24. – 29. Januar 2021, Dagstuhl-Seminar 21041

CANCELLED Toward Scientific Evidence Standards in Empirical Computer Science

Due to the Covid-19 pandemic, this seminar was cancelled. A related Dagstuhl-Seminar was scheduled to 30. Oktober – 04. November 2022 – Seminar 22442.


Brett A. Becker (University College Dublin, IE)
Christopher D. Hundhausen (Washington State University – Pullman, US)
Ciera Jaspan (Google Inc. – Mountain View, US)
Andreas Stefik (University of Nevada – Las Vegas, US)
Thomas Zimmermann (Microsoft Corporation – Redmond, US)

Auskunft zu diesem Dagstuhl-Seminar erteilt

Dagstuhl Service Team


Scientists in a variety of fields are increasingly concerned about the quality of gathered evidence in the sciences. This concern stems from many things, including a lack of procedures to detect fraud, the challenges in replication, our lack of use of pre-registration, and statistical problems like p-hacking. For example, pre-registration requires researchers to register the methodologies of their studies before running an experiment and can be closely tied to the publication process. Besides the aforementioned p-hacking, this also helps prevent the file drawer problem: publishing only the results that confirm the authors' biases. No journal in computer science of which we are aware has checks and balances such as these in place.

Issues of evidence quality have also motivated political change and are beginning to limit the ability of computer scientists to win government grants. For example, the Every Student Succeeds Act in the United States places empirical studies into "Tiers" of evidence, which automatically discount many papers in our field because of our lack of evidence standards. Further, while not all scholars hold the same view, a recent Naturesurvey found that more than 70% of researchers have tried and failed to reproduce another scientist's experiments, more than 50% have failed to reproduce their own experiments, and 90% believe there is a replication crisis. Researchers in many fields are thus concerned about standards of evidence, replication, and other issues that are meaningful to the credibility of the science.

The discipline of computer science is not immune to any of these challenges and faces its own unique difficulties in addressing them. Multiple years-long investigations into software engineering and programming languages have furnished compelling evidence that authors in these fields fail to present rigorous evidence in their publications and lack basic checks and balances like "gathering data,” "having a control group,” or testing people "other than the authors of the publication.” Researchers have raised concerns about the quality of research in computer science education and specifically the lack of replication. Further, no journal or conference in the field has formalized a standard of evidence for its publications, which makes comparisons across studies difficult.

We invite you to a Dagstuhl Seminar that has three primary objectives: 1) to establish a process for creating a computer science-specific evidence standard for empirical research, 2) to build a community of scholars in software engineering, human factors, and computer science education to discuss what a general standard should include, and 3) to collaborate between these sub-fields in analyzing and drafting evidence standards that can be adopted by journals in our respective sub-fields. Our overall goal is to define a standard that is general and flexible, focusing on what should be reported in empirical studies, but not prescribing to scholars the content of what they decide to investigate. The bulk of the activities will be focused on discussion time to determine what could work across and within our respective sub-fields of computer science.

Motivation text license
  Creative Commons BY 3.0 DE
  Brett A. Becker, Christopher D. Hundhausen, Ciera Jaspan, Andreas Stefik, and Thomas Zimmermann

Related Dagstuhl-Seminar


  • Human-Computer Interaction
  • Other Computer Science
  • Software Engineering


  • Community Evidence Standards
  • Human Factors


In der Reihe Dagstuhl Reports werden alle Dagstuhl-Seminare und Dagstuhl-Perspektiven-Workshops dokumentiert. Die Organisatoren stellen zusammen mit dem Collector des Seminars einen Bericht zusammen, der die Beiträge der Autoren zusammenfasst und um eine Zusammenfassung ergänzt.


Download Übersichtsflyer (PDF).

Dagstuhl's Impact

Bitte informieren Sie uns, wenn eine Veröffentlichung ausgehend von Ihrem Seminar entsteht. Derartige Veröffentlichungen werden von uns in der Rubrik Dagstuhl's Impact separat aufgelistet  und im Erdgeschoss der Bibliothek präsentiert.


Es besteht weiterhin die Möglichkeit, eine umfassende Kollektion begutachteter Arbeiten in der Reihe Dagstuhl Follow-Ups zu publizieren.