Suche auf der Schloss Dagstuhl Webseite
Sie suchen nach Informationen auf den Webseiten der einzelnen Seminare? - Dann:
Nicht fündig geworden? - Einige unserer Dienste laufen auf separaten Webseiten mit jeweils eigener Suche. Bitte beachten Sie folgende Liste:
Schloss Dagstuhl - LZI - Logo
Schloss Dagstuhl Services
Innerhalb dieser Seite:
Externe Seiten:
  • DOOR (zum Registrieren eines Dagstuhl Aufenthaltes)
  • DOSA (zum Beantragen künftiger Dagstuhl Seminare oder Dagstuhl Perspektiven Workshops)
Innerhalb dieser Seite:
Externe Seiten:
Innerhalb dieser Seite:
Externe Seiten:
  • die Informatik-Bibliographiedatenbank dblp

Dagstuhl-Seminar 22442

Toward Scientific Evidence Standards in Empirical Computer Science

( 30. Oct – 04. Nov, 2022 )

(zum Vergrößern in der Bildmitte klicken)

Bitte benutzen Sie folgende Kurz-Url zum Verlinken dieser Seite:



Gemeinsame Dokumente



The goals of the seminar Toward Scientific Evidence Standards in Empirical Computer Science were to establish a process for introducing evidence standards in computer science, build a community of scholars that discuss what a general standard would include and have enough diversity of background to have a good basis for the breadth of community needs across a range of computer science-related venues.

Over the first few days, we conducted a series of breakout groups and larger group discussions. In these, to introduce people to evidence standards, we reviewed several, including: APA JARS[1], WWC[2], and CONSORT[3]. The purpose was introductory and to scaffold for discussions on what could work across the breadth of computer science or in subareas. We also conducted a session looking at existing papers and noted the changes that would need to be made to fit the APA JARS standards. This exercise in particular was found to be useful by participants, as it made it clear that the the conversion is not particularly difficult, although it is aided by advanced planning for what might need to be collected during a study.

During the Dagstuhl, we also had several talks. These included an introductory talk by Andreas Stefik on evidence standards as a whole, telling the story of the well-known Tolbutamide drug and its influence on the medical field in regard to evidence standards. Christopher Hundhausen provided a talk on his experience with introducing reporting standards at ACM’s Transactions on Computing Education (TOCE) (Section 3.2). Paul Ralph presented on the problems in scholarly peer review and how evidence standards could be a solution, along with a reviewing tool that he has developed (Section 3.3). Neil Ernst covered registered reports, their benefits to the transparency and quality of research, and his experience with introducing them at Mining Software Repositories (MSR) and Empirical Software Engineering (EMSE) (Section 3.4). Lastly, Kate Sanders et al. discussed a review on reviews, which spanned a variety of the computer science subfields. This included their observations on review criteria, ethical concerns in the peer review process and excerpts from interviews with conference chairs and journal editors that were relevant to the subject of the seminar (Section 3.5). Each of these gave insights into the process of adopting an evidence standard and some of the potential impacts of the status quo and potential changes (positive or negative).

Finally, after discussion, we identified four topics for breakout groups to brainstorm potential avenues toward actionable progress on goals: a deeper dive into how to write guidelines for more complex experiments like mixed-methods studies (Section 4.5), how can we measure the effects that evidence standards have both in reference in paper quality and community satisfaction (Section 4.6), what are the first steps towards community engagement as far as introducing the topic and adopting it (Section 4.7) and how to operationalize these standards in a way that is open source to allow for community control (Section 4.8). A final working group session went through some of the first steps could be made at conferences and a dissemination plan for how to start information the community about the topic (Section 4.9). Overall, the seminar brought a range of computer science stakeholders up to speed on the state of evidence standards in the field, what could be gained by moving towards a domain-wide guidelines and started a discussion on how to spark the conversation in various communities. A set of next steps on where and what to recommend and talk about with communities were set in motion, as well as plans for a collaborative position paper to introduce the topic to a wider audience.


  1. American Psychological Association. APA Style Journal Article Reporting Standards (APA Style JARS). Accessed on December 12, 2022 from
  2. National Center for Education Evaluation and Regional Assistance. WWC | Find What Works. Accessed on December 12 , 2022 from
  3. The CONSORT Group. CONSORT Transparent Reporting of Trials. Accessed on December 12, 2022 from
Copyright Timothy Kluthe and Andreas Stefik


Many scientific fields of study use formally established evidence standards during the peer review and evaluation process, such as CONSORT ( in medicine, the What Works Clearinghouse ( in education, or the APA Journal Article Reporting Standards (JARS) in psychology ( The basis for these standards is community agreement on what to report in empirical studies. Such standards achieve two key goals. First, they make it easier to compare studies, facilitating replications which can provide confidence that multiple research teams can obtain the same results. Second, they establish community agreement on how to report on and evaluate studies using different methodologies.

The discipline of computer science does not have formalized evidence standards, even for major conferences or journals. This Dagstuhl Seminar has three primary objectives:

  1. To establish a process for creating or adopting an existing evidence standard for empirical research in computer science.
  2. To build a community of scholars that can discuss what a general standard should include.
  3. To kickstart the discussion with scholars from software engineering, human computer interaction, and computer science education.

In order to better discuss and understand the implications of such standards across several empirical subfields of computer science and to facilitate adoption, our plan for the seminar includes having representatives from prominent journals in attendance.

Copyright Brett A. Becker, Christopher D. Hundhausen, Ciera Jaspan, Andreas Stefik, and Thomas Zimmermann


  • Computers and Society
  • Human-Computer Interaction
  • Software Engineering

  • Community Evidence Standards
  • Human Factors