- Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)
- Timothy Kluthe, Brett A. Becker, Christopher D. Hundhausen, Ciera Jaspan, Andreas Stefik, and Thomas Zimmermann. Toward Scientific Evidence Standards in Empirical Computer Science (Dagstuhl Seminar 22442). In Dagstuhl Reports, Volume 12, Issue 10, pp. 225-240, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)
The goals of the seminar Toward Scientific Evidence Standards in Empirical Computer Science were to establish a process for introducing evidence standards in computer science, build a community of scholars that discuss what a general standard would include and have enough diversity of background to have a good basis for the breadth of community needs across a range of computer science-related venues.
Over the first few days, we conducted a series of breakout groups and larger group discussions. In these, to introduce people to evidence standards, we reviewed several, including: APA JARS, WWC, and CONSORT. The purpose was introductory and to scaffold for discussions on what could work across the breadth of computer science or in subareas. We also conducted a session looking at existing papers and noted the changes that would need to be made to fit the APA JARS standards. This exercise in particular was found to be useful by participants, as it made it clear that the the conversion is not particularly difficult, although it is aided by advanced planning for what might need to be collected during a study.
During the Dagstuhl, we also had several talks. These included an introductory talk by Andreas Stefik on evidence standards as a whole, telling the story of the well-known Tolbutamide drug and its influence on the medical field in regard to evidence standards. Christopher Hundhausen provided a talk on his experience with introducing reporting standards at ACM’s Transactions on Computing Education (TOCE) (Section 3.2). Paul Ralph presented on the problems in scholarly peer review and how evidence standards could be a solution, along with a reviewing tool that he has developed (Section 3.3). Neil Ernst covered registered reports, their benefits to the transparency and quality of research, and his experience with introducing them at Mining Software Repositories (MSR) and Empirical Software Engineering (EMSE) (Section 3.4). Lastly, Kate Sanders et al. discussed a review on reviews, which spanned a variety of the computer science subfields. This included their observations on review criteria, ethical concerns in the peer review process and excerpts from interviews with conference chairs and journal editors that were relevant to the subject of the seminar (Section 3.5). Each of these gave insights into the process of adopting an evidence standard and some of the potential impacts of the status quo and potential changes (positive or negative).
Finally, after discussion, we identified four topics for breakout groups to brainstorm potential avenues toward actionable progress on goals: a deeper dive into how to write guidelines for more complex experiments like mixed-methods studies (Section 4.5), how can we measure the effects that evidence standards have both in reference in paper quality and community satisfaction (Section 4.6), what are the first steps towards community engagement as far as introducing the topic and adopting it (Section 4.7) and how to operationalize these standards in a way that is open source to allow for community control (Section 4.8). A final working group session went through some of the first steps could be made at conferences and a dissemination plan for how to start information the community about the topic (Section 4.9). Overall, the seminar brought a range of computer science stakeholders up to speed on the state of evidence standards in the field, what could be gained by moving towards a domain-wide guidelines and started a discussion on how to spark the conversation in various communities. A set of next steps on where and what to recommend and talk about with communities were set in motion, as well as plans for a collaborative position paper to introduce the topic to a wider audience.
- American Psychological Association. APA Style Journal Article Reporting Standards (APA Style JARS). Accessed on December 12, 2022 from https://apastyle.apa.org/jars.
- National Center for Education Evaluation and Regional Assistance. WWC | Find What Works. Accessed on December 12 , 2022 from https://ies.ed.gov/ncee/wwc/.
- The CONSORT Group. CONSORT Transparent Reporting of Trials. Accessed on December 12, 2022 from https://www.consort-statement.org/.
Many scientific fields of study use formally established evidence standards during the peer review and evaluation process, such as CONSORT (http://www.consort-statement.org) in medicine, the What Works Clearinghouse (https://ies.ed.gov/ncee/wwc/) in education, or the APA Journal Article Reporting Standards (JARS) in psychology (https://apastyle.apa.org/jars). The basis for these standards is community agreement on what to report in empirical studies. Such standards achieve two key goals. First, they make it easier to compare studies, facilitating replications which can provide confidence that multiple research teams can obtain the same results. Second, they establish community agreement on how to report on and evaluate studies using different methodologies.
The discipline of computer science does not have formalized evidence standards, even for major conferences or journals. This Dagstuhl Seminar has three primary objectives:
- To establish a process for creating or adopting an existing evidence standard for empirical research in computer science.
- To build a community of scholars that can discuss what a general standard should include.
- To kickstart the discussion with scholars from software engineering, human computer interaction, and computer science education.
In order to better discuss and understand the implications of such standards across several empirical subfields of computer science and to facilitate adoption, our plan for the seminar includes having representatives from prominent journals in attendance.
- Brett A. Becker (University College Dublin, IE) [dblp]
- Andrew Begel (Carnegie Mellon University - Pittsburgh, US) [dblp]
- Michelle Craig (University of Toronto, CA) [dblp]
- Andrew Duchowski (Clemson University, US) [dblp]
- Neil Ernst (University of Victoria, CA)
- Arto Hellas (Helsinki University of Technology, FI) [dblp]
- Christopher D. Hundhausen (Oregon State University - Corvallis, US) [dblp]
- Ciera Jaspan (Google - Mountain View, US) [dblp]
- Timothy Kluthe (University of Nevada - Las Vegas, US)
- Juho Leinonen (Aalto University, FI)
- Joseph Maguire (University of Glasgow, GB)
- Monica McGill (CSEdResearch.org - Peoria, US)
- Brad Myers (Carnegie Mellon University - Pittsburgh, US) [dblp]
- Andrew Petersen (University of Toronto Mississauga, CA) [dblp]
- Mauro Pezzè (University of Lugano, CH) [dblp]
- Paul Ralph (Dalhousie University - Halifax, CA)
- Kate Sanders (Rhode Island College - Providence, US) [dblp]
- Andreas Stefik (University of Nevada - Las Vegas, US) [dblp]
- Claudia Szabo (University of Adelaide, AU) [dblp]
- Jan Vahrenhold (Universität Münster, DE) [dblp]
- Titus Winters (Google - New York, US)
- Aman Yadav (Michigan State Universit - East Lansing, US) [dblp]
- Computers and Society
- Human-Computer Interaction
- Software Engineering
- Community Evidence Standards
- Human Factors