07.10.18 - 10.10.18, Seminar 18412

Encouraging Reproducibility in Scientific Research of the Internet

The following text appeared on our web pages prior to the seminar, and was included as part of the invitation.


Reproducibility of research in computer science and in the field of networking in particular is a well-recognized problem. For several reasons, including the sensitive and/or proprietary nature of some Internet measurements, the networking research community discounts the importance of reproducibility of results, instead tending to accept papers that appear plausible. Studies have shown that a fraction of published papers release artefacts (such as code and datasets) that are needed to reproduce results. To provide incentives to authors, conferences attempt to bestow best dataset awards and actively solicit submissions that reproduce results. Community archives (such as DatCat and CRAWDAD) exist that provide an index of existing measurement data and invite the community to reproduce existing research. SIGCOMM Computer Communication Review allows authors to upload artefacts during the paper submission page to allow reviewers to check for reproducibility, and relaxes page limits for reproducible papers. Association for Computing Machinery (ACM) has lately also taken an initiative and introduced a new policy on result and artefact review and badging. The policy sets a terminology to use to assess results and artefacts. ACM has also initiated a new task force on data, software and reproducibility in publication to understand how ACM can effectively promote reproducibility within the computing research community. Despite these continued efforts, reproducibility of research in computer science and in the field of networking in particular appears to exist as an ongoing problem since papers that reproduce existing research rarely get published in practise.

In this Seminar, we aim to discuss challenges to improving reproducibility of scientific Internet research, and hope to develop a set of recommendations that we as a community can undertake to initiate a cultural change toward reproducibility of our work. Questions we anticipate discussing during the seminar include:

  • What are the challenges with reproducibility?
    How can researchers (and data providers) navigate concerns with openly sharing datasets? How should we cope with datasets that lack stable ground truth?
  • What incentives are needed to encourage reproducibility?
    What can publishers do? What can conference organisation committees do? How can we ensure that reviewers consider reproducibility when reviewing papers? How can we man- age and scale the evaluation of artefacts during peer review? Do we need new venues that specifically require reproducibility of the submitted research?
  • What tools and systems are available to facilitate reproducibility?
    How effective are emerging interactive lab notebook tools (e.g., Jupyter) at enabling or facilitating reproducibility? Should computer science course curricula integrate use of these tools for student projects to help develop skills and habits that enable reproducibility?
  • What guidelines or (best practises) are needed to help reproducibility?
    How can we ensure authors think about reproducibility? What guidelines would assist reviewers in evaluating artefacts?

In order to encourage reproducibility of research, practitioners continue to do community service to educate the community on the need for this change.

Creative Commons BY 3.0 DE
Vaibhav Bajpai, Olivier Bonaventure, Kimberly Claffy, and Daniel Karrenberg