15. – 18. November 2015, Dagstuhl-Seminar 15472

Programming with "Big Code"


William W. Cohen (Carnegie Mellon University, US)
Charles Sutton (University of Edinburgh, GB)
Martin Vechev (ETH Zürich, CH)

Auskunft zu diesem Dagstuhl-Seminar erteilt

Dagstuhl Service Team


Dagstuhl Report, Volume 5, Issue 11 Dagstuhl Report


The main objective of the seminar was to bring together several research communities which have so far been working separately on the emerging topic of "Big Code" and to foster a new community around the topic. Over the last 4-5 years there have been several developments and interesting results involving "Big Code" all spanning a wide range of fields and conferences: the seminar brought these communities together and enabled them to interact for the first time.

The program was structured as a series of talks interspersed with discussion. Almost all of seminar participants gave a talk on their latest research. Even though the initial plan was to include special discussion sessions, each talk triggered so much discussion, both during the talk itself, and also after, that there was no need for specific discussion slots. We believe the seminar was successful in setting the right atmosphere for open ended discussion and obtained the desired affect of triggering much organic interaction.

Only the last day (morning) included a short wrap-up discussion session focusing on the future of the area, defining common data sets and future challenges the community can address. That discussion is summarized in the working group report.

The seminar was highly inter-disciplinary involving experts from programming languages, software engineering, machine learning and natural language processing. Further, it brought together research groups from Europe, Asia and U.S., all working on the topic of "Big Code", and raised awareness and familiarity with what different research groups are working on.

The talks and discussions spanned several topics including: the kinds of statistical methods used (e.g., n-gram models, recurrent neural networks, graphical models, probabilistic grammars, etc), new programming applications that can benefit from these models (e.g., code completion, code search, code similarity, translating natural language to code, etc), and the interaction between these. Some of the presentations were more of an introductory/overview nature while others focused on the more technical aspects of particular programming tools and machine learning models.

After two days of presentations and discussions, we used the last day of the seminar (before lunch) to summarize the discussions and to outline a future research direction. A suggestion enthusiastically embraced by everyone was to create a web site which lists the current data sets, challenges, tools and research groups working on the topic. The view was that this will not only enable existing groups to compare their tools on common problems and data sets but will also make it much easier for other research groups and graduate students to get into the area and to start contributing. It also serves as a useful instrument for raising awareness about the topic:

We have now created this web site and have made it available here:

In a short time, several groups have started contributing by uploading links to tools, data sets and challenges.

Overall, the seminar was successful both in terms of stimulating new and fruitful interaction between research communities that were working in the area but were separated so far, but also in setting a common agenda moving forward. Due to the high interest and feedback from this seminar, we anticipate that in a year or two from now, we will be ready to propose a larger seminar on the topic.

Summary text license
  Creative Commons BY 3.0 Unported license
  William W. Cohen, Charles Sutton, and Martin Vechev


  • Artificial Intelligence / Robotics
  • Programming Languages / Compiler
  • Software Engineering


  • Statistical programming tools
  • Machine learning
  • Natural language processing
  • Programming languages
  • Software engineering


In der Reihe Dagstuhl Reports werden alle Dagstuhl-Seminare und Dagstuhl-Perspektiven-Workshops dokumentiert. Die Organisatoren stellen zusammen mit dem Collector des Seminars einen Bericht zusammen, der die Beiträge der Autoren zusammenfasst und um eine Zusammenfassung ergänzt.


Download Übersichtsflyer (PDF).

Dagstuhl's Impact

Bitte informieren Sie uns, wenn eine Veröffentlichung ausgehend von Ihrem Seminar entsteht. Derartige Veröffentlichungen werden von uns in der Rubrik Dagstuhl's Impact separat aufgelistet  und im Erdgeschoss der Bibliothek präsentiert.


Es besteht weiterhin die Möglichkeit, eine umfassende Kollektion begutachteter Arbeiten in der Reihe Dagstuhl Follow-Ups zu publizieren.