TOP
Search the Dagstuhl Website
Looking for information on the websites of the individual seminars? - Then please:
Not found what you are looking for? - Some of our services have separate websites, each with its own search option. Please check the following list:
Schloss Dagstuhl - LZI - Logo
Schloss Dagstuhl Services
Seminars
Within this website:
External resources:
  • DOOR (for registering your stay at Dagstuhl)
  • DOSA (for proposing future Dagstuhl Seminars or Dagstuhl Perspectives Workshops)
Publishing
Within this website:
External resources:
dblp
Within this website:
External resources:
  • the dblp Computer Science Bibliography


Dagstuhl Seminar 24172

Code Search

( Apr 21 – Apr 24, 2024 )

(Click in the middle of the image to enlarge)

Permalink
Please use the following short url to reference this page: https://www.dagstuhl.de/24172

Organizers

Contact

Dagstuhl Reports

As part of the mandatory documentation, participants are asked to submit their talk abstracts, working group results, etc. for publication in our series Dagstuhl Reports via the Dagstuhl Reports Submission System.

  • Upload (Use personal credentials as created in DOOR to log in)

Dagstuhl Seminar Wiki

Shared Documents

Schedule

Motivation

Code search describes the process of retrieving source code from a repository, where that source code matches a query. Whether a developer is looking for where an error was thrown, learning how to use a new-to-them API, learning a new programming language, or browsing their team’s directory to familiarize themselves with the codebase, search underpins all these activities. Beyond those human-driven software engineering processes, search is also a component in automated software engineering, such as automated program repair, code example recommendation, and clone detection. Furthermore, new generative AI tools have challenged traditional code search by presenting alternative approaches to finding and reusing code.

Code search research has implications for developer productivity, code quality, and software engineering ethics, and tools to facilitate code search are widely available. Some are internal to companies (e.g. Google has invested substantially in this), others are open source (e.g. Github has a search interface for public repositories), while still others generate code to match a user query (e.g., ChatGPT). Students and professionals use generic web search to find source code examples as well. With each of these platforms, query formats vary, indexing varies, rankings vary, the origin of the code varies, and use cases vary. This provides many avenues for innovation and exploration in code search research.

For example, what is the appropriate scope for a search result? This question has implications for the underlying technology (e.g., should the indexed unit be a file, function, sub-function, or something else?) and for the use case (e.g., does the user want to adapt the code to their context? Are they seeking to understand a code base? Or something else?). There are many other questions worth exploring: How should source code be indexed? Which search results should appear first? Are there artifacts beyond the code itself that should be surfaced, such as diffs against previous versions or documentation? What diversity of results should be shown to the user? What are the ethical considerations with code search, and with code search vs. code generation?

This Dagstuhl Seminar brings together experts in mining software repositories, human factors in software engineering, software documentation, code examples, program analysis, and industrial code search systems to bridge the gap between industry and academia and set the roadmap for the next decade of code search research.

Expected outcomes of this seminar include: new ideas on how to better support developers in searching for code across different user segments (e.g., industrial, open source software, student populations, developers with low language familiarity), clarity on how search can help during different stages of software development (e.g., writing new code, debugging existing issues, reviewing code), a better understanding of code search ethics, and guidelines for more rigorous, repeatable evaluations for code search research.

Copyright Satish Chandra, Michael Pradel, and Kathryn T. Stolee

Participants

Classification
  • Software Engineering

Keywords
  • code search
  • developer tools