August 25 – 30 , 2019, Dagstuhl Seminar 19351

Computational Proteomics


Nuno Bandeira (University of California – San Diego, US)
Ileana M. Cristea (Princeton University, US)
Lennart Martens (Ghent University, BE)

For support, please contact

Dagmar Hofmann for administrative matters

Shida Kunz for scientific matters


List of Participants
Shared Documents
Dagstuhl Seminar Schedule [pdf]


Mass spectrometry (MS) based proteomics has seen an enormous increase in analytical capability over the past twenty years, which has allowed the field to become a cornerstone technology in the life sciences. Concomitant with this increased analytical capability, the field has seen rapid and continuous growth of the amount of data it produces. This in turn led to a strong dependence on dedicated, state-of-the-art computational approaches to process and interpret the acquired data. Moreover, the field has seen the development of complex experimental approaches that allow researchers to dig ever deeper into protein biology. Yet these new approaches also require their own dedicated analysis algorithms.

Two such novel approaches that have gained prominence over the past two years are data independent acquisition (DIA), and protein cross-linking experiments. The DIA approach omits the selection of a narrow mass-over-charge (m/z) range prior to fragmenting a peptide analyte, thus effectively creating compound spectra that consist of a multitude of co-fragmented peptides. These spectra contrast markedly with those from traditional data dependent acquisition (DDA), where spectra are typically derived from the fragmentation of a single peptide. Current identification algorithms, built for the interpretation of DDA spectra, are therefore not suited to handling DIA spectra.

A second challenge emerges for protein cross-linking, where small-molecule cross-linkers are used to establish a covalent link between two sections of a single protein, or between two different proteins. The resulting peptide analytes of interest are thus covalently cross-linked, creating a so-called chimeric di-peptide. When fragmented, these cross-linked di-peptides create complex spectra, with many novel types of fragment ions. Interpretation of these spectra thus requires dedicated algorithms that should moreover be able to adapt to the large variety of small molecule cross-linkers available today.

Apart from these experiment-specific data analysis challenges, the field also must find ways to deal effectively with all the public data that is being amassed at an increasing rate. Indeed, while there is an enormous (and growing) amount of proteomics data now available in the public domain, efforts to channel these data into a comprehensive picture of the proteome of a given cell, tissue, or organism are still at an early stage. The field therefore needs to develop novel ways to assemble and present proteomes for direct consumption by biologists, which will require new algorithms to combine and filter data collected across tens of thousands of individual analyses, along with novel visualization approaches that are tailored to biologists. To successfully address these challenges, different experts need to be brought together: computer scientists, bioinformaticians and statisticians that develop algorithms, approaches and software for the interpretation of the acquired data; life scientists that rely on mass spectrometry-based proteomics as a key means to elucidate biology; and analytical chemists and engineers that develop the instruments.

Our key topics for discussion and investigation at this seminar will follow the outlines of the challenges identified above, and will center on:

  1. Identification and quantification of DIA data
    There is an urgent need to bring together the various researchers involved in establishing novel approaches for DIA analytics, and in developing novel algorithms to process DIA data, so that the specific features of these data can be leveraged for robust identification and quantification.
  2. Algorithms for the analysis of protein cross-linking data
    Cross-linking MS data exposes weaknesses in current scoring functions, as well as scaling issues. This creates a clear need for fresh approaches to the processing of cross-linking data, including data from cleavable cross-linkers. We will therefore bring together experimentalists and bioinformaticians working in cross-linking MS to derive novel solutions to support this field.
  3. Creating an online view on complete, browsable proteomes from public data
    We will investigate approaches to combine data across tens of thousands of analyses into high-quality proteomes and develop an interface for biologists to explore and interrogate such a proteome based on a new visual design language for proteomics.
  4. Detecting interesting biology from proteomics findings
    The re-processing of public data is highly likely to deliver novel biology, yet we are currently extremely poorly equipped to detect such biologically significant findings, or to assess their role or importance. We will therefore investigate the creation of such methods and approaches in this seminar.

Motivation text license
  Creative Commons BY 3.0 DE
  Lennart Martens

Dagstuhl Seminar Series


  • Bioinformatics


  • Computational Mass Spectrometry
  • Computational Biology
  • Proteomics


In the series Dagstuhl Reports each Dagstuhl Seminar and Dagstuhl Perspectives Workshop is documented. The seminar organizers, in cooperation with the collector, prepare a report that includes contributions from the participants' talks together with a summary of the seminar.


Download overview leaflet (PDF).


Furthermore, a comprehensive peer-reviewed collection of research papers can be published in the series Dagstuhl Follow-Ups.

Dagstuhl's Impact

Please inform us when a publication was published as a result from your seminar. These publications are listed in the category Dagstuhl's Impact and are presented on a special shelf on the ground floor of the library.