25. – 30. August 2019, Dagstuhl-Seminar 19351

Computational Proteomics


Nuno Bandeira (University of California – San Diego, US)
Ileana M. Cristea (Princeton University, US)
Lennart Martens (Ghent University, BE)

The Dagstuhl Seminar 19351 'Computational Proteomics' discussed several key challenges of facing the field of computational proteomics. The topics discussed were varied and wide-ranging, and radiated out from the four topics set out at the start.

These four topics were (i) personally identifiable proteomics data; (ii) unique computational challenges in data-independent analysis (DIA) approaches; (iii) computational approaches for cross-linking proteomics; and (iv) the visual design of proteomics data and results, to communicate more clearly to the broad life sciences community. A cross-cutting topic was introduced as well, which focused on proteotyping in clinical trials as it brings many of the previous challenges together, by asking the logical but complex question of how proteomics approaches, data, and associated computational methods and tools can become part of routine clinical trial data acquisition, monitoring and processing.

Based on these initial topics, breakout sessions were organized around proteomics data privacy, dealing with data from DIA approaches, how to best utilize computational approaches to use cross-linking for structural elucidation, and the importance of visualisation of proteomics data and results to engender excitement for the field's capabilities in the life sciences in general. However, these breakout sessions in turn inspired additional breakout sessions on associated topics.

The DIA and cross-linking breakouts both yielded the issue of ambiguity in identification as a cross-cutting topic that merited its own dedicated breakout session. A closley related breakout session, derived from the proteomics privacy and DIA sessions, centered on open modification searches, which are now becoming feasible in proteomics for the first time, but which are also prone to potentially crippling ambiguity issues while raising even more complex privacy issues. The visual design breakout explicitly identified multi-omics data integration as a direct offshoot of its discussions, which led to a dedicated breakout session on this topic as well. Another emerging breakout session concerned public data, which was triggered by both the DIA and cross-linking topics because of their shared need to disseminate their respective specialised data and results in a standardised, uniform, and well-structured manner. Finally, the cross-linking and DIA topics also led to a breakout session on ion mobility, as this technological advance was seen as a key aspect in the future of these technologies.

Each of these breakout sessions had exciting outcomes, and gave rise to future research ideas and collaborations. The proteomics privacy breakout concluded that the field is now ready to delve in more detail into the issues surrounding proteomics data privacy concerns, and that a white paper will be written that can be used to propose policy and to inform the community. The DIA breakout identified three such future tasks: (i) to develop a perspective manuscript that will discuss peptide-centric and spectrum-centric FDR, as well as the effects of shared evidence; (ii) to conduct an experiment for testing DDA versus DIA on the same sample to discover the sampling space for precursors and fragments; and (iii) to conduct a second experiment for understanding target/decoy scoring for different decoy generation models using both synthetic and predicted target/decoy peptides. The cross-linking breakout concluded that a cross-linked ribosomal protein complex should be used as a standardized dataset publicly available to the community, while a 'Minimum Information Requirements About a Cross Linking Experiment (MIRACLE)' was proposed to unify results from many crosslinking tools. The results will also be presented at the Symposium on Structural Proteomics in Göttingen in November 2019. The visual design breakout came up with many fine-grained conclusions, but also with an overall design philosophy which centered on three levels of technical detail, depending on the audience: i) interfaces for deatiled data exploration for experienced consumers; ii) interfaces with minimal technical information, focusing on high-level data for the specific scientific question for novice consumers; and iii) interfaces with only relevant information for clinical decision making (e.g. short list of proteins significantly affected by the disease) for clinicians.

The five offshoot breakouts described above also came to conclusions, and the interested reader is referred to the corresponding abstracts for details.

Overall, the 2019 Dagstuhl Seminar on Computational Proteomics was extremely successful as a catalyst for careful yet original thinking about key challenegs in the field, and as a means to make progress by setting important, high impact goals to work on in close collaboration. Moreover, during the Seminar, several highly interesting topics for a future Dagstuhl Seminar on Computational Proteomics were proposed, showing that this active and inspired community has not yet run out of challenges, nor out of ideas and opportunities!

   Lennart Martens and Nuno Bandeira

