Search the Dagstuhl Website
Looking for information on the websites of the individual seminars? - Then please:
Not found what you are looking for? - Some of our services have separate websites, each with its own search option. Please check the following list:
Schloss Dagstuhl - LZI - Logo
Schloss Dagstuhl Services
Within this website:
External resources:
  • DOOR (for registering your stay at Dagstuhl)
  • DOSA (for proposing future Dagstuhl Seminars or Dagstuhl Perspectives Workshops)
Within this website:
External resources:
Within this website:
External resources:
  • the dblp Computer Science Bibliography

Dagstuhl Seminar 22181

Computational Metabolomics: From Spectra to Knowledge

( May 01 – May 06, 2022 )

Please use the following short url to reference this page:



Shared Documents


Metabolomics is the study of small molecules in living systems, including those which generate the energy to sustain life, those that form the building blocks of macromolecules such as DNA, as well as some originating outside the living system such as pollutants. Biologically, this field is of increasing importance due to its strong connection to organism function. Metabolomics is rapidly expanding with significant advances in both measurement technology (e.g. mass spectrometry, chromatography, NMR spectroscopy) and informatics approaches. The amount and complexity of data routinely exceeds the capacity of typical software and other computational systems used in bioanalytical labs and there is an ongoing and increasingly acute need for improvements in computational, informatics and statistical/machine learning approaches to make sense of it all.

This seminar, the fourth in the series on computational metabolomics, continued some themes previously well developed, and explored many new ones. A good example of the former is the problem of how to use mass spectral data to annotate (putatively identify) the 1000s of unknown metabolites typically observed in routine assays. Another example would be the discussion of new developments in dealing with Data Independent Acquisition which has diversified considerably in the last 5 years. Many new directions were also discussed. For instance, the question of pathway analysis - how to generate semi-automated interpretation of metabolomics data on the level of groups of molecules working together in biological processes - is becoming more prominent as larger annotated datasets become available. Another new direction was "metaboproteomics", looking at the diverse array of interactions between metabolites and proteins, in particular in how metabolite derived post-translational modifications of proteins can be picked up in annotation pipelines. Other discussions focused on software aspects such as visualization of chemical space (a key problem in designing effective software tools) and the generation/curation of high quality data for benchmarking new informatics algorithms. A session on extended metabolic models looked at ways to link data and prediction tools from protein function studies to metabolites in order to gain new knowledge of unknown metabolic pathways. From a data generation technology perspective, while mass spectrometry (MS) dominated as expected (e.g. sessions on MS spectral quality requirements, fragmentation trees etc), the seminar extended beyond previous ones into a discussion of NMR data processing and modeling. Open databases, repositories and knowledge representation also featured their own discussions including Wikidata, CxSMILEs and Wikipathways/RaMP-DB. Finally the important issue of integrating metabolomic data with other relevant data types (e.g. genomics, proteomics etc) was discussed.

The seminar organization followed a similar flexible format to the previous one, where topics were both suggested in advance and brainstormed on the Monday. The whole group participated in brainstorming and prioritization and this was further refined each morning of the meeting. Parallel discussions were organized with the aim to minimize clashes in individual interests and at the end of each morning/afternoon session a plenary feedback session was held to disseminate the main discussion points to the whole group. Evening sessions were very popular and covered a wide range of topics. Overall the seminar was felt to be one of the most successful yet, highlighting the growing importance of computational metabolomics as a field in its own right and emphasizing the need for further meetings to address the important problems in this exciting area of research.

Copyright Corey Broeckling, Timothy Ebbels, Ewy Mathé, and Nicola Zamboni


Metabolomics is an analytical approach which aims to comprehensively describe the small molecule composition of a sample. Analyses are typically produced via mass spectrometry (MS) and/or nuclear magnetic resonance (NMR) spectroscopy. Computational tools for data processing and interpretation are critical to realizing the full potential of metabolomics in biological and biomedical research, environmental monitoring, or industrial biotechnology.

In recent years, network-based and multi-omics approaches have attracted a great deal of attention. Knowing that small molecules, in both biological and non-biological systems, are transformed over time, networks can be drawn which visualize and model small molecule transformations or other relationships. These networks provide information that can guide assignments of chemical structures and their annotations, and inform on mechanistic processes driving small molecule transformation (metabolism) in the sample. As an example of these networks, small molecules are nodes, and edges may be catalysts (synthetic, environmental) or gene products. While generic computational methods and metrics exist to construct and navigate networks, the appropriate definition of nodes and edges for a particular application is non-trivial and critical. Specifically, the metabolomics field needs computational methods that specifically address peculiarities of metabolomics data (e.g., identification, annotations, etc.). This task necessitates a deep knowledge of metabolomics, and that of other omics datasets when multi-omics integration is taken into account.

Building upon previous meetings, this multidisciplinary Dagstuhl Seminar will focus on improving interpretation of metabolomics data through network and statistical analysis of metabolomics data in a wider biological or environmental context (e.g., incorporation of other data types). This five-day seminar aims to bring together mass spectrometrists, NMR spectroscopists, statisticians, epidemiologists, biologists, and computer scientists to find solutions to the major challenges still remaining in this highly dynamic and rapidly evolving field.

Copyright Corey Broeckling, Timothy Ebbels, Ewy Mathé, and Nicola Zamboni


Related Seminars
  • Dagstuhl Seminar 15492: Computational Metabolomics (2015-11-29 - 2015-12-04) (Details)
  • Dagstuhl Seminar 17491: Computational Metabolomics: Identification, Interpretation, Imaging (2017-12-03 - 2017-12-08) (Details)
  • Dagstuhl Seminar 20051: Computational Metabolomics: From Cheminformatics to Machine Learning (2020-01-26 - 2020-01-31) (Details)

  • Data Structures and Algorithms
  • Emerging Technologies
  • Machine Learning

  • metabolomics
  • mass spectrometry
  • bioinformatics
  • chemoinformatics
  • exposomics