27. Januar – 01. Februar 2019, Dagstuhl-Seminar 19052

Computational Methods for Melody and Voice Processing in Music Recordings


Emilia Gómez (UPF – Barcelona, ES)
Meinard Müller (Universität Erlangen-Nürnberg, DE)
Yi-Hsuan Yang (Academica Sinica – Taipei, TW)

Auskunft zu diesem Dagstuhl-Seminar erteilen

Susanne Bach-Bernhard zu administrativen Fragen

Michael Gerke zu wissenschaftlichen Fragen


Gemeinsame Dokumente
Dagstuhl-Seminar Wiki

(Zum Einloggen bitte Seminarnummer und Zugangscode verwenden)

Press Room


In our daily lives, we are constantly surrounded by music, and we are deeply influenced by music. Making music together can create strong ties between people, while fostering communication and creativity. This is demonstrated, for example, by the large community of singers active in choirs or by the fact that music constitutes an important part of our cultural heritage. The availability of music in digital formats and its distribution over the world wide web has changed the way we consume, create, enjoy, explore, and interact with music. To cope with the increasing amount of digital music, one requires computational methods and tools that allow users to find, organize, analyze, and interact with music – topics that are central to the research field known as Music Information Retrieval (MIR).

This Dagstuhl Seminar is devoted to a branch of MIR that is of particular importance: processing melodic voices using computational methods. It is often the melody, a specific succession of musical tones, which constitutes the leading element in a piece of music. In this seminar we want to discuss how to detect, extract, and analyze melodic voices as they occur in recorded performances of a piece of music. Even though it may be easy for a human to recognize and hum a main melody, the automated extraction and reconstruction of such information from audio signals is extremely challenging due to superposition of different sound sources as well as the intricacy of musical instruments and, in particular, the human voice. As one main objective of the seminar, we want to critically review the state of the art of computational approaches to various MIR tasks related to melody processing including pitch estimation, source separation, instrument recognition, singing voice analysis and synthesis, and performance analysis (timbre, intonation, expression). Second, we aim at triggering interdisciplinary discussions that leverage insights from fields such as audio processing, machine learning, music perception, music theory, and information retrieval. Third, we shall explore novel applications in music and multimedia retrieval, content creation, musicology, education, and human-computer interaction.

By gathering internationally renowned key players from different research areas, our goal is to highlight and better understand the problems that arise while dealing with a highly interdisciplinary topic such as melody extraction or voice separation. Special focus will be put on increasing the diversity of the MIR community in collaboration with the mentoring program developed by the Women in Music Information Retrieval (WiMIR) initiative. General questions and issues that may be addressed in this seminar include, but are not limited to the following list:

  • Model-based melody extraction and singing voice separation
  • Multipitch and predominant frequency estimation
  • Modeling of musical aspects such as vibrato, tremolo, and glissando
  • Informed singing voice separation (exploiting additional information such as score and lyrics)
  • Integrated models for voice separation, melody extraction, and singing analysis
  • Alignment methods for synchronizing lyrics and recorded songs
  • Assessment of singing style, expression, skills, and enthusiasm
  • Content-based retrieval (e.g. query-by-humming, theme identification)
  • Data mining techniques for identifying singing-related resources on the world wide web
  • Collecting and annotating musical sounds via crowdsourcing
  • User interfaces for singing information analysis and visualization
  • Singing for gaming and medical purposes
  • Understanding expressiveness in singing
  • Extraction of emotion-related parameters from melodic voices
  • Extraction of vocal quality descriptors
  • Singing voice synthesis and transformation
  • Cognitive and sensorimotor factors in singing
  • Deep learning approaches for melody extraction and voice processing
  • Deep autoencoder models for feature design and classification
  • Hierarchical models for short-term/long-term dependencies

  Creative Commons BY 3.0 DE
  Emilia Gómez, Meinard Müller, and Yi-Hsuan Yang

Dagstuhl-Seminar Series


  • Data Bases / Information Retrieval
  • Multimedia
  • Society / Human-computer Interaction


  • Music information retrieval
  • Music processing
  • Singing voice processing
  • Audio signal processing
  • Machine learning


Bücher der Teilnehmer 

Buchausstellung im Erdgeschoss der Bibliothek

(nur in der Veranstaltungswoche).


In der Reihe Dagstuhl Reports werden alle Dagstuhl-Seminare und Dagstuhl-Perspektiven-Workshops dokumentiert. Die Organisatoren stellen zusammen mit dem Collector des Seminars einen Bericht zusammen, der die Beiträge der Autoren zusammenfasst und um eine Zusammenfassung ergänzt.


Download Übersichtsflyer (PDF).


Es besteht weiterhin die Möglichkeit, eine umfassende Kollektion begutachteter Arbeiten in der Reihe Dagstuhl Follow-Ups zu publizieren.

Dagstuhl's Impact

Bitte informieren Sie uns, wenn eine Veröffentlichung ausgehend von
Ihrem Seminar entsteht. Derartige Veröffentlichungen werden von uns in der Rubrik Dagstuhl's Impact separat aufgelistet  und im Erdgeschoss der Bibliothek präsentiert.