August 28 – September 2 , 2016, Dagstuhl Seminar 16351
Next Generation Sequencing - Algorithms, and Software For Biomedical Applications
1 / 4 >
For support, please contact
(Use seminar number and access code to log in)
In recent years, Next Generation Sequencing (NGS) data have begun to appear in many applications that are clinically relevant, such as resequencing of cancer patients, disease-gene discovery and diagnostics for rare diseases, microbiome analyses, and gene expression profiling, to name but a few. Other fields of biological research, such as phylogenomics, functional genomics, and metagenomics, are also making increasing use of the new sequencing technologies.
The analysis of sequencing data is demanding because of the enormous data volume and the need for fast turnaround time, accuracy, reproducibility, and data security. Addressing these issues requires expertise in a large variety of areas: algorithm design, high performance computing on big data (and hardware acceleration), statistical modeling and estimation, and specific domain knowledge for each medical problem. In this Dagstuhl Seminar we aimed at bringing together leading experts from both sides -- computer scientists including theoreticians, algorithmicists and tool developers, as well as leading researchers who work primarily on the application side in the biomedical sector -- to discuss the state-of-the art and to identify areas of research that might benefit from a joint effort of all the groups involved.
Goals of the seminar
The key goal of this seminar was a free and deep exchange of ideas and needs between the communities of algorithmicists and theoreticians and practitioners from the biomedical field. This exchange should have triggered discussions about the implications that new types of data or experimental protocols have on the needed algorithms or data structures.
We started the seminar with a number of challenge talks to encourage discussion about the various topics introduced in the proposal. Before the seminar started we identified three areas the participants were most interested in, namely:
- Data structures and algorithms for large data sets, hardware acceleration
- New problems in the upcoming age of genomes
- Challenges arising from new experimental frontiers and validation
For the first area Laurent Mouchard, Gene Myers, and Simon Gog presented results and challenges; for the second area Siavash Mirarab, Niko Beerenwinkel, Shibu Yooseph, and Kay Nieselt introduced some thoughts; and finally, for the last area, Jason Chin, Ewan Birney, Alice McHardy, and Pascal Costanza talked about challenges. For most of those talks the abstracts can be found below. Following this introductionary phase, the participants organized themselves into various working groups the topics of which were relatively broad. Those first breakout groups were about
- Haplotype phasing
- Big data
- Pangenomics data representation
- Cancer genomics
The results of the groups were discussed in plenary sessions interleaved with some impromptu talks. As a result the participants split up into smaller, more focused breakout groups that were received very well. Indeed, some participants did already extend data formats for assembly or improved recent results on full text string indices.
Based on the initial feedback from the participants we think that the topic of the seminar was interesting and led to a lively exchange of ideas. We thus intend to revisit the field in the coming years in a Dagstuhl seminar again, most likely organized by different leaders of the field in order to account for these upcoming changes. In such a seminar we intend to encourage more people from clinical bioinformatics to join into the discussions.
Creative Commons BY 3.0 Unported license
Gene Myers and Mihai Pop and Knut Reinert and Tandy Warnow
- Data Structures / Algorithms / Complexity
- Software Engineering
- Sequence analysis
- DNA Sequence Assembly
- Expression Profiles
- Human Disease
- Software Engineering (Tools & Libraries)
- Next Generation Sequencing