Dagstuhl Seminar 07131
Similarity-based Clustering and its Application to Medicine and Biology
( Mar 25 – Mar 30, 2007 )
- Michael Biehl (University of Groningen, NL)
- Barbara Hammer (TU Clausthal, DE)
- Michel Verleysen (University of Louvain, BE)
- Thomas Villmann (Universität Leipzig, DE)
- Accelerating relational clustering algorithms with sparse prototype representation : article : proceedings of the 6th international workshop on self-organizing maps : S. 1-6 = WOSM 2007 - Fabrice Rossi ; Alexander Hasenfuß ; Barbara Hammer - Bielefeld : Universität , 2007.
- Advances in feature selection with mutual information : article in LNAI 5400 : S. 52-69 - Verleysen, Michel; Rossi, Fabrice; Francois, Damien - Berlin : Springer, 2009 - (Lecture notes in artificial intelligence : 5400 ; S. 52-69).
- Median topographic maps for biomedical data sets : article in LNAI 5400 : S. 92-117 - Hammer, Barbara; Hasenfuss, Alexander; Rossi, Fabrice - Berlin : Springer, 2009 - (Lecture notes in artificial intelligence : 5400 ; S. 92-117).
- Patch relational neural gas : clustering of huge dissimilarity datasets : article in LNAI 5064, S. 1-12 - Alexander Hasenfuß ; Barbara Hammer ; Fabrice Rossi - Berlin : Springer, 2008 - (Lecture notes in artificial intelligence : 5064 : S. 1-12).
- Similarity-Based Clustering : Recent Developments and Biomedical Applications - Biehl, Michael; Hammer, Barbara; Verleysen, Michel; Villmann, Thomas - Berlin : Springer, 2009. - IX, 201 S. - (Lecture notes in artificial intelligence : state-of-the-art survey ; 5400). ISBN: 978-3-642-01804-6 / 3-642-01804-1.
- Topographic processing of relational data : article : proceedings of the 6th international workshop on self-organizing maps : S. 1-6 = WOSM 2007 - Hammer, Barbara; Hasenfuß, Alexander; Rossi, Fabrice; Strickert, Marc - Bielefeld : Universität , 2007.
In medicine, biology, and medical bioinformatics, more and more data arise from clinical measurements such as EEG or fMRI studies for monitoring brain activity, mass spectrometry data for the detection of proteins, peptides and composites, or microarray profiles for the analysis of gene expressions. Typically, data are high dimensional, noisy, and very hard to inspect using classical (e.g. symbolic or linear) methods. At the same time, new technologies ranging from the possibility of a very high resolution of spectra to high throughput screening for microarray data are rapidly developing and carry the promise of an efficient, cheap, and automatic gathering of tons of high quality data with large information potential. Thus, there is a need for appropriate machine learning methods which help to automatically extract and interprete the relevant parts of this information and which, eventually, help to enable understanding of biological systems, reliable diagnosis of faults, and therapy of diseases such as cancer based on this information.
The seminar centered around developments, understanding, and application of similarity-based clustering in complex domains related to the life sciences. These methods have a great potential as an intuitive and flexible toolbox for mining, visualization, and inspection of large data sets since they combine simple and human-understandabel principles with a large variety of different, problem adapted design choices. The goal of the seminar was to bring together researchers from Computer Science and Biology to explore recent algorithmic developments, discuss theoretical background and problems, and to identify important applications and challenges of the methods.
A variety of open problems and challenges came up during the week. Before the seminar, the main challenge of similarity-based clustering in medicine and biology was seen as the problem to adapt similarity-based learning for complex, high-dimensional, and possibly non-euclidean data structures as they occur in these domains. During the discussions a much more widepread and subtle picture emerged, identifying the following topics as central issues for clustering:
- Feature extraction
- Cluster evaluation
- Good sampling
Overal, the presentations and discussions revealed that similarity-based clustering constitutes a highly evolving field which seems particularly suitable for problems in medicine or biology and which still waits with quite a few open problems from researchers, a central problem being a formalization of goals and implicit regularization of clustering in the context of medicine and biology.
- Michael Biehl (University of Groningen, NL) [dblp]
- Rainer Breitling (University of Groningen, NL) [dblp]
- Hans Burkhardt (Universität Freiburg, DE)
- Nestor Caticha (University of Sao Paolo, BR) [dblp]
- Eytan Domany (Weizmann Inst. - Rehovot, IL) [dblp]
- Colin Fyfe (University of the West of Scotland - Paisley, GB)
- Barbara Hammer (TU Clausthal, DE) [dblp]
- Alexander Hasenfuss (TU Clausthal, DE)
- David Hoyle (Univ. of Manchester, GB)
- Samuel Kaski (Helsinki University of Technology, FI) [dblp]
- John A. Lee (University of Louvain, BE) [dblp]
- Yang Li (University of Groningen, NL)
- Thomas Martinetz (Universität Lübeck, DE) [dblp]
- Erzsébet Merényi (Rice University - Houston, US) [dblp]
- Johannes Mohr (Bernstein Center for Comp. Neuroscience - Berlin, DE) [dblp]
- Peter Riegler (Ostfalia Hochschule - Wolfenbüttel, DE)
- Michal Rosen-Zvi (IBM - Haifa, IL) [dblp]
- Fabrice Rossi (INRIA Rocquencourt, FR) [dblp]
- David Saad (Aston University - Birmingham, GB)
- Lidia Sanchez-Gonzalez (University of León, ES)
- Craig Saunders (University of Southampton, GB)
- Frank-Michael Schleif (Universität Leipzig, DE) [dblp]
- Petra Schneider (University of Groningen, NL) [dblp]
- Udo Seiffert (IPK Gatersleben, DE) [dblp]
- Joachim Selbig (Universität Potsdam, DE)
- Marc Strickert (IPK Gatersleben, DE) [dblp]
- David M. J. Tax (TU Delft, NL)
- Peter Tino (University of Birmingham, GB) [dblp]
- Michel Verleysen (University of Louvain, BE) [dblp]
- Thomas Villmann (Universität Leipzig, DE) [dblp]
- Ulrike von Luxburg (MPI für biologische Kybernetik - Tübingen, DE) [dblp]
- Michael Wilkinson (University of Groningen, NL)
- Aree Witoelar (University of Groningen, NL)
- Dagstuhl Seminar 09081: Similarity-based learning on structures (2009-02-15 - 2009-02-20) (Details)
- Soft computing
- Interdisciplinary (medicine/biology)
- Similarity-based clustering and classification
- prototype-based classifiers
- learning vector quantization
- medical diagnosis