Dagstuhl-Seminar 12081: Information Visualization, Visual Data Mining and Machine Learning

Dagstuhl-Seminar 12081

Information Visualization, Visual Data Mining and Machine Learning

( 19. Feb – 24. Feb, 2012 )

(zum Vergrößern in der Bildmitte klicken)

Permalink

Bitte benutzen Sie folgende Kurz-Url zum Verlinken dieser Seite: https://www.dagstuhl.de/12081

Organisatoren

Daniel A. Keim (Universität Konstanz, DE)
Fabrice Rossi (University of Paris I, FR)
Thomas Seidl (RWTH Aachen, DE)
Michel Verleysen (University of Louvain, BE)
Stefan Wrobel (Fraunhofer IAIS, St. Augustin, DE & University of Bonn, DE)

Kontakt

Simone Schilke (für administrative Fragen)

Summary

Show Summary

Information visualization and visual data mining leverage the human visual system to provide insight and understanding of unorganized data. Visualizing data in a way that is appropriate for the user's needs proves essential in a number of situations: getting insights about data before a further more quantitative analysis, presenting data to a user through well-chosen table, graph or other structured representations, relying on the cognitive skills of humans to show them extended information in a compact way, etc.

Machine learning enables computers to automatically discover complex patterns in data and, when examples of such patterns are available, to learn automatically from the examples how to recognize occurrences of those patterns in new data. Machine learning has proven itself quite successful in day to day tasks such as SPAM filtering and optical character recognition.

Both research fields share a focus on data and information, and it might seem at first that the main difference between the two fields is the predominance of visual representations of the data in information visualization compared to its relatively low presence in machine learning. However, it should be noted that visual representations are used in a quite systematic way in machine learning, for instance to summarize predictive performances, i.e., whether a given system is performing well in detecting some pattern. This can be traced back to a long tradition of statistical graphics for instance. Dimensionality reduction is also a major topic in machine learning: one aims here at describing as accurately as possible some data with a small number of variables rather than with their original possibly numerous variables. Principal component analysis is the simplest and most well known example of such a method. In the extreme case where one uses only two or three variables, dimensionality reduction is a form of information visualization as the new variables can be used to directly display the original data.

Even if this could be seen as an over simplification of the reality, one could consider that ML tends to provide scalability through automated methods based on the optimization of some ad hoc quality measure, while IV tends to rely on the user to direct the summarizing process, using adapted interactive techniques. Then, the two fields remain quite isolated, despite some well known contact points such as the Self-Organizing Map and Multidimensional Scaling.

The main difference between both fields is the role of the user in the data exploration and modeling. The ultimate goal of machine learning is somehow to get rid of the user: everything should be completely automated and done by a computer. While the user could still play a role by, e.g., choosing the data description or the type of algorithm to use, his/her influence should be limited to a strict minimum. In information visualization, a quite opposite point of view is put forward as visual representations are designed to be leveraged by a human to extract knowledge from the data. Patterns are discovered by the user, models are adjusted to the data under user steering, etc.

The seminar was organized in this context with the specific goal of bringing together researchers from both communities in order to tighten the loose links between them.

It became clear that a large effort is still needed at the algorithmic and software levels. First, fast machine learning techniques are needed that can be embedded in interactive visualization systems. Second, there is the need for a standard software environment that can be used in both communities. The unavailability of such a system hurts research to some extent, as some active system environments in one field do not include even basic facilities from the other. One typical example is the R statistical environment with which a large part of machine learning research is conducted and whose interactive visualization capabilities are limited, in particular in comparison to the state-of-the-art static visualization possibilities. One possible solution foreseen at the seminar was the development of some dynamic data sharing standard that can be implemented in several software environments, allowing fast communication between those environments and facilitating software reuse.

Judging by the liveliness of the discussions and the number of joint research projects proposed at the end of the seminar, this meeting between the machine learning and the information visualization communities was more than needed. The flexible format of the Dagstuhl seminars is perfectly adapted to this type of meeting and the only frustration perceivable at the end of the week was that it had indeed reached its end. It was clear that researchers from the two communities were starting to understand each other and were eager to share more thoughts and actually start working on joint projects. This calls for further seminars ...

More information about the Dagstuhl seminar can be found at http://www.dagstuhl.de/12081.

Teilnehmer

Zeige Teilnehmer

Daniel Archambault (University College Dublin, IE) [dblp]
Michael Aupetit (Commissariat a l´Energie Atomique - Gif-sur-Yvette, FR) [dblp]
Michael Biehl (University of Groningen, NL) [dblp]
Kerstin Bunte (Universität Bielefeld, DE) [dblp]
Etienne Côme (IFSTTAR - Noisy le Grand, FR) [dblp]
Di Cook (Iowa State University - Ames, US) [dblp]
Jean-Daniel Fekete (University of Paris South XI, FR) [dblp]
Brian D. Fisher (Simon Fraser University - Surrey, CA) [dblp]
Ksenia Genova (TU Dresden, DE)
Andrej Gisbrecht (Universität Bielefeld, DE) [dblp]
Hans Hagen (TU Kaiserslautern, DE) [dblp]
Barbara Hammer (Universität Bielefeld, DE) [dblp]
Helwig Hauser (University of Bergen, NO) [dblp]
Nathalie Henry Riche (Microsoft Corporation - Redmond, US) [dblp]
Heike Hofmann (Iowa State University - Ames, US) [dblp]
Ata Kaban (University of Birmingham, GB) [dblp]
Samuel Kaski (Aalto University, FI & University of Helsinki, FI) [dblp]
Johannes Kehrer (VRVis - Wien, AT)
Daniel A. Keim (Universität Konstanz, DE) [dblp]
Andreas Kerren (Linnaeus University - Växjö, SE) [dblp]
Jörn Kohlhammer (Fraunhofer Institut - Darmstadt, DE) [dblp]
Bongshin Lee (Microsoft Corporation - Redmond, US) [dblp]
John Aldo Lee (University of Louvain, BE) [dblp]
Marcus A. Magnor (TU Braunschweig, DE) [dblp]
Florian Mansmann (Universität Konstanz, DE)
Ian Nabney (Aston University - Birmingham, GB) [dblp]
Jaakko Peltonen (Aalto University, FI) [dblp]
Gabriele Peters (FernUniversität in Hagen, DE) [dblp]
Fabrice Rossi (University of Paris I, FR) [dblp]
Frank-Michael Schleif (Universität Bielefeld, DE) [dblp]
Tobias Schreck (Universität Konstanz, DE) [dblp]
Marc Strickert (Universität Marburg, DE) [dblp]
Holger Theisel (Universität Magdeburg, DE) [dblp]
Peter Tino (University of Birmingham, GB) [dblp]
Laurens van der Maaten (TU Delft, NL) [dblp]
Jarke J. van Wijk (TU Eindhoven, NL) [dblp]
Michel Verleysen (University of Louvain, BE) [dblp]
Nathalie Villa-Vialaneix (University of Paris I, FR) [dblp]
Thomas Villmann (Hochschule Mittweida, DE) [dblp]
Daniel Weiskopf (Universität Stuttgart, DE) [dblp]
Hadley Wickham (Rice University - Houston, US) [dblp]
Leishi Zhang (Universität Konstanz, DE) [dblp]

Klassifikation

information visualization
machine learning
computer graphics
information retrieval
soft computing.

Schlagworte

Information visualization
machine learning
nonlinear dimensionality reduction
exploratory data analysis

Seminar 12081

Suche auf der Schloss Dagstuhl Webseite

Schloss Dagstuhl Services

Seminare

Innerhalb dieser Seite:

Externe Seiten:

Publishing

Innerhalb dieser Seite:

Externe Seiten:

dblp

Innerhalb dieser Seite:

Externe Seiten:

Dagstuhl-Seminar 12081

Information Visualization, Visual Data Mining and Machine Learning

( 19. Feb – 24. Feb, 2012 )

Permalink

Organisatoren

Kontakt

Publikationen

Impacts

Programm

Summary

Teilnehmer

Verwandte Seminare

Klassifikation

Schlagworte