Dagstuhl Seminar 17332
Scalable Set Visualizations
( Aug 13 – Aug 18, 2017 )
- Yifan Hu (Yahoo! Research - New York, US)
- Luana Micallef (Aalto University, FI)
- Martin Nöllenburg (TU Wien, AT)
- Peter Rodgers (University of Kent - Canterbury, GB)
- Susanne Bach-Bernhard (for administrative matters)
- Euler diagrams drawn with ellipsesarea?proportionally (Edeap) : article - Wybrow, Michael; Rodgers, Peter J.; Dib, Fadi K. - Berlin : Springer, 2021. - 27 pp. - (BMC Bioinformatics ; 22. 2021: 214).
- Short Plane Supports for Spatial Hypergraphs - Castermans, Thom; Garderen, Mereke van; Meulemans, Wouter; Nöllenburg, Martin; Yuan, Xiaoru - Cornell University : arXiv.org, 2018. - 23 pp..
- Short Plane Supports for Spatial Hypergraphs : article - Castermans, Thom; Garderen, Mereke van; Meulemans, Wouter; Nöllenburg, Martin; Yuan, Xiaoru - jgaa.info, 2019. - pp. 463-498 - (Journal of Graph Algorithms and Applications ; 23. 2019, 3 : article).
- Short Plane Supports for Spatial Hypergraphs : article in LNCS 11282: Graph Drawing and Network Visualization : GD 2018 - Castermans, Thom; Garderen, Mereke van; Meulemans, Wouter; Nöllenburg, Martin; Yuan, Xiaoru - Berlin : Springer, 2018. - pp. 53-66 - (Lecture notes in computer science ; 11282 : article).
- Using the Metro-Map Metaphor for Drawing Hypergraphs : article in SOFSEM 2021: Theory and Practice of Computer Science - Frank, Fabian; Kaufmann, Michael; Kobourov, Stephen G.; Mchedlidze, Tamara; Pupyrev, Sergey; Ueckerdt, Torsten; Wolff, Alexander - Berlin : Springer, 2021. - pp 361-372 - (Lecture notes in computer science ; 12607 : article).
Sets are a fundamental way of organizing data. From mathematical logic to the results of automated gene analysis, sets are used because they intuitively represent the way data is structured. Information visualization is key to gaining insight into data, as the human perceptual system is an analytic system of enormous power. As a result, there has been a recent proliferation of automated visualization methods for set-based data. An important goal for researchers in set visualization is to develop visually and computationally scalable methods to address the challenge of interpreting large data sets.
The goal of this seminar is to bring together researchers with different backgrounds but a shared interest in set visualization. It will involve computer scientists with expertise, e.g., in visualization, algorithms, and human-computer interaction, but also users of set visualizations from domains outside computer science. Despite the large number of set visualization techniques, for which there is often a considerable practical and theoretical understanding of their capabilities, there has only been limited success in scaling these methods. This seminar starts with a few overview talks on the state of the art in set visualization, but then focuses on small hands-on working groups during most of the seminar week. We aim to accelerate the efforts to improve scalability of set visualizations by addressing open questions proposed by the seminar attendees, in order to produce concrete research outcomes, including new set visualization software and peer-reviewed research publications.
Topics to be addressed include:
Algorithms. Effective and efficient algorithms are required to generate accurate and comprehensible visualizations of large set-based data, and to allow smooth interaction with the data for analysis and exploration. Various algorithms have been devised, but more work is needed in areas including:
- Scaling overlay techniques;
- Exploring colouring and defragmentation in map-based techniques;
- Developing new interaction techniques.
Theory. Theoretical findings on the drawability and readability of set visualizations help in selecting an appropriate technique and optimizing its visual design and layout. Findings are needed in subjects including:
- Limits on the drawability of different set visualization techniques, starting off from Venn and Euler diagrams;
- Measures and models predicting the readability of layouts as the number of sets and their intersections increase;
- Measures and computational methods predicting user performance in completing data analysis tasks for different set visualization techniques with a large number of sets and set intersections.
Evaluation. Both laboratory experiments and crowd-sourced experiments have been used to evaluate the effectiveness of set visualizations. The methodologies for evaluation of set-based visualizations will be explored including:
- Defining relevant tasks and mapping data and task to visualization;
- Effective use of perceptual, cognitive and HCI theories;
- Evaluation of new visual metrics quantifying the effectiveness of a set visualization for a specific task and user traits.
Application Areas and Users. Set visualizations are designed to both communicate key aspects of information, and to derive understanding from data. Hence, the visualizations are typically targeted at non-computer scientists. This means engagement with users is thus crucial. Example user focused topics include:
- Examining improved visualizations for biosciences data;
- Developing visualization techniques for social network and security applications;
- Generating more accurate visualizations for the results of medical studies.
- Verborgene Muster aufdecken: die Kunst, Mengen zu zeigen
Article about this seminar published in the "Saarbrücker Zeitung" on September 11, 2017 (in German)
- Exposing Hidden Patterns – The Art of Set Visualization
Press release in English
- Verborgene Muster aufdecken – Die Kunst, Mengen zu veranschaulichen
Press release in German
Sets are a fundamental way of organizing information. Visualizing set-based data is crucial in gaining understanding of it as the human perceptual system is an analytic system of enormous power. The number of different set visualization methods has increased rapidly in recent years, and they vary widely in the visual metaphors used and set-related tasks that they support. Large volume set-based data can now be found in diverse application areas such as social networks, biosciences and security analysis. At present, set visualization methods lack the facility to provide analysts with the visual tools needed to successfully interpret large-scale set data as the scalability of existing metaphors and methods is limited. This seminar provided a forum for set visualization researchers and application users to discuss how this challenge could be addressed.
Existing set visualizations can be grouped into several families of techniques, including traditional Euler and Venn diagrams, but also node-link diagrams, map- and overlay-based representations, or matrix-based visualizations. Inevitably, the approaches taken to drawing these visualizations are diverse, for example node-link diagrams require graph drawing methods, whereas overlay techniques use algorithms from computational geometry. However, they are similar in a number of aspects. One aspect is the underlying set theory. For instance, theoretical results into the drawability of many of these set visualization techniques for different data characteristics is possible (as already done in example Venn and Euler diagram research). Another common aspect is that the visualizations are typically focused on an end-user, so perceptual, cognitive and evaluation considerations are an important concern.
A particularly pressing issue in set visualization is that of scaling representations. The number of data items can be large and many methods aggregate individual items. Yet, even using aggregation, the limit of the most scalable of these methods is considered to be in the region of 100 sets . Typical application areas that make use of sets include, e.g., social networks, biosciences and security analysis. In these applications, there may be many millions of data items in thousands of sets. Other applications have high-dimensional data, where each item is associated with a large number of variables, which poses different scalability challenges for set visualizations.
A distinct feature of set visualization is that visualizations must support set-related, element-related, and attribute-related analysis tasks  that involve, e.g., visually evaluating containment relations, cardinalities, unions, intersections, or set differences. For example, bioscience microarray experiments classify large numbers of genes and multiple visualization tools have been developed to visualize this data. However, current efforts can only visualize small sections of the information at once . Similar scalability challenges for set visualizations appear in many other applications as well. Hence, developing effective visualization methods for large set-based data would greatly facilitate analysis of such data in a number of important application areas.
The goal of this seminar was to bring together researchers with different backgrounds but a shared interest in set visualization. It involved computer scientists with expertise, e.g., in visualization, algorithms, and human-computer interaction, but also users of set visualizations from domains outside computer science. Despite the large number of set visualization techniques, for which there is often a considerable practical and theoretical understanding of their capabilities, there has only been limited success in scaling these methods. Thus the intended focus of this seminar was to discuss and study specific research challenges for scalable set visualizations concerning fundamental theory, algorithms, evaluation, applications, and users. We started with a few overview talks on the state of the art in set visualization, but then focused on small hands-on working groups during most of the seminar week. We aimed to accelerate the efforts to improve scalability of set visualizations by addressing open questions proposed by the seminar attendees, in order to produce concrete research outcomes, including new set visualization software and peer-reviewed research publications.
- On the first two days of the seminar we enjoyed five invited overview lectures on different aspects of set visualizations. The topics and speakers were chosen as to create a joint understanding of the state of the art of set visualization techniques, evaluations and applications. Silvia Miksch gave a systematic overview of set visualization techniques, grouped by types of visual representations and tasks, with a special focus on set visual analytics. Martin Krzywinski reported about his experiences on using visual analogies for showing set-based data in the area of genomics. Sara Fabrikant took a cartographer's view on visualizing sets and explained how successful cartographic maps work as information displays by taking not only the design but also the context and the user into account. Stephen Kobourov explained how large graph-based set data can be represented using a familiar map metaphor by showing several interesting data sets and their map representations. Finally, John Howse presented how set visualizations can be used as diagrammatic reasoning systems in logic.
- In the open problem session on the first day of the seminar we collected a list of 13 open research problems that were contributed by the seminar participants. In a preference voting we determined the five topics that raised the most interest among the participants and formed small working groups around them. During the following days the groups worked by themselves, except for a few plenary reporting sessions, formalizing and solving their respective theoretical and practical challenges. Below is a list of the working group topics; more detailed group reports are found in Section 4.
- Mapifying the genome: Can the axis of the entire genome be mapped on a 2-dimensional space based on gene function rather than a 1-dimensional line based on gene position?
- Area-proportional Euler diagrams with ellipses: Can the use of ellipses extend the size of data that can be drawn with area-proportional Euler diagrams?
- Spatially informative set visualizations: Can we improve spatial overlay-based set visualizations when allowing some limited displacement of the given set positions?
- Set visualization using the metro map metaphor: How and under which conditions can the metro map metaphor be used to visualize set systems?
- Visual analytics of sets/set-typed data and time: challenges and opportunities: What are the main research challenges and opportunities in the context of set visualizations that change over time and how can these be structured?
- We had a flexible working schedule with a short plenary session every morning to accommodate group reports and impromptu presentations by participants. In two of those Wouter Meulemans and Nan Cao shared recent results of theirs related to set visualization.
- During the week we encouraged participants to come up with suggestions for further strengthening this growing community of set visualization researchers. In a plenary session on Friday we collected and structured these ideas and made started planning future events related to set visualizations, see Section 1.
During the entire seminar, participants actively discussed ways how to disseminate, proliferate and promote scalable set visualization research in diverse specific areas, such as: set theory and diagrammatic reasoning; algorithms and graph theory; information visualization and visual analytics; evaluation, users and application areas. This led to the concretization of the following future milestones, each of which is being coordinated by volunteered seminar participants:
- Diagrams Workshop in 2018 on Set Visualization and Reasoning (SetVR)
The workshop aims at promoting set visualization to the Diagrams community, of which well-renowned mathematicians and logicians are members, thus proliferating relevant set theory and diagrammatic reasoning research;
- IEEE VIS Workshop in 2019 on Set Visualization and Analytics (SetVA)
The workshop aims at promoting set visualization to the Information Visuaization and Visual Analytics communities, at the premier forum for advances in information and scientific visualization, with the aim to generate new visualization and analytic techniques to handle large set-typed data;
- Dagstuhl seminar in 2019 on Set Visualization and Analytics (SetVA) over Time and Space
This seminar has revealed, for the first time, the need for visualization and analytic techniques for the set-typed data that has an element of time and/or space; thus a follow-up Dagstuhl seminar will be organized to discuss this topic, once again among researchers with diverse set visualization backgrounds;
- Set Visualization Workshop in 2020 in the Computational Geometry Week or collocated with Graph Drawing
This workshop aims at disseminating set visualization to a more algorithmic and computational geometry research community, to ensure the production of effective, yet efficient and scalable set visualization algorithms;
- Set Visualization browser, like http://setviz.net
This browser will collect and disseminate available set visualization techniques, making them easily accessible through various categorizations, such as the type of data analysis tasks or application areas they target;
- Set Visualization book
The book would serve as a guide for researchers who are new to set visualization and as a review of the current state-of-the-art of set-typed data in the various related domains.
We decided to have an annual set visualization workshop that each year focuses on one of (i) diagrammatic reasoning and logic, (ii) information visualization and visual analytics, and (iii) computational geometry and graph drawing, at premier venues of the respective research communities, to generate further research interest in all of these three diverse areas that are all important for scalable set visualizations.
According to the Dagstuhl survey conducted after the seminar, as well as informal feedback to the organizers, the seminar was highly appreciated. Particularly the small group size, group composition, and the seminar structure focusing on hands-on working groups was very well received. The seminar's goals to identify and initiate collaboration on new research challenges was very successful (also in comparison to other Dagstuhl seminars) as the participants rated the seminar highly for inspiring new research directions, joint projects and joint publications. We are looking forward to seeing the first scientific outcomes of the seminar in the near future and to continuing the efforts to support the growth of the set visualization community.
Schloss Dagstuhl was the perfect place for hosting a seminar like this. The unique scientific atmosphere and the historic building provided not only all the room we needed for our program and the working groups, but also plenty of opportunities for continued discussions and socializing outside the official program. On behalf of all participants the organizers want to express their deep gratitude to the entire Dagstuhl staff for their outstanding support and service accompanying this seminar. We further thank Tamara Mchedlidze for helping us collecting the contributions and preparing this report.
- Peter Rodgers. The State-of-the-Art of Set Visualization. Computer Graphics Forum, 35(1):234–260, 2015.
- Sebastian Behrens and Hans A Kestler. Using VennMaster to evaluate and analyse shRNA data. Ulmer Informatik-Berichte, page 8, 2013.
- Daniel Archambault (Swansea University, GB) [dblp]
- Robert Baker (University of Kent - Canterbury, GB) [dblp]
- Kerstin Bunte (University of Groningen, NL) [dblp]
- Nan Cao (Tongji University - Shanghai, CN) [dblp]
- Thom Castermans (TU Eindhoven, NL) [dblp]
- Fadi Dib (Gulf University for Science&Technology - Mishreff, KW) [dblp]
- Sara Irina Fabrikant (Universität Zürich, CH) [dblp]
- John Howse (University of Brighton, GB) [dblp]
- Yifan Hu (Yahoo! Research - New York, US) [dblp]
- Radu Jianu (City, University of London, GB) [dblp]
- Michael Kaufmann (Universität Tübingen, DE) [dblp]
- Andreas Kerren (Linnaeus University - Växjö, SE) [dblp]
- Stephen G. Kobourov (University of Arizona - Tucson, US) [dblp]
- Martin Krzywinski (BC Cancer Research Centre - Vancouver, CA) [dblp]
- Tamara Mchedlidze (KIT - Karlsruher Institut für Technologie, DE) [dblp]
- Wouter Meulemans (TU Eindhoven, NL) [dblp]
- Luana Micallef (Aalto University, FI) [dblp]
- Silvia Miksch (TU Wien, AT) [dblp]
- Martin Nöllenburg (TU Wien, AT) [dblp]
- Sergey Pupyrev (Facebook - Menlo Park, US) [dblp]
- Peter Rodgers (University of Kent - Canterbury, GB) [dblp]
- Mereke van Garderen (Universität Konstanz, DE) [dblp]
- Alexander Wolff (Universität Würzburg, DE) [dblp]
- Hsiang-Yun Wu (TU Wien, AT) [dblp]
- Michael Wybrow (Monash University - Caulfield, AU) [dblp]
- Xiaoru Yuan (Peking University, CN) [dblp]
- computer graphics / computer vision
- data structures / algorithms / complexity
- society / human-computer interaction
- information visualization
- set visualization
- visual analytics
- cognition and graphical perception
- geometric algorithms