http://www.dagstuhl.de/16061

February 7 – 12 , 2016, Dagstuhl Seminar 16061

Data-Driven Storytelling

Organizers

Sheelagh Carpendale (University of Calgary, CA)
Nicholas Diakopoulos (University of Maryland – College Park, US)
Nathalie Henry Riche (Microsoft Research – Redmond, US)
Christophe Hurter (ENAC – Toulouse, FR)


1 / 2 >

For support, please contact

Dagstuhl Service Team

Documents

Dagstuhl Report, Volume 6, Issue 2 Dagstuhl Report
Aims & Scope
List of Participants
Shared Documents
Dagstuhl's Impact: Documents available

Summary

Data visualization is the "use of computer-supported, interactive, visual representations of data to amplify cognition" [5]. Visualization can play a crucial role for exploring data and for communicating information as "a picture is worth a thousand words". Early research in this field focused on producing static images and quantifying the perception of different visual encodings [6] in these visual representations. The vast majority of research since then focused on designing and implementing novel interfaces and interactive techniques to enable data exploration. Major advances in visual analytics and big data initiatives concentrated on integrating machine learning and analysis methods with visual representations to enable powerful exploratory analysis and data mining [10]. As interactive visualizations play an increasing role in data analysis scenarios, they also started to appear as a powerful vector for communicating information. Stories supported by facts extracted from data analysis proliferate in many different forms from animated infographics and videos [2] to interactive online visualizations on news media outlets. We argue that it is now time for the visualization research community to understand how these powerful interactive visualizations play a role in communicating information. We define this line of research as data-driven storytelling.

The popularity of javascript web technology and the availability of the D3 toolkit [3] enabled a wider range of people to create data visualizations. Being able to easily share interactive data visualizations on the web also increased the democratization of interactive visualizations. Coupled with the emphasis on data science, these advances raise new practices such as data journalism. Data journalists gather and explore available datasets to extract relevant insights, often conveying their stories via interactive data visualizations [1,9]. The popularity of data-driven stories on New York Times especially, revealed the potential of interactive visualizations as a powerful communication tool [7].

Central to our vision of the convening was that the vast majority of research on data visualization to date has focused on designing and implementing novel interfaces and interactive techniques to enable data exploration. Major advances in visual analytics and big data initiatives have concentrated on integrating machine learning and analysis methods with visual representations to enable powerful exploratory analysis and data mining. But just as interactive visualization plays an important role in data analysis scenarios it is also becoming increasingly important in structuring the communication and conveyance of insights and stories in a compelling format. Visual data-driven stories have proliferated in many different forms, from talks [8], to animated infographics and videos [1, 9, 7], to interactive online visualizations.

Data-driven storytelling is also compelling for a wide range of applications. In enterprise scenarios, the output of data analysis (often reports and slide-based presentations) has to be conveyed to decision makers. In scientific research, interactive visualizations are increasingly used to convey data-driven discoveries to peers or used to communicate complex findings to a broader audience. In education scenarios, interactive visualizations are used by teachers to explain mathematical concepts or to illustrate biological or physical mechanisms. Many questions arise as interactive visualizations are used beyond data exploration by experts, for communication purposes to a broader audience. Research on understanding of static images in cognitive psychology and perception must be extended to encompass more advanced techniques (videos and interactive applications). Visualization literacy, defined as the ability to extract, interpret, and make meaning from information presented in the form of an (interactive) data visualization is also a crucial component for data-driven storytelling research. Assessing the visualization literacy of an audience and developing techniques to better teach how to decode interactive visualizations has started to attract the attention of our research community [4] However a plethora of research remains to be done. For example, research on how visualizations can lie [11] or at least how they may introduce bias in the reader’s mind has focused on static visual representations but has not yet been extended to other medium. Similarly it is crucial for advancing researches in visualization to assess the role data-driven storytelling can play in easing the comprehension of a messages or in increasing their memorability.

The visualization research community needs to reflect on data-driven storytelling and to develop a research agenda to investigate how advanced data-driven stories are understood by the audience, identify factors that makes them compelling as well as factors that can introduce bias in their perception. By learning from master storytellers from other fields (journalism, design, art and education) strategies to craft successful stories, our community will be able to reflect on these questions and eventually build novel consuming tools that engage a broad audience while minimizing perception bias, as well as build novel authoring tools to craft high quality data-driven stories.

One domain where there has been extensive and practical progress on the question of data-driven storytelling is data journalism. News sites like FiveThirtyEight or the New York Times’ The Upshot have seen a recent surge of attention and interest as a means of communicating data-driven news to the public. By carefully structuring the information and integrating explanation to guide the consumer, journalists help lead users toward a valid interpretation of the underlying data. Because of the rapid and practical progress of data-driven storytelling in the domain of journalism, our seminar sought to put some of the top practitioners from that field together with computer science researchers to discuss the challenges and opportunities of data-driven communication.

The Dagstuhl seminar was structured to leverage the interdisciplinarity of the attendees by first tapping into a divergent design thinking process meant to enumerate the range of issues that are relevant to data-driven stories. Hundreds of index cards and sticky notes were sacrificed as participants generated ideas (see Figure 1).

We then clustered these ideas to arrived at a set of key themes, including:

  • Techniques and Design Choices for Storytelling
  • Exploration and Explanation
  • From Analysis to Communication
  • Audience
  • Evaluation
  • Devices and Gadgets
  • Ethics

Groups of participants formed around common interests and each of these major themes were then the focus of discussion. Each work group was geared towards developing an outline and plan to produce a written chapter for a forthcoming edited book on the topic of data-driven storytelling. Some groups met for a day or two and then reformed around other topics, whereas other groups spent the entire week going deep in exploring a single topic. And as if the daytime activities weren’t enough, additional evening breakout groups formed around additional topics of interest like Education in Data Visualization, Urban Visualization, and the Technology Stack for data-driven stories.

In-between the intense, small group sessions the entire group came together daily for five-minute lightning talks on a wide array of relevant topics. These stimulating talks primed the group for approaching data-driven storytelling from different perspectives and were an entertaining and informative way to share creative ideas or results in small and easily digestible nuggets. Among the more than 25 lightning talks, topics ranged from storytelling with timelines, to mobile visualization, the use of data comics, visual literacy, affect and color, data-story design workflows, and even the visualization of data through cuisine.

Outcomes

Our initial goal of the seminar was to have groups work intensively on their chosen topic(s) so that an outline and workplan could be developed to write a contributing chapter to a book on data-driven storytelling. The book is underway and will have contributions on each of the main themes outlined above, as well as an introductory chapter by the editors / organizers of the Dagstuhl seminar. Moreover, our creative contributors at the seminar produced other outputs as well: curated lists of example data driven stories, as well as of storytelling techniques were created and will be published online, and a blog has pulled together some of the formative impressions of participants (https://medium.com/data-driven-storytelling).

Below we briefly summarize the expected contents of each of the chapters that will form the book.

Techniques and Design Choices for Storytelling

This chapter will discuss techniques and design choices for visual storytelling grounded in a survey of over 60 examples collected from various online news sources and from award-winning visualization and infographic design work. These design choices represent a middle ground between low-level visualization and interaction techniques and high-level narrative devices or structures. The chapter will define several classes of design choices: embellishment, explanation, exploration, navigation, story presentation, emphasis, focus, and annotation. Examples from the survey for each class of design choices will be provided. Finally, several case studies of examples from the survey that make use of multiple design choices will be developed.

Exploration and Explanation in Data-Driven Stories

This chapter will explore the differences between and integration of exploration and explanation in visual data-driven storytelling. Exploratory visualizations allow for a lot of freedom which can include changing the visual representation, the focus of what is being shown and the sequence in which the data is viewed. They allow readers to find their own stories in the data. Explanatory stories include a focused message which is usually more narrow and guides the reader often in a linear way. Advantages and disadvantages of exploration and explanation as well as dimensions that help to describe and classify data-driven stories will be developed. The space is described by identifying freedom, guidance regarding representation, focus and sequence as well as interpretation as important dimensions of data-driven storytelling and existing systems are characterized along these dimensions. Recommendations will be developed for how to integrate both aspects of exploration and explanation in data-driven stories.

From Analysis to Communication: Supporting the Lifecycle of a Story

This chapter will explore how tools can better support the authoring of rich and custom data stories with natural / seamless workflows. The aim is to understand the roles and limitations of analysis / authoring tools within current workflow practices and use these insights to suggest opportunities for future research and design. First, the chapter will report a summary of interviews with practitioners at the Dagstuhl seminar; these interviews aim to understand current workflow practices for analysis and authoring, the tools used to support those practices, and pain points in those processes. Then the chapter will reflect on design implications that may improve tool support for the authoring process as well as research opportunities related to such tool support. A strong theme is the interplay between analytical and communicative phases during both creation and consumption of data-driven stories.

The Audience for Data-Driven Stories

Creators of data-driven visual stories want to be as effective as possible in communicating their message. By carefully considering the needs of their audience, content creators can help their readers better understand their content. This chapter will describe four separate characteristics of audience that creators should consider: expertise and familiarity with the topic, the medium, data, and data visualization; expectations about how and what the story will deliver; how the reader uses the interface such as reading, scrolling, or other interactivity; and demographic characteristics of the audience such as age, gender, education, and location. This chapter will discuss how these audience goals match the goals of the creator, be it to inform, persuade, educate, or entertain. Then it will discuss certain risks creators should recognize, such as confusing or offending the reader, or using unfamiliar jargon or technological interfaces. Case studies from a variety of fields including research, media, and government organizations will be presented.

Evaluating Data-Driven Storytelling

The study of data-driven storytelling requires specific guidelines, metrics, and methodologies reflecting their different complex aspects. Evaluation is not only essential for researchers to learn about the quality of data-driven storytelling but also for editorial rooms in media and enterprises to justify the required resources the gathering, analyzing and presentation of data. A framework will be presented that takes the different perspectives of author, audience and publisher and their correspondent criteria into account. Furthermore it connects them with the methods and metrics to provide a roadmap for what and how to measure if these resulting data-driven stories met the goals. In addition, the chapter will explore and define the constraints which might limit the metrics and methods available making it difficult to reach the goals.

Devices and Gadgets for Data Storytelling

This chapter will discuss the role of different hardware devices and media in visual data driven storytelling. The different form factors offer different affordances for data storytelling affecting their suitability to the different data storytelling settings. For example, wall displays are well suited to synchronous co-located presentation, while watches and virtual reality headsets work better for personal consumption of pre-authored data stories.

Ethics in Data-Driven Visual Storytelling

Is the sample representative, have we thought of the bias of whoever collected or aggregated the data, can we extract a certain conclusion from the dataset, is it implying something the data doesn’t cover, does the visual device, or the interaction, or the animation affect the interpretation that the audience can have of the story? Those are questions that anyone that has produced or edited a data-driven visual story has, or at least should have, been confronted with. After introducing the space, and the reasons and implications of ethics in this space, this chapter will look at the risks, caveats, and considerations at every step of the process, from the collection/acquisition of the data, to the analysis, presentation, and publication. Each point will be supported by an example of a successful or flawed ethical consideration.

Conclusion

The main objective of this Dagstuhl seminar was to develop an interdisciplinary research agenda around data-driven storytelling as we seek to develop generalizable findings and tools to support the use of visualization in communicating information. Productive group work converged to delineate several research opportunities moving forward:

  • The need for interfaces that enable the fluid movement between exploratory and communicative visualization so that storytelling workflow is seamless and powerful.
  • The need to develop typologies of visual storytelling techniques and structures used in practice so that opportunities for supporting these techniques can be sought through computing approaches.
  • The need to develop evaluation frameworks that can assess storytelling techniques and tools both scientifically and critically.
  • The need for design frameworks that can guide the structure of visual information for experiences across different output devices, both existing and future.
  • The need to understand the audience and their role in co-constructing meaning with the author of a data-driven story.
  • The need for ethical frameworks that should guide tool development for visual data-driven communication.

These opportunities were productively enumerated at the Dagstuhl seminar and are in the process of being written up as chapters in our book on data-driven storytelling.

References:

  1. All the medalists: Men’s 100-meter freestyle: Racing against history. Web, 2012. http://www.nytimes.com/interactive/2012/08/01/sports/olympics/racing-against-history.html?_r=1&.
  2. Fereshteh Amini, Nathalie Henry Riche, Bongshin Lee, Christophe Hurter, and Pourang Irani. Understanding data videos: Looking at narrative visualization through the cinematography lens. In Proceedings of CHI: SIGCHI Conference on Human Factors in Computing System, pages 1459–1468, New York, NY, USA, 2015. ACM.
  3. Michael Bostock, Vadim Ogievetsky, and Jeffrey Heer. D3 data-driven documents. IEEE Transactions on Visualization and Computer Graphics, 17(12):2301–2309, December 2011.
  4. Jeremy Boy, Sara Johansson Fernstad, Martin Turner, Simon Walton, David Ebert, Jean-Daniel Fekete, Andy Kirk, and Mario Romeo. Eurovis 2014 workshop: Towards visualization literacy. visualization literacy workshop. EuroVis 2014, 2014. https://www.kth.se/profile/178785/page/eurovis-2014-workshop-towards-visualiza/.
  5. Stuart K. Card, Jock D. Mackinlay, and Ben Shneiderman. Readings in Information Visualization: Using Vision to Think. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA., 1999. 6 William C. Cleveland and Marylyn E. McGill. Dynamic Graphics for Statistics (1st ed.). CRC Press, Inc., Boca Raton, FL, USA., 1999.
  6. Instance of data storytelling to explain the subprime mortgage crisis in the usa. Web, 2014. http://www.bloomberg.com/dataview/2014-02-25/bubble-to-bust-to-recovery.html.
  7. Hans Rosling. The best stats you’ve ever seen. TED Talks, 2006. https://youtu.be/hVimVzgtD6w.
  8. The wealth report 2013: Examining high net worth individuals from around the world. Web, 2013. http://www.elitehavenssales.com/news/47/ The-Wealth-Report-2013-Examining-High-Net-Worth-Individuals-from-around-the-world.
  9. J. J. Thomas and K. A. Cook. Illuminating the Path: The Research and Development Agenda for Visual Analytics. IEEE CS Press, USA., 2005. 11 Edward Tufte. The Visual Display of Quantitative Information, Second Edition. Graphics Press, USA., 1991.
License
  Creative Commons BY 3.0 Unported license
  Sheelagh Carpendale and Nicholas Diakopoulos and Nathalie Henry Riche and Christophe Hurter

Classification

  • Computer Graphics / Computer Vision
  • Multimedia

Keywords

  • Information Visualization
  • Storytelling
  • Visual literacy
  • Data journalism
  • Personal visualization

Book exhibition

Books from the participants of the current Seminar 

Book exhibition in the library, ground floor, during the seminar week.

Documentation

In the series Dagstuhl Reports each Dagstuhl Seminar and Dagstuhl Perspectives Workshop is documented. The seminar organizers, in cooperation with the collector, prepare a report that includes contributions from the participants' talks together with a summary of the seminar.

 

Download overview leaflet (PDF).

Publications

Furthermore, a comprehensive peer-reviewed collection of research papers can be published in the series Dagstuhl Follow-Ups.

Dagstuhl's Impact

Please inform us when a publication was published as a result from your seminar. These publications are listed in the category Dagstuhl's Impact and are presented on a special shelf on the ground floor of the library.