https://www.dagstuhl.de/19282

July 7 – 12 , 2019, Dagstuhl Seminar 19282

Data Series Management

Organizers

Anthony Bagnall (University of East Anglia – Norwich, GB)
Richard L. Cole (Tableau Software – Palo Alto, US)
Themis Palpanas (Paris Descartes University, FR)
Konstantinos Zoumpatianos (Harvard University – Cambridge, US)

For support, please contact

Dagstuhl Service Team

Documents

List of Participants
Shared Documents
Dagstuhl Seminar Schedule [pdf]

Press Room

Motivation

We now witness a very strong interest by users across different domains on data series (a.k.a. time series) management. It is not unusual for industrial applications that produce data series to involve numbers of sequences (or subsequences) in the order of billions (i.e., multiple TBs). As a result, analysts are unable to handle the vast amounts of data series that they have to manage and process. The goal of this Dagstuhl Seminar is to enable researchers and practitioners to exchange ideas and foster collaborations in the topic of data series management and identify the corresponding open research directions. The main questions answered will be the following: i) What are the data series management needs across various domains and what are the shortcomings of current systems, ii) How can we use machine learning to optimize our current data systems, and how can these systems help in machine learning pipelines? iii) How can visual analytics assist the process of analyzing big data series collections?

The seminar will focus on the following key topics related to data series management:

  1. Data series storage and access patterns: We will describe some of the existing (academic and commercial) systems for managing data series, describe their differences, and comment on their evolution over time. We will try to answer the following questions: What are their shortcomings? What are the best ways to lay out data series on disk and in memory to optimize data series queries? How can we integrate domain specific summarizations/indexes and compression schemes in existing systems?
  2. Query optimization: One of the most important open problems in data series management is that of query optimization. However, there has been no work on estimating the hardness/selectivity of data series similarity search queries. This is of paramount importance for effective access path selection. During the seminar we will discuss the current work in the topic, and examine future research directions.
  3. Machine learning and data mining for data series: Recent developments in deep neural network architectures have also caused an intense interest in examining the interactions between machine learning algorithms and data series management. We will discuss machine learning from two perspectives. First, we will discuss machine learning techniques for data series analysis tasks, as well as for tuning data series management systems. Second, we will discuss how data series management systems can help in the scalability of machine learning pipelines.
  4. Visualization for data series exploration: There are several research problems in the intersection of visualization and data series management. Existing data series visualization and human interaction techniques only consider very small datasets, yet, they can play a significant role in the tasks of similarity search, analysis, and exploration of very large data series collections. We will discuss promising directions for addressing these problems related to both the frontend and the backend.
  5. Applications in multiple domains: We will discuss applications and requirements originating from various fields, including astrophysics, neuroscience, engineering, and operations management. The goal will be to allow scientists and practitioners to exchange ideas, foster collaborations, and develop a common terminology.

Motivation text license
  Creative Commons BY 3.0 DE
  Anthony Bagnall, Richard L. Cole, Themis Palpanas, and Kontantinos Zoumptianos

Classification

  • Data Bases / Information Retrieval
  • Data Structures / Algorithms / Complexity

Keywords

  • Sequences
  • Time series
  • Data series analytics
  • Machine learning
  • Data systems

Documentation

In the series Dagstuhl Reports each Dagstuhl Seminar and Dagstuhl Perspectives Workshop is documented. The seminar organizers, in cooperation with the collector, prepare a report that includes contributions from the participants' talks together with a summary of the seminar.

 

Download overview leaflet (PDF).

Publications

Furthermore, a comprehensive peer-reviewed collection of research papers can be published in the series Dagstuhl Follow-Ups.

Dagstuhl's Impact

Please inform us when a publication was published as a result from your seminar. These publications are listed in the category Dagstuhl's Impact and are presented on a special shelf on the ground floor of the library.