https://www.dagstuhl.de/19282

07. – 12. Juli 2019, Dagstuhl-Seminar 19282

Data Series Management

Organisatoren

Anthony Bagnall (University of East Anglia – Norwich, GB)
Richard L. Cole (Tableau Software – Palo Alto, US)
Themis Palpanas (Paris Descartes University, FR)
Konstantinos Zoumpatianos (Harvard University – Cambridge, US)

Auskunft zu diesem Dagstuhl-Seminar erteilen

Susanne Bach-Bernhard zu administrativen Fragen

Shida Kunz zu wissenschaftlichen Fragen

Dokumente

Programm des Dagstuhl-Seminars (Hochladen)

(Zum Einloggen bitte Seminarnummer und Zugangscode verwenden)

Motivation

We now witness a very strong interest by users across different domains on data series (a.k.a. time series) management. It is not unusual for industrial applications that produce data series to involve numbers of sequences (or subsequences) in the order of billions (i.e., multiple TBs). As a result, analysts are unable to handle the vast amounts of data series that they have to manage and process. The goal of this Dagstuhl Seminar is to enable researchers and practitioners to exchange ideas and foster collaborations in the topic of data series management and identify the corresponding open research directions. The main questions answered will be the following: i) What are the data series management needs across various domains and what are the shortcomings of current systems, ii) How can we use machine learning to optimize our current data systems, and how can these systems help in machine learning pipelines? iii) How can visual analytics assist the process of analyzing big data series collections?

The seminar will focus on the following key topics related to data series management:

  1. Data series storage and access patterns: We will describe some of the existing (academic and commercial) systems for managing data series, describe their differences, and comment on their evolution over time. We will try to answer the following questions: What are their shortcomings? What are the best ways to lay out data series on disk and in memory to optimize data series queries? How can we integrate domain specific summarizations/indexes and compression schemes in existing systems?
  2. Query optimization: One of the most important open problems in data series management is that of query optimization. However, there has been no work on estimating the hardness/selectivity of data series similarity search queries. This is of paramount importance for effective access path selection. During the seminar we will discuss the current work in the topic, and examine future research directions.
  3. Machine learning and data mining for data series: Recent developments in deep neural network architectures have also caused an intense interest in examining the interactions between machine learning algorithms and data series management. We will discuss machine learning from two perspectives. First, we will discuss machine learning techniques for data series analysis tasks, as well as for tuning data series management systems. Second, we will discuss how data series management systems can help in the scalability of machine learning pipelines.
  4. Visualization for data series exploration: There are several research problems in the intersection of visualization and data series management. Existing data series visualization and human interaction techniques only consider very small datasets, yet, they can play a significant role in the tasks of similarity search, analysis, and exploration of very large data series collections. We will discuss promising directions for addressing these problems related to both the frontend and the backend.
  5. Applications in multiple domains: We will discuss applications and requirements originating from various fields, including astrophysics, neuroscience, engineering, and operations management. The goal will be to allow scientists and practitioners to exchange ideas, foster collaborations, and develop a common terminology.

License
  Creative Commons BY 3.0 DE
  Anthony Bagnall, Richard L. Cole, Themis Palpanas, and Kontantinos Zoumptianos

Classification

  • Data Bases / Information Retrieval
  • Data Structures / Algorithms / Complexity

Keywords

  • Sequences
  • Time series
  • Data series analytics
  • Machine learning
  • Data systems

Buchausstellung

Bücher der Teilnehmer 

Buchausstellung im Erdgeschoss der Bibliothek

(nur in der Veranstaltungswoche).

Dokumentation

In der Reihe Dagstuhl Reports werden alle Dagstuhl-Seminare und Dagstuhl-Perspektiven-Workshops dokumentiert. Die Organisatoren stellen zusammen mit dem Collector des Seminars einen Bericht zusammen, der die Beiträge der Autoren zusammenfasst und um eine Zusammenfassung ergänzt.

 

Download Übersichtsflyer (PDF).

Publikationen

Es besteht weiterhin die Möglichkeit, eine umfassende Kollektion begutachteter Arbeiten in der Reihe Dagstuhl Follow-Ups zu publizieren.

Dagstuhl's Impact

Bitte informieren Sie uns, wenn eine Veröffentlichung ausgehend von
Ihrem Seminar entsteht. Derartige Veröffentlichungen werden von uns in der Rubrik Dagstuhl's Impact separat aufgelistet  und im Erdgeschoss der Bibliothek präsentiert.