https://www.dagstuhl.de/20391

20. – 25. September 2020, Dagstuhl-Seminar 20391

Database Indexing and Query Processing

Organisatoren

Renata Borovica-Gajic (The University of Melbourne, AU)
Goetz Graefe (Google – Madison, US)
Allison Lee (Snowflake – San Mateo, US)
Caetano Sauer (Tableau – München, DE)
Pinar Tözün (IT University of Copenhagen, DK)

Auskunft zu diesem Dagstuhl-Seminar erteilen

Annette Beyer zu administrativen Fragen

Michael Gerke zu wissenschaftlichen Fragen

Dokumente

Programm des Dagstuhl-Seminars (Hochladen)

(Zum Einloggen bitte persönliche DOOR-Zugangsdaten verwenden)

Motivation

Following up on earlier Dagstuhl Seminars on robust performance of database query processing, a new Dagstuhl Seminar in 2020 will discuss and advance multiple topics in database query processing. We hope to achieve mutual education as well as concrete solutions for specific hard problems, and possibly publications based on the seminar and collaboration initiated during the seminar.

In our selection of topics, we focus on problems that are hard, relevant, and unsolved throughout academic research and industrial development. Technical topics of particular interest include:

  1. Robust query performance: resource policies, algorithms, data structures, query execution plans, and query optimization – with specific focus on dynamic sequences of multiple joins and on skew and load balancing in parallel systems.
  2. Sort-based versus hash-based query processing – a question decided in many minds but there are new techniques to consider. For example, pause-and-resume and restart-after-failure are important in highly parallel systems, and waste-free designs may require sorted intermediate results. For another example, storage structures and intermediate results sorted on hash values could combine the advantages of traditional indexes and of traditional hash-based query processing.
  3. Columns, rows, or clusters as storage formats and as intermediate results – with row storage widely favored for transaction processing and traditional line-of-business applications, column storage may or may not hold up to critical inspection and deep optimization of row storage including compressed indexes as well as advanced sorting and merging of index contents.
  4. Modern hardware: accelerators, memory & storage hierarchies – two mostly independent topics, with hardware accelerators continuing a promising opportunity mostly unused in industrial practice and with deep hierarchies of memory and storage hardware a practical reality not fully addressed and exploited in most database research or products. Dedicated instructions already speed up compression, encryption, transactional memory, and sorting, e.g., priority queues and string comparisons.
  5. Compilation, vectorization, or normalized keys – diverging but nonetheless strong opinions and beliefs notwithstanding, we should instead design and optimize hybrid systems that combine the techniques’ advantages. Some initial architectures already exist, e.g., deep compilation of query execution plans while interpreted query execution already begins. With luck, we can integrate all three of these promising techniques for high-bandwidth query execution over large databases.
  6. Stream processing, stream indexing – deferred maintenance and incremental optimization of derived data like Vertica’s write-optimized storage can be taken much further, with log-structured merge-forests and stepped merging widely used in industrial key-value stores but still far from optimal in the three crucial dimensions of insertion (information capture) bandwidth, query efficiency and query performance, and insertion-to-query latency.

As seminar outcomes, we hope to advance the state of the art as well as educate all seminar participants on both current technologies and new ideas. We will structure our activities at Dagstuhl in such a way that each group and individual leaves with the possibility of publishing their results.

Motivation text license
  Creative Commons BY 3.0 DE
  Renata Borovica-Gajic, Goetz Graefe, Allison Lee, Caetano Sauer, and Pinar Tözün

Dagstuhl-Seminar Series

Classification

  • Data Bases / Information Retrieval

Keywords

  • Database
  • Query
  • Optimization
  • Execution
  • Hardware
  • Performance

Dokumentation

In der Reihe Dagstuhl Reports werden alle Dagstuhl-Seminare und Dagstuhl-Perspektiven-Workshops dokumentiert. Die Organisatoren stellen zusammen mit dem Collector des Seminars einen Bericht zusammen, der die Beiträge der Autoren zusammenfasst und um eine Zusammenfassung ergänzt.

 

Download Übersichtsflyer (PDF).

Publikationen

Es besteht weiterhin die Möglichkeit, eine umfassende Kollektion begutachteter Arbeiten in der Reihe Dagstuhl Follow-Ups zu publizieren.

Dagstuhl's Impact

Bitte informieren Sie uns, wenn eine Veröffentlichung ausgehend von
Ihrem Seminar entsteht. Derartige Veröffentlichungen werden von uns in der Rubrik Dagstuhl's Impact separat aufgelistet  und im Erdgeschoss der Bibliothek präsentiert.