TOP
Search the Dagstuhl Website
Looking for information on the websites of the individual seminars? - Then please:
Not found what you are looking for? - Some of our services have separate websites, each with its own search option. Please check the following list:
Schloss Dagstuhl - LZI - Logo
Schloss Dagstuhl Services
Seminars
Within this website:
External resources:
  • DOOR (for registering your stay at Dagstuhl)
  • DOSA (for proposing future Dagstuhl Seminars or Dagstuhl Perspectives Workshops)
Publishing
Within this website:
External resources:
dblp
Within this website:
External resources:
  • the dblp Computer Science Bibliography


Dagstuhl Seminar 21283

Data Structures for Modern Memory and Storage Hierarchies

( Jul 11 – Jul 16, 2021 )

(Click in the middle of the image to enlarge)

Permalink
Please use the following short url to reference this page: https://www.dagstuhl.de/21283

Organizers

Contact


Motivation

For decades, virtually all systems software used DRAM as main memory and disk for persistent storage. Over the past years, a plethora of novel technologies have emerged that radically change the storage and memory hierarchy:

  • Byte-addressable persistent memory (also known as NVM, NVRAM, or SCM) has been a topic of intense research and different use cases for this technology have been proposed. However, in most cases this research had to resort to simulation or emulation of the hardware. With the recent commercial availability of persistent memory, it has become possible to re-evaluate the different proposals and determine their role in the storage hierarchy of the future.
  • PCIe-attached flash solid-state drives (SSD) have become fast and cheap. Arrays of such devices can approach DRAM bandwidth, but at a much lower cost. NAND flash also has peculiar physical properties (e.g., out-of-place writes, garbage collection), which are usually hidden today, but can severely impact throughput, latency, and durability. This requires rethinking how secondary storage is managed.
  • Technologies like RDMA and Gen-Z blur the line between local and remote data structures ("Far Memory"), which has major implications on system design.
  • More and more data management systems are moving into public clouds, which offer a bouquet of storage services with widely differing performance and costs characteristics. Choosing the best mix of services for a given use case is a major challenge.

This "zoo" of technologies offers widely-differing storage capacities, performance characteristics, access interfaces, and persistency guarantees. Each of these technologies has the potential of significantly affecting system architecture and data structure design.

This Dagstuhl Seminar will bring together researchers and practitioners from the data management and systems communities to foster cross-cutting architectural discussions. During the seminar, the participants will discuss opportunities and challenges of exploiting modern storage technologies. As seminar outcomes, we hope to advance the state of the art as well as educate all seminar participants on database system architecture, data structure design, caching strategies, and operating system support for data processing. We will structure our activities at Dagstuhl in such a way that each group and individual leaves with the possibility of publishing their results.

Copyright Stratos Idreos, Viktor Leis, Kai-Uwe Sattler, and Margo Seltzer

Summary

The seminar brought together researchers and practitioners from the data management and systems/storage communities to discuss the implications of the modern hardware landscape on high-performance systems. Due to the pandemic, the seminar was organized as a hybrid event: Virtual participation was limited to one session per day that featured invited talks. The in-person component consisted of free-flowing plenary discussions and several smaller, focused working groups. Some key takeaways from the discussion are:

  • OS/DBMS co-design: Traditional POSIX-style OS abstractions do not work well for data-intensive systems, leading to complex workarounds and suboptimal performance. While some of these issues could in principle be fixed by optimizing OS implementations, others require new APIs. For example, it is very difficult to implement crash-consistent data structures on top of the mmap system call. C
  • loud: The cloud is taking over and cloud-native data processing systems often have a a very different architecture from traditional data management systems. For example, many systems strive to separate storage from compute. This trend is enabled by ever faster networks.
  • Near-data processing: Separating storage from compute leads to costly data movement, which may be mitigated by pushing down (parts of) the computation close to the data. Major public cloud vendors already to optimize their internal services towards this goal. The challenges is how to program such distributed and specialized hardware components.
  • Persistent Memory: One major question discussed at the seminar was the role of byte-addressable persistent memory in future systems and whether what the "kill app" for this technology is. While there are several promising applications (e.g., graph processing or systems that require fast recovery times), it is not clear whether wide adoption will occur. Currently, the technology is quite expensive (prices per byte are similar to DRAM) and very hard to program in a crash-consistent way (e.g., writes must be carefully ordered similar to lock-free-style programming).
Copyright Viktor Leis

Participants
On-site
  • Gustavo Alonso (ETH Zürich, CH) [dblp]
  • Alexander Baumstark (TU Ilmenau, DE)
  • Carsten Binnig (TU Darmstadt, DE) [dblp]
  • André Brinkmann (Universität Mainz, DE) [dblp]
  • Christian Dietrich (TU Hamburg-Harburg, DE)
  • Muhammad Attahir Jibril (TU Ilmenau, DE)
  • Alfons Kemper (TU München, DE) [dblp]
  • Viktor Leis (Universität Erlangen-Nürnberg, DE) [dblp]
  • Alberto Lerner (University of Fribourg, CH) [dblp]
  • Ulrich Carsten Meyer (Goethe-Universität - Frankfurt am Main, DE) [dblp]
  • Thomas Neumann (TU München, DE) [dblp]
  • Marcus Paradies (German Aerospace Center - Jena, DE) [dblp]
  • Kai-Uwe Sattler (TU Ilmenau, DE) [dblp]
  • Jens Teubner (TU Dortmund, DE) [dblp]
  • Alexander van Renen (Universität Erlangen-Nürnberg, DE)
Remote:
  • Marcos K. Aguilera (VMware - Palo Alto, US) [dblp]
  • Raja Appuswamy (EURECOM - Biot, FR)
  • Manos Athanassoulis (Boston University, US) [dblp]
  • Alexander Böhm (SAP SE - Walldorf, DE) [dblp]
  • Peter A. Boncz (CWI - Amsterdam, NL) [dblp]
  • Mark Callaghan (Rockset - Bend, US)
  • Khuzaima Daudjee (University of Waterloo, CA) [dblp]
  • Jana Giceva (TU München, DE) [dblp]
  • Goetz Graefe (Google - Madison, US) [dblp]
  • Gabriel Haas (Universität Erlangen-Nürnberg, DE)
  • Stratos Idreos (Harvard University - Cambridge, US) [dblp]
  • Wolfgang Lehner (TU Dresden, DE) [dblp]
  • Ismail Oukid (Snowflake - Berlin, DE) [dblp]
  • Danica Porobic (Oracle Labs Switzerland - Zürich, CH) [dblp]
  • Ken Salem (University of Waterloo, CA) [dblp]
  • Wolfgang Schröder-Preikschat (Universität Erlangen-Nürnberg, DE) [dblp]
  • Margo Seltzer (University of British Columbia - Vancouver, CA) [dblp]
  • Tianzheng Wang (Simon Fraser University - Burnaby, CA) [dblp]
  • William Wang (ARM Ltd. - Cambridge, GB) [dblp]

Classification
  • Data Structures and Algorithms
  • Databases
  • Performance

Keywords
  • persistent memory
  • non-volatile memory
  • SSD
  • database systems
  • storage