Foundations of Persistent Programming
( 14. Nov – 17. Nov, 2021 )
- Hans-J. Boehm (Google - Mountain View, US)
- Ori Lahav (Tel Aviv University, IL)
- Azalea Raad (Imperial College London, GB)
- Michael Gerke (für wissenschaftliche Fragen)
- Jutka Gasiorowski (für administrative Fragen)
Modern society depends on storing and accessing vast quantities of data, at ever-increasing speeds. This global data reliance is underpinned by fast and reliable storage solutions. To meet our conﬂicting desiderata of fast and persistent data, storage hardware is traditionally divided into two categories: fast, byte-addressable memory (e.g. RAM) that is volatile (it loses its contents after a crash, e.g. power loss); and slow, block-addressable storage (e.g. hard disks) that is persistent (its contents survive crashes). However, advances in cutting-edge non-volatile memory (NVM) render this dichotomy obsolete by combining the strengths of both: NVM provides fast, byte-addressable access as with DRAM, while ensuring persistence across crashes as with hard disks.
NVM (persistent memory) has recently become commercially available through products such as Intel Optane which already has a larger capacity than DRAM, its access latency and throughput are within an order of magnitude of DRAM and 2-3 orders of magnitude better than hard disks. With DRAM scalability increasingly becoming an obstacle, NVM is expected to grow in importance and to complement or even supplant DRAM in the near future, which will inevitably lead to substantial changes in software and its engineering.
Existing persistency research has made commendable strides in several directions:
- Architecture (low-level) persistency models: The two major mainstream architectures (Intel-x86 in desk-top/server computing and ARMv8 for mobile/embedded computing) provide support for NVM programming via instructions for managing and ﬂushing the cache, with their persistency semantics recently formalised.
- Libraries: Existing work includes several persistent libraries and data structures. Most notably, PMDK is an industry-backed library by Intel, providing low-level NVM primitives and a few persistent data structures.
- Verification: Verifying persistent code has recently attracted the attention of the formal methods community; with the proliferation of persistency architectures this area will undoubtedly grow in importance and popularity.
All in all, persistent programming and persistency semantics are highly active research areas; nevertheless, there are several directions in which the existing work on persistency is lacking, leading to a research gap, including:
- Persistency validation and testing: Although the existing formal persistency models of mainstream architectures are a good starting point, they have a key limitation in that they lack validation: they were devised after discussions with processor architects, but were not checked against existing hardware implementations. This is a signiﬁcant gap: closely related work on testing processor consistency semantics revealed errors in prior models, bugs in deployed hardware implementations, and even led to changes in the oﬃcial architecture speciﬁcations.
- Language (high-level) persistency models: There is currently no high-level support for persistent programming in mainstream languages such as C/C++ and Rust. Speciﬁcally, there are currently only a handful of academic proposals for high-level persistency which suﬀer two main drawbacks: they cannot be implemented eﬃciently and lack veriﬁed implementations. For instance, they rely on primitives such as persistent fences to enforce persist-ordering on writes. However, such primitives may not be efficiently implementable on mainstream architectures, and thus may not be supported by architecture vendors in the foreseeable future. As such, these proposals cannot currently be reasonably implemented on existing hardware, thus lacking compilation schemes.
- Persistent libraries and data structures: With a few notable exceptions such as PMDK, there is currently little work on implementing persistent libraries and data structures. The main drawback of PMDK and similar (less-developed) libraries is that they lack formal speciﬁcations, and given the current lack of testing infrastructure for persistency, it is unclear if their implementations are correct.
- General correctness conditions for persistency: A key challenge in verifying persistent implementations is deﬁning a general correctness condition for persistent libraries. The only existing such condition is durable linearisability, a simple extension of the well-known linearisability criterion to persistency. Although a useful starting point, durable linearisability has a key drawback: it presupposes strong consistency and does not cater to the more realistic weak consistency setting. As such, durable linearisability is too strong and is not suitable for verifying libraries implemented in mainstream languages (e.g. C/C++) with weak consistency guarantees.
In this Dagstuhl Seminar we aim to take the ﬁrst steps to close this gap by bringing together a diverse group of world-renowned researchers on formal concurrency and persistency semantics and leading industry practitioners including chief hardware architects, concurrent software developers and experts on programming language design and standards. The seminar is structured along the four topics above to facilitate an interdisciplinary interchange between the theory and practice of persistency, thereby fostering the development of rigorous foundations for persistent programming.
We brought together 15 in-person attendees at Schloss Dagstuhl, with a roughly equal number of remote attendees. Remote attendance was challenging, particularly for attendees from very different time-zones. It nonetheless provided the opportunity for us to hear from a wider selection of participants.
We decided up-front not to try to cater the schedule to remote participation. Given the number of time zones covered by the participants, we continue to believe that, although it clearly had adverse impacts, it was the right decision.
We had a number of remote presentations that included interactions with the speakers. Otherwise the discussion tended to happen mostly among the in-person participants, in spite of the excellent AV systems at Dagstuhl. Many remote participants were limited in attendance due to time-zone issues.
Our area is perhaps unique, in that it includes deep theoretical work, but is also very dependent on technological developments. Accordingly, the participants of the seminar were from a spectrum of topics ranging from theory of distributed systems to hardware specification and design. The seminar gave many of us the opportunity to catch up on both theory and practice, including input from some participants with more direct insights into industrial developments.
We began the seminar by reviewing some of the underlying assumptions that were made by prior work in this field, often without certainty about their correctness. We were actually able to get much more shared clarity on a few of these as a result of audience discussion during the seminar. Some of us learned that non-volatile caches are being publicly discussed by Intel, and that there also is similar agreement that memory encryption, to restore volatility when needed, is desirable. We also learned that writes to the same cache line are not just believed by software researchers to reach memory in the correct order, but at least one hardware vendor also agrees. Though this last fact is rather obscure, it is important for some NVM algorithms, and not normally reflected in hardware manuals.
The rest of the seminar consisted of talks and group discussions. Three talks were longer overview talks on different aspects (Michael Scott on buffered persistency, Parosh Aziz Abdulla on verification, and Erez Petrank on persistent lock-free data structures). We did not feel the need for smaller break-out sessions, since the in-person group was quite small, largely due to our timing with respect to Covid waves. Much of the benefit here appears to have been in listening to discussions that often were either significantly more theoretical or significantly more practical than our own research.
NVM programming is both complicated by, and often synergistic with concurrent programming. Much of our focus was on the interaction between the two. Due to these close interactions, we asked several speakers to talk about concurrency issues that seemed particularly relevant (e.g., Peter Sewell on Armv8-A virtual memory model, Mark Batty on novel solution to the ``thin air problem'', and Paul McKenney on weak memory schemes used in the Linux kernel).
Several talks raised questions on the foundations of the field, such as what are the hardware-supplied programming models, or what it means for a persistent program to be correct. It is entirely possible that most future NVM programmers will be more concerned with something like the persistent transactions discussed in Michael Bond's talk. However, given the relative immaturity of the area and foundational uncertainty, the emphasis seemed appropriate.
- Mark Batty (University of Kent - Canterbury, GB) [dblp]
- Hans-J. Boehm (Google - Mountain View, US) [dblp]
- Michael D. Bond (Ohio State University - Columbus, US) [dblp]
- Brijesh Dongol (University of Surrey - Guildford, GB) [dblp]
- Michal Friedman (Technion - Haifa, IL) [dblp]
- Michalis Kokologiannakis (MPI-SWS - Kaiserslautern, DE) [dblp]
- Ori Lahav (Tel Aviv University, IL) [dblp]
- Maged M. Michael (Facebook - New York, US) [dblp]
- Erez Petrank (Technion - Haifa, IL) [dblp]
- Azalea Raad (Imperial College London, GB) [dblp]
- Konstantinos Sagonas (Uppsala University, SE) [dblp]
- Michael Scott (University of Rochester, US) [dblp]
- Viktor Vafeiadis (MPI-SWS - Kaiserslautern, DE) [dblp]
- Heike Wehrheim (Universität Oldenburg, DE) [dblp]
- Hagit Attiya (Technion - Haifa, IL) [dblp]
- Parosh Aziz Abdulla (Uppsala University, SE) [dblp]
- Piotr Balcer (Intel Technology - Gdansk, PL)
- Dhruva Chakrabarti (AMD - Santa Clara, US) [dblp]
- Soham Chakraborty (TU Delft, NL) [dblp]
- Ricardo Ciríaco da Graca (MPI-SWS - Kaiserslautern, DE)
- Brian Demsky (University of California - Irvine, US) [dblp]
- Derek Dreyer (MPI-SWS - Saarbrücken, DE) [dblp]
- João F. Ferreira (INESC-ID - Lisboa, PT)
- Maurice Herlihy (Brown University - Providence, US) [dblp]
- Joseph Izraelevitz (University of Colorado - Boulder, US) [dblp]
- Limin Jia (Carnegie Mellon University - Pittsburgh, US) [dblp]
- Jeehoon Kang (KAIST - Daejeon, KR) [dblp]
- Artem Khyzha (Arm - Cambridge, GB) [dblp]
- Umang Mathur (National University of Singapore, SG)
- Paul McKenney (Facebook - Beaverton, US) [dblp]
- Nikos Nikoleris (Arm - Cambridge, GB) [dblp]
- Andy Rudoff (Intel - Boulder, US) [dblp]
- Peter Sewell (University of Cambridge, GB) [dblp]
- William Wang (Arm - Cambridge, GB) [dblp]
- John Wickerson (Imperial College London, GB) [dblp]
- Dagstuhl-Seminar 23412: Formal Methods for Correct Persistent Programming (2023-10-08 - 2023-10-11) (Details)
- Distributed / Parallel / and Cluster Computing
- Programming Languages
- weak memory models