CHALLENGES AND DIRECTIONS FOR TRANSACTIONAL MEMORY RESEARCH Christos Kozyrakis Pervasive Parallelism Lab Stanford University http://ppl.stanford.edu/~christos The Transactional Memory (TM) research community has made great progress within the past five years. We now have a reasonable understanding of how to implement TM with hardware or software, how to optimize TM code, how to argue about and prove TM semantics, how to manage contention, and how to reason about issues such as strong atomicity or composability. More important, the TM community has created excitement and attracted participation from multiple research domains including architecture, compilers, programming languages, operating systems, and distributed algorithms. There are several low-level implementation issues for TM researchers to debate and investigate. Nevertheless, the goal of this talk is to initiate a discussion on the urgent high-level challenges for the TM community. The following five issues are often raised by colleagues outside of this community and are particular important for the long-term success of TM research. Some suggestions on how to address these issues based on past and current work at the TCC group at Stanford are also included: 1) How does TM fit with parallel programming environments? Since TM cannot address on its own all the challenges of concurrency, it is important to consider how it fits within complete parallel programming environments. We have placed significant effort on integrating transactions with existing parallel idioms (C++ threads, Java threads, OpenMP, etc). It is now time to explore how transactions fit with other innovative ideas for parallel programming. One such idea is domain-specific languages that hide the complexities of concurrency using high-level, domain-specific abstractions. In such an environment, transactions may simply be an implementation tool, hidden from the end programmer. The advantage of this approach is that we can limit the type and scope of transactions used in practice, avoiding the difficult cases of nesting, inter-transaction communication, etc. 2) How does TM fit in the stack of a modern computer system? Related to (1), modern computing environments do not consist of just processors and memory. They also include I/O, networking, interprocess communication, distributed environment over cluster substrates, etc. How does TM technology fit in such a stack? Can we provide atomicity and isolation as user code interacts with multiple system components? If yes, what are the semantics and what are the restrictions? So far, we have either ignored these issues or "stretched" TM to cover parts of the system functionality. An alternative approach is to consider system-scale transactions, where TM is just one of the many transactional components in the system. Similar to IBM's QuickSilver system, a transactional manager would coordinate the execution of user-level transactions across transactional components such as TM, log-based file systems, DBMS, and network queues. (3) Does TM technology scale? The only concurrency that matters is concurrency that scales. For TM to remain relevant, its language abstractions and implementations must scale from tens to hundred of thousands of threads. Virtually all TM implementations currently rely on coherent, shared memory, a technology that we are still not certain how to scale. On the other hand, transactions may be the abstraction that makes inter-thread communication sufficiently coarse-grained in space and time so that coherent shared memory can scale to large numbers of threads. (4) Can TM help with system challenges beyond concurrency? Application developers are facing several challenges in addition to exploiting concurrency. Security, reliability and robustness, debugging and testing are a few of the many. The basic mechanisms of a TM system (data versioning, conflict detection, serializability enforcement) can potentially help simplify or improve solutions towards these challenges. The opportunity for the TM community is that such uses may be the points that convince system vendors to deploy TM and application developers to actually use it. The challenge is to explore how such uses interact with transactions for concurrency control. (5) How much easier does TM make parallel programming after all? Last but definitely not least, for all our work on TM, we still have no quantitative data to support the main claim for TM research. Measuring ease of programming is an extremely difficult task, but it is also a task that we must undertake. We need to consider what are the user studies or deployments that we should put together to quantify some aspects of programmability. Apart from convincing critics, this will help us understand how programmers will actually use transactions, what are the common cases, what are the pitfalls, how useful transactions are beyond concurrency etc. Such work is much more important at this point than yet another optimization of some implementation aspect. Hopefully, the Dagstuhl workshop can make some progress towards addressing these challenges.