February 23 – 28, 2014, Dagstuhl Seminar 14091
Data Structures and Advanced Models of Computation on Big Data
1 / 2 >
For support, please contact
Annette Beyer for administrative matters
Marc Herbstritt for scientific matters
As part of the mandatory documentation, participants are asked to submit their talk abstracts, working group results, etc. for publication in our series Dagstuhl Reports via the Dagstuhl Reports Submission System.
(Use seminar number and access code to log in)
Computing is about the processing, exchanging, handling and storage of information. The way information is stored profoundly influences the efficiency of operations over the data. Research in data structures attempts to provide ways of storing and manipulating data and information that are appropriate for the computational model at hand: internal random access memory, sequentially accessible data streams, multiple cache-external disk memory hierarchies, parallel and distributed computing, etc. In spite of the maturity of the field many data structuring problems are still open, while new ones arise due to technological advances. Almost every new computing or storage technology poses new data structuring problems. This Dagstuhl Seminar will address besides the "classical" data structuring topics the following issues:
One consequence of the exponential growth of computing usage has been a large increase in the scale of the problems addressed by computer scientists. For example, twenty-five years ago a few tens of megabytes constituted a substantial text corpus. Nowadays search engine indexes handle document collections whose size is in the order of a hundred terabytes of data or so. We have seen similar growth in the size of large data collections such as those derived from internet traffic management, genomic data, social networks, large scale scientific experiments such as the Large Hadron Collider (LHC) and sensor networks in general. In this context the introduction of novel data structures that can operate over massively distributed computing platforms with minimal interprocess communication is a must.
Memory Hierarchy Methods for CMPs and SSDs
Processor speeds and external memory sizes have increased exponentially over the last decades. However the transfer speed between external and internal memory has not kept pace. Data structuring research has been addressing this problem and has worked on solutions that attempt to minimize the data transfer between external and internal memory and the various caches in between. However, many data structuring problems still have unsatisfactory solutions in this model. New challenges arise from multicore computers in which certain caches are shared, giving rise to issues such as cache contention, coherence and optimal order of execution. Similarly, flash memory storage, suffer from certain peculiarities like read/write imbalances and wear, which are not captured in the standard I/O model.
Energy-Aware Data Structure Design
Reducing the energy consumption of computations has become an important issue both for large scale computing and for small mobile devices. Clearly, the power consumption of a program depends on its execution time and memory usage, but the details are quite sophisticated. Unstructured memory access patterns, for example, have be shown to severely influence the energy consumption of smart phones -- more than could be expected by their respective time delay. Thus, suitable theoretical models and experimental studies will be needed to develop meaningful data structures and algorithms for energy-efficient computing.
Web-scale Data Structures
They are playing an increasingly important role in modern computing. The focus on implementing them has properly been on systems and commercial issues, but as the scale continues to increase, the scientific basis for studying performance and developing new algorithms that has proven so successful for classical data structures is certain to play a critical role in the evolution of the web.
Dagstuhl Seminar Series
- 10091: "Data Structures" (2010)
- 08081: "Data Structures" (2008)
- 06091: "Data Structures " (2006)
- 04091: "Data Structures" (2004)
- 02091: "Data Structures" (2002)
- 00091: "Data Structures" (2000)
- 98091: "Data Structures" (1998)
- 9609: "Data Structures" (1996)
- 9409: "Data Structures" (1994)
- 9145: "Data Structures" (1991)
- Data Bases / Information Retrieval
- Data Structures / Algorithms / Complexity
- Data structures
- Large data sets
- External memory methods
- Big data