06.03.16 - 11.03.16, Seminar 16101

Data Structures and Advanced Models of Computation on Big Data

Diese Seminarbeschreibung wurde vor dem Seminar auf unseren Webseiten veröffentlicht und bei der Einladung zum Seminar verwendet.

Motivation

Computing is about the processing, exchanging, handling and storage of information. The way information is stored profoundly inuences the efficiency of operations over the data. Research in data structures attempts to provide ways of storing and manipulating data and information that are appropriate for the computational model at hand: A setting where data are to be stored in internal random access memory requires data structuring methods different from settings where data are to be stored on sequentially accessible tapes, or on a multiple cache-external disk memory hierarchy with highly varying access latency. In spite of the maturity of the field many data structuring problems are still open, while new ones arise due to technological advances. Moreover, almost every new computing or storage technology poses new data structuring problems. This seminar will address besides the "classical" data structuring topics the following issues:

  • Distributed Data Structures for Cloud Services. Many large scale cloud services are based on distributed data structures such as Google's BigTable that are designed to scale across hundreds or thousands of machines. Design goals for such data structures to be addressed at this seminar are the prevention of large latency with only a small computational overhead and the ability for the simple use of additional servers without the need of reconfiguration. Furthermore, the ever growing amount of data to be processed asks for compression; here a modernized view of information assimilating elements of space, time, structure, semantics, and context might be helpful to improve the data structures performance. Accordingly, one focus of this seminar will be on Information Theory of Data Structures; a starting point for the discussion at our workshop will be the search for measures and e_cient algorithms to appraise the amount of organization and structure embodied in artifacts and natural objects.
  • Memory Hierarchy Methods for CMPs. Processor speeds and external memory sizes have increased exponentially over the last decades. However the transfer speed between external and internal memory has not kept pace. The „von Neumann bottleneck“ is becoming a „von Neumann capillary“. Data structuring research has been addressing this problem and has worked on solutions that attempt to minimize the data transfer between external and internal memory and the various caches in between (see e.g. GI-Dagstuhl Seminar 02112 and Dagstuhl Seminar 14091). However, many data structuring problems still have unsatisfactory solutions and new challenges arise from multicore computers in which certain caches are shared. Here, new models are needed to guide the design of I/O efficient data structures and algorithms. Our seminar will start a discussion on how such models should look like.
  • Energy-Aware Data Structure Design. Reducing the energy consumption of computations has become an important issue both for large scale computing and for small mobile devices. Clearly, the power consumption of a program depends on its execution time and memory usage, but the details are quite sophisticated. Unstructured memory access patterns, for example, have been shown to severely inuence the energy consumption of smart phones - more than could be expected by their respective time delay. In addition, it has been observed that a „better“ data structure (runtime of related algorithm on a given machine) not necessarily gives rise to a smaller energy consumption (on that machine). Thus, suitable theoretical models and experimental studies will be needed to develop meaningful data structures and algorithms for energy-efficient computing. Our seminar will host a discussion on such models.