http://www.dagstuhl.de/14091

February 23rd – February 28th 2014, Dagstuhl Seminar 14091

Data Structures and Advanced Models of Computation on Big Data

Organizers

Alejandro Lopez-Ortiz (University of Waterloo, CA)
Ulrich Carsten Meyer (Goethe-Universität Frankfurt am Main, DE)
Robert Sedgewick (Princeton University, US)


1 / 2 >

For support, please contact

Dagstuhl Service Team

Documents

Dagstuhl Report, Volume 4, Issue 2 Dagstuhl Report
Aims & Scope
List of Participants
Shared Documents
Dagstuhl Seminar Schedule [pdf]

Summary

A persistent theme in the presentations in this Dagstuhl seminar is the need to refine our models of computation to adapt to modern architectures, if we are to develop a scientific basis for inventing efficient algorithms to solve real-world problems. For example, Mehlhorn's presentation on the cost of memory translation, Iacono's reexamination of the cache-oblivious model, and Sanders' description of communication efficiency all left many participants questioning basic assumptions they have carried for many years and are certain to stimulate new research in the future.

Better understanding of the properties of modern processors certainly can be fruitful. For example, several presentations, such as the papers by Aumüller, López-Ortiz, and Wild on Quicksort and the paper by Bingmann on string sorting, described faster versions of classic algorithms that are based on careful examination of modern processor design.

Overall, many presentations described experience with data from actual applications. For example, the presentations by Driemel and Varenhold on trajectory data described a relatively new big-data application that underscores the importance and breadth of application of classic techniques in computational geometry and data structure design.

Other presentations which discussed large data sets on modern architectures were the lower bound on parallel external list ranking by Jacob, which also applies on the MapReduce and BSP models commonly used in large distributed platforms; and by Hagerup who considered the standard problem of performing a depth first search (DFS) on a graph, a task that is trivial in small graphs but extremely complex on ``big data'' sets such as the Facebook graph. He proposed a space efficient algorithm that reduces the space required by DFS by a log n factor or an order of magnitude on practical data sets.

Schweikardt gave a model for MapReduce computations, a very common computing platform for very large server farms. Salinger considered the opposite end of the spectrum namely how to simplify the programming task as to take optimal advantage of a single server which also has its own degree of parallelism from multiple cores, GPUs and other parallel facilities.

In terms of geometric data structures for large data sets Afshani presented sublinear algorithms for the I/O model which generalize earlier work on sublinear algorithms. Sublinear algorithms are of key importance on very large data sets, which are thus presumably unable to fit in main memory. Yet most of the previously proposed algorithms assumed that such large data sets were hosted in main memory. Toma gave an external memory representation of the popular quad tree data structure commonly used in computer graphics as well as other spatial data applications.

License
Creative Commons BY 3.0 Unported license
Alejandro Lopez-Ortiz and Ulrich Carsten Meyer and Robert Sedgewick

Dagstuhl Seminar Series

Classification

  • Data Bases / Information Retrieval
  • Data Structures / Algorithms / Complexity

Keywords

  • Data structures
  • Algorithms
  • Large data sets
  • External memory methods
  • Big data
  • Streaming
  • Web-scale

Book exhibition

Books from the participants of the current Seminar 

Book exhibition in the library, 1st floor, during the seminar week.

Documentation

In the series Dagstuhl Reports each Dagstuhl Seminar and Dagstuhl Perspectives Workshop is documented. The seminar organizers, in cooperation with the collector, prepare a report that includes contributions from the participants' talks together with a summary of the seminar.

 

Download overview leaflet (PDF).

Publications

Furthermore, a comprehensive peer-reviewed collection of research papers can be published in the series Dagstuhl Follow-Ups.

Dagstuhl's Impact

Please inform us when a publication was published as a result from your seminar. These publications are listed in the category Dagstuhl's Impact and are presented on a special shelf on the ground floor of the library.