https://www.dagstuhl.de/23071

February 12 – 17 , 2023, Dagstuhl Seminar 23071

From Big Data Theory to Big Data Practice

Organizers

Martin Farach-Colton (Rutgers University – Piscataway, US)
Fabian Daniel Kuhn (Universität Freiburg, DE)
Ronitt Rubinfeld (MIT – Cambridge, US)
Przemyslaw Uznanski (University of Wroclaw, PL)

For support, please contact

Susanne Bach-Bernhard for administrative matters

Andreas Dolzmann for scientific matters

Motivation

Some recent advances in the theory of algorithms for big data – sublinear/local algorithms, streaming algorithms and external memory algorithms – have translated into impressive improvements in practice, whereas others have remained stubbornly resistant to useful implementations. This Dagstuhl Seminar aims to glean lessons for those aspects of these algorithms that have led to practical implementation to see if the lessons learned can both improve the implementations of other theoretical ideas and to help guide the next generation of theoretical advances.

As data has grown faster than RAM, the theory of algorithms has expanded to provide approaches for tackling such problems. These fall into three broad categories:

  • Streaming and semi-streaming algorithms
  • Sublinear or local algorithms
  • External memory algorithms

Each of these areas has a vibrant literature, and many of the results from the theory literature have made their way into practice. Other results are not suitable for implementation and deployment. The seminar aims to address several questions by bringing together algorithmicists from these subcommunities, as well as algorithms engineers. Specifically, we aim to address the following questions:

  • What themes emerge from considering practical algorithms from the theory literature?
  • Can we use these insights to create new models or to capture interesting new optimization criteria?

By bringing together researchers in these disparate areas and by including researchers in algorithms engineering, we hope to bring to light these deep connections. The goals are to:

  • Extract shared lessons to help guide theoretical research towards practical solutions;
  • Create a feedback loop where commonalities of practical solutions can help guide future theoretical research;
  • Help cross-pollinate these research areas.

Motivation text license
  Creative Commons BY 4.0
  Martin Farach-Colton, Fabian Daniel Kuhn, Ronitt Rubinfeld, and Przemyslaw Uznanski

Classification

  • Data Structures And Algorithms
  • Distributed / Parallel / And Cluster Computing

Keywords

  • Sublinear algorithms
  • Local algorithms
  • External memory

Documentation

In the series Dagstuhl Reports each Dagstuhl Seminar and Dagstuhl Perspectives Workshop is documented. The seminar organizers, in cooperation with the collector, prepare a report that includes contributions from the participants' talks together with a summary of the seminar.

 

Download overview leaflet (PDF).

Dagstuhl's Impact

Please inform us when a publication was published as a result from your seminar. These publications are listed in the category Dagstuhl's Impact and are presented on a special shelf on the ground floor of the library.

Publications

Furthermore, a comprehensive peer-reviewed collection of research papers can be published in the series Dagstuhl Follow-Ups.