01.12.19 - 06.12.19, Seminar 19491

Big Graph Processing Systems

The following text appeared on our web pages prior to the seminar, and was included as part of the invitation.

Motivation

In our world, data is not just getting bigger, it is also getting more connected. Exploring, describing, predicting, and explaining phenomena connected to the interconnected world requires the use of an adequate data abstraction. Graphs are recognized as a general, natural, and flexible data-abstraction that can model complex relationships, interactions, and interdependencies between objects. Graphs have been widely used to represent datasets and encode problems across an already extensive range of application domains. The ever-increasing size of graph-structured data for these applications creates a critical need for scalable and even elastic systems that can process large amounts of it efficiently. Additionally, the complexity of using multiple datasets simultaneously in complex analysis, raises numerous challenges for graph processing, from new requirements to new capabilities.

This Dagstuhl Seminar is planning to discuss some of these open challenges on the interplay between graph data, abstractions, systems, performance engineering, and software engineering, with a main focus on the following key topics related to big graph processing systems:

  1. Design Decisions of Big Graph Processing Ecosystems: In modern setups, graph-processing is not a self-sustained, independent activity, but rather part of a larger big-data processing ecosystem with many system alternatives and possible design decisions. We need a clear understanding of the impact and the trade-offs of the various decisions in order to effectively guide the developers of big graph processing applications.
  2. High-Level Graph Processing Abstractions: While imperative programming models, such as vertex-centric or edge-centric programming models, are popular, they are lacking a high-level exposition to the end user. To increase the power of graph processing systems and foster the usage of graph analytics in applications, we need to design high-level graph processing abstractions. It is currently completely open how future declarative graph processing abstractions could look like.
  3. Application and Domain Specific Requirements: Due to the ubiquity of graph-shaped data, users and applications deal with these data in their daily tasks from private to professional life. It becomes thus crucial to understand the user’s requirements in executing queries and complex analytical tasks on graph data, and to understand their actual usage of these data in production environments.
  4. Performance and Scalability Evaluation: Traditionally, performance and scalability are measures of efficiency, e.g. FLOPS, throughput, or speedup, are difficult to apply for graph processing, especially since performance is non-trivially dependent on platform, algorithm, and dataset. Moreover, running graph-processing workloads in the cloud leverages additional challenges. Such performance-related issues are key to identify, design, and build upon widely recognized benchmarks for graph processing.

To address these topics the seminar will bring together researchers, developers, and practitioners actively working on these topics. The aim is to use the insights and the results of the discussions of the seminar to provide a roadmap that can guide the development of several aspects in the future of big graph processing systems.

License
Creative Commons BY 3.0 DE
Angela Bonifati, Alexandru Iosup, Sherif Sakr, and Hannes Voigt