29. Oktober – 03. November 2017, Dagstuhl Seminar 17441
Big Stream Processing Systems
Auskunft zu diesem Dagstuhl Seminar erteilen
Simone Schilke zu administrativen Fragen
Marc Herbstritt zu wissenschaftlichen Fragen
Currently, the world is entirely living in the era of the information age. The world is progressively moving towards being a data-driven society where data is the most valuable asset. Therefore, the digital transformation is representing a revolution that cannot be missed. It is significantly transforming and changing various aspects in our modern life including the way we live, socialize, think, work, do business, conduct research and govern society. The digital transformation is characterized through the enormous amounts of data that are produced and analyzed. Big data has commonly been characterized by the defining 3V's properties which refer to huge in Volume, consisting of terabytes or petabytes of data; high in Velocity, being created in or near real time; and diversity in Variety of type, being structured and unstructured in nature.
As the world gets more instrumented and connected, we are witnessing a flood of digital data that is getting generated, in a high velocity, from different hardware (e.g., sensors) or software in the format of streams of data. Examples of this phenomena are crucial for several applications and domains including financial markets, surveillance systems, manufacturing, smart cities and scalable monitoring infrastructure. In these applications and domains, there is a crucial requirement to collect, process, and analyse big streams of data in order to extract valuable information, discover new insights in real-time and to detect emerging patterns and outliers.
Stream computing is a new paradigm necessitated by new data generating scenarios, such as the ubiquity of mobile devices, location services, and sensor pervasiveness. In general, stream processing systems support a large class of applications (e.g., financial markets, surveillance systems, manufacturing, smart cities and scalable monitoring infrastructure) in which data are generated from multiple sources and are pushed asynchronously to servers which are responsible for processing them. Recently, several systems (e.g., Apache Storm, Apache Heron, Apache Flink, Spark Streamin, Apache Apex) have been introduced to tackle the real-time processing of big streaming data. However, there are several challenges and open problems that need to be addressed in order improve the state-of-the- art in this domain and push big stream processing systems to make them widely used by large number of users and enterprises. Thus, this application proposes a seminar bringing together researchers, developers and practitioners actively working in this domain to discuss very relevant open challenges in this domain with a focus on two main topics: benchmarking and high-level declarative programming abstracts of big streaming jobs.
Creative Commons BY 3.0 DE
Irini Fundulaki and Tilmann Rabl and Sherif Sakr
- Data Bases / Information Retrieval
- Optimization / Scheduling
- Programming Languages / Compiler
- Big Data
- Big Streams
- Stream Processing Systems
- Declarative Programming