Dagstuhl Seminar 08261
Structure-Based Compression of Complex Massive Data
( Jun 22 – Jun 27, 2008 )
- Stefan Böttcher (Universität Paderborn, DE)
- Markus Lohrey (Universität Leipzig, DE)
- Sebastian Maneth (Univ. of New South Wales, AU)
- Wojciech Rytter (University of Warsaw, PL)
- Fast in-memory XPath search using compressed indexes : article - Arroyuelo, Diego; Claude, Francisco; Nguyen, Kim; Siren, Jouni; Välimäki, Niko; Navarro, Gonzalo; Mäkinen, Veli; Maneth, Sebastian - Chichester : Wiley, 2015. - pp. 399-434 - (Software : Practice and Experience ; 45. 2015 : article).
- Fast in-memory XPath search using compressed indexes : article in 2010 IEEE 26th International Conference on Data Engineering (ICDE) - Arroyuelo, Diego; Claude, Francisco; Nguyen, Kim; Siren, Jouni; Välimäki, Niko; Navarro, Gonzalo; Mäkinen, Veli; Maneth, Sebastian - Los Alamitos : IEEE, 2010. - pp. 417-428 - (International Conference on Data Engineering ; 26. 2010 : article).
Die Kunst der Datenquetschung
Interview Wolfgang Back vom Computerclub Zwei mit Prof. Dr. Stefan Böttcher (Universität Paderborn); Folge 116, 07.07.08
- Download der Sendung 116, 32 Kbit/s (~7 MB)
Compression of massive complex data is a promising technique to reduce stored and transferred data volumes. In industry, semi-structured data and XML in particular, have become the de facto data exchange format, and e.g. grammar-based compression offers an efficient memory representation that can be used within XML databases. Therefore, we expect improvements on compression techniques for structured data and XML to have significant impact on a variety of industry applications where data exchange or limited data storage are bottlenecks.
The big advantage of structure and grammar-based compression over other compression techniques is that many algorithms can be performed directly on the compressed structure, without prior decompression. This idea of “computing on compressed structures” has given rise to efficient algorithms in several areas; e.g., term graph rewriting, model-checking using BDDs, and querying XML; all three areas profit from the use of dags (directed acyclic graphs) which allow to share common subtrees and which can be seen as a particular instance of grammar-based compression. In these areas and others, we expect efficiency improvements through the use of more sophisticated grammar-based compression techniques.
The goal of the seminar was to bring together researchers working on various aspects of structure-based compression and to discuss new ideas and directions for doing research on compression and on computation over compressed structures. In particular, the sub-goals were to achieve a deeper understanding of the whole field of compressing structured data, to discover new interconnections between different research directions in compression, to interchange ideas between theory and application oriented research, to distinguish between solved and open questions in the field, to identify key problems for further research, and to initiate new collaborations.
Within the seminar, many different forms of cooperations took place. The seminar started with a short introduction of each participant. The introduction was followed by a first series of overview talks, namely:
- “Grammar-Based Compression (Survey)” by Wojciech Rytter and
- “Pattern Matching on Compressed Strings”, jointly given by Shunsuke Inenaga and Ayumi Shinohara.
- “Practical Search on Compressed Text” by Gonzalo Navarro and
- “XML Compression Techniques” by Gregory Leighton.
- “Algorithmics on Compressed Strings” by Markus Lohrey.
The seminar brought together researchers from different fields of data compression. Many fruitful discussions provided a unique opportunity to discover interconnections between different research directions in compression, to identify open key problems for further research, and to get a much broader understanding of the field. Furthermore, a creative interchange of ideas between theory and application oriented research led to many deep insights for further research in the field. Finally, Dagstuhl initiated new collaborations which have been fruitful already (one open problem has been resolved, as mentioned in the section on Working Group II), and a large software project software project has been initiated (see the section on Working Group I). As always, Dagstuhl provided a highly enjoyable atmosphere to work in.
- Siva Anantharaman (Université d'Orléans, FR)
- Angela Bonifati (ICAR-CNR - Rende, IT) [dblp]
- Stefan Böttcher (Universität Paderborn, DE) [dblp]
- Jurek Czyzowicz (Université du Québec en Outaouais, CA)
- William Evans (University of British Columbia - Vancouver, CA) [dblp]
- Barbara Fila (University of Orleans, FR)
- Wojciech Fraczak (Université du Québec en Outaouais, CA)
- Rita Hartel (Universität Paderborn, DE) [dblp]
- Shunsuke Inenaga (Kyushu University, JP) [dblp]
- Marek Karpinski (Universität Bonn, DE) [dblp]
- Gregory Leighton (University of Calgary, CA)
- Markus Lohrey (Universität Leipzig, DE) [dblp]
- Veli Mäkinen (University of Helsinki, FI) [dblp]
- Sebastian Maneth (Univ. of New South Wales, AU) [dblp]
- Tomasz Müldner (Acadia University - Wolfville, CA)
- Gonzalo Navarro (University of Chile - Santiago de Chile, CL) [dblp]
- Andrea Pugliese (University of Calabria, IT)
- Rüdiger Reischuk (Universität Lübeck, DE) [dblp]
- Wojciech Rytter (University of Warsaw, PL) [dblp]
- Hiroshi Sakamoto (Kyushu Institute of Technology, JP) [dblp]
- Manfred Schmidt-Schauss (Universität Frankfurt, DE) [dblp]
- Ayumi Shinohara (Tohoku University - Sendai, JP) [dblp]
- Data structures
- Data compression
- algorithms for compressed strings and trees