March 9 – 14 , 2008, Dagstuhl Seminar 08111

Ranked XML Querying


Sihem Amer-Yahia (Yahoo! Research – New York, US)
Divesh Srivastava (AT&T Labs Research – Florham Park, US)
Gerhard Weikum (MPI für Informatik – Saarbrücken, DE)

For support, please contact

Dagstuhl Service Team


Dagstuhl Seminar Proceedings DROPS
List of Participants
Dagstuhl's Impact: Documents available


This paper is based on a five-day workshop on „Ranked XML Querying“ that took place in Schloss Dagstuhl in Germany in March 2008 and was attended by 27 people from three different research communities: database systems (DB), information retrieval (IR), and Web. The seminar title was interpreted in an IR-style „andish“ sense (it covered also subsets of {Ranking, XML, Querying}, with larger sets being favored) rather than the DB-style strictly conjunctive manner. So in essence, the seminar really addressed the integration of DB and IR technologies with Web 2.0 being an important target area.

DB and IR have evolved as separate communities for historical reasons. They were spawned in the sixties with focus on very different application areas: accounting and reservation systems on the DB side, and library and patent information on the IR side. Consequently, they have emphasized different methodological paradigms: precise querying over schematized data, based on logic and algebra (DB), vs. keyword search and ranking over text and uncertain data, based on statistics and probability theory (IR). However, there are now many applications that require managing both structured and unstructured data and thus mandate serious consideration on how to integrate the DB and IR worlds at both foundational and software-system levels. These applications include Web and Web 2.0 use cases as well as more corporate-oriented scenarios such as customer support and health care. All three communities that participated in the seminar (DB, IR, Web) agreed on the importance of the general direction and came up with ten tenets, from different viewpoints, on why DB&IR integration is desirable.

All three of the participating communities – DB, IR, and Web – felt that looking across the fence paid off very well, and that the communities should continue learning from each other. Challenges are ahead in areas like Web 2.0, personal information management, and entity-relationship search; these will remain difficult and rewarding areas for a while. Combining the different and quite complementary expertises from DB and IR would be vital towards well-founded and practically viable solutions.

See more:


  • Data Bases/information Retrieval
  • Data Structures/algorithms/complexity
  • Web


  • Scoring methods for XML
  • Ranking approximate XML answers
  • Top-K query processing
  • Querying structured and unstructured data
  • XML Full-Text Querying
  • Querying heterogeneous XML
  • Extracting structure from unstructured data
  • Text mining
  • XML data integration.


In the series Dagstuhl Reports each Dagstuhl Seminar and Dagstuhl Perspectives Workshop is documented. The seminar organizers, in cooperation with the collector, prepare a report that includes contributions from the participants' talks together with a summary of the seminar.


Download overview leaflet (PDF).

Dagstuhl's Impact

Please inform us when a publication was published as a result from your seminar. These publications are listed in the category Dagstuhl's Impact and are presented on a special shelf on the ground floor of the library.


Furthermore, a comprehensive peer-reviewed collection of research papers can be published in the series Dagstuhl Follow-Ups.