February 13 – 18 , 2005, Dagstuhl Seminar 05071

Machine Learning for the Semantic Web


Fabio Ciravegna (University of Sheffield, GB)
AnHai Doan (University of Wisconsin – Madison, US)
Craig A. Knoblock (USC – Marina del Rey, US)
Nicholas Kushmerick (University College Dublin, IE)
Steffen Staab (Universität Koblenz-Landau, DE)

For support, please contact

Dagstuhl Service Team


External Homepage
List of Participants
Dagstuhl's Impact: Documents available


The Semantic Web has attracted great attention since the vision was first articulated several years ago. In a nutshell, the Semantic Web will augment conventional Web content with explicit machine-processable semantic metadata, enabling a variety of automated content manipulation and aggregation.

As demonstrated by the first two International Semantic Web Conferences, the initial "futuristic vision" has matured into a carefully crafted set of substantive technical proposals, such as the Resource Description Framework (RDF) and the Web Ontology Language (OWL). However, it is widely recognized the Semantic Web will never "take off" until a critical mass of semantic metadata has been deployed. Many SW researchers have therefore built various tools to help developers attach semantic metadata to their content.

More ambitously, machine learning and other artificial intelligence techniques are being developed that generate the requisite semantic metadata in a semi-automated or even entirely automated fashion. For example, machine learning algorithms for information extraction allow large legacy text repositories to be rapidily enriched with semantic metadata, and machine learning approaches to ontology learning and matching are being developed for the Semantic Web context.

The goal of this seminar is to assemble the leading researchers who work at the intersection of machine learning and the Semantic Web, in order to review progress and identify the most significant opportunities and challenges over the next several years. We will also invite leading figures from the "conventional" (hand-crafted metadata) Semantic Web community, to ensure both that our technology is fully appreciated by the Semantic Web community, and that the machine learning community focuses on important and realistic problems.

The seminar will focus specifically on the following five topics:

  1. Automated document annotation;
  2. Ontology learning and maintenance;
  3. Ontology mapping and merging;
  4. Service discovery; and
  5. Content cleaning and normalization.


In the series Dagstuhl Reports each Dagstuhl Seminar and Dagstuhl Perspectives Workshop is documented. The seminar organizers, in cooperation with the collector, prepare a report that includes contributions from the participants' talks together with a summary of the seminar.


Download overview leaflet (PDF).

Dagstuhl's Impact

Please inform us when a publication was published as a result from your seminar. These publications are listed in the category Dagstuhl's Impact and are presented on a special shelf on the ground floor of the library.


Furthermore, a comprehensive peer-reviewed collection of research papers can be published in the series Dagstuhl Follow-Ups.