11. – 16. September 2011, Event 11372

Semantic Statistics for Social, Behavioural, and Economic Sciences: Leveraging the DDI Model for the Web


Richard Cyganiak (National University of Ireland – Galway, IE)
Arofan Gregory (Open Data Foundation – Tucson, US)
Wendy Thomas (University of Minnesota – Minneapolis, US)
Joachim Wackerow (GESIS – Mannheim, DE)

The goals of the workshop are several: Core knowledge on the DDI model and Semantic Web Technologies will be taught. A possible design of an implementation of the DDI metadata model using Semantic Web standards will be discussed. The outline and draft of a best practice paper on the publication of microdata and the related metadata into the Linked Data Web will be discussed. The latter might be put forward as a standard for use with data in this domain for dissemination on the Web.

Description of the workshop

This workshop will examine the metadata model of the Data Documentation Initiative (DDI) used in the Social, Behavioural, and Economic (SBE) sciences, and design an implementation of that model using the Semantic Web standards (RDF, OWL, etc.). Invited participants will represent the user community (data librarians, archivists, researchers, and data producers), DDI experts, and experts in the Semantic Web technologies and standards. The goal of the workshop is to develop a best practice for the publication of microdata and related metadata into the Linked Data Web, which might be put forward as a standard for use with data in this domain for dissemination on the Web.

Statistical data and metadata is already being standardized within the Linked Data Web with the Data Cube vocabulary, which has been further specialized into the SDMX model. There is no equivalent for the discovery and possible use of microdata. The demand for discovery of both aggregate statistics and the underlying data is strong, however, through open government initiatives and through the efforts of many data producers, data archives, and research centers. Further, Linked Data technologies are becoming increasingly popular within universities, as the basis for tools which can be used to assist research and teaching.

Microdata are often confidential, and this aspect of the problem is one which will be a point of discussion in the workshop - how best to advertise the existence of data which cannot be openly exposed? Other aspects of the problem such as quality and documentation issues and provenance will also be addressed. The question of how best to align the existing RDF vocabularies with the DDI metadata model also will be a focus of discussion. The publication of thesauri and the formal classifications in a standard RDF format also will be a topic.

