Dagstuhl-Seminar 06491: Digital Historical Corpora – Architecture, Annotation, and Retrieval

Dagstuhl-Seminar 06491

Digital Historical Corpora – Architecture, Annotation, and Retrieval

( 03. Dec – 08. Dec, 2006 )

(zum Vergrößern in der Bildmitte klicken)

Permalink

Bitte benutzen Sie folgende Kurz-Url zum Verlinken dieser Seite: https://www.dagstuhl.de/06491

Organisatoren

Lou Burnard (University of Oxford, GB)
Milena Dobreva (Bulgarian Academy of Sciences, BG)
Norbert Fuhr (Universität Duisburg-Essen, DE)
Anke Lüdeling (HU Berlin, DE)

Kontakt

Dagstuhl Service Team

Publikationen

Digital Historical Corpora- Architecture, Annotation, and Retrieval. Lou Burnard, Milena Dobreva, Norbert Fuhr, and Anke Lüdeling (Eds.). Dagstuhl Seminar Proceedings, Volume 6491. June 13, 2007

Presse/News

Press Room

Show Press Room

Press Release

"Digitalisierung von historischen Texten" 27.11.06 (German only)

Summary

Show Summary

The seminar brought together scholars from (historical) linguistics, (historical) philology, computational linguistics and computer science who work with collections of historical texts. These texts or digital libraries or corpora1 are collected for a number of different purposes such as lexicography, history, linguistics, philology etc. This, naturally, leads to different decisions in their design and architecture.

The purpose of this seminar was twofold: First we wanted to inform each other about the decisions each of us had taken in building a historical corpus and discuss the options. Second, we wanted to build an international network of people working with historical corpora and explore the options for further partnerships or projects. We think that both goals were reached.

The seminar was very interesting and stimulating. In the final discussion of the workshop, a ‘grand picture’ of the research issues in the area of digital historic corpora was developed (see Figure 1). Here the arcs represent enabling/supporting methods. As can be seen from this picture, the major goal is the research on large historical corpora, which requires work on the areas pointing to it directly or indirectly. A researcher’s workbench should support personalization, collaboration as well as problem solving. It must be complemented by tools for the annotation and the analysis of corpora, as well as providing functions for visualization, browsing and retrieval (especially for spelling variants). These methods should first be applied to and tested on small corpora, before they can be used for large corpora. In this context, evaluation also plays a major role. For large corpora (stored in digital libraries), the choice of an appropriate architecture is a crucial issue.

Another issue that was of interest to all participants is quality control and standardization.

Teilnehmer

Zeige Teilnehmer

Victor Baranov (Izhevsk State Technical University, RU)
Lou Burnard (University of Oxford, GB)
Gregory R. Crane (Tufts University, US) [dblp]
James Cummings (Oxford University Computing Services, GB)
Mark Davies (Brigham Young University, US)
Mila Dimitrova-Vulchanova (NTNU - Trondheim, NO)
Milena Dobreva (Bulgarian Academy of Sciences, BG) [dblp]
Karin Donhauser (HU Berlin, DE)
Eva Dyllong (Universität Duisburg-Essen, DE)
Astrid Ensslin (The University of Manchester, GB)
Tomaz Erjavec (Jozef Stefan Institute - Ljubljana, SI) [dblp]
Andrea Ernst-Gerlach (Universität Duisburg-Essen, DE)
Stefan Evert (Universität Osnabrück, DE) [dblp]
Jean-Daniel Fekete (University of Paris South XI, FR) [dblp]
Norbert Fuhr (Universität Duisburg-Essen, DE) [dblp]
Kurt Gärtner (Universität Trier, DE)
Markus Heller (LMU München, DE)
Karin Hess (MPI-SWS - Saarbrücken, DE)
Nikola Ikonomov (Bulgarian Academy of Sciences, BG)
Fotis Jannidis (TU Darmstadt, DE) [dblp]
Jaap Kamps (University of Amsterdam, NL) [dblp]
Alexander Karosseit (HU Berlin, DE)
Meike Klettke (Universität Rostock, DE)
Ulf Leser (HU Berlin, DE) [dblp]
Anke Lüdeling (HU Berlin, DE)
Wolfram Luther (Universität Duisburg-Essen, DE) [dblp]
Manfred Markus (Universität Innsbruck, AT)
Roland Meyer (Universität Regensburg, DE)
Thomas Pilz (Universität Duisburg-Essen, DE)
Paul Rayson (Lancaster University, GB)
Klaus U. Schulz (LMU München, DE)
Thorsten Vitt (TU Darmstadt, DE)
Andreas Witt (Universität Tübingen, DE)
Amir Zeldes (HU Berlin, DE) [dblp]

Klassifikation

Interdisciplinary (Computer Science
Computational Linguistics
Corpus Linguistics
Literacy
Bioinformatics) Own Categories: Corpus architecture
Processing and representing multilingual and multimodal parallel text corpora
Annotation standards
Retrieval facilities in multilevel hypertext

Schlagworte

Corpus architecture
annotation standards
multilingual
multimodal corpora
fuzzy search
multilevel hypertext

Seminar 06491

Suche auf der Schloss Dagstuhl Webseite

Schloss Dagstuhl Services

Seminare

Innerhalb dieser Seite:

Externe Seiten:

Publishing

Innerhalb dieser Seite:

Externe Seiten:

dblp

Innerhalb dieser Seite:

Externe Seiten:

Dagstuhl-Seminar 06491

Digital Historical Corpora – Architecture, Annotation, and Retrieval

( 03. Dec – 08. Dec, 2006 )

Permalink

Organisatoren

Kontakt

Publikationen

Presse/News

Press Room

Press Release

Summary

Teilnehmer

Klassifikation

Schlagworte