20.07.14 - 24.07.14, Seminar 14302

Digital Palaeography: New Machines and Old Texts

Digital Palaeography emerged as a research community in the late 2000s. Following a successful Dagstuhl Perspectives Workshop on Computation and Palaeography (12382), this seminar focuses on the interaction of Palaeography and computerized tools developed in Computer Vision for the analysis of digital images.

Given the present techniques developed to enhance damaged documents, optical text recognition or computer-assisted transcription, identification and categorisation of scripts and scribes, the current technical challenge is to develop “new machines”, i.e. efficient solutions for palaeographic tasks, and to provide scholars with quantitative evidence towards palaeographical arguments, even beyond the reading of “old texts” (ancient, mediaeval and early modern documents), which is of interest to the industry, to the wider public and to the broad community of genealogists.

The core issue is to create the conditions of a fluid and seamless communication between Humanities and Computer Sciences in order to advance research in Palaeography, Manuscript Studies and History, on the one hand, and in Computer Vision, Semantic Technologies, Image Processing, and Human Computer Interaction (HCI) systems on the other hand. Researchers must articulate their respective systems of proof, in order to produce efficient systems that present palaeographical data quickly and easily, and in a way that scholars can understand, evaluate, and trust, to optimize collaboration, prevent the implementation of “black boxes”, make use of the outreach potential offered by computerized technologies to enrich palaeographical knowledge and facilitate sharing the methodologies.

The primary outcome will be the sharing of insights based on the state-of-the-art of Digital Palaeography, the interdisciplinary discussions on the potentials and limitations of future research in this field and the establishment of a community of practice in Digital Palaeography. Further prospective outcomes include the dissemination of methodologies and current research within the community, a better understanding of how to conduct interdisciplinary research across all the fields of expertise involved in Digital Palaeography, and new research directions in the Computer Sciences and new research strategies in Palaeography.

On the technical side, the following key issues have been identified:

  • Ontologies to discuss and qualify the variability of scripts
  • Semi-automatic and interactive image processing and classification methods
  • Approaches to alignment (establishing a correspondence between images of texts, textual transcriptions and printed editions)
  • Search methods across modalities and datasets (texts/images/shapes)
  • Methods to define “mid-level features”, which are key to productive communication between scholars and computational systems
  • Data sources, data collection, and their use and management
  • Use of input devices (e.g., sensitive digital pens, touch-surfaces) to interact with the images of texts, to collect data on letter formation and on the ergonomics of writing.
  • Explorations of the cognitive underpinnings of palaeographical research and how digital technologies might assist (e.g., kinaesthetic engagement in reading)
  • Methods of visualization
  • Dissemination of results, ideas, and developments outside of expert communities and to the general public
  • Going beyond Palaeography - applying to the expertise and capabilities developed by the joint efforts of researchers from both fields to new ones, including forensics, recognition of handwriting in business (e.g., postal services), indexing of large scale Cultural Heritage datasets with a large audience (e.g., genealogy)