https://www.dagstuhl.de/15201
10. – 13. Mai 2015, Dagstuhl-Seminar 15201
Cross-Lingual Cross-Media Content Linking: Annotations and Joint Representations
Organisatoren
Alexander G. Hauptmann (Carnegie Mellon University, US)
James Hodson (Bloomberg – New York, US)
Juanzi Li (Tsinghua University – Beijing, CN)
Nicu Sebe (University of Trento, IT)
Koordinatoren
Achim Rettinger (KIT – Karlsruher Institut für Technologie, DE)
Auskunft zu diesem Dagstuhl-Seminar erteilt
Dokumente
Dagstuhl Report, Volume 5, Issue 5
Motivationstext
Teilnehmerliste
Summary
Different types of content belonging to multiple modalities (text, audio, video) and languages are generated from various sources. These sources either broadcast information on channels like TV and News or allow collaboration in social media forums. Often multiple sources are consumed in parallel. For example, users watching TV tweeting their opinions about a show. This kind of consumption throw new challenges and require innovation in the approaches to enhance content search and recommendations.
Currently, most of search and content based recommendations are limited to monolingual text. To find semantic similar content across different languages and modalities, considerable research contributions are required from various computer science communities working on natural language processing, computer vision and knowledge representation. Despite success in individual research areas, cross-lingual or cross-media content retrieval has remained an unsolved research issue.
To tackle this research challenge, a common platform is provided in this seminar for researchers working on different disciplines to collaborate and identify approaches to find similar content across languages and modalities. After the group discussions between seminar participants, two possible solutions are taken into consideration:
- Building a joint space from heterogeneous data generated from different modalities to generate missing or to retrieve modalities. This is achieved through aligned media collections (like parallel text corpora). Now to find cross-media cross-lingual relatedness of the content mapped to a joint latent space, similarity measures can be used.
- Another way is to build a shared conceptual space using knowledge bases(KB) like DBpedia etc for semantic annotation of concepts or events shared across modalities and languages. Entities are expressed in any channel, media type or language cam be mapped to a concept space in KB. Identifying a commonality between annotations can be used to find cross-media cross-lingual relatedness.
Thus, implementing these solutions require a joint effort across research disciplines to relate the representations and to use them for linking languages and modalities. This seminar also aimed to build datasets that can be used as standard test bed and benchmark for cross-lingual cross-media content linking. Also, seminar was very well received by all participants. There was a common agreement that the areas of text, vision and knowledge graph should work more closely together and that each discipline would benefit from the other. The participants agreed to continue to work on two cross-modal challenges and discuss progress and future steps in a follow-up meeting in September at Berlin.


Classification
- Artificial Intelligence / Robotics
- Data Bases / Information Retrieval
- Multimedia
Keywords
- Cross-lingual
- Cross-media
- Cross-modal
- Natural language processing
- Computer vision
- Multimedia
- Knowledge representation
- Machine learning
- Information extraction
- Information retrieval