Knowledge engineering has changed dramatically in the last twenty years. When the organisers of this seminar were starting out, it used to be about gathering highly curated knowledge from experts and encoding it into computational representations in knowledge bases. It was primarily a manual process, focusing more on how knowledge was structured and organised, for instance, as schemas or ontologies, and less on tying in existing data into that process. The results were used in expert systems and required considerable up-front investment. Today, knowledge base construction is a largely automatic process with human-in-the-loop. Owing to greater availability of data in different modalities and to advances in data management, machine learning, and crowdsourcing, knowledge bases today incorporate large amounts of knowledge. Provided access to data and (off-the-shelf) AI capabilities, an organisation can create a large knowledge base at a fraction of the costs from decades ago. It's for these reasons that we see knowledge bases, in particular in the form of knowledge graphs, routinely applied in anything from search and intelligent assistants to digital twins, supply chain management, and legal compliance. Many socio-technical challenges remain, which the seminar aimed to address with a mix of invited talks, deep-dives, and small-group workshops as following:
Landscape review: as the field has changed so much, both in research and practices, it was important to take inventory of approaches, methods, techniques, and tools by analysing real-world case studies where knowledge bases and knowledge graphs are created and used. Participants reflected on core lessons learned, knowledge gaps, and opportunities to create and maintain knowledge graphs at scale in various domains.
The knowledge graph life cycle: participants discussed extant knowledge engineering pipelines and identified gaps and connections between knowledge sources and methods and tools used in the construction and maintenance of knowledge graphs, including large language models and generative AI systems. There was consensus that we need a sustained effort to update and upgrade classical ontology engineering methodologies and develop a prototype infrastructure to make the most of the latest neurosymbolic technologies and tools. One specific challenge identified during the seminar was around taking knowledge engineering and knowledge graphs beyond structured data e.g., tables and information extraction from text to other modalities.
Using AI responsibly: as knowledge graph construction is slowly but surely embracing more and more sophisticated AI capabilities to scale, it is critical that processes and outcomes are aligned with fairness, accountability, and transparency guidance and standards. Solutions need to consider a range of end-users and stakeholders, including those that are unique to knowledge engineering settings such as domain experts, information scientists and librarians, and knowledge graph developers. Participants discussed the need for setting up task-based studies and in-depth analyses of human-centric challenges, and for developing bespoke explainability solutions and bias and fairness assessments.
Knowledge and technology transfer: knowledge graphs and knowledge engineering do not exist in isolation. From a research point of view, participants suggested activities to build capabilities to use the latest neurosymbolic technologies and tools in knowledge graph construction, including tutorials, workshops, and hackathons, and to jointly develop frameworks and methodologies. From an application point of view, it was recognised that there is a need to promote knowledge graphs to the wider developer community and communicate their benefits, for instance, alongside neural methods.
As knowledge graphs are now extensively used in everything from search engines and chatbots to product recommenders and autonomous systems, we think it is time to reflect upon the state of the art of the field, revisit its foundations, and look into the future.
Like many other methodology-driven disciplines, knowledge engineering evolved from waterfall-like approaches to agile, participatory ones. Understanding how these work for knowledge graphs in a world of smart devices, alternative user interfaces, data silos, and misinformation is critical.
This Dagstuhl Seminar will help us gain a better understanding of the way knowledge graphs are created, maintained, and used today, and identify research challenges in modelling, representation, reasoning, and evolution. These will form the basis for new methodologies, methods and tools, applicable to various types of AI systems using knowledge graphs, for instance, natural language processing or information retrieval. To facilitate this, we will bring together knowledge scientists and engineers from different areas to take an inventory of solutions, discuss open problems, and identify opportunities for novel research and technology transfer. We intend the seminar to be the kick-off of a virtual network that includes seminar participants as well as other researchers to jointly work on the challenges and convene regularly to present results and exchange ideas and identify opportunities for funding and support for cross-border collaborations.
Specific challenges include understanding knowledge graphs and automation, user experiences of creating and using knowledge graphs for a diverse set of contributors, and supporting and evaluating hybrid knowledge graph engineering workflows.
Expected results / outcome of the Dagstuhl Seminar
- A report summarizing the discussions held during the seminar, which we will aim to turn into a call for papers for an edited volume inviting the community to further the state of the art.
- A visual summary of the plenary discussions, created by a professional illustrator, shared with the wider community. This will cover the plenary sessions.
- Terms of reference for the virtual network, initial composition, and plans to launch in Q3 2022.
- Funding roadmap and emerging bids ideas.
- Marcel R. Ackermann (Schloss Dagstuhl - Trier, DE) [dblp]
- Mehwish Alam (FIZ Karlsruhe, DE) [dblp]
- Bradley Allen (Merit - Millbrae, US)
- Sören Auer (TIB - Hannover, DE) [dblp]
- Eva Blomqvist (Linköping University, SE) [dblp]
- George Fletcher (TU Eindhoven, NL) [dblp]
- Paul Groth (University of Amsterdam, NL) [dblp]
- Aidan Hogan (University of Chile - Santiago de Chile, CL) [dblp]
- Filip Ilievski (USC - Marina del Rey, US)
- Antoine Isaac (Europeana Foundation - Den Haag, NL)
- Diana Maynard (University of Sheffield, GB)
- Deborah L. McGuinness (Rensselaer Polytechnic Institute - Troy, US) [dblp]
- Axel-Cyrille Ngonga Ngomo (Universität Paderborn, DE) [dblp]
- Heiko Paulheim (Universität Mannheim, DE) [dblp]
- Lydia Pintscher (Wikimedia - Germany, DE) [dblp]
- Valentina Presutti (University of Bologna, IT) [dblp]
- Florian Reitz (Schloss Dagstuhl - Trier, DE) [dblp]
- Marta Sabou (Wirtschaftsuniversität Wien, AT) [dblp]
- Harald Sack (FIZ Karlsruhe, DE) [dblp]
- Stefan Schlobach (VU University Amsterdam, NL)
- Juan F. Sequeda (data.world - Austin, US) [dblp]
- Elena Simperl (King's College London, GB) [dblp]
- Steffen Staab (Universität Stuttgart, DE) [dblp]
- Lise Stork (VU University Amsterdam, NL)
- Hideaki Takeda (National Institute of Informatics - Tokyo, JP)
- Katherine Thornton (Yale University Library - New Haven, US) [dblp]
- Marieke van Erp (KNAW Humanities Cluster - Amsterdam, NL) [dblp]
- Denny Vrandecic (Wikimedia - San Francisco, US) [dblp]
- Artificial Intelligence
- Human-Computer Interaction
- knowledge graphs
- knowledge engineering
- knowledge science
- hybrid workflows
- user experience