##### Dagstuhl Seminar 23401

### Automated Mathematics: Integrating Proofs, Algorithms and Data

##### ( Oct 01 – Oct 06, 2023 )

##### Permalink

##### Organizers

- Andrej Bauer (University of Ljubljana, SI)
- Katja Bercic (University of Ljubljana, SI)
- Florian Rabe (Universität Erlangen-Nürnberg, DE)
- Nicolas Thiéry (University Paris-Saclay - Orsay, FR)

##### Contact

- Michael Gerke (for scientific matters)
- Jutka Gasiorowski (for administrative matters)

##### Shared Documents

- Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)

##### Schedule

Modern mathematical software and large datasets of mathematical knowledge allow new approaches to solving mathematical problems, and support new kinds of mathematical exploration. In the past, lack of cooperation led to each project developing its own standards and techniques. Both developers and users experienced the resulting incompatibilities and fragmentation as serious usability issues and major obstacle to wider adoption, and none of the projects can address these successfully on their own. To make further progress that will eventually result in truly comprehensive and advanced computer mathematical assistant systems, the developers of individual software projects must start working together and tackle questions of interoperability, software engineering, and knowledge management on a large scale.

Concretely, this Dagstuhl Seminar brought together system developers, library authors, and users from three key areas of computer-supported mathematics: theorem provers, symbolic computation, and databases of mathematical structures. All three areas develop large formal mathematical libraries, but they do so in fundamentally different and incompatible ways. Theorem provers optimize for precise definitions and automation of proof support, often employing complex and abstract representation languages that capture exactly the semantics of the mathematical concepts of interest. Computer algebra systems, on the other hand, prioritize the efficiency of computation, which usually requires hardware-near representations that can be optimized for speed. Mathematical datasets finally employ general purpose database languages, which are optimized for indexing and fast querying, often requiring non-trivial encodings of mathematical objects in terms of concrete data. Over several decades of mostly independent development, these different communities have built software systems that are as impressively large as they are different from each other.

#### Work Groups

Broadly speaking, the seminar participants employed three independent approaches towards system integration.

Firstly, *direct integration* builds individual bridges between (usually) two systems.
These tend to be more ad hoc but enable a problem-driven approach that delivers a specific practically needed integration solution. Multiple work groups were formed that tackled individual bridges.

Secondly, *ontology-based integration* uses a central representation that acts as an interoperability layer.
The ontology describes mathematical concepts abstractly without a commitment to any of the three flavors of systems.
Two work groups investigated this approach:

- Alignments: This work group worked towards building a central ontology of mathematical concepts. It surveyed existing approaches and judged the feasibility of major future approaches. Of special interests was the difficulty of \emph{library alignment}, the task of connecting the central ontology to the individual libraries.
- Knowledge graphs: This work group investigated services that can be built on top of a central ontology. Of particular interest were knowledge graph techniques, which use concepts as the nodes and alignments as some of the edges.

Thirdly, for the special goal of integrating datasets, a work group on building an *index of datasets* started a major push towards cataloging the many existing datasets, which are distributed all over the internet, often without active maintenance.
This is a necessary step towards more systemic integration with each other and deduction and computation systems.

#### Outcomes

Overall, we observed that the field has made major progress over the last 10 years. Direct integrations that would have been very expensive in the past, often prohibitively so, have become feasible targets for short meetings as within a Dagstuhl Seminar. This can be attributed to increased awareness in the community of interoperability needs that has led to better interface design. Nonetheless, integrations are still brittle, and a major incentive problem remains: it is difficult for two communities to maintain bridges between their systems.

Ontology-based integration had developed little momentum in the past because of the high cost of additionally maintaining the central ontology. Here, the seminar showed that the time is right for a major push towards this and initiated a community-driven ontology curation project.

The work on building an index of datasets kick-started a dataset curation project. This project has already attracted the attention of outside researchers and has led to the founding of the Mathbase project.

Modern mathematical software and large databases of mathematical knowledge allow new approaches to solving mathematical problems, and support new kinds of mathematical exploration. The growing popularity of such approaches is increasingly straining the resources of the development teams. To make progress that will eventually result in a truly comprehensive and advanced computer mathematical assistant, the developers of individual software projects must start working together and tackle questions of interoperability, software engineering and knowledge management on a large scale.

This Dagstuhl Seminar plans to bring together system developers, library authors, and users from key branches of computer-supported mathematics: formalized mathematics, symbolic computation, and mathematical databases. The seminar aims to connect the communities, foster future collaboration across branches, and tackle problems that no branch can address alone. We shall address issues that are common to all areas of computer-supported mathematics: library management, dependencies and interoperability between software components, quality and correctness assurances, searching for information, and usability by end users.

**Community building through interoperability.** Participants will share their experiences and best practices, analyze common challenges, and work towards interoperability between existing systems. During the seminar, we will draft common terminology and information exchange protocols as a step towards serious integration of systems. We will rely on existing protocols, and will not depend on any particular system for automated mathematics. Doing so will allow computer supported mathematics to integrate more easily with other software systems.

**Case studies.** Participants from different communities will work on joint cases studies, for example by importing a library of mathematical knowledge from one system into another. The organizers will suggest challenges that require versatile expertise, for example formal verification of a mathematical dataset, incorporation of certain capabilities of a computer algebra system into a proof assistant, or enrichment of a mathematical dataset with new knowledge generated by a computer algebra system.

**Impact.** Advances in automated mathematics have a positive impact on research branches that rely on its methods and tools: formal verification of software and hardware systems uses proof assistants, applied mathematicians rely on computer algebra systems, and of course mathematicians themselves use automated mathematics in their research.

- Andrej Bauer (University of Ljubljana, SI) [dblp]
- Christoph Benzmüller (Universität Bamberg, DE) [dblp]
- Katja Bercic (University of Ljubljana, SI) [dblp]
- Alex Best (King's College - London, GB) [dblp]
- James Boyd (Wolfram Institute - Cambridge, US)
- Jacques Carette (McMaster University - Hamilton, CA) [dblp]
- Mario Carneiro (Carnegie Mellon University - Pittsburgh, US) [dblp]
- Edgar Costa (MIT - Cambridge, US) [dblp]
- James H. Davenport (University of Bath, GB) [dblp]
- Valeria de Paiva (Topos Institute - Berkeley, US) [dblp]
- Gilles Dowek (ENS - Gif-sur-Yvette, FR) [dblp]
- Catherine Dubois (ENSIIE - Evry, FR) [dblp]
- Stefania Dumbrava (ENSIIE - Paris, FR) [dblp]
- Madalina Erascu (West University of Timisoara, RO) [dblp]
- Jan Goedgebeur (KU Leuven, BE) [dblp]
- Kiran Gopinathan (National University of Singapore, SG) [dblp]
- Darij Grinberg (Drexel Univ. - Philadelphia, US) [dblp]
- Jyoti Jyoti (Panjab University - Chandigarh, IN)
- Michael Kohlhase (Universität Erlangen-Nürnberg, DE) [dblp]
- Olexandr Konovalov (University of St Andrews, GB) [dblp]
- Filip Koprivec (University of Ljubljana, SI) [dblp]
- Dimitri Leemans (UL - Brussels, BE) [dblp]
- Samuel Lelievre (University Paris-Saclay - Orsay, FR) [dblp]
- Assia Mahboubi (INRIA - Nantes, FR) [dblp]
- Harshit J Motwani (Ghent University, BE) [dblp]
- Dennis Müller (Universität Erlangen-Nürnberg, DE) [dblp]
- Tobias Nipkow (TU München - Garching, DE) [dblp]
- Matej Petkovic (University of Ljubljana, SI) [dblp]
- Christian Pfrang (Bay. Min. für Digitales - München, DE)
- Florian Rabe (Universität Erlangen-Nürnberg, DE) [dblp]
- Talia Ringer (University of Illinois - Urbana-Champaign, US) [dblp]
- Claudio Sacerdoti Coen (University of Bologna, IT) [dblp]
- Kazuhiko Sakaguchi (INRIA - Nantes, FR) [dblp]
- Natarajan Shankar (SRI - Menlo Park, US) [dblp]
- Kathrin Stark (Heriot-Watt University - Edinburgh, GB) [dblp]
- Jure Taslak (University of Ljubljana, SI)
- Nicolas Thiéry (University Paris-Saclay - Orsay, FR) [dblp]
- Josef Urban (Czech Technical University - Prague, CZ) [dblp]
- Makarius Wenzel (Dr. Wenzel - Augsburg, DE) [dblp]
- Tom Kaspar Wiesing (Universität Erlangen-Nürnberg, DE) [dblp]

##### Classification

- Databases
- Logic in Computer Science
- Mathematical Software

##### Keywords

- mathematical knowledge management
- mathematical software
- formalized mathematics
- computer algebra
- databases of mathematical structures