##### Dagstuhl-Seminar 24092

### Applied and Combinatorial Topology

##### ( 25. Feb – 01. Mar, 2024 )

##### Permalink

##### Organisatoren

- Pawel Dlotko (Polish Academy of Science, PL)
- Dmitry Feichtner-Kozlov (Universität Bremen, DE)
- Anastasios Stefanou (Universität Bremen, DE)
- Yusu Wang (University of California, San Diego - La Jolla, US)

##### Kontakt

- Andreas Dolzmann (für wissenschaftliche Fragen)
- Christina Schwarz (für administrative Fragen)

##### Gemeinsame Dokumente

- Dagstuhl Materials Page (Use personal credentials as created in DOOR to log in)

##### Programm

The Dagstuhl Seminar titled "Applied and Combinatorial Topology" brought together researchers in mathematics and computer science to engage in active discussions and exchange of ideas on theoretical, computational, and practical aspects of applied and combinatorial topology. The seminar has led to further the connections between the Discrete Morse Theory, Computational Topology, and Statistics communities and identification of open problems that can be addressed together.

##### Context

Applied Topology is a new and rapidly increasing research field within applied mathematics. Its main focus is to utilize topological methods to solve applied problems. The common emphasis of the methods in Applied Topology is on the computational aspect. Application areas include: data analysis, computational biology, network analysis, graph visualization and reconstruction, feature selection, and many more.

##### Goals

The Dagstuhl Seminar 24092 (February 25-March 1, 2024) brought together three research communities, namely researchers in discrete Morse theory (DMT), computational topology, and statistics. The aim was to facilitate collaborations that could strengthen the existing interactions of the fields (e.g. Reeb graphs, Mappers and discrete vector fields) and collaborations that may lead to the development of new descriptors of data (e.g. persistence invariants, magnitude functions, etc.), which in turn have the potential to be inputted into statistical methodologies and to provide their efficient implementations.

##### Topics

We chose three research topics for which the respective communities will benefit from a knowledge exchange and mutual discussion.

**Discrete Morse theory.** The research field of discrete Morse theory, developed by R. Forman, is a discrete counterpart of a continuous Morse theory. It has recently found many practical applications both within mathematics (e.g. configuration spaces, and homology computation) as well as outside mathematics, such as computer science (e.g. denoising and mesh compression). The target of discrete Morse theory is to construct a discrete vector field that either simplifies the data at hand, without losing its important features, or to introduce discrete dynamics on the data. The resulting dynamics can be further analyzed to extract certain interesting sets, for instance invariant sets. In both cases, the aim is to simplify the data or to find the important regions of the data. One example of use of Morse theory in data analysis is the Reeb graph of a Morse function. A discrete adaptation of Reeb graphs which uses ideas of partial clustering is known as Mapper and has seen a great deal of success in data analysis. Another example is an adaptation of discrete Morse theory to computations of persistent homology in topological data analysis: the machinery of discrete Morse theory can be used to help reduce the complexity of the evolving topology in the filtrations of datasets. Moreover, it was recently shown that discrete Morse theory can be utilized for simplification and complexity reduction also in the multiparameter persistence setting. Furthermore, discrete Morse theory has been studied in conjunction with persistent homology theory and has found interesting applications, such as reconstruction of grayscale digital images and reconstruction of graphs, see for instance 2D road reconstruction and 3D neuron reconstruction. Another powerful application of Discrete Morse theory is in distributed computing. In both cases, the discrete Morse theory is used to simplify data at hand, and recover their invariants. We believe that using this machinery, one can do even more. We would like to take the paradigm of discrete Morse theory further and directly try to recover certain invariants from the constructed discrete vector fields.

**Computational Topology.** The research field of combinatorial topology originated from the study of topological invariants derived from combinatorial decompositions of spaces (cf. simplicial approximation theorem), known as simplicial complexes. One of the main examples of such invariants are the Betti numbers. Combinatorial topology was later named algebraic topology due to the switch of focus of the field on its algebraic aspects (as homology groups), which is attributed to Emmy Noether. In the research area of computational topology (also known as topological data analysis), we are interested in studying a single parameter filtration of complexes associated with a data set (viewed as finite metric space) such as the Vietoris-Rips filtration, or multiparameter filtrations of complexes associated to datasets, such as the function-Rips bifiltration and the multicover bifiltration. Those structures can be simplified with the tools of discrete Morse theory. A homology functor is then being applied to those filtrations resulting in a single or multiparameter persistence module. Single-parameter persistence modules are visualized by their persistence diagrams. A well-known invariant of multiparameter persistence modules is the rank invariant which captures important persistence information about multifiltrations of datasets. Recently, there have been some refinements of the rank invariant and also a generalization of the notion of persistence diagram (induced by the rank invariant). Developing algorithms for the efficient computation of multiparameter persistence modules and their rank invariants, is one of the big challenges of computational topology and topological data analysis (TDA).

**Statistics in Topological Data Analysis.** Persistence invariants such as the persistence diagram are equipped with a family of metrics, e.g. the ℓ^{p}-Wasserstein distances and the bottleneck distance. To make these signatures applicable, one must interface them with standard statistical methods. This has already been done e.g. when developing statistics on persistent diagrams, or other signatures such as persistence landscapes. However, much remains unknown in the case of limits of persistence diagrams when the number of points goes to infinity. A good example of a successful synergy between statistics and combinatorial topology is a process of vectorization of persistence diagrams. This process allowed the community to build multiple applications of persistent homology into many branches of science and engineering. We believe that, if new invariants originated from discrete Morse theory and combinatorial topology are introduced, such as the recently introduced Mapper graph of datasets, a work needs to be done, to incorporate them into existing statistical pipelines, hypothesis testing methods and similar. Moreover, a vectorization method for Mapper graphs needs to be established and their limit behavior (when e.g. the number of points goes to infinity) need to be studied. Also an application in standard statistics will be further explored; It is widely known that one should not rely on summary statistics, but always attempt to visualize the data. However, oftentimes the data are very high dimensional. In this case, Mapper type algorithms may serve as a surrogate of a scatter plot in visualization by providing a graph-based summary of the data. Our aim will be to explore this connection and look for ways of inputting Mappers into standard statistical pipelines, e.g. including concepts of averages and central limit theorems. We will also explore the connections between Mapper and other combinatorial topology concepts via, for instance, discrete Morse theory.

##### Participants, Schedule, and Organization

The attendees were strongly encouraged to prepare talks that will include open problems and new research directions. The program for the week consisted of talks of different lengths, open problem sessions, breakout sessions, and summary sessions with the participants. On Monday, we started with an 1-hour session where the participants introduced themselves, and then we had 6 introductory talks, two on Discrete Morse theory, two on computational topology and two on Statistics in TDA. Then, we had an open problem session where participants identified certain open problems and directions for research for the breakout sessions.

Participants chose one or more from the following proposed topics for breakout sessions:

- Can we compute representatives of generators of persistent homology in less than cubic time? (proposed by Tamal Dey)
- Optimal Discrete Morse function given a partial matching (proposed by Yusu Wang)
- Topological information, i.e. "how much topological information remains when going from one to two dimensional filtrations (or from Reeb graphs to Reeb spaces)"(proposed by Bei Wang Phillips)
- Manifold reconstruction guarantees (proposed by Ulrich Bauer)
- Algorithmic questions on (multiparameter) persistence (proposed by Fabian Lenzen)
- Can TDA detect planted cliques? (proposed by Bastian Rieck)
- Monotonicity of magnitude functions of Euclidean metric spaces (proposed by Sara Kalisnik)
- General applied topology (proposed by Dmitry Feichtner-Kozlov).

Tuesday to Thursday in the morning we had the lecture talks and we organized breakout sessions on Tuesday and Thursday afternoon. We reserved three rooms for the breakout sessions that ran in parallel, the main seminar room for topics (1)-(5), another room for topics (6)-(7), and a small room for topic (8). On Wednesday afternoon we organized some groups for hiking near Schloss Dagstuhl. Representatives from the working groups summarized the discussions during their breakout sessions and presented it to all participants on Thursday evening and Friday morning.

##### Results and Reflection

The seminar successfully facilitated a rich exchange of ideas and expertise among participants. The varied program, including talks, open problem sessions, breakout discussions, and outdoor activities, created an environment conducive to collaborative exploration. Attendees expressed satisfaction with the content and structure of the seminar, indicating a strong interest in future editions. During the breakout sessions, it was encouraging to note that some participants reported preliminary results related to the open problems presented. These early findings sparked lively discussions and provided valuable insights into potential directions for further research. The seminar served as a platform not only for sharing existing knowledge but also for generating new ideas and approaches.

The last twenty years of rapid development of Topological Data Analysis (TDA) have shown the need of understanding the shape of data to better understand the data. Since an explosion of new ideas in 2000’ including those of *Persistent Homology* and *Mapper Algorithms*, the community rushed to solve detailed theoretical questions related to the existing invariants. However, topology and geometry have still much to offer to the data science community. New tools and techniques are within reach, waiting to be brought over the fence to enrich our understanding and potential to analyze data.

At the same time, the fields of Discrete Morse Theory (DMT) and Combinatorial Topology (CT) are developed in parallel with no strong connection to data-intensive TDA or to other statistical pipelines (e.g. machine learning).

Our major goal is to soften this strict separation between TDA, CT, and DMT by creating new cross-community connections. We plan to bring together cross-sections of both communities, including researchers with theoretical, applied, and computational backgrounds. Initial speed up of interpersonal communication should result in a significant boost to all the involved areas and even encourage them to search for connections and common ground to create community in-between – transferring ideas, tools and methods.

This Dagstuhl Seminar aims to bring together a number of experts in Discrete Morse Theory, Combinatorial Topology, Topological Data Analysis, and Statistics to (i) enhance the existing interactions between these fields on the one hand, and to (ii) discuss the possibility of adopting new invariants from algebra, geometry, and topology; in particular inspired by continuous and discrete Morse theory and combinatorial topology; to analyze and better understand the notion of shape of the data.

We will start our discussion by understanding interactions of DMT and CT with the current tools of TDA and Statistics, i.e. we will start by examining how to input Mappers into statistical pipelines, study how to define averages and study the limit behavior. Building on this, new possible techniques will be explored and integrated into proper statistical frameworks. We will also discuss how to adopt new invariants from DMT and CT to the setting of TDA and how to input them into Statistics. Possible invariants from DMT and CT to be examined include *invariant sets of discrete vector fields* and *algebraic invariants of (multi-)filtered complexes*, respectively. Their stability, robustness and computational complexity will be thoroughly discussed. Lastly, if time permits, we want to test them on a number of sample datasets and compare their efficacy to the already existing tools.

We hope that our seminar in Dagstuhl will provide an interactive environment for the discrete Morse theory, combinatorial topology, and statistics communities that will result in new joint projects and research collaborations in applied topology and data science.

- Ulrich Bauer (TU München, DE) [dblp]
- Julian Brüggemann (MPI für Mathematik - Bonn, DE)
- Mathieu Carrière (Centre Inria d'Université Côte d'Azur - Sophia Antipolis, FR)
- Tamal K. Dey (Purdue University - West Lafayette, US) [dblp]
- Pawel Dlotko (Polish Academy of Science, PL) [dblp]
- Dmitry Feichtner-Kozlov (Universität Bremen, DE) [dblp]
- Robert Green (University at Albany, US)
- Teresa Heiss (IST Austria - Klosterneuburg, AT) [dblp]
- Niklas Hellmer (Polish Academy of Science, PL)
- Ingrid Hotz (Linköping University, SE) [dblp]
- Sara Kališnik Hintz (ETH Zürich, CH)
- Michael Kerber (TU Graz, AT) [dblp]
- Woojin Kim (Duke University - Durham, US & KAIST - Daejeon, KR)
- Claudia Landi (University of Modena, IT) [dblp]
- Fabian Lenzen (TU Berlin, DE)
- Michael Lesnick (University at Albany, US) [dblp]
- Roy Meshulam (Technion - Haifa, IL) [dblp]
- Elizabeth Munch (Michigan State University, US) [dblp]
- Tom Needham (Florida State University - Tallahassee, US)
- Martin Raussen (Aalborg University, DK) [dblp]
- Bastian Rieck (Helmholtz Zentrum München, DE) [dblp]
- Nicholas Scoville (Ursinus College - Collegeville, US)
- Jan Felix Senge (Universität Bremen, DE)
- Anastasios Stefanou (Universität Bremen, DE)
- Yusu Wang (University of California, San Diego - La Jolla, US) [dblp]
- Bei Wang Phillips (University of Utah - Salt Lake City, US) [dblp]
- Leonard Wienke (Universität Bremen, DE)
- Ling Zhou (Duke University - Durham, US)

##### Klassifikation

- Computational Geometry
- Data Structures and Algorithms
- Discrete Mathematics

##### Schlagworte

- Applied Topology
- Topological Data Analysis
- Discrete Morse Theory
- Combinatorial Topology
- Statistics