https://www.dagstuhl.de/11451

### 06. – 11. November 2011, Dagstuhl Seminar 11451

# Data Mining, Networks and Dynamics

## Organisatoren

Lars Elden (Linköping University, SE)

Andreas Frommer (Bergische Universität Wuppertal, DE)

## Auskunft zu diesem Dagstuhl Seminar erteilt

## Dokumente

Dagstuhl Report, Volume 1, Issue 11

Teilnehmerliste

Gemeinsame Dokumente

Programm des Dagstuhl Seminars [pdf]

## Summary

In many areas one needs to extract relevant information from signals generated by dynamical systems evolving on networks with a configuration that itself evolves with time. Such problems occur e.g. in surveillance systems for security, in early warning systems of disasters such as earthquakes, hurricanes and forest fires, in collaborative filtering, in search engines for evolving data bases and in reputation systems based on evolving votes. In each of those examples one wants to extract relevant information from an evolving database. Information science is one of the most expansive scientific areas nowadays, much due to the vast amount of information that is available on the Internet, and and the rapid growth of e-business. There are several open problems in these areas of research and even partial answers would have an important impact. There is a pressing need to make progress in analysis tools and in algorithms for such complex tasks. The purpose of the seminar was to bring together a diverse community of researchers working in different aspects of this exciting field.

The main focus of the seminar was the theory and computational aspects of methods for the extraction of information in evolving networks and recent advances in algorithms for related linear algebra problems. The ever increasing size of data sets and its influence on algorithmic progress appeared as a recurrent general theme of the seminar. Some of the participants are experts in the modeling aspects, some are focusing on the theoretical analysis, and some are more directed toward software development and concrete applications. There was a healthy mix of participants of different age and academic status, from several PhD students and post-docs to senior researchers. As the subject has immediate and important applications, the seminar had some attendants with an industrial background (Yahoo and Google). Those participants also contributed greatly by introducing the academic researchers to new applications. The seminar also had several academic participants from application areas, who presented recent advances and new problems. Apart from well-known applications in social networks, search engines and biology approached from a different angle (Alter, Bast, Groh, Harb, Stumme), new applications were presented such as network methods in epidemiology (Poletto), human contact networks (Yoneki), credibility analysis of Twitter postings (Castillo), structure determination in cryo-electron microscopy (Boumal).

On the methodological side, new methods from graph theory and corresponding numerical linear algebra were presented (Bolten, Brannick, Delvenne, Dhillon, Gleich, Ishteva, Kahl, Savas, Szyld). Of particular interest is the extraction of information from extremely large graphs. Given that the class of multigrid methods is standard for solving large sparse matrix problems derived from partial differential equations, it is very natural that these methods should be tried for graph problems. In this direction, new adaptive algebraic multigrid methods for obtaining the stationary distribution of Markov chains were presented.

It has recently been recognized that optimization on manifolds (Boumal, Sepulchre) is a powerful tool for solving problems that occur in information sciences. As real world data are often organized in more than two categories, tensor methods (Alter, De Lathauwer, Elden, Khoromskij, Sorber) are becoming a hot topic, and the talks showed that the techniques are developing so that now large problems can be treated. Tensor methods have been used for a long time for extremely large problems arising in physics. It has been conjectured that those techniques can be used also for problems in information science. Preliminary discussions along those lines took place. It is also interesting that some tensor computations are based on manifold optimization. Thus there were interesting discussions on the interplay between these areas.

The atmosphere of the meeting was very informal and friendly. During and after the talks lively discussions took place that also continued after dinner. Although it is too early to tell whether the seminar lead to new collaborations between the participants, some preliminary contacts were made. An open problem in spectral partitioning was raised (Elden) and a preliminary solution was suggested (Gleich, Kahl).

The participants of this seminar had a chance to interact with the Dagstuhl seminar 11452 "Analysis of Dynamic Social and Technological Networks" held at the same time. Indeed, the Thursday morning session was arranged as a common session between both seminars, focusing at introducing the different methodological approaches to all participants.

## Classification

- Data Bases / Information Retrieval

## Keywords

- Datamining
- Network Analysis
- Modeling
- Dynamical Systems
- Optimization
- Numerical Methods