### 09.11.14 - 14.11.14, Seminar 14461

# High-performance Graph Algorithms and Applications in Computational Science

### The following text appeared on our web pages prior to the seminar, and was included as part of the invitation.

## Motivation

Graphs (or networks) are a very powerful abstraction of various phenomena that can be expressed as a relation between entities. For several decades researchers in theoretical computer science and discrete mathematics have been developing a wealth of graph theory and graph algorithms. Recently, however, we see a qualitative change how graph algorithms are used in practice. This change is due to (i) the complex structure of graphs in new and emerging applications, (ii) the size of typical inputs, and (iii) the computer systems on which graph problems are solved.

Application areas driving the aforementioned change can be very diverse. To name a few, consider microblogging services in social media, genome assembly in bioinformatics, routing in traffic networks, network analysis and graph mining (e. g. predicting network evolution or detecting communities), technology diffusion and decentralized trade in economics, and combinatorial scientific computing (e. g. spectral graph theory, graph partitioning). These application areas have in common that current and emerging data sources show massive scale and irregular or non-uniform distribution of structural and semantic properties of entities and their interactions.

Despite the increasing importance of these complex networks, research on their many aspects is still in early stages, and we are facing daunting challenges. One big challenge stems from the fact that the governing principles on how some of these networks evolve and function are unknown. Without such principles, we cannot predict what the Internet will look like in five years, we cannot simulate email exchange networks, and we cannot tell whether two graphs are similar or not.

Additional challenges are posed by the hardware. Tomorrow’s computer architectures are becoming increasingly parallel and heterogeneous. Yet, many graph algorithms as well as techniques in relevant applications are inherently sequential. That is why many previous results do not transfer easily into efficient computational tools. To fully exploit emerging architectures, however, it is important to develop parallel algorithms and tools. Other technological challenges, for example deep memory hierarchies and solid-state disks (SSDs), and particularly their impact on algorithm design have also to be considered in this context. Algorithmic techniques that show promise try to be oblivious to the specific hardware, and optimize for general classes of typical hardware.

As a consequence, the central goal of the seminar is to exploit different techniques for highperformance graph algorithm design in various application areas. This will span the issues of efficient and effective implementation from well known network analysis and graph mining techniques that are traditionally sequential to modern methods that require a high degree of parallelism to applications that need the performance of emerging hardware architectures. Only if larger inputs can be processed faster than before, the underlying science can reach new levels.

We will exploit synergies by bringing together researchers that represent various communities such as (a) researchers designing, analyzing, and implementing graph algorithms relevant to the seminar topics; (b) specialists in innovative models of computation as well as in parallel and highperformance computing; (c) specialists in combinatorial scientific computing; (d) researchers working on graph mining and network analysis techniques, who may come from very diverse fields such as databases, data mining, theoretical computer science, applied mathematics, and applied physics; (e) domain scientists as users of algorithms and software tools developed by the communities mentioned before.

Research on high-performance graph algorithms and applications in computational science is currently scattered across several communities with different cultures. With this seminar we want to promote this emerging scientific area as a subfield in the computing sciences. We hope to nurture strong collaborations and a “common language” between communities that are otherwise disjoint to some extent. By bringing together experts from different fields, we hope that the next-generation tools of the trade become significantly faster and more feature-rich than the current state of the art. This, in turn, will allow domain scientists to process larger data sets more effectively, which will allow them to tackle problems previously out of reach.