https://www.dagstuhl.de/15161

### 12. – 17. April 2015, Dagstuhl-Seminar 15161

# Advanced Stencil-Code Engineering

## Organisatoren

Matthias Bolten (Bergische Universität Wuppertal, DE)

Robert D. Falgout (LLNL – Livermore, US)

Christian Lengauer (Universität Passau, DE)

Olaf Schenk (University of Lugano, CH)

## Auskunft zu diesem Dagstuhl-Seminar erteilt

## Dokumente

Dagstuhl Report, Volume 5, Issue 4

Motivationstext

Teilnehmerliste

Dagstuhl's Impact: Dokumente verfügbar

## Summary

Stencil codes are compute-intensive algorithms, in which data points arranged in a large grid are being recomputed repeatedly from the values of data points in a predefined neighborhood. This fixed neighborhood pattern is called a stencil. Stencil codes see wide-spread use in computing the discrete solutions of partial differential equations and systems composed of such equations. Connected to the implementation of stencil codes is the use of efficient solver technology, i.e., iterative solvers that rely on the application of a stencil and that provide good convergence properties like multigrid methods. Major application areas are the natural sciences and engineering. Although, in many of these applications, unstructured adaptive discretizations are employed for an efficient use of exascale supercomputers whose architectures possibly include accelerators or are of a heterogeneous nature, the use of structured discretizations and, thus, stencil codes has turned out to be helpful.

Stencil codes come in large varieties: there are many thousands! Deriving each of them individually, even if by code modification from one another, is not practical. The goal of the seminar is to raise the level of abstraction for application programmers significantly and to support this raise with an automated software technology that generates highly efficient massively parallel implementations which are tuned to the specific problem at hand and the execution platform used.

### Research Challenges

Stencil codes are algorithms with a pleasantly high regularity: the data structures are higher-dimensional grids and the computations follow a static, locally contained dependence pattern and are typically arranged in nested loops with linearly affine bounds. This invites massive parallelism and raises the hope for easily achieved high performance. However, serious challenges remain:

- Because of the large numbers and varieties of stencil code implementations, deriving each of them individually, even if by code modification from one another, is not practical. Not even the use of program libraries is practical; instead, a domain-specific metaprogramming approach is needed.
- Reaching petascale to exascale execution speed is a challenge in the frequently used so-called multigrid algorithms, which work on a hierarchy of increasingly larger grids. The coarse grids in the upper part of the hierarchy are too small for massive parallelism.
- Efficiency, i.e., a high ratio of speedup to the degree of parallelism, is impaired by the low mathematical density, i.e., the low ratio of computation steps to data transfers of stencil codes.
- An inappropriate use of the execution platform may act as a performance brake.

Stencil-code engineering has received increased attention in the last few years, which is evidenced by the appearance of a number of stencil-code programming languages and frameworks. To reach the highest possible execution speed and to conserve hardware resources and energy, the stencil code must be tuned cleverly to the specific application problem at hand and the execution platform used. One approach that could be followed has been demonstrated by the previous U.S. project SPIRAL, whose target was the domain of linear transforms: domain-specific optimization at several levels of abstraction -- from the mathematical equations over an abstract, domain-specific program and, in further steps, to the actual target code on the execution platform used. At each level, one makes aggressive use of knowledge of the problem and platform and employs up-to-date, automated software technology suitable for that level.

### Questions and Issues Addressed

The charter of the seminar was to foster international cooperation in the development of a radically new, automatic, optimizing software technology for the effective and flexible exploitation of massively parallel architectures for dedicated, well delineated problem domains.

The central approaches in achieving this technology are:

- the aggressive use of domain knowledge for optimization at different levels of abstraction
- the exploitation of commonalities and variabilities in application codes via product-line technology and domain engineering
- the use of powerful models for program optimization, like the polyhedron model for loop parallelization and feature-orientation for software product lines

Among the issues discussed were:

- What are suitable abstraction, modularization, composition and generation mechanisms for stencil codes?
- What are the appropriate language features of a domain-specific language for stencil codes?
- What are the commonalities and variabilities of stencil codes?
- What are the computational performance barriers, especially, of multigrid methods using stencils and how can they be overcome?
- What are the performance barriers caused by data exchanges and how can they be overcome? How can communication be avoided in multilevel algorithms?
- What are the roles of nested loops and divide-and-conquer recursions in stencil codes?
- How can other solvers and preconditioners benefit from autotuned stencil codes?
- What role should techniques like autotuning and machine learning play in the optimization of stencil codes?
- What options of mapping stencil codes to a heterogeneous execution platform exist and how can an educated choice be made?
- Which techniques can be employed to make clever use of large-scale hybrid architectures, e.g., by the combination of multigrid with mathematical domain decomposition?

On the informatics side, one important role of the seminar was to inform the international stencils community about the techniques used in ExaStencils: software product lines, polyhedral loop optimization and architectural metaprogramming. Equally important was for ExaStencils members to learn about the experiences made with other techniques like divide-and-conquer, multicore optimization in parallel algorithms or autotuning. The application experts contributed to a realistic grounding of the research questions.

On the mathematics side, the seminar fostered the cooperation of experts in parallel solver technology with the groups from informatics to enable them to make use of the advanced techniques available. Further, different strategies for improving the scalability of iterative methods were discussed and the awareness of the opportunities and complexities of modern architectures in the numerical mathematics community was advanced.

**Summary text license**

Creative Commons BY 3.0 Unported license

Matthias Bolten, Robert D. Falgout, Christian Lengauer, and Olaf Schenk

## Classification

- Modelling / Simulation
- Programming Languages / Compiler
- Software Engineering

## Keywords

- Stencil codes
- Architectural metaprogramming
- Linear solvers
- Multigrid methods
- Supercomputing
- Software engineering
- Massive parallelism