- Peter Zaspel (HITS & Universität Heidelberg)
- Simone Schilke (for administrative matters)
High performance computing (HPC) is a key technology to solve large-scale real-world simulation problems on parallel computers. Simulations for a fixed, deterministic set of parameters are current state of the art. However, there is a growing demand in methods to appropriately cope with uncertainties in those input parameters. This is addressed in the developing research field of uncertainty quantification. Here, (pure) Monte-Carlo methods are easy to parallelize and thus fit well for parallel computing. However, their weak approximation capabilities lead to slow convergence.
The proposed seminar aims at bringing together experts in the fields of uncertainty quantification and high performance computing. Contributions with an industrial background are strongly encouraged. Discussions on the latest numerical techniques beyond pure Monte-Carlo such as Polynomial Chaos, Stochastic Collocation, Gaussian Process Regression, Quasi Monte-Carlo or Multi-Level Monte Carlo shall be fostered. The covered topics will include, but are not limited to inference, control and optimization under uncertainties on HPC systems, scalable multi-level, higher-order and low discrepancy methods, parallel adaptive methods, model reduction, parallelization techniques, parallel software frameworks and resilience. These topics shall be put in context of large-scale real-world problems on parallel computers.
Uncertainty quantification (UQ) aims at approximating measures for the impact of uncertainties in e.g. simulation parameters or simulation domains. By this way, it is of great importance for both academic research and industrial development. In uncertainty quantification, one distinguishes between classical forward uncertainty propagation and more involved inference, optimization or control problems under uncertainties. Forward uncertainty propagation is concerned with deterministic numerical models for e.g. engineering problems, in which parts of the input data (domain, parameters, ...) might be affected by uncertainties, i.e. they have a random nature. Randomness is usually characterized by random fields that replace the originally deterministic inputs. In Bayesian inference, parameters of a system shall be derived for given measurements. Since the measurements are assumed to be affected by some (stochastic) error, this inference approach tries to derive probabilities under which a given parameter leads to the observed measurements. In some sense, Bayesian inference complements classical inverse problems in a stochastic sense. Other fields of interest for a similar uncertainty analysis are optimization and control.
High performance computing (HPC) is an interdisciplinary research field in computer science, mathematics and engineering. Its aim is to develop hardware, algorithmic approaches and software to solve (usually) mathematically formulated problems on large clusters of interconnected computers. The dominant part of the involved research is done in parallel computing. From a hardware perspective, HPC or parallel computing requires to develop computing technologies that can e.g. solve several problems at the same time at high performance and low power. Moreover, hardware developments in HPC often aim at improving network communication technologies, which are necessary to let a (potentially) large set of computers solve a single problem in a distributed way. From an algorithmic perspective, methods known from numerical mathematics and data processing are adapted such that they can run in a distributed way on different computers. Here, a key notion is (parallel) scalability which describes the ability to improve the performance or throughput of a given method by increasing the number of used computers. Most algorithmic developments shall improve this scalability for numerical methods. Research in software aims at defining appropriate programming models for parallel algorithms, providing efficient management layers for the underlying hardware and implementing the proposed parallel algorithms in real software.
In UQ, (partial) differential equations with random data are approximately solved by either intrusive or non-intrusive methods. An intrusive technique simultaneously discretizes stochastic and physical space with the classical example of stochastic Galerkin approaches. This method delivers favorable properties such as small errors with fewer number of equations and potentially small overall run-time. To achieve that, it requires to re-discretize and re-implement existing deterministic PDE solvers. On the other hand, non-intrusive techniques (e.g.(quasi-)Monte Carlo, multi-level Monte Carlo, stochastic collocation, ...) reuse existing solvers / simulation tools and generate a series of deterministic solutions which are used to approximate stochastic moments. It is thereby possible to perform uncertainty quantification analysis even for very complex large-scale applications for which a re-implementation of existing solvers is no option. The non-intrusive approach is connected to a rather extreme computational effort, with at least hundreds, thousands or even more deterministic problems that have to be solved. While a single real-world forward uncertainty propagation problem is already extremely computational intensive, even on a larger parallel computer, inference, optimization and control under uncertainties often go beyond the limits of currently available parallel computers.
In HPC, we have to distinguish methods that are intrinsically (often also called embarrassingly) parallel and those that have to exchange data to compute a result. That is, embarrassingly parallel algorithms are able to independently compute on completely decoupled parts of a given problem. A prominent example in UQ are Monte-Carlo-type methods. The other extreme are approaches that require to exchange a lot of data in order to solve a given problem. Here, prominent examples are adaptive and multi-level methods in general and stochastic Galerkin methods. Both method types tend to have excellent approximation properties, but require a considerable effort in parallel algorithms to be scalable on parallel computers. Scalability considerations might become even more important on the next generation of the largest parallel computers, which are expected to be available at the beginning of the next decade. These parallel Exascale computers will be able to process on the exaFLOP level, thus they will be able to issue 10 18 floating-point instructions within a second. Technological limitations in chip production will force computing centers to install systems with a parallel processor count which is by orders of magnitude higher than in current systems. Current parallel algorithms might not be prepared for this next step.
The Dagstuhl Seminar on "Uncertainty Quantification and High Performance Computing", brought together experts from UQ and HPC to discuss some of the following challenging questions:
- How can real-world forward uncertainty problems or even inference, control and optimization under uncertainties be made tractable by high performance computing?
- What types of numerical uncertainty quantification approaches are able to scale on current or future parallel computers, without sticking to pure Monte Carlo methods?
- Might adaptivity, model reduction or similar techniques improve existing uncertainty quantification approaches, without breaking their parallel performance?
- Can we efficiently use Exascale computing for large-scale uncertainty quantification problems without being affected by performance, scalability and resilience problems?
- Does current research in uncertainty quantification fit the needs of industrial users? Would industrial users be willing and able to use HPC systems to solve uncertainty quantification problems?
Several presentations covered Bayesion inference / inversion (Ghattas, Marzouk, Najm, Peters), where seismology is an extremely computationally expensive problem that can only be solved by the largest parallel computers (Ghattas). While the parallelization is crucial, the numerical methods have to be adapted as well, such that fast convergence is achieved (Ghattas, Marzouk, Peters). The very computationally intensive optimization under uncertainties (Benner) becomes tractable by the use of tensor approximation methods (Benner, Osedelets). Tensor approximation methods as well as hierarchical matrices (Börm, Zaspel) are optimal complexity numerical methods for a series of applications in UQ. However their large-scale parallelization is still subject to research.
A series of talks considered mesh-free approximation methods (Rieger, Teckentrup, Zaspel) with examples in Gaussian process regression (Teckentrup) and kernel-based methods. It was possible to see that these methods have provable error bounds (Rieger, Teckentrup) and can be scaled on parallel computers (Rieger, Zaspel). Moreover these methods even fit well for inference (Teckentrup). Sparse grid techniques were considered as example for classical approximation methods for higher-dimensional problems (Stoyanov, Peters, Harbrecht, Pflüger). Here, recent developments in adaptivity and optimal convergence were discussed. Sparse grid techniques are usually considered in a non-intrusive setting such that parallel scalability is often guaranteed. Compressed sensing promises to reduce the amount of simulations in a non-intrusive framework (Dexter). Quasi-Monte Carlo methods are under investigation for optimal convergence (Nuyens). The latter methods are of high interest for excellent parallel scalability on parallel computers due to the full decoupling of all deterministic PDE solves while keeping convergence orders beyond classical Monte Carlo methods.
Adaptivity leads to strongly improved approximations using the same amount of deterministic PDE solutions (Pflüger, Stoyanov, Webster, ...). However, a clear statement on how to parallelize adaptive schemes in an efficient way is still subject to research. The general class of multi-level schemes was also under investigation (Dodwell, Zhang), including but not being limited to multi-level Monte-Carlo and multi-level reduced basis approaches. These methods show excellent convergence properties. However their efficient and scalable parallelization is part of intensive studies, as well.
Performance considerations in the field of HPC (including future parallel computers) have been discussed (Heuveline, Legrand). Performance predictability is necessary to understand scaling behavior of parallel codes on future machines (Legrand). Parallel scalability of (elliptic) stochastic PDEs by domain decomposition has been discussed by LeMaître. His approach allows to increase parallel scalability and might show hints towards resilience.
Industrial applications were considered for the company Bosch (Schick), where intrusive and non-intrusive approaches are under investigation. High performance computing is still subject to discussion in this industrial context. One of the key applications, which is expected to become an industrial-like application, is UQ in medical engineering (Heuveline). Once introduced into the daily work cycle at hospitals, it will soon become a driving technology for our health.
Based on the survey and personal feedback from the invitees, the general consensus is that there is a high interest in deepening the discussions at the border of UQ and HPC. While some answers to the above questions could be given, there is still a lot more to learn, to discuss and to develop. A general wish is therefore to have similar meetings in the future.
The organizers would like to express their gratitude to all participants of the Seminar. Special thanks go to the Schloss Dagstuhl team for its extremely friendly support during the preparation phase and for the warm welcome at Schloss Dagstuhl.
- Peter Benner (MPI - Magdeburg, DE) [dblp]
- Steffen Börm (Universität Kiel, DE) [dblp]
- Nick Dexter (University of Tennessee - Knoxville, US) [dblp]
- Tim Dodwell (University of Exeter, GB)
- Omar Ghattas (Univ. of Texas at Austin, US) [dblp]
- Helmut Harbrecht (Universität Basel, CH) [dblp]
- Vincent Heuveline (HITS & Universität Heidelberg) [dblp]
- Olivier Le Maitre (LIMSI - Orsay, FR) [dblp]
- Arnaud Legrand (INRIA - Grenoble, FR) [dblp]
- Youssef M. Marzouk (MIT - Cambridge, US) [dblp]
- Habib Najm (Sandia Nat. Labs - Livermore, US) [dblp]
- Dirk Nuyens (KU Leuven, BE) [dblp]
- Ivan Oseledets (Skoltech - Moscow, RU) [dblp]
- Michael Peters (Universität Basel, CH) [dblp]
- Dirk Pflüger (Universität Stuttgart, DE) [dblp]
- Christian Rieger (Universität Bonn, DE) [dblp]
- Michael Schick (Robert Bosch GmbH - Stuttgart, DE) [dblp]
- Miroslav Stoyanov (Oak Ridge National Laboratory, US) [dblp]
- Aretha Teckentrup (University of Warwick - Coventry, GB) [dblp]
- Clayton Webster (Oak Ridge National Laboratory, US) [dblp]
- Peter Zaspel (HITS & Universität Heidelberg) [dblp]
- Guannan Zhang (Oak Ridge National Laboratory, US) [dblp]
- data structures / algorithms / complexity
- modelling / simulation
- software engineering
- Uncertainty Quantification
- High Performance Computing
- Stochastic Modeling