- Christina Schwarz (for administrative matters)
The focus of this Dagstuhl Seminar will be on fundamental theoretical issues in discrete and continuous adjoint methods with applications in Computational Science, Engineering, and Finance (CSEF).
With the ever increasing performance of HPC systems the resolution at which problems in CSEF formulated as (systems of) partial differential equations (PDE) can be solved numerically, is rising steadily. Adjoint methods enable the highly desirable transition from numerical simulation to deterministic (derivative-based nonlinear) optimization as they provide sensitivity information of target functionals with respect to a potentially very large number of parameters with a computational complexity that is independent of this number. Thus many (inverse) problems can be dealt with numerically the solution of which would be beyond modern computational capabilities otherwise. Typical examples include PDE-constrained optimization and optimal control, uncertainty quantification and error analysis / correction, and large-scale parameter estimation / data assimilation.
The development of adjoint numerical methods yields a large number of theoretical, algorithmic, and practical (implementation) challenges most of them to be addressed by state of the art Computer Science and Applied Mathematics methodology including parallel high-performance computing, domain-specific program analysis and compiler construction, combinatorial scientific computing, numerical linear algebra / analysis, and functional analysis.
One aim of this seminar is to tackle these challenges by setting the stage for accelerated development and deployment of such methods based on in-depth discussions between computer scientists, mathematicians, and practitioners from various application areas. We plan to initiate a sequence of related events alternating in between Schloss Dagstuhl and the Mathematisches Forschungsinstitut Oberwolfach, thus, emphasizing the obvious synergies between computer science and mathematics in the given context.
For this first meeting we would like to focus on some fundamental theoretical issues arising in the context of "continuous vs. discrete adjoints" in addition to introductory lectures and round table discussions / working groups on both approaches. Motivation will be provided by presentations of various applications of adjoint methods in CSEF.
The human desire for meaningful numerical simulation of physical, chemical, biological, economical, financial (etc.) phenomena in CSEF has been increasing with the growing performance of the continuously improving computer systems. As a result of this development we are (and will always be) faced with a large (and growing) number of highly complex numerical simulation codes that run at the limit of the available HPC resources. These codes often result from the discretization of systems of PDE. Their run time correlates with the spatial and temporal resolution which often needs to be very high in order to capture the real behavior of the underlying system. There is no doubt that the available hardware will always be used to the extreme. Improvements in the run time of the simulations need to be sought through research in numerical algorithms and their efficient implementation on HPC architectures.
Problem sizes are often in the billions of unknowns; and with emerging large-scale computing systems, this size is expected to increase by a factor of thousand over the next five years. Moreover, simulations are increasingly used in design optimization and parameter identification which is even more complex and requires the highest possible computational performance and fundamental enabling algorithmic technology. Derivatives of certain objectives of these numerical models with respect to a potentially very large number of model parameters are crucial for the highly desirable transition from pure simulation to optimization. Approximation of these derivatives via finite difference quotients often lacks the required accuracy. More importantly, it may be infeasible for a large parameter space in terms of its computational complexity. Adjoint numerical programs have until recently been written by hand to overcome this problem. Such programs compute (large) gradients with machine accuracy at a small constant multiple of the computational complexity of the underlying primal simulation. Due to the enormous size of most numerical simulation codes the manual procedure may take up to several man years. Moreover manual adjoint codes are error-prone and hard to maintain as the primal simulation evolves. Computer scientists have been developing special software tools based on the principles of algorithmic differentiation (AD) to generate discrete adjoint code automatically. Consequently, this method has gained considerable acceptance within the CSEF community as illustrated by numerous successful case studies presented in the proceedings of so far six international conferences on AD. See http://www.autodiff.org for details.
Illustrative Example: Classical applications of adjoint methods arise in the context of large-scale inverse problems, such as the estimation of unknown or uncertain parameters of implementations of mathematical models for real-world problems as computer programs. Imagine the optimization of the shape of an aircraft with the objective to maximize its lift. The continuous mathematical domain (the surface of the aircraft) is typically discretized through the generation of a mesh with a potentially very large number of points spread over the whole surface. Optimization aims to adapt the position of these points in 3D space such that the objective is met while at the same time satisfying various constraints (e.g. prescribed volume). A naive approach might run a potentially very large number of primal numerical simulations with changing mesh configurations thus being able to identify an optimum within this very limited search space.
Derivative-based approaches use information on the sensitivity of the objective at the given mesh configuration with respect to changes in the positions of all mesh points (the gradient) in order to make a deterministic decision about the next configuration to be considered. The sensitivities can be approximated through local perturbations of the position of each mesh point (finite difference quotients). A single optimization step would thus require a number of primal simulations that is of the order of the number of degrees of freedom (three spatial coordinates for each mesh point) induced by the mesh. This approach is practically infeasible as a single simulation may easily run for several minutes (if not hours) on the latest HPC architectures. The approximation of a single gradient would take months (if not years) for a mesh with only one million points.
Adjoint methods deliver the gradient at the cost of only a few (between 2 and 10) primal simulations. Continuous adjoint methods derive an adjoint version of the primal mathematical model analytically followed by the numerical solution of the resulting adjoint model. While this approach promises low computational cost (approx. 2 primal simulations) it can be mathematically challenging and numerically inconsistent when compared with the primal numerical simulation. To the best of our knowledge, the automation of the derivation of continuous adjoint models is still outstanding.
Discrete adjoint methods rely on the algorithmic differentiation of the primal numerical model, thus overcoming the potential numerical inconsistencies induced by the continuous adjoint. Depending on the mode of implementation of AD, the level of maturity of the AD tool, and the expertise of the user of the tool the computational cost can range between 2 and 20 primal simulations, sometimes even more. Still this cost is independent of the number of mesh points (referring to the above example). Solutions to problems arising in adjoint methods require expertise in both theoretical and applied Computer Science as well as in Numerical Analysis. Robust methods for the data flow reversal within adjoint code are built on special graph partitioning and coloring algorithms. Their implementation on modern HPC architectures (e.,g. using MPI and/or OpenMP) has impact on the simulation software design and the data management. The use of accelerators has been considered only recently with many open as of yet unsolved problems. Static and dynamic program analysis and compiler construction techniques have been developed to facilitate the semi-automatic generation of discrete adjoint code. The exploration of a potential extension of these techniques to continuous adjoint code was one of the subjects of this seminar. Other conceptual problems discussed included functional analytic aspects of adjoint methods and their impact on practical implementation, combinatorial problems in adjoint code generation and their computational complexities, and simulation software engineering guidelines in the light of adjoint methods.
Adjoint methods borrow from a variety of subfields of Computer Science and Applied Mathematics including high performance and combinatorial scientific computing, program analysis and compiler construction, functional analysis, numerical analysis and linear algebra, and with relevance to a wide range of potential areas of application. As such, the topic lends itself to a series of seminars taking more detailed looks into the respective subjects. With this seminar we intent to initiate a sequence of related events alternating in between the Leibniz Center for Informatics at Schloss Dagstuhl and the Mathematisches Forschungsinstitut Oberwolfach, thus, emphasizing the obvious synergies between Computer Science and Mathematics in the given context.
- Christian Bischof (TU Darmstadt, DE) [dblp]
- Luise Blank (Universität Regensburg, DE) [dblp]
- David Bommes (INRIA Sophia Antipolis - Méditerranée, FR) [dblp]
- Martin Bücker (Universität Jena, DE) [dblp]
- Bruce Christianson (University of Hertfordshire, GB) [dblp]
- Mike Dewar (NAG Ltd. - Oxford, GB) [dblp]
- Boris Diskin (National Institute of Aerospace - Hampton, US) [dblp]
- Patrick Farrell (University of Oxford, GB) [dblp]
- Christian Frey (DLR - Köln, DE)
- Nicolas R. Gauger (TU Kaiserslautern, DE) [dblp]
- Andreas Griewank (HU Berlin, DE) [dblp]
- Stefanie Günther (TU Kaiserslautern, DE)
- Ralf Hannemann-Tamas (NTNU - Trondheim, NO) [dblp]
- Laurent Hascoet (INRIA Sophia Antipolis - Méditerranée, FR) [dblp]
- Patrick Heimbach (MIT - Cambridge, US) [dblp]
- Paul D. Hovland (Argonne National Laboratory, US) [dblp]
- Barbara Kaltenbacher (Alpen-Adria-Universität Klagenfurt, AT) [dblp]
- Peter Korn (MPI für Meteorologie - Hamburg, DE) [dblp]
- Drosos Kourounis (University of Lugano, CH) [dblp]
- Ben Lenser (HU Berlin, DE)
- Johannes Lotz (RWTH Aachen, DE) [dblp]
- Marcus Meyer (Rolls-Royce - Blankenfelde-Mahlow, DE) [dblp]
- Viktor Mosenkis (RWTH Aachen, DE) [dblp]
- Jens-Dominik Mueller (Queen Mary University of London, GB)
- Uwe Naumann (RWTH Aachen, DE) [dblp]
- Eric Nielsen (NASA Langley ASDC - Hampton, US) [dblp]
- Jacques Peter (ONERA - Châtillon, FR)
- Olivier Pironneau (UPMC - Paris, FR) [dblp]
- Alex Pothen (Purdue University - West Lafayette, US) [dblp]
- Thomas Rung (TU Hamburg-Harburg, DE) [dblp]
- Ekkehard Sachs (Universität Trier, DE) [dblp]
- Max Sagebaum (TU Kaiserslautern, DE) [dblp]
- Stephan Schmidt (Imperial College London, GB) [dblp]
- Volker Schulz (Universität Trier, DE) [dblp]
- Markus Towara (RWTH Aachen, DE) [dblp]
- Stefan Ulbrich (TU Darmstadt, DE) [dblp]
- Jörn Ungermann (Forschungszentrum Jülich, DE)
- Jean Utke (Allstate - Northbrook, US) [dblp]
- Andrea Walther (Universität Paderborn, DE) [dblp]
- Qiqi Wang (MIT - Cambridge, US) [dblp]
- modelling / simulation
- continuous adjoints
- discrete adjoints
- high-performance scientific computing
- algorithmic differentiation