12. – 17. Februar 2006, Dagstuhl Seminar 06071
Architectures and Algorithms for Petascale Computing
Auskunft zu diesem Dagstuhl Seminar erteilt
- Pressemitteilung vom 06.02.06: "Supercomputer simulieren virtuelle Welten" (German only)
- Video by Saarländischer Rundfunk Aktueller Bericht: "Computer-Tagung im Nordsaarland"
DiVX [14.9 MB], WMV [13.8 MB] (Author: Jürgen Rinner; February 16, 2006)
- Radio broadcast by Radio Berlin-Brandenburg: "Supercomputer - wozu sind sie gut?"
MP3, rbb-online (Author: Thomas Prinzler; February 21, 2006)
This seminar will focus on high end simulation as a tool for computational science and engineering applications. To be useful tools for science, such simulations must be based on accurate mathematical descriptions of the processes and thus they begin with mathematical formulations, such as partial differential equations or integral equations.
Because of the ever-growing complexity of scientific and engineering problems, computational needs continue to increase rapidly. But most of the currently available hardware, software, systems, and algorithms are primarily focused on business applications or smaller scale scientific and engineering problems, and cannot meet the high-end computing needs of cutting-edge scientific and engineering work. This seminar is concerned with peta-scale scientific computation, which are highly computation- and data-intensive, and cannot be satisfied in todays typical cluster environment. The target hosts for these tools are systems comprised of thousands to tens of thousands of processors. By the end of the decade such systems are expected to reach a performance of one Petaflop, that is 1015 floating point operations per second.
The rapid progress over the past three decades in high performance simulation has recently been facing an increasing number of obstacles that are so fundamental that no single solution is in sight. Instead, only a combination of approaches seems to promise the successful transition to petascale simulation.
- Petaflops systems are necessarily massively parallel. Many simulation codes currently in use (e.g. commercial finite element packages) are hardly scalable on parallel systems at all, but even specifically developed programs cannot be expected to scale successfully to tens of thousands of processing units, as will be used in Petascale systems.
- Achieving a significant percentage of the processor performance has become increasingly difficult especially for many commodity processors, since the so called memory wall prohibits better efficiency. The compute speed is not matched by the memory performance of such systems. Mainstream computer and CPU architecture is hitting severe limits which may be most noticeable in a high performance computing scenario (but not only there).
- Further improvements of latency and bandwidth are hitting fundamental limits. At 10 GHz clock rate, light travels for 3 cm in vacuum, but a Petaflop system may be physically 100 m across, so that latencies of several thousand clock cycles are unavoidable for such a system.
- Similarly, a Petaflop system would ideally have an aggregate bandwidth that, if transported on buses of 128 bit width at a clock rate of 1 GHz, would require in excess of a million of such buses operating in parallel. Therefore not only latency, but even the available bandwidth may become a severe bottleneck.
- New and innovative hard- and software architectures will be required, but it will not be sufficient to design solutions only on the system level:
- additionally the design (and implementation) of the algorithms must be revised and adapted.
- new latency tolerant and bandwidth optimized algorithms must be invented and designed for petascale systems.
The proposed seminar will focus on develping solutions for concurrent and future problems in high end computing. Specifically, these are:
- innovative hard- and software architectures for petascale computing
- scalable parallel simulation algorithms, whose complexity must depend only linearly (or almost linearly) on the problem size
- scalable massively parallel systems and architectures
- simultaneously using multiple granularity levels of parallelism, from instruction or task level to message passing in a networked cluster
- devising algorithms and implementation techniques capable of tolerating latency and bandwidth restrictions Petaflop systems
- tools and techniques for improving the usability of such systems
- possible alternatives to silicon-based computing