31.01.16 - 03.02.16, Seminar 16052

Dark Silicon: From Embedded to HPC Systems

Diese Seminarbeschreibung wurde vor dem Seminar auf unseren Webseiten veröffentlicht und bei der Einladung zum Seminar verwendet.

Motivation

Semiconductor industry is hitting the utilization wall and puts focus on parallel and heterogeneous many-core architectures. While continuous technological improvements in the chip manufacturing process enable the dense integration of more and more processing cores and, thus, processing capabilities, the resulting power consumption per area (the power density) increases enormously. With this density, the problem of dark silicon will become more prevalent in the future: It will be impossible to power all the components on the chip up due to the thermal constraints. But, this is not only an emerging threat for SoC and MPSoC designers, HPC faces the same problem as well: The power supplied by the energy companies as well as the cooling capacity does not allow to run the entire machine at highest performance anymore. The goal of this workshop is to increase the awareness of the research communities of those similarities and to explore solutions based on more flexible resource management schemes including run-time, design-time, and hybrid solutions.

Recent research work on power management for Dark Silicon aims at efficiently utilizing the TDP (Thermal Design Power) budget to maximize the performance or to allocate full power budget for boosting single-application performance by running a single core at the maximum voltage or multiple cores at nominal level for a very short time period. Control-based frame-works are proposed to find the optimal trade-off between power and performance of many-core systems under a given power budget. The work on near-threshold computing (NTC) enables operating multiple cores at a voltage close to the threshold voltage. Though this approach favors applications with thread-level parallelism at low power, it severely suffers from errors or inefficiency due to process variations and voltage fluctuations. On the other hand, the computational sprinting approach leverages Dark Silicon to power-on many extra cores for a very short time period (100s of millisecond) to facilitate sub-second bursts of parallel computations through multi-threading thereby wasting a significant amount of energy due to leakage current.

The energy consumption of HPC systems is steadily growing. The costs for energy in the five year lifetime of large scale supercomputers already almost equal the cost of the machine. It is a necessity to carefully tune systems, infrastructure and applications to reduce the overall energy consumption. In addition, the computing centers running very big systems face the problem of limited power provided by the energy providers and of the requirement for an almost constant power draw from the grid. These challenges also require a careful and flexible power and resource management for HPC systems.

Modern Applications have to exploit the available parallelism and heterogeneity of – non-darkened – cores to meet their functional and non-functional requirements and to gain performance improvements. A main challenge originates from many-cores promoting (a) highly dynamic usage scenarios as already observable in today's "smart devices", where multiple and varying numbers of applications are running at different points in time while (b) the available cores are subject to change due to dark silicon. As a consequence, providing a mapping or pinning of applications to processor cores which is optimal and predictable with respect to performance, timing, energy consumption, etc. may not be guaranteed by static design-time optimization alone. At the same time, pure run-time resource management may result in unpredictable and non-optimal system states. A research direction that addresses this field of tension is invasive computing where design-time analysis and optimization of the applications is combined with run-time resource management approaches that try to balance the requirements of the individual applications with the system’s requirements e.g. to respect a maximum power density.

The goal of this Dagstuhl Seminar is to bring together experts from the different domains and to discuss the state-of-the-art and identify future collaboration topics based on common research interests. We will have three main parts on the topics Dark Silicon, Power and Energy Usage in HPC, and Hybrid Approaches to Resource Management with longer overview presentations by invited speakers and research presentations by the attendees. Each part will close with a discussion slot. After these three parts we plan for group discussion to identify future collaborative research directions.