11.10.15 - 16.10.15, Seminar 15421

Rack-scale Computing

Diese Seminarbeschreibung wurde vor dem Seminar auf unseren Webseiten veröffentlicht und bei der Einladung zum Seminar verwendet.

Motivation

Rack-scale computing is the emerging research area of how we design and program the machines used in data centers. “Traditional” data center racks each contain dozens of discrete machines connected by Ethernet or InfiniBand. Over the last few years researchers and industry have been moving away from this model to rack-level integrated design, driven by the need to increase density and connectivity between servers, while lowering cost and power consumption.

In the near future we expect to see rack-scale computers with 10,000 to 100,000 cores, petabytes of solid-state memory, and high-bandwidth / low-latency internal fabrics. This raises interesting research questions. How should the fabric be organized, and how should CPUs, DRAM, and storage be placed in it? Are different rack-scale designs required for different applications? How should we integrate rack-scale computers into data center networks and warehouse-scale computers (WSCs)? Should rack-scale machines be programmed like large shared-memory NUMA servers, like traditional distributed systems, or a combination of the two? What are the best communication primitives to let applications benefit from low-latency interconnects? What are the likely failure modes and how do we achieve fault tolerance? How can researchers effectively prototype and test novel ideas?

We wish to bring together researchers and practitioners working on:

  • Physical design: High resource density under cost, power, and cooling constraints can require physical redesign of the rack. High utilization requires us to balance processing, bandwidth, and storage resources. Physical designs such as the Pelican cold storage rack achieve very high density with commodity components for specialized applications. The physical design space ranging from general to specialized needs further exploration.
  • Systems-on-Chip. SoCs are used both in industrial and research rack-scale systems, motivated by a drive for high density and high performance-per-Watt. For instance, the UC Berkeley FireBox is a 50kW WSC building block containing a thousand compute sockets, each containing a SoC with around 100 cores connected to high-bandwidth on-package DRAM.
  • Interconnects. Rack-scale computers could support workloads with more fine-grained communication than is supported in traditional racks. Research in rack-scale interconnects ranges from the physical level (for instance, silicon photonics) to new hardware-software interfaces. For instance, Scale-Out Numa exposes the inter-connect via a remote memory controller which is mapped into a node’s local cache-coherent address space.
  • Storage systems. Emerging non-volatile random-access memory (NV-RAM) technologies promise high-capacity non-volatile storage with read performance comparable with DRAM. At the same time, other researchers are exploring 3D-stacking of FLASH and DRAM. These technologies can have a huge impact on the capabilities of rack-scale computers.
  • Systems software and language runtime systems. How should operating systems manage and schedule applications on rack-scale systems? Should a single OS instance cover the whole rack, or should separate instances run on each core or socket? How do we expose locality, failures, and tail latencies to applications? What is the role of virtualization? What problems are best solved in hardware versus software Experience with multi-core research operating systems, such as Barrelfish, fos, and Tessellation is relevant here.

The goal of this Dagstuhl Seminar is to bring together leading international researchers from both academia and industry working on different aspects of rack-scale systems. Effective solutions will require renewed collaboration across architecture, systems, and programming language communities. In addition, we want to involve participants with practical experience of workloads, and of running industrial warehouse-scale systems.