Dagstuhl-Seminar 08371: Fault-Tolerant Distributed Algorithms on VLSI Chips

Dagstuhl-Seminar 08371

Fault-Tolerant Distributed Algorithms on VLSI Chips

( 07. Sep – 10. Sep, 2008 )

(zum Vergrößern in der Bildmitte klicken)

Permalink

Bitte benutzen Sie folgende Kurz-Url zum Verlinken dieser Seite: https://www.dagstuhl.de/08371

Organisatoren

Bernadette Charron-Bost (Ecole Polytechnique - Palaiseau, FR)
Shlomi Dolev (Ben Gurion University - Beer Sheva, IL)
Jo Ebergen (Sun Microsystems - Menlo Park, US)
Ulrich Schmid (TU Wien, AT)

Kontakt

Annette Beyer (für administrative Fragen)

Publikationen

Fault-Tolerant Distributed Algorithms on VLSI Chips. Bernadette Charron-Bost, Shlomi Dolev, Jo Ebergen, and Ulrich Schmid (Eds.). Dagstuhl Seminar Proceedings, Volume 8371. March 13, 2009

Impacts

On the Threat of Metastability in an Asynchronous Fault-Tolerant Clock Generation Scheme : article pp. 127-136 : 2009 15th IEEE Symposium on Asynchronous Circuits and Systems (ASYNC 2009 - Fuchs, Gottfried; Függer, Matthias; Steininger, Andreas - Los Alamitos : IEEE, 2009. - pp. 127-136.

Summary

Show Summary

The Dagstuhl seminar 08371 on Fault-Tolerant Distributed Algorithms on VLSI Chips was devoted to exploring whether the wealth of existing fault-tolerant distributed algorithms research can be utilized for meeting the challenges of future-generation VLSI chips. Participants from both the distributed fault-tolerant algorithms community, interested in this emerging application domain, and from the VLSI systems-on-chip and digital design community, interested in well-founded system-level approaches to fault-tolerance, surveyed the current state-of-the-art and tried to identify possibilities to work together. The seminar clearly achieved its purpose: It became apparent that most existing research in Distributed Algorithms is too heavy-weight for being immediately applied in the "core" VLSI design context, where power, area etc. are scarce resources. At the same time, however, it was recognized that emerging trends like large multicore chips and increasingly critical applications create new and promising application domains for fault-tolerant distributed algorithms. We are convinced that the very fruitful cross-community interactions that took place during the Dagstuhl seminar will contribute to new research activities in those areas.

Description

Show Description

Shrinking feature sizes and increasing clock speeds are the most visible signs of the tremendous advances in VLSI design, which will accommodate billions of transistors on a single in the near future. This comes, however, at the price of increased system-level complexity: In today’s deep submicron technology with GHz clock speeds, wiring delays dominate transistor switching delays, and signals cannot traverse the whole die within single clock cycle any more. In fact, a modern VLSI chip can no longer be viewed as a monolithic block of synchronous hardware, where all state transitions occur simultaneously. Rather, VLSI chips are nowadays considered as systems of interacting subsystems — the advent of Systems-on-Chip (SoC)and Networks-on-Chip (NoC).

In addition, ever-increasing manufacturing variabilities increase the defect ratio, and the reduced voltage swing needed for high clock speeds and low power consumption also increases the adverse effects of α-particle and neutron hits during operation, as well as cross-talk and ground-bouncing sensitivity. The resulting increase of the transient failure rate (soft-error rate), which was negligible in most former-generation chips, has hence raised general concerns about the dependability of future generation VLSI chips. Consequently, suitable fault-tolerance mechanisms with respect to timing errors or value errors are vital for such devices: Fine-grained fault-tolerance like radiation-hardening, fault masking at transistor or gate level, error-correcting codes or error detection and recovery are the primary methods of choice here.

Due to the above trends, however, modern VLSI chips have much in common with the loosely-coupled distributed systems that have been studied by the fault-tolerant distributed algorithms community for decades. System-level fault tolerance based on replication and distributed agreement is the dominant approach here, and a wealth of different computing and failure models, algorithms & protocols, and theoretical results regarding solvability of problems and achievable performance have been established in the past.

The purpose of our Dagstuhl seminar was to explore whether fault-tolerant distributed algorithms research can indeed be utilized for meeting the challenges of future-generation VLSI chips: Just as Temporal Logic, established in the distributed computing scope decades ago, found its way to the VLSI domain, other radically new solutions and methods may also find their way. And indeed, some recent research suggested a positive answer to this question: For example, demonstrated that distributed fault-tolerant clock generation algorithms can be adapted to the very special requirements of VLSI chips, and demonstrated that self-stabilization is a very promising approach for designing robust VLSI chips.

Fifteen participants from the distributed fault-tolerant algorithms community (and related fields, like verification), interested in the new application domain of VLSI chips, and twelve participants from the VLSI community, interested in system-level approaches to fault-tolerance, joined at Dagstuhl in order to survey the current state-of-the-art and identify possibilities to work together.

The presentations and the unique setting of Dagstuhl, with its relaxed and stimulating atmosphere, fully achieved their purpose: Long discussions during the official seminar, and many fruitful cross-community interactions during the free times were stimulated, which even exceeded the amount of available time.

Teilnehmer

Zeige Teilnehmer

Janusz Brzozowski (University of Waterloo, CA)
Bernadette Charron-Bost (Ecole Polytechnique - Palaiseau, FR) [dblp]
Shlomi Dolev (Ben Gurion University - Beer Sheva, IL) [dblp]
Jo Ebergen (Sun Microsystems - Menlo Park, US) [dblp]
Sergey Frenkel (Russian Academy of Sciences - Moscow, RU)
Gottfried Fuchs (TU Wien, AT)
Matthias Függer (TU Wien, AT) [dblp]
Mike Gerdes (Universität Augsburg, DE)
Leslie Lamport (Microsoft Corp. - Mountain View, US) [dblp]
Rajit Manohar (Cornell University, US)
Alain Martin (CalTech - Pasadena, US)
Philippe Matherat (Télécom ParisTech, FR)
Chris J. Myers (Univ. of Utah, US) [dblp]
Lirida Naviner (ENST - Paris, FR)
Tim Nieberg (Universität Bonn, DE)
Dhiraj Pradhan (University of Bristol, GB)
Rüdiger Reischuk (Universität Lübeck, DE) [dblp]
André Schiper (EPFL - Lausanne, CH) [dblp]
Ulrich Schmid (TU Wien, AT) [dblp]
Daniel J. Sorin (Duke University - Durham, US)
Andreas Steininger (TU Wien, AT)
Oliver Theel (Universität Oldenburg, DE) [dblp]
Philippas Tsigas (Chalmers UT - Göteborg, SE) [dblp]
Helmut Veith (TU Darmstadt, DE) [dblp]
Jennifer L. Welch (Texas A&M University - College Station, US) [dblp]
Josef Widder (Ecole Polytechnique - Palaiseau, FR) [dblp]
Alex Yakovlev (Newcastle University, GB) [dblp]

Klassifikation

data structures / algorithms / complexity
networks
hardware

Schlagworte

Fault-tolerant distributed algorithms
system-level fault tolerance
VLSI systems-on-chip
digital logic
formal specification

Seminar 08371

Suche auf der Schloss Dagstuhl Webseite

Schloss Dagstuhl Services

Seminare

Innerhalb dieser Seite:

Externe Seiten:

Publishing

Innerhalb dieser Seite:

Externe Seiten:

dblp

Innerhalb dieser Seite:

Externe Seiten:

Dagstuhl-Seminar 08371

Fault-Tolerant Distributed Algorithms on VLSI Chips

( 07. Sep – 10. Sep, 2008 )

Permalink

Organisatoren

Kontakt

Publikationen

Impacts

Summary

Description

Teilnehmer

Klassifikation

Schlagworte