CaltechAUTHORS
  A Caltech Library Service

Tolerating Faults in Counting Networks

Riedel, Marc D. and Bruck, Jehoshua (2000) Tolerating Faults in Counting Networks. In: Dependable Network Computing. Springer International Series in Engineering and Computer Science . No.538. Springer , Boston, MA, pp. 267-277. ISBN 978-1-4613-7053-6. https://resolver.caltech.edu/CaltechAUTHORS:20190830-101628653

Full text is not posted in this repository. Consult Related URLs below.

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:20190830-101628653

Abstract

Counting networks were proposed by Aspnes, Herlihy and Shavit [3] as a low-contention concurrent data structure for multiprocessor coordination. We address the issue of tolerating faults in counting networks. In our fault model, balancer objects experience responsive crash failures: they behave correctly until they fail, and thereafter they are inaccessible. We propose two methods for tolerating such faults. The first is based on a construction of a k-fault-tolerant balancer with 2(K + 1) bits of memory. All balancers in a counting network are replaced by fault-tolerant ones. Thus, a counting network with depth O(log2 n), where n is the width, is transformed into a k-fault-tolerant counting network with depth O(k log^2 n). We also consider the case where inaccessible balancers can be remapped to spare balancers. We present a bound on the error in the output token distribution of counting networks with remapped faulty balancers (a generalization of the error bound for sorting networks with faulty comparators presented by Yao & Yao [10]). Our second method for tolerating faults is based on the construction of a correction network. Given a token distribution with a bounded error, the correction network produces a token distribution that is smooth (i.e., the number of tokens on each output wire differs by at most one — a weaker condition than the step property of counting networks). The correction network is constructed with fault-tolerant balancers. It is appended to a counting network in which faulty balancers are remapped to spare balancers. In order to tolerate k faults, the correction network has depth 2k(k + l)(log n + 1), for a network of width n. Therefore, this method results in a network with a smaller depth provided that O(k) < O(log n). However, it is only applicable if it is possible to remap faulty balancers.


Item Type:Book Section
Related URLs:
URLURL TypeDescription
https://doi.org/10.1007/978-1-4615-4549-1_12DOIArticle
http://resolver.caltech.edu/CaltechPARADISE:1998.ETR022Related ItemTechnical Report
ORCID:
AuthorORCID
Riedel, Marc D.0000-0002-3318-346X
Bruck, Jehoshua0000-0001-8474-0812
Additional Information:© 2000 Springer Science+Business Media New York.
Subject Keywords:counting networks; faults; fault tolerance; concurrent data structures; multiprocessor coordination; load balancing; network routing
Series Name:Springer International Series in Engineering and Computer Science
Issue or Number:538
Record Number:CaltechAUTHORS:20190830-101628653
Persistent URL:https://resolver.caltech.edu/CaltechAUTHORS:20190830-101628653
Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:98368
Collection:CaltechAUTHORS
Deposited By: Tony Diaz
Deposited On:30 Aug 2019 18:13
Last Modified:22 Nov 2019 09:58

Repository Staff Only: item control page