CaltechAUTHORS
  A Caltech Library Service

Fault-Tolerant Switched Local Area Networks

LeMahieu, Paul S. and Bohossian, Vasken and Bruck, Jehoshua (1998) Fault-Tolerant Switched Local Area Networks. California Institute of Technology . (Unpublished) https://resolver.caltech.edu/CaltechPARADISE:1998.ETR021

[img]
Preview
PDF (Adobe PDF (1.6MB))
See Usage Policy.

1MB
[img]
Preview
Postscript
See Usage Policy.

476kB

Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechPARADISE:1998.ETR021

Abstract

The RAIN (Reliable Array of Independent Nodes) project at Caltech is focusing on creating highly reliable distributed systems by leveraging commercially available personal computers, workstations and interconnect technologies. In particular; the issue of reliable communication is addressed by introducing redundancy in the form of multiple network interfaces per computer node. When using compute nodes with multiple network connections the question of how to best connect these nodes to a given network of switches arises. We examine networks of switches (e.g. based on Myrinet technology) and focus on degree two compute nodes (two network adaptor cards per node). Our primary goal is to create networks that are as resistant as possible to partitioning. Our main contributions are: (i) a construction for degree-2 compute nodes connected by a ring network of switches of degree 4 that can tolerate any 3 switch failures without partitioning the nodes into disjoint sets, (ii) a proof that this construction is optimal in the sense that no construction can tolerate more switch failures while avoiding partitioning and (iii) generalizations of this construction to arbitrary switch and node degrees and to other switch networks, in particular; to a fully-connected network of switches.


Item Type:Report or Paper (Technical Report)
Related URLs:
URLURL TypeDescription
http://www.paradise.caltech.edu/papers/etr021.psPublisherUNSPECIFIED
ORCID:
AuthorORCID
Bruck, Jehoshua0000-0001-8474-0812
Group:Parallel and Distributed Systems Group
Record Number:CaltechPARADISE:1998.ETR021
Persistent URL:https://resolver.caltech.edu/CaltechPARADISE:1998.ETR021
Usage Policy:You are granted permission for individual, educational, research and non-commercial reproduction, distribution, display and performance of this work in any format.
ID Code:26053
Collection:CaltechPARADISE
Deposited By: Imported from CaltechPARADISE
Deposited On:03 Sep 2002
Last Modified:22 Nov 2019 09:58

Repository Staff Only: item control page