Bruck, Jehoshua and Dolev, Danny and Ho, Ching-Tien and Orni, Rimon and Strong, Ray (1995) PCODE: an efficient and reliable collective communication protocol for unreliable broadcast domain. In: International Parallel Processing Symposium, 9th, (IPPS '95), Santa Barbara, CA, 25-28 April 1995. IEEE , Piscataway, NJ, pp. 130-139. ISBN 0-8186-7074-6. https://resolver.caltech.edu/CaltechAUTHORS:BRUipps95
![]()
|
PDF
- Published Version
See Usage Policy. 1MB |
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechAUTHORS:BRUipps95
Abstract
Existing programming environments for clusters are typically built on top of a point-to-point communication layer (send and receive) over local area networks (LANs) and, as a result, suffer from poor performance in the collective communication part. For example, a broadcast that is implemented using a TCP/IP protocol (which is a point-to-point protocol) over a LAN is obviously inefficient as it is not utilizing the fact that the LAN is a broadcast medium. We have observed that the main difference between a distributed computing paradigm and a message passing parallel computing paradigm is that, in a distributed environment the activity of every processor is independent while in a parallel environment the collection of the user-communication layers in the processors can be modeled as a single global program. We have formalized the requirements by defining the notion of a correct global program. This notion provides a precise specification of the interface between the transport layer and the user-communication layer. We have developed PCODE, a new communication protocol that is driven by a global program and proved its correctness. We have implemented the PCODE protocol on a collection of IBM RS/6000 workstations and on a collection of Silicon Graphics Indigo workstations, both communicating via UDP broadcast. The experimental results we obtained indicate that the performance advantage of PCODE over the current point-to-point approach (TCP) can be as high as an order of magnitude on a cluster of 16 workstations.
Item Type: | Book Section | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Related URLs: |
| |||||||||
ORCID: |
| |||||||||
Additional Information: | © Copyright 1996 IEEE. Reprinted with permission. Meeting Date: 04/25/1995 - 04/28/1995. Supported in part. by the NSF Young Investigator Award CCR-9457811, by a grant from the IBM Almaden Research Center, San Jose, California and by a grant from the AT&T Foundation. Wv would like to thank especially Dalia Malki for her invaluable advice, coding ideas and trouble shooting. Thanks to Yair Amir for his coding ideas and advice on IPC and to Jim Wiley for useful help and advice on AIX. | |||||||||
Funders: |
| |||||||||
Subject Keywords: | local area networks; message passing; transport protocols; LAN; PCODE; Silicon Graphics Indigo workstations; broadcast; communication protocol; point-to-point protocol; programming environments; unreliable broadcast domain | |||||||||
DOI: | 10.1109/IPPS.1995.395924 | |||||||||
Record Number: | CaltechAUTHORS:BRUipps95 | |||||||||
Persistent URL: | https://resolver.caltech.edu/CaltechAUTHORS:BRUipps95 | |||||||||
Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. | |||||||||
ID Code: | 12367 | |||||||||
Collection: | CaltechAUTHORS | |||||||||
Deposited By: | INVALID USER | |||||||||
Deposited On: | 14 Nov 2008 04:49 | |||||||||
Last Modified: | 08 Nov 2021 22:27 |
Repository Staff Only: item control page