Building a High-Performance Collective Communication Library
Abstract
We report on a project to develop a unified approach for building a library of collective communication operations that performs well on a cross-section of problems encountered in real applications. The target architecture is a two-dimensional mesh with worm-hole routing, but the techniques are more general. The approach differs from traditional library implementations in that we address the need for implementations that perform well for various sized vectors and grid dimensions, including non-power-of-two grids. We show how a general approach to hybrid algorithms yields performance across the entire range of vector lengths. Moreover, many scalable implementations of application libraries require collective communication within groups of nodes. Our approach yields the same kind of performance for group collective communication. Results from the Intel Paragon system are included.
Additional Information
© 1994 IEEE. This research was performed in part using the Intel Paragon System and the Intel Touchstone Delta System operated by the California Institute of Technology on behalf of the Concurrent Supercomputing Consortium. Access to this facility was provided by Intel Supercomputer Systems Division and the California Institute of Technology. Funding for this project was provided in part by the Intel Research Council, Intel Supercomputer Systems Division, and the University of Texas Center for High Performance Computing.
Attached Files
Published - 00344270.pdf
Files
Name | Size | Download all |
---|---|---|
md5:642f5d201d8ad7d05100c045863c234c
|
681.1 kB | Preview Download |
Additional details
- Eprint ID
- 66629
- DOI
- 10.1109/SUPERC.1994.344270
- Resolver ID
- CaltechAUTHORS:20160503-162855047
- Intel Research Council
- Intel Supercomputer Systems Division
- University of Texas Center for High Performance Computing
- Created
-
2016-05-04Created from EPrint's datestamp field
- Updated
-
2021-11-11Created from EPrint's last_modified field