Johnsson, Lennart (1982) Pipelined linear equation solvers and VLSI. California Institute of Technology , Pasadena, CA. (Unpublished) http://resolver.caltech.edu/CaltechAUTHORS:20120419-115610622
See Usage Policy.
Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechAUTHORS:20120419-115610622
Many of the commonly used methods for solution of linear systems of equations on sequential machines can be given a concurrent formulation. The concurrent algorithms take advantage of independence of operations in order to reduce the time complexity of the methods. During the course of computations specified by the algorithm data has to be routed to the various places of computation. Pipelining can be used to avoid broadcasting in VLSI arrays for computation. Pipelining will in general allow for a reduced cycle time but may force data to be spread out in time, as is the case for Gaussian elimination. What the required spacing is depends on the pipelining and the data flow. In the paper concurrent algorithms and their pipelining for Gaussian elimination, Householder transformations and Given's rotations are discussed, Gaussian elimination and Given's rotations can use two dimensional arrays while Householder transformation uses a one dimensional array. If partial pivoting is necessary in Gaussian elimination, then one dimension of the array is essentially lost and s linear array is almost as efficient as a two-dimensional array. Householder transformations that are numerically stable may perform the triangulation in shorter time, if partial pivoting is necessary in Gaussian elimination. The amount of arithmetic that a node in the arrays perform is somewhat different for the different methods. The difference is largest for the boundary cells. However, it should be feasible to design a common node of very low complexity that very efficiently supports a range of methods for the solution of linear systems of equations.
|Item Type:||Report or Paper (Technical Report)|
|Additional Information:||© California Institute of Technology , 1982 Presented at Microelectronics 1982 Adelaide, Australia May 1982. The research described in this paper was sponsored by the Defense Advanced Research Projects Agency, ARPA Order Number 3771, and monitored by the Office of Naval Research under contract number N00014-79-C-0597. The author gratefully acknowledges the support provided by the Defense Advanced Research Project Agency under contract N00014-79-c-0597 with the California Institute of Technology. Views and conclusions contained in this paper are the author's and should not be interpreted as representing the official opinion of DARPA, the U.S. Government, nor any person or agency connected with them.|
|Group:||Computer Science Technical Reports|
|Subject Keywords:||Concurrent algorithms; Numerical analysis; Computational Arrays; VLSI|
|Other Numbering System:|
|Usage Policy:||No commercial reproduction, distribution, display or performance rights in this work are provided.|
|Deposited By:||Kristin Buxton|
|Deposited On:||02 May 2012 18:01|
|Last Modified:||26 Dec 2012 15:05|
Repository Staff Only: item control page