Dally, William J. (1986) A VLSI Architecture for Concurrent Data Structures. California Institute of Technology . (Unpublished) http://resolver.caltech.edu/CaltechCSTR:1986.5209-tr-86
Other (Adobe PDF (10MB))
See Usage Policy.
Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechCSTR:1986.5209-tr-86
Concurrent data structures simplify the development of concurrent programs by encapsulating commonly used mechanisms for synchronization and communication into data structures. This thesis develops a notation for describing concurrent data structures, presents examples of concurrent data structures, and describes an architecture to support concurrent data structures. Concurrent Smalltalk (CST), a derivative of Smalltalk-80 with extensions for concurrency, is developed to describe concurrent data structures. CST allows the programmer to specify objects that are distributed over the nodes of a concurrent computer. These distributed objects have many constituent objects and thus can process many messages simultaneously. They are the foundation upon which concurrent data structures are built. The balanced cube is a concurrent data structure for ordered sets. The set is distributed by a balanced recursive partition that maps to the subcubes of a binary n-cube using a Gray code. A search algorithm, VW search, based on the distance properties of the Gray code, searches a balanced cube in O(log N) time. Because it does not have the root bottleneck that limits all tree-based data structures to O(1) concurrency, the balanced cube achieves O(~og ) concurrency. Considering graphs as concurrent data structures, graph algorithms are presented for the shortest path problem, the mix-flow problem, and graph partitioning. These algorithms introduce new synchronization techniques to achieve better performance than existing algorithms. A message-passing, concurrent architecture is developed that exploits the characteristics of VLSI technology to support concurrent data structures. Interconnection topologies are compared on the basis of dimension. It is shown that minimum latency is achieved with a very low dimensional network. A deadlock-free routing strategy is developed for this class of networks, and a prototype VLSI chip implementing this strategy is described. A message-driven processor complements the network by responding to messages with a very low latency. The processor directly executes messages, eliminating a level of interpretation. To take advantage of the performance offered by specialization while at the same time retaining flexibility, processing elements can be specialized to operate on a single class of objects. These object experts accelerate the performance of all applications using this class.
|Item Type:||Report or Paper (Technical Report)|
|Group:||Computer Science Technical Reports|
|Usage Policy:||You are granted permission for individual, educational, research and non-commercial reproduction, distribution, display and performance of this work in any format.|
|Deposited By:||Imported from CaltechCSTR|
|Deposited On:||03 Dec 2001|
|Last Modified:||26 Dec 2012 14:09|
Repository Staff Only: item control page