Brian J. d'Auriol, Ph.D.



Brian J. d'Auriol, A Unified Model for Compiling Systolic Computations for Distributed Memory Multicomputers. PhD thesis, Faculty of Computer Science, University of New Brunswick, Fredericton, New Brunswick, June 1995.
Existing terminology and definitions in systolic computing, formal systolic design concepts and several hardware and software implementations of systolic algorithms are reviewed. Based on this review, it is noted that the following areas have been neglected in current research: (a) generic definitions of systolic processing concepts are lacking, (b) none of the work generates different possible implementations for the given systolic algorithm which can then be compared in order to determine an efficient final implementation, (c) there is a general lack of unification of the overall compilation process from algorithm to implementation, and (d) there is a general lack of performance modeling for systolic implementations for general purpose DMMCs. The main contribution of this thesis is the development of a unified model for compiling systolic algorithms for DMMCs. The unified model addresses the issues noted above. The unified model consists of a set of partitioning strategies, a set of implementational objects and a set of given machine characteristics which are used as input to a set of five processes that transform a given systolic algorithm into an executable program. When all the components of the unified model are implemented, the unified model implementation would represent an automatic process for compiling systolic algorithms for distributed memory multicomputers. A second major contribution of this thesis is the derivation of the details of two of the implementational objects, including high level language code generation and performance models. Also proposed are a new set of systolic computing definitions which are more suitable to the software viewpoint adopted in this thesis. Many experiments are conducted to investigate various key issues and concerns relating to the application of the unified model to compile systolic algorithms. Based on these results, the following is demonstrated: (a) the performance models developed in this thesis correctly predict the execution time behavior of the implementation, (b) different systolic algorithms may be compiled by the use of our model, and (c) the usefulness of the unified model. In the course of this work, a language parser for Occam 2 named COMMAN which predicts the pointtopoint transmission times of communication between processes executing on transputers has been developed. It is noted that some of the methods used in the implementation of the communication analyzing parser are unique. COMMAN may be used as a general tool in the analysis of Occam 2 programs. As such, it may be applied in many environments where an analysis of communication times is necessary. The use of COMMAN in relation to this thesis is twofold. First, the prediction of transmission times is required by the performance model(s) and second, it forms a significant part of the compiling process described above. In conclusion, the first major attempt to define a theoretical framework for compiling systolic algorithms for distributed memory multicomputers has been made in this thesis. 