A Parallel Algorithm with Embedded Load Balancing for Autocorrelation Matrix Computation
نویسنده
چکیده
The computation of autocorrelation matrix is used heavily in several areas including signal and image processing , where parallel and application-speciic archi-tectures are also being increasingly used. Therefore, an eecient scheme to compute autocorrelation matrix on parallel architectures has tremendous beneets. In this paper, a parallel algorithm for the computation of autocorrelation matrix on 2-D mesh is presented. The computation requirements for the elements of the autocorrelation matrix is highly skewed and the proposed algorithm attempts to balance the computation load, without requiring an external load balancing algorithm or processor. In this sense, the load balancing is embedded within the algorithm. The exact number of computation steps are derived. The time complexity of the proposed algorithm is shown to be within twice the optimal (or lower bound). It is also shown to have twice the speedup of a straightforward parallel algorithm.
منابع مشابه
Parallel computation of autocorrelation matrices on a limited number of processors
Autocorrelation matrices are used heavily in several areas including signal and image processing, where parallel and application-speciic architectures are also being increasingly employed. Therefore, an eecient scheme to compute autocorrelation matrices on parallel architectures has considerable beneets. In this paper, a parallel algorithm for the computation of autocorrelation matrices on a li...
متن کاملStatic versus dynamic heterogeneous parallel schemes to solve the symmetric tridiagonal eigenvalue problem
Computation of the eigenvalues of a symmetric tridiagonal matrix is a problem of great relevance. Many linear algebra libraries provide subroutines for solving it. But none of them is oriented to be executed in heterogeneous distributed memory multicomputers. In this work we focus on this kind of platforms. Two different load balancing schemes are presented and implemented. The experimental res...
متن کاملA load balancing strategy for parallel computation of sparse permanents
The research in parallel machine scheduling in combinatorial optimization suggests that the desirable parallel efficiency could be achieved when the jobs are sorted in the non-increasing order of processing times. In this paper, we find that the time spending for computing the permanent of a sparse matrix by hybrid algorithm is strongly correlated to its permanent value. A strategy is introduce...
متن کاملA Parallel Implementation of the Invariant Subspace Decomposition Algorithm for Dense Symmetric Matrices
We give an overview of the Invariant Subspace Decomposition Algorithm for dense symmetric matrices (SYISDA) by rst describing the algorithm, followed by a discussion of a parallel implementation of SYISDA on the Intel Delta. Our implementation utilizes an optimized parallel matrix multiplication implementation we have developed. Load balancing in the costly early stages of the algorithm is acco...
متن کاملFast Distributed Network Decompositions and Covers
This paper presents deterministic sublinear-time distributed algorithms for network decomposition and for constructing a sparse neighborhood cover of a network. The latter construction leads to improved distributed preprocessing time for a number of distributed algorithms, including all-pairs shortest paths computation, load balancing, broadcast, and bandwidth management.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997