Scheduling Block-Cyclic Array Redistribution

نویسندگان

  • Frédéric Desprez
  • Jack J. Dongarra
  • Antoine Petitet
  • Cyril Randriamaro
  • Yves Robert
چکیده

This article is devoted to the run-time redistribution of arrays that are distributed in a blockcyclic fashion over a multidimensional processor grid. While previous studies have concentrated on e ciently generating the communication messages to be exchanged by the processors involved in the redistribution, we focus on the scheduling of those messages: how to organize the message exchanges into \structured" communication steps that minimize contention. We build upon results of Walker and Otto, who solved a particular instance of the problem, and we derive an optimal scheduling for the most general case, namely, moving from a CYCLIC(r) distribution on a P -processor grid to a CYCLIC(s) distribution on a Q-processor grid, for arbitrary values of the redistribution parameters P , Q, r, and s. This work was supported in part by the National Science Foundation Grant No. ASC-9005933; by the Defense Advanced Research Projects Agency under contract DAAH04-95-1-0077, administered by the Army Research O ce; by the Department of Energy O ce of Computational and Technology Research, Mathematical, Information, and Computational Sciences Division under Contract DE-AC05-84OR21400; by the National Science Foundation Science and Technology Center Cooperative Agreement No. CCR-8809615; by the CNRS{ENS Lyon{INRIA project ReMaP; and by the Eureka Project EuroTOPS. Yves Robert is on leave from Ecole Normale Sup erieure de Lyon and is partly supported by DRET/DGA under contract ERE 96-1104/A000/DRET/DS/SR. The authors acknowledge the use of the Intel Paragon XP/S 5 computer, located in the Oak Ridge National Laboratory Center for Computational Sciences, funded by the Department of Energy's Mathematical, Information, and Computational Sciences Division subprogram of the O ce of Computational and Technology Research.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-phase array redistribution: modeling and evaluation

s t lcm lcm*2 lcm*4 gcd gcd/2 gcd/4 s t lcm lcm*2 lcm*4 gcd gcd/2 gcd/4 Table 1: Execution times (ms) for cyclic(s) to cyclic(t) redistribution on 32 processors. other block sizes t. Fig. 3 shows the total times in milliseconds for a cyclic(192) to cyclic(8) redistribution on 32 processors for increasing data sizes. This redistribution corresponds to the cyclic(Y t) to cyclic(t) case with Y = 2...

متن کامل

More on Scheduling Block-Cyclic Array Redistribution

This article is devoted to the run-time redistribution of one-dimensional arrays that are distributed in a block-cyclic fashion over a processor grid. In a previous paper 2], we have reported how to derive optimal schedules made up of successive communication-steps. In this paper we assume that successive steps may overlap. We show how to obtain an optimal scheduling for the most general case, ...

متن کامل

Irregular Redistribution Scheduling by Partitioning Messages

Dynamic data redistribution enhances data locality and improves algorithm performance for numerous scientific problems on distributed memory multi-computers systems. Regular data distribution typically employs BLOCK, CYCLIC, or BLOCK-CYCLIC(c) to specify array decomposition. Conversely, an irregular distribution specifies an uneven array distribution based on user-defined functions. Performing ...

متن کامل

Efficient Methods for kr R r and r R kr Array

Array redistribution is usually required to enhance algorithm performance in many parallel programs on distributed memory multicomputers. Since it is performed at run-time, there is a performance tradeoff between the efficiency of new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present efficient algorithms for...

متن کامل

Message Scheduling for Irregular Data Redistribution in Parallelizing Compilers

In parallelizing compilers on distributed memory systems, distributions of irregular sized array blocks are provided for load balancing and irregular problems. The irregular data redistribution is different from the regular block-cyclic redistribution. This paper is devoted to scheduling message for irregular data redistribution that attempt to obtain suboptimal solutions while satisfying the m...

متن کامل

Multi-Phase Redistribution: A Communication-Efficient Approach to Array Redistributionz

Distributed-memory implementations of several scientific applications require array redistribution. Array redistribution is used in languages such as High Performance Fortran to dynamically change the distribution of arrays across processors. Performing array redistribution incurs two overheads an indexing overhead for determining the set of processors to communicate with and the array elements...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997