Parallel graph reduction for divide-and-conquer applications† Part II - program performance

نویسندگان

  • Pieter H. Hartel
  • Willem G. Vree
چکیده

An extensible machine architecture is devised to efficiently support a parallel reduction model of computation. The organisation of the machine is designed to match the behaviour of the application programs. A pilot implementation of the architecture is used to obtain an execution profile of the various applications. These profiles are used with a performance model to calculate optimal schedules of the applications. The resulting speedup figures give an upper bound for the performance gain that may be attained on a full implementation of the architecture. The most important result is that each application allows for a processor utilisation of over 50% to be attained on our parallel architecture. Ke y words: local memory architecture multiple processor system optimal scheduling parallel graph reduction performance measurement

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel graph reduction for divide-and-conquer applications† Part I - program transformations

A proposal is made to base parallel evaluation of functional programs on graph reduction combined with a form of string reduction that avoids duplication of work. Pure graph reduction poses some rather difficult problems to implement on a parallel reduction machine, but with certain restrictions, parallel evaluation becomes feasible. The restrictions manifest themselves in the class of applicat...

متن کامل

Parallel Combinator Reduction: Some Performance Bounds

A parallel graph reduction machine simulator is described. This performs combinator reduction and can simulate various different parallel reduction strategies. A number of functional programs are examined, and experimental results presented comparing the amount of parallelism obtainable using explicit divide-and-conquer with the maximum amount of parallelism available in the programs. Ke ywords...

متن کامل

Kinematic Identification of Parallel Mechanisms by a Divide and Conquer Strategy

This paper presents a Divide and Conquer strategy to estimate the kinematic parameters of parallel symmetrical mechanisms. The Divide and Conquer kinematic identification is designed and performed independently for each leg of the mechanism. The estimation of the kinematic parameters is performed using the inverse calibration method. The identification poses are selected optimizing the observab...

متن کامل

N -Graphs: A Topology for Parallel Divide-and-Conquer on Transputer Networks

A parallel implementation of a divide-and-conquer template (skeleton) is derived systematically from its functional speciication. The implementation makes use of a new processor topology for divide-and-conquer, called N-graph, which suits transputer networks well: there are not more than 4 links per processor, overlapping of computations and communication within a processor is exploited, the pr...

متن کامل

Dampvm/dac Programming, Tuning and Automatic Parallelization of Irregular Divide-and-conquer Applications in Programming, Tuning and Automatic Parallelization of Irregular Divide-and-conquer Applications in Dampvm/dac

This paper presents a new object oriented framework DAMPVM/DAC which is implemented on top of DAMPVM and provides automatic partitioning of irregular divide-andconquer (DAC) applications at runtime. The processes are then mapped dynamically to processors taking into account their speeds and even loads by other user processes. The paper presents the programming interface (API) of the framework, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009