A simple and efficient parallel FFT algorithm using the BSP model

نویسندگان

  • Márcia A. Inda
  • Rob H. Bisseling
چکیده

We present a new parallel radix-4 FFT algorithm based on the BSP model. Our parallel algorithm uses the group-cyclic distribution family, which makes it simple to understand and easy to implement. We show how to reduce the communication cost of the algorithm by a factor of three, in the case that the input/output vector is in the cyclic distribution. We also show how to reduce computation time on computers with a cache-based architecture. We present performance results on a Cray T3E with up to 64 processors, obtaining reasonable efficiency levels for local problem sizes as small as 256 and very good efficiency levels for local sizes larger than 2048.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Parallel FFTs for Different Computational Models

We select the Fast Fourier Transfrom (FFT) to demonstrate a methodology for deriving the optimal parallel algorithm according to predetermined performance metrics, within a computational model. Following the vector space framework for parallel permutations, we provide a speci cation language to capture the algorithm, derive the optimal parallel FFT speci cation, compute the arithmetic, memory, ...

متن کامل

Eecient Parallel Ffts for Diierent Computational Models Eecient Parallel Ffts for Diierent Computational Models

We select the Fast Fourier Transfrom (FFT) to demonstrate a methodology for deriving the optimal parallel algorithm according to predetermined performance metrics, within a computational model. Following the vector space framework for parallel permutations, we provide a speciication language to capture the algorithm, derive the optimal parallel FFT speciication, compute the arithmetic, memory, ...

متن کامل

Design of a Hybrid Genetic Algorithm for Parallel Machines Scheduling to Minimize Job Tardiness and Machine Deteriorating Costs with Deteriorating Jobs in a Batched Delivery System

This paper studies the parallel machine scheduling problem subject to machine and job deterioration in a batched delivery system. By the machine deterioration effect, we mean that each machine deteriorates over time, at a different rate. Moreover, job processing times are increasing functions of their starting times and follow a simple linear deterioration. The objective functions are minimizin...

متن کامل

C Omputational M Odels for P Arallel C Omputing and Bsp Lab

A major challenge for parallel computing is the development of a standardized combination of portable and efficient parallel programming. An interesting approach towards this major goal is the research with offspring in Leslie Valiant’s Bulk Synchronous Parallel Model (BSP). The BSP model is a theoretical framework outlining how parallel computations can be organized in a way that bridges the g...

متن کامل

Using WPT as a New Method Instead of FFT for ‌Improving the Performance of OFDM Modulation

Orthogonal frequency division multiplexing (OFDM) is used in order to provide immunity against very hostile multipath channels in many modern communication systems.. The OFDM technique divides the total available frequency bandwidth into several narrow bands. In conventional OFDM, FFT algorithm is used to provide orthogonal subcarriers. Intersymbol interference (ISI) and intercarrier interferen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Parallel Computing

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2001