Comparison of Scalable Parallel Matrix Multiplication Libraries

نویسندگان

Steven Huss-Lederman

Elaine M. Jacobson

Anna Tsao

چکیده

This paper compares two general library routines for performing parallel distributed matrix multiplication. The PUMMA algorithm utilizes block scattered data layout, whereas BiMMeR utilizes virtual 2-D torus wrap. The algorithmic diierences resulting from these diierent layouts are discussed as well as the general issues associated with diierent data layouts for library routines. Results on the Intel Delta for the two matrix multiplication algorithms are presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure

The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...

متن کامل

Parallel Matrix Multiplication: A Systematic Journey

We expose a systematic approach for developing distributed memory parallel matrix matrix multiplication algorithms. The journey starts with a description of how matrices are distributed to meshes of nodes (e.g., MPI processes), relates these distributions to scalable parallel implementation of matrix-vector multiplication and rank-1 update, continues on to reveal a family of matrix-matrix multi...

متن کامل

Comparative Study of Parallel Programming Models to Compute Complex Algorithm

The main goal of this research is to use OpenMP, Posix Threads and Microsoft Parallel Patterns libraries to design an algorithm to compute Matrix Multiplication effectively. By using the libraries of OpenMP, Posix Threads and Microsoft Parallel Patterns Libraries, one can optimize the speedup of the algorithm. First step is to write simple program which calculates a predetermined Matrix and giv...

متن کامل

Matrix Multiplication Specialization in STAPL

The Standard Template Adaptive Parallel Library (STAPL) is a superset of C++’s Standard Template Library (STL) which allows highproductivity parallel programming in both distributed and shared memory environments. This framework provides parallel equivalents of STL containers and algorithms enabling ease of development for parallel systems. In this paper, we will discuss our methodology for imp...

متن کامل

Toward Scalable Matrix Multiply on Multithreaded Architectures

We show empirically that some of the issues that affected the design of linear algebra libraries for distributed memory architectures will also likely affect such libraries for shared memory architectures with many simultaneous threads of execution, including SMP architectures and future multicore processors. The always-important matrix-matrix multiplication is used to demonstrate that a simple...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1993

Comparison of Scalable Parallel Matrix Multiplication Libraries

نویسندگان

چکیده

منابع مشابه

A New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure

Parallel Matrix Multiplication: A Systematic Journey

Comparative Study of Parallel Programming Models to Compute Complex Algorithm

Matrix Multiplication Specialization in STAPL

Toward Scalable Matrix Multiply on Multithreaded Architectures

عنوان ژورنال:

اشتراک گذاری