Implementing Communication-optimal Parallel and Sequential Qr Factorizations

نویسندگان

  • JAMES DEMMEL
  • LAURA GRIGORI
  • MARK HOEMMEN
  • JULIEN LANGOU
چکیده

We present parallel and sequential dense QR factorization algorithms for tall and skinny matrices and general rectangular matrices that both minimize communication, and are as stable as Householder QR. The sequential and parallel algorithms for tall and skinny matrices lead to significant speedups in practice over some of the existing algorithms, including LAPACK and ScaLAPACK, for example up to 6.7x over ScaLAPACK. The parallel algorithm for general rectangular matrices is estimated to show significant speedups over ScaLAPACK, up to 22x over ScaLAPACK.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Communication-optimal Parallel and Sequential QR and LU Factorizations

We present parallel and sequential dense QR factorization algorithms that are both optimal (up to polylogarithmic factors) in the amount of communication they perform and just as stable as Householder QR. We prove optimality by deriving new lower bounds for the number of multiplications done by “non-Strassen-like” QR, and using these in known communication lower bounds that are proportional to ...

متن کامل

Communication-optimal parallel and sequential QR and LU factorizations: theory and practice

We present parallel and sequential dense QR factorization algorithms that are both optimal (up to polylogarithmic factors) in the amount of communication they perform, and just as stable as Householder QR. Our first algorithm, Tall Skinny QR (TSQR), factors m × n matrices in a one-dimensional (1-D) block cyclic row layout, and is optimized for m n. Our second algorithm, CAQR (Communication-Avoi...

متن کامل

Communication-optimal Parallel and Sequential Cholesky Decomposition

Numerical algorithms have two kinds of costs: arithmetic and communication, by which we mean either moving data between levels of a memory hierarchy (in the sequential case) or over a network connecting processors (in the parallel case). Communication costs often dominate arithmetic costs, so it is of interest to design algorithms minimizing communication. In this paper we first extend known lo...

متن کامل

Communication-avoiding parallel and sequential QR factorizations

We present parallel and sequential dense QR factorization algorithms that are optimized to avoid communication. Some of these are novel, and some extend earlier work. Communication includes both messages between processors (in the parallel case), and data movement between slow and fast memory (in either the sequential or parallel cases). Our first algorithm, Tall Skinny QR (TSQR), factors m× n ...

متن کامل

Communication Avoiding Rank Revealing QR Factorization with Column Pivoting

In this paper we introduce CARRQR, a communication avoiding rank revealing QRfactorization with tournament pivoting. We show that CARRQR reveals the numerical rank of amatrix in an analogous way to QR factorization with column pivoting (QRCP). Although the upperbound of a quantity involved in the characterization of a rank revealing factorization is worse forCARRQR than for QRCP...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008