semi inherited lu factorization

نتایج جستجو برای: semi inherited lu factorization

تعداد نتایج: 204029 فیلتر نتایج به سال:

Managing the Complexity of Lookahead for LU Factorization with Pivoting∗ FLAME Working Note #40

2009

Ernie Chan Andrew Chapman Robert van de Geijn

We describe parallel implementations of LU factorization with pivoting for multicore architectures. Implementations that differ in two different dimensions are discussed: (1) using classical partial pivoting versus recently proposed incremental pivoting and (2) extracting parallelism only within the Basic Linear Algebra Subprograms versus building and scheduling a directed acyclic graph of task...

متن کامل

Permuting Sparse Rectangular Matrices into Block-Diagonal Form

Journal: :SIAM J. Scientific Computing 2004

Cevdet Aykanat Ali Pinar Ümit V. Çatalyürek

We investigate the problem of permuting a sparse rectangular matrix into blockdiagonal form. Block-diagonal form of a matrix grants an inherent parallelism for solving the deriving problem, as recently investigated in the context of mathematical programming, LU factorization, and QR factorization. To represent the nonzero structure of a matrix, we propose bipartite graph and hypergraph models t...

متن کامل

Parallel Multilevel Block ILU Preconditioning Techniques for Large Sparse Linear Systems

2003

Chi Shen Jun Zhang Kai Wang

We present a class of parallel preconditioning strategies built on a multilevel block incomplete LU (ILU) factorization technique to solve large sparse linear systems on distributed memory parallel computers. The preconditioners are constructed by using the concept of block independent sets. Two algorithms for constructing block independent sets of a distributed sparse matrix are proposed. We c...

متن کامل

LU Factorization with Partial Pivoting for a Multi-CPU, Multi-GPU Shared Memory System

2012

Jakub Kurzak Piotr Luszczek Mathieu Faverge Jack Dongarra

LU factorization with partial pivoting is a canonical numerical procedure and the main component of the High Performance LINPACK benchmark. This article presents an implementation of the algorithm for a hybrid, shared memory, system with standard CPU cores and GPU accelerators. Performance in excess of one TeraFLOPS is achieved using four AMD Magny Cours CPUs and four NVIDIA Fermi GPUs.

متن کامل

A Distributed CPU-GPU Sparse Direct Solver

2014

Piyush Sao Richard W. Vuduc Xiaoye S. Li

This paper presents the first hybrid MPI+OpenMP+CUDA implementation of a distributed memory right-looking unsymmetric sparse direct solver (i.e., sparse LU factorization) that uses static pivoting. While BLAS calls can account for more than 40% of the overall factorization time, the difficulty is that small problem sizes dominate the workload, making efficient GPU utilization challenging. This ...

متن کامل

Fault tolerant variants of the fine-grained parallel incomplete LU factorization

2017

Evan Coleman Masha Sosonkina Edmond Chow

This paper presents an investigation into fault tolerance for the fine-grained parallel algorithm for computing an incomplete LU factorization. Results concerning the convergence of the algorithm with respect to the occurrence of faults, and the impact of any sub-optimality in the produced incomplete factors in Krylov subspace solvers are given. Numerical tests show that the simple algorithmic ...

متن کامل

Spectral factorization of bi-infinite multi-index block Toeplitz matrices

2000

Cornelis V.M. van der Mee Sebastiano Seatzu Giuseppe Rodriguez

In this paper we formulate a theory of LU and Cholesky factorization of bi-infinite block Toeplitz matrices A = (Ai−j )i,j∈Zd indexed by i, j ∈ Zd and develop two numerical methods to compute such factorizations. © 2002 Elsevier Science Inc. All rights reserved.

متن کامل

Solution of Dense Systems of Linear Equations Arising from Integral Equation Formulations

2007

Kimmo Forsman William Gropp Lauri Kettunen David Levine Jukka Salonen

|This paper discusses eecient solution of dense systems of linear equations arising from integral equation formulations. Several preconditioners in connection with Krylov iterative solvers are examined and compared with LU factorization. Results are shown demonstrating practical aspects and issues we have encountered in implementing iterative solvers on both parallel and sequential computers.

متن کامل

New Evaluation Index of Orderings in Incomplete Factorization Preconditioning

2005

Takeshi Iwashita Masaaki Shimasaki

| It is well known that ordering of unknowns greatly a ects convergence in Incomplete LU (ILU) factorization preconditioned iterative methods. The authors recently proposed a simple evaluation way for orderings in ILU preconditioning. The evaluation index, which has a simple relationship with a norm of a remainder matrix, is easily computed without additional memory requirement. The computation...

متن کامل

A Class of Communication-avoiding Algorithms for Solving General Dense Linear Systems on CPU/GPU Parallel Machines

2012

Marc Baboulin Simplice Donfack Jack J. Dongarra Laura Grigori Adrien Rémy Stanimire Tomov

We study several solvers for the solution of general linear systems where the main objective is to reduce the communication overhead due to pivoting. We first describe two existing algorithms for the LU factorization on hybrid CPU/GPU architectures. The first one is based on partial pivoting and the second uses a random preconditioning of the original matrix to avoid pivoting. Then we introduce...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید