نتایج جستجو برای: semi inherited lu factorization

تعداد نتایج: 204029  

Journal: : 2023

Abstract This chapter considers the LU factorization of a general nonsymmetric nonsingular sparse matrix A . In practice, numerical pivoting for stability and/or ordering to limit fill-in in factors is often needed and computed then permuted PAQ Pivoting discussed Chapter 7 algorithms 8

2017
ERIN CARSON NICHOLAS J. HIGHAM

We propose a general algorithm for solving a n×n nonsingular linear system Ax = b based on iterative refinement with three precisions. The working precision is combined with possibly different precisions for solving for the correction term and for computing the residuals. Via rounding error analysis of the algorithm we derive sufficient conditions for convergence and bounds for the attainable n...

2000
Igor Brainman Sivan Toledo

We describe the implementation and performance of a novel fill-minimization ordering technique for sparse LU factorization with partial pivoting. The technique was proposed by Gilbert and Schreiber in 1980 but never implemented and tested. Like other techniques for ordering sparse matrices for LU with partial pivoting, our new method preorders the columns of the matrix (the row permutation is c...

2012
Jakub Kurzak Piotr Luszczek Mathieu Faverge Jack J. Dongarra

LU factorization with partial pivoting is a canonical numerical procedure and the main component of the High Performance Linpack benchmark. This article presents an implementation of the algorithm for a hybrid, shared memory, system with standard CPU cores and GPU accelerators. Performance in excess of one TeraFLOPS is achieved using four AMD Magny Cours CPUs and four NVIDIA Fermi GPUs.

2017

Performance of FPGA-based token dataflow architectures is often limited by the long tail distribution of parallelism in the compute paths of dataflow graphs. This is known to limit speedup of dataflow processing of Sparse LU factorization to only 3– 10× over CPUs. In this paper, we show how to overcome these limitations by exploiting criticality information along compute paths; both statically ...

2017

Performance of FPGA-based token dataflow architectures is often limited by the long tail distribution of parallelism in the compute paths of dataflow graphs. This is known to limit speedup of dataflow processing of Sparse LU factorization to only 3– 10× over CPUs. In this paper, we show how to overcome these limitations by exploiting criticality information along compute paths; both statically ...

Journal: :SIAM J. Matrix Analysis Applications 2006
Bor Plestenjak

We consider numerical methods for the computation of the eigenvalues of the tridiagonal hyperbolic quadratic eigenvalue problem. The eigenvalues are computed as zeros of the characteristic polynomial using the bisection, Laguerre’s method, the Ehrlich–Aberth method, and the Durand–Kerner method. Initial approximations are provided by a divide-and-conquer approach using rank two modifications. T...

1990
Jack Dongarra Susan Ostrouchov

The aim of this project is to implement the basic factorization routines for solving linear systems of equations and least squares problems from LAPACK—namely, the blocked versions of LU with partial pivoting, QR, and Cholesky on a distributed-memory machine. We discuss our implementation of each of the algorithms and the results we obtained using varying orders of matrices and blocksizes.

2006
M. I. Bueno

In this paper, we consider shifted tridiagonal matrices. We prove that the standard algorithm to compute the LU factorization in this situation is mixed forward-backward stable and, therefore, componentwise forward stable. Moreover, we give a formula to compute the corresponding condition number in O(n) flops. Mathematics Subject Classification (2000). 65F35, 65F50, 15A12, 15A23, 65G50.

2014
Raimondas CIEGIS Juozas SIMKEVICIUS

This paper deals with load balancing of parallel algorithms for distributedmemory computers. The parallel versions of BLAS subroutines for matrix-vector product and LU factorization are considered. Two task partitioning algorithms are investigated and speed-ups are calculated. The cases of homogeneous and heterogeneous collections of computers/processors are studied, and special partitioning al...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید