نتایج جستجو برای: matrix multiplication
تعداد نتایج: 385488 فیلتر نتایج به سال:
Matrix multiplication is a key primitive in block matrix algorithms such as those found in LAPACK. We present results from our study of matrix multiplication algorithms on the Intel Touchstone Delta, a distributed memory message-passing architecture with a two-dimensional mesh topology. We obtain an implementation that uses communications primitives highly suited to the Delta and exploits the s...
Parallelization of generalized matrix-matrix multiplication is crucial for achieving high performance required in many situations. Parallelization performed using contemporary compilers is not sufficient enough to replace expert-tuned multi-threaded implementations or to get close to their performance. All competitive solutions require previously optimized external implementations that cannot b...
In this article, we continue exploring the topic of so-called induced methods for implementing complex matrix multiplication. Previous work investigated two approaches and demonstrated various algorithms for each method that compute matrix products in the complex domain using only a real matrix multiplication kernel. However, algorithms based on the more generally applicable of the two methods—...
We analyze single-node performance of sparse matrix-vector multiplication by investigating issues of data locality and ne-grained parallelism. We examine the data-locality characteristics of the compressed-sparse-row representation and consider improvements in locality through matrix permutation. Motivated by potential improvements in ne-grained parallelism, we evaluate modiied sparse-matrix re...
In this work the relation between Boolean Matrix Multiplication (BMM) and Context Free Grammar (CFG) parsing is shown. The first described approach, which is due to Valiant (1975), shows how CFG parsing can be reduced to Boolean Matrix Multiplication. Afterwards the reverse direction, i.e. how a CFG parser can be used to multiply two Boolean matrices, is presented, which is due to Lee (2002). T...
A sufficiently optimized matrix multiplication on embedded systems can facilitate data processing in high performance mobile measuring equipment since plenty of the kernel mathematical algorithms are based on matrix multiplication. In this paper, we propose a matrix multiplication specially optimized for ARMv7 architecture. The performance-critical differences between ARMv7 and conventional des...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید