Numerical reproducibility for the parallel reduction on multi- and many-core architectures
نویسندگان
چکیده
منابع مشابه
Numerical reproducibility for the parallel reduction on multi- and many-core architectures
Onmodern multi-core, many-core, and heterogeneous architectures, floating-point computations, especially reductions, may become non-deterministic and, therefore, non-reproducible mainly due to the non-associativity of floating-point operations. We introduce an approach to compute the correctly rounded sums of large floating-point vectors accurately and efficiently, achieving deterministic resul...
متن کاملParallel Dual Tree Traversal on Multi-core and Many-core Architectures for Astrophysical N-body Simulations
In astrophysical N -body simulations, Dehnen’s algorithm, implemented in the serial falcON code and based on a dual tree traversal, is faster than serial Barnes-Hut tree-codes, but outperformed by parallel CPU and GPU tree-codes. In this paper, we present a parallel dual tree traversal, implemented in the pfalcON code, targeting multi-core CPUs and manycore architectures (Xeon Phi). We focus he...
متن کاملSolving Matrix Equations on Multi-Core and Many-Core Architectures
We address the numerical solution of Lyapunov, algebraic and differential Riccati equations, via the matrix sign function, on platforms equipped with general-purpose multicore processors and, optionally, one or more graphics processing units (GPUs). In particular, we review the solvers for these equations, as well as the underlying methods, analyze their concurrency and scalability and provide ...
متن کاملParallel Packet Processing on Multi-core and Many- core Processors
The Service-oriented Router (SoR), a highly functional router based on a novel router architecture, enables unprecedented web services traditional routers were unable to provide. The SoR performs Deep Packet Inspection (DPI) to analyze Layer 7 information, which is becoming increasingly difficult due to the substantial increase in Internet traffic. Meanwhile, multi-core processors and general-p...
متن کاملParallel HEVC Decoding on Multi- and Many-core Architectures - A Power and Performance Analysis
The Joint Collaborative Team on Video Decoding is developing a new standard named High Efficiency Video Coding (HEVC) that aims at reducing the bitrate of H.264/AVC by another 50%. In order to fulfill the computational demands of the new standard, in particular for high resolutions and at low power budgets, exploiting parallelism is no longer an option but a requirement. Therefore, HEVC include...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Parallel Computing
سال: 2015
ISSN: 0167-8191
DOI: 10.1016/j.parco.2015.09.001