Numerical reproducibility for the parallel reduction on multi- and many-core architectures

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Numerical reproducibility for the parallel reduction on multi- and many-core architectures

Onmodern multi-core, many-core, and heterogeneous architectures, floating-point computations, especially reductions, may become non-deterministic and, therefore, non-reproducible mainly due to the non-associativity of floating-point operations. We introduce an approach to compute the correctly rounded sums of large floating-point vectors accurately and efficiently, achieving deterministic resul...

متن کامل

Parallel Dual Tree Traversal on Multi-core and Many-core Architectures for Astrophysical N-body Simulations

In astrophysical N -body simulations, Dehnen’s algorithm, implemented in the serial falcON code and based on a dual tree traversal, is faster than serial Barnes-Hut tree-codes, but outperformed by parallel CPU and GPU tree-codes. In this paper, we present a parallel dual tree traversal, implemented in the pfalcON code, targeting multi-core CPUs and manycore architectures (Xeon Phi). We focus he...

متن کامل

Solving Matrix Equations on Multi-Core and Many-Core Architectures

We address the numerical solution of Lyapunov, algebraic and differential Riccati equations, via the matrix sign function, on platforms equipped with general-purpose multicore processors and, optionally, one or more graphics processing units (GPUs). In particular, we review the solvers for these equations, as well as the underlying methods, analyze their concurrency and scalability and provide ...

متن کامل

Parallel Packet Processing on Multi-core and Many- core Processors

The Service-oriented Router (SoR), a highly functional router based on a novel router architecture, enables unprecedented web services traditional routers were unable to provide. The SoR performs Deep Packet Inspection (DPI) to analyze Layer 7 information, which is becoming increasingly difficult due to the substantial increase in Internet traffic. Meanwhile, multi-core processors and general-p...

متن کامل

Parallel HEVC Decoding on Multi- and Many-core Architectures - A Power and Performance Analysis

The Joint Collaborative Team on Video Decoding is developing a new standard named High Efficiency Video Coding (HEVC) that aims at reducing the bitrate of H.264/AVC by another 50%. In order to fulfill the computational demands of the new standard, in particular for high resolutions and at low power budgets, exploiting parallelism is no longer an option but a requirement. Therefore, HEVC include...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Parallel Computing

سال: 2015

ISSN: 0167-8191

DOI: 10.1016/j.parco.2015.09.001