نتایج جستجو برای: many core architectures
تعداد نتایج: 1178744 فیلتر نتایج به سال:
We propose parallel algorithms for operations on univariate polynomials (multi-point evaluation, interpolation) based on subproduct tree techniques and targeting many-core GPUs. On those architectures, we demonstrate the importance of adaptive algorithms, in particular the combination of parallel plain arithmetic and parallel FFT-based arithmetic. Experimental results illustrate the benefits of...
In this paper we address the problem of identifying and exploiting techniques to increase the performance of large scale scientific codes on recent many-core processors. We consider as a test-bed a state-of-the-art Lattice Boltzmann (LB) model, that accurately reproduces the thermo-hydrodynamics of a 2D-fluid that obeys the equations of state of a perfect gas. The regular structure of Lattice B...
The roadmaps of the leading supercomputer manufacturers are based on hybrid systems, which consist of a mix of conventional processors and accelerators. This trend is mainly due to the fact that the power consumption cost of the future cpu-only Exascale systems will be unsustainable, thus accelerators such as graphic processing units (GPUs) and many-integrated-core (MIC) will likely be the inte...
Onmodern multi-core, many-core, and heterogeneous architectures, floating-point computations, especially reductions, may become non-deterministic and, therefore, non-reproducible mainly due to the non-associativity of floating-point operations. We introduce an approach to compute the correctly rounded sums of large floating-point vectors accurately and efficiently, achieving deterministic resul...
The European FP7 project PEPPHER is addressing programmability and performance portability for current and emerging heterogeneous many-core architectures. As its main idea, the project proposes a multi-level parallel execution model comprised of potentially parallelized components existing in variants suitable for different types of cores, memory configurations, input characteristics, optimizat...
This paper presents a benchmarking, performance analysis and optimisation study of the OP2 “active” library, which provides an abstraction framework for the parallel execution of unstructured mesh applications. OP2 aims to decouple the scientific specification of the application from its parallel implementation, and thereby achieve code longevity and near-optimal performance through re-targetin...
Sparse Matrix-Matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this paper, we develop parallel algorithms for sparse matrixmatrix multiplication with a focus on performance portability across different high performance computing archite...
Xiang Zhang and Ahmed Louri Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721 E-mail:{ zxkidd, louri}@ece.arizona.edu Abstract: We explore silicon photonics and 3D stacked technology to implement a photonic network-on-chips. The proposed scheme provides 2.56 Tb/sec bandwidth with a much reduced power consumption and latency compared to any leading on-chip photonic net...
A new synchronization mechanism created under the dataflow model of computation was introduced during the late 1970s and called I-Structure. I-Structure exhibited the following important features: (1) it is a dataflow style synchronization, i.e., synchronization only occurs between an I-Structure producer and consumer operations that are accessing the same memory location; (2) it is fine-grain ...
Maximizing the data throughput is a very common implementation objective for several streaming applications. Such task is particularly challenging for implementations based on many-core and multi-core target platforms because, in general, it implies tackling several NPcomplete combinatorial problems. Moreover, an efficient design space exploration requires an accurate evaluation on the basis of...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید