Programming CUDA and OpenCL: A Case Study Using Modern C++ Libraries
نویسندگان
چکیده
منابع مشابه
Programming CUDA and OpenCL: A Case Study Using Modern C++ Libraries
We present a comparison of several modern C++ libraries providing high-level interfaces for programming multiand many-core architectures on top of CUDA or OpenCL. The comparison focuses on the solution of ordinary differential equations and is based on odeint, a framework for the solution of systems of ordinary differential equations. Odeint is designed in a very flexible way and may be easily ...
متن کاملTeaching Parallel Programming Using CUDA: A Case Study
A recent prevailing trend in microprocessor architecture is the constant increase in chip-level parallelism. However, practical parallel processing instruction is made difficult by short-comings in existing platforms. The programming of graphics processing units (GPUs) is emerging as an effective alternative to the traditional paradigms, permitting students the chance to construct and assess pa...
متن کاملA Performance Comparison of CUDA and OpenCL
CUDA and OpenCL offer two different interfaces for programming GPUs. OpenCL is an open standard that can be used to program CPUs, GPUs, and other devices from different vendors, while CUDA is specific to NVIDIA GPUs. Although OpenCL promises a portable language for GPU programming, its generality may entail a performance penalty. In this paper, we compare the performance of CUDA and OpenCL usin...
متن کاملLeveraging Parallelism with CUDA and OpenCL
Graphics processing units (GPUs), originally designed for computing and manipulating pixels, have become general-purpose processors capable of executing in excess of trillion calculations per second. Taking advantage of GPU’s compute power and commodity popularity, the field of computing systems is exhibiting a trend toward heterogeneous platforms consisting of a central processor integrated wi...
متن کاملCUDA and OpenCL-based asynchronous PSO
1. GPU-BASED PSO PARALLELIZATION In ‘synchronous’ PSO, positions and velocities of all particles are updated in turn in each ‘generation’, after which each particle’s new fitness is evaluated. The value of the social attractor is only updated at the end of each generation, when the fitness values of all particles are known. The ‘asynchronous’ version of PSO, instead, allows the social attractor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SIAM Journal on Scientific Computing
سال: 2013
ISSN: 1064-8275,1095-7197
DOI: 10.1137/120903683