نتایج جستجو برای: بستر CUDA

تعداد نتایج: 19735  

Journal: :ACM Transactions on Embedded Computing Systems 2016

Journal: :International Journal of Parallel Programming 2016

A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...

Journal: :The Journal of The Institute of Image Information and Television Engineers 2012

2008
S. M. Lechner D. Butnaru H-J. Bungartz D. Chen M. W. Vogel

Matrix size LS [s] VS [s] CUDA Solver [s] LS vs. CUDA VS vs. CUDA 16x3408 85.68 1.20 0.10 870.70 12.20 32x3472 348.92 2.36 0.41 851.02 5.76 64x360

Journal: :CoRR 2014
Ahmad Lashgar Alireza Majidi Amirali Baniasadi

In this paper we introduce IPMACC, a framework for translating OpenACC applications to CUDA or OpenCL. IPMACC is composed of set of translators translating OpenACC for C applications to CUDA or OpenCL. The framework uses the system compiler (e.g. nvcc) for generating final accelerator’s binary. The framework can be used for extending the OpenACC API, executing OpenACC applications, or obtaining...

2010
Muthu Manikandan Baskaran J. Ramanujam P. Sadayappan

Graphics Processing Units (GPUs) offer tremendous computational power. CUDA (Compute Unified Device Architecture) provides a multi-threaded parallel programming model, facilitating high performance implementations of general-purpose computations. However, the explicitly managed memory hierarchy and multi-level parallel view make manual development of high-performance CUDA code rather complicate...

Journal: :RITA 2015
Esteban Walter Gonzalez Clua Marcelo Panaro de Moraes Zamith

Since the first version of CUDA was launch, many improvements were made in GPU computing. Every new CUDA version included important novel features, turning this architecture more and more closely related to a typical parallel High Performance Language. This tutorial will present the GPU architecture and CUDA principles, trying to conceptualize novel features included by NVIDIA, such as dynamics...

Journal: :JCP 2012
Zuo Chen Jialiang Ji Renfa Li

For video coding, weighing the balance between and coding rate image quality, we apply global motion search algorithm to avoid loss of image quality and parallel computing capacity of graphics processors to accelerate the encoding process. According to the heterogeneous system of CPU+GPU, and the multi-threaded parallel structure, thread synchronization features of CUDA platform, we build a pro...

Journal: :SoftwareX 2022

Geometric Semantic Genetic Programming (GSGP) is a state-of-the-art machine learning method based on evolutionary computation. GSGP performs search operations directly at the level of program semantics, which can be done more efficiently than operating syntax like most GP systems. Efficient implementations in C++ exploit this fact, but not to its full potential. This paper presents GSGP-CUDA, f...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید