نتایج جستجو برای: gpu parallel computation

تعداد نتایج: 358612  

2009
S-Y. Ju Y-W. Tang T-Y. Huang

Introduction Image registration [1] has been an important topic in the MRI applications, such as longitudinal follow-up studies, brain-normalization for group statistics and motion correction for fMRI studies. There are many different algorithms of image registrations. In general, the calculations require a lot of iterations of coordinate transformations to find the displacements and rotations ...

2011
Ekaterina Gonina Kurt Keutzer Armando Fox

Most current speaker diarization systems use agglomerative clustering of Gaussian Mixture Models (GMMs) to determine “who spoke when” in an audio recording. While state-of-the-art in accuracy, this method is computationally costly, mostly due to the GMM training, and thus limits the performance of current approaches to be roughly real-time. Increased sizes of current datasets require processing...

2013
Wei-Jen Wang I-Fan Hsieh Chun-Chuan Chen

This study aims to improve the performance of Dynamic Causal Modelling for Event Related Potentials (DCM for ERP) in MATLAB by using external function calls to a graphics processing unit (GPU). DCM for ERP is an advanced method for studying neuronal effective connectivity. DCM utilizes an iterative procedure, the expectation maximization (EM) algorithm, to find the optimal parameters given a se...

2013
Giuliano Laccetti Marco Lapegna Valeria Mele Diego Romano

In this work, a parallel adaptive algorithm for the computation of a multidimensional integral on heterogeneous GPU and multicore based systems is described. Two different strategies have been combined together in the algorithm: a first procedure is responsible for the load balancing among the threads on the multicore CPU and a second one is responsible for an efficient execution on the GPU of ...

2013
Dongyou Seo Hyeonsang Eom Heon Y. Yeom

Todays, there are many studies in complicated computation and big data processing by using the high performance computability of GPU. Tesla K20X recently announced by NVIDIA provides 3.95 TFLOPS in precision floating point performance [1]. The performance of K20X is 10 times higher than Intel’s high-end CPUs. Due to the high performance computability of GPU, K20X was adapted to Titan, the first...

2014
Huayou Su Mei Wen Nan Wu Ju Ren Chunyuan Zhang

Through reorganizing the execution order and optimizing the data structure, we proposed an efficient parallel framework for H.264/AVC encoder based on massively parallel architecture. We implemented the proposed framework by CUDA on NVIDIA's GPU. Not only the compute intensive components of the H.264 encoder are parallelized but also the control intensive components are realized effectively, su...

2015
Christian Zentner Yan Liu

This paper elaborates on the possibility to leverage the highly parallel nature of GPUs to implement more efficient stereo matching algorithms. Different algorithms have been implemented and compared on the CPU and the GPU in order to show the speedup gained by moving the computation to the graphics card. The results were evaluated for accuracy using the test available on the Middlebury website...

2010
Andreas Klöckner Jan Sickmann

of “High-Performance High-Order Simulation of Wave and Plasma Phenomena” by Andreas Klöckner, Ph.D., Brown University, May 2010 This thesis presents results aiming to enhance and broaden the applicability of the discontinuous Galerkin (“DG”) method in a variety of ways. DG was chosen as a foundation for this work because it yields high-order finite element discretizations with very favorable nu...

2003
Zhaowei Fan Huagen Wan Shuming Gao

Real-time collision detection is required by most of all computer graphics applications. However, the current collision detection methods still have difficulties in achieving real time. Recent advances in programmable graphics hardware (GPU) make it possible to be used in general-purpose computation. In this paper, we explore to solve the collision detection problem with programmable GPUs. An a...

2008
Bryan Catanzaro Narayanan Sundaram Kurt Keutzer

Recent developments in programmable, highly parallel Graphics Processing Units (GPUs) have enabled high performance general purpose computation. We describe a framework designed for high performance GPU programming, built on Nvidia’s Compute Unified Device Architecture (CUDA) platform. The framework is built around the Map Reduce abstraction, which allows application developers to focus on thei...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید