gpu parallel computation

Measuring the Performance of Realtime DSP using Pure Data and GPU

2012

André J. Bianchi Marcelo Queiroz

In order to achieve greater amounts of computation while lowering the cost of artistic and scientific projects that rely on realtime digital signal processing techniques, it is interesting to study the performance of commodity parallel processing GPU cards coupled with commonly used software for realtime DSP. In this article, we describe the measurement of data roundtrip time using the Pure Dat...

متن کامل

3D Recursive Gaussian IIR on GPUs and FPGAs A Case Study for Accelerating Bandwidth-Bounded Applications

2011

Jason Cong Muhuan Huang Yi Zou

GPU devices typically have a higher off-chip bandwidth than FPGA-based systems. Thus typically GPU should perform better for bandwidth-bounded massive parallel applications. In this paper we present our implementations of a 3D recursive Gaussian IIR on multicore CPU, many-core GPU and multi-FPGA platforms. Our baseline implementation on the CPU features the smallest arithmetic computation (2 MA...

متن کامل

New Sparse Matrix Storage Format to Improve The Performance of Total SPMV Time

Journal: :Scalable Computing: Practice and Experience 2012

Neelima Reddy Raghavendra Prakash Ram Mohana Reddy

Graphics Processing Units (GPUs) are massive data parallel processors. High performance comes only at the cost of identifying data parallelism in the applications while using data parallel processors like GPU. This is an easy effort for applications that have regular memory access and high computation intensity. GPUs are equally attractive for sparse matrix vector multiplications (SPMV for shor...

متن کامل

A Bi-objective Optimization Framework for Heterogeneous CPU/GPU Query Plans

2013

Piotr Przymus Krzysztof Kaczmarski Krzysztof Stencel

Graphics Processing Units (GPU) have significantly more applications than just rendering images. They are also used in general-purpose computing to solve problems that can benefit from massive parallel processing. However, there are tasks that either hardly suit GPU or fit GPU only partially. The latter class is the focus of this paper. We elaborate on hybrid CPU/GPU computation and build optim...

متن کامل

Hybrid Parallel-in-Time-and-Space Transient Stability Simulation of Large-Scale AC/DC Grids

Journal: :IEEE Transactions on Power Systems 2022

The increasing complexity of modern AC/DC power systems poses a significant challenge to fast solution large-scale transient stability simulation problems. This paper proposes the hybrid parallel-in-time-and-space (PiT+PiS) on CPU-GPU platform thoroughly exploit parallelism from time and spatial perspectives, thereby fully utilizing parallel processing hardware. respective electromechanical ele...

متن کامل

Fastplay-A Parallelization Model and Implementation of SMC on CUDA based GPU Cluster Architecture

Journal: :IACR Cryptology ePrint Archive 2011

Shi Pu Pu Duan Jyh-Charn Liu

We propose a four-tiered parallelization model for acceleration of the secure multiparty computation (SMC) on the CUDA based Graphic Processing Unit (GPU) cluster architecture. Specification layer is the top layer, which adopts the SFDL of Fairplay for specification of secure computations. The SHDL file generated by the SFDL compiler of Fairplay is used as inputs to the function layer, for whic...

متن کامل

Accelerating Image Retrieval Using Factorial Correspondence Analysis on GPU

2009

Nguyen-Khang Pham Annie Morin Patrick Gros

We are interested in the intensive use of Factorial Correspondence Analysis (FCA) for large-scale content-based image retrieval. Factorial Correspondence Analysis, is a useful method for analyzing textual data, and we adapt it to images using the SIFT local descriptors. FCA is used to reduce dimensions and to limit the number of images to be considered during the search. Graphics Processing Uni...

متن کامل

A Parallel Access Method for Spatial Data Using GPU

2012

Byoung-Woo Oh

Spatial access methods (SAMs) are used for information retrieval in large spatial databases. Many of the SAMs use sequential tree structures to search the result set of the spatial data which are contained in the given query region. In order to improve performance for the SAM, this paper proposes a parallel method using GPU. Since the searching process needs intensive computation but is indepen...

متن کامل

High-Speed GPU-Based Fully Three-Dimensional Diffuse Optical Tomographic System

2014

Manob Jyoti Saikia Rajan Kanhirodan Ram Mohan Vasu

We have developed a graphics processor unit (GPU-) based high-speed fully 3D system for diffuse optical tomography (DOT). The reduction in execution time of 3D DOT algorithm, a severely ill-posed problem, is made possible through the use of (1) an algorithmic improvement that uses Broyden approach for updating the Jacobian matrix and thereby updating the parameter matrix and (2) the multinode m...

متن کامل

Optimizing data intensive GPGPU computations for DNA sequence alignment

Journal: :Parallel computing 2009

Cole Trapnell Michael C. Schatz

MUMmerGPU uses highly-parallel commodity graphics processing units (GPU) to accelerate the data-intensive computation of aligning next generation DNA sequence data to a reference sequence for use in diverse applications such as disease genotyping and personal genomics. MUMmerGPU 2.0 features a new stackless depth-first-search print kernel and is 13× faster than the serial CPU version of the ali...

متن کامل