The implementation and optimization of Bitonic sort algorithm based on CUDA
نویسندگان
چکیده
This paper describes in detail the bitonic sort algorithm,and implements the bitonic sort algorithm based on cuda architecture.At the same time,we conduct two effective optimization of implementation details according to the characteristics of the GPU,which greatly improve the efficiency. Finally,we survey the optimized Bitonic sort algorithm on the GPU with the speedup of quick sort algorithm on the CPU.Since Quick Sort is not suitable to be implemented in parallel,but it is more efficient than other sorting algorithms on CPU to some extend.Hence,to see the speedup and performance,we compare bitonic sort on GPU with quick Sort on CPU. For a series of 32-bit random integer,the experimental results show that the acceleration of our work is nearly 20 times.When array size is about 2,the speedup ratio is even up to 30.
منابع مشابه
Fast In-Place Sorting with CUDA Based on Bitonic Sort
State of the art graphics processors provide high processing power and furthermore, the high programmability of GPUs offered by frameworks like CUDA increases their usability as high-performance coprocessors for general-purpose computing. Sorting is well-investigated in Computer Science in general, but (because of this new field of application for GPUs) there is a demand for high-performance pa...
متن کاملAn approach to Improve Particle Swarm Optimization Algorithm Using CUDA
The time consumption in solving computationally heavy problems has always been a concern for computer programmers. Due to simplicity of its implementation, the PSO (Particle Swarm Optimization) is a suitable meta-heuristic algorithm for solving computationally heavy problems. However, despite the simplicity, the algorithm is inefficient for solving real computationally heavy problems but the pr...
متن کاملComparison of parallel sorting algorithms
In our study we implemented and compared seven sequential and parallel sorting algorithms: bitonic sort, multistep bitonic sort, adaptive bitonic sort, merge sort, quicksort, radix sort and sample sort. Sequential algorithms were implemented on a central processing unit using C++, whereas parallel algorithms were implemented on a graphics processing unit using CUDA platform. We chose these algo...
متن کاملThe Bitonic Sort on Transputer Architectures
The bitonic sort algorithm is a parallel sorting algorithm that has been implemented in sorting networks and is readily adaptable to Transputer arrays. This paper looks at the implementation and time cost of the algorithm for a machine and compares this with an implementation on T8 networks to see the benefits that the new architecture provides over the previous generation of Transputers.
متن کاملParallel Implementation of Particle Swarm Optimization Variants Using Graphics Processing Unit Platform
There are different variants of Particle Swarm Optimization (PSO) algorithm such as Adaptive Particle Swarm Optimization (APSO) and Particle Swarm Optimization with an Aging Leader and Challengers (ALC-PSO). These algorithms improve the performance of PSO in terms of finding the best solution and accelerating the convergence speed. However, these algorithms are computationally intensive. The go...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1506.01446 شماره
صفحات -
تاریخ انتشار 2015