Functional programming for nested data parallelism on GPUs
نویسنده
چکیده
Recent advances in general purpose GPU computing technology allow new data parallel kernel jobs to be dispatched dynamically during kernel execution. This enables significantly more expressive programming using nested data parallelism (NDP), where the restrictive need for flat data structures and computation has been lifted. Functional programming is fundamentally well suited for expressing data parallel computation. Expensive flattening and fusion transformations have so far been necessary in vectorizing irregular data structures and computation, causing functional NDP to be inefficient. With hardware supported nesting on massively parallel GPUs, high-abstraction functional nested data parallelism can now be explored further. This paper introduces nested data parallelism and briefly makes a case for using functional languages to program GPUs for general purpose data parallel computing.
منابع مشابه
Nessie: A NESL to CUDA Compiler
Modern GPUs provide supercomputer-level performance at commodity prices, but they are notoriously hard to program. To address this problem, we have been exploring the use of Nested Data Parallelism (NDP), and specifically the first-order functional language NESL, as a way to raise the level of abstraction for programming GPUs. This paper describes a new compiler for NESL language that generated...
متن کاملHarnessing the Multicores: Nested Data Parallelism in Haskell
If you want to program a parallel computer, a purely functional language like Haskell is a promising starting point. Since the language is pure, it is by-default safe for parallel evaluation, whereas imperative languages are by-default unsafe. But that doesn’t make it easy! Indeed it has proved quite difficult to get robust, scalable performance increases through parallel functional programming...
متن کاملEnlarging the Scope of Vector-Based Computations: Extending Fortran 90 by Nested Data Parallelism
This paper describes the integration of nested data parallelism into Fortran 90. Unlike flat data parallelism, nested data parallelism directly provides means for handling irregular data structures and certain forms of control parallelism, such as divideand-conquer algorithms, thus enabling the programmer to express such algorithms far more naturally. Existing work deals with nested data parall...
متن کاملEfficient Primitives and Algorithms for Many-core architectures
OF THE DISSERTATION Efficient Primitives and Algorithms for Many-core architectures Graphics Processing Units (GPUs) are a fast evolving architecture. Over the last decade their programmability has been harnessed to solve non-graphics tasks—in many cases at a huge performance advantage to CPUs. Unlike CPUs, GPUs have always been a highly parallel architecture—thousands of lightweight execution ...
متن کاملA Language for Nested Data Parallel Design-space Exploration on GPUs
Graphics Processing Units (GPUs) o er potential for very high performance; they are also rapidly evolving. Obsidian is an embedded language (in Haskell) for implementing high performance kernels to be run on GPUs. We would like to have our cake and eat it too; we want to raise the level of abstraction beyond CUDA code and still give the programmer control over the details relevant kernel perfor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012