Barra, a Modular Functional GPU Simulator for GPGPU

نویسندگان

  • Sylvain Collange
  • David Defour
  • David Parello
چکیده

The use of GPUs for general-purpose applications promises huge performance returns for a small investment. However the internal design of such processors is undocumented and many details are unknown, preventing developers to optimize their code for these architectures. One solution is to use functional simulation to determine program behavior and gather statistics when counters are missing or unavailable. In this article we present a GPU functional simulator targeting GPGPU based on the UNISIM framework which takes a NVIDIA cubin file as input.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Barra, a Parallel Functional GPGPU Simulator

We present a GPU functional simulator targeting GPGPU based on the UNISIM framework which takes unaltered NVIDIA CUDA executables as input. It simulates the native instruction set of the Tesla architecture at the functional level and generates detailed execution statistics. Simulation speed is competitive with the less-accurate CUDA emulation mode thanks to optimizations which exploit the inher...

متن کامل

CrystalGPU: Transparent and Efficient Utilization of GPU Power

General-purpose computing on graphics processing units (GPGPU) has recently gained considerable attention in various domains such as bioinformatics, databases and distributed computing. GPGPU is based on using the GPU as a co-processor accelerator to offload computationally-intensive tasks from the CPU. This study starts from the observation that a number of GPU features (such as overlapping co...

متن کامل

Fault injection on GPGPU application

Today, with the development of GPU computing techniques in terms of architectures and hardware and software support, people realized that intensive computing workload could be ported to GPU device. Applications could exploit GPUs’ characteristics for parallel computing and gain a significantly high speedup comparing to CPU architecture. However, failures are still unavoidable. People have alrea...

متن کامل

A complete and efficient CUDA-sharing solution for HPC clusters

In this paper we detail the key features, architectural design, and implementation of rCUDA, an advanced framework to enable remote and transparent GPGPU acceleration in HPC clusters. rCUDA allows decoupling GPUs from nodes, forming pools of shared accelerators, which brings enhanced flexibility to cluster configurations. This opens the door to configurations with fewer accelerators than nodes,...

متن کامل

Towards Multi-tenant GPGPU: Event-driven Programming Model for System-wide Scheduling on Shared GPUs

Graphics processing units (GPUs) are attractive to the generalpurpose computing (GPGPU) beyond the graphics purpose. Sharing GPUs among such GPGPU applications is a key requirement especially for cloud platforms whose resources are utilized by various cloud users. However, consolidating recent GPU applications, referred to as GPU eaters, on a GPU poses a new challenge. Such advanced application...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009