نتایج جستجو برای: opencl
تعداد نتایج: 807 فیلتر نتایج به سال:
Convolutional neural networks (CNNs) have been widely employed in many applications such as image classification, video analysis and speech recognition. Being computeintensive, CNN computations are mainly accelerated by GPUs with high power dissipations. Recently, studies were carried out exploiting FPGA as CNN accelerator because of its reconfigurability and energy efficiency advantage over GP...
We present an automatic analysis technique for checking data races on OpenCL kernels. Our method defines symbolic execution techniques based on separation logic with suitable abstractions to automatically detect non-benign racy behaviours on kernels.
We study how the C11 memory model can be simplified and how it can be extended. Our first contribution is to propose a mild strengthening of the model that enables the rules pertaining to sequentially-consistent (SC) operations to be significantly simplified. We eliminate one of the total orders that candidate executions must range over, leading to a model that is significantly faster to simula...
While many-core processors offer multiple layers of hardware parallelism to boost performance, applications are lagging behind in exploiting them effectively. A typical example is vector parallelism(SIMD), offered by many processors, but used by too few applications. In this paper we discuss two different strategies to enable the vectorization of naive OpenCL kernels. Further, we show how these...
The Weather Research and Forecasting model (WRF) is a simulating system developed for atmospheric weather prediction. WRF model is used for both operational as well as research purposes. The need for accurate weather and climate simulation to be carried out in shorter time is increasing day by day, which leads to the acceleration of existing Numerical Weather Prediction (NWP) system. This paper...
The end of Moore’s law creates a significant turning point for computer architecture. Today, performance is largely limited by energy, power, and cooling. Heterogeneity and radical new architecture designs are keys to achieving higher energy proportionality. In mobile computing, heterogeneity is well adopted in system-on-chip designs (e.g., to improve battery life). In high-performance computin...
This paper presented the single kernel multiple devices (SKMD) system, a framework that transparently orchestrates collaborative execution of a single data-parallel kernel across multiple asymmetric CPUs and GPUs. SKMD is an abstraction layer located between applications and the OpenCL library. It uses OpenCL as the intermediate language. SKMD transparently partitions an OpenCL kernel across mu...
Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementat...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید