نتایج جستجو برای: opencl
تعداد نتایج: 807 فیلتر نتایج به سال:
This work demonstrates how to accelerate dense linear algebra computations using CLBlast, an open-source OpenCL BLAS library providing optimized routines for a wide variety of devices. It is targeted at machine learning and HPC applications and thus provides a fast matrix-multiplication routine (GEMM) to accelerate the core of many applications (e.g. deep learning, iterative solvers, astrophysi...
For decades, the streaming architecture of FPGAs has delivered accelerated performance across many application domains, such as option pricing solvers in finance, computational fluid dynamics in oil and gas, and packet processing in network routers and firewalls. However, this performance comes at the expense of programmability. FPGA developers use hardware design languages (HDLs) to implement ...
The OpenCL standard allows targeting a large variety of CPU, GPU and accelerator architectures using a single unified programming interface and language. While the standard guarantees portability of functionality for complying applications and platforms, performance portability on such a diverse set of hardware is limited. Devices may vary significantly in memory architecture as well as type, n...
Systems that comprise accelerators (e.g., GPUs) promise high performance, but their programming is still a challenge, mainly because of two reasons: 1) two distinct programming models have to be used within an application: one for the host CPU (e.g., C++), and one for the accelerator (e.g., OpenCL or CUDA); 2) using Just-In-Time (JIT) compilation and its optimization opportunities in both OpenC...
Modern SoC-FPGA that consists of FPGA with embedded ARM cores is being popularized as an embedded vision system platform. However, the design approach of SoCFPGA applications still follows traditional hardware-software separate workflow, which becomes the barrier of rapid product design and iteration on SoC-FPGA. High-Level Synthesis (HLS) and OpenCL-based system-level design approaches provide...
Introduction of Graphical Processing Units (GPUs) and computing using GPUs in recent years opened possibilities for simple parallelization of programs. In this update, we present the modernized version of program ARVO [J. Buša, J. Dzurina, E. Hayryan, S. Hayryan, C.-K. Hu, J. Plavka, I. Pokorný, J. Skivánek, M.-C. Wu, Comput. Phys. Comm. 165 (2005) 59]. The whole package has been rewritten in t...
OpenCL support across many heterogeneous nodes (FPGAs, GPUs, CPUs) has increased the programmability of these systems significantly. At the same time, it opens up new challenges and design choices for system designers and application programmers. While OpenCL offers a universal semantic to capture the parallel behavior of applications independent of the target architecture, some customization s...
Upcoming heterogeneous systems ask for new programming paradigms. Abstracting the underlying hardware architecture is desirable in order to support productive software development. This thesis proposes a design flow and runtime-system for executing process networks on heterogeneous systems using OpenCL. Process networks are a popular model of computation for deterministic parallel programming a...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید