moDNN: Memory Optimal Deep Neural Network Training on Graphics Processing Units

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cofactorization on Graphics Processing Units

We show how the cofactorization step, a compute-intensive part of the relation collection phase of the number field sieve (NFS), can be farmed out to a graphics processing unit. Our implementation on a GTX 580 GPU, which is integrated with a state-of-the-art NFS implementation, can serve as a cryptanalytic co-processor for several Intel i7-3770K quad-core CPUs simultaneously. This allows those ...

متن کامل

Rapid Training of Acoustic Models using Graphics Processing Units

Robust and accurate speech recognition systems can only be realized with adequately trained acoustic models. For common languages, state-of-the-art systems are now trained on thousands of hours of speech data. Even with a large cluster of machines the entire training process can take many weeks. To overcome this development bottleneck we propose a new framework for rapid training of acoustic mo...

متن کامل

Biological Sequence Alignment on Graphics Processing Units

Sequence alignment is a common and often repeated task in molecular biology. The need for speeding up this treatment comes from the rapid growth rate of biological sequence databases. In this paper we present a new approach to high performance biological sequence database scanning on graphics processing units. Using modern graphics processing units for high performance computing is facilitated ...

متن کامل

Systolic neighborhood search on graphics processing units

In this paper, we propose a parallel processing model based on systolic computing merged with concepts of evolutionary algorithms. The proposed model works over a Graphics Processing Unit using the structure of threads as cells that form a systolic mesh. Data passes through those cells, each one performing a simple computing operation. The systolic algorithm is implemented using NVIDIA’s comput...

متن کامل

Parallel Genetic Programming on Graphics Processing Units

In program inference, the evaluation of how well a candidate solution solves a certain task is usually a computationally intensive procedure. Most of the time, the evaluation involves either submitting the program to a simulation process or testing its behavior on many input arguments; both situations may turn out to be very time-consuming. Things get worse when the optimization algorithm needs...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Parallel and Distributed Systems

سال: 2019

ISSN: 1045-9219,1558-2183,2161-9883

DOI: 10.1109/tpds.2018.2866582