نتایج جستجو برای: sgd

تعداد نتایج: 1169  

2018
Zhanxing Zhu Jingfeng Wu Bing Yu Lei Wu Jinwen Ma

Understanding the generalization of deep learning has raised lots of concerns recently, where the learning algorithms play an important role in generalization performance, such as stochastic gradient descent (SGD). Along this line, we particularly study the anisotropic noise introduced by SGD, and investigate its importance for the generalization in deep neural networks. Through a thorough empi...

2016
Wei Zhang Suyog Gupta Xiangru Lian Ji Liu

This paper investigates the effect of stale (delayed) gradient updates within the context of asynchronous stochastic gradient descent (Async-SGD) optimization for distributed training of deep neural networks. We demonstrate that our implementation of Async-SGD on a HPC cluster can achieve a tight bound on the gradient staleness while providing near-linear speedup. We propose a variant of the SG...

2011
Jinlong Zhou Xipeng Qiu Xuanjing Huang

Sparse learning framework, which is very popular in the field of nature language processing recently due to the advantages of efficiency and generalizability, can be applied to Conditional Random Fields (CRFs) with L1 regularization method. Stochastic gradient descent (SGD) method has been used in training L1-regularized CRFs, because it often requires much less training time than the batch tra...

2013
Yangyang Shi Mei-Yuh Hwang Kaisheng Yao Martha Larson

Recurrent neural network based language models (RNNLM) have been demonstrated to outperform traditional n-gram language models in automatic speech recognition. However, the superior performance is obtained at the cost of expensive model training. In this paper, we propose a sentence-independent subsampling stochastic gradient descent algorithm (SIS-SGD) to speed up the training of RNNLM using p...

Journal: :CoRR 2016
Hossein Mobahi

This work presents a new algorithm for training recurrent neural networks (although ideas are applicable to feedforward networks as well). The algorithm is derived from a theory in nonconvex optimization related to the diffusion equation. The contributions made in this work are two fold. First, we show how some seemingly disconnected mechanisms used in deep learning such as smart initialization...

Journal: :The Journal of Experimental Medicine 1999
Julie A. Lekstrom-Himes Susan E. Dorman Piroska Kopar Steven M. Holland John I. Gallin

Neutrophil-specific granule deficiency (SGD) is a rare disorder characterized by recurrent pyogenic infections, defective neutrophil chemotaxis and bactericidal activity, and lack of neutrophil secondary granule proteins. CCAAT/enhancer binding protein (C/EBP)epsilon, a member of the leucine zipper family of transcription factors, is expressed primarily in myeloid cells, and its knockout mouse ...

2008
Jiu Jimmy Jiao

Article history: Received 10 August 2007 Received in revised form 24 April 2008 Accepted 25 April 2008 Available online xxxx Algal blooms in Tolo Harbour, Hong Kong have received much attention and submarine groundwater discharge is speculated to be a significant pathway carrying nutrients into the constricted estuary. Plover Cove, a small cove in the Harbour, was selected for SGD analysis usin...

Journal: :Developmental neurorehabilitation 2010
Mandy Jenkins Rispoli Jessica H Franco Larah van der Meer Russell Lang Síglia Pimentel Höher Camargo

OBJECTIVE This review synthesizes communication interventions that involved the use of speech generating devices (SGD) for individuals with developmental disabilities. METHODS Systematic searches of electronic databases, journals and reference lists identified 35 studies meeting the inclusion criteria. These studies were evaluated in terms of (a) participants, (b) SGD function, (c) SGD charac...

Journal: :CoRR 2014
Huahua Wang Arindam Banerjee

Two types of low cost-per-iteration gradient descent methods have been extensively studied in parallel. One is online or stochastic gradient descent ( OGD/SGD), and the other is randomzied coordinate descent (RBCD). In this paper, we combine the two types of methods together and propose online randomized block coordinate descent (ORBCD). At each iteration, ORBCD only computes the partial gradie...

Journal: :Journal of Machine Learning Research 2009
Antoine Bordes Léon Bottou Patrick Gallinari

The SGD-QN algorithm is a stochastic gradient descent algorithm that makes careful use of secondorder information and splits the parameter update into independently scheduled components. Thanks to this design, SGD-QN iterates nearly as fast as a first-order stochastic gradient descent but requires less iterations to achieve the same accuracy. This algorithm won the “Wild Track” of the first PAS...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید