نتایج جستجو برای: stochastic gradient descent learning
تعداد نتایج: 840759 فیلتر نتایج به سال:
We introduce a simple algorithm, True Asymptotic Natural Gradient Optimization (TANGO), that converges to a true natural gradient descent in the limit of small learning rates, without explicit Fisher matrix estimation. For quadratic models the algorithm is also an instance of averaged stochastic gradient, where the parameter is a moving average of a “fast”, constant-rate gradient descent. TANGO...
s from the year’s technical reports.7 A sample abstract may be seen in printed form as Figure 3. This technical report is issued jointly by UMIACS and by the Computer Science Department; hence it carries two identifying numbers. The abstract was described by the troff source of Figure 4, and the corresponding Hyperties article is shown in Figure 5. Here, the identifying numbers are used as the ...
A novel hybrid evolutionary approach is presented in this paper for improving the performance of neural network classifiers in slowly varying environments. For this purpose, we investigate a coupling of Differential Evolution Strategy and Stochastic Gradient Descent, using both the global search capabilities of Evolutionary Strategies and the effectiveness of on–line gradient descent. The use o...
Though we use conventional (batch) L-BFGS to finalize optimization when training our model, optimization can be sped up by using stochastic (non-batch) gradient descent techniques prior to L-BFGS. Inspired by recent advances in “second-order” stochastic gradient descent techniques [2, 6, 7], we developed a novel variant of stochastic gradient descent based on exponential decaying different quan...
Neural networks have traditionally been applied to recognition problems, and most learning algorithms are tailored to those problems. We discuss the requirements of learning for generalization, where the traditional methods based on gradient descent have limited success. We present a new stochastic learning algorithm based on simulated annealing in weight space. We verify the convergence proper...
We establish novel generalization bounds for learning algorithms that converge to global minima. We do so by deriving black-box stability results that only depend on the convergence of a learning algorithm and the geometry around the minimizers of the loss function. The results are shown for nonconvex loss functions satisfying the Polyak-Łojasiewicz (PL) and the quadratic growth (QG) conditions...
We evaluate natural gradient, an algorithm originally proposed in Amari (1997), for learning deep models. The contributions of this paper are as follows. We show the connection between natural gradient and three other recently proposed methods: Hessian-Free (Martens, 2010), Krylov Subspace Descent (Vinyals and Povey, 2012) and TONGA (Le Roux et al., 2008). We empirically evaluate the robustness...
We propose the application of a semi-supervised learning method to improve the performance of acoustic modelling for automatic speech recognition based on deep neural networks. As opposed to unsupervised initialisation followed by supervised fine tuning, our method takes advantage of both unlabelled and labelled data simultaneously through minibatch stochastic gradient descent. We tested the me...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید