stochastic gradient descent

نتایج جستجو برای: stochastic gradient descent

تعداد نتایج: 258150 فیلتر نتایج به سال:

Matrix Factorization using Window Sampling and Negative Sampling for Improved Word Representations

Journal: :CoRR 2016

Alexandre Salle Aline Villavicencio Marco Idiart

In this paper, we propose LexVec, a new method for generating distributed word representations that uses low-rank, weighted factorization of the Positive Point-wise Mutual Information matrix via stochastic gradient descent, employing a weighting scheme that assigns heavier penalties for errors on frequent cooccurrences while still accounting for negative co-occurrence. Evaluation on word simila...

متن کامل

Backstitch: Counteracting Finite-Sample Bias via Negative Steps

2017

Yiming Wang Vijayaditya Peddinti Hainan Xu Xiaohui Zhang Daniel Povey Sanjeev Khudanpur

In this paper we describe a modification to Stochastic Gradient Descent (SGD) that improves generalization to unseen data. It consists of doing two steps for each minibatch: a backward step with a small negative learning rate, followed by a forward step with a larger learning rate. The idea was initially inspired by ideas from adversarial training, but we show that it can be viewed as a crude w...

متن کامل

Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent

Journal: :CoRR 2011

Wei Xu

For large scale learning problems, it is desirable if we can obtain the optimal model parameters by going through the data in only one pass. Polyak and Juditsky (1992) showed that asymptotically the test performance of the simple average of the parameters obtained by stochastic gradient descent (SGD) is as good as that of the parameters which minimize the empirical cost. However, to our knowled...

متن کامل

Scalable Heterogeneous Transfer Ranking

2014

Mohammad Taha Bahadori Yi Chang Bo Long Yan Liu

In this paper, we propose to study the problem of heterogeneous transfer ranking, a transfer learning problem with heterogeneous features in order to utilize the rich large-scale labeled data in popular languages to help the ranking task in less popular languages. We develop a large-margin algorithm, namely LM-HTR, to solve the problem by mapping the input features in both the source domain and...

متن کامل

Stochastic gradient descent algorithms for strongly convex functions at O(1/T) convergence rates

Journal: :CoRR 2013

Shenghuo Zhu

With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm achieves a high probability convergence rate of O(κ/T ) for strongly convex functions, instead of O(κ ln(T )/T ). We also prove that an accelerated SGD algorithm also achieves a rate of O(κ/T ).

متن کامل

Statistical inference using SGD

Journal: :CoRR 2017

Tianyang Li Liu Liu Anastasios Kyrillidis Constantine Caramanis

We present a novel method for frequentist statistical inference in M -estimation problems, based on stochastic gradient descent (SGD) with a fixed step size: we demonstrate that the average of such SGD sequences can be used for statistical inference, after proper scaling. An intuitive analysis using the OrnsteinUhlenbeck process suggests that such averages are asymptotically normal. From a prac...

متن کامل

Abstract - Training Non-linear Structured Prediction Models with Stochastic Gradient Descent

2008

Thomas Gärtner

Training Non-linear Structured Prediction Models with Stochastic Gradient Descent Thomas Gärtner [email protected] Shankar Vembu [email protected] Fraunhofer IAIS, Schloß Birlinghoven, 53754 Sankt Augustin, Germany

متن کامل

Validation analysis of mirror descent stochastic approximation method

Journal: :Math. Program. 2012

Guanghui Lan Arkadi Nemirovski Alexander Shapiro

The main goal of this paper is to develop accuracy estimates for stochastic programming problems by employing stochastic approximation (SA) type algorithms. To this end we show that while running a Mirror Descent Stochastic Approximation procedure one can compute, with a small additional effort, lower and upper statistical bounds for the optimal objective value. We demonstrate that for a certai...

متن کامل

Link Prediction via Matrix Factorization

2011

Aditya Krishna Menon Charles Elkan

We propose to solve the link prediction problem in graphs using a supervised matrix factorization approach. The model learns latent features from the topological structure of a (possibly directed) graph, and is shown to make better predictions than popular unsupervised scores. We show how these latent features may be combined with optional explicit features for nodes or edges, which yields bett...

متن کامل

Generalization Error Bounds for Aggregation by Mirror Descent with Averaging

2005

Anatoli Juditsky Alexander V. Nazin Alexandre B. Tsybakov Nicolas Vayatis

We consider the problem of constructing an aggregated estimator from a finite class of base functions which approximately minimizes a convex risk functional under the l1 constraint. For this purpose, we propose a stochastic procedure, the mirror descent, which performs gradient descent in the dual space. The generated estimates are additionally averaged in a recursive fashion with specific weig...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید