Statistical inference for model parameters in stochastic gradient descent
نویسندگان
چکیده
منابع مشابه
Stochastic Gradient Descent as Approximate Bayesian Inference
Stochastic Gradient Descent with a constant learning rate (constant SGD) simulates a Markov chain with a stationary distribution. With this perspective, we derive several new results. (1) We show that constant SGD can be used as an approximate Bayesian posterior inference algorithm. Specifically, we show how to adjust the tuning parameters of constant SGD to best match the stationary distributi...
متن کاملOn Scalable Inference with Stochastic Gradient Descent
In many applications involving large dataset or online updating, stochastic gradient descent (SGD) provides a scalable way to compute parameter estimates and has gained increasing popularity due to its numerical convenience and memory efficiency. While the asymptotic properties of SGD-based estimators have been established decades ago, statistical inference such as interval estimation remains m...
متن کاملStatistical Inference for Online Learning and Stochastic Approximation via Hierarchical Incremental Gradient Descent
Stochastic gradient descent (SGD) is an immensely popular approach for online learningin settings where data arrives in a stream or data sizes are very large. However, despite anever-increasing volume of work on SGD, much less is known about the statistical inferentialproperties of SGD-based predictions. Taking a fully inferential viewpoint, this paper introducesa novel proc...
متن کاملVariational Stochastic Gradient Descent
In Bayesian approach to probabilistic modeling of data we select a model for probabilities of data that depends on a continuous vector of parameters. For a given data set Bayesian theorem gives a probability distribution of the model parameters. Then the inference of outcomes and probabilities of new data could be found by averaging over the parameter distribution of the model, which is an intr...
متن کاملByzantine Stochastic Gradient Descent
This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the m machines which allegedly compute stochastic gradients every iteration, an α-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds ε-approximate minimizers of convex functions in T = Õ ( 1...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Annals of Statistics
سال: 2020
ISSN: 0090-5364
DOI: 10.1214/18-aos1801