نتایج جستجو برای: ideal batch size

تعداد نتایج: 662202  

Journal: :Journal of Applied Mathematics and Stochastic Analysis 1996

Journal: :Jurnal Ilmu Komputer dan Informasi 2021

Neural networks possess an ability to generalize well data distribution, extent that they are capable of fitting a randomly labeled data. But also known be extremely sensitive adversarial examples. Batch Normalization (BatchNorm), very commonly part deep learning architecture, has been found increase vulnerability. Fixup Initialization (Fixup Init) shown as alternative BatchNorm, which can cons...

Journal: :Methodology and Computing in Applied Probability 2021

We consider batch size selection for a general class of multivariate means variance estimators, which are computationally viable high-dimensional Markov chain Monte Carlo simulations. derive the asymptotic mean squared error this estimators. Further, we propose parametric technique estimating optimal sizes and discuss practical issues regarding process. Vector auto-regressive, Bayesian logistic...

Journal: :CoRR 2017
Hamed R. Bonab Fazli Can

The number of component classifiers chosen for an ensemble has a great impact on its prediction ability. In this paper, we use a geometric framework for a priori determining the ensemble size, applicable to most of the existing batch and online ensemble classifiers. There are only a limited number of studies on the ensemble size considering Majority Voting (MV) and Weighted Majority Voting (WMV...

Journal: :Information Sciences 2021

StochAstic Recursive grAdient algoritHm (SARAH), originally proposed for convex optimization and also proven to be effective general nonconvex optimization, has received great attention because of its simple recursive framework updating stochastic gradient estimates. The performance SARAH significantly depends on the choice step size sequence. However, variants often manually select a best-tune...

Journal: :Journal of Biosocial Science 1971

Journal: :CoRR 2017
Siyuan Ma Raef Bassily Mikhail Belkin

Stochastic Gradient Descent (SGD) with small mini-batch is a key component in modern large-scale machine learning. However, its efficiency has not been easy to analyze as most theoretical results require adaptive rates and show convergence rates far slower than that for gradient descent, making computational comparisons difficult. In this paper we aim to clarify the issue of fast SGD convergenc...

2017
Matteo Papini Matteo Pirotta Marcello Restelli

PROBLEM • Monotonically improve a parametric gaussian policy πθ in a continuous MDP, avoiding unsafe oscillations in the expected performance J(θ). • Episodic Policy Gradient: – estimate ∇̂θJ(θ) from a batch of N sample trajectories. – θ′ ← θ+Λ∇̂θJ(θ) • Tune step size α and batch size N to limit oscillations. Not trivial: – Λ: trade-off with speed of convergence← adaptive methods. – N : trade-off...

Journal: :CoRR 2017
Yang You Igor Gitman Boris Ginsburg

The most natural way to speed-up the training of large networks is to use dataparallelism on multiple GPUs. To scale Stochastic Gradient (SG) based methods to more processors, one need to increase the batch size to make full use of the computational power of each GPU. However, keeping the accuracy of network with increase of batch size is not trivial. Currently, the state-of-the art method is t...

2004
Andrew Potter Stephen M. Disney

The rounding of orders to achieve a batch size is recognised as a source of the bullwhip problem within supply chains. While it is often advocated that batch sizes should be reduced as much as possible, there has been limited investigation into the impact of batching on bullwhip. Here we consider scenarios where orders are placed only in multiples of a fixed batch size, for both deterministic a...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید