training iteration

Least-Squares Methods in Reinforcement Learning for Control

2002

Michail G. Lagoudakis Ronald Parr Michael L. Littman

Least-squares methods have been successfully used for prediction problems in the context of reinforcement learning, but little has been done in extending these methods to control problems. This paper presents an overview of our research efforts in using least-squares techniques for control. In our early attempts, we considered a direct extension of the Least-Squares Temporal Difference (LSTD) a...

متن کامل

A Multi-Batch L-BFGS Method for Machine Learning

2016

Albert S. Berahas Jorge Nocedal Martin Takác

The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead on batch methods that use a sizeable fraction of the training set at each iteration to facilitate parallelism, and that employ second-order information. In order to improve the learning process, we follow a multi-batch approach in which t...

متن کامل

Better Self-training for Image Classification Through Self-supervision

Journal: :Lecture Notes in Computer Science 2022

Self-training is a simple semi-supervised learning approach: Unlabelled examples that attract high-confidence predictions are labelled with their and added to the training set, this process being repeated multiple times. Recently, self-supervision—learning without manual supervision by solving an automatically-generated pretext task—has gained prominence in deep learning. This paper investigate...

متن کامل

An Improved Gradient Projection-based Decomposition Technique for Support Vector Machines

Journal: :Comput. Manag. Science 2006

Luca Zanni

In this paper we propose some improvements to a recent decomposition technique for the large quadratic program arising in training Support Vector Machines. As standard decomposition approaches, the technique we consider is based on the idea to optimize, at each iteration, a subset of the variables through the solution of a quadratic programming subproblem. The innovative features of this approa...

متن کامل

Continuous Learning: Engineering Super Features With Feature Algebras

Journal: :CoRR 2013

Michael Tetelman

In this paper we consider a problem of searching a space of predictive models for a given training data set. We propose an iterative procedure for deriving a sequence of improving models and a corresponding sequence of sets of non-linear features on the original input space. After a finite number of iterations N , the non-linear features become 2 -degree polynomials on the original space. We sh...

متن کامل

Stochastic Backward Euler: An Implicit Gradient Descent Algorithm for k-means Clustering

Journal: :CoRR 2017

Penghang Yin Minh Pham Adam M. Oberman Stanley Osher

In this paper, we propose an implicit gradient descent algorithm for the classic k-means problem. The implicit gradient step or backward Euler is solved via stochastic fixed-point iteration, in which we randomly sample a mini-batch gradient in every iteration. It is the average of the fixed-point trajectory that is carried over to the next gradient step. We draw connections between the proposed...

متن کامل

Generalized Value Iteration Networks: Life Beyond Lattices

Journal: :CoRR 2017

Sufeng Niu Siheng Chen Hanyu Guo Colin Targonski Melissa C. Smith Jelena Kovacevic

In this paper, we introduce a generalized value iteration network (GVIN), which is an end-to-end neural network planning module. GVIN emulates the value iteration algorithm by using a novel graph convolution operator, which enables GVIN to learn and plan on irregular spatial graphs. We propose three novel differentiable kernels as graph convolution operators and show that the embedding-based ke...

متن کامل

An Iterative Learning Control of Nonlinear Systems Using Neural Network Design

2002

Chiang-Ju Chien Li-Chen Fu

In this paper, a feedforward neural network with sigmoid hidden units is used to design a neural network based iterative learning controller for nonlinear systems with state dependent input gains. No prior offline training phase is necessary, and only a single neural network is employed. All the weights of the neurons are tuned during the iteration process in order to achieve the desired learni...

متن کامل

CISO: Co-iteration semi-supervised learning for visual object detection

Journal: :Multimedia Tools and Applications 2023

Abstract Semi-supervised learning offers a solution to the high cost and limited availability of manually labeled samples in supervised learning. In semi-supervised visual object detection, use unlabeled data can significantly enhance performance deep models. this paper, we introduce an end-to-end framework, named CISO (Co-Iteration Semi-Supervised Learning for Object Detection), which integrat...

متن کامل

regional simulation of bootstrap efficiency of broiler production in peninsular malaysia

Journal: :journal of agricultural science and technology 0

b. h. gabdo institute of agricultural and food policy studies, university of putra, malaysia. m. i. mansor institute of agricultural and food policy studies, university of putra, malaysia. h. a. w. kamal institute of agricultural and food policy studies, university of putra, malaysia. a. m. ilmas institute of agricultural and food policy studies, university of putra, malaysia.

bootstrapping the dea is one of the current methods of measuring robust efficiency by constructing a confidence interval and measuring the noise (bias) in production. in this study, two estimators: the conventional data envelopment analysis (dea) and bootstrap simulation with 2,000 bootstrap iterations were applied on a cross sectional data of 296 broiler farms in peninsular malaysia. the objec...

متن کامل