نتایج جستجو برای: trust region dogleg method

تعداد نتایج: 2150475  

Journal: :SIAM J. Scientific Computing 1999
Mary Ann Branch Thomas F. Coleman Yuying Li

BOUND-CONSTRAINED MINIMIZATION PROBLEMS MARY ANN BRANCH , THOMAS F. COLEMAN AND YUYING LI Abstract. A subspace adaptation of the Coleman-Li trust region and interior method is proposed for solving large-scale bound-constrained minimization problems. This method can be implemented with either sparse Cholesky factorization or conjugate gradient computation. Under reasonable conditions the converg...

Journal: :CoRR 2017
Ryo Iwaki Minoru Asada

Monotonic policy improvement and off-policy learning are two main desirable properties for reinforcement learning algorithms. In this study, we show that the monotonic policy improvement is guaranteed from onand off-policy mixture data. Based on the theoretical result, we provide an algorithm which uses the experience replay technique for trust region policy optimization. The proposed method ca...

2017

Trust region methods, such as TRPO, are often used to stabilize policy optimization algorithms in reinforcement learning (RL). While current trust region strategies are effective for continuous control, they typically require a large amount of on-policy interaction with the environment. To address this problem, we propose an off-policy trust region method, Trust-PCL, which exploits an observati...

2018

Trust region methods, such as TRPO, are often used to stabilize policy optimization algorithms in reinforcement learning (RL). While current trust region strategies are effective for continuous control, they typically require a large amount of on-policy interaction with the environment. To address this problem, we propose an off-policy trust region method, Trust-PCL, which exploits an observati...

Journal: :Universität Trier, Mathematik/Informatik, Forschungsbericht 1998
Florian Jarre

In several applications, semideenite programs with nonlinear equality constraints arise. We give two such examples to emphasize the importance of this class of problems. We then propose a new solution method that also applies to smooth nonconvex programs. The method combines ideas of a predictor corrector interior-point method, of the SQP method, and of trust region methods. In particular, we b...

Journal: :Comp. Opt. and Appl. 2016
Wim van Ackooij Antonio Frangioni Welington de Oliveira

We explore modifications of the standard cutting-plane approach for minimizing a convex nondifferentiable function, given by an oracle, over a combinatorial set, which is the basis of the celebrated (generalized) Benders’ decomposition approach. Specifically, we combine stabilization—in two ways: via a trust region in the L1 norm, or via a level constraint—and inexact function computation (solu...

Journal: :iranian journal of numerical analysis and optimization 0
maziar salahi akram taati

the extended trust region subproblem has been the focus of several research recently. under various assumptions, strong duality and certain socp/sdp relaxations have been proposed for several classes of it. due to its importance, in this paper, without any assumption on the problem, we apply the widely used alternating direction method of multipliers (admm) to solve it. the convergence of admm ...

Journal: :J. Global Optimization 2017
José Mario Martínez Marcos Raydan

In a recent paper we introduced a trust-region method with variable norms for unconstrained minimization and we proved standard asymptotic convergence results. Here we will show that, with a simple modification with respect to the sufficient descent condition and replacing the trust-region approach with a suitable cubic regularization, the complexity of this method for finding approximate first...

Journal: :CoRR 2017
Ofir Nachum Mohammad Norouzi Kelvin Xu Dale Schuurmans

Trust region methods, such as TRPO, are often used to stabilize policy optimization algorithms in reinforcement learning (RL). While current trust region strategies are effective for continuous control, they typically require a prohibitively large amount of on-policy interaction with the environment. To address this problem, we propose an offpolicy trust region method, Trust-PCL. The algorithm ...

2006
C. G. Baker Pierre-Antoine Absil Kyle A. Gallivan

The recently proposed Riemannian Trust-Region method can be applied to the problem of computing extreme eigenpairs of a matrix pencil, with strong global convergence and local convergence properties. This paper addresses inherent inefficiencies of an explicit trustregion mechanism. We propose a new algorithm, the Implicit Riemannian Trust-Region method for extreme eigenpair computation, which s...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید