نتایج جستجو برای: trust region
تعداد نتایج: 594912 فیلتر نتایج به سال:
In this paper, we study a few challenging theoretical and numerical issues on the well known trust region policy optimization for deep reinforcement learning. The goal is to find that maximizes total expected reward when agent acts according policy. subproblem constructed with surrogate function coherent general distance constraint around latest We solve using preconditioned stochastic gradient...
Binary trust-region steepest descent (BTR) and combinatorial integral approximation (CIA) are two recently investigated approaches for the solution of optimization problems with distributed binary-/discrete-valued variables (control functions). We show improved convergence results BTR by imposing a compactness assumption that is similar to theory CIA. As corollary we conclude also constitutes a...
We describe an approach to managing the use of approximate models in optimization. This approach combines the idea of approximation models from engineering design optimization with the model trust region approach from nonlinear programming. The trust region framework regulates the amount of optimization done with the approximate models before one needs to appeal to a detailed model to check the...
The trust-region problem, which minimizes a nonconvex quadratic function over a ball, is a key subproblem in trust-region methods for solving nonlinear optimization problems. It enjoys many attractive properties such as exact semidefinite linear programming relaxation (SDP-relaxation) and strong duality. Unfortunately, such properties do not, in general, hold for an extended trust-region proble...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید