Approximate Modified Policy Iteration
نویسندگان
چکیده
Modified policy iteration (MPI) is a dynamic programming (DP) algorithm that contains the two celebrated policy and value iteration methods. Despite its generality, MPI has not been thoroughly studied, especially its approximation form which is used when the state and/or action spaces are large or infinite. In this paper, we propose three implementations of approximate MPI (AMPI) that are extensions of well-known approximate DP algorithms: fitted-value iteration, fittedQ iteration, and classification-based policy iteration. We provide error propagation analysis that unifies those for approximate policy and value iteration. For the classificationbased implementation, we develop a finitesample analysis that shows that MPI’s main parameter allows to control the balance between the estimation error of the classifier and the overall value function approximation.
منابع مشابه
Solving time-fractional chemical engineering equations by modified variational iteration method as fixed point iteration method
The variational iteration method(VIM) was extended to find approximate solutions of fractional chemical engineering equations. The Lagrange multipliers of the VIM were not identified explicitly. In this paper we improve the VIM by using concept of fixed point iteration method. Then this method was implemented for solving system of the time fractional chemical engineering equations. The ob...
متن کاملApproximate modified policy iteration and its application to the game of Tetris
Modified policy iteration (MPI) is a dynamic programming (DP) algorithm that contains the two celebrated policy and value iteration methods. Despite its generality, MPI has not been thoroughly studied, especially its approximation form which is used when the state and/or action spaces are large or infinite. In this paper, we propose three implementations of approximate MPI (AMPI) that are exten...
متن کاملOptimization of Solution Regularized Long-wave Equation by Using Modified Variational Iteration Method
In this paper, a regularized long-wave equation (RLWE) is solved by using the Adomian's decomposition method (ADM) , modified Adomian's decomposition method (MADM), variational iteration method (VIM), modified variational iteration method (MVIM) and homotopy analysis method (HAM). The approximate solution of this equation is calculated in the form of series which its components are computed by ...
متن کاملNon-Stationary Approximate Modified Policy Iteration
We consider the infinite-horizon γ-discounted optimal control problem formalized by Markov Decision Processes. Running any instance of Modified Policy Iteration—a family of algorithms that can interpolate between Value and Policy Iteration—with an error at each iteration is known to lead to stationary policies that are at least 2γ (1−γ)2 -optimal. Variations of Value and Policy Iteration, that ...
متن کاملSome New Existence, Uniqueness and Convergence Results for Fractional Volterra-Fredholm Integro-Differential Equations
This paper demonstrates a study on some significant latest innovations in the approximated techniques to find the approximate solutions of Caputo fractional Volterra-Fredholm integro-differential equations. To this aim, the study uses the modified Adomian decomposition method (MADM) and the modified variational iteration method (MVIM). A wider applicability of these techniques are based on thei...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1205.3054 شماره
صفحات -
تاریخ انتشار 2012