Iterative integral versus dynamic programming
نویسندگان
چکیده
منابع مشابه
Convergence of Stochastic Iterative Dynamic Programming Algorithms
Increasing attention has recently been paid to algorithms based on dynamic programming (DP) due to the suitability of DP for learning problems involving control. In stochastic environments where the system being controlled is only incompletely known, however, a unifying theoretical account of these methods has been missing. In this paper we relate DP-based learning algorithms to the powerful te...
متن کاملA new keyword spotting approach based on iterative dynamic programming
This paper addresses the problem of detecting keywords in unconstrained speech without explicit modeling of nonkeyword segments. The proposed algorithm is based on recent developments in confidence measures using local posterior probabilities, and searches for the segment maximizing the average observation posterior’ along the most likely path in the hypothesized keyword model.’ As known, this ...
متن کاملDominance Rules for the Choquet Integral in Multiobjective Dynamic Programming
Multiobjective Dynamic Programming (MODP) is a general problem solving method used to determine the set of Pareto-optimal solutions in optimization problems involving discrete decision variables and multiple objectives. It applies to combinatorial problems in which Pareto-optimality of a solution extends to all its sub-solutions (Bellman principle). In this paper we focus on the determination o...
متن کاملAnalysis of an Iterative Dynamic Programming Approach To
In this paper, we consider a novel Bayesian approach to 2-D phase unwrapping. The phase is unwrapped according to a maximum a posteriori (MAP) rule, where the estimate is made through a form of 2-D dynamic programming. The approach uses structured iterated conditional modes to achieve good performance without examining a large number of states in the dynamic system. We analyze the performance o...
متن کاملOn the Convergence of Stochastic Iterative Dynamic Programming Algorithms
Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments These algorithms including the TD algo rithm of Sutton and the Q learning algorithm of Watkins can be motivated heuristically as approximations to dynamic program ming DP In this paper we provide a rigorous proof of convergence of these DP ba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computers & Mathematics with Applications
سال: 1991
ISSN: 0898-1221
DOI: 10.1016/0898-1221(91)90104-c