Temporal Diierence Learning: a Chemical Process Control Application
نویسندگان
چکیده
منابع مشابه
Temporal Diierence Learning in Continuous Time and Space
A continuous-time, continuous-state version of the temporal diier-ence (TD) algorithm is derived in order to facilitate the application of reinforcement learning to real-world control tasks and neurobi-ological modeling. An optimal nonlinear feedback control law was also derived using the derivatives of the value function. The performance of the algorithms was tested in a task of swinging up a ...
متن کاملAnalytical Mean Squared Error Curves in Temporal Diierence Learning
We have calculated analytical expressions for how the bias and variance of the estimators provided by various temporal diierence value estimation algorithms change with ooine updates over trials in absorbing Markov chains using lookup table representations. We illustrate classes of learning curve behavior in various chains, and show the manner in which TD is sensitive to the choice of its step-...
متن کاملEvolutionary Algorithms for Reinforcement
There are two distinct approaches to solving reinforcement learning problems, namely, searching in value function space and searching in policy space. Temporal diierence methods and evolutionary algorithms are well-known examples of these approaches. Kaelbling, Littman and Moore recently provided an informative survey of temporal diierence methods. This article focuses on the application of evo...
متن کاملLearning to Achieve Goals
Temporal diierence methods solve the temporal credit assignment problem for reinforcement learning. An important subproblem of general reinforcement learning is learning to achieve dynamic goals. Although existing temporal diierence methods, such as Q learning, can be applied to this problem, they do not take advantage of its special structure. This paper presents the DG-learning algorithm, whi...
متن کاملStructural Measures for Games and Process Control in the Branch Learning Model
Process control problems can be modeled as closed recursive games. Learning strategies for such games is equivalent to the concept of learning innnite recursive branches for recursive trees. We use this branch learning model to measure the diiculty of learning and synthesizing process controllers. We also measure the diierence between several process learning criteria, and their diierence to co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995