نتایج جستجو برای: action value function

تعداد نتایج: 2342819  

2013
Stefan Elfwing Eiji Uchibe Kenji Doya

Free-energy based reinforcement learning (FERL) was proposed for learning in high-dimensional state- and action spaces, which cannot be handled by standard function approximation methods. In this study, we propose a scaled version of free-energy based reinforcement learning to achieve more robust and more efficient learning performance. The action-value function is approximated by the negative ...

2015
Akiyoshi Shioura Natalia V. Shakhlevich Vitaly A. Strusevich

We study scheduling problems with controllable processing times on parallel machines, in which the decision-maker selects an actual processing time for each job from a given interval. The chosen values of the processing times must be such that each job can be preemptively scheduled within a given time interval, and it is required to minimize a certain cost function that depends on the chosen ti...

2013
Matthieu Geist Edouard Klein Bilal Piot Yann Guermeur Olivier Pietquin

Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some expert agent from interactions between this expert and the system to be controlled. One of its major application fields is imitation learning, where the goal is to imitate the expert, possibly in situations not encountered before. A classic and simple way to handle this problem is to see it as a...

Introduction: Audit is an organized and documented process in which qualified and trained auditors evaluate the laboratories using tools such as checklists. This study was designed and conducted to survey the audit process in medical laboratories of Hamadan province. Methods and Materials: This study was cross-sectional and retrospective research. Laboratories of Hamadan province were studied ...

Journal: :Fuzzy Sets and Systems 2010
Vali Derhami Vahid Johari Majd Majid Nili Ahmadabadi

This paper offers a fuzzy balance management scheme between exploration and exploitation, which can be implemented in any critic-only fuzzy reinforcement learning method. The paper, however, focuses on a newly developed continuous reinforcement learning method, called fuzzy Sarsa learning (FSL) due to its advantages. Establishing balance greatly depends on the accuracy of action value function ...

2006
Branislav Kveton Milos Hauskrecht

Markov decision processes (MDPs) with discrete and continuous state and action components can be solved efficiently by hybrid approximate linear programming (HALP). The main idea of the approach is to approximate the optimal value function by a set of basis functions and optimize their weights by linear programming. It is known that the solution to this convex optimization problem minimizes the...

2017
Eric S. Hall

A generally accepted value for the Radiation Amplification Factor (RAF), with respect to the erythemal action spectrum for sunburn of human skin, is −1.1, indicating that a 1.0% increase in stratospheric ozone leads to a 1.1% decrease in the biologically damaging UV radiation in the erythemal action spectrum reaching the Earth. The RAF is used to quantify the non-linear change in the biological...

Journal: :J. Artif. Intell. Res. 2014
S. W. Carden

This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision Processes under the expected total discounted reward criterion when both the state and action spaces are continuous. This algorithm is based on Watkins’ Q-learning, but uses Nadaraya-Watson kernel smoothing to generalize knowledge to unvisited states. As expected, continuity conditions must be im...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید