action value function

نتایج جستجو برای: action value function

تعداد نتایج: 2342819 فیلتر نتایج به سال:

Scaled free-energy based reinforcement learning for robust and efficient learning in high-dimensional state spaces

2013

Stefan Elfwing Eiji Uchibe Kenji Doya

Free-energy based reinforcement learning (FERL) was proposed for learning in high-dimensional state- and action spaces, which cannot be handled by standard function approximation methods. In this study, we propose a scaled version of free-energy based reinforcement learning to achieve more robust and more efficient learning performance. The action-value function is approximated by the negative ...

متن کامل

تحولات ساختاری و کارکردی بازار در عصر حاضر نمونه موردی: بازار زنجان بین سالهای 1336-1386

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه زنجان 1387

فاطمه لطفی, محسن احدنژاد,

چکیده ندارد.

15 صفحه اول

Scheduling Imprecise Computation Tasks on Parallel Machines to Minimize Linear and Non-Linear Error Penalties: Reviews, Links and Improvements

2015

Akiyoshi Shioura Natalia V. Shakhlevich Vitaly A. Strusevich

We study scheduling problems with controllable processing times on parallel machines, in which the decision-maker selects an actual processing time for each job from a given interval. The chosen values of the processing times must be such that each job can be preemptively scheduled within a given time interval, and it is required to minimize a certain cost function that depends on the chosen ti...

متن کامل

Around Inverse Reinforcement Learning and Score-based Classification

2013

Matthieu Geist Edouard Klein Bilal Piot Yann Guermeur Olivier Pietquin

Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some expert agent from interactions between this expert and the system to be controlled. One of its major application fields is imitation learning, where the goal is to imitate the expert, possibly in situations not encountered before. A classic and simple way to handle this problem is to see it as a...

متن کامل

Survey on Audit Process of Medical Laboratories in Hamadan Province

ژورنال: علوم پیراپزشکی و بهداشت نظامی 2022

amiri, fatemeh, biglari, mohaddese, massoum, ghazaleh, tavakoli, sima,

Introduction: Audit is an organized and documented process in which qualified and trained auditors evaluate the laboratories using tools such as checklists. This study was designed and conducted to survey the audit process in medical laboratories of Hamadan province. Methods and Materials: This study was cross-sectional and retrospective research. Laboratories of Hamadan province were studied ...

متن کامل

Exploration and exploitation balance management in fuzzy reinforcement learning

Journal: :Fuzzy Sets and Systems 2010

Vali Derhami Vahid Johari Majd Majid Nili Ahmadabadi

This paper offers a fuzzy balance management scheme between exploration and exploitation, which can be implemented in any critic-only fuzzy reinforcement learning method. The paper, however, focuses on a newly developed continuous reinforcement learning method, called fuzzy Sarsa learning (FSL) due to its advantages. Establishing balance greatly depends on the accuracy of action value function ...

متن کامل

On the Smoothness of Linear Value Function Approximations

2006

Branislav Kveton Milos Hauskrecht

Markov decision processes (MDPs) with discrete and continuous state and action components can be solved efficiently by hybrid approximate linear programming (HALP). The main idea of the approach is to approximate the optimal value function by a set of basis functions and optimize their weights by linear programming. It is known that the solution to this convex optimization problem minimizes the...

متن کامل

Comparison of Five Modeling Approaches to Quantify and Estimate the Effect of Clouds on the Radiation Amplification Factor (RAF) for Solar Ultraviolet Radiation

2017

Eric S. Hall

A generally accepted value for the Radiation Amplification Factor (RAF), with respect to the erythemal action spectrum for sunburn of human skin, is −1.1, indicating that a 1.0% increase in stratospheric ozone leads to a 1.1% decrease in the biologically damaging UV radiation in the erythemal action spectrum reaching the Earth. The RAF is used to quantify the non-linear change in the biological...

متن کامل

Predictive value of local and core laboratory echocardiographic assessment of cardiac function in patients with chronic stable angina: The ACTION study

Journal: :European Journal of Echocardiography 2007

متن کامل

Convergence of a Q-learning Variant for Continuous States and Actions

Journal: :J. Artif. Intell. Res. 2014

S. W. Carden

This paper presents a reinforcement learning algorithm for solving infinite horizon Markov Decision Processes under the expected total discounted reward criterion when both the state and action spaces are continuous. This algorithm is based on Watkins’ Q-learning, but uses Nadaraya-Watson kernel smoothing to generalize knowledge to unvisited states. As expected, continuity conditions must be im...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید