smoothed minima

State-action value functions (i.e., Q-values) are ubiquitous in reinforcement learning (RL), giving rise to popular algorithms such as SARSA and Q-learning. We propose a new notion of action value defined by a Gaussian smoothed version of the expected Q-value. We show that such smoothed Q-values still satisfy a Bellman equation, making them learnable from experience sampled from an environment....

متن کامل

Smoothed Dual Embedding Control

Journal: :CoRR 2017

Bo Dai Albert Shaw Lihong Li Lin Xiao Niao He Jianshu Chen Le Song

We revisit the Bellman optimality equation with Nesterov’s smoothing technique and provide a unique saddle-point optimization perspective of the policy optimization problem in reinforcement learning based on Fenchel duality. A new reinforcement learning algorithm, called Smoothed Dual Embedding Control or SDEC, is derived to solve the saddle-point reformulation with arbitrary learnable function...

متن کامل

Adaptive Smoothed Aggregation (αSA)

Journal: :SIAM J. Scientific Computing 2004

Marian Brezina Robert D. Falgout Scott P. MacLachlan Thomas A. Manteuffel Stephen F. McCormick John W. Ruge

Substantial effort has been focused over the last two decades on developing multilevel iterative methods capable of solving the large linear systems encountered in engineering practice. These systems often arise from discretizing partial differential equations over unstructured meshes, and the particular parameters or geometry of the physical problem being discretized may be unavailable to the ...

متن کامل

Incompressible smoothed particle hydrodynamics

Journal: :J. Comput. Physics 2007

Marco Ellero Mar Serrano Pep Español

We present a smoothed particle hydrodynamic model for incompressible fluids. As opposed to solving a pressure Poisson equation in order to get a divergence-free velocity field, here incompressibility is achieved by requiring as a kinematic constraint that the volume of the fluid particles is constant. We use Lagrangian multipliers to enforce this restriction. These Lagrange multipliers play the...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید