نتایج جستجو برای: وزن دهی critic

تعداد نتایج: 69016  

2003
XIAOQUN LIAO MASOUD GHAFFARI SOUMA M. ALHAJ ALI ERNEST L. HALL

For intelligent robots to accomplish tasks in an unstructured environment, the adaptive critic algorithm has been shown to provide useful approximations or even optimal control policies to non-linear systems. The purpose of this paper is to explore the use of new learning control methods defined as Creative Learning or Creative Control that goes beyond the adaptive critic method for unstructure...

2005
Huizhen Yu

We consider the estimation of the policy gradient in partially observable Markov decision processes (POMDP) with a special class of structured policies that are finite-state controllers. We show that the gradient estimation can be done in the Actor-Critic framework, by making the critic compute a “value” function that does not depend on the states of POMDP. This function is the conditional mean...

1994
Toby Walsh

Inductive theorem provers often diverge. This paper describes a critic which monitors the construction of inductive proofs attempting to identify diverging proof attempts. The critic proposes lemmas and generalizations which hopefully allow the proof to go through without divergence. The critic enables the system SPIKE to prove many theorems completely automatically from the deenitions alone.

ژورنال: :بهداشت و ایمنی کار 0
فریده گلبابایی f. golbabaei professor of occupational health engineering, department of occupational health, faculty of health, university of medical sciences, tehran, iranاستاد گروه مهندسی بهداشت حرفه ای، دانشکده بهداشت، دانشگاه علوم پزشکی تهران لیلا حیدری l. heidari msc of hse, department of occupational health center east tehran,shahid beheshti university of medical sciences, tehran, iranکارشناسی ارشدhse ، مرکز بهداشت شرق تهران، دانشگاه علوم پزشکی شهید بهشتی ساناز غازی s. ghazi assisstant professor, department of statistic,environment faculty, islamic azad university, science and research branch, tehran, iranاستادیار گروه آمار، دانشکده محیط زیست، دانشگاه آزاد اسلامی واحد علوم و تحقیقات، تهران کریم جباری k. jabari msc of occupational health, department of occupational health center in northern tehran, ,shahid beheshti university of medical sciences, tehran, iranکارشناس ارشد بهداشت حرفه ای، مرکز بهداشت شمال تهران، دانشگاه علوم پزشکی شهید بهشتی، تهران

مقدمه: پیشگیری از بروز حوادث و بیماری های ناشی از کار در راستای توسعه پایدار و افزایش بهره و ری بدون در نظرگرفتن ایمنی کارکنان، مشتریان، پیمانکاران ودیگر افراد امکان پذیر نیست. از این رو ارزیابی وضعیت مدیریت ایمنی در صنایع یکی از فعالیت هایی است که می تواند منجر به کاهش این خسارات شود. مطالعه حاضر به ارزیابی وضعیت مدیریت ایمنی در یک شرکت تولیدی لوازم خانگی پرداخته شده است.  .روش کار: این مطالعه...

1999
Vijaymohan Konda

We propose and analyze a class of actor-critic algorithms for simulation-based optimization of a Markov decision process over a parameterized family of randomized stationary policies. These are two-time-scale algorithms in which the critic uses TD learning with a linear approximation architecture and the actor is updated in an approximate gradient direction based on information provided by the ...

2005
Jan Peters Sethu Vijayakumar Stefan Schaal

This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari’s natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regression. We show that actor improvements with natural p...

2009
Reinaldo A Uribe

4 Actor-Critic Marble Control 4 4.1 R-code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.2 The critic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4.3 Unstable actors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4.4 Trading off stability against...

Journal: :J. Artif. Intell. Res. 1996
Toby Walsh

Inductive theorem provers often diverge. This paper describes a simple critic, a computer program which monitors the construction of inductive proofs attempting to identify diverging proof attempts. Divergence is recognized by means of a \diierence matching" procedure. The critic then proposes lemmas and generalizations which \ripple" these differences away so that the proof can go through with...

2009
Gary Greenfield Penousal Machado

We describe an agent based artist-critic simulation. Artist agents use a swarm based evolutionary art system to evolve images that try to match their preferences. Preferred images are submitted to critic agents who then decide, accordingly to their own criteria, which images should be displayed in a public gallery. The purpose of our model is to enable the implementation of a variety of behavio...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید