نتایج جستجو برای: policy space

تعداد نتایج: 747131  

2012
Riad Akrour Marc Schoenauer Michèle Sebag

This work tackles in-situ robotics: the goal is to learn a policy while the robot operates in the real-world, with neither ground truth nor rewards. The proposed approach is based on preference-based policy learning: Iteratively, the robot demonstrates a few policies, is informed of the expert’s preferences about the demonstrated policies, constructs a utility function compatible with all exper...

Journal: :CoRR 2013
Bruno Scherrer Matthieu Geist

Local Policy Search is a popular reinforcement learning approach for handling large state spaces. Formally, it searches locally in a parameterized policy space in order to maximize the associated value function averaged over some predefined distribution. It is probably commonly believed that the best one can hope in general from such an approach is to get a local optimum of this criterion. In t...

2006
Nicholas Nguyen

We present a type and effect system for statically determining whether concurrent programs in a simple functional language adhere to a strict access control policy. Policy states are represented by automata states and are tracked, statically, by the type and effect system. We ensure that, per thread, all function calls are, independently, in accordance with policy with respect to the current st...

1997
Sivarama P. Dandamudi Thyagaraj Thanalapati

Processor scheduling policies for distributedmemory systems can be divided into space-sharing or timesharing policies. In space sharing, the set of processors in the system is partitioned and each partition is assigned for the exclusive use of a job. In time sharing policies, on the other hand, none of the processors is given exclusively to jobs; instead, several jobs share the processors (for ...

2014
Jared Glover Charlotte Zhu

We present a ping-pong-playing robot that learns to improve its swings with human advice. Our method learns a reward function over the joint space of task and policy parameters T ×P , so the robot can explore policy space more intelligently in a way that trades off exploration vs. exploitation to maximize the total cumulative reward over time. Multimodal stochastic polices can also easily be le...

2008
Edward J. López

This note presents data on Congressional fiscal policy by party and chamber, using the National Taxpayers Union annual vote index for 1979-2002. NTU scores are presented with and without adjusting for interchamber and intertemporal movements of the policy space over which the scores are calculated. Results indicate that the parties and chambers are much more stable over time, and exhibit a slig...

2007
Kumar Padmanabh Rajarshi Roy

In this paper a special kind of buffer management policy is studied where the packet are preempted even when sufficient space is available in the buffer for incoming packets. This is done to congestion for future incoming packets to improve QoS for certain type of packets. This type of study has been done in past for ATM type of scenario. We extend the same for heterogeneous traffic where data ...

Journal: :CoRR 2017
Smruti Amarjyoti

The focus of this work is to enumerate the various approaches and algorithms that center around application of reinforcement learning in robotic manipulation tasks. Earlier methods utilized specialized policy representations and human demonstrations to constrict the policy. Such methods worked well with continuous state and policy space of robots but failed to come up with generalized policies....

Journal: :Science 2001
A Lawler

A delay would add millions to the cost of the mission, disrupt the intricate space shuttle schedule, and anger researchers eager to do science. But as the long-time program scientist of the Hubble Space Telescope effort, whose faulty mirror had to be corrected in space, Weiler knew that haste might have even more serious consequences. His instincts told him to wait. “I overruled three independe...

2004
Martin C. Martin

The essential dynamics algorithm is a novel policy search algorithm for learning in a class of stochastic Markov decision processes (MDPs) with continuous state and action spaces. We apply it to the control of a 5 degree of freedom robot arm atop a Segway base. Movement of the arm causes the base to translate and tilt, which in turn affects the movement of the arm. The state space has 14 dimens...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید