policy space

نتایج جستجو برای: policy space

تعداد نتایج: 747131 فیلتر نتایج به سال:

APRIL: Active Preference-based Reinforcement Learning

2012

Riad Akrour Marc Schoenauer Michèle Sebag

This work tackles in-situ robotics: the goal is to learn a policy while the robot operates in the real-world, with neither ground truth nor rewards. The proposed approach is based on preference-based policy learning: Iteratively, the robot demonstrates a few policies, is informed of the expert’s preferences about the demonstrated policies, constructs a utility function compatible with all exper...

متن کامل

Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee

Journal: :CoRR 2013

Bruno Scherrer Matthieu Geist

Local Policy Search is a popular reinforcement learning approach for handling large state spaces. Formally, it searches locally in a parameterized policy space in order to maximize the associated value function averaged over some predefined distribution. It is probably commonly believed that the best one can hope in general from such an approach is to get a local optimum of this criterion. In t...

متن کامل

Typed static analysis for concurrent, policy-based, resource access control

2006

Nicholas Nguyen

We present a type and effect system for statically determining whether concurrent programs in a simple functional language adhere to a strict access control policy. Policy states are represented by automata states and are tracked, statically, by the type and effect system. We ensure that, per thread, all function calls are, independently, in accordance with policy with respect to the current st...

متن کامل

A hierarchical processor scheduling policy for distributed-memory multicomputer systems

1997

Sivarama P. Dandamudi Thyagaraj Thanalapati

Processor scheduling policies for distributedmemory systems can be divided into space-sharing or timesharing policies. In space sharing, the set of processors in the system is partitioned and each partition is assigned for the exclusive use of a job. In time sharing policies, on the other hand, none of the processors is given exclusively to jobs; instead, several jobs share the processors (for ...

متن کامل

Generalizing Policy Advice with Gaussian Process Bandits for Dynamic Skill Improvement

2014

Jared Glover Charlotte Zhu

We present a ping-pong-playing robot that learns to improve its swings with human advice. Our method learns a reward function over the joint space of task and policy parameters T ×P , so the robot can explore policy space more intelligently in a way that trades off exploration vs. exploitation to maximize the total cumulative reward over time. Multimodal stochastic polices can also easily be le...

متن کامل

Congressional Trends to Tax and Spend: Examining Fiscal Voting Across Time and Chamber

2008

Edward J. López

This note presents data on Congressional fiscal policy by party and chamber, using the National Taxpayers Union annual vote index for 1979-2002. NTU scores are presented with and without adjusting for interchamber and intertemporal movements of the policy space over which the scores are calculated. Results indicate that the parties and chambers are much more stable over time, and exhibit a slig...

متن کامل

Expelling Policy Based Buffer Control during Congestion in Differentiated Service Routers

2007

Kumar Padmanabh Rajarshi Roy

In this paper a special kind of buffer management policy is studied where the packet are preempted even when sufficient space is available in the buffer for incoming packets. This is done to congestion for future incoming packets to improve QoS for certain type of packets. This type of study has been done in past for ATM type of scenario. We extend the same for heterogeneous traffic where data ...

متن کامل

Deep Reinforcement Learning for Robotic Manipulation - The state of the art

Journal: :CoRR 2017

Smruti Amarjyoti

The focus of this work is to enumerate the various approaches and algorithms that center around application of reinforcement learning in robotic manipulation tasks. Earlier methods utilized specialized policy representations and human demonstrations to constrict the policy. Such methods worked well with continuous state and policy space of robots but failed to come up with generalized policies....

متن کامل

Policy. NASA's street fighter takes on tangled space science program.

Journal: :Science 2001

A Lawler

A delay would add millions to the cost of the mission, disrupt the intricate space shuttle schedule, and anger researchers eager to do science. But as the long-time program scientist of the Hubble Space Telescope effort, whose faulty mirror had to be corrected in space, Weiler knew that haste might have even more serious consequences. His instincts told him to wait. “I overruled three independe...

متن کامل

Controlling Cardea: Fast Policy Search in a High Dimensional Space

2004

Martin C. Martin

The essential dynamics algorithm is a novel policy search algorithm for learning in a class of stochastic Markov decision processes (MDPs) with continuous state and action spaces. We apply it to the control of a 5 degree of freedom robot arm atop a Segway base. Movement of the arm causes the base to translate and tilt, which in turn affects the movement of the arm. The state space has 14 dimens...

متن کامل

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید