نتایج جستجو برای: reactive policies

تعداد نتایج: 277752  

2001
Christian R. Shelton

Importance sampling has recently become a popular method for computing off-policy Monte Carlo estimates of returns. It has been known that importance sampling ratios can be computed for POMDPs when the sampled and target policies are both reactive (memoryless). We extend that result to show how they can also be efficiently computed for policies with memory state (finite state controllers) witho...

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه سمنان - دانشکده شیمی 1393

در کار حاضر، سه رنگ آنیونی reactive yellow 145,reactive blue19 و reactive red195 برای حذف انتخاب شدند. روش حذف به کار برده شده، با استفاده از جاذب نانولوله¬های کربنی اصلاح شده با نانوذرات مغناطیسی fe3o4 انجام شد. مشخصات نانوذرات آماده شده توسط tem,xrdو vsm تعین گردید. جاذب مغناطیس آماده شده می¬تواند بخوبی در آب حل شده و به آسانی توسط آهنربا از محیط جدا می¬گردد. متغیر¬هایی که معمولا بر کارایی فر...

Journal: :Adaptive Behaviour 1997
Marco Wiering Jürgen Schmidhuber

HQ-learning is a hierarchical extension of Q()-learning designed to solve certain types of partially observable Markov decision problems (POMDPs). HQ automatically decomposes POMDPs into sequences of simpler subtasks that can be solved by memoryless policies learn-able by reactive subagents. HQ can solve partially observable mazes with more states than those used in most previous POMDP work.

2013
Byron Boots Dieter Fox

We address the problem of learning a policy directly from expert demonstrations. Typically, this problem is solved with a supervised learning method such as regression or classification to learn a reactive policy. Unfortunately, reactive policies lack the ability to model long-range dependancies and this omission can result in suboptimal performance. So, we take a different approach. We observe...

1994
Andrew H. Fagg David Lotspeich George A. Bekey

Within the field of robotics, much recent attention has been given to control techniques that have been termed reactive or behavior-based. The design of such control systems for even a remotely interesting task is typically a laborious effort, requiring many hours of experimental "tweaking" as the actual behavior of the system is observed by the system designer. In this paper, we present a neur...

2009
Mazlina Abdul Majid Uwe Aickelin

In our research we investigate the output accuracy of discrete event simulation models and agent based simulation models when studying human centric complex systems. In this paper we focus on human reactive behaviour as it is possible in both modelling approaches to implement human reactive behaviour in the model by using standard methods. As a case study we have chosen the retail sector, and h...

Journal: :CoRR 2010
Mazlina Abdul Majid Uwe Aickelin Peer-Olaf Siebers

In our research we investigate the output accuracy of discrete event simulation models and agent based simulation models when studying human centric complex systems. In this paper we focus on human reactive behaviour as it is possible in both modelling approaches to implement human reactive behaviour in the model by using standard methods. As a case study we have chosen the retail sector, and h...

2013
Stephen Jordan William Benson

The quality of life and economies of coastal communities depend, to a great degree, on the ecological integrity of coastal ecosystems. Paradoxically, as more people are drawn to the coasts, these ecosystems and the services they provide are increasingly stressed by development and human use. Employing the coastal Gulf of Mexico as an example, we explore through three case studies how government...

2007
Douglas Aberdeen Jonathan Baxter Peter L. Bartlett

GPOMDP is an algorithm for estimating the gradient of the average reward for arbitrary Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. It applies to purely reactive (memoryless) policies, or policies that generate actions as a function of finite histories of observations. Based on the fact that maintenance of a belief state is sufficient ...

2007
Jonathan Dinerstein Parris K. Egbert Dan Ventura

Although many powerful AI and machine learning techniques exist, it remains difficult to quickly create AI for embodied virtual agents that produces visually lifelike behavior. This is important for applications (e.g., games, simulators, interactive displays) where an agent must behave in a manner that appears human-like. We present a novel technique for learning reactive policies that mimic de...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید