نتایج جستجو برای: reactive policies
تعداد نتایج: 277752 فیلتر نتایج به سال:
Importance sampling has recently become a popular method for computing off-policy Monte Carlo estimates of returns. It has been known that importance sampling ratios can be computed for POMDPs when the sampled and target policies are both reactive (memoryless). We extend that result to show how they can also be efficiently computed for policies with memory state (finite state controllers) witho...
در کار حاضر، سه رنگ آنیونی reactive yellow 145,reactive blue19 و reactive red195 برای حذف انتخاب شدند. روش حذف به کار برده شده، با استفاده از جاذب نانولوله¬های کربنی اصلاح شده با نانوذرات مغناطیسی fe3o4 انجام شد. مشخصات نانوذرات آماده شده توسط tem,xrdو vsm تعین گردید. جاذب مغناطیس آماده شده می¬تواند بخوبی در آب حل شده و به آسانی توسط آهنربا از محیط جدا می¬گردد. متغیر¬هایی که معمولا بر کارایی فر...
HQ-learning is a hierarchical extension of Q()-learning designed to solve certain types of partially observable Markov decision problems (POMDPs). HQ automatically decomposes POMDPs into sequences of simpler subtasks that can be solved by memoryless policies learn-able by reactive subagents. HQ can solve partially observable mazes with more states than those used in most previous POMDP work.
We address the problem of learning a policy directly from expert demonstrations. Typically, this problem is solved with a supervised learning method such as regression or classification to learn a reactive policy. Unfortunately, reactive policies lack the ability to model long-range dependancies and this omission can result in suboptimal performance. So, we take a different approach. We observe...
Within the field of robotics, much recent attention has been given to control techniques that have been termed reactive or behavior-based. The design of such control systems for even a remotely interesting task is typically a laborious effort, requiring many hours of experimental "tweaking" as the actual behavior of the system is observed by the system designer. In this paper, we present a neur...
In our research we investigate the output accuracy of discrete event simulation models and agent based simulation models when studying human centric complex systems. In this paper we focus on human reactive behaviour as it is possible in both modelling approaches to implement human reactive behaviour in the model by using standard methods. As a case study we have chosen the retail sector, and h...
Comparing Simulation Output Accuracy of Discrete Event and Agent Based Models: A Quantitive Approach
In our research we investigate the output accuracy of discrete event simulation models and agent based simulation models when studying human centric complex systems. In this paper we focus on human reactive behaviour as it is possible in both modelling approaches to implement human reactive behaviour in the model by using standard methods. As a case study we have chosen the retail sector, and h...
The quality of life and economies of coastal communities depend, to a great degree, on the ecological integrity of coastal ecosystems. Paradoxically, as more people are drawn to the coasts, these ecosystems and the services they provide are increasingly stressed by development and human use. Employing the coastal Gulf of Mexico as an example, we explore through three case studies how government...
GPOMDP is an algorithm for estimating the gradient of the average reward for arbitrary Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies. It applies to purely reactive (memoryless) policies, or policies that generate actions as a function of finite histories of observations. Based on the fact that maintenance of a belief state is sufficient ...
Although many powerful AI and machine learning techniques exist, it remains difficult to quickly create AI for embodied virtual agents that produces visually lifelike behavior. This is important for applications (e.g., games, simulators, interactive displays) where an agent must behave in a manner that appears human-like. We present a novel technique for learning reactive policies that mimic de...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید