Learning Topological Maps from Sequential Observation and Action Data under Partially Observable Environment

نویسندگان

  • Takehisa Yairi
  • Masahito Togami
  • Koichi Hori
چکیده

A map is an abstract internal representation of an environment for a mobile robot, and how to learn it autonomously is one of the most fundamental issues in the research fields of intelligent robotics and artificial intelligence. In this paper, we propose a topological map learning method for mobile robots which constructs a POMDP-based discrete state transition model from time-series data of observations and actions. The main point of this method is to find a set of states or nodes of the map gradually so that it minimizes the three types of entropies or uncertainties of the map about “what observations are obtained”,“what actions are available” and “what state transitions are expected”. It is shown that the topological structure of the state transition model is effectively obtained by this method. 1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Improving Deep Reinforcement Learning for POMDPs

Deep Reinforcement Learning (RL) recently emerged as one of the most competitive approaches for learning in sequential decision making problems with fully observable environments, e.g., computer Go. However, very little work has been done in deep RL to handle partially observable environments. We propose a new architecture called Action-specific Deep Recurrent Q-Network (ADRQN) to enhance learn...

متن کامل

A Survey of POMDP Solution Techniques

One of the goals of AI is to design an agent1 which can interact with an environment so as to maximize some reward function. Control theory addresses the same problem, but uses slightly different language: agent = controller, environment = plant, maximizing reward = minimizing cost. Control theory is mainly concerned with tasks in continuous spaces, such as designing a guided missile to interce...

متن کامل

Learning Others' Intentional Models in Multi-Agent Settings Using Interactive POMDPs

Interactive partially observable Markov decision processes (I-POMDPs) provide a principled framework for planning and acting in a partially observable, stochastic and multiagent environment, extending POMDPs to multi-agent settings by including models of other agents in the state space and forming a hierarchical belief structure. In order to predict other agents’ actions using I-POMDP, we propo...

متن کامل

Apprenticeship Learning for Model Parameters of Partially Observable Environments

We consider apprenticeship learning — i.e., having an agent learn a task by observing an expert demonstrating the task — in a partially observable environment when the model of the environment is uncertain. This setting is useful in applications where the explicit modeling of the environment is difficult, such as a dialogue system. We show that we can extract information about the environment m...

متن کامل

Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs

We present a data-efficient reinforcement learning method for continuous stateaction systems under significant observation noise. Data-efficient solutions under small noise exist, such as PILCO which learns the cartpole swing-up task in 30s. PILCO evaluates policies by planning state-trajectories using a dynamics model. However, PILCO applies policies to the observed state, therefore planning i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002