Off-Policy Evaluation With Online Adaptation for Robot Exploration in Challenging Environments
نویسندگان
چکیده
Autonomous exploration has many important applications. However, classic information gain-based or frontier-based only relies on the robot current state to determine immediate goal, which lacks capability of predicting value future states and thus leads inefficient decisions. This letter presents a method learn how “good” are, measured by function, provide guidance for in real-world challenging environments. We formulate our work as an off-policy evaluation (OPE) problem (OPERE). It consists offline Monte-Carlo training data performs Temporal Difference (TD) online adaptation optimize trained estimator. also design intrinsic reward function based sensor coverage enable gain more with sparse extrinsic rewards. Results show that enables predict so better guide exploration. The proposed algorithm achieves prediction performance compared state-of-the-arts. To best knowledge, this first time demonstrates dataset subterranean urban
منابع مشابه
Searching for Optimal Off-Line Exploration Paths in Grid Environments for a Robot with Limited Visibility
Robotic exploration is an on-line problem in which autonomous mobile robots incrementally discover and map the physical structure of initially unknown environments. Usually, the performance of exploration strategies used to decide where to go next is not compared against the optimal performance obtainable in the test environments, because the latter is generally unknown. In this paper, we prese...
متن کاملEligibility Traces for Off-Policy Policy Evaluation
Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference methods. Here we generalize eligibility traces to off-policy learning, in which one learns about a policy different from the policy that generates the data. Off-policy methods can greatly multiply learning, as many policie...
متن کاملIncremental Online Evolution and Adaptation of Neural Networks for Robot Control in Dynamic Environments
Many approaches have been developed to tackle the design complexity of modern robotic systems by using evolutionary processes. Starting with an initial solution, the evolutionary process tries to adapt to a given scenario and in the end produces an improved solution. Previous work showed that incremental evolution, a stepwise increase in the scenario difficulty, can increase the success of evol...
متن کاملOnline policy adaptation for ensemble classifiers
Ensemble algorithms can improve the performance of a given learning algorithm through the combination of multiple base classifiers into an ensemble. In this paper, the idea of using an adaptive policy for training and combining the base classifiers is put forward. The effectiveness of this approach for online learning is demonstrated by experimental results on several UCI benchmark databases.
متن کاملToward Reliable Off Road Autonomous Vehicles Operating in Challenging Environments
The DARPA PerceptOR program implements a rigorous evaluative test program which fosters the development of field relevant outdoor mobile robots. Autonomous ground vehicles are deployed on diverse test courses throughout the USA and quantitatively evaluated on such factors as autonomy level, waypoint acquisition, failure rate, speed, and communications bandwidth. Our efforts over the three year ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE robotics and automation letters
سال: 2023
ISSN: ['2377-3766']
DOI: https://doi.org/10.1109/lra.2023.3271520