机器人
计算机科学
适应(眼睛)
人工智能
功能(生物学)
国家(计算机科学)
估计员
机器学习
人机交互
算法
生物
光学
进化生物学
统计
数学
物理
作者
Yafei Hu,Junyi Geng,Chen Wang,J. S. Keller,Sebastian Scherer
出处
期刊:IEEE robotics and automation letters
日期:2023-05-11
卷期号:8 (6): 3780-3787
被引量:13
标识
DOI:10.1109/lra.2023.3271520
摘要
Autonomous exploration has many important applications. However, classic information gain-based or frontier-based exploration only relies on the robot current state to determine the immediate exploration goal, which lacks the capability of predicting the value of future states and thus leads to inefficient exploration decisions. This letter presents a method to learn how "good" states are, measured by the state value function, to provide a guidance for robot exploration in real-world challenging environments. We formulate our work as an off-policy evaluation (OPE) problem for robot exploration (OPERE). It consists of offline Monte-Carlo training on real-world data and performs Temporal Difference (TD) online adaptation to optimize the trained value estimator. We also design an intrinsic reward function based on sensor information coverage to enable the robot to gain more information with sparse extrinsic rewards. Results show that our method enables the robot to predict the value of future states so as to better guide robot exploration. The proposed algorithm achieves better prediction and exploration performance compared with the state-of-the-arts. To the best of our knowledge, this work for the first time demonstrates value function prediction on real-world dataset for robot exploration in challenging subterranean and urban environments.
科研通智能强力驱动
Strongly Powered by AbleSci AI