强化学习
计算机科学
水准点(测量)
蒙特卡罗树搜索
启发式
背景(考古学)
一般化
状态空间
机器学习
人工智能
任务(项目管理)
动态决策
加速
离线学习
决策树
在线学习
蒙特卡罗方法
工程类
古生物学
数学分析
统计
数学
大地测量学
系统工程
万维网
生物
地理
操作系统
作者
Sunan Zhang,Weichao Zhuang,Bingbing Li,Ke Li,Tianyu Xia,Bo Hu
出处
期刊:IEEE Transactions on Transportation Electrification
日期:2024-04-24
卷期号:11 (1): 521-535
被引量:3
标识
DOI:10.1109/tte.2024.3393293
摘要
The intricate interactions with other road users and the diversity of traffic environments create a challenging decision-making task for autonomous driving systems. While offline learning solutions are renowned for their high execution efficiency and the ability to approximate the optimal policy across the entire state space, they are often unsafe and fragile when encountering untrained states. Conversely, online planning methods possess the capacity to thoroughly assess how current decisions influence future outcomes online, and therefore have better generalization. Nevertheless, these approaches face difficulties in terms of execution efficiency and are susceptible to becoming stuck in locally optimal solutions. In this context, this paper proposes an Integrated Planning and Learning (IPL) algorithm based on the reinforcement learning framework for speed and lane change decision-making on highways. Specifically, at each decision time step, this method utilizes the offline learned model to guide an online Monte Carlo Tree Search (MCTS) algorithm for heuristic search, aiming to formulate a forward-looking policy. The experimental results show that the IPL algorithm performs better generalization when faced with unknown scenarios, and its asymptotic performance is better than other benchmark algorithms. In addition, in contrast to the MCTS-based online planning method, the IPL algorithm enhances execution efficiency and comes closer to achieving global optimality.
科研通智能强力驱动
Strongly Powered by AbleSci AI