强化学习
马尔可夫决策过程
计算机科学
构造(python库)
事件(粒子物理)
动作(物理)
人工智能
国家(计算机科学)
控制(管理)
实现(概率)
深度学习
价值网络
最优控制
机器学习
马尔可夫过程
数学优化
算法
数学
计算机网络
统计
物理
量子力学
商业模式
营销
业务
作者
Jingwei Lu,Liyuan Han,Qinglai Wei,Xiao Wang,Xingyuan Dai,Fei‐Yue Wang
标识
DOI:10.1109/tiv.2023.3262132
摘要
This paper utilizes parallel control to investigate the problem of event-triggered deep reinforcement learning and develops an event-triggered deep Q-network (ETDQN) for decision-making of autonomous driving, without training an explicit triggering condition . Based on the framework of parallel control, the developed ETDQN incorporates information of actions into the feedback and constructs a dynamic control policy. First, in the realization of the dynamic control policy, we integrate the current state and the previous action to construct the augmented state as well as the augmented Markov decision process. Meanwhile, it is shown theoretically that the goal of the developed dynamic control policy is to learn the variation rate of the action. The augmented state contains information of the current state and the previous action, which enables the developed ETDQN to directly design the immediate reward considering communication loss. Then, based on dueling double deep Q-network (dueling DDQN), we establish the augmented action-value, value, and advantage functions to directly learn the optimal event-triggered decision-making policy of autonomous driving without an explicit triggering condition. It is worth noticing that the developed ETDQN applies to various deep Q-networks (DQNs). Empirical results demonstrate that, in event-triggered control, the developed ETDQN outperforms dueling DDQN and reduces communication loss effectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI