强化学习
计算机科学
马尔可夫决策过程
人工智能
背景(考古学)
增强学习
机器学习
梯度下降
马尔可夫过程
人工神经网络
数学
生物
统计
古生物学
作者
Xingze Guo,Feng Gao,Xiaojun Zheng,Shiduo Ning
标识
DOI:10.1109/cscwd57460.2023.10152759
摘要
For the problem of collaborative decision-making, we propose a multi-agent deep reinforcement learning collaborative behavior decision-making algorithm. Firstly, a discrete state space and a greedy strategy-based action space are established in the context of multi- agent collaborative attack, the conditions for successful collaborative siege are given for the requirements of rapidity and collocation. Secondly, the Markov Decision Process (MDP) framework is established based on the multi-agent collaborative behavior decision algorithm, we introduce the experience replay to train the neural network using gradient descent. Finally, a centralized training and distributed execution architecture is used to complete the training of collaborative behavioral decision making, in which the agents share the same strategy and execute actions independently. The simulation shows that the deep reinforcement learning algorithm is able to realize the multi-agent collaborative decision. It can be placed in a real environment.
科研通智能强力驱动
Strongly Powered by AbleSci AI