强化学习
计算机科学
多智能体系统
分布式算法
分布式计算
人工智能
作者
Martin Lauer,Martin Riedmiller
出处
期刊:International Conference on Machine Learning
日期:2000-06-29
卷期号:: 535-542
被引量:399
摘要
The article focuses on distributed reinforcement learning in cooperative multiagent -decision-processes, where an ensemble of simultaneously and independently acting agents tries to maximize a discounted sum of rewards. We assume that each agent has no information about its teammates' behaviour. Thus, in contrast to single-agent reinforcement-learning each agent has to consider its teammates' behaviour and to nd a cooperative policy. We propose a model-free distributed Q-learning algorithm for cooperative multi-agent-decision-processes. It can be proved to nd optimal policies in deterministic environments. No additional expense is needed in comparison to the non-distributed case. Further there is no need for additional communication between the agents. 1. Introduction Reinforcement learning has originally been discussed for Markov Decision Processes (MDPs): a single agent has to learn a policy that maximizes the discounted sum of rewards in a stochastic environment...
科研通智能强力驱动
Strongly Powered by AbleSci AI