增强学习
最短路径问题
强化学习
路径(计算)
表(数据库)
计算机科学
K最短路径路由
人工智能
算法
数学优化
数学
理论计算机科学
数据挖掘
图形
程序设计语言
作者
Nitchakun Kantasewi,Sanparith Marukatat,Somying Thainimit,Manabu Okumura
标识
DOI:10.1109/ictemsys.2019.8695963
摘要
Q-learning is a popular reinforcement learning technique for solving shortest path (STP) problem. In a maze with multiple sub-tasks such as collecting treasures and avoiding traps, it has been observed that the Q-learning converges to the optimal path. However, the sum of obtained rewards along the path in average is moderate. This paper proposes Multi-Q-Table Q-learning to address a problem of low average sum of rewards. The proposed method constructs a new Q-table whenever a sub-goal is reached. This modification let an agent to learn that the sub-reward is already collect and it can be obtained only once. Our experimental results show that a modified algorithm can achieve an optimal answer to collect all treasures (positive rewards), avoid pit and reach goal with the shortest path. With a small size of maze, the proposed algorithm uses the larger amount of time to achieved optimal solution compared to the conventional Q-learning.
科研通智能强力驱动
Strongly Powered by AbleSci AI