可变邻域搜索
强化学习
水准点(测量)
元启发式
计算机科学
数学优化
变量(数学)
多武装匪徒
局部搜索(优化)
车辆路径问题
方案(数学)
布线(电子设计自动化)
人工智能
机器学习
数学
大地测量学
数学分析
后悔
地理
计算机网络
作者
Panagiotis Kalatzantonakis,Angelo Sifaleras,Nikolaos Samaras
标识
DOI:10.1016/j.eswa.2022.118812
摘要
Finding the best sequence of local search operators that yields the optimal performance of Variable Neighborhood Search (VNS) is an important open research question in the field of metaheuristics. This paper proposes a Reinforcement Learning method to address this question. We introduce a new hyperheuristic scheme, termed Bandit VNS, inspired by the Multi-Armed Bandit (MAB), a particular type of a single state reinforcement learning problem. In Bandit VNS, we utilize the General Variable Neighborhood Search metaheuristic and enhance it by a hyperheuristic strategy. We examine several variations of the Upper Confidence Bound algorithm to create a reliable strategy for adaptive neighborhood selection. Furthermore, we utilize Adaptive Windowing, a state of the art algorithm to estimate and detect changes in the data stream. Bandit VNS is designed for effective parallelization and encourages cooperation between agents to produce the best solution quality. We demonstrate this concept’s advantages in accuracy and speed by extensive experimentation using the Capacitated Vehicle Routing Problem. We compare the novel scheme’s performance against the conventional General Variable Neighborhood Search metaheuristic in terms of the CPU time and solution quality. The Bandit VNS method shows excellent results and reaches significantly higher performance metrics when applied to well-known benchmark instances. Our experiments show that, our approach achieves an improvement of more than 25% in solution quality when compared to the General Variable Neighborhood Search method using standard library instances of medium and large size.
科研通智能强力驱动
Strongly Powered by AbleSci AI