强化学习
计算机科学
作业车间调度
工作车间
元启发式
马尔可夫决策过程
数学优化
人工智能
调度(生产过程)
流水车间调度
分布式计算
地铁列车时刻表
马尔可夫过程
数学
统计
操作系统
作者
Jiang‐Ping Huang,Liang Gao,Xinyu Li,Chunjiang Zhang
标识
DOI:10.1016/j.jmsy.2023.06.007
摘要
With the development of a global economy, distributed manufacturing becomes common in the industrial field. The Distributed Job-shop Scheduling Problem (DJSP), which is widespread in real-life production, is a hotspot in the academic field. The existing Priority Dispatch Rules (PDRs), which are used to assign a value to each waiting job according to some method and select the job with minimum or maximum “value” for next processing, are all relatively simple but lack self-learning ability, while the metaheuristics are all complex and with fixed evolutionary trajectory and cannot change with the manufacturing environment. This paper proposes a novel PDR generation method based on Graph Neural Network (GNN) and Reinforcement Learning (RL), which can self-learn and self-evolute by interacting with the scheduling environment. To combine DJSP with GNN closely, a new solution representation based on disjunctive graph is designed. DJSP is formulated as a Markov decision process, and the problem features and inner connections among the vertices of the disjunctive graph are fully explored by the GNN. An Actor-Critic RL method is applied to automatically train the network parameters to optimize the policy, so that it can be used to schedule the best action at each step. Comprehensive experiments on 240 test instances are conducted to evaluate the performance of the proposed method, and the results indicate that the proposed method shows greater effectiveness, generalizability and stability than other 8 classical PDRs, 5 metaheuristics and 3 RL-based methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI