调度(生产过程)
强化学习
计算机科学
钢筋
端到端原则
数学优化
运筹学
分布式计算
工业工程
工程类
人工智能
数学
结构工程
作者
Haoran Li,Liang Gao,Qingsong Fan,Xinyu Li,Bao An Han
标识
DOI:10.1080/00207543.2024.2449240
摘要
Heterogeneity among factories in distributed manufacturing significantly expands the solution space, complicating optimisation. Traditional centralised scheduling methods lack the scalability to adapt to varying factory scales. This paper proposes an end-to-end decentralised scheduling framework based on deep reinforcement learning (DRL) for dynamic distributed heterogeneous permutation flowshop scheduling problem (DDHPFSP) with random job arrivals. The framework utilises a multi-agent architecture, where each factory operates as an independent agent, enabling efficient, robust, and scalable scheduling. Specifically, the DDHPFSP is formulated as a partially observable Markov decision process (POMDP), with a state space reflecting heterogeneity and permutation characteristics and a new tailored reward function addressing sparse rewards and high reward variance. An end-to-end policy network with dual-layer architecture is developed, incorporating a feature extraction network to capture intrinsic relationships between jobs and heterogeneous factories, enhancing the agent's self-learning and policy evolution. Moreover, a backward swap search (BSS) method based on greedy heuristics optimises the pre-scheduling plan during the online phase with minimal computation time. Experimental results demonstrate the framework outperforms the best comparison methods by 39.76% on 540 baseline instances and 59.95% on 2430 generalisation instances. Furthermore, the framework's effectiveness improves by 68.9% with the introduction of the BSS method.
科研通智能强力驱动
Strongly Powered by AbleSci AI