强化学习
控制重构
计算机科学
动态优先级调度
调度(生产过程)
马尔可夫决策过程
人工智能
图形
遗传算法调度
分布式计算
作业车间调度
数学优化
动态规划
公平份额计划
两级调度
领域(数学)
多智能体系统
部分可观测马尔可夫决策过程
马尔可夫过程
缩小
马尔可夫链
作者
Yuxin Li,Qihao Liu,Chunjiang Zhang,Xinyu Li,Liang Gao
标识
DOI:10.1109/tsmc.2025.3612300
摘要
Human–machine hybrid reconfiguration manufacturing is an emerging paradigm in the field of precision equipment production and can greatly improve the production capability of the workshop. However, numerous complex constraints and a dynamic environment make reasonable scheduling very difficult. To this end, this article studies the dynamic human–machine hybrid reconfiguration manufacturing scheduling problem (DHMRSP) and proposes a novel deep reinforcement learning (DRL) scheduling method. Specifically, a dual-agent Markov decision process (MDP) is established, which can handle seven complex constraints and three disturbance events. Then, a heterogeneous competition graph attention network (HCGAN) is designed, where the meta-path-based subgraph conversion reflects the resource-operation competition, and three modules use node-level attention and semantic-level attention to realize important information embedding. Afterward, a dual proximal policy optimization (PPO) algorithm with HCGAN and mixed action space (HM-DPPO) is proposed, where the allocation agent and reconfiguration agent achieve collaborative learning by taking joint action and sharing graph embeddings and reward. Experimental results prove that the proposed approach outperforms rules, genetic programming (GP), and three DRL methods on different instances and can effectively handle various disturbance events.
科研通智能强力驱动
Strongly Powered by AbleSci AI