A DRL-Based Reactive Scheduling Policy for Flexible Job Shops With Random Job Arrivals

拖延 计算机科学 调度(生产过程) 强化学习 作业车间调度 工作车间 分布式计算 作业调度程序 缩小 动态优先级调度 数学优化 运筹学 流水车间调度 人工智能 工程类 计算机网络 数学 排队 服务质量 程序设计语言 布线(电子设计自动化)
作者
Linlin Zhao,Jiaxin Fan,Chunjiang Zhang,Weiming Shen,Jing Zhuang
出处
期刊:IEEE Transactions on Automation Science and Engineering [Institute of Electrical and Electronics Engineers]
卷期号:21 (3): 2912-2923 被引量:34
标识
DOI:10.1109/tase.2023.3271666
摘要

In real-life production systems, arrivals of jobs are usually unpredictable, which makes it necessary to develop solid reactive scheduling policies to meet delivery requirements. Deep reinforcement learning (DRL) based scheduling methods are capable of quickly responding to dynamic events by learning from the training data. However, most of policy networks in DRL algorithms are trained to choose priority dispatching rules (PDR), thus, to some extent, the efficiency of obtained scheduling plans is limited by the performance of PDRs. This paper investigates a dynamic flexible job shop scheduling problem with random job arrivals for the total tardiness minimization. A DRL-based reactive scheduling method, proximal policy optimization with attention-based policy network (PPO-APN), is proposed to make real-time decisions for the dynamic scheduling environment, where the attention-based policy network (APN) is able to directly select pending jobs distinguished from the action space that consists of PDRs. Additionally, a global/local reward function (GLRF) is designed to address the reward sparsity issue during training processes. The proposed PPO-APN is tested on randomly generated instances with different production configurations, and is compared with frequently-used PDRs and DRL-based methods. Numerical experimental results indicate that APN and GLRF components significantly improve the training efficiency, and the PPO-APN shows better overall performance compared with other methods. Note to Practitioners —This work is motivated by a typical production scenario in discrete manufacturing systems, where orders randomly arrive at the shop floor and require to be scheduled in a short time to ensure the on-time delivery. Previous research work tends to apply DRL algorithms to choose suitable dispatching rules for the ease of implementation. Nevertheless, the jobs that can be selected by dispatching rules are rather limited, thus many possible high-quality scheduling plans are ignored. This work first sorts all the unscheduled jobs by a heuristic algorithm, and puts some of top-ranked jobs to a pool. When a machine becomes available, it will directly choose a job from the pool as the next processing task. The job selection policy is represented by a novel attention-based network, and is trained by a powerful DRL algorithm. The aforementioned process is repeatedly executed in a simulation environment to collect the training data. Therefore, after being trained for a certain period of time, the policy will become smarter and can be applied to make right decisions in real-time. The proposed reactive scheduling method has been proved to be more efficient than dispatching rules and DRL-based approaches, and is effective in the production scheduling for a wide variety of discrete manufacturing scenarios, such as automobile and electronics industries. Moreover, the proposed method can be further extended to address dynamic scheduling problems with some production characteristics via adding constraints for the job selection or re-defining calculations for the completion time of operations accordingly.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
m13965062353完成签到,获得积分10
1秒前
1秒前
彭于晏应助lizhiqian2024采纳,获得10
1秒前
奥里给完成签到 ,获得积分10
2秒前
2秒前
科研蚂蚁完成签到,获得积分10
2秒前
木木杨发布了新的文献求助20
3秒前
cherishfawn完成签到,获得积分10
3秒前
培a发布了新的文献求助10
6秒前
L7.发布了新的文献求助10
7秒前
纪绝悟完成签到,获得积分10
8秒前
扁桃体永不发炎完成签到 ,获得积分10
8秒前
8秒前
9秒前
11秒前
11秒前
11秒前
xu完成签到 ,获得积分10
12秒前
辉辉发布了新的文献求助10
13秒前
hyy发布了新的文献求助30
13秒前
14秒前
我是老大应助不知名选手采纳,获得10
15秒前
自由焦虑发布了新的文献求助30
15秒前
lizhiqian2024发布了新的文献求助10
16秒前
16秒前
梁小鱼完成签到,获得积分10
16秒前
科研小菜鸡完成签到,获得积分10
16秒前
领导范儿应助西风惊绿采纳,获得10
19秒前
19秒前
Young4399完成签到 ,获得积分10
19秒前
19秒前
Gudeguy完成签到 ,获得积分10
20秒前
思源应助Raye采纳,获得10
21秒前
gsj完成签到 ,获得积分10
21秒前
拼搏的酸奶完成签到,获得积分20
23秒前
24秒前
hyy完成签到,获得积分10
24秒前
梁小鱼发布了新的文献求助10
24秒前
李新悦发布了新的文献求助10
25秒前
27秒前
高分求助中
Encyclopedia of Mathematical Physics 2nd edition 888
Technologies supporting mass customization of apparel: A pilot project 600
Introduction to Strong Mixing Conditions Volumes 1-3 500
协和专家大医说:医话肿瘤 400
Pharmacological profile of sulodexide 400
Optical and electric properties of monocrystalline synthetic diamond irradiated by neutrons 320
共融服務學習指南 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3805206
求助须知:如何正确求助?哪些是违规求助? 3350214
关于积分的说明 10347750
捐赠科研通 3066060
什么是DOI,文献DOI怎么找? 1683511
邀请新用户注册赠送积分活动 809039
科研通“疑难数据库(出版商)”最低求助积分说明 765205