数学优化
拉格朗日松弛
马尔可夫决策过程
启发式
计算机科学
时间范围
维数之咒
线性规划松弛
线性规划
调度(生产过程)
动态规划
水准点(测量)
随机规划
数学
马尔可夫过程
统计
机器学习
大地测量学
地理
作者
Abderrahmane Abbou,Viliam Makis
出处
期刊:Informs Journal on Computing
日期:2019-06-19
卷期号:31 (4): 719-731
被引量:32
标识
DOI:10.1287/ijoc.2018.0863
摘要
We consider a maintenance planner problem to dynamically allocate the available repairmen to a system of unreliable production facilities. Each facility has several machines that incur a linear production loss due to stochastic degradation, which we model as a continuous time Markov process with fully observable states. The objective is to schedule group maintenance interventions, in discrete time epochs, so as to minimize production losses over an infinite horizon. Direct solution procedures, such as dynamic programming value or policy iteration, are impractical due to the curse of dimensionality. An approximate scheduling procedure is developed following Whittle’s restless bandits approach. In particular, we decompose the Whittle’s relaxation of our scheduling problem by production facility (i.e., bandit) using the Lagrangian technique. Based on the structural investigation of a single-bandit problem, we prove indexability and propose a novel index computational algorithm. Our numerical study shows that, for systems with three or four facilities, the index policy has a near-zero optimality gap. For systems with 10 or more facilities, the index policy expected cost remains fairly close to a lower bound that we compute using the known linear programming (LP) formulation of Whittle’s relaxation. Furthermore, the numerical study also shows that our policy yields substantial expected cost improvements relative to a benchmark LP-based heuristic when the states are partially observable and can handle large-scale systems unlike LP-based heuristics, which have excessive memory requirements.
科研通智能强力驱动
Strongly Powered by AbleSci AI