马尔可夫决策过程
计算机科学
维数之咒
动态规划
数学优化
贝尔曼方程
排队论
准入控制
功能(生物学)
运筹学
马尔可夫过程
数学
算法
人工智能
进化生物学
生物
统计
服务质量
计算机网络
标识
DOI:10.1287/msom.2018.0730
摘要
Problem definition: Inpatient beds are usually grouped into several wards, and each ward is assigned to serve patients from certain “primary” specialties. However, when a patient waits excessively long before a primary bed becomes available, hospital managers have the option to assign her to a nonprimary bed. although it is undesirable. Deciding when to use such “overflow” is difficult in real time and under uncertainty. Relevance: To aid the decision making, we model hospital inpatient flow as a multiclass, multipool parallel-server queueing system and formulate the overflow decision problem as a discrete-time, infinite-horizon average cost Markov decision process (MDP). The MDP incorporates many realistic and important features such as patient arrival and discharge patterns depending on time of day. Methodology: To overcome the curse-of-dimensionality of this formulated MDP, we resort to approximate dynamic programming (ADP). A critical part in designing an ADP algorithm is to choose appropriate basis functions to approximate the relative value function. Using a novel combination of fluid control and single-pool approximation, we develop analytical forms to approximate the relative value functions at midnight, which then guides the choice of the basis functions for all other times of day. Results: We demonstrate, via numerical experiments in realistic hospital settings, that our proposed ADP algorithm is remarkably effective in finding good overflow policies. These ADP policies can significantly improve system performance over some commonly used overflow strategies—for example, in a baseline scenario, the ADP policy achieves a congestion level similar to that achieved by a complete bed sharing policy, while reduces the overflow proportion by 20%. Managerial implications: We quantify the trade-off between the overflow proportion and congestion from implementing ADP policies under a variety of system conditions and generate useful insights. The plotted efficient frontiers allow managers to observe various performance measures in different parameter regimes, and the ADP policies provide managers with operational strategies to achieve the desired performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI