强化学习
计算机科学
时间范围
数学优化
库存控制
单调函数
维数之咒
人工智能
运筹学
数学
数学分析
作者
Rui Wang,Xianghua Gan,Qing Li,Yan Xiao
出处
期刊:Complexity
[Hindawi Publishing Corporation]
日期:2021-01-01
卷期号:2021 (1)
被引量:19
摘要
We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic‐review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost‐sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. When there is no fixed ordering cost involved, we design a deep reinforcement learning algorithm to obtain a near‐optimal ordering policy and show that there are some monotonicity properties in the learned policy. We also show that our deep reinforcement learning algorithm achieves a better performance than tabular‐based Q‐learning algorithms. When a fixed ordering cost is involved, we show that our deep reinforcement learning algorithm is effective and efficient, under which the problem of “curse of dimension” is circumvented.
科研通智能强力驱动
Strongly Powered by AbleSci AI