后悔
非参数统计
计算机科学
决策者
时间范围
数学优化
序列(生物学)
产品(数学)
匹配(统计)
先验与后验
库存控制
数理经济学
数学
运筹学
计量经济学
统计
哲学
几何学
认识论
机器学习
生物
遗传学
作者
Cong Yang,Woonghee Tim Huh
标识
DOI:10.1177/10591478241231858
摘要
We consider a periodic-review single-product multi-echelon inventory problem with instantaneous replenishment. In each period, the decision-maker makes ordering decisions for all echelons. Any unsatisfied demand is back-ordered, and any excess inventory is carried to the next period. In contrast to the classic inventory literature, we assume that the information of the demand distribution is not known a priori, and the decision-maker observes demand realizations over the planning horizon. We propose a nonparametric algorithm that generates a sequence of adaptive ordering decisions based on the stochastic gradient descent method. We compare the [Formula: see text]-period cost of our algorithm to the clairvoyant, who knows the underlying demand distribution in advance, and we prove that the expected [Formula: see text]-period regret is at most [Formula: see text], matching a lower bound for this problem.
科研通智能强力驱动
Strongly Powered by AbleSci AI