次梯度方法
后悔
数学优化
计算机科学
先验与后验
时间范围
在线算法
马尔可夫决策过程
动态规划
订单(交换)
库存控制
接头(建筑物)
随机规划
上下界
联合概率分布
线性规划
资源配置
概率分布
随机优化
运筹学
班级(哲学)
稳健优化
决策问题
尺寸
最优化问题
块(置换群论)
工作(物理)
作者
Jingwen Tang,Izak Duenyas,Cong Shi,Nan Yang
标识
DOI:10.1287/msom.2024.0974
摘要
Problem definition: We consider the joint optimization of ordering and upgrading decisions in a dynamic multiproduct system over a finite horizon of T periods. In each period, multiple types of demand arrive stochastically and can be satisfied either with supply of the same type or by upgrading to a higher-quality product. The goal is to find an optimal joint replenishment and allocation policy that maximizes total expected profit, both when the firm knows the demand distributions a priori and when the firm must learn them over time. Methodology/results: We first characterize the structure of the clairvoyant optimal joint ordering and allocation policy. Building on this structure, we propose a new online learning algorithm, termed stochastic subgradient descent with perturbed subgradient (SGD-PG for short), and show that it achieves cumulative regret growing on the order of the square root of T, which matches the known lower bound for any online learning method. We further show that SGD-PG can be extended to a nested censored demand setting. In the course of the algorithmic design, we propose a linear programming (LP)-based approach to compute the subgradient and prove that it produces the same output as the perturbed subgradient method. The LP-based method also allows us to extend the results to general upgrading structures. We demonstrate the efficacy of the proposed algorithms in numerical experiments. Managerial implications: This work provides practitioners with the optimal policy for inventory replenishment and allocation in a multiproduct system with upgrading. When the demand distribution is unknown, we propose an easy-to-implement and provably good algorithm for demand learning. In addition, our numerical results quantify the value of optimal upgrading and identify the conditions under which upgrading is most beneficial. Funding: This research was partially supported by an Amazon research award. Supplemental Material: The e-companion is available at https://doi.org/10.1287/msom.2024.0974 .
科研通智能强力驱动
Strongly Powered by AbleSci AI