心理干预
强化学习
计算机科学
随机对照试验
随机试验
互补性(分子生物学)
机器学习
贝叶斯概率
人工智能
心理学
数学
医学
统计
外科
精神科
生物
遗传学
作者
Yicheng Song,Tianshu Sun
出处
期刊:Management Science
[Institute for Operations Research and the Management Sciences]
日期:2023-10-04
卷期号:70 (8): 5115-5130
被引量:12
标识
DOI:10.1287/mnsc.2023.4914
摘要
Firms adopt randomized experiments to evaluate various interventions (e.g., website design, creative content, and pricing). However, most randomized experiments are designed to identify the impact of one specific intervention. The literature on randomized experiments lacks a holistic approach to optimize a sequence of interventions along the customer journey. Specifically, locally optimal interventions unveiled by randomized experiments might be globally suboptimal when considering their interdependence as well as the long-term rewards. Fortunately, the accumulation of a large number of historical experiments creates exogenous interventions at different stages along the customer journey and provides a new opportunity. This study integrates multiple experiments within the reinforcement learning (RL) framework to tackle the questions that cannot be answered by stand-alone randomized experiments. How can we learn optimal policy with a sequence of interventions along the customer journey based on an ensemble of historical experiments? Additionally, how can we learn from multiple historical experiments to guide future intervention trials? We propose a Bayesian recurrent Q-network model that leverages the exogenous interventions from multiple experiments to learn their effectiveness at different stages of the customer journey and optimize them for long-term rewards. Beyond optimization within the existing interventions, the Bayesian model also estimates the distribution of rewards, which can guide subject allocation in the design of future experiments to optimally balance exploration and exploitation. In summary, the proposed model creates a two-way complementarity between RL and randomized experiments, and thus, it provides a holistic approach to learning and optimizing interventions along the customer journey. This paper was accepted by Anindya Ghose, information systems. Funding: This work was supported by Adobe Faculty Research Award and the Marketing Science Institute Research Grant. Supplemental Material: The data files and online appendix are available at https://doi.org/10.1287/mnsc.2023.4914 .
科研通智能强力驱动
Strongly Powered by AbleSci AI