强化学习
计算机科学
操作员(生物学)
运筹学
利润(经济学)
车队管理
再培训
价值(数学)
服务(商务)
人工神经网络
数学优化
人工智能
经济
工程类
微观经济学
机器学习
数学
业务
营销
电信
抑制因子
国际贸易
化学
基因
转录因子
生物化学
作者
Nicholas Kullman,Martin Cousineau,Justin C. Goodson,Jorge E. Mendoza
出处
期刊:Transportation Science
[Institute for Operations Research and the Management Sciences]
日期:2021-07-29
卷期号:56 (3): 775-794
被引量:58
标识
DOI:10.1287/trsc.2021.1042
摘要
We consider the problem of an operator controlling a fleet of electric vehicles for use in a ride-hailing service. The operator, seeking to maximize profit, must assign vehicles to requests as they arise as well as recharge and reposition vehicles in anticipation of future requests. To solve this problem, we employ deep reinforcement learning, developing policies whose decision making uses [Formula: see text]-value approximations learned by deep neural networks. We compare these policies against a reoptimization-based policy and against dual bounds on the value of an optimal policy, including the value of an optimal policy with perfect information, which we establish using a Benders-based decomposition. We assess performance on instances derived from real data for the island of Manhattan in New York City. We find that, across instances of varying size, our best policy trained with deep reinforcement learning outperforms the reoptimization approach. We also provide evidence that this policy may be effectively scaled and deployed on larger instances without retraining.
科研通智能强力驱动
Strongly Powered by AbleSci AI