Two-Sided Deep Reinforcement Learning for Dynamic Mobility-on-Demand Management with Mixed Autonomy

强化学习重新安置计算机科学运筹学车队管理可扩展性操作员（生物学）分布式计算人工智能工程类生物化学电信数据库转录因子基因抑制因子化学程序设计语言

作者

Jiaohong Xie,Yang Liu,Nan Chen

出处

期刊：Transportation Science [Institute for Operations Research and the Management Sciences]
日期：2023-01-17 卷期号：57 (4): 1019-1046 被引量：42

标识

DOI：10.1287/trsc.2022.1188

摘要

Autonomous vehicles (AVs) are expected to operate on mobility-on-demand (MoD) platforms because AV technology enables flexible self-relocation and system-optimal coordination. Unlike the existing studies, which focus on MoD with pure AV fleet or conventional vehicles (CVs) fleet, we aim to optimize the real-time fleet management of an MoD system with a mixed autonomy of CVs and AVs. We consider a realistic case that heterogeneous boundedly rational drivers may determine and learn their relocation strategies to improve their own compensation. In contrast, AVs are fully compliant with the platform’s operational decisions. To achieve a high level of service provided by a mixed fleet, we propose that the platform prioritizes human drivers in the matching decisions when on-demand requests arrive and dynamically determines the AV relocation tasks and the optimal commission fee to influence drivers’ behavior. However, it is challenging to make efficient real-time fleet management decisions when spatiotemporal uncertainty in demand and complex interactions among human drivers and operators are anticipated and considered in the operator’s decision making. To tackle the challenges, we develop a two-sided multiagent deep reinforcement learning (DRL) approach in which the operator acts as a supervisor agent on one side and makes centralized decisions on the mixed fleet, and each CV driver acts as an individual agent on the other side and learns to make decentralized decisions noncooperatively. We establish a two-sided multiagent advantage actor-critic algorithm to simultaneously train different agents on the two sides. For the first time, a scalable algorithm is developed here for mixed fleet management. Furthermore, we formulate a two-head policy network to enable the supervisor agent to efficiently make multitask decisions based on one policy network, which greatly reduces the computational time. The two-sided multiagent DRL approach is demonstrated using a case study in New York City using real taxi trip data. Results show that our algorithm can make high-quality decisions quickly and outperform benchmark policies. The efficiency of the two-head policy network is demonstrated by comparing it with the case using two separate policy networks. Our fleet management strategy makes both the platform and the drivers better off, especially in scenarios with high demand volume. History: This paper has been accepted for the Transportation Science Special Issue on Emerging Topics in Transportation Science and Logistics. Funding: This work was supported by the Singapore Ministry of Education Academic Research [Grant MOE2019-T2-2-165] and the Singapore Ministry of Education [Grant R-266-000-135-114].

求助该文献

最长约 10秒，即可获得该文献文件

Two-Sided Deep Reinforcement Learning for Dynamic Mobility-on-Demand Management with Mixed Autonomy

今日热心研友