An End-to-End Reinforcement Learning Based Approach for Micro-View Order-Dispatching in Ride-Hailing

强化学习 计算机科学 订单(交换) 端到端原则 人工智能 运筹学 工程类 业务 财务
作者
Xinlang Yue,Yiran Liu,Fangzhou Shi,Sihong Luo,Chen Zhong,M. Lu,Zhe Xu
标识
DOI:10.1145/3627673.3680013
摘要

Assigning orders to drivers under localized spatiotemporal context (micro-view order-dispatching) is a major task in Didi, as it influences ride-hailing service experience. Existing industrial solutions mainly follow a two-stage pattern that incorporate heuristic or learning-based algorithms with naive combinatorial methods, tackling the uncertainty of both sides' behaviors, including emerging timings, spatial relationships, and travel duration, etc. In this paper, we propose a one-stage end-to-end reinforcement learning based order-dispatching approach that solves behavior prediction and combinatorial optimization uniformly in a sequential decision-making manner. Specifically, we employ a two-layer Markov Decision Process framework to model this problem, and present Deep Double Scalable Network (D2SN), an encoder-decoder structure network to generate order-driver assignments directly and stop assignments accordingly. Besides, by leveraging contextual dynamics, our approach can adapt to the behavioral patterns for better performance. Extensive experiments on Didi's real-world benchmarks justify that the proposed approach significantly outperforms competitive baselines in optimizing matching efficiency and user experience tasks. In addition, we evaluate the deployment outline and discuss the gains and experiences obtained during the deployment tests from the view of large-scale engineering implementation.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
yliu发布了新的文献求助10
刚刚
南宫士晋完成签到 ,获得积分10
刚刚
有的没的完成签到,获得积分20
1秒前
难过梦竹完成签到,获得积分10
1秒前
Paul完成签到,获得积分10
1秒前
碧蓝世界完成签到 ,获得积分10
1秒前
宇宙最萌小猫咪完成签到 ,获得积分10
2秒前
李7发布了新的文献求助10
2秒前
3秒前
MarvelerYB3完成签到,获得积分10
3秒前
4秒前
Skyrin完成签到,获得积分0
4秒前
卡莎完成签到,获得积分10
5秒前
chyang发布了新的文献求助10
5秒前
helpme完成签到,获得积分10
5秒前
5秒前
6秒前
万能图书馆应助peng采纳,获得10
7秒前
糜轩完成签到,获得积分10
7秒前
jiu完成签到,获得积分10
7秒前
三石盟约完成签到,获得积分10
8秒前
Lambisucc完成签到,获得积分10
9秒前
10秒前
Chan完成签到,获得积分10
10秒前
clock完成签到 ,获得积分10
10秒前
neverpart完成签到,获得积分10
10秒前
大叉烧完成签到,获得积分10
11秒前
朴实冷之完成签到,获得积分10
11秒前
zhanlang发布了新的文献求助10
11秒前
QWJ完成签到,获得积分10
11秒前
AU完成签到 ,获得积分10
12秒前
李7完成签到,获得积分10
12秒前
12秒前
PO8发布了新的文献求助10
12秒前
12秒前
12秒前
13秒前
善良的金鱼完成签到,获得积分10
13秒前
13秒前
高分求助中
Mass producing individuality 600
Algorithmic Mathematics in Machine Learning 500
Разработка метода ускоренного контроля качества электрохромных устройств 500
Advances in Underwater Acoustics, Structural Acoustics, and Computational Methodologies 300
The Power of High-Throughput Experimentation: General Topics and Enabling Technologies for Synthesis and Catalysis (Volume 1) 200
NK Cell Receptors: Advances in Cell Biology and Immunology by Colton Williams (Editor) 200
Effect of clapping movement with groove rhythm on executive function: focusing on audiomotor entrainment 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3827509
求助须知:如何正确求助?哪些是违规求助? 3369757
关于积分的说明 10457657
捐赠科研通 3089465
什么是DOI,文献DOI怎么找? 1699897
邀请新用户注册赠送积分活动 817560
科研通“疑难数据库(出版商)”最低求助积分说明 770263