Sampling Big Trajectory Data for Traversal Trajectory Aggregate Query

树遍历计算机科学弹道采样（信号处理）导线估计员查询优化数据挖掘差异（会计）骨料（复合）算法统计数学会计大地测量学物理材料科学滤波器（信号处理）天文业务复合材料计算机视觉地理

作者

Yichen Ding,Yanhua Li,Xun Zhou,Zhuojie Huang,Simin You,Jun Luo

出处

期刊：IEEE Transactions on Big Data [IEEE Computer Society]
日期：2018-04-30 卷期号：5 (4): 550-563 被引量：8

标识

DOI：10.1109/tbdata.2018.2830780

摘要

This paper defines and investigates a novel trajectory query, namely, Traversal Trajectory Aggregate (TTA) Query: Given a trajectory database and a pair of upstream and downstream spatio-temporal (ST) regions (i.e., spatial area coupled with a time interval), a TTA query aims to retrieve the total number of unique trajectories that traverse through these two ST regions. Such TTA queries play an important role in various urban applications, such as route planning, taxi dispatching, and location-based advertising. Two baselines can answer such TTA queries: (a) exact search (over the entire ST query regions) can obtain the exact answer, but it leads to extremely long running time when the ST query regions are huge; (b) uniform-sampling-based approaches estimate the query answer with sampled trajectories. However, the uniform sampling distribution may lead to significant estimation variance for TTA query, because traversal trajectories are relatively few and unevenly distributed in the query regions. To tackle these challenges, this paper proposes a novel Targeted Index Sampling (TIS) framework to answer TTA queries with high estimation accuracy. TIS employs a two-stage framework, with a Pilot Sampling Estimation (PSE) stage to estimate the distribution of trajectories in ST query region, and an Integrated Importance Sampling (IIS) stage, which collects trajectories with the importance sampling distribution obtained in PSE, and estimates the query result with an asymptotically unbiased estimator. Extensive experiments and case studies using a large-scale real taxi trajectory dataset from Shenzhen, China demonstrate that our TIS framework achieves <; 10 percent estimation error with > 90 percent computational time reduction over exact search, and 50 percent reduction on estimation error (with similar running time) over uniform-distribution-based sampling approaches.

求助该文献

最长约 10秒，即可获得该文献文件

Sampling Big Trajectory Data for Traversal Trajectory Aggregate Query

今日热心研友