计算机科学
计算机视觉
人工智能
视频跟踪
卡尔曼滤波器
跟踪(教育)
水准点(测量)
卫星
运动(物理)
对象(语法)
利用
跟踪系统
先验概率
运动估计
目标检测
卫星图像
帧(网络)
编码(集合论)
匹配移动
眼动
像素
磁道(磁盘驱动器)
国家(计算机科学)
运动检测
作者
Fan Ruijie,Ye Junyan,Chen Huan,Huang Zilong,Wang Xiao-lei,Li Weijia
出处
期刊:Cornell University - arXiv
日期:2025-11-23
标识
DOI:10.48550/arxiv.2511.18264
摘要
Existing satellite video tracking methods often struggle with generalization, requiring scenario-specific training to achieve satisfactory performance, and are prone to track loss in the presence of occlusion. To address these challenges, we propose SatSAM2, a zero-shot satellite video tracker built on SAM2, designed to adapt foundation models to the remote sensing domain. SatSAM2 introduces two core modules: a Kalman Filter-based Constrained Motion Module (KFCMM) to exploit temporal motion cues and suppress drift, and a Motion-Constrained State Machine (MCSM) to regulate tracking states based on motion dynamics and reliability. To support large-scale evaluation, we propose MatrixCity Video Object Tracking (MVOT), a synthetic benchmark containing 1,500+ sequences and 157K annotated frames with diverse viewpoints, illumination, and occlusion conditions. Extensive experiments on two satellite tracking benchmarks and MVOT show that SatSAM2 outperforms both traditional and foundation model-based trackers, including SAM2 and its variants. Notably, on the OOTB dataset, SatSAM2 achieves a 5.84% AUC improvement over state-of-the-art methods. Our code and dataset will be publicly released to encourage further research.
科研通智能强力驱动
Strongly Powered by AbleSci AI