Scalable Video Object Segmentation With Identification Mechanism

计算机科学 可扩展性 分割 人工智能 对象(语法) 视频跟踪 水准点(测量) 机器学习 计算机视觉 数据挖掘 数据库 大地测量学 地理
作者
Zongxin Yang,Jiaxu Miao,Yunchao Wei,Wenguan Wang,Xiaohan Wang,Yi Yang
出处
期刊:IEEE Transactions on Pattern Analysis and Machine Intelligence [Institute of Electrical and Electronics Engineers]
卷期号:: 1-15
标识
DOI:10.1109/tpami.2024.3383592
摘要

This paper delves into the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object Segmentation (VOS). Previous VOS methods decode features with a single positive object, limiting the learning of multi-object representation as they must match and segment each target separately under multi-object scenarios. Additionally, earlier techniques catered to specific application objectives and lacked the flexibility to fulfill different speed-accuracy requirements. To address these problems, we present two innovative approaches, Associating Objects with Transformers (AOT) and Associating Objects with Scalable Transformers (AOST). In pursuing effective multi-object modeling, AOT introduces the IDentification (ID) mechanism to allocate each object a unique identity. This approach enables the network to model the associations among all objects simultaneously, thus facilitating the tracking and segmentation of objects in a single network pass. To address the challenge of inflexible deployment, AOST further integrates scalable long short-term transformers that incorporate scalable supervision and layer-wise ID-based attention. This enables online architecture scalability in VOS for the first time and overcomes ID embeddings' representation limitations. Given the absence of a benchmark for VOS involving densely multi-object annotations, we propose a challenging Video Object Segmentation in the Wild (VOSW) benchmark to validate our approaches. We evaluated various AOT and AOST variants using extensive experiments across VOSW and five commonly used VOS benchmarks, including YouTube-VOS 2018 & 2019 Val, DAVIS-2017 Val & Test, and DAVIS-2016. Our approaches surpass the state-of-the-art competitors and display exceptional efficiency and scalability consistently across all six benchmarks. Moreover, we notably achieved the 1st position in the 3 rd Large-scale Video Object Segmentation Challenge. Project page: https://github.com/yoxu515/aot-benchmark.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
77发布了新的文献求助10
1秒前
LWBlm1912_发布了新的文献求助10
1秒前
13656479046完成签到,获得积分20
2秒前
3秒前
3秒前
lwb发布了新的文献求助10
4秒前
无花果应助科研通管家采纳,获得10
4秒前
顾矜应助科研通管家采纳,获得10
4秒前
Yin应助科研通管家采纳,获得10
4秒前
Orange应助科研通管家采纳,获得30
4秒前
小蘑菇应助科研通管家采纳,获得10
4秒前
Hello应助科研通管家采纳,获得10
5秒前
李健应助科研通管家采纳,获得10
5秒前
搜集达人应助科研通管家采纳,获得10
5秒前
5秒前
可爱的函函应助宇婷采纳,获得10
5秒前
Timothy完成签到,获得积分10
6秒前
WLZ发布了新的文献求助20
7秒前
Kunjo完成签到,获得积分10
8秒前
13656479046发布了新的文献求助100
8秒前
11秒前
健康的海发布了新的文献求助50
11秒前
limerencevie完成签到,获得积分10
13秒前
北极星完成签到,获得积分20
13秒前
14秒前
hhhhhh发布了新的文献求助10
15秒前
结实初翠发布了新的文献求助10
15秒前
太阳alright发布了新的文献求助10
17秒前
酷波er应助Timothy采纳,获得10
17秒前
17秒前
咕咕完成签到 ,获得积分10
17秒前
17秒前
CodeCraft应助JET_Li采纳,获得10
18秒前
19秒前
19秒前
qazx完成签到,获得积分10
19秒前
19秒前
20秒前
20秒前
高分求助中
Manual of Clinical Microbiology, 4 Volume Set (ASM Books) 13th Edition 1000
Sport in der Antike 800
De arte gymnastica. The art of gymnastics 600
少脉山油柑叶的化学成分研究 530
Electronic Structure Calculations and Structure-Property Relationships on Aromatic Nitro Compounds 500
Berns Ziesemer - Maos deutscher Topagent: Wie China die Bundesrepublik eroberte 500
Stephen R. Mackinnon - Chen Hansheng: China’s Last Romantic Revolutionary (2023) 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2417168
求助须知:如何正确求助?哪些是违规求助? 2109533
关于积分的说明 5334917
捐赠科研通 1836666
什么是DOI,文献DOI怎么找? 914756
版权声明 561068
科研通“疑难数据库(出版商)”最低求助积分说明 489200