计算机科学
人工智能
特征提取
像素
模式识别(心理学)
特征匹配
计算机视觉
探测器
边距(机器学习)
特征(语言学)
变压器
匹配(统计)
编码(集合论)
数学
工程类
电压
机器学习
统计
电气工程
哲学
语言学
集合(抽象数据类型)
程序设计语言
电信
作者
Jiaming Sun,Zehong Shen,Yuang Wang,Hujun Bao,Xiaowei Zhou
标识
DOI:10.1109/cvpr46437.2021.00881
摘要
We present a novel method for local image feature matching. Instead of performing image feature detection, description, and matching sequentially, we propose to first establish pixel-wise dense matches at a coarse level and later refine the good matches at a fine level. In contrast to dense methods that use a cost volume to search correspondences, we use self and cross attention layers in Transformer to obtain feature descriptors that are conditioned on both images. The global receptive field provided by Transformer enables our method to produce dense matches in low-texture areas, where feature detectors usually struggle to produce repeatable interest points. The experiments on indoor and outdoor datasets show that LoFTR outperforms state-of-the-art methods by a large margin. LoFTR also ranks first on two public benchmarks of visual localization among the published methods. Code is available at our project page: https://zju3dv.github.io/loftr/.
科研通智能强力驱动
Strongly Powered by AbleSci AI