计算机科学
人工智能
卷积神经网络
模式识别(心理学)
变压器
特征学习
离群值
匹配(统计)
计算机视觉
数学
量子力学
统计
物理
电压
作者
Jiaxuan Chen,Xiaoming Chen,Shuang Chen,Yuyan Liu,Yujing Rao,Yang Yang,Haifeng Wang,Dan Wu
标识
DOI:10.1016/j.inffus.2022.10.030
摘要
As with any data fusion task, the front-end of the pipeline for image fusion, aiming to collect multitudinous physical properties from multimodal images taken by different types of sensors, requires registering the overlapped content of two images via image matching. In other words, the accuracy of image matching will influence directly the subsequent fusion results. In this work, we propose a hybrid correspondence learning architecture, termed as Shape-Former, which is capable of solving matching problems such as multimodal, and multiview cases. Existing attempts have trouble capturing intricate feature interactions for seeking good correspondence, if the image pairs simultaneously suffer from geometric and radiation distortion. To address this, our key is to take advantage of convolutional neural network (CNN) and Transformer for enhancing structure consensus representation ability. Specifically, we introduce a novel ShapeConv so that CNN and Transformer can be generalized to sparse matches learning. Furthermore, we provide a robust soft estimation of outliers mechanism for filtering the response of outliers before capturing shape features. Finally, we also propose coupling multiple consensus representations to further solve the context conflict problems such as local ambiguity. Experiments with variety of datasets reveal that our Shape-Former outperforms state-of-the-art on multimodal image matching, and shows promising generalization ability to different types of image deformations.
科研通智能强力驱动
Strongly Powered by AbleSci AI