Multimodal image matching: A scale-invariant algorithm and an open dataset

人工智能计算机科学尺度不变特征变换计算机视觉模式识别（心理学）缩放空间光流特征（语言学）棱锥（几何）匹配（统计）合成孔径雷达特征提取数学图像处理图像（数学）统计哲学语言学几何学

作者

Jiayuan Li,Qingwu Hu,Yongjun Zhang

出处

期刊：Isprs Journal of Photogrammetry and Remote Sensing 日期：2023-09-11 卷期号：204: 77-88 被引量：59

标识

DOI：10.1016/j.isprsjprs.2023.08.010

摘要

Multimodal image matching is a core basis for information fusion, change detection, and image-based navigation. However, multimodal images may simultaneously suffer from severe nonlinear radiation distortion (NRD) and complex geometric differences, which pose great challenges to existing methods. Although deep learning-based methods had shown potential in image matching, they mainly focus on same-source images or single types of multimodal images such as optical-synthetic aperture radar (SAR). One of the main obstacles is the lack of public data for different types of multimodal images. In this paper, we make two major contributions to the community of multimodal image matching: First, we collect six typical types of images, including optical-optical, optical-infrared, optical-SAR, optical-depth, optical-map, and nighttime, to construct a multimodal image dataset with a total of 1200 pairs. This dataset has good diversity in image categories, feature classes, resolutions, geometric variations, etc. Second, we propose a scale and rotation invariant feature transform (SRIF) method, which achieves good matching performance without relying on data characteristics. This is one of the advantages of our SRIF over deep learning methods. SRIF obtains the scales of FAST keypoints by projecting them into a simple pyramid scale space, which is based on the study that methods with/without scale space have similar performance under small scale change factors. This strategy largely reduces the complexity compared to traditional Gaussian scale space. SRIF also proposes a local intensity binary transform (LIBT) for SIFT-like feature description, which can largely enhance the structure information inside multimodal images. Extensive experiments on these 1200 image pairs show that our SRIF outperforms current state-of-the-arts by a large margin, including RIFT, CoFSM, LNIFT, and MS-HLMO. Both the created dataset and the code of SRIF will be publicly available in https://github.com/LJY-RS/SRIF.

求助该文献

最长约 10秒，即可获得该文献文件

Multimodal image matching: A scale-invariant algorithm and an open dataset

今日热心研友