人工智能
计算机科学
方向(向量空间)
航空影像
水准点(测量)
计算机视觉
判别式
相似性(几何)
模式识别(心理学)
特征(语言学)
精确性和召回率
匹配(统计)
深度学习
成对比较
像素
人工神经网络
图像(数学)
地理
数学
地图学
几何学
哲学
统计
语言学
出处
期刊:Cornell University - arXiv
日期:2019-01-01
被引量:3
标识
DOI:10.48550/arxiv.1903.12351
摘要
This paper studies image-based geo-localization (IBL) problem using ground-to-aerial cross-view matching. The goal is to predict the spatial location of a ground-level query image by matching it to a large geotagged aerial image database (e.g., satellite imagery). This is a challenging task due to the drastic differences in their viewpoints and visual appearances. Existing deep learning methods for this problem have been focused on maximizing feature similarity between spatially close-by image pairs, while minimizing other images pairs which are far apart. They do so by deep feature embedding based on visual appearance in those ground-and-aerial images. However, in everyday life, humans commonly use {\em orientation} information as an important cue for the task of spatial localization. Inspired by this insight, this paper proposes a novel method which endows deep neural networks with the `commonsense' of orientation. Given a ground-level spherical panoramic image as query input (and a large georeferenced satellite image database), we design a Siamese network which explicitly encodes the orientation (i.e., spherical directions) of each pixel of the images. Our method significantly boosts the discriminative power of the learned deep features, leading to a much higher recall and precision outperforming all previous methods. Our network is also more compact using only 1/5th number of parameters than a previously best-performing network. To evaluate the generalization of our method, we also created a large-scale cross-view localization benchmark containing 100K geotagged ground-aerial pairs covering a city. Our codes and datasets are available at \url{https://github.com/Liumouliu/OriCNN}.
科研通智能强力驱动
Strongly Powered by AbleSci AI