计算机科学
杠杆(统计)
人工智能
特征学习
机器学习
一般化
代表(政治)
深度学习
航空影像
领域(数学分析)
模式识别(心理学)
图像(数学)
数学分析
政治
法学
数学
政治学
作者
Xian Sun,Peijin Wang,Wanxuan Lu,Zicong Zhu,Xiaonan Lü,Qibin He,Junxi Li,Xuee Rong,Zhujun Yang,Hao Chang,Qinglin He,Guang Yang,Ruiping Wang,Jiwen Lu,Kun Fu
标识
DOI:10.1109/tgrs.2022.3194732
摘要
Deep learning approaches have contributed to the rapid development of remote sensing (RS) image interpretation. The most widely used training paradigm is to use ImageNet pretrained models to process RS data for specified tasks. However, there are issues such as domain gap between natural and RS scenes and the poor generalization capacity of RS models. It makes sense to develop a foundation model with general RS feature representation. Since a large amount of unlabeled data is available, the self-supervised method has more development significance than the fully supervised method in RS. However, most of the current self-supervised methods use contrastive learning, whose performance is sensitive to data augmentation, additional information, and selection of positive and negative pairs. In this article, we leverage the benefits of generative self-supervised learning (SSL) for RS images and propose an RS foundation mo del framework called RingMo, which consists of two parts. First, a large-scale dataset is constructed by collecting two million RS images from satellite and aerial platforms, covering multiple scenes and objects around the world. Second, we propose an RS foundation model training method designed for dense and small objects in complicated RS scenes. We show that the foundation model trained on our dataset with RingMo method achieves state-of-the-art (SOTA) on eight datasets across four downstream tasks, demonstrating the effectiveness of the proposed framework. Through in-depth exploration, we believe it is time for RS researchers to embrace generative SSL and leverage its general representation capabilities to speed up the development of RS applications.
科研通智能强力驱动
Strongly Powered by AbleSci AI