Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model

概率逻辑可扩展性计算机科学变更检测人工智能生成语法比例（比率）过程（计算）生成模型遥感像素数据挖掘事件（粒子物理）变压器语义变化随机建模机器学习数据建模时间序列随机过程深度学习图形模型分割数据同化统计模型

作者

Zhuo Zheng,Stefano Ermon,Dongjun Kim,Liangpei Zhang,Yanfei Zhong

出处

期刊：IEEE Transactions on Pattern Analysis and Machine Intelligence [IEEE Computer Society]
日期：2024-10-10 卷期号：47 (2): 725-741 被引量：29

链接

nih.govdoi.org

标识

DOI：10.1109/tpami.2024.3475824

摘要

Our understanding of the temporal dynamics of the Earth's surface has been significantly advanced by deep vision models, which often require a massive amount of labeled multi-temporal images for training. However, collecting, preprocessing, and annotating multi-temporal remote sensing images at scale is non-trivial since it is expensive and knowledge-intensive. In this paper, we present scalable multi-temporal change data generators based on generative models, which are cheap and automatic, alleviating these data problems. Our main idea is to simulate a stochastic change process over time. We describe the stochastic change process as a probabilistic graphical model, namely the generative probabilistic change model (GPCM), which factorizes the complex simulation problem into two more tractable sub-problems, i.e., condition-level change event simulation and image-level semantic change synthesis. To solve these two problems, we present Changen2, a GPCM implemented with a resolution-scalable diffusion transformer which can generate time series of remote sensing images and corresponding semantic and change labels from labeled and even unlabeled single-temporal images. Changen2 is a "generative change foundation model" that can be trained at scale via self-supervision, and is capable of producing change supervisory signals from unlabeled single-temporal images. Unlike existing "foundation models", our generative change foundation model synthesizes change data to train task-specific foundation models for change detection. The resulting model possesses inherent zero-shot change detection capabilities and excellent transferability. Comprehensive experiments suggest Changen2 has superior spatiotemporal scalability in data generation, e.g., Changen2 model trained on 256 pixel single-temporal images can yield time series of any length and resolutions of 1,024 pixels. Changen2 pre-trained models exhibit superior zero-shot performance (narrowing the performance gap to 3% on LEVIR-CD and approximately 10% on both S2Looking and SECOND, compared to fully supervised counterpart) and transferability across multiple types of change tasks, including ordinary and off-nadir building change, land-use/land-cover change, and disaster assessment.

求助该文献

最长约 10秒，即可获得该文献文件

Changen2: Multi-Temporal Remote Sensing Generative Change Foundation Model

今日热心研友