Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models

计算机科学杠杆（统计）适应（眼睛）人工智能基础（证据）领域（数学分析）人机交互域适应一般化开放式研究数据科学机器学习多模态多通道交互桥接（联网）利用动作（物理）

作者

Hao Dong,M. H. Liu,Kaiyang Zhou,Eleni Chatzi,Juho Kannala,Cyrill Stachniss,olga fink

出处

期刊：IEEE Transactions on Pattern Analysis and Machine Intelligence [IEEE Computer Society]
日期：2026-01-01 卷期号：PP: 1-20

链接

nih.gov epfl.chdoi.org

标识

DOI：10.1109/tpami.2026.3651319

摘要

Domain adaptation and generalization are crucial for real-world applications, such as autonomous driving and medical imaging where the model must operate reliably across environments with distinct data distributions. However, these tasks are challenging because the model needs to overcome various domain gaps caused by variations in, for example, lighting, weather, sensor configurations, and so on. Addressing domain gaps simultaneously in different modalities, known as multimodal domain adaptation and generalization, is even more challenging due to unique challenges in different modalities. Over the past few years, significant progress has been made in these areas, with applications ranging from action recognition to semantic segmentation, and more. Recently, the emergence of large-scale pre-trained multimodal foundation models, such as CLIP, has inspired numerous research studies, which leverage these models to enhance downstream adaptation and generalization. This survey summarizes recent advances in multimodal adaptation and generalization, particularly how these areas evolve from traditional approaches to foundation models. Specifically, this survey covers (1) multimodal domain adaptation, (2) multimodal test-time adaptation, (3) multimodal domain generalization, (4) domain adaptation and generalization with the help of multimodal foundation models, and (5) adaptation of multimodal foundation models. For each topic, we formally define the problem and give a thorough review of existing methods. Additionally, we analyze relevant datasets and applications, highlighting open challenges and potential future research directions. We also maintain an active repository that contains up-to-date literature and supports research activities in these fields at https://github.com/donghao51/Awesome-Multimodal-Adaptation.

求助该文献

最长约 10秒，即可获得该文献文件

Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models

今日热心研友