计算机科学
人工智能
瓶颈
频道(广播)
桥接(联网)
规范化(社会学)
卷积神经网络
融合
过程(计算)
分割
RGB颜色模型
模式识别(心理学)
操作系统
人类学
哲学
社会学
嵌入式系统
语言学
计算机网络
作者
Yikai Wang,Wenbing Huang,Fuchun Sun,Tingyang Xu,Yu Rong,Junzhou Huang
出处
期刊:Cornell University - arXiv
日期:2020-01-01
被引量:102
标识
DOI:10.48550/arxiv.2011.05005
摘要
Deep multimodal fusion by using multiple sources of data for classification or regression has exhibited a clear advantage over the unimodal counterpart on various applications. Yet, current methods including aggregation-based and alignment-based fusion are still inadequate in balancing the trade-off between inter-modal fusion and intra-modal processing, incurring a bottleneck of performance improvement. To this end, this paper proposes Channel-Exchanging-Network (CEN), a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities. Specifically, the channel exchanging process is self-guided by individual channel importance that is measured by the magnitude of Batch-Normalization (BN) scaling factor during training. The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network. Extensive experiments on semantic segmentation via RGB-D data and image translation through multi-domain input verify the effectiveness of our CEN compared to current state-of-the-art methods. Detailed ablation studies have also been carried out, which provably affirm the advantage of each component we propose. Our code is available at https://github.com/yikaiw/CEN.
科研通智能强力驱动
Strongly Powered by AbleSci AI