计算机科学
模式
多模式学习
人工智能
保险丝(电气)
机器学习
模态(人机交互)
特征(语言学)
代表(政治)
补语(音乐)
特征学习
钥匙(锁)
电气工程
工程类
表型
哲学
基因
社会学
政治
化学
互补
法学
生物化学
语言学
计算机安全
社会科学
政治学
作者
Yi Ren,Ning Xu,Miaogen Ling,Xin Geng
标识
DOI:10.1007/s11704-021-0611-6
摘要
Multimodal machine learning (MML) aims to understand the world from multiple related modalities. It has attracted much attention as multimodal data has become increasingly available in real-world application. It is shown that MML can perform better than single-modal machine learning, since multi-modalities containing more information which could complement each other. However, it is a key challenge to fuse the multi-modalities in MML. Different from previous work, we further consider the side-information, which reflects the situation and influences the fusion of multi-modalities. We recover multimodal label distribution (MLD) by leveraging the side-information, representing the degree to which each modality contributes to describing the instance. Accordingly, a novel framework named multimodal label distribution learning (MLDL) is proposed to recover the MLD, and fuse the multimodalities with its guidance to learn an in-depth understanding of the jointly feature representation. Moreover, two versions of MLDL are proposed to deal with the sequential data. Experiments on multimodal sentiment analysis and disease prediction show that the proposed approaches perform favorably against state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI