自编码
计算机科学
人工智能
成对比较
利用
代表(政治)
机器学习
超图
多模式学习
特征学习
监督学习
模式识别(心理学)
推论
对象(语法)
深度学习
人工神经网络
数学
离散数学
法学
政治
计算机安全
政治学
作者
Jingquan Liu,Xin Du,Yuanzhe Li,Weidong Hu
标识
DOI:10.1007/978-3-031-15937-4_33
摘要
In many real-world settings, the external environment is perceived through multi-modal information, such as visual, radar, lidar, etc. Naturally, the fact motivates us to exploit interaction intra modals and integrate multiple source information using limited labels on the multimodal dataset as a semi-supervised task. A challenging issue in multimodal semi-supervised learning is the complicated correlations under pairwise modalities. In this paper, we propose a hypergraph variational autoencoder (HVAE) which can mine high-order interaction of multimodal data and introduce extra prior knowledge for inferring multimodal fusion representation. On one hand, the hypergraph structure can represent high-order data correlation in multimodal scenes. On the other hand, a prior distribution is introduced by mask-based variational inference to enhance multi-modal characterization. Moreover, the variational lower bound is leveraged to collaborate semi-supervised learning. We conduct experiments on semi-supervised visual object recognition task, and extensive experiments on two datasets demonstrate the superiority of our method against the existing baselines.
科研通智能强力驱动
Strongly Powered by AbleSci AI