汉明距离
计算机科学
嵌入
人工智能
模式识别(心理学)
特征(语言学)
模态(人机交互)
汉明码
特征提取
一般化
代表(政治)
特征学习
解码方法
算法
数学
数学分析
哲学
区块代码
法学
政治学
语言学
政治
作者
Biao Gong,Chenggang Yan,Junjie Bai,Changqing Zou,Yue Gao
标识
DOI:10.1109/tip.2020.3013138
摘要
Three-dimensional multi-modal data are used to represent 3D objects in the real world in different ways. Features separately extracted from multimodality data are often poorly correlated. Recent solutions leveraging the attention mechanism to learn a joint-network for the fusion of multimodality features have weak generalization capability. In this paper, we propose a hamming embedding sensitivity network to address the problem of effectively fusing multimodality features. The proposed network called HamNet is the first end-to-end framework with the capacity to theoretically integrate data from all modalities with a unified architecture for 3D shape representation, which can be used for 3D shape retrieval and recognition. HamNet uses the feature concealment module to achieve effective deep feature fusion. The basic idea of the concealment module is to re-weight the features from each modality at an early stage with the hamming embedding of these modalities. The hamming embedding also provides an effective solution for fast retrieval tasks on a large scale dataset. We have evaluated the proposed method on the large-scale ModelNet40 dataset for the tasks of 3D shape classification, single modality and cross-modality retrieval. Comprehensive experiments and comparisons with state-of-the-art methods demonstrate that the proposed approach can achieve superior performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI