人工智能
深度学习
Lift(数据挖掘)
计算机科学
特征(语言学)
注释
图像(数学)
图像自动标注
模式识别(心理学)
图像检索
机器学习
语言学
哲学
作者
Junbing Li,Changqing Zhang,Joey Tianyi Zhou,Huazhu Fu,Shuyin Xia,Qinghua Hu
标识
DOI:10.1109/tcyb.2021.3049630
摘要
Image annotation aims to jointly predict multiple tags for an image. Although significant progress has been achieved, existing approaches usually overlook aligning specific labels and their corresponding regions due to the weak supervised information (i.e., "bag of labels" for regions), thus failing to explicitly exploit the discrimination from different classes. In this article, we propose the deep label-specific feature (Deep-LIFT) learning model to build the explicit and exact correspondence between the label and the local visual region, which improves the effectiveness of feature learning and enhances the interpretability of the model itself. Deep-LIFT extracts features for each label by aligning each label and its region. Specifically, Deep-LIFTs are achieved through learning multiple correlation maps between image convolutional features and label embeddings. Moreover, we construct two variant graph convolutional networks (GCNs) to further capture the interdependency among labels. Empirical studies on benchmark datasets validate that the proposed model achieves superior performance on multilabel classification over other existing state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI