Double Attention Based on Graph Attention Network for Image Multi-Label Classification

计算机科学 多标签分类 杠杆(统计) 人工智能 帕斯卡(单位) 模式识别(心理学) 嵌入 相关性 概化理论 图形 机器学习 分类器(UML) 特征(语言学) 数据挖掘 理论计算机科学 数学 统计 几何学 哲学 语言学 程序设计语言
作者
Wei Zhou,Zhiwu Xia,Peng Dou,Tao Su,Haifeng Hu
出处
期刊:ACM Transactions on Multimedia Computing, Communications, and Applications [Association for Computing Machinery]
卷期号:19 (1): 1-23 被引量:10
标识
DOI:10.1145/3519030
摘要

The task of image multi-label classification is to accurately recognize multiple objects in an input image. Most of the recent works need to leverage the label co-occurrence matrix counted from training data to construct the graph structure, which are inflexible and may degrade model generalizability. In addition, these methods fail to capture the semantic correlation between the channel feature maps to further improve model performance. To address these issues, we propose DA-GAT (a D ouble A ttention framework based on the G raph A ttention ne T work) to effectively learn the correlation between labels from training data. First, we devise a new channel attention mechanism to enhance the semantic correlation between channel feature maps, so as to implicitly capture the correlation between labels. Second, we propose a new label attention mechanism to avoid the adverse impact of a manually constructed label co-occurrence matrix. It only needs to leverage the label embedding as the input of network, then automatically constructs the label relation matrix to explicitly establish the correlation between labels. Finally, we effectively fuse the output of these two attention mechanisms to further improve model performance. Extensive experiments are conducted on three public multi-label classification benchmarks. Our DA-GAT model achieves mean average precision of 87.1%, 96.6%, and 64.3% on MS-COCO 2014, PASCAL VOC 2007, and NUS-WIDE, respectively, and obviously outperforms other existing state-of-the-art methods. In addition, visual analysis experiments demonstrate that each attention mechanism can capture the correlation between labels well and significantly promote the model performance.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
2秒前
fkalltn发布了新的文献求助30
4秒前
5秒前
学术Bond完成签到,获得积分10
6秒前
欢喜新瑶完成签到,获得积分10
8秒前
学术Bond发布了新的文献求助10
8秒前
fkalltn完成签到,获得积分10
9秒前
9秒前
过客给zcc的求助进行了留言
10秒前
11秒前
11秒前
阮博完成签到 ,获得积分10
12秒前
L_野发布了新的文献求助10
12秒前
稳重的凝芙完成签到,获得积分10
13秒前
怕黑翠发布了新的文献求助10
14秒前
Owen应助桀桀采纳,获得10
14秒前
15秒前
17秒前
布丁布丁发布了新的文献求助10
17秒前
小蘑菇应助Grace采纳,获得10
17秒前
张西西完成签到 ,获得积分10
18秒前
19秒前
马小梁发布了新的文献求助50
19秒前
游尘发布了新的文献求助10
21秒前
SciGPT应助ff采纳,获得30
26秒前
sars518关注了科研通微信公众号
27秒前
无花果应助熊熊采纳,获得10
27秒前
27秒前
28秒前
28秒前
牛太虚发布了新的文献求助10
28秒前
29秒前
望远山发布了新的文献求助10
29秒前
bodoctor2008完成签到 ,获得积分10
29秒前
马小梁完成签到,获得积分10
30秒前
点点完成签到,获得积分20
31秒前
深情安青应助精明妙之采纳,获得30
31秒前
陶醉觅夏发布了新的文献求助10
32秒前
ARESCI发布了新的文献求助30
34秒前
高分求助中
Manual of Clinical Microbiology, 4 Volume Set (ASM Books) 13th Edition 1000
Teaching Social and Emotional Learning in Physical Education 900
Chinese-English Translation Lexicon Version 3.0 500
Electronic Structure Calculations and Structure-Property Relationships on Aromatic Nitro Compounds 500
マンネンタケ科植物由来メロテルペノイド類の網羅的全合成/Collective Synthesis of Meroterpenoids Derived from Ganoderma Family 500
[Lambert-Eaton syndrome without calcium channel autoantibodies] 440
Plesiosaur extinction cycles; events that mark the beginning, middle and end of the Cretaceous 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2383210
求助须知:如何正确求助?哪些是违规求助? 2090193
关于积分的说明 5253793
捐赠科研通 1817185
什么是DOI,文献DOI怎么找? 906530
版权声明 559000
科研通“疑难数据库(出版商)”最低求助积分说明 484080