The impact of introducing textual semantics on item instance retrieval with highly similar appearance: An empirical study

图像检索 相似性(几何) 计算机科学 语义学(计算机科学) 特征(语言学) 人工智能 维数(图论) 图像(数学) 情报检索 模式识别(心理学) 特征向量 空格(标点符号) 数学 操作系统 哲学 语言学 程序设计语言 纯数学
作者
Bo Li,Jiang Zhu,Lingyun Dai,Hui Jing,Zhizheng Huang
出处
期刊:Image and Vision Computing [Elsevier BV]
卷期号:143: 104925-104925
标识
DOI:10.1016/j.imavis.2024.104925
摘要

Feature representation plays an important role in image instance retrieval (IIR). In practical applications, we find that items of different categories but highly similar in appearance are easy to become the objects of incorrect retrieval. We analyze that extracting features from the appearance dimension alone may cause objects with similar appearance to have smaller similar distances in feature space. But the appearance is not the only factor that determines whether the item is the same, and the difference in the shooting angle will also amplify the appearance difference of the same item in the image. In this paper, through detailed empirical study, we verify a conjecture that by introducing text semantics and fusing it with appearance features, the similarity distance of falsely retrieved objects in feature space can be corrected, thus improving the retrieval effectiveness of image instance retrieval tasks in highly similar appearance data. We introduce textual semantics for image instances based on the image-text cross-modal model. Specifically, we enhance the proportion of appearance similar items based on three open-source datasets (Products-10 k, RP2k and Stanford products) of item instances, and add multi-angle image samples of the same item to enlarge the difference of the same item. Subsequently, we have embarked on baseline experiments for appearance features and textual features from the perspectives of shooting angle similarity and visual character similarity, to explore the advantages of multiple strategies for fusing textual semantics with appearance features. Then, we examine the effect of our method on fine-grained item instance retrieval methods with state-of-the-art. Resultantly, taking mean Average Precision (mAP) as the quantitative metric and averaging experimental results, our method has an obvious improvement over the appearance and textual baselines, where the improvement of appearance feature baselines is generally more obvious than that of textual feature baselines (e.g., in our expanded RP2k dataset, from the perspective of shooting angle similarity, the mAP of the appearance feature baseline is nearly 19.62, the textual feature baseline is 32.45, our method is 43.19. From perspective of visual character similarity, the values are 27.14, 43.59, 54.76, respectively). Moreover, our methods outperform the state-of-the-art fine-grained item instance retrieval methods with improvements of nearly 13.05% and 22.49% on expanded Products-10 k and RP2k, respectively.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
无极微光应助朴素浩然采纳,获得20
刚刚
mak20081发布了新的文献求助10
刚刚
1秒前
1秒前
anomomy发布了新的文献求助10
1秒前
2秒前
2秒前
yangkunmedical完成签到,获得积分10
3秒前
脑洞疼应助野性的灭龙采纳,获得10
3秒前
宇是眼中星眸完成签到 ,获得积分10
4秒前
之逸发布了新的文献求助10
5秒前
5秒前
又又发布了新的文献求助10
5秒前
慕青应助nano采纳,获得10
5秒前
汉堡包应助慈祥的魔镜采纳,获得10
6秒前
s子完成签到,获得积分10
6秒前
欢喜昊焱完成签到,获得积分10
7秒前
8秒前
8秒前
zou发布了新的文献求助10
8秒前
洪栋国发布了新的文献求助10
8秒前
9秒前
10秒前
slforest发布了新的文献求助10
11秒前
聪明的迎夏完成签到 ,获得积分10
11秒前
YACK发布了新的文献求助10
12秒前
虎攀伟发布了新的文献求助10
12秒前
133完成签到 ,获得积分10
13秒前
Jay发布了新的文献求助10
13秒前
berry完成签到,获得积分10
14秒前
橙汁发布了新的文献求助10
14秒前
大力的灵雁应助莫宝采纳,获得10
15秒前
15秒前
15秒前
15秒前
16秒前
搜集达人应助Ykx采纳,获得10
16秒前
Jasper应助qwe1108采纳,获得10
17秒前
之逸完成签到,获得积分10
17秒前
mikoto完成签到 ,获得积分10
17秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Les Mantodea de Guyane Insecta, Polyneoptera 2000
Quality by Design - An Indispensable Approach to Accelerate Biopharmaceutical Product Development 800
Pulse width control of a 3-phase inverter with non sinusoidal phase voltages 777
Signals, Systems, and Signal Processing 610
Research Methods for Applied Linguistics: A Practical Guide 600
Research Methods for Applied Linguistics 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6403836
求助须知:如何正确求助?哪些是违规求助? 8222752
关于积分的说明 17427518
捐赠科研通 5456335
什么是DOI,文献DOI怎么找? 2883441
邀请新用户注册赠送积分活动 1859733
关于科研通互助平台的介绍 1701145