计算机科学
图像检索
正规化(语言学)
图像(数学)
情报检索
图像处理
集合(抽象数据类型)
人工智能
程序设计语言
作者
Yahui Xu,Jiwei Wei,Yi Bin,Yang Yang,Zeyu Ma,Heng Tao Shen
标识
DOI:10.1109/tcsvt.2024.3401006
摘要
Composed image retrieval aims to search a target image by concurrently understanding the composed inputs with a reference image and the complementary modification text. It aims to find a shared latent space where the representation of the composed inputs is close to the desired target image. Most previous methods capture the one-to-one correspondence between the composed inputs and target image, which encodes the composed inputs and the target image into single points in the feature space. However, the one-to-one correspondence cannot effectively handle this task due to the inherent ambiguity problem arising from the various semantic meanings and data uncertainty. Specifically, the composed inputs and target image always exhibit various semantic meanings, affecting the retrieval results. Moreover, given the composed inputs (resp. target image), there are multiple target images (resp. composed inputs) that equally make sense. In this paper, we propose a novel method termed Set of Diverse Queries with Uncertainty Regularization (SDQUR) to solve such inherent ambiguity problem. First, we utilize diverse queries to adaptively aggregate the composed inputs and target image into multiple deterministic embeddings that capture different semantic meanings in the triplet affecting the retrieval process. It can exploit the deterministic many-to-many correspondence within each triple through these set-based queries. Moreover, we provide an uncertainty regularization module to encode the composed inputs and target image into gaussian distribution. Multiple potential positive candidates are sampled from the distribution for probabilistic many-to-many correspondence. Through the complementary deterministic and probabilistic many-to-many correspondence manner, we achieve consistent improvements on the standard FashionIQ, CIRR, and Shoes benchmarks, surpassing the state-of-the-art methods by a large margin.
科研通智能强力驱动
Strongly Powered by AbleSci AI