判别式
计算机科学
人工智能
模式识别(心理学)
嵌入
公制(单位)
特征(语言学)
上下文图像分类
迭代重建
图像(数学)
弹丸
语言学
运营管理
哲学
化学
有机化学
经济
作者
Xiaoxu Li,Qi Song,Jijie Wu,Rui Zhu,Zhanyu Ma,Jing‐Hao Xue
标识
DOI:10.1109/tcsvt.2023.3275382
摘要
Few-shot fine-grained image classification has attracted considerable attention in recent years for its realistic setting to imitate how humans conduct recognition tasks. Metric-based few-shot classifiers have achieved high accuracies. However, their metric function usually requires two arguments of vectors, while transforming or reshaping three-dimensional feature maps to vectors can result in loss of spatial information. Image reconstruction is thus involved to retain more appearance details: the test images are reconstructed by different classes and then classified to the one with the smallest reconstruction error. However, discriminative local information, vital to distinguish sub-categories in fine-grained images with high similarities, is not well elaborated when only the base features from a usual embedding module are adopted for reconstruction. Hence, we propose the novel local content-enriched cross-reconstruction network (LCCRN) for few-shot fine-grained classification. In LCCRN, we design two new modules: the local content-enriched module (LCEM) to learn the discriminative local features, and the cross-reconstruction module (CRM) to fully engage the local features with the appearance details obtained from a separate embedding module. The classification score is calculated based on the weighted sum of reconstruction errors of the cross-reconstruction tasks, with weights learnt from the training process. Extensive experiments on four fine-grained datasets showcase the superior classification performance of LCCRN compared with the state-of-the-art few-shot classification methods. Codes are available at: https://github.com/lutsong/LCCRN .
科研通智能强力驱动
Strongly Powered by AbleSci AI