Large Scale Visual Food Recognition

计算机科学 人工智能 水准点(测量) 背景(考古学) 特征(语言学) 深度学习 分割 模式识别(心理学) 机器学习 特征学习 地理 语言学 哲学 考古 大地测量学
作者
Weiqing Min,Zhiling Wang,Yuxin Liu,Mengjiang Luo,Liping Kang,Xiaoming Wei,Xiaolin Wei,Shuqiang Jiang
出处
期刊:IEEE Transactions on Pattern Analysis and Machine Intelligence [IEEE Computer Society]
卷期号:45 (8): 9932-9949 被引量:99
标识
DOI:10.1109/tpami.2023.3237871
摘要

Food recognition plays an important role in food choice and intake, which is essential to the health and well‐being of humans. It is thus of importance to the computer vision community, and can further support many food-oriented vision and multimodal tasks, e.g., food detection and segmentation, cross-modal recipe retrieval and generation. Unfortunately, we have witnessed remarkable advancements in generic visual recognition for released large-scale datasets, yet largely lags in the food domain. In this paper, we introduce Food2K, which is the largest food recognition dataset with 2,000 categories and over 1 million images. Compared with existing food recognition datasets, Food2K bypasses them in both categories and images by one order of magnitude, and thus establishes a new challenging benchmark to develop advanced models for food visual representation learning. Furthermore, we propose a deep progressive region enhancement network for food recognition, which mainly consists of two components, namely progressive local feature learning and region feature enhancement. The former adopts improved progressive training to learn diverse and complementary local features, while the latter utilizes self-attention to incorporate richer context with multiple scales into local features for further local feature enhancement. Extensive experiments on Food2K demonstrate the effectiveness of our proposed method. More importantly, we have verified better generalization ability of Food2K in various tasks, including food image recognition, food image retrieval, cross-modal recipe retrieval, food detection and segmentation. Food2K can be further explored to benefit more food-relevant tasks including emerging and more complex ones (e.g., nutritional understanding of food), and the trained models on Food2K can be expected as backbones to improve the performance of more food-relevant tasks. We also hope Food2K can serve as a large scale fine-grained visual recognition benchmark, and contributes to the development of large scale fine-grained visual analysis.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
博士加油完成签到,获得积分10
刚刚
芳蔼发布了新的文献求助10
1秒前
Luna爱科研完成签到 ,获得积分10
1秒前
淡然的衣发布了新的文献求助10
1秒前
肖果完成签到 ,获得积分10
3秒前
lyn发布了新的文献求助10
4秒前
4秒前
111完成签到 ,获得积分10
6秒前
渊思发布了新的文献求助10
7秒前
lang发布了新的文献求助10
8秒前
9秒前
桐桐应助xiaobao采纳,获得10
9秒前
10秒前
11秒前
XiaoQi完成签到,获得积分10
11秒前
甜甜千兰完成签到 ,获得积分10
12秒前
FYFaue3ng发布了新的文献求助10
14秒前
17秒前
打打应助牛牛眉目采纳,获得10
17秒前
18秒前
不一样的光完成签到,获得积分10
19秒前
21秒前
CHN完成签到 ,获得积分10
22秒前
内向的小凡完成签到,获得积分0
26秒前
97完成签到,获得积分10
27秒前
李健的小迷弟应助默默筮采纳,获得10
27秒前
淡然的衣完成签到,获得积分10
30秒前
30秒前
zkk完成签到 ,获得积分10
30秒前
xiaobao完成签到,获得积分20
30秒前
杜兰特发布了新的文献求助10
33秒前
34秒前
34秒前
36秒前
liuguohua126发布了新的文献求助10
38秒前
xingxinghan完成签到 ,获得积分10
39秒前
40秒前
书中魂我自不理会完成签到 ,获得积分10
40秒前
自己发布了新的文献求助10
40秒前
40秒前
高分求助中
A new approach to the extrapolation of accelerated life test data 1000
Cognitive Neuroscience: The Biology of the Mind 1000
Technical Brochure TB 814: LPIT applications in HV gas insulated switchgear 1000
Immigrant Incorporation in East Asian Democracies 500
Nucleophilic substitution in azasydnone-modified dinitroanisoles 500
不知道标题是什么 500
A Preliminary Study on Correlation Between Independent Components of Facial Thermal Images and Subjective Assessment of Chronic Stress 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 3966285
求助须知:如何正确求助?哪些是违规求助? 3511697
关于积分的说明 11159270
捐赠科研通 3246284
什么是DOI,文献DOI怎么找? 1793339
邀请新用户注册赠送积分活动 874354
科研通“疑难数据库(出版商)”最低求助积分说明 804351