计算机科学
杠杆(统计)
嵌入
人工智能
班级(哲学)
对象(语法)
可视化
视觉对象识别的认知神经科学
自然语言处理
训练集
图像(数学)
深度学习
情报检索
模式识别(心理学)
作者
Andrea Frome,Greg S. Corrado,Jon Shlens,Samy Bengio,Jeff Dean,Marc’Aurelio Ranzato,Tomáš Mikolov
出处
期刊:Neural Information Processing Systems
日期:2013-12-05
卷期号:26: 2121-2129
被引量:2213
摘要
Modern visual recognition systems are often limited in their ability to scale to large numbers of object categories. This limitation is in part due to the increasing difficulty of acquiring sufficient training data in the form of labeled images as the number of object categories grows. One remedy is to leverage data from other sources - such as text data - both to train visual models and to constrain their predictions. In this paper we present a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as semantic information gleaned from unannotated text. We demonstrate that this model matches state-of-the-art performance on the 1000-class ImageNet object recognition challenge while making more semantically reasonable errors, and also show that the semantic information can be exploited to make predictions about tens of thousands of image labels not observed during training. Semantic knowledge improves such zero-shot predictions achieving hit rates of up to 18% across thousands of novel labels never seen by the visual model.
科研通智能强力驱动
Strongly Powered by AbleSci AI