计算机科学
注释
自然语言处理
人工智能
情报检索
构造(python库)
词(群论)
多样性(控制论)
模式识别(心理学)
语言学
哲学
程序设计语言
作者
Eric Schwenker,Weixin Jiang,Trevor Spreadbury,Nicola Ferrier,Oliver Cossairt,Maria K. Y. Chan
出处
期刊:Patterns
[Elsevier]
日期:2023-09-30
卷期号:4 (11): 100843-100843
被引量:11
标识
DOI:10.1016/j.patter.2023.100843
摘要
This work introduces the EXSCLAIM! toolkit for the automatic extraction, separation, and caption-based natural language annotation of images from scientific literature. EXSCLAIM! is used to show how rule-based natural language processing and image recognition can be leveraged to construct an electron microscopy dataset containing thousands of keyword-annotated nanostructure images. Moreover, it is demonstrated how a combination of statistical topic modeling and semantic word similarity comparisons can be used to increase the number and variety of keyword annotations on top of the standard annotations from EXSCLAIM! With large-scale imaging datasets constructed from scientific literature, users are well positioned to train neural networks for classification and recognition tasks specific to microscopy-tasks often otherwise inhibited by a lack of sufficient annotated training data.
科研通智能强力驱动
Strongly Powered by AbleSci AI