一致性(知识库)
可用性
计算机科学
金标准(测试)
考试(生物学)
医学诊断
自然语言处理
质量(理念)
人工智能
数据挖掘
医学物理学
医学
统计
数学
放射科
人机交互
生物
古生物学
哲学
认识论
作者
Yao-Shun Chuang,Xiaoqian Jiang,Chun‐Teh Lee,Ryan Brandon,Duong Thuy Tran,Oluwabunmi Tokede,Muhammad F. Walji
出处
期刊:Cornell University - arXiv
日期:2023-01-01
被引量:1
标识
DOI:10.48550/arxiv.2311.10810
摘要
This study explored the usability of prompt generation on named entity recognition (NER) tasks and the performance in different settings of the prompt. The prompt generation by GPT-J models was utilized to directly test the gold standard as well as to generate the seed and further fed to the RoBERTa model with the spaCy package. In the direct test, a lower ratio of negative examples with higher numbers of examples in prompt achieved the best results with a F1 score of 0.72. The performance revealed consistency, 0.92-0.97 in the F1 score, in all settings after training with the RoBERTa model. The study highlighted the importance of seed quality rather than quantity in feeding NER models. This research reports on an efficient and accurate way to mine clinical notes for periodontal diagnoses, allowing researchers to easily and quickly build a NER model with the prompt generation approach.
科研通智能强力驱动
Strongly Powered by AbleSci AI