计算机科学
推论
任务(项目管理)
特征选择
人工智能
机器学习
自然语言处理
选择(遗传算法)
性能预测
数据科学
数据挖掘
系统工程
工程类
模拟
作者
Tong Xie,Yuwei Wan,Yufei Zhou,Wei Huang,Yixuan Liu,Qingyuan Linghu,Shaozhou Wang,Chunyu Kit,Clara Grazian,Wenjie Zhang,Bram Hoex
出处
期刊:Patterns
[Elsevier BV]
日期:2024-03-22
卷期号:5 (5): 100955-100955
被引量:11
标识
DOI:10.1016/j.patter.2024.100955
摘要
Materials scientists usually collect experimental data to summarize experiences and predict improved materials. However, a crucial issue is how to proficiently utilize unstructured data to update existing structured data, particularly in applied disciplines. This study introduces a new natural language processing (NLP) task called structured information inference (SII) to address this problem. We propose an end-to-end approach to summarize and organize the multi-layered device-level information from the literature into structured data. After comparing different methods, we fine-tuned LLaMA with an F1 score of 87.14% to update an existing perovskite solar cell dataset with articles published since its release, allowing its direct use in subsequent data analysis. Using structured information, we developed regression tasks to predict the electrical performance of solar cells. Our results demonstrate comparable performance to traditional machine-learning methods without feature selection and highlight the potential of large language models for scientific knowledge acquisition and material development.
科研通智能强力驱动
Strongly Powered by AbleSci AI