脂毒性
学习迁移
核受体
受体
化学
计算机科学
人工智能
医学
内科学
生物化学
胰岛素抵抗
胰岛素
转录因子
基因
作者
Rouyi Wang,Shujun Yi,Guoqiang Shan,Lingyan Zhu
标识
DOI:10.1021/acs.est.5c07895
摘要
Per/polyfluoroalkyl substances (PFAS) can induce hepatic lipotoxicity by activating nuclear receptors (NRs). Here, we first developed machine-learning models to predict activities of PFAS toward five NRs related to hepatic lipotoxicity using five conventional algorithms based on three commonly used data sets: a general chemical data set (A-data set, including 6388-10199 compounds), a broad PFAS data set based on OECD definition (B-data set, including 369-772 compounds), and a strictly defined PFAS data set (C-data set, including 184-198 compounds). Unexpectedly, the models trained on the broad chemical spaces (A- and B-data sets) showed weak identification of active PFAS, which might be due to distributional shifts. The C-data set-trained models exhibited the best identification performance, but with weaker discrimination than A-data set-trained models. There herein, a transfer-learning multitask deep neural network (TL-MT-DNN) was implemented to transfer knowledge from the A-data set to the C-data set, which greatly improved the prediction performance with an average AUC of 0.886 and F1 of 0.665. Applying this model to 3716 PFAS from the PFASSTRUCTv5 database, 391 compounds were predicted to activate all the five NRs. The model's prediction reliability was validated by in vitro cell-based assays and in vivo animal experiments. This study provides a modeling strategy to improve PFAS activity prediction, overcoming the distributional shift inherent in models trained on broad chemical spaces, and highlights its potential for practical application in risk screening.
科研通智能强力驱动
Strongly Powered by AbleSci AI