计算机科学
水准点(测量)
数据挖掘
鉴定(生物学)
冗余(工程)
指纹(计算)
集合(抽象数据类型)
人工智能
大地测量学
植物
生物
操作系统
程序设计语言
地理
作者
Hongbin Yang,Jie Li,Zengrui Wu,Weihua Li,Guixia Liu,Yun Tang
标识
DOI:10.1021/acs.chemrestox.7b00083
摘要
Identification of structural alerts for toxicity is useful in drug discovery and other fields such as environmental protection. With structural alerts, researchers can quickly identify potential toxic compounds and learn how to modify them. Hence, it is important to determine structural alerts from a large number of compounds quickly and accurately. There are already many methods reported for identification of structural alerts. However, how to evaluate those methods is a problem. In this paper, we tried to evaluate four of the methods for monosubstructure identification with three indices including accuracy rate, coverage rate, and information gain to compare their advantages and disadvantages. The Kazius' Ames mutagenicity data set was used as the benchmark, and the four methods were MoSS (graph-based), SARpy (fragment-based), and two fingerprint-based methods including Bioalerts and the fingerprint (FP) method we previously used. The results showed that Bioalerts and FP could detect key substructures with high accuracy and coverage rates because they allowed unclosed rings and wildcard atom or bond types. However, they also resulted in redundancy so that their predictive performance was not as good as that of SARpy. SARpy was competitive in predictive performance in both training set and external validation set. These results might be helpful for users to select appropriate methods and further development of methods for identification of structural alerts.
科研通智能强力驱动
Strongly Powered by AbleSci AI