数量结构-活动关系
化学
训练集
花生四烯酸5-脂氧合酶
化学
计算机科学
立体化学
人工智能
药物发现
生物化学
花生四烯酸
酶
作者
Alexey A. Lagunin,Athina Geronikaki,Phaedra Eleftheriou,Pavel V. Pogodin,Alexey Zakharov
标识
DOI:10.1021/acs.jcim.8b00617
摘要
Numerous studies have been published in recent years with acceptable quantitative structure-activity relationship (QSAR) modeling based on heterogeneous data. In many cases, the training sets for QSAR modeling were constructed from compounds tested by different biological assays, contradicting the opinion that QSAR modeling should be based on the data measured by a single protocol. We attempted to develop approaches that help to determine how heterogeneous data should be used for the creation of QSAR models on the basis of different sets of compounds tested by different experimental methods for the same target and the same endpoint. To this end, more than 100 QSAR models for the IC50 values of ligands interacting with cyclooxygenase 1,2 (COX) and seed lipoxygenase (LOX), obtained from ChEMBL database were created using the GUSAR software. The QSAR models were tested on the external set, including 26 new thiazolidinone derivatives, which were experimentally tested for COX-1,2/LOX inhibition. The IC50 values of the derivatives varied from 89 μM to 26 μM for LOX, from 200 μM to 0.018 μM for COX-1, and from 210 μM to 1 μM for COX-2. This study showed that the accuracy of the models is dependent on the distribution of IC50 values of low activity compounds in the training sets. In the most cases, QSAR models created based on the combined training sets had advantages in comparison with QSAR models, based on a single publication. We introduced a new method of combination of quantitative data from different experimental studies based on the data of reference compounds, which was called "scaling".
科研通智能强力驱动
Strongly Powered by AbleSci AI