赫尔格
化学
心脏毒性
集合(抽象数据类型)
频道(广播)
可用性
计算机科学
小分子
毒性
药物发现
机器学习
虚拟筛选
化学
数据挖掘
人工智能
生物信息学
人机交互
生物
钾通道
医学
生物化学
有机化学
程序设计语言
计算机网络
内分泌学
作者
Issar Arab,Kris Laukens,Wout Bittremieux
标识
DOI:10.1021/acs.jcim.4c01102
摘要
Predicting drug toxicity is a critical aspect of ensuring patient safety during the drug design process. Although conventional machine learning techniques have shown some success in this field, the scarcity of annotated toxicity data poses a significant challenge in enhancing models' performance. In this study, we explore the potential of leveraging large unlabeled small molecule data sets using semisupervised learning to improve drug cardiotoxicity predictive performance across three cardiac ion channel targets: the voltage-gated potassium channel (hERG), the voltage-gated sodium channel (Nav1.5), and the voltage-gated calcium channel (Cav1.2). We extensively mined the ChEMBL database, comprising approximately 2 million small molecules, and then employed semisupervised learning to construct robust classification models for this purpose. We achieved a performance boost on highly diverse (i.e., structurally dissimilar) test data sets across all three targets. Using our built models, we screened the whole ChEMBL database and a large set of FDA-approved drugs, identifying several compounds with potential cardiac ion channel activity. To ensure broad accessibility and usability for both technical and nontechnical users, we developed a cross-platform graphical user interface that allows users to make predictions and gain insights into the cardiotoxicity of drugs and other small molecules. The software is made available as open source under the permissive MIT license at https://github.com/issararab/CToxPred2.
科研通智能强力驱动
Strongly Powered by AbleSci AI