Toward more realistic drug-target interaction predictions

标杆管理计算机科学药物靶点二元分类机器学习集合（抽象数据类型）人工智能试验装置二进制数数据挖掘药物发现计算生物学支持向量机生物信息学药理学数学生物营销业务算术程序设计语言

作者

Tapio Pahikkala,Antti Airola,Samuli Pietilä,Sushil Kumar Shakyawar,Agnieszka Szwajda,Jing Tang,Tero Aittokallio

出处

期刊：Briefings in Bioinformatics [Oxford University Press]
日期：2014-04-09 卷期号：16 (2): 325-337 被引量：455

链接

europepmc.org europepmc.org nih.gov nih.govdoi.org

标识

DOI：10.1093/bib/bbu010

摘要

A number of supervised machine learning models have recently been introduced for the prediction of drug ^target interactions based on chemical structure and genomic sequence information.Although these models could offer improved means for many network pharmacology applications, such as repositioning of drugs for new therapeutic uses, the prediction models are often being constructed and evaluated under overly simplified settings that do not reflect the real-life problem in practical applications.Using quantitative drug ^target bioactivity assays for kinase inhibitors, as well as a popular benchmarking data set of binary drug ^target interactions for enzyme, ion channel, nuclear receptor and G protein-coupled receptor targets, we illustrate here the effects of four factors that may lead to dramatic differences in the prediction results: (i) problem formulation (standard binary classification or more realistic regression formulation), (ii) evaluation data set (drug and target families in the application use case), (iii) evaluation procedure (simple or nested cross-validation) and (iv) experimental setting (whether training and test sets share common drugs and targets, only drugs or targets or neither).Each of these factors should be taken into consideration to avoid reporting overoptimistic drug ^target interaction prediction results.We also suggest guidelines on how to make the supervised drug ^target interaction prediction studies more realistic in terms of such model formulations and evaluation setups that better address the inherent complexity of the prediction task in the practical applications, as well as novel benchmarking data sets that capture the continuous nature of the drug ^target interactions for kinase inhibitors.

求助该文献

最长约 10秒，即可获得该文献文件

Toward more realistic drug-target interaction predictions

今日热心研友