计算机科学
随机森林
原始数据
支持向量机
机器学习
人工智能
数据挖掘
程序设计语言
作者
Atsushi Matsumoto,Shin Aoki,Hayato Ohwada
出处
期刊:International Journal of Machine Learning and Computing
[EJournal Publishing]
日期:2016-04-01
卷期号:6 (2): 145-148
被引量:13
标识
DOI:10.18178/ijmlc.2016.6.2.589
摘要
This paper compares random forest and SVM for raw data in drug discovery.Both machine-learning methods are often applied in drug discovery.We should select our methods depending on the problem.This is very important.SVM is suitable for virtual screening when the target protein is known.In contrast, random forest is suitable for virtual screening when the target protein is not decided uniquely or unknown, because random forest can find good combinations of features from many features.Therefore, random forest is thus more effective for problems including many unknown parts.Incidentally, selecting the good features is important in both methods.In particular, we must narrow the features using importance calculations if we lack sufficient biochemical knowledge.In this study, we predicted the radiation protection function and toxicity for radioprotectors targeting p53 as a case study.When predicting the radiation protection function the target protein is known.In contrast, when predicting toxicity, the target protein is not decided uniquely or is unknown.We evaluated each experiment based on its AUC score.As a result, we found that when predicting the radiation protection function, SVM was better than random forest.By contrast, when predicting toxicity, random forest was better than SVM.
科研通智能强力驱动
Strongly Powered by AbleSci AI