主成分分析
特征选择
支持向量机
计算机科学
人工神经网络
多元微积分
人工智能
生物系统
前馈神经网络
模式识别(心理学)
线性回归
特征(语言学)
交叉验证
回归
回归分析
淀粉样蛋白(真菌学)
化学
机器学习
数学
统计
生物
工程类
哲学
控制工程
无机化学
语言学
作者
Wuyue Yang,Pengzhen Tan,Xianjun Fu,Liu Hong
摘要
A novel data-based machine learning algorithm for predicting amyloid aggregation rates is reported in this paper. Based on a highly nonlinear projection from 16 intrinsic features of a protein and 4 extrinsic features of the environment to the protein aggregation rate, a feedforward fully connected neural network (FCN) with one hidden layer is trained on a dataset composed of 21 different kinds of amyloid proteins and tested on 4 rest proteins. FCN shows a much better performance than traditional algorithms, such as multivariable linear regression and support vector regression, with an average accuracy higher than 90%. Furthermore, by the correlation analysis and the principal component analysis, seven key features, folding energy, HP patterns for helix, sheet and helices cross membrane, pH, ionic strength, and protein concentration, are shown to constitute a minimum feature set for characterizing the amyloid aggregation kinetics.
科研通智能强力驱动
Strongly Powered by AbleSci AI