预处理器
计算机科学
特征提取
人工智能
特征(语言学)
机器学习
结构生物信息学
序列(生物学)
软件
数据挖掘
模式识别(心理学)
蛋白质结构
生物
程序设计语言
哲学
语言学
生物化学
遗传学
作者
Hamid D. Ismail,C White,Hussam Al-Barakati,Robert H. Newman,Dukka B. Kc
出处
期刊:Methods in molecular biology
日期:2022-01-01
卷期号:: 65-104
被引量:3
标识
DOI:10.1007/978-1-0716-2317-6_3
摘要
Machine learning has become one of the most popular choices for developing computational approaches in protein structural bioinformatics. The ability to extract features from protein sequence/structure often becomes one of the crucial steps for the development of machine learning-based approaches. Over the years, various sequence, structural, and physicochemical descriptors have been developed for proteins and these descriptors have been used to predict/solve various bioinformatics problems. Hence, several feature extraction tools have been developed over the years to help researchers to generate numeric features from protein sequences. Most of these tools have some limitations regarding the number of sequences they can handle and the subsequent preprocessing that is required for the generated features before they can be fed to machine learning methods. Here, we present Feature Extraction from Protein Sequences (FEPS), a toolkit for feature extraction. FEPS is a versatile software package for generating various descriptors from protein sequences and can handle several sequences: the number of which is limited only by the computational resources. In addition, the features extracted from FEPS do not require subsequent processing and are ready to be fed to the machine learning techniques as it provides various output formats as well as the ability to concatenate these generated features. FEPS is made freely available via an online web server as well as a stand-alone toolkit. FEPS, a comprehensive toolkit for feature extraction, will help spur the development of machine learning-based models for various bioinformatics problems.
科研通智能强力驱动
Strongly Powered by AbleSci AI