特征(语言学)
排名(信息检索)
量化(信号处理)
计算机科学
模式识别(心理学)
算法
人工智能
语言学
哲学
作者
Chaolu Meng,Yunyun Shi,Quan Zou,Ruijun Liu
标识
DOI:10.1109/tcbbio.2024.3524677
摘要
The identification of protein sequences depends on the effective selection of an optimized set of features. Traditional algorithms prioritize global feature importance, often overshadowing the significance of local metrics. Addressing this imbalance, we introduce an innovative algorithm that fuses feature ranking with an advanced weight quantization technique. This algorithm unfolds in two pivotal stages: initially, it generates a weighted directed graph based on normal distribution metrics; subsequently, it employs the HodgeRank algorithm to amalgamate these rankings. Specifically, the algorithm evaluates feature score normality by employing z-scores for skewness and kurtosis, resulting in a graph that quantitatively reflects both local and global feature contributions. HodgeRank then translates this graph into a Laplacian matrix, enabling the calculation of a comprehensive scoring function for each feature. We refine the initial rankings by incorporating weights during the integration phase, capturing a holistic view of feature significance. The proposed method, termed HodgeRankWeight, showcases superior performance, achieving accuracy rates of 87.02%, 92.84%, and 74.51% across different datasets. In head-to-head comparisons, HodgeRankWeight outstripped existing models, achieving an overall accuracy of 82.6923% and setting a new benchmark for precision in protein sequence identification. We also offer a complimentary web server for related research. The website is http://112.124.26.17:8004/ .
科研通智能强力驱动
Strongly Powered by AbleSci AI