自编码
特征选择
人工智能
模式识别(心理学)
特征(语言学)
选择(遗传算法)
计算机科学
人工神经网络
哲学
语言学
作者
Shiqiao Gu,Matee Ullah,Jiangning Song,Dong‐Jun Yu
标识
DOI:10.1109/tcbbio.2025.3562809
摘要
Accurate prediction of protein subcellular localization is critical for understanding cellular functions and guiding drug design. However, current computational methods have limited and insufficient performance and as such, there exist few efficient vision learners based on self-supervised learning for extracting deep and informative features. To address it, we propose a novel bioimage-based method, termed PScL-SDNNMAE, to effectively predict the subcellular localizations of proteins in human cells. PScL-SDNNMAE first extracts classical features using traditional image descriptors. Next, the masked autoencoder (MAE) is first trained using the training image data and then used to extract the MAE-based deep features. In the feature selection phase, PScL-SDNNMAE applies the Analysis of Variance (ANOVA), Mutual Information (MI) and stepwise discriminant analysis (SDA) to select the optimal features from the classical feature sets. Finally, PScL-SDNNMAE trains the deep neural network (DNN) classifier using the super feature set generated by integrating all the classical optimal and MAE-based deep features. Extensive benchmark experiments including 10-fold cross-validation on the training dataset and independent test on the independent dataset illustrate more advanced performance and generalization capability of PScL-SDNNMAE than other existing state-of-the-art predictors. Moreover, the experiments also demonstrate the effectiveness of self-supervised learning methods in learning representations of IHC images, as well as the significant potential for pre-training on massive unlabeled datasets in the future.
科研通智能强力驱动
Strongly Powered by AbleSci AI