化学
寡核苷酸
线性判别分析
支持向量机
主成分分析
DNA
拉曼光谱
计算生物学
人工智能
模式识别(心理学)
打字
生物系统
组合化学
生物化学
计算机科学
语音识别
生物
物理
光学
作者
Guantong Xu,Ying Bao,Yujing Zhang,Xiaoxuan Xiang,Hong Qun Luo,Xinhua Guo
标识
DOI:10.1021/acs.analchem.4c02143
摘要
Surface-enhanced Raman spectroscopy (SERS) has been demonstrated as an effective method for elucidating secondary structural characteristics of DNA. However, the inherent complexity of the DNA conformation and the lack of SERS samples pose challenges for identifying numerous secondary structures. To address these issues, a synergistic method integrating machine learning with SERS was proposed so as to analyze the SERS spectra of 54 well-defined conformational oligonucleotides, namely, G-quadruplex (G4), i-motif (iM), double-strand (DS), and single-strand (SS) configurations. Principal component analysis (PCA) effectively segregated the oligonucleotides into three distinct conformational groups (G4s, iMs, and others). Furthermore, linear discriminant analysis (LDA), K-nearest neighbor (KNN), and support vector machine (SVM) approaches were utilized to improve the typing accuracy of 54 trained sequences. This enabled the correct classification of the structures of five untrained sequences, as well as the identification of the predominant conformations including G4, iM, and DS formed by two complementary G-rich and C-rich sequences in acidic and neutral pH conditions. The results of this study demonstrated the potential of the proposed methodology for rapid screening and prediction of secondary DNA conformations.
科研通智能强力驱动
Strongly Powered by AbleSci AI