双谱
超参数
计算机科学
人工智能
特征选择
特征(语言学)
计算复杂性理论
机器学习
集成学习
降维
模式识别(心理学)
算法
语言学
电信
哲学
光谱密度
作者
Jiawei Jiang,Li-Chun Xu,Fenglian Li,Jian-Li Shao
出处
期刊:Metals
[MDPI AG]
日期:2023-01-13
卷期号:13 (1): 169-169
被引量:6
摘要
With the continuous improvement of machine learning methods, building the interatomic machine learning potential (MLP) based on the datasets from quantum mechanics calculations has become an effective technical approach to improving the accuracy of classical molecular dynamics simulation. The Spectral Neighbor Analysis Potential (SNAP) is one of the most commonly used machine learning potentials. It uses the bispectrum to encode the local environment of each atom in the lattice. The hyperparameter jmax controls the mapping complexity and precision between the local environment and the bispectrum descriptor. As the hyperparameter jmax increases, the description will become more accurate, but the number of parameters in the bispectrum descriptor will increase dramatically, increasing the computational complexity. In order to reduce the computational complexity without losing the computational accuracy, this paper proposes a two-level ensemble feature selection method (EFS) for a bispectrum descriptor, combining the perturbation method and the feature selector ensemble strategy. Based on the proposed method, the feature subset is selected from the original dataset of the bispectrum descriptor for building the dimension-reduced MLP. As a method application and validation, the data of Fe, Ni, Cu, Li, Mo, Si, and Ge metal elements are used to train the linear regression model based on SNAP for predicting these metals’ atomic energies and forces them to evaluate the performance of the feature subsets. The experimental results show that, compared to the features of SNAP and qSNAP, the training complexity improvement of our EFS method on the qSNAP feature is more effective than SNAP. Compared with the existing methods, when the feature subset size is 0.7 times that of the original features, the proposed EFS method based on the SSWRP ensemble strategy can achieve the best performance in terms of stability, achieving an average stability of 0.94 across all datasets. The training complexity of the linear regression model is reduced by about half, and the prediction complexity is reduced by about 30%.
科研通智能强力驱动
Strongly Powered by AbleSci AI