皮尔逊积矩相关系数
可靠性(半导体)
计算机科学
相关系数
数据挖掘
变量(数学)
距离相关
统计
随机变量
数学
机器学习
数学分析
功率(物理)
物理
量子力学
作者
Hewei Zhang,Xiaoguang Di,Longlong Yang
标识
DOI:10.1109/iccia59741.2023.00023
摘要
Due to the importance of measuring variables for the input dataset, Variable Importance Analysis (VIA) techniques have drawn great attention in reliability assessment and prediction. Traditionally, the Pearson Correlation Coefficient (Pearson CC) has been used to select the most critical variables only based on the dataset. A clear disadvantage of this technique is that it fails to provide satisfactory results for a large-scale dataset with both linear and non-linear relationships, which limits its actual applications. This paper proposes an improved method for the Pearson CC to better make VIA for reliability requirements. The proposed method improves the original one in that the Pearson CC is derived from total information provided by all input variables based on information theory rather than the analysis of only the linear relationship. Data from an in-service pipeline is used as a case study to demonstrate how the proposed improved Pearson CC is applied in the real application. Comparative analysis with the existing method’s results provides clear evidence that the proposed method is more accurate as it obtains the higher values for the variables with the non-linear relationship. Moreover, the results show that this improved method, in accordance with information theory, is suitable for both structured data and unstructured data, which promotes the exploration of VIA based on big data analysis in now development trends.
科研通智能强力驱动
Strongly Powered by AbleSci AI