计算机科学
数据挖掘
可视化
离群值
数据集
探索性数据分析
集合(抽象数据类型)
数据可视化
原始数据
数据质量
数据分析
水准点(测量)
过程(计算)
人工智能
工程类
程序设计语言
大地测量学
地理
公制(单位)
操作系统
运营管理
作者
Priyadarshini Mahalingam,D. Kalpana,S. Sendhilkumar,T. Thyagarajan
标识
DOI:10.1177/09596518221117326
摘要
This article illustrates the importance of exploratory data analysis as the key step in the data validation process and also for further model development. Descriptive statistics is performed by applying the exploratory data analysis approach to a data set that is derived from the simulink library of a benchmark industrial actuator process provided by the Development and Application of Methods for Actuator Diagnosis in Industrial Control Systems research group. In this work, the data set is synthetically generated from an actuator model by simulating it under different fault conditions. The exploratory data analysis is performed as an initial investigation on the data set to reveal the anomalies and patterns for making suitable assumptions and test hypothesis to develop more accurate models. The raw data are visualised using different visualisation techniques to find patterns and behaviours. The data distribution, class distribution, feature correlation and presence of outliers are revealed through data visualisation. Data processing is performed to transform the data, and treat outliers and missing values in the data set. The treated data set after performing data processing is visually confirmed using appropriate visualisation techniques. The inferences from the visualisation methods are validated quantitatively with statistical results to support the exploratory data analysis. Data profiling is dealt with by collecting the metadata on the Development and Application of Methods for Actuator Diagnosis in Industrial Control Systems data set and aids in enhancing the data set quality and content to procure accurate predictive models employed to foresee the actuator performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI