自编码
异常检测
计算机科学
深度学习
数据挖掘
数据质量
异常(物理)
人工智能
基线(sea)
监督学习
比例(比率)
机器学习
人工神经网络
地质学
工程类
物理
凝聚态物理
公制(单位)
运营管理
海洋学
量子力学
作者
Jiun‐Ting Lin,A. C. Aguiar,Qingkai Kong,Amanda Price,Stephen C. Myers
摘要
Abstract Seismic data quality assessment (QA) is the first and one of the most important steps before conducting any further data analysis. Traditional methods involve checking various metrics, such as spike detection and power spectral density, by setting strict thresholds or comparing data against synthetic benchmarks. However, these approaches often rely on pre-existing knowledge and assumptions about data anomalies, leading to potential misclassification of unusual cases. Here, we propose a deep autoencoder model, an unsupervised learning approach that evaluates data quality without making assumptions about normal and anomalous data, which can be used to identify deviations in recorded data that may indicate nascent instrument failure. We test the model with the U.S. International Monitoring System (IMS) seismic stations and demonstrate the capability of detecting anomalies on a monthly scale. This could prompt station operators to examine potential problems early, allowing sufficient time for instrument maintenance to prevent data outages. In addition, we use a new manually selected testing dataset to compare our model performance against two supervised machine learning (ML) approaches and a standard QA package, as baseline models. When applied to the dataset containing known data anomalies, performance of the supervised and unsupervised ML approaches is similar, with an accuracy of 88.1% for our model compared to ∼90% for the supervised ML approach and 78.2% for the standard QA package. Our model outperforms the baseline models when applied to new stations, where new types of data anomalies can be station-specific and not included in the training dataset. Finally, we show model transferability by training the model with data from the Global Seismograph Network only and applying it to the IMS network data. The results suggest that our model is generalizable and can be applied to new stations with good accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI