离群值
计算机科学
数据挖掘
主成分分析
无线传感器网络
稳健主成分分析
缺少数据
人口
人工智能
张量(固有定义)
数学
模式识别(心理学)
机器学习
计算机网络
人口学
社会学
纯数学
作者
G Rajesh,Ashvini Chaturvedi
出处
期刊:IEEE Transactions on Signal and Information Processing over Networks
日期:2021-01-01
卷期号:7: 539-550
被引量:10
标识
DOI:10.1109/tsipn.2021.3105795
摘要
The major challenges during the data acquisition process in an environment wireless sensor network (EWSN) architecture are the presence of outliers and missing data. The outliers are ubiquitous in the data acquired by the EWSN due to sensor failures, aging effects, power dwindling, external noise, etc. Missing data at the sink node owes to the communication failures, sensor node malfunction, inadequate sampling frequency and switching of sensor nodes into sleep mode, etc. The robust tensor principal component analysis (RTPCA) decomposes a noisy data tensor into a low-rank tensor and a sparse tensor, which can be exploited in the data recovery process of multi-attribute WSNs, where the low-rank component represents the intrinsic data tensor and the sparse component represents the gross outlier tensor. In this paper, a novel probabilistic outlier modeling scheme using multivariate Chebyshev's inequality hypothesis is proposed, which maps the sample population and the associated magnitudes of outliers with the spatio-temporal correlations inherently present in the acquired heterogeneous sensory data. In the EWSN scenario, the inevitable features of spatio-temporal and multi-attribute correlations in the acquired environmental data are established using singular value methods. The RTPCA method, in which a tensor nuclear norm (TNN) which can extract more temporal and multi-attribute correlations in the sensory data through block circulant matricization is used as the convex surrogate of tensor rank minimization in the robust recovery of the intrinsic tensor data from outlier-corrupted, incomplete tensor data. Two real-world outdoor environmental datasets are used for the performance evaluation of the proposed scheme. Analytical experiments show that the reconstruction accuracy of RTPCA is close to 90% even for outlier corruptions of the largest magnitude bounds in the forest surveillance data for a high missing ratio of 0.95. The reconstruction of oceanographic data, whose variance is much lower, maintains almost similar performance for various outlier contaminations, although RPTCA performs much better than other tensor and matrix completion methods in terms of achieved reconstruction accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI