多元统计
数据挖掘
人工神经网络
计算机科学
样品(材料)
人口
水质
样本量测定
统计
机器学习
人工智能
数学
生态学
社会学
人口学
化学
生物
色谱法
作者
Ali El Bilali,Houda Lamane,Abdeslam Taleb,Ayoub Nafii
标识
DOI:10.1016/j.jclepro.2022.133227
摘要
Deep Neural Network (DNN) is a powerful tool for predicting and monitoring water quality. However, its application is only limited to well-monitored zones where the availability of data for training and validation phases. In this study, we attempt to develop a novel framework based on Multivariate distributions (MVD) (elliptical copulas)-based Virtual Sample Generation (VSG) method to broaden the application of DNN to predict water quality even with a small dataset. This framework is evaluated to predict the Entropy Weighted Water Quality Index (EWQI) using DNN and Electrical Conductivity, Temperature, and pH as input variables, in Berrechid and Chaouia aquifer systems, Morocco. Validation results showed that the virtual samples generated from 400, 50, 30, and 20 original samples improved the NSE from 0.88 to 0.92, from 0.53 to 0.91, from 0.42 to 0.91, and from 0.24 to 0.87, respectively. Besides, sensitivity analysis of the methodology to the virtual data sizes and the original samples showed that the RMSE and NSE of the DNN models have limits in function to virtual data sizes according to the first order Exponential Decay and logistic trends, respectively. These limits highly depend on original sample sizes. Such empirical trends are crucial for reproducing the proposed methodology in other sites to determine optimal virtual datasets. Overall, the proposed methodology provided new insights to improve the DNN model performances in predicting water quality with small datasets. Hence, it is useful to manage water quality in order to supply clean water for the population in poorly monitored zones. • Data availability is one of the limitations in applying DNN approach. • Existed Virtual Sample Generation methods are not capable to generate appropriate combinations of water quality parameters. • Copulas are valuable to generate virtual dataset for training DNN. • Small dataset with 20 original samples is sufficient to generate virtual data for training DNN. • Validation with sufficient observed data, 300 samples of groundwater.
科研通智能强力驱动
Strongly Powered by AbleSci AI