重采样
大洪水
估计员
熵(时间箭头)
洪水(心理学)
数据挖掘
k-最近邻算法
随机森林
采样(信号处理)
人工智能
计算机科学
机器学习
统计
数学
地理
考古
心理学
物理
滤波器(信号处理)
量子力学
计算机视觉
心理治疗师
作者
Hongfa Wang,Yu Meng,Hongshi Xu,Huiliang Wang,Xinjian Guan,Yuan Liu,Meng Liu,Zening Wu
标识
DOI:10.1016/j.jhydrol.2024.130742
摘要
With the emphasis on preventing urban flooding and the enhancement of rational urban development, data related to urban flooding are also collected with unbalanced sample size that is a widespread phenomenon in other world fields. The performance of the classification model is compromised by unbalanced datasets, therefore, minority-class samples, floods with higher risk, are often missing alerted or incorrectly warned. To solve this problem, a novel hybrid resampling proposal is proposed in this research proved to be effective for balancing data. First, it optimizes an imbalanced dataset by the Borderline-SMOTE algorithm. Next, alternative datasets are synthesized through under-sampling techniques, whose qualities are evaluated by using information entropy and calculated rely on the k-nearest neighbor entropy estimator. The suggested method not only makes full use of the original data information, but also avoids under-fitting due to the single under-sampling utilization. A practical application in the central area of Zhengzhou, China, combining the resampling proposal and the Random Forest classification model optimized by Genetic Algorithm, the results show that significantly better results are yielded compared without any treatment in terms of all assessment indicators (Accuracy, Recall, G-mean and F1-score) have been improved.
科研通智能强力驱动
Strongly Powered by AbleSci AI