计算机科学
离群值
人工智能
特征选择
支持向量机
机器学习
特征(语言学)
预测建模
数据挖掘
异方差
特征工程
主成分分析
人工神经网络
偏斜
深度学习
统计
哲学
语言学
数学
作者
Lubna Obaid,Khaled Hamad,Mohamad Ali Khalil,Ali Bou Nassif
标识
DOI:10.1016/j.engappai.2024.107845
摘要
Developing a high-performing traffic incident-duration prediction model is considered a key component for evaluating the impact of these incidents on the roadway network. Various research studies have developed robust incident-duration prediction models. Still, they have faced many issues in providing an accurate prediction result due to the countless data modeling issues, such as complex correlations, highly skewed data distributions, heteroscedasticity, and outliers. This paper investigates the impact of feature optimization (FO) - a relatively new term encompassing two already-known topics: feature engineering (FE) and feature selection (FS) techniques - on the performance of several machine learning models developed for predicting incident durations. The models developed included multivariate linear regression, decision trees, support vector regressors, K-Nearest Neighbors regression, ensembles, and artificial neural networks. Various FO techniques have been used for each model to derive the massive traffic incidents dataset and repeat the prediction process. Our results show that the proposed filtering, wrapper, and embedded FS techniques can successfully reduce the number of features without sacrificing the prediction performance. Using log-normal transformation to deal with continuous data skewness, min-max normalization to deal with data variability, and principal component analysis (PCA) to reform the dataset into a smaller independent feature subset, FE techniques can enhance the accuracy of incident duration estimation over the assessed ML models. The best-performing FE technique was the PCA since performance improvements were observed across all developed ML models. The best-performing FS technique was the Recursive Feature Elimination, outperforming other tested techniques in reducing model complexity while maintaining model accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI