Estimation of building project completion duration using a natural gradient boosting ensemble model and legal and institutional variables

Boosting(机器学习) 梯度升压 地铁列车时刻表 持续时间(音乐) 回归分析 回归 计算机科学 计量经济学 统计 运筹学 数学 人工智能 随机森林 艺术 文学类 操作系统
作者
Farshad Peiman,Mohammad Khalilzadeh,Nasser Shahsavari-Pour,Mehdi Ravanshadnia
出处
期刊:Engineering, Construction and Architectural Management [Emerald Publishing Limited]
被引量:4
标识
DOI:10.1108/ecam-12-2022-1170
摘要

Purpose Earned value management (EVM)–based models for estimating project actual duration (AD) and cost at completion using various methods are continuously developed to improve the accuracy and actualization of predicted values. This study primarily aimed to examine natural gradient boosting (NGBoost-2020) with the classification and regression trees (CART) base model (base learner). To the best of the authors' knowledge, this concept has never been applied to EVM AD forecasting problem. Consequently, the authors compared this method to the single K-nearest neighbor (KNN) method, the ensemble method of extreme gradient boosting (XGBoost-2016) with the CART base model and the optimal equation of EVM, the earned schedule (ES) equation with the performance factor equal to 1 (ES1). The paper also sought to determine the extent to which the World Bank's two legal factors affect countries and how the two legal causes of delay (related to institutional flaws) influence AD prediction models. Design/methodology/approach In this paper, data from 30 construction projects of various building types in Iran, Pakistan, India, Turkey, Malaysia and Nigeria (due to the high number of delayed projects and the detrimental effects of these delays in these countries) were used to develop three models. The target variable of the models was a dimensionless output, the ratio of estimated duration to completion (ETC(t)) to planned duration (PD). Furthermore, 426 tracking periods were used to build the three models, with 353 samples and 23 projects in the training set, 73 patterns (17% of the total) and six projects (21% of the total) in the testing set. Furthermore, 17 dimensionless input variables were used, including ten variables based on the main variables and performance indices of EVM and several other variables detailed in the study. The three models were subsequently created using Python and several GitHub-hosted codes. Findings For the testing set of the optimal model (NGBoost), the better percentage mean (better%) of the prediction error (based on projects with a lower error percentage) of the NGBoost compared to two KNN and ES1 single models, as well as the total mean absolute percentage error (MAPE) and mean lags (MeLa) (indicating model stability) were 100, 83.33, 5.62 and 3.17%, respectively. Notably, the total MAPE and MeLa for the NGBoost model testing set, which had ten EVM-based input variables, were 6.74 and 5.20%, respectively. The ensemble artificial intelligence (AI) models exhibited a much lower MAPE than ES1. Additionally, ES1 was less stable in prediction than NGBoost. The possibility of excessive and unusual MAPE and MeLa values occurred only in the two single models. However, on some data sets, ES1 outperformed AI models. NGBoost also outperformed other models, especially single models for most developing countries, and was more accurate than previously presented optimized models. In addition, sensitivity analysis was conducted on the NGBoost predicted outputs of 30 projects using the SHapley Additive exPlanations (SHAP) method. All variables demonstrated an effect on ETC(t)/PD. The results revealed that the most influential input variables in order of importance were actual time (AT) to PD, regulatory quality (RQ), earned duration (ED) to PD, schedule cost index (SCI), planned complete percentage, rule of law (RL), actual complete percentage (ACP) and ETC(t) of the ES optimal equation to PD. The probabilistic hybrid model was selected based on the outputs predicted by the NGBoost and XGBoost models and the MAPE values from three AI models. The 95% prediction interval of the NGBoost–XGBoost model revealed that 96.10 and 98.60% of the actual output values of the testing and training sets are within this interval, respectively. Research limitations/implications Due to the use of projects performed in different countries, it was not possible to distribute the questionnaire to the managers and stakeholders of 30 projects in six developing countries. Due to the low number of EVM-based projects in various references, it was unfeasible to utilize other types of projects. Future prospects include evaluating the accuracy and stability of NGBoost for timely and non-fluctuating projects (mostly in developed countries), considering a greater number of legal/institutional variables as input, using legal/institutional/internal/inflation inputs for complex projects with extremely high uncertainty (such as bridge and road construction) and integrating these inputs and NGBoost with new technologies (such as blockchain, radio frequency identification (RFID) systems, building information modeling (BIM) and Internet of things (IoT)). Practical implications The legal/intuitive recommendations made to governments are strict control of prices, adequate supervision, removal of additional rules, removal of unfair regulations, clarification of the future trend of a law change, strict monitoring of property rights, simplification of the processes for obtaining permits and elimination of unnecessary changes particularly in developing countries and at the onset of irregular projects with limited information and numerous uncertainties. Furthermore, the managers and stakeholders of this group of projects were informed of the significance of seven construction variables (institutional/legal external risks, internal factors and inflation) at an early stage, using time series (dynamic) models to predict AD, accurate calculation of progress percentage variables, the effectiveness of building type in non-residential projects, regular updating inflation during implementation, effectiveness of employer type in the early stage of public projects in addition to the late stage of private projects, and allocating reserve duration (buffer) in order to respond to institutional/legal risks. Originality/value Ensemble methods were optimized in 70% of references. To the authors' knowledge, NGBoost from the set of ensemble methods was not used to estimate construction project duration and delays. NGBoost is an effective method for considering uncertainties in irregular projects and is often implemented in developing countries. Furthermore, AD estimation models do fail to incorporate RQ and RL from the World Bank's worldwide governance indicators (WGI) as risk-based inputs. In addition, the various WGI, EVM and inflation variables are not combined with substantial degrees of delay institutional risks as inputs. Consequently, due to the existence of critical and complex risks in different countries, it is vital to consider legal and institutional factors. This is especially recommended if an in-depth, accurate and reality-based method like SHAP is used for analysis.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
回首不再是少年完成签到,获得积分0
5秒前
机灵山河完成签到 ,获得积分10
8秒前
12秒前
开门啊菇凉完成签到,获得积分0
14秒前
baoxiaozhai完成签到 ,获得积分10
19秒前
冬雪完成签到,获得积分10
19秒前
43秒前
lep发布了新的文献求助10
47秒前
月亮快打烊吖完成签到 ,获得积分10
47秒前
香蕉觅云应助lep采纳,获得10
54秒前
晨曦完成签到 ,获得积分10
58秒前
lumeicheng完成签到 ,获得积分10
1分钟前
先锋老刘001发布了新的文献求助100
1分钟前
九零后无心完成签到,获得积分10
1分钟前
1分钟前
Marilyn发布了新的文献求助10
1分钟前
1分钟前
cdercder应助科研通管家采纳,获得10
1分钟前
cdercder应助科研通管家采纳,获得10
1分钟前
1分钟前
orixero应助Marilyn采纳,获得10
1分钟前
Marilyn完成签到,获得积分10
1分钟前
Harlotte完成签到 ,获得积分10
1分钟前
1分钟前
畅快谷秋完成签到 ,获得积分10
1分钟前
我的完成签到,获得积分10
1分钟前
CodeCraft应助wangxiaoli0991采纳,获得30
1分钟前
yi完成签到 ,获得积分10
2分钟前
VAD123完成签到,获得积分10
2分钟前
飘逸锦程完成签到 ,获得积分10
2分钟前
amai完成签到,获得积分10
2分钟前
uouuo完成签到 ,获得积分10
2分钟前
2分钟前
2分钟前
冯珂完成签到 ,获得积分10
2分钟前
我独舞完成签到 ,获得积分10
2分钟前
小学生学免疫完成签到 ,获得积分10
2分钟前
hyxu678完成签到,获得积分10
2分钟前
liciky完成签到 ,获得积分10
3分钟前
雪山飞龙发布了新的文献求助30
3分钟前
高分求助中
Applied Survey Data Analysis (第三版, 2025) 800
Narcissistic Personality Disorder 700
Assessing and Diagnosing Young Children with Neurodevelopmental Disorders (2nd Edition) 700
The Elgar Companion to Consumer Behaviour and the Sustainable Development Goals 540
Images that translate 500
Transnational East Asian Studies 400
Mapping the Stars: Celebrity, Metonymy, and the Networked Politics of Identity 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3843292
求助须知:如何正确求助?哪些是违规求助? 3385593
关于积分的说明 10540764
捐赠科研通 3106166
什么是DOI,文献DOI怎么找? 1710900
邀请新用户注册赠送积分活动 823825
科研通“疑难数据库(出版商)”最低求助积分说明 774308