Trade-off between accuracy and fairness of data-driven building and indoor environment models: A comparative study of pre-processing methods

计算机科学采样（信号处理）机器学习公寓数据挖掘预测建模人工智能工程类计算机视觉滤波器（信号处理）土木工程

作者

Ying Sun,Fariborz Haghighat,Benjamin C. M. Fung

出处

期刊：Energy [Elsevier]
日期：2022-01-15 卷期号：239: 122273-122273 被引量：4

标识

DOI：10.1016/j.energy.2021.122273

摘要

Data-driven models have drawn extensive attention in the building domain in recent years, and their predictive accuracy depends on features or data distribution. Accuracy variation among users or periods creates a certain unfairness to some users. This paper addresses a new research problem called fairness-aware prediction of data-driven building and indoor environment models. First, three types of fairness definitions are introduced in building engineering. Next, Type I and Type II fairness are investigated. To achieve fairness Type I , we study the effect of suppressing the protected attribute (i.e., attribute whose value cannot be disclosed or be discriminated against) from inputs. To improve fairness Type II while preserving the predictive accuracy of data-driven building and indoor environment models, we propose three pre-processing methods for training dataset—sequential sampling, reversed preferential sampling, and sequential preferential sampling. The proposed methods are compared to two existing pre-processing methods in a case study for lighting status prediction in an apartment building. Overall, 576 study cases were used to study the effect of these pre-processing methods on the accuracy and fairness of 12 series of lighting status prediction based on 2 types of feature combinations and 4 types of classifiers. Predictive results show that suppressing the protected attribute slightly influences overall predictive accuracy, while all pre-processing methods decrease it. However, in general, sequential sampling would be a good option for improving fairness Type II with an acceptable accuracy decrease. Fairness improvement performance of other pre-processing methods varies depending on applied features and classifiers. • Fairness concepts are firstly introduced to the building and indoor environment domain. • Trade off between accuracy and fairness of data-driven building models are studied. • Three pre-processing methods are proposed to process the training dataset. • 576 study cases are designed and investigated. • Four kinds of machine learning algorithms are utilized.

求助该文献

最长约 10秒，即可获得该文献文件

Trade-off between accuracy and fairness of data-driven building and indoor environment models: A comparative study of pre-processing methods

今日热心研友