Identifying financial statement fraud with decision rules obtained from Modified Random Forest

提交随机森林决策树计算机科学分类器（UML）独创性利润（经济学）集合（抽象数据类型）人工智能经济短缺财务报表机器学习语句（逻辑）水准点（测量）数据挖掘业务经济会计心理学数据库地理哲学语言学政府（语言学）微观经济学创造力政治学大地测量学法学社会心理学审计程序设计语言

作者

Byungdae An,Yongmoo Suh

出处

期刊：Data technologies and applications [Emerald Publishing Limited]
日期：2020-05-04 卷期号：54 (2): 235-255 被引量：54

标识

DOI：10.1108/dta-11-2019-0208

摘要

Purpose Financial statement fraud (FSF) committed by companies implies the current status of the companies may not be healthy. As such, it is important to detect FSF, since such companies tend to conceal bad information, which causes a great loss to various stakeholders. Thus, the objective of the paper is to propose a novel approach to building a classification model to identify FSF, which shows high classification performance and from which human-readable rules are extracted to explain why a company is likely to commit FSF. Design/methodology/approach Having prepared multiple sub-datasets to cope with class imbalance problem, we build a set of decision trees for each sub-dataset; select a subset of the set as a model for the sub-dataset by removing the tree, each of whose performance is less than the average accuracy of all trees in the set; and then select one such model which shows the best accuracy among the models. We call the resulting model MRF (Modified Random Forest). Given a new instance, we extract rules from the MRF model to explain whether the company corresponding to the new instance is likely to commit FSF or not. Findings Experimental results show that MRF classifier outperformed the benchmark models. The results also revealed that all the variables related to profit belong to the set of the most important indicators to FSF and that two new variables related to gross profit which were unapprised in previous studies on FSF were identified. Originality/value This study proposed a method of building a classification model which shows the outstanding performance and provides decision rules that can be used to explain the classification results. In addition, a new way to resolve the class imbalance problem was suggested in this paper.

求助该文献

最长约 10秒，即可获得该文献文件

Identifying financial statement fraud with decision rules obtained from Modified Random Forest

今日热心研友