过度拟合
计算机科学
人工智能
数据挖掘
模糊逻辑
机器学习
神经模糊
聚类分析
差异(会计)
模糊分类
人工神经网络
模式识别(心理学)
模糊控制系统
会计
业务
作者
Rukshima Dabare,Kok Wai Wong,Mohd Fairuz Shiratuddin,Polychronis Koutsakis
摘要
Deep learning (DL) has achieved superior classification in many applications due to its capability of extracting features from the data. However, the success of DL comes with the tradeoff of possible overfitting. The bias towards the data it has seen during the training process leads to poor generalisation. One way of solving this issue is by having enough training data so that the classifier is invariant to many data patterns. In the literature, data augmentation has been used as a type of regularisation method to reduce the chance for the model to overfit. However, most of the relevant works focus on image, sound or text data. There is not much work on numerical data augmentation, although many real-world problems deal with numerical data. In this paper, we propose using a technique based on Fuzzy C-Means clustering and fuzzy membership grades. Fuzzy-related techniques are used to address the variance problem by generating new data items based on fuzzy numbers and each data item's belongings to different fuzzy clusters. This data augmentation technique is used to improve the generalisation of a Deep Neural Network that is suitable for numerical data. By combining the proposed fuzzy data augmentation technique with the Dropout regularisation technique, we manage to balance the classification model's bias-variance tradeoff. Our proposed technique is evaluated using four popular data sets and is shown to provide better regularisation and higher classification accuracy compared with popular regularisation approaches.
科研通智能强力驱动
Strongly Powered by AbleSci AI