Improved Feature Selection and Stream Traffic Classification Based on Machine Learning in Software-Defined Networks

计算机科学特征选择交通分类选择（遗传算法）机器学习人工智能软件软件定义的网络数据挖掘计算机网络操作系统互联网

作者

Arwa M. Eldhai,Mosab Hamdan,Ahmed Abdelaziz,Mohamed Hashem,Sharief F. Babiker,Muhammad Nadzir Marsono,Muzaffar Hamzah,N. Z. Jhanjhi

出处

期刊：IEEE Access [Institute of Electrical and Electronics Engineers]
日期：2024-01-01 卷期号：12: 34141-34159 被引量：2

链接

ieee.org ieee.orgdoi.org

标识

DOI：10.1109/access.2024.3370435

摘要

Traffic classification (TC) in software-defined networks (SDN) based on machine learning (ML) proves to be a viable option for improving network management. Therefore, TC assists SDN, and SDN facilitates the feature selection (FS) process, especially when using ML as a classification mechanism to extract measurements and related information from the incoming data to the SDN controller. Despite these advantages, there is still a lack of adequate support for tasks related to TC and FS because traffic profiles are often very similar, making classification difficult. Moreover, stream learning (SL), when it is used with TC, shows many challenges. Therefore, robust statistical flow features are needed to reduce the overhead from the SDN control plane. Consequently, these statistical flow features could extract online features, handle concept drift and process an infinite data stream with finite resources (time and memory). This paper aims to improve the overall performance of TC based on the SL technique to selection of relevant FS to alleviate load from the SDN control plane by the following. First, an FS mechanism named Boruta is proposed. Second, we propose a streaming-based traffic classification method in SDN called hoeffding adaptive trees (HAT), adaptive random forest (ARF), and k-nearest neighbour with adaptive sliding window detector (KNN-ADWIN). These techniques can dynamically handle the drift concept and solve the problem of consuming memory and time to reduce the SDN controller's overhead. Third, real and synthetic traffic traces are used to assess the proposed FS and stream TC performance. According to simulation results, the Boruta FS technique can achieve up to 95 % average accuracy, and up to 87% average per application to precision, recall, and f-score than other works in the literature. Furthermore, findings for SL techniques reveal that the proposed methods can retain up to 85% average accuracy, 78% kappa, and average rates between 62-88% in precision, recall, f-score. Also, the HAT has lower time and memory consumption reach to 15s and 105KB comparison to ART and KNN-ADWIN.

求助该文献

最长约 10秒，即可获得该文献文件

Improved Feature Selection and Stream Traffic Classification Based on Machine Learning in Software-Defined Networks

今日热心研友