加密
交通分类
计算机科学
有效载荷(计算)
稳健性(进化)
数据挖掘
交通生成模型
深包检验
人工智能
机器学习
网络数据包
计算机网络
生物化学
基因
化学
作者
Quanbo Pan,Yang Yu,Hanbing Yan,Maoli Wang,Bingzhi Qi
标识
DOI:10.1109/trustcom60117.2023.00039
摘要
With the widespread application of network traffic encryption, traffic identification has become increasingly critical. As the types of encryption protocols continue to grow, identifying encrypted traffic with limited training samples has become more challenging. In recent years, pre-training models have been extensively applied in natural language processing due to their ability to utilize a large amount of unlabeled data effectively. However, when applied to encrypted traffic identification, these methods lack sufficient information extraction from encrypted network traffic, resulting in the loss of some essential features and negatively impacting the recognition performance of such approaches. Therefore, we proposed an encrypted traffic classification model based on a Transformer named FlowBERT. In FlowBERT, the semantic features of the traffic can be learned from two dimensions: payload and packet length sequence in large-scale, unlabeled encrypted traffic scenarios. Length sequences are encoded to extract traffic sequence features efficiently, enabling the model to learn the contextual semantic relationships within the sequences. Simultaneously, the pre-training process is improved by balancing data samples, enhancing the performance of the pre-training model. We validated the performance of this method on both classic encrypted traffic classification datasets and the novel network protocol DoH dataset. We concluded that our approach demonstrates robustness and superior recognition performance compared to similar methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI