计算机科学
量化(信号处理)
前进飞机
实时计算
加速
推论
人工智能
网络数据包
算法
并行计算
计算机网络
作者
Kaiyi Zhang,Nancy Samaan,Ahmed Karmouch
标识
DOI:10.1109/noms56928.2023.10154321
摘要
Offloading some of the traffic management decision-making functionalities to intelligent data-planes (IDPs) can significantly enhance the accuracy and adaptation speed of network services. An IDP executes, at line-speed, one or more machine learning (ML) models for real-time inference and decision making. Unfortunately, existing IDP deployments either realize only a limited set of ML models such as decision trees or require substantial modifications in the switch hardware. These limitations can be attributed to the inherent scarcity of both the computational and memory resources and the strict high-speed per-packet processing demands. To address the aforementioned limitations, we propose a novel ML-based management framework, the in-network quantized ML architecture (INQ-MLA). First, INQ-MLA delegates the task of training and continuously optimizing the IDP ML model to the control-plane. The latter adopts a tailored quantization-aware training process to compensate for the effect of precision loss due to quantization. Second, INQ-MLA employs an efficient quantization mechanism to transform the trained ML model parameters (e.g., weights and activation functions outputs) from floating-point representations to smaller low precision fixed integer values that can be easily processed and stored in the data-plane. Finally, INQ-MLA ensures that the deployed ML model is integrated into the IDP pipeline by limiting all its execution operations to simplified arithmetic operations that are available in most switches. We developed a proof-of-concept implementation of our proposed architecture using P4-based switches. Experimental results demonstrate that INQ-MLA can achieve a high-level of accuracy at runtime.
科研通智能强力驱动
Strongly Powered by AbleSci AI