计算机科学
云计算
分布式计算
强化学习
资源(消歧)
忠诚
服务(商务)
资源配置
利用
高保真
共享资源
可靠性(半导体)
人工智能
实时计算
计算机网络
工程类
计算机安全
操作系统
电信
电气工程
量子力学
物理
经济
经济
功率(物理)
作者
Chunyang Meng,Jingwan Tong,Maolin Pan,Yang Yu
标识
DOI:10.1109/icws55610.2022.00033
摘要
The elastic cloud applies autoscaling technology to allow users to automatically provision or deprovision resources on demands, attracting many application providers to migrate their applications to the cloud. However, autoscaling multi-service applications are still challenging due to the complex correlations among services. This paper presents HRA, an intelligent, holistic resource autoscaling framework for multi-service applications, utilizing model-based deep reinforcement learning (DRL), mitigating service-level agreements (SLA) violations while saving costs. HRA (i) leverages historical telemetry data and machine learning methods to build a simulated environment adaptively, modeling relations between resources, workloads and performance, (ii) exploits the environment model to drive up training efficiency of DRL agent, and (iii) uses the agent to automatically take actions to scale resources online based on simple low-level features from a monitor instead of elaborate high-level features that are representing the complex correlations and needing much sophisticated prior knowledge. Experiments (i) evaluated the fidelity of the proposed (simulated) environment modeling method, (ii) evaluated the reliability of the resource allocation policy from the simulation to reality, and (iii) compared related autoscaling methods. The evaluation results demonstrate that HRA realizes a more effective resource allocation policy under the limited number of time-consuming interactions and significantly decreases the 32-92% in SLA violation rate at a lower cost compared to other main methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI