: Monitoring Large-Scale Cloud-Native Infrastructure Using One-Sided RDMA

远程直接内存访问 云计算 计算机科学 架空(工程) 数据库 操作系统 算法
作者
Zhuo Song,Jiejian Wu,Teng Ma,Zhe Wang,Linghe Kong,Zhenzao Wen,Jingxuan Li,Yang Lu,Yong Yang,Tao Ma,Zheng Liu,Guihai Chen
出处
期刊:IEEE ACM Transactions on Networking [Institute of Electrical and Electronics Engineers]
卷期号:32 (4): 3499-3514
标识
DOI:10.1109/tnet.2024.3394514
摘要

Cloud services have shifted from monolithic designs to microservices running on cloud-native infrastructure with monitoring systems to ensure service level agreements (SLAs). However, traditional monitoring systems no longer meet the demands of cloud-native monitoring. In Alibaba's "double eleven" shopping festival, it is observed that the monitor occupies resources of the monitored infrastructure and even disrupts services. In this paper, we propose a novel monitoring system named for cloud-native monitoring. achieves zero overhead in collecting raw metrics using one-sided remote direct memory access (RDMA) and remedies network congestion by adopting a receiver-driven flow control scheme. also features a priority queue mechanism to meet different quality of service requirements and an efficient batch processing design to relieve CPU occupation. has been deployed and evaluated in four different clusters with heterogeneous RDMA NIC devices and architectures in Alibaba Cloud. Results show that achieves no CPU occupation at the monitored host and supports $1\sim10k$ hosts with $0.1\sim1s$ sampling interval using a single thread for network I/O. significantly relieves the incast issue and maintains $80\sim95\%$ of bandwidth utilization in several clusters when monitoring $1k$ hosts. also ensures services with high priority accomplish collecting metrics earlier than low priority ones by at least $400 \mu s$ when monitoring $1k$ hosts.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
烟花应助濮阳冰海采纳,获得100
1秒前
大个应助崔凌翠采纳,获得20
2秒前
时尚语梦完成签到 ,获得积分10
5秒前
Orange应助快乐的天磊采纳,获得10
6秒前
丁莞完成签到,获得积分10
6秒前
Hello应助科研通管家采纳,获得30
7秒前
7秒前
所所应助科研通管家采纳,获得10
7秒前
wy.he应助科研通管家采纳,获得10
7秒前
陈雷应助科研通管家采纳,获得10
7秒前
今后应助科研通管家采纳,获得10
7秒前
无花果应助科研通管家采纳,获得10
7秒前
冰魂应助科研通管家采纳,获得10
7秒前
科研通AI5应助科研通管家采纳,获得30
7秒前
科研通AI5应助科研通管家采纳,获得10
7秒前
科研通AI5应助科研通管家采纳,获得30
7秒前
SciGPT应助科研通管家采纳,获得10
7秒前
CipherSage应助科研通管家采纳,获得10
7秒前
爆米花应助科研通管家采纳,获得10
7秒前
wy.he应助科研通管家采纳,获得10
7秒前
Jasper应助科研通管家采纳,获得10
8秒前
8秒前
完美世界应助科研通管家采纳,获得10
8秒前
科研通AI5应助科研通管家采纳,获得10
8秒前
科研通AI5应助科研通管家采纳,获得10
8秒前
orixero应助科研通管家采纳,获得10
8秒前
8秒前
10秒前
SciGPT应助赵鑫雅采纳,获得10
10秒前
ultramantaro发布了新的文献求助10
14秒前
14秒前
14秒前
14秒前
濮阳冰海发布了新的文献求助100
17秒前
17秒前
xqssll完成签到 ,获得积分10
19秒前
王佳豪发布了新的文献求助10
19秒前
三脸茫然完成签到 ,获得积分10
19秒前
20秒前
20秒前
高分求助中
【此为提示信息,请勿应助】请按要求发布求助,避免被关 20000
Continuum Thermodynamics and Material Modelling 2000
Encyclopedia of Geology (2nd Edition) 2000
105th Edition CRC Handbook of Chemistry and Physics 1600
Maneuvering of a Damaged Navy Combatant 650
Mixing the elements of mass customisation 300
the MD Anderson Surgical Oncology Manual, Seventh Edition 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3778155
求助须知:如何正确求助?哪些是违规求助? 3323817
关于积分的说明 10215889
捐赠科研通 3038977
什么是DOI,文献DOI怎么找? 1667739
邀请新用户注册赠送积分活动 798378
科研通“疑难数据库(出版商)”最低求助积分说明 758339