重复数据消除
计算机科学
容器(类型理论)
加密
相似性(几何)
数据库
数据挖掘
情报检索
计算机视觉
计算机安全
图像(数学)
材料科学
复合材料
作者
Tong Sun,Bowen Jiang,Borui Li,Jiamei Lv,Yi Gao,Wei Dong
摘要
The growing popularity of encrypted container images in registries poses unique challenges for storage management due to the necessity for deduplication amidst rising image volumes. Traditional deduplication struggles with encrypted content, which inherently disguises duplicate data as distinct due to its randomized nature. Current advanced methods tackle this issue by decompressing images and applying message-locked encryption (MLE). However, these techniques face considerable challenges. Minor content changes can impair deduplication effectiveness, and decompressing layers increases storage requirements. Furthermore, this process negatively impacts both the speed at which users access the images and the overall system throughput. We propose SimEnc, a high-performance and secure deduplication system for encrypted container images by exploiting multiple similarity spaces. SimEnc pioneers the integration of semantic hashing with MLE to effectively parse semantic relationships across layers, thereby increasing deduplication efficacy. This system incorporates a rapid selection mechanism for similarity spaces, offering enhanced flexibility over previous models that relied on full decompression. By adopting Huffman decoding to navigate new similarity spaces, SimEnc not only improves deduplication ratios but also enhances overall performance. Our experimental results demonstrate that SimEnc substantially reduces storage needs by up to 261.7% compared to encrypted serverless platforms and by 54.2% against plaintext registries, while also delivering superior pull latency metrics.
科研通智能强力驱动
Strongly Powered by AbleSci AI