发布文献求助

Learning-based Sketches for Frequency Estimation in Data Streams without Ground Truth

基本事实溪流计算机科学估计数据流挖掘数据挖掘人工智能工程类计算机网络系统工程

作者

Xinyu Yuan,Qiao Yan,Li Meng,Zhenchun Wei,Cuiying Feng

出处

期刊：Cornell University - arXiv 日期：2024-12-04

链接

arxiv.org arxiv.orgdoi.org

标识

DOI：10.48550/arxiv.2412.03611

摘要

Estimating the frequency of items on the high-volume, fast data stream has been extensively studied in many areas, such as database and network measurement. Traditional sketches provide only coarse estimates under strict memory constraints. Although some learning-augmented methods have emerged recently, they typically rely on offline training with real frequencies or/and labels, which are often unavailable. Moreover, these methods suffer from slow update speeds, limiting their suitability for real-time processing despite offering only marginal accuracy improvements. To overcome these challenges, we propose UCL-sketch, a practical learning-based paradigm for per-key frequency estimation. Our design introduces two key innovations: (i) an online training mechanism based on equivalent learning that requires no ground truth (GT), and (ii) a highly scalable architecture leveraging logically structured estimation buckets to scale to real-world data stream. The UCL-sketch, which utilizes compressive sensing (CS), converges to an estimator that provably yields a error bound far lower than that of prior works, without sacrificing the speed of processing. Extensive experiments on both real-world and synthetic datasets demonstrate that our approach outperforms previously proposed approaches regarding per-key accuracy and distribution. Notably, under extremely tight memory budgets, its quality almost matches that of an (infeasible) omniscient oracle. Moreover, compared to the existing equation-based sketch, UCL-sketch achieves an average decoding speedup of nearly 500 times. To help further research and development, our code is publicly available at https://github.com/Y-debug-sys/UCL-sketch.

求助该文献

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2025年影响因子查询已上线 (2025-6-18)

更新

PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: xdc发布了新的文献求助10

2秒前; 文承龙完成签到，获得积分20

5秒前; 大军门诊完成签到，获得积分10

6秒前; 周周完成签到，获得积分10

6秒前; 好的昂完成签到，获得积分10

6秒前; 罗大壮发布了新的文献求助10

7秒前; xiaoruixue完成签到，获得积分10

7秒前; 阿策完成签到，获得积分10

8秒前; 虚幻绿兰完成签到，获得积分10

8秒前; ycc完成签到，获得积分10

10秒前; LaffiteElla完成签到，获得积分10

10秒前; gexzygg完成签到，获得积分0

11秒前; 孙小懒完成签到，获得积分10

12秒前; 七濑关闭了七濑的文献求助

13秒前; 天明完成签到，获得积分10

13秒前; 十字路口完成签到，获得积分10

13秒前; 峰宝宝完成签到，获得积分10

14秒前; 蕉鲁诺蕉巴纳完成签到，获得积分0

14秒前; muzi完成签到，获得积分10

15秒前; 李海平完成签到，获得积分10

15秒前; ding7862完成签到，获得积分10

15秒前; 量子星尘发布了新的文献求助10

16秒前; gapper完成签到，获得积分10

18秒前; 罗大壮关闭了罗大壮的文献求助

19秒前; lym完成签到，获得积分10

20秒前; ECT完成签到，获得积分10

20秒前; MIST完成签到，获得积分10

20秒前; gexzygg发布了新的文献求助200

20秒前; Maribo完成签到，获得积分10

21秒前; 康家旗完成签到，获得积分10

21秒前; QinCaibin完成签到，获得积分10

23秒前; 倾听阳光完成签到，获得积分10

24秒前; 好学的泷泷完成签到，获得积分10

27秒前; wweq完成签到，获得积分10

28秒前; xiaxia42完成签到，获得积分10

30秒前; 逍遥呱呱完成签到，获得积分10

30秒前; LLL完成签到，获得积分10

31秒前; yanmh完成签到，获得积分10

32秒前; 包容问雁发布了新的文献求助30

32秒前; 研友_VZG7GZ的应助被开心采纳，获得10

33秒前

高分求助中: Aerospace Standards Index - 2025 10000; (应助此贴封号)【重要！！请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000; Clinical Microbiology Procedures Handbook, Multi-Volume, 5th Edition 1000; Teaching Language in Context （Third Edition） 1000; List of 1,091 Public Pension Profiles by Region 961; 流动的新传统主义与新生代农民工的劳动力再生产模式变迁 500; Historical Dictionary of British Intelligence (2014 / 2nd EDITION!) 500

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 5450504; 求助须知：如何正确求助？哪些是违规求助？ 4558218; 关于积分的说明 14265752; 捐赠科研通 4481783; 什么是DOI，文献DOI怎么找？ 2454981; 邀请新用户注册赠送积分活动 1445752; 关于科研通互助平台的介绍 1421880

今日热心研友

你嵙这个期刊没买

遇上就这样吧

悲凉的冬天

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通