织女星
粒度
索引(排版)
计算机科学
群(周期表)
人工智能
化学
万维网
物理
操作系统
有机化学
天文
作者
Meng Li,Huayi Chai,Siqiang Luo,Haipeng Dai,Rong Gu,Jiaqi Zheng,Guihai Chen
摘要
Learned indexes, which model key-value data structures by machine learning models, have been extensively studied. However, the fastest immutable learned indexes (e.g., RMI) do not provide the same tight lookup bounds as classical indexes such as B-trees. There are learned indexes that provide tight bounds (e.g., PGM) but those fall short in query performance. This gives rise to an interesting open question: whether there exists a learned index that simultaneously achieves state-of-the-art empirical performance and matching complexity? In this paper, we give a positive answer to this standing problem.We propose two new online model-building policies: (1) simplifying distribution by the adoption of a proper granularity (i.e., grouping multiple keys together for model-building) and (2) actively tuning distribution through key repositioning. Additionally, we introduce a general framework that combines these two policies for performance optimization under a given memory budget. We put everything together to design VEGA, a learned index that simultaneously achieves competitive theoretical and empirical performance compared to state-of-the-art learned indexes. We conducted extensive evaluations, demonstrating VEGA achieves both better lookup and building performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI