计算机科学
稀疏矩阵
协同过滤
局部敏感散列
架空(工程)
大数据
推荐系统
散列函数
矩阵分解
核(代数)
非负矩阵分解
理论计算机科学
数据挖掘
机器学习
哈希表
特征向量
数学
操作系统
组合数学
物理
量子力学
高斯分布
计算机安全
作者
Zixuan Li,Hao Li,Kenli Li,Fan Wu,Lydia Y. Chen,Keqin Li
出处
期刊:ACM/IMS transactions on data science
[Association for Computing Machinery]
日期:2021-11-30
卷期号:2 (4): 1-27
被引量:1
摘要
Matrix factorization (MF) can extract the low-rank features and integrate the information of the data manifold distribution from high-dimensional data, which can consider the nonlinear neighborhood information. Thus, MF has drawn wide attention for low-rank analysis of sparse big data, e.g., Collaborative Filtering (CF) Recommender Systems, Social Networks, and Quality of Service. However, the following two problems exist: (1) huge computational overhead for the construction of the Graph Similarity Matrix (GSM) and (2) huge memory overhead for the intermediate GSM. Therefore, GSM-based MF, e.g., kernel MF, graph regularized MF, and so on, cannot be directly applied to the low-rank analysis of sparse big data on cloud and edge platforms. To solve this intractable problem for sparse big data analysis, we propose Locality Sensitive Hashing (LSH) aggregated MF (LSH-MF), which can solve the following problems: (1) The proposed probabilistic projection strategy of LSH-MF can avoid the construction of the GSM. Furthermore, LSH-MF can satisfy the requirement for the accurate projection of sparse big data. (2) To run LSH-MF for fine-grained parallelization and online learning on GPUs, we also propose CULSH-MF, which works on CUDA parallelization. Experimental results show that CULSH-MF can not only reduce the computational time and memory overhead but also obtain higher accuracy. Compared with deep learning models, CULSH-MF can not only save training time but also achieve the same accuracy performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI