计算机科学
张量分解
张量(固有定义)
算法
噪音(视频)
数据挖掘
人工智能
数学
纯数学
图像(数学)
作者
Jiaqi Liu,Qiwu Wu,Lingzhi Jiang,Renjun Zhan,Xiaochuan Zhao,Husheng Wu,W.M. Tan
出处
期刊:PLOS ONE
[Public Library of Science]
日期:2024-12-02
卷期号:19 (12): e0312723-e0312723
标识
DOI:10.1371/journal.pone.0312723
摘要
Tensor data is common in real-world applications, such as recommendation system and air quality monitoring. But such data is often sparse, noisy, and fast produced. CANDECOMP/PARAFAC (CP) is a popular tensor decomposition model, which is both theoretically advantageous and numerically stable. However, learning the CP model in a Bayesian framework, though promising to handle data sparsity and noise, is computationally challenging, especially with fast produced data streams. The fundamental problem addressed by the paper is mainly tackles the efficient processing of streaming tensor data. In this work, we propose BS-CP, a quick and accurate structure to dynamically update the posterior of latent factors when a new observation tensor is received. We first present the BS-CP1 algorithm, which is an efficient implementation using assumed density filtering (ADF). In addition, we propose BS-CP2 algorithm, using Gauss–Laguerre quadrature method to integrate the noise effect which shows better empirical result. We tested BS-CP1 and BS-CP2 on generic real recommendation system datasets, including Beijing-15k, Beijing-20k, MovieLens-1m and Fit Record. Compared with state-of-the-art methods, BS-CP1 achieve 31.8% and 33.3% RMSE improvement in the last two datasets, with a similar trend observed for BS-CP2. This evidence proves that our algorithm has better results on large datasets and is more suitable for real-world scenarios. Compared with most other comparison methods, our approach has demonstrated an improvement of over 10% and exhibits superior stability.
科研通智能强力驱动
Strongly Powered by AbleSci AI