推荐系统
计算机科学
人工智能
人工神经网络
机器学习
深度学习
深层神经网络
作者
Qicai Shi,Feng Xiao,Douglas Pickard,Inga Chen,Liang Chen
标识
DOI:10.1145/3543873.3587684
摘要
Recommender systems are widely used in many Web applications to recommend items which are relevant to a user's preferences. However, focusing on exploiting user preferences while ignoring exploration will lead to biased feedback and hurt the user's experience in the long term. The Mutli-Armed Bandit (MAB) is introduced to balance the tradeoff between exploitation and exploration. By utilizing context information in the reward function, contextual bandit algorithms lead to better performance compared to context-free bandit algorithms. However, existing contextual bandit algorithms either assume a linear relation between the expected reward and context features, whose representation power gets limited, or use a deep neural network in the reward function which is impractical in implementation. In this paper, we propose a new contextual bandit algorithm, DeepLinUCB, which leverages the representation power of deep neural network to transform the raw context features in the reward function. Specifically, this deep neural network is dedicated to the recommender system, which is efficient and practical in real-world applications. Furthermore, we conduct extensive experiments in our online recommender system using requests from real-world scenarios and show that DeepLinUCB is efficient and outperforms other bandit algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI