Reservoir Sampling Based Streaming Method for Large Scale Collaborative Filtering

Tevfik Aytekin


Collaborative filtering algorithms work on user feedback data (such as purchases, clicks, or ratings.) in order to build models of users and items. User feedback data in real life e-commerce sites can be very large which incurs high costs on maintenance and model building. Parallelization of computation might help but it results in additional costs for extra computing power and maintenance problems of very large datasets still persist. Sampling at this point can be an effective approach for reducing the amount of data. In this work we propose a novel sampling technique for collaborative filtering which can be used to reduce the amount of data considerably. Experimental results on three real life datasets show that the proposed method leads to a significant reduction in the amount of data with little harm to the accuracy of the models. The method works in a streaming fashion which makes it suitable for being used in real time at large-scale e-commerce applications where there is a large flow of continuous user feedback.


Collaborative filtering; reservoir sampling; large-scale recommender systems.

Full Text:

Submitted: 2018-02-27 14:41:35
Published: 2018-09-26 07:04:22
Search for citations in Google Scholar
Related articles: Google Scholar
Abstract views:


Copyright (c) 2018 International Journal of Intelligent Systems and Applications in Engineering

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
© Prof.Dr. Ismail SARITAS 2013-2018     -    Address: Selcuk University, Faculty of Technology 42031 Selcuklu, Konya/TURKEY.