The Netflix Prize: Alternating Least Squares in MPI
نویسنده
چکیده
In 2006 Netflix announced a million dollar prize to the first team that could beat their Cinematch recommendation system by 10% on a particular test data set. Specifically, given over 100 million ratings, 1-5, from 480,189 different users and 17,770 different movies, the goal was to produce predictions for the test set that minimize the root mean square error. Cinematch scored an rmse of .9525, so the goal was to score less than or equal to .8572. My goal in this project was to simply beat Cinematch, in parallel. The prize was won in September, 2009 by a team using a blend of many different techniques. One such technique that played a large part in the winning blend was matrix factorization. The 480,000x17,000 (sparse) ratings matrix, R, is approximated by a product of two much smaller matrices. A singular value decomposition can produce the two matrices and conveniently minimizes the square norm. After choosing a number of “features”, f, you want an fx480,000 user matrix U and an fx17,000 movie matrix M whose product, UM comes as close as possible to the given training data in R. A number of algorithms exist to find these two matrices; I used an approach called alternating least squares with weighted-λ-regularization. [1]
منابع مشابه
Large-Scale Parallel Collaborative Filtering for the Netflix Prize
Many recommendation systems suggest items to users by utilizing the techniques of collaborative filtering (CF) based on historical records of items that the users have viewed, purchased, or rated. Two major problems that most CF approaches have to resolve are scalability and sparseness of the user profiles. In this paper, we describe Alternating-Least-Squares with Weighted-λ-Regularization (ALS...
متن کاملStatistical Properties of Alternating Least Squares Estimators of a Collaborative Filtering Model
Recommender systems are emerging as important tools for improving customer satisfaction by mathematically predicting user preferences. Several major corporations including Amazon.com and Pandora use these types of systems to suggest additional options based on current or recent purchases. Netflix uses a recommender system to provide its customers with suggestions for movies that they may like, ...
متن کاملThe Netflix Prize High Performance Computing Neural Networks Final Report
A solution for the Netflix Prize was developed based on back propagation neural networks. The solution is different than most other Collaborative Filtering techniques in that rather than perform a global dimensionality reduction, this method focuses on each desired prediction by creating an entirely new neural network for each prediction. The implementation was parallelized using MPI, achieving...
متن کاملP-Tree Singular Value Decomposition Item-Feature Collaborative Filtering Algorithm for Netflix Prize
Collaborative Filtering is effective to provide customers with personalized recommendations by analyzing the purchase pattens. Matrix factorization, e.g. Singular Value Decomposition, is another successful technique in recommendation system. We implemented Singular Value Decomposition algorithm to achieve the least total squared errors. Based on the result, item-feature Collaborative Filtering ...
متن کاملParallel stochastic gradient algorithms for large-scale matrix completion
This paper develops Jellyfish, an algorithm for solving data-processing problems with matrix-valued decision variables regularized to have low rank. Particular examples of problems solvable by Jellyfish include matrix completion problems and least-squares problems regularized by the nuclear norm or γ2-norm. Jellyfish implements a projected incremental gradient method with a biased, random order...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010