Recommender Systems (RS) provide suggestions for particular items that are likely to be interested in target users. Those suggestions can be in any area that relates to decision-making processes.
Youtube video suggestions, Spotify recommendations or any particular suggestion offered to you on shopping sites that are related to the item that you bought is the result of a specific recommender system.
There are many approaches in recommenders systems considering various situations with pros and cons. Those are Content-Based, Collaborative Filtering, Knowledge-Based, Demographic and Hybrid-Systems. In this article, we will only focus on collaborative filtering while considering movie recommendations.
It is like getting a movie recommendation from a friend you trust her/his cinema taste
As we know people tend to take movie recommendations from friends or other people. The basic idea of collaborative filtering is lying on analyzing people's shared interests on domain-specific items (in our case, domain items are movies ). Calculating cinema similarity of people or similarity between different movies with each other will allow us to make plausible recommendations. Those recommendations are classified as user-based collaborative filtering (UB-CF) and item-based collaborative filtering (IB-CF) methods respectively.
On the rest of this article, to make a simple illustration we will consider a hypothetical situation that mainly consists of two persons; Person A and Person B and two movies Movie X and Movie Y.
User-Based Collaborative Filtering
In real life, Person A asks a good movie to watch from Person B. If the suggested movie is liked by Person A then we assume that Person A will be more likely to ask Person B for new suggestions in the future.
In our case, we first define a threshold number which defines the required minimum number of commonly watched movies of two persons. It was found that choosing 25 as our threshold number could significantly improve the accuracy of the predicted ratings and that a value of 50 for gave the best results[2,3].
We will assume that the number of commonly watched movies of Person A and Person B has more than our threshold number. Then, we will calculate their cinema taste similarity with Pearson Correlation and we assume that those two persons have a positive correlation. Then we will consider those two persons as neighbours.
After that, if some of the neighbours ( Person B) of Person A likes Movie X, recommending Movie X to Person A can be a good suggestion.
Furthermore, with enough information and a good algorithm, we can also make a plausible prediction about how many ratings will be given to Movie X by Person A. This can be classified as user-based collaborative filtering (UB-CF). On the atomic perspective, UB-CF firstly holds two persons and compare their ratings on commonly watched movies, then classify them whether they are neighbours or not. After all the neighbours are considered, then our algorithm decides to whether a specific movie can be recommended to Person A or not.
Item-Based Collaborative Filtering
On the other side, Item-Based Collaborative-Filtering (IB-CF) analyze the similarity of two movies while comparing the ratings given by the same user.
In our case, let's say Movie X and Movie Y has rated by 50 different persons. Firstly, the similarity between these two movies is calculated. When those two items are positively correlated, this time we can say that Movie X and Movie Y are neighbour items.
In item-based methods, the rating predicted for a movie is based on the ratings given to similar movies. Consequently, recommender systems using this approach will tend to recommend to a user item that is related to those usually appreciated by this user. For instance, in our case recommending movies having the same genre, actors or director as those highly rated by the user are likely to be recommended. While this may lead to safety recommendations, it reduces the chance of discovering a movie from a genre that a user never watched before.
Although the user-based method is relatively riskier than the item based methods, it is more likely to make serendipitous recommendations.
In Pixly, We are calculating the similarities of movies with various machine learning methods. We also attach great importance to the serendipitous experience of our users. You will see two different ‘similar section’ in movie pages. One of them is ‘movie recommendations’ which is based on user-based collaborative filtering, and the other one is ‘similar movies’ section which is based on content-based recommendations.
References
Francesco Ricci, Lior Rokach, Bracha Shapira: Recommender Systems Handbook 2E
Herlocker, J., Konstan, J.A., Riedl, J.: An empirical analysis of design choices in neighbourhood-based collaborative filtering algorithms. Inf. Retr. 5(4), 287–310 (2002)
Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J.: An algorithmic framework for performing collaborative filtering. In: SIGIR ’99: Proc. of the 22nd Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 230–237. ACM, New York, NY, USA (1999)