SML PBL
SML PBL
MOVIE RECOMMENDATION
SYSTEM
22R11A66G8-K.SrivalliVarshini
23R15A6618-M.Bhargavi
23R15A6619-N.Sruthi
INTRODUCTION
● A recommendation system is a type of information filtering system which attempts to predict the preferences of
a user, and make suggestions based on these preferences. There are a wide variety of applications for
recommendation systems.
● These have become increasingly popular over the last few years and are now utilized in most online platforms
that we use.
● The objective of this project is to build a movie recommendation system using collaborative filtering
techniques.
● The system utilizes data from two primary datasets:
movies.csv, which contains details such as movieId , title, and genres, and ratings.csv, which includes userId,
movieId, rating, and timestamp.
● Key tools and libraries used in this implementation include R with packages such as recommenderlab, ggplot2,
and data.table. The system leverages these resources to analyze user preferences and recommend movies
effectively.
Project Overview
Objective: Build an Item-Based Collaborative Filtering system in R to recommend movies
Data preprocessing
Collaborative Filtering?
Normalization:
• User ratings are standardized to remove biases, such as some users always rating high or low.
• This ensures fair comparisons between users’ preferences
Binarization:
• Ratings are binarized to focus on whether a user likes a movie (1) or does not like it (0).
• For instance, ratings above 3 might be converted to (1), and ratings below or equal to 3 to (0).
Collaborative Filtering
Overview of Collaborative Filtering:
This approach relies on user behavior and preferences to recommend items .
Item-Based Collaborative Filtering: Compares items (movies) rather than users, identifying similar items
based on how users rate them.
Similarity Calculation:
• Cosine similarity is used to measure the relationship between items.
• Example: If two movies have similar ratings from multiple users, their cosine similarity score will be
high.
Recommendation Process:
• The model predicts how much a user would like a movie based on the ratings of similar movies.
• Movies with the highest predicted scores are recommended.
Model Training and Testing
Data Splitting:
The dataset is divided into two parts:
Training Set (80%): Used to train the recommendation model.
Testing Set (20%): Used to evaluate the model's performance.
Model Parameters:
Number of Neighbors (k): Determines how many similar items are considered
during recommendations. In this case, k=30.
Similarity Metric: Cosine similarity was chosen to evaluate item similarity.
Evaluation:
The model’s effectiveness is assessed by comparing predicted ratings against
actual user ratings in the test set.
Metrics such as precision, recall, or Mean Absolute Error (MAE) may be used.
Visualisation
Heatmaps
Recommendations
Conclusion
In conclusion, the movie recommendation system built using
Item-Based Collaborative Filtering (IBCF) effectively provides
personalized movie suggestions based on user ratings. By
preprocessing the data and transforming it into a user-item matrix, we
were able to calculate similarities between users and movies using
cosine similarity. The system successfully predicted top movie
recommendations for each user, offering an enhanced user experience.
Visualizations of movie popularity and rating distributions provided
valuable insights into user behavior. Overall, the approach
demonstrated the potential of collaborative filtering to deliver relevant
and tailored recommendations. Future improvements could involve
incorporating hybrid models for even more accurate suggestions.
THANKYOU
!