0% found this document useful (0 votes)
45 views18 pages

SML PBL

movie recommendation system

Uploaded by

23r15a6619
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views18 pages

SML PBL

movie recommendation system

Uploaded by

23r15a6619
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

GEETANJALI COLLEGE OF ENGINEERING AND TECHNOLOGY

MOVIE RECOMMENDATION
SYSTEM

22R11A66G8-K.SrivalliVarshini
23R15A6618-M.Bhargavi
23R15A6619-N.Sruthi
INTRODUCTION
● A recommendation system is a type of information filtering system which attempts to predict the preferences of
a user, and make suggestions based on these preferences. There are a wide variety of applications for
recommendation systems.
● These have become increasingly popular over the last few years and are now utilized in most online platforms
that we use.
● The objective of this project is to build a movie recommendation system using collaborative filtering
techniques.
● The system utilizes data from two primary datasets:
movies.csv, which contains details such as movieId , title, and genres, and ratings.csv, which includes userId,
movieId, rating, and timestamp.
● Key tools and libraries used in this implementation include R with packages such as recommenderlab, ggplot2,
and data.table. The system leverages these resources to analyze user preferences and recommend movies
effectively.
Project Overview
Objective: Build an Item-Based Collaborative Filtering system in R to recommend movies

Dataset: MovieLens (105,339 ratings across 10,329 movies).

Data preprocessing

Techniques Collaborative filtering

Model building and


evaluation.
Dataset Details

Movies Dataset Rating Dataset Integration


It containds movieId,title and genres It containds movieId,userId, rating & The processed movies & ratings data
column.Genres consists of 3 timestamp were combined into a single
categories(action,Adventure,Fantasy) The dataset was transformed into a searchable matrix, enabling analysis
A binary matrix was then created user-item matrix, where rows of user preferences and movie
where each column represented a represented users and columns characteristics.
genre represented movies. Each cell held The combined data structure served
the user's rating for the as the foundation for computing
For each movie, a 1 indicated the corresponding movie. similarities and generating
presence of a genre, and a0 indicated This matrix was converted into a recommendations.
its absence. This transformation sparse matrix using the Both datasets are merged based on a
allowed for efficient filtering and recommenderlab package to optimize common identifier (e.g., Movie ID).
searching by genre. storage and computation, especially This ensures that movie metadata
since most cells in the matrix were and user ratings are aligned for
empty processing.
What is ……?
Data Preprocessing?
Data preprocessing is the process of preparing raw data for analysis by transforming it
into a clean and usable format for machine learning models. It involves handling
missing values, scaling numerical data, encoding categorical data, and more.

Collaborative Filtering?

Collaborative filtering is a technique used in recommendation systems to predict


user preferences based on the preferences of other users.

Model Training and Testing?


This process involves teaching a machine learning model to make accurate predictions
and evaluating its performance
Data Preprocessing
Genre Encoding:
• Movies genres are transformed into a numerical format using one-hot encoding.
• For example, if a movie belongs to genres like Comedy and Action, its encoded vector might
look like [1,0,1,0] where each position corresponds to a genre.

Sparse Matrix Creation:


• A sparse matrix is built to represent user-movie interactions.
• Rows represent users, columns represent movies, and matrix values contain ratings or binary
indicators (liked or not liked).

Normalization:
• User ratings are standardized to remove biases, such as some users always rating high or low.
• This ensures fair comparisons between users’ preferences

Binarization:
• Ratings are binarized to focus on whether a user likes a movie (1) or does not like it (0).
• For instance, ratings above 3 might be converted to (1), and ratings below or equal to 3 to (0).
Collaborative Filtering
Overview of Collaborative Filtering:
This approach relies on user behavior and preferences to recommend items .
Item-Based Collaborative Filtering: Compares items (movies) rather than users, identifying similar items
based on how users rate them.

Similarity Calculation:
• Cosine similarity is used to measure the relationship between items.
• Example: If two movies have similar ratings from multiple users, their cosine similarity score will be
high.

Recommendation Process:
• The model predicts how much a user would like a movie based on the ratings of similar movies.
• Movies with the highest predicted scores are recommended.
Model Training and Testing
Data Splitting:
The dataset is divided into two parts:
Training Set (80%): Used to train the recommendation model.
Testing Set (20%): Used to evaluate the model's performance.

Model Parameters:
Number of Neighbors (k): Determines how many similar items are considered
during recommendations. In this case, k=30.
Similarity Metric: Cosine similarity was chosen to evaluate item similarity.

Evaluation:
The model’s effectiveness is assessed by comparing predicted ratings against
actual user ratings in the test set.
Metrics such as precision, recall, or Mean Absolute Error (MAE) may be used.
Visualisation
Heatmaps

Bar Charts Distribution Plots


Visualization
Recommendation
Output of the model:
The model generates a list of top 10 movies for
each user based on their prior ratings.
Example: If a user liked Toy Story and Shrek, the
model might recommend Finding Nemo or
Frozen.
Recommendation
The recommendations are The system's accuracy can
personalized and vary be enhanced by
across users, showcasing incorporating more
the system’s ability to contextual features (e.g.,
adapt to individual user demographics, time of
preferences. viewing).

Output of the Model: Improvement Potential:

Recommendations
Conclusion
In conclusion, the movie recommendation system built using
Item-Based Collaborative Filtering (IBCF) effectively provides
personalized movie suggestions based on user ratings. By
preprocessing the data and transforming it into a user-item matrix, we
were able to calculate similarities between users and movies using
cosine similarity. The system successfully predicted top movie
recommendations for each user, offering an enhanced user experience.
Visualizations of movie popularity and rating distributions provided
valuable insights into user behavior. Overall, the approach
demonstrated the potential of collaborative filtering to deliver relevant
and tailored recommendations. Future improvements could involve
incorporating hybrid models for even more accurate suggestions.
THANKYOU
!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy