0% found this document useful (0 votes)
300 views11 pages

IMDB Movie Analysis Report

This document summarizes an analysis of the IMDB movie dataset conducted in Google Sheets. Key steps included cleaning the data by removing nulls and duplicates, then analyzing various metrics like highest grossing films, top-rated movies, best directors, popular genres, and top actors based on critic and audience ratings. The highest grossing film was Avatar, the director with the best average rating was John Blanchard, the most popular genres were Drama and Comedy, and Leonardo DiCaprio was found to be the favorite actor of both critics and audiences.

Uploaded by

abrarahmed120401
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
300 views11 pages

IMDB Movie Analysis Report

This document summarizes an analysis of the IMDB movie dataset conducted in Google Sheets. Key steps included cleaning the data by removing nulls and duplicates, then analyzing various metrics like highest grossing films, top-rated movies, best directors, popular genres, and top actors based on critic and audience ratings. The highest grossing film was Avatar, the director with the best average rating was John Blanchard, the most popular genres were Drama and Comedy, and Leonardo DiCaprio was found to be the favorite actor of both critics and audiences.

Uploaded by

abrarahmed120401
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Trainity

IMDB MOVIE ANALYSIS


14th may 2023

Project description:
IMDB movie analysis involves analyzing the IMDB movie dataset by performing
various techniques and calculations on some features of the IMDB dataset. Based
on the questions and the task given , the analysis should be proper and should
fulfill the project requirements.

Approach:
● The project begins with loading the dataset into the spreadsheet tool.
● After loading, the first step is to clean the dataset by removing nulls , finding
the missing values, removing duplicates etc.
● Later based on the questions asked and the problem statement , specific
criterias, calculations and visualization are used to make tables, pivot tables,
charts etc.
● The answers to each and every question are stored separately on different
spreadsheets.

Tech-stack used:

Google Sheets
INSIGHTS & RESULTS:
A. Cleaning the data:: This is one of the most important steps to perform before
moving forward with the analysis.

Task : Clean the data

Results:

There were lots of null values out of which , some of the rows containing null
values were dropped and others were manually filled by myself by researching
and finding more about the imdb movies. Also there were lots of duplicated rows
and also some columns which were not very important for the analysis were
dropped during the cleaning process. They are visualized below:

Null values : Before and after cleaning:


B. Movies with highest profit: Use a scatter plot to plot profit vs budget and plot
the movie profit data point and identify the outlier.

Task : Find the movies with highest profit

Results:

Here are top 10 movies with highest profit:


Note: The third graph is plotted with 1 unit = $100000

There are lots of movies with profits , and the movie with the highest profit was
‘Avatar’ . But there were also some movies with losses which can be seen in the second
graph. In the second graph I can clearly see an outlier. It is the movie name ‘The Host’
with a profit of $-12213298588 and this movie had a budget of $12215500000. This
calculation shows that either this movie was not released or it had some issues in
releasing the movie or maybe wrong data collected.
C. Top 250: Find the top 250 movies according IMDB score in the IMDB dataset.
Also find the foreign movies.

Task : Find the top 250

Results:

Out of the IMDB top 250 , the top 29 movies are as follows:
D. Best Directors: TGroup the column using the director_name column.

Find out the top 10 directors for whom the mean of imdb_score is the highest and
store them in a new column top 10 director. In case of a tie in IMDb score
between two directors, sort them alphabetically.
Your task: Find the best directors.

Results:

In this analysis , I found that John Blanchard is the director with the highest
average IMDB score which is 9.5 followed by Krzysztof Kieslowski with an
average score of 9.1
E. Popular Genres: Perform this step using the knowledge gained while
performing previous steps.
Your task: Find popular genres

Results: Here are the top 10 genres with the highest number of counts.

Also, the genres with highest average IMDB score are :


F. Charts: Create three new columns namely, Meryl_Streep, Leo_Caprio, and
Brad_Pitt which contain the movies in which the actors: 'Meryl Streep', 'Leonardo
DiCaprio', and 'Brad Pitt' are the lead actors. Use only the actor_1_name column
for extraction. Also, make sure that you use the names 'Meryl Streep', 'Leonardo
DiCaprio', and 'Brad Pitt' for the said extraction.
Append the rows of all these columns and store them in a new column named
Combined.
Group the combined column using the actor_1_name column.
Find the mean of the num_critic_for_reviews and num_users_for_review and
identify the actors which have the highest mean.

Your task: Find the critic-favorite and audience-favorite actors

Results:
The audience favorite and the critic favorite actor is Leonardo DiCaprio with an average
audience vote of 914 and average critic votes of 330. Also he has the most number of
movies compared to the other 3 actors.
Conclusion:
Here I conclude the report, I got to learn a lot about advanced excel and how to
analyze such data sets effectively. I also got to know a lot about the movies , their
profits and much more.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy