0% found this document useful (0 votes)
26 views8 pages

IMDB Movie Analysis Project Report

The IMDb Movie Analysis project investigates a dataset of movies to uncover insights related to genres, durations, languages, directors, and budgets. Key findings include the most common genres (Drama, Comedy, Thriller, Action), an average movie duration of 109 minutes with a positive correlation to IMDb scores, and top directors with high average scores. The project utilized Microsoft Excel for data cleaning and analysis, resulting in valuable experience in data visualization and statistical analysis.

Uploaded by

projectdemo091
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views8 pages

IMDB Movie Analysis Project Report

The IMDb Movie Analysis project investigates a dataset of movies to uncover insights related to genres, durations, languages, directors, and budgets. Key findings include the most common genres (Drama, Comedy, Thriller, Action), an average movie duration of 109 minutes with a positive correlation to IMDb scores, and top directors with high average scores. The project utilized Microsoft Excel for data cleaning and analysis, resulting in valuable experience in data visualization and statistical analysis.

Uploaded by

projectdemo091
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

IMDB Movie Analysis

PROJECT DESCRIPTION :-
The IMDb Movie Analysis project aims to explore and analyze a comprehensive dataset of
movies available on the IMDb platform. This dataset contains essential information about
movies, including director names, movie titles, duration, genre, budget, gross earnings, IMDb
ratings, and more. Through in-depth data analysis using Excel, Data Visualization and Statistics
techniques this project seeks to extract valuable insights and trends that contribute to a movie's
success.

In this project, I was required to provide a detailed report for the below data record mentioning
the answers of the questions that follows:

A. Movie Genre Analysis: Analyze the distribution of movie genres and their impact on the
IMDB score.

● Task: Determine the most common genres of movies in the dataset. Then, for each genre,
calculate descriptive statistics (mean, median, mode, range, variance, standard deviation)
of the IMDB scores.

B. Movie Duration Analysis: Analyze the distribution of movie durations and its impact on the
IMDB score.

● Task: Analyze the distribution of movie durations and identify the relationship between
movie duration and IMDB score.

C. Language Analysis: Situation: Examine the distribution of movies based on their language.

● Task: Determine the most common languages used in movies and analyze their impact
on the IMDB score using descriptive statistics.

D. Director Analysis: Influence of directors on movie ratings.

● Task: Identify the top directors based on their average IMDB score and analyze their
contribution to the success of movies using percentile calculations.

E. Budget Analysis: Explore the relationship between movie budgets and their financial success.

● Task: Analyze the correlation between movie budgets and gross earnings, and identify
the movies with the highest profit margin.

MY APPROACH :-
I have gone through the dataset and understood all the given columns. Then I have observed that
there are a total of 28 Columns and 5043 Rows. This dataset consists of unwanted columns, Null
values and Blank rows. So, I have decided to Clean this dataset thoroughly.
1) First, I have deleted the columns which have no relation to our project and don't provide
any valuable insights. In the end, I only left with 9 Columns which are director’s name,
duration, movie title, genre, budget, gross, imdb rating, language and country.
2) Then, I noticed that there were many blank rows. To find them I first clicked on “Find &
Select” then clicked on “go to special” and selected the “blank” option. It highlighted all
the blank rows. Then I clicked the shortcut “CTRL + - ” and selected the “Entire rows”
option. This process deleted the entire blank rows in the dataset.
3) Finally, I also deleted the duplicate rows present in the dataset. Now, I left with a total of
9 Columns and 3786 Rows. The Cleaned Dataset is provided below.

https://docs.google.com/spreadsheets/d/
1QZcrT5BZhKOTA9_pnpaorlPPRI7wW4BCzT_FyVd0YQY/edit?
usp=sharing

TECH STACK :-
For this project, I have used Microsoft Excel 365 to run the functions and get answers for the
above questions. I also used this to plot the graphs.

INSIGHTS :-

1) Movie Genre Analysis:


Task: Determine the most common genres of movies in the dataset. Then, for each genre,
calculate descriptive statistics (mean, median, mode, range, variance, standard deviation)
of the IMDB scores.
2) Movie Duration Analysis:
Task: Analyze the distribution of movie durations and identify the relationship between
movie duration and IMDB score.
3) Movie Language Analysis:
Task: Determine the most common languages used in movies and analyze their impact
on the IMDB score using descriptive statistics.

4) Movie Director Analysis:


Task: Identify the top directors based on their average IMDB score and analyze their
contribution to the success of movies using percentile calculations.
5) Movie Budget Analysis:
Task: Analyze the correlation between movie budgets and gross earnings, and identify
the movies with the highest profit margin.
The Results Dataset Link:-

https://docs.google.com/spreadsheets/d/1X-
ak_kajhbePb1_NzvtnA8kmErCSwUN9yEJqvIdsac0/edit?usp=sharing
I have noticed that,

1) The Most common movie genres from the dataset are Drama, Comedy, Thriller and
Action.
2) The Average duration of a Movie is 109 minutes. The trendline between the duration vs
imdb score is elevated upward with R^2 = 0.131
3) The Most common languages used in the movies are English, French, Spanish, Mandarin
and German. I have also Observed that the languages Telugu and Persian have the
highest average imdb score.
4) I have identified that Tony Kaye, Charles Chaplin, Alfred Hitchcock, Ron Fricke,
Damien Chazelle, Majid Majidi, Sergio Leone, Christopher Nolan, SS Rajamouli and
Richard Marquand are the top 10 directors with average imdb score >=8.4
5) The Top-5 with highest profits are Avatar, Jurassic World, Titanic, Star Wars: Episode
IV - A New Hope and E.T. The Extra-Terrestrial. The Correlation between budget and
gross is positive.

RESULTS :-
With the help of this project, I have gained valuable experience for data analysis using statistical
knowledge and excel’s data visualization. Through this, I have learnt to apply my data analysis
skills in solving real life problems.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy