0% found this document useful (0 votes)
20 views25 pages

SMDM Guided Project Ashish

The document analyzes data from a food delivery service to help the company improve business. It describes the data, performs univariate and multivariate analysis of variables like cuisine, cost, time and rating, and identifies relationships between variables. Key findings are shared to help answer questions on popular cuisine, restaurant demand, and factors affecting customer experience.

Uploaded by

Ashish Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views25 pages

SMDM Guided Project Ashish

The document analyzes data from a food delivery service to help the company improve business. It describes the data, performs univariate and multivariate analysis of variables like cuisine, cost, time and rating, and identifies relationships between variables. Key findings are shared to help answer questions on popular cuisine, restaurant demand, and factors affecting customer experience.

Uploaded by

Ashish Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Confidential

Statistical Methods for Decision Making


Project Report

FoodHub Analysis

This file is meant for personal use by kumar.ashish2050@gmail.com only.


Sharing or publishing the contents in part or full is liable for legal action. 0
Contents

S.no Topics Page

1 Problem - FoodHub Analysis 3

Confidential
1.1 Problem Definition 3

1.2 Data Overview 5

1.3 Univariate Analysis 8

1.4 Multivariate Analysis 15

1.5 Conclusion and Recommendations 24

List of Tables

No Name of the Table Page no

1 Top five rows of dataset 5

2 Basic Information of dataset 5

3 Statistical summary 7

4 Restaurant rating 21

This file is meant for personal use by kumar.ashish2050@gmail.com only. 1


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
List of Figures

No Name of Figure Page no

1 Univariate Analysis of Cuisine type 8

Confidential
2 Univariate Analysis of Cost of the Order 9

3 Univariate Analysis of Day of the Week 10

4 Univariate Analysis of Rating 11

5 Univariate Analysis of Food Preparation Time (Histogram) 11

6 Univariate Analysis of Food Preparation Time (Boxplot) 12

7 Univariate Analysis of Delivery Time (Histogram) 12

8 Univariate Analysis of Delivery Time (Boxplot) 13

9 Multivariate analysis of Cuisine vs Cost of the Order 15

10 Multivariate analysis of Cuisine vs Food preparation Time 16

11 Multivariate analysis of Day of the Week vs Delivery Time 17

12 Multivariate analysis of Rating vs Delivery Time 18

13 Multivariate analysis of Rating vs Food Preparation Time 18

14 Multivariate analysis of Rating vs Cost of the Order 19

15 Correlation Plot 20

This file is meant for personal use by kumar.ashish2050@gmail.com only. 2


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
FoodHub Data Analysis
Problem Definition

Confidential
Context
The number of restaurants in New York is increasing day by day. Lots of students
and busy professionals rely on those restaurants due to their hectic lifestyles.
Online food delivery service is a great option for them. It provides them with good
food from their favorite restaurants. A food aggregator company FoodHub offers
access to multiple restaurants through a single smartphone app.
The app allows restaurants to receive a direct online order from a customer. The app
assigns a delivery person from the company to pick up the order after it is confirmed
by the restaurant. The delivery person then uses the map to reach the restaurant and
waits for the food package. Once the food package is handed over to the delivery
person, he/she confirms the pick-up in the app and travels to the customer's location
to deliver the food. The delivery person confirms the drop-off in the app after
delivering the food package to the customer. The customer can rate the order in the
app. The food aggregator earns money by collecting a fixed margin of the delivery
order from the restaurants.

Objective
The food aggregator company has stored the data of the different orders made by the
registered customers in their online portal. They want to analyze the data to get a fair
idea about the demand of different restaurants which will help them in enhancing
their customer experience. Suppose you are hired as a Data Scientist in this company
and the Data Science team has shared some of the key questions that need to be
answered.
Perform the data analysis to find answers to these questions that will help the
company to improve the business.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 3


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Data Description
The data contains different data related to a food order. The detailed data dictionary is
given below.

Data Dictionary

Confidential
order_id: Unique ID of the order
customer_id: ID of the customer who ordered the food
restaurant_name: Name of the restaurant
cuisine_type: Cuisine ordered by the customer
cost_of_the_order: Cost of the order
day_of_the_week: Indicates whether the order is placed on a weekday or weekend
(The weekday is from Monday to Friday and the weekend is Saturday and Sunday)
rating: Rating given by the customer out of 5
food_preparation_time: Time (in minutes) taken by the restaurant to prepare the
food. This is calculated by taking the difference between the timestamps of the
restaurant's order confirmation and the delivery person's pick-up confirmation.
delivery_time: Time (in minutes) taken by the delivery person to deliver the food
package. This is calculated by taking the difference between the timestamps of the
delivery person's pick-up confirmation and drop-off information

This file is meant for personal use by kumar.ashish2050@gmail.com only. 4


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Data Overview

Structure of the data

Confidential
The data frame has 9 columns as mentioned in the Data Dictionary. Data in each
row corresponds to the order placed by a customer.

Table 1: Top 5 rows of the dataset

The data frame has 1898 rows and 9 columns.

Table 2: Basic information of the dataset

There are a total of 1898 non-null observations in each of the columns.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 5


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
The dataset contains 9 columns: 4 are of integer type ('order_id',
'customer_id', 'food_preparation_time', 'delivery_time'), 1 is of floating point type
('cost_of_the_order') and 4 are of the general object type ('restaurant_name',
'cuisine_type', 'day_of_the_week', 'rating').
Total memory usage is approximately 133.6 KB.

Confidential
Missing Value Treatment

There are no missing values in the data.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 6


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Statistical Summary

Inspecting the Summary Statistics of the Dataset (Numerical fields)

Confidential
Table 3: Statistical summary

Observations:
● Order ID and Customer ID are just identifiers for each order.
● The cost of an order ranges from 4.47 to 35.41 dollars, with an average order
costing around 16 dollars and a standard deviation of 7.5 dollars. The cost of
75% of the orders is below 23 dollars. This indicates that most customers
prefer low-cost food compared to expensive ones.
● Food preparation time ranges from 20 to 35 minutes, with an average of
around 27 minutes and a standard deviation of 4.6 minutes. The spread is not
very high for the food preparation time.
● Delivery time ranges from 15 to 33 minutes, with an average of around 24
minutes and a standard deviation of 5 minutes. The spread is not too high
for delivery time either.

How many orders are not rated?

There are 736 orders that are not rated.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 7


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Exploratory Data Analysis

Univariate Analysis

Confidential
Order ID
There are 1898 unique orders. As mentioned earlier, 'order_id' is just an identifier for
the orders.

Customer ID
There are 1200 unique customers. As 'customer_id' is a variable to identify customers,
and the number of unique customer IDs is less than the number of unique order IDs,
we can see that there are some customers who have placed more than one order.

Restaurant Name
There are 178 unique restaurants in the dataset.
Let's check the number of orders that get served by the restaurants.
The restaurant that has received the maximum number of orders is Shake Shack

Cuisine Type

Figure 1: Univariate analysis of Cuisine Type

This file is meant for personal use by kumar.ashish2050@gmail.com only. 8


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Observation:
There are 14 unique cuisines in the dataset.
The distribution of cuisine types shows that cuisine types are not equally
distributed. The most frequent cuisine type is American followed by Japanese and
Italian.

Confidential
Vietnamese appears to be the least popular of all the cuisines.

Cost of the Order

Figure 2: Univariate analysis of Cost of the order

Observations:
The average cost of the order is greater than the median cost indicating that the
distribution for the cost of the order is right-skewed.
The mode of distribution indicates that a large chunk of people prefer to order food
that costs around 10-12 dollars.
There are a few orders that cost greater than 30 dollars. These orders might be
for some expensive meals.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 9


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Day of the Week

Confidential
Figure 3: Univariate analysis of Day of the week

Observations:
The 'day_of_the_week' columns consist of 2 unique values - Weekday and Weekend
The distribution shows that the number of orders placed on weekends is
approximately double the number of orders placed on weekdays.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 10


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Rating

Confidential Figure 4: Univariate analysis of Rating

Observations:
The distribution of 'rating' shows that the most frequent rating category is 'not
given', followed by a rating of 5.
Only around 200 orders have been rated 3.

Food Preparation Time

Figure 5: Univariate analysis of Food Preparation time

This file is meant for personal use by kumar.ashish2050@gmail.com only. 11


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Confidential
Figure 6: Univariate analysis of Food Preparation time

Observations:
The average food preparation time is almost equal to the median food preparation time
indicating that the distribution is nearly symmetrical.
The food preparation time is pretty evenly distributed between 20 and 35 minutes.
There are no outliers in this column.

Delivery Time

Figure 7: Univariate analysis of Delivery time

This file is meant for personal use by kumar.ashish2050@gmail.com only. 12


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Confidential
Figure 8: Univariate analysis of Delivery time

Observations:
The average delivery time is a bit smaller than the median delivery time indicating that
the distribution is a bit left-skewed.
Comparatively more orders have delivery time between 24 and 30 minutes.
There are no outliers in this column.

Answer the Key Questions

Which are the top 5 restaurants in terms of the number of orders


received?

Top 5 popular restaurants that have received the highest number of orders, ‘Shake
Shack', 'The Meatball Shop', 'Blue Ribbon Sushi', 'Blue Ribbon Fried Chicken', and
'Parm'.
Almost 33% of the orders in the dataset are from these restaurants.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 13


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Which is the most popular cuisine on weekends?

Confidential
The most popular cuisine type on weekends is American.

What percentage of the orders cost more than 20 dollars?

There are a total of 555 orders that cost above 20 dollars.


The percentage of such orders in the dataset is around 29.24%.

What is the mean order delivery time?

The mean delivery time is around 24.16 minutes.

The company has decided to give 20% discount vouchers to the


top 5 most frequent customers. Find the IDs of these customers
and the number of orders they placed.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 14


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Confidential
A customer with ID 52832 has ordered 13 times.

Multivariate Analysis
Multivariate analysis helps to explore relationships between the important variables in
the dataset. (It is a good idea to explore relations between numerical variables as
well as relations between numerical and categorical variables)

Cuisine vs Cost of the Order

Figure 9: Multivariate analysis of Cuisine vs Cost of the Order


Observations:
Vietnamese and Korean cuisines cost less compared to other cuisines.
The boxplots for Italian, American, Chinese, and Japanese cuisines are quite similar.
This indicates that the quartile costs for these cuisines are quite similar.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 15


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Outliers are present for the cost of Korean, Mediterranean, and
Vietnamese cuisines.
French and Spanish cuisines are costlier compared to other cuisines.

Cuisine vs Food Preparation Time

Confidential

Figure 10: Multivariate analysis of Cuisine vs Food preparation Time

Observations:
Food preparation time is very consistent for most cuisines.
The median food preparation time lies between 24 and 30 minutes for all the cuisines.
Outliers are present for the food preparation time of Korean cuisine.
Korean cuisine takes less time compared to other cuisines.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 16


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Day of the Week vs Delivery Time

Confidential
Figure 11: Multivariate analysis of Day of the Week vs Delivery Time

Observations:
The delivery time for all the orders over the weekends is less compared to
weekdays. This could be due to the dip in traffic over the weekends.

Revenue Generated by the Restaurants

Observations:
The displayed 14 restaurants are
generating more than 500
dollars in revenue.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 17


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Rating vs Delivery Time

Confidential
Figure 12: Multivariate analysis of Rating vs Delivery Time

Observations:
It is possible that delivery time plays a role in the low rating of the orders.

Rating vs Food Preparation Time

Figure 13: Multivariate analysis of Rating vs Food Preparation Time

This file is meant for personal use by kumar.ashish2050@gmail.com only. 18


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Observations:
It seems that food preparation time does not play a role in the low rating of the orders.

Rating vs Cost of the Order

Confidential
Figure 14: Multivariate analysis of Rating vs Cost of the Order
Observations:
It seems that high-cost orders have been rated well and low-cost orders have not been
rated.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 19


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Correlation among Variables

Confidential
Figure 15: Correlation Plot

Observations:
There is no correlation between the cost of the order, delivery time, and food
preparation time.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 20


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Answer Key Questions

The company wants to provide a promotional offer in the


advertisement of the restaurants. The condition to get the offer is that
the restaurants must have a rating count of more than 50 and the

Confidential
average rating should be greater than 4. Find the restaurants fulfilling
the criteria to get the promotional offer.

In order to find the restaurants that fuflfill the criteria, we need to filter the data as
follows:

1. First, find the restaurants that have at least 50 ratings


2. Once we have the restaurants that have at least 50 ratings, sort them
in descending order of their average rating

Table 4: Restaurant rating

The restaurants fulfilling the criteria to get the promotional offer are: 'The Meatball
Shop', 'Blue Ribbon Fried Chicken', 'Shake Shack', and 'Blue Ribbon Sushi'.

The company charges the restaurant 25% on orders having cost


greater than 20 dollars and 15% on orders having cost greater than 5
dollars. Find the net revenue generated by the company across all
orders.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 21


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
In order to find the net revenue generated by the company, we need to follow the steps,
1. First, we need to add the charges as per the requirement
2. Finall, find the sum of revenues after adding the charges.

Confidential
The net revenue generated on all the orders given in the dataset is around 6166.3
dollars.

The company wants to analyze the total time required to deliver the
food. What percentage of orders take more than 60 minutes to get
delivered from the time the order is placed? (The food has to be
prepared and then delivered.)

In order to find the percentage of orders that take more than 60 minutes to get
delivered from the time the order is placed, follow the steps,
1. First, find the total time by adding food preparation time and delivery time
2. Then, find the percentage of orders that have more than 60 minutes of
total delivery time

Approximately 10.54 % of the total orders have more than 60 minutes of total
delivery time.

The company wants to analyze the delivery time of the orders


on weekdays and weekends. How does the mean delivery time
vary during weekdays and weekends?

In order to find the mean delivery time during weekdays and weekends, follow the steps,
1. First, get the mean delivery time on weekdays
2. Then, get the mean delivery time on weekends

This file is meant for personal use by kumar.ashish2050@gmail.com only. 22


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Confidential
The mean delivery time on weekdays is around 28 minutes whereas the mean delivery
time on weekends is around 22 minutes.
This could be due to the dip in traffic volume on the weekends.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 23


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr
Actionable Insights and Recommendations

Actionable Insights:
● Around 80% of the orders are for American, Japanese, Italian, and Chinese

Confidential
cuisines. Thus, it seems that these cuisines are quite popular among
customers of FoodHub.
● Shake Shack is the most popular restaurant that has received the highest
number of orders.
● Order volumes increase on the weekends compared to the weekdays.
● Delivery time over the weekends is less compared to the weekdays. This could
be due to the dip in traffic volume over the weekends.
● Around 39% of the orders have not been rated.

Business Recommendations:
● FoodHub should integrate with restaurants serving American, Japanese,
Italian, and Chinese cuisines as these cuisines are very popular among
FoodHub customers.
● FoodHub should provide promotional offers to top-rated popular restaurants
like Shake Shack that serve most of the orders.
● As the order volume is high during the weekends, more delivery persons
should be employed during the weekends to ensure timely delivery of the
order. Weekend promotional offers should be given to the customers to
increase the food orders during weekends.
● Customer Rating is a very important factor to gauge customer satisfaction. The
company should investigate the reason behind the low count of ratings. They
can redesign the rating page in the app and make it more interactive to lure
the customers to rate the order.
● Around 11% of the total orders have more than 60 minutes of total delivery
time. FoodHub should try to minimize such instances in order to avoid
customer dissatisfaction. They can provide some reward to the punctual
delivery persons.

This file is meant for personal use by kumar.ashish2050@gmail.com only. 24


Proprietary coSntheanrti.n©g GorrepautbLliesahrinngintgh.eAclloRnitgehnttss Rinepsaerrtveodr

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy