SMDM Guided Project Ashish
SMDM Guided Project Ashish
FoodHub Analysis
Confidential
1.1 Problem Definition 3
List of Tables
3 Statistical summary 7
4 Restaurant rating 21
Confidential
2 Univariate Analysis of Cost of the Order 9
15 Correlation Plot 20
Confidential
Context
The number of restaurants in New York is increasing day by day. Lots of students
and busy professionals rely on those restaurants due to their hectic lifestyles.
Online food delivery service is a great option for them. It provides them with good
food from their favorite restaurants. A food aggregator company FoodHub offers
access to multiple restaurants through a single smartphone app.
The app allows restaurants to receive a direct online order from a customer. The app
assigns a delivery person from the company to pick up the order after it is confirmed
by the restaurant. The delivery person then uses the map to reach the restaurant and
waits for the food package. Once the food package is handed over to the delivery
person, he/she confirms the pick-up in the app and travels to the customer's location
to deliver the food. The delivery person confirms the drop-off in the app after
delivering the food package to the customer. The customer can rate the order in the
app. The food aggregator earns money by collecting a fixed margin of the delivery
order from the restaurants.
Objective
The food aggregator company has stored the data of the different orders made by the
registered customers in their online portal. They want to analyze the data to get a fair
idea about the demand of different restaurants which will help them in enhancing
their customer experience. Suppose you are hired as a Data Scientist in this company
and the Data Science team has shared some of the key questions that need to be
answered.
Perform the data analysis to find answers to these questions that will help the
company to improve the business.
Data Dictionary
Confidential
order_id: Unique ID of the order
customer_id: ID of the customer who ordered the food
restaurant_name: Name of the restaurant
cuisine_type: Cuisine ordered by the customer
cost_of_the_order: Cost of the order
day_of_the_week: Indicates whether the order is placed on a weekday or weekend
(The weekday is from Monday to Friday and the weekend is Saturday and Sunday)
rating: Rating given by the customer out of 5
food_preparation_time: Time (in minutes) taken by the restaurant to prepare the
food. This is calculated by taking the difference between the timestamps of the
restaurant's order confirmation and the delivery person's pick-up confirmation.
delivery_time: Time (in minutes) taken by the delivery person to deliver the food
package. This is calculated by taking the difference between the timestamps of the
delivery person's pick-up confirmation and drop-off information
Confidential
The data frame has 9 columns as mentioned in the Data Dictionary. Data in each
row corresponds to the order placed by a customer.
Confidential
Missing Value Treatment
Confidential
Table 3: Statistical summary
Observations:
● Order ID and Customer ID are just identifiers for each order.
● The cost of an order ranges from 4.47 to 35.41 dollars, with an average order
costing around 16 dollars and a standard deviation of 7.5 dollars. The cost of
75% of the orders is below 23 dollars. This indicates that most customers
prefer low-cost food compared to expensive ones.
● Food preparation time ranges from 20 to 35 minutes, with an average of
around 27 minutes and a standard deviation of 4.6 minutes. The spread is not
very high for the food preparation time.
● Delivery time ranges from 15 to 33 minutes, with an average of around 24
minutes and a standard deviation of 5 minutes. The spread is not too high
for delivery time either.
Univariate Analysis
Confidential
Order ID
There are 1898 unique orders. As mentioned earlier, 'order_id' is just an identifier for
the orders.
Customer ID
There are 1200 unique customers. As 'customer_id' is a variable to identify customers,
and the number of unique customer IDs is less than the number of unique order IDs,
we can see that there are some customers who have placed more than one order.
Restaurant Name
There are 178 unique restaurants in the dataset.
Let's check the number of orders that get served by the restaurants.
The restaurant that has received the maximum number of orders is Shake Shack
Cuisine Type
Confidential
Vietnamese appears to be the least popular of all the cuisines.
Observations:
The average cost of the order is greater than the median cost indicating that the
distribution for the cost of the order is right-skewed.
The mode of distribution indicates that a large chunk of people prefer to order food
that costs around 10-12 dollars.
There are a few orders that cost greater than 30 dollars. These orders might be
for some expensive meals.
Confidential
Figure 3: Univariate analysis of Day of the week
Observations:
The 'day_of_the_week' columns consist of 2 unique values - Weekday and Weekend
The distribution shows that the number of orders placed on weekends is
approximately double the number of orders placed on weekdays.
Observations:
The distribution of 'rating' shows that the most frequent rating category is 'not
given', followed by a rating of 5.
Only around 200 orders have been rated 3.
Observations:
The average food preparation time is almost equal to the median food preparation time
indicating that the distribution is nearly symmetrical.
The food preparation time is pretty evenly distributed between 20 and 35 minutes.
There are no outliers in this column.
Delivery Time
Observations:
The average delivery time is a bit smaller than the median delivery time indicating that
the distribution is a bit left-skewed.
Comparatively more orders have delivery time between 24 and 30 minutes.
There are no outliers in this column.
Top 5 popular restaurants that have received the highest number of orders, ‘Shake
Shack', 'The Meatball Shop', 'Blue Ribbon Sushi', 'Blue Ribbon Fried Chicken', and
'Parm'.
Almost 33% of the orders in the dataset are from these restaurants.
Confidential
The most popular cuisine type on weekends is American.
Multivariate Analysis
Multivariate analysis helps to explore relationships between the important variables in
the dataset. (It is a good idea to explore relations between numerical variables as
well as relations between numerical and categorical variables)
Confidential
Observations:
Food preparation time is very consistent for most cuisines.
The median food preparation time lies between 24 and 30 minutes for all the cuisines.
Outliers are present for the food preparation time of Korean cuisine.
Korean cuisine takes less time compared to other cuisines.
Confidential
Figure 11: Multivariate analysis of Day of the Week vs Delivery Time
Observations:
The delivery time for all the orders over the weekends is less compared to
weekdays. This could be due to the dip in traffic over the weekends.
Observations:
The displayed 14 restaurants are
generating more than 500
dollars in revenue.
Confidential
Figure 12: Multivariate analysis of Rating vs Delivery Time
Observations:
It is possible that delivery time plays a role in the low rating of the orders.
Confidential
Figure 14: Multivariate analysis of Rating vs Cost of the Order
Observations:
It seems that high-cost orders have been rated well and low-cost orders have not been
rated.
Confidential
Figure 15: Correlation Plot
Observations:
There is no correlation between the cost of the order, delivery time, and food
preparation time.
Confidential
average rating should be greater than 4. Find the restaurants fulfilling
the criteria to get the promotional offer.
In order to find the restaurants that fuflfill the criteria, we need to filter the data as
follows:
The restaurants fulfilling the criteria to get the promotional offer are: 'The Meatball
Shop', 'Blue Ribbon Fried Chicken', 'Shake Shack', and 'Blue Ribbon Sushi'.
Confidential
The net revenue generated on all the orders given in the dataset is around 6166.3
dollars.
The company wants to analyze the total time required to deliver the
food. What percentage of orders take more than 60 minutes to get
delivered from the time the order is placed? (The food has to be
prepared and then delivered.)
In order to find the percentage of orders that take more than 60 minutes to get
delivered from the time the order is placed, follow the steps,
1. First, find the total time by adding food preparation time and delivery time
2. Then, find the percentage of orders that have more than 60 minutes of
total delivery time
Approximately 10.54 % of the total orders have more than 60 minutes of total
delivery time.
In order to find the mean delivery time during weekdays and weekends, follow the steps,
1. First, get the mean delivery time on weekdays
2. Then, get the mean delivery time on weekends
Actionable Insights:
● Around 80% of the orders are for American, Japanese, Italian, and Chinese
Confidential
cuisines. Thus, it seems that these cuisines are quite popular among
customers of FoodHub.
● Shake Shack is the most popular restaurant that has received the highest
number of orders.
● Order volumes increase on the weekends compared to the weekdays.
● Delivery time over the weekends is less compared to the weekdays. This could
be due to the dip in traffic volume over the weekends.
● Around 39% of the orders have not been rated.
Business Recommendations:
● FoodHub should integrate with restaurants serving American, Japanese,
Italian, and Chinese cuisines as these cuisines are very popular among
FoodHub customers.
● FoodHub should provide promotional offers to top-rated popular restaurants
like Shake Shack that serve most of the orders.
● As the order volume is high during the weekends, more delivery persons
should be employed during the weekends to ensure timely delivery of the
order. Weekend promotional offers should be given to the customers to
increase the food orders during weekends.
● Customer Rating is a very important factor to gauge customer satisfaction. The
company should investigate the reason behind the low count of ratings. They
can redesign the rating page in the app and make it more interactive to lure
the customers to rate the order.
● Around 11% of the total orders have more than 60 minutes of total delivery
time. FoodHub should try to minimize such instances in order to avoid
customer dissatisfaction. They can provide some reward to the punctual
delivery persons.