0% found this document useful (0 votes)

6 views10 pages

Data Analysis

This report analyzes retail sales data from a supermarket chain to improve decision-making through data analytics. It covers the methodology used for data preprocessing, exploratory analysis, forecasting, regression, and customer segmentation, revealing insights on sales trends, store performance, and customer behavior. Recommendations include better inventory management, targeted marketing strategies, and improved data collection to enhance business operations.

Uploaded by

sahilsharma9068730737

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views10 pages

Data Analysis

Uploaded by

sahilsharma9068730737

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Project

Student Names - Seenu Rani (SPI230468)

Bipin Panthi (SPI 240561 )
Bikash Khatri (SPI230852)
Manish (SPI240545)

Subject - MDS610, Decision Making for Analytics

Date:- 24 May, 2025

Page 1 / 1
Data Analytics Report
Retail Sales Forecasting for a Chain of Supermarkets

1. Introduction
Background of the Dataset
Today, stores uses the data (information) to make better decisions, help the customers, and work
more smoothly. Big store chains that have shops in many places collect a lot of data every day.
This data comes from things like sales, talking to customers, checking what’s in stock, and
running discounts. When this data is studied, it helps the stores know what to sell, how much to
charge, how much stock to keep, and how to advertise to different kinds of customers.

This report uses a made-up but realistic set of data to show how a supermarket chain might work
in both cities and villages. The data is based on the real life examples found on websites like
Kaggle and UCI that share data or datasets for learning. It includes over 100,000 sales made
during one year, showing shopping the habits, changes during different seasons, and how
discounts can affect buying.

Each row in the dataset shows one shopping transaction and includes the following important
details:

• Date of purchase: Shows the exact day something is bought by someone. This helps us
to see the patterns over days, weeks, months, and seasons.

• Store location and type: Tells us if the store is in a city or village, and in which city.
This helps compare how well stores are doing in different places.

• Product category: Products are grouped into the types like groceries, personal care,
electronics, clothes, and household items. This helps us to see which types of the
products make the most money and which are most affected by the sales and offers.

• Units sold and money earned: It shows how many items were sold and how much
money was made from them. This helps to measure store performance, how fast the
products sell, and how much the money is made from per customer.

• Discounts: Shows if a product was sold at a lower price and how big the discount was.
This helps us to understand that if sales and offers really work or not.

• Customer details: Basic customer information like age group, gender, and if they are a
member of the loyalty program. This can helps to group the customers and study how
differently people shop.
Use this kind of dataset is helpful because it looks like the real data that medium and and large
stores uses it every day. Big retail companies int the world track this kind of data or information.
It helps stores to managers to make the smart choices and smart decisions based on the data, and
not only just by guesses.
One great thing about this dataset is that it has many kind of data. It has numbers like how much
items was sold and how much money was made, and it also has categories like store location,
product type, and customer type. Because of this, we can use many ways to study the data like
looking for patterns, making predictions and grouping similar things.
From a business point of view, this data can help answer important questions like:
• How do customer choices change by age, gender, or location?

• Which stores are doing the best and why?

• Do discounts help increase sales?

• Which products sell well, and which don’t?

This report will look at the data step by step, using modern tools to show how a store can use its
data to improve and make better decisions.

Purpose and Objectives of the Analysis

The main purpose of this report is to show how using data can help the stores to make smart and
better decisions. The goals of this analysis are:
• To find out how the sales change over time and in different store locations.
• To see how the offers and prices affect the product sales.
• To guess the future of sales using past data so stores can plan the stock and staff.
• To understand different types of customers so stores can do better and more personal
marketing.
• To give useful suggestions to help the business grow.
This analysis will help store managers and business leaders make smart choices based on real
data. It can help increase profits, lower costs, and make customers happier.

2. Methodology
Analytical Workflow and Tools
To get useful information from the data, we followed a clear step-by-step process used in the
data industry. We used ideas from a common method called CRISP-DM, which helps to guide
the data projects. We used both basic math (statistics) and computer models (machine learning)
to answer the business questions mentioned earlier.
The goal was to find the patterns, understand how different things in the data are connected, and
help the retail business make smart decisions based on facts.

1. Data Preprocessing and Cleaning

The rst step in our analysis was to get the data ready by xing the mistakes and making sure
that the data was correct. Here’s what we did:

• Removed Duplicates: We deleted the repeated records so that the results wouldn’t be
unfair or wrong.

• Fixed Missing Data: Some parts of the data were empty, like customer details or
discount info. We lled the missing numbers using the average, and for missing
categories (like gender), we used the most common value. If too much data was missing
in a row, we removed that row.

• Made Categories Consistent: We cleaned up the category names like store type (city or
village), product types, and gender so everything can matched. We then changed these
words into numbers using the simple methods so computer models could understand
them.

• Changed Date Format: We changed the date eld into a special format that lets us easily
nd the day, month, or quarter. This helped us study sales over time and see trends.

2. Exploratory Data Analysis (EDA)

In this step, we looked closely at the data to understand it better. This helped us to nd the
unusual values (outliers), spot patterns, and come up with the ideas for deeper study.

• Basic Statistics: We used the Python tools like Pandas and NumPy to nd the average,
middle value, how much values the change, and how often things appear.

• Graphs and Charts: We used tools like Seaborn and Matplotlib to make:

◦ Bar charts to compare the different product categories

◦ Line graphs to see how the sales changed over the time

◦ Box plots to see how the revenue changed by age or gender

◦ Heatmaps to show how different numbers are connected

From these graphs, we learned that the city stores sold more electronics, while village stores
made more money from groceries.

3. Time Series Forecasting

fi
fi
fi
fi
fi
fi
fi
To help the store to plan stock and staff, we studied how sales changed over the time using
special methods:

• ARIMA Model: This method helped us to predict the future sales by looking at the past
trends and patterns.

• Exponential Smoothing: This helped us to see seasonal effects (like higher sales during
holidays) by reducing short-term ups and downs.

• STL (Seasonal Decomposition of Time Series): This method broke the sales data into
three parts:

◦ Overall trend (long-term rise or fall)

◦ Seasonal pattern (like more sales during festivals)

◦ Random changes (ups and downs we can’t explain)

These tools helped us to understand how the sales change over time and plan better for the
future.

4. Regression Analysis
To nd out what things affect the sales, we used the following methods:

• Linear Regression: This helped us to see how the things like discounts, customer details
(like age or gender), and store location affect sales and the number of items sold.

• Multivariate Regression: This is a more advanced version where we looked at many

factors at the same time. It helped us to understand how these factors work together and
make better business decisions.

5. Customer Segmentation (Clustering)

To better understand our customers and improve marketing, we grouped them based on how they
shop. We used a method called:

• We grouped customers based on how much they spend, how often they shop, and what
they buy using a method called K-Means Clustering.

• We found different types of shoppers (like discount hunters, premium buyers, and bulk
buyers) to help improve marketing and customer loyalty strategies.

This helped the business make better marketing and loyalty plans.

6. Tools Used
We used both coding tools and spreadsheet software to do the analysis.
fi
Python Libraries:
o Pandas and NumPy for data handling and numerical operations
o Seaborn and Matplotlib for advanced data visualization
o Scikit-learn for regression and clustering models
o Statsmodels for statistical modeling and time series analysis
• Excel: Used at the beginning to look at the data, make quick summaries, and create pivot
tables before doing deeper analysis with code.

3. Results and Discussion

This section explains the aim we found from the study of the data. We see at the seasonal trends,
how different stores performed, types of customers and how accurately our sales predictions are.
The results shows useful patterns and give ideas that can help the better decision making.

1. Seasonal and Temporal Patterns

Monthly sales figures highlighted variations throughout the year:
% Change from
Month Total Sales ($) Key Observa ons
Previous Month
January 390,000 - Post-holiday dip
August 510,000 +21% Back-to-school peak
Decemb Holiday shopping
655,000 +28%
er surge

• Sales were higher on the weekends than weekdays. On average, people spent about
$18,500 on weekends and $14,000 on the weekdays, showing a 32% increase in weekend
spending.
• The highest sales in one day happened on December 23rd, with $78,000 in sales likely
because of last minute holiday shopping.
• We also saw more sales in stationery and clothes during August and September, which
matches the school season.

2. Store and Product Performance

A comparative evaluation between store types and product categories revealed significant
differences in performance:
ti
Store Performance Summary

Store Avg Monthly Revenue Avg Transac on Value Customer Foo all (avg/
Type ($) ($) month)
Urban 450,000 38 11,842
Rural 320,000 31 10,322

• Urban stores made more money than rural stores about 40.6% more. This is because city
stores have more people, higher spending, and bigger shopping.
• City stores also had more customers and bigger sales per purchase compared to rural
stores.
Product Performance Matrix

Units Sold Revenue Pro t Margin

Category
(Monthly) ($) (%)
Groceries 120,000 150,000 12%
Electronics 11,500 220,000 35%
Household
17,000 135,000 28%
Items
Apparel 23,000 97,000 22%
• Groceries sold the most but at the least pro t (12%).

• Electronics and household items made more pro t (35% and 28%).

• The best-selling product was “Family Pack Milk 2L”, selling about 3,450 units per
month, making $6,200every month.

3. Customer Segmentation Insights (K-Means Clustering)

Clustering analysis (using K-Means) divided customers into three different behavioral segments:
% of Total Avg Basket Visit
Cluster Key Behavior
Customers ($) Frequency
Regular visits, low spend, deal-
Budget 54% 15 Weekly
seeking
Premium 22% 58 Monthly Brand loyal, high-value purchases
Occasion
24% 27 Irregular Promo on-sensi ve, low reten on
al

• Promotions for Cluster 2 (Premium Shoppers) worked well — 18.3% more of them
bought expensive products, showing that targeted ads and deals were effective.
fi
ti
ti
tf
ti
ti
fi
fi
• Cluster 1 customers didn’t spend as much per the visit, but they made up more than half
of all shoppers, so it’s smart to keep them happy with bulk or value deals.

4. Forecast Accuracy and Sales Projections

We used to methods to predict future sales:
RMS MAPE Forecasted Q1 Sales YoY
Model
E (%) ($) Growth
ARIMA (2,1,2) 7.5 4.2% 1.38 million +6.8%
Exponen al
9.1 5.6% 1.33 million +5.1%
Smoothing

• The ARIMA model gave more accurate results.

• February sales were expected to be dropped by 9%, normal after the holiday season.
• We tested both models carefully to make that we can get reliable and good results.

Strengths
• Accuracy: The error was low, so we can trust the predictions.
• We grouped customers in helpful ways for better marketing and planning.
• We used charts and tables to explain the findings and easy to understand.
• The results gave direct ideas for stock, ads and running the business.

Weaknesses and Limitations

• We do not know customer’s earning or jobs, which can make it to understand shopper
types.
• Some sales and discounts were not clearly shown, making I third to judge that how well
promotions worked.
• We didnot include factors like inflation, weather and competitors these all can affect the
sales.
• Everything was based on past sales, so we cannot reach in real tome to new trends.

4. Conclusion and Recommendations

ti
Conclusion
This analysis shows that how using data in a smart and organised way can help business to find
useful information from large sets of data. BY studying over 1,00,000 store transaction, we find:
• Seasonal trends like a 22% sales jump in December and 15% to 20% more sales on the
weekdays.
• Only 20% of the items made up 80% of the total sales.
• Different types of customers who shop in very different ways and spend differently.
We used smart tools like: Time series forecasting (ARIMA) to predict future sales, Clustering
(K-Means) to group similar customers, and Regression models to find what factors affect sales.
These methods helped us better to understand what to stock, when to run promotions and how to
serve different types of customers.

Even the data was not real but based on realistic example, the results shows how data can be
used to make smart business decision. This analysis helps us to increase sales, cut down on waste
in both urban and rural areas.

Recommendations
Based on the findings from the data, we proposed some recommendations to get benefit in the
business:

1. Better Inventory Management

• Plan the stock using the expected 6.8% 5growth in sales to keep more stock in busy
months like December and August.
• Use automatic alerts to restock popular items on time and avoid less in stock.
• Focus on the top 20% of items that make 80% of sales, to make sure that these are always
available.
2. Smart Marketing
• Create offers for different types of customers.
• Give regular small discounts to keep budget shoppers coming back.
• Offers special bundles or early access to new products for premium shoppers.
• Send remainder through emails and sms with limited time offers to bring occasional
shoppers back.
3. Improve Store Performance
• Keep stores open for longer on weekends and holidays in cities, where more peoples
shop.
• Put high profit items like electronics in the best shelf spots to increase sales.
• In rural areas keep plenty of fast selling daily items like groceries and households goods
due to this stock shortage cuts by 12-15%.
4. Collect Better Data
• Encourage customer to signup for loyalty programs to collect more information about
them, like income or job.
• Make sure that all discounts and promotions are entered in the same way which can helps
to track that offers which really works.
• Add outside data like weather information or inflation rates to improve the predictions
and handles sudden changes better.

If the company follows these steps, it can improve how it run the stores, market to the
right people, and plan better decisions which can increase the profits by 10-15% each
year. Using data in daily decisions is not just helpful , it’s a big advantage in todays
business world.

Business Forecasting John E. Hanke Dean Wichern Ninth Edition
No ratings yet
Business Forecasting John E. Hanke Dean Wichern Ninth Edition
159 pages
ISYE 6402 Lecture Transcripts
No ratings yet
ISYE 6402 Lecture Transcripts
363 pages
Irfan Awan - The International Conference on Deep Learning, Big Data and Blockchain (Deep-BDB 2021)-Springer Nature (2021)
No ratings yet
Irfan Awan - The International Conference on Deep Learning, Big Data and Blockchain (Deep-BDB 2021)-Springer Nature (2021)
182 pages
Time Series With Python
No ratings yet
Time Series With Python
88 pages
IE UNIT-III Forecasting
No ratings yet
IE UNIT-III Forecasting
99 pages
Final Report Docs Grp9
No ratings yet
Final Report Docs Grp9
40 pages
Chapter1 1
No ratings yet
Chapter1 1
33 pages
Unit 3 - Time Series Analysis
No ratings yet
Unit 3 - Time Series Analysis
40 pages
Retail Analytics-MGT3007-Dr. AFMS (53031) M1-M2
No ratings yet
Retail Analytics-MGT3007-Dr. AFMS (53031) M1-M2
20 pages
Final Project
No ratings yet
Final Project
39 pages
Group 9 Paper Presentation
No ratings yet
Group 9 Paper Presentation
24 pages
14Mx11 Probability and Statistics: Semester I 4 0 0 4
No ratings yet
14Mx11 Probability and Statistics: Semester I 4 0 0 4
36 pages
Machine Learning Applications in Predictive Maintenance for Vehicles Case Studies
No ratings yet
Machine Learning Applications in Predictive Maintenance for Vehicles Case Studies
14 pages
Ch2 Marketing Feasibility Study
67% (3)
Ch2 Marketing Feasibility Study
120 pages
i Ct 762 Group Report
No ratings yet
i Ct 762 Group Report
19 pages
Integration and Comovement of Developed and Emerging Islamic Stock Markets: A Case Study of Malaysia
No ratings yet
Integration and Comovement of Developed and Emerging Islamic Stock Markets: A Case Study of Malaysia
37 pages
سلاسل ماركوف 1
No ratings yet
سلاسل ماركوف 1
49 pages
Aaabgh Project
No ratings yet
Aaabgh Project
28 pages
advance database
No ratings yet
advance database
15 pages
Internship Report of Sales Data Analysis
No ratings yet
Internship Report of Sales Data Analysis
21 pages
Supermarket_Sales_Analysis_Algorithm- by Data Analaysis
No ratings yet
Supermarket_Sales_Analysis_Algorithm- by Data Analaysis
2 pages
The Role of Banks, Non-Banks and The Central Bank in The Money Creation Process
100% (1)
The Role of Banks, Non-Banks and The Central Bank in The Money Creation Process
21 pages
Murat Durmus - A Primer To The 42 Most Commonly Used Machine Learning Algorithms (With Code Samples) - Leanpub (2023)
No ratings yet
Murat Durmus - A Primer To The 42 Most Commonly Used Machine Learning Algorithms (With Code Samples) - Leanpub (2023)
192 pages
eBook - Retail Analytics_Final
No ratings yet
eBook - Retail Analytics_Final
10 pages
MANTHIRAM NAAN MUDHALVAN Finished. picture completed the project.
No ratings yet
MANTHIRAM NAAN MUDHALVAN Finished. picture completed the project.
18 pages
HOMEWORK_13 - Question 19.1 Answer
No ratings yet
HOMEWORK_13 - Question 19.1 Answer
4 pages
FILE_2620
No ratings yet
FILE_2620
24 pages
Introduction
No ratings yet
Introduction
21 pages
ZFL KM ICT702 Assessment 4
No ratings yet
ZFL KM ICT702 Assessment 4
7 pages
Deakin Ms Data Science Programme
No ratings yet
Deakin Ms Data Science Programme
21 pages
data Analytics in Retail
No ratings yet
data Analytics in Retail
18 pages
AS Riyyan ICT702 (1)
No ratings yet
AS Riyyan ICT702 (1)
8 pages
IEOR E4709 Spring 2016 Syllabus
No ratings yet
IEOR E4709 Spring 2016 Syllabus
1 page
Business Driven Information Systems 2e
No ratings yet
Business Driven Information Systems 2e
61 pages
RITHIKA CONTENT
No ratings yet
RITHIKA CONTENT
25 pages
ILANTENRALVBDA
No ratings yet
ILANTENRALVBDA
11 pages
Session 1 - Marketing Business Analytics - 0621
No ratings yet
Session 1 - Marketing Business Analytics - 0621
68 pages
ARIMA-in in Environmental Forecasting
No ratings yet
ARIMA-in in Environmental Forecasting
25 pages
CS2A Mega Class 2
No ratings yet
CS2A Mega Class 2
7 pages
AER
No ratings yet
AER
201 pages
Lab 1 ML
No ratings yet
Lab 1 ML
7 pages
final project ppt
No ratings yet
final project ppt
15 pages
Marketing Analytics Unit 4
No ratings yet
Marketing Analytics Unit 4
10 pages
Piyush Kumar Singh - Project Submission - Data Analytics
No ratings yet
Piyush Kumar Singh - Project Submission - Data Analytics
23 pages
Presentation Dashboard
No ratings yet
Presentation Dashboard
13 pages
R CASE STUDY 1 (Retail)
No ratings yet
R CASE STUDY 1 (Retail)
4 pages
Statistical Learning in Practice_young
No ratings yet
Statistical Learning in Practice_young
2 pages
Pranita Dane - IBM - Internship Project Submission - Data Analytics
No ratings yet
Pranita Dane - IBM - Internship Project Submission - Data Analytics
28 pages
Imbuido James MA5821 Ax2
No ratings yet
Imbuido James MA5821 Ax2
20 pages
MA_UNIT V
No ratings yet
MA_UNIT V
22 pages
Document (6) Shaira
No ratings yet
Document (6) Shaira
8 pages
rithika.ppt
No ratings yet
rithika.ppt
16 pages
Retail Data Analysis in Istanbul - Demo - Guide File
No ratings yet
Retail Data Analysis in Istanbul - Demo - Guide File
25 pages
finaal project
No ratings yet
finaal project
13 pages
Data Analysis On BigMart Sales
67% (3)
Data Analysis On BigMart Sales
17 pages
Walmart's Sales Data Analysis - A Big Data
No ratings yet
Walmart's Sales Data Analysis - A Big Data
6 pages
Sales and Operations Planning P
No ratings yet
Sales and Operations Planning P
15 pages
Retail Sales Analytics Project
No ratings yet
Retail Sales Analytics Project
3 pages
Fuzzy Based Techniques For Handling Missing Values
No ratings yet
Fuzzy Based Techniques For Handling Missing Values
6 pages
National Institute of Technology Durgapur
No ratings yet
National Institute of Technology Durgapur
11 pages
Anusika Gupta ABMCL18043
No ratings yet
Anusika Gupta ABMCL18043
20 pages
bigData_report
No ratings yet
bigData_report
14 pages
Retail_and_Ecommerce_Analysis_Report
No ratings yet
Retail_and_Ecommerce_Analysis_Report
3 pages
Note - Unit-4
No ratings yet
Note - Unit-4
12 pages
Sales Analysis: Submitted By: Diksha
No ratings yet
Sales Analysis: Submitted By: Diksha
35 pages
1.1 Introduction To Retail
No ratings yet
1.1 Introduction To Retail
46 pages
Walmart_Sales_Data_Analysis
No ratings yet
Walmart_Sales_Data_Analysis
4 pages
Chapter 5 ECON NOTES
No ratings yet
Chapter 5 ECON NOTES
6 pages
DSML - Project Report - Group 3
No ratings yet
DSML - Project Report - Group 3
17 pages
Report
No ratings yet
Report
4 pages
Retail Case Study
No ratings yet
Retail Case Study
3 pages
SS Teamproject Documentation
No ratings yet
SS Teamproject Documentation
33 pages
Enterprise Final Demo
No ratings yet
Enterprise Final Demo
8 pages
DABI - Final Assignment - Arif - Shayekh
No ratings yet
DABI - Final Assignment - Arif - Shayekh
12 pages
Research Paper On Retail Data Analytics
No ratings yet
Research Paper On Retail Data Analytics
6 pages
Chapter 1: Introduction: 1.1 Background Theory
No ratings yet
Chapter 1: Introduction: 1.1 Background Theory
36 pages
br17 Final Project Report
No ratings yet
br17 Final Project Report
7 pages
Intro To BA
No ratings yet
Intro To BA
7 pages
Forecasting For Asian Paints
No ratings yet
Forecasting For Asian Paints
3 pages
Time Series Lecture Notes
No ratings yet
Time Series Lecture Notes
97 pages
Case Study-1-Pattern Discovery in Supermarket Sales Transactions Using EDA
No ratings yet
Case Study-1-Pattern Discovery in Supermarket Sales Transactions Using EDA
3 pages
Targeting Customers N Gathering Ion
No ratings yet
Targeting Customers N Gathering Ion
45 pages
Q.1. What Is Data Mining?
No ratings yet
Q.1. What Is Data Mining?
15 pages
21f1000089 BDM Proposal
No ratings yet
21f1000089 BDM Proposal
6 pages
1 1 Intro To Data and Data Science Course Notes
No ratings yet
1 1 Intro To Data and Data Science Course Notes
8 pages
Steps Ofvvector Estimating Error Correction Model
No ratings yet
Steps Ofvvector Estimating Error Correction Model
4 pages
BI Retail Industry v1.1
100% (1)
BI Retail Industry v1.1
23 pages
How to do an analysis of exceptional dice for sales - definitive guide to commercial success
From Everand
How to do an analysis of exceptional dice for sales - definitive guide to commercial success
Digital World
No ratings yet
Data Analytics Essentials You Always Wanted To Know: Self Learning Management
From Everand
Data Analytics Essentials You Always Wanted To Know: Self Learning Management
Vibrant Publishers
4/5 (11)
How To Win Customers Every Day _ Volume 7: Data-Driven Selling: The Complete Guide to Success
From Everand
How To Win Customers Every Day _ Volume 7: Data-Driven Selling: The Complete Guide to Success
Max Editorial
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Data Analysis

Uploaded by

Data Analysis

Uploaded by

Project

Student Names - Seenu Rani (SPI230468)

Subject - MDS610, Decision Making for Analytics

• Which stores are doing the best and why?

• Do discounts help increase sales?

• Which products sell well, and which don’t?

Purpose and Objectives of the Analysis

1. Data Preprocessing and Cleaning

2. Exploratory Data Analysis (EDA)

◦ Bar charts to compare the different product categories

◦ Box plots to see how the revenue changed by age or gender

◦ Heatmaps to show how different numbers are connected

3. Time Series Forecasting

◦ Overall trend (long-term rise or fall)

◦ Seasonal pattern (like more sales during festivals)

◦ Random changes (ups and downs we can’t explain)

• Multivariate Regression: This is a more advanced version where we looked at many

5. Customer Segmentation (Clustering)

3. Results and Discussion

1. Seasonal and Temporal Patterns

2. Store and Product Performance

Units Sold Revenue Pro t Margin

3. Customer Segmentation Insights (K-Means Clustering)

4. Forecast Accuracy and Sales Projections

• The ARIMA model gave more accurate results.

Weaknesses and Limitations

4. Conclusion and Recommendations

1. Better Inventory Management

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.