100% found this document useful (1 vote)
1K views17 pages

Problem Statement

The document outlines a plan to analyze transactional data from a grocery store to identify common product combinations and make recommendations for combo offers. Key steps include: 1) Loading the transaction data and removing irrelevant products to prepare the data for market basket analysis. 2) Using the KNIME workflow to split product data into sets, apply association rule learning to identify frequent item sets, and extract the results. 3) Analyzing the association rules to identify combinations of products that are commonly purchased together based on metrics like support, confidence and lift in order to suggest new combo offers for the grocery store.

Uploaded by

gowtham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views17 pages

Problem Statement

The document outlines a plan to analyze transactional data from a grocery store to identify common product combinations and make recommendations for combo offers. Key steps include: 1) Loading the transaction data and removing irrelevant products to prepare the data for market basket analysis. 2) Using the KNIME workflow to split product data into sets, apply association rule learning to identify frequent item sets, and extract the results. 3) Analyzing the association rules to identify combinations of products that are commonly purchased together based on metrics like support, confidence and lift in order to suggest new combo offers for the grocery store.

Uploaded by

gowtham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

AGENDA

 Problem Statement
A Grocery Store shared the transactional data with you. Your job is to identify the most
popular combos that can be suggested to the Grocery Store chain after a thorough analysis of
the most commonly occurring sets of items in the customer orders. The Store doesn’t have
any combo offers. Can you suggest the best combos & offers?

 We aim to analyze the association rules to suggest the best combo and offers for the

Grocery Store chain using Market Basket Analysis.

 The data provided in the csv file has a Point of Sale (POS).
 Tableau used for EDA Visualization

 Tableau Public Link


https://public.tableau.com/app/profile/sivaramakrishnan3623/viz/Sivaramakrishnan_S_Marketing_Project_Milestone_2/MonthlyTrend

 KNIME Workflow used for MBA Analysis

 Sivaramakrishnan_S_MRA_Project_MileStone_2.knwf
TOOLS USED
ABOUT DATA

DATA DICTIONARY

 No of transactions : 20682

 No of features : 3
INFORMATION
&  No missing values

ASSUMPTIONS  No duplicates
 # of Unique Orders : (1 to 1139)

 # of Unique Products : 37

 # of Unique Dates : 603

 Data provided from Jan to Sep for 2 years (2018, 2019) and 2020 with 2 months(Jan and Feb)
YEARLY TREND

The year 2018 has the highest no of orders followed by 2019, Since the data in the year
2020 has only 2 months so very low count in orders.
OVERALL MONTHLY

There is highest no of unique orders in Jan(174) and low number of orders made in June(105)
MONTHLY TREND

There is no trend and seasonality available in the data provided.


QUARTERLY TREND

The Q1 2019 and Q3 2018 have the highest no of orders (180) and the lowest no of orders in Q1 2020 since it contains only 2 months of data.
DAY WISE TREND

High number of orders made on mid of the month and start of month is low and it reduced
at end of month.
PRODUCTS COUNT

The product poultry is the order highest no of orders and hand shop is the lowest no of orders.
POULTRY - 480

PRODUCTS COUNT ICE CREAM - 454


CEREALS - 451

LUNCH MEAT - 450


WAFFLES - 449

CHEESES - 445

SODA - 445

EGGS - 444

DINNER ROLLS - 443


DISHWASHING LIQUID/DETERGENT - 442


BAGELS - 439

ALUMINUM FOIL - 438


YOGURT - 438

MILK - 433

COFFEE/TEA - 432

SOAP - 432

LAUNDRY DETERGENT - 431


TOILET PAPER - 431


JUICE - 429

INDIVIDUAL MEALS - 428


MIXES - 428

ALL- PURPOSE - 427


BEEF - 427

SPAGHETTI SAUCE - 425


KETCHUP - 423

PASTA - 423

FRUITS - 422

TORTILLAS - 421

SHAMPOO - 420

BUTTER - 419

SANDWICH BAGS - 419


PAPER TOWELS - 413


SUGAR - 411

PORK - 405

FLOUR - 402

SANDWICH LOAVES - 398


HAND SOAP - 394


 All Purpose is general product so we will remove the data to get better combos

 Unique Products count post removal of All-Purpose – 15,484/-

 Products counts post removal of All-Purpose – 20,090 - /-


MARKET BASKET ANALYSIS Market Basket Analysis is a technique which identifies the strength of association between pairs
of products purchased together and identify patterns of co-occurrence. A co- occurrence is when
two or more things take place together.

Market Basket Analysis creates If-Then scenario rules, for example, if item A is purchased then
item B is likely to be purchased. The rules are probabilistic in nature or, in other words, they are
derived from the frequencies of co-occurrence in the observations. Frequency is the proportion of
baskets that contain the items of interest. The rules can be used in pricing strategies, product
placement, and various types of cross-selling strategies. In order to make it easier to understand,
think of Market Basket Analysis in terms of shopping at a supermarket. Market Basket Analysis
takes data at transaction level, which lists all items bought by a customer in a single purchase.
The technique determines relationships of what products were purchased with which other
product(s). These relationships are then used to build profiles containing If-Then rules of the
items purchased. The rules could be written as If {A} Then {B}

The If part of the rule (the {A} above) is known as the antecedent and the THEN part of the rule is
known as the consequent (the {B} above).

The antecedent is the condition and the consequent is the result. The association rule has three
measures that express the degree of confidence in the rule, Support, Confidence, and Lift.
Threshold Values Support: Its the default popularity of an item. In mathematical terms, the support of item A is
nothing but the ratio of transactions involving A to the total number of transactions.

Confidence: Likelihood that customer who bought both A and B. Its divides the number of
transactions involving both A and B by the number of transactions involving B.

Lift : Increase in the sale of A when you sell B.


KNIME WORK-FLOW

Node Name Description


Read CSV Read the CSV file
Data Explorer Explore the data parameters
Row Filter To filter the “All-Purpose” value since this looks not relevant.

GroupBy GroupBy OrderID


Cell Splitter Convert the products data to set.
Association Rule Learner Market Basket Analysis to generate the Frequent/Item List
Excel Extract Extract the Frequent/Item List to Excel.
We can observe here that no. of rows are now 1,139 as compared to our dataset it was 20,641,
MBA – DATA LOAD after filtering the data with out All-Purpose so 20,090
This will help us classify the products for our further Market Basket Analysis

Data Load
Filtered Data
The filtered data then grouped with Order ID and the unique values of 1139 rows
MBA – CELL SPLITTER Grouped Data

Convert Products to Set

In this node ‘Cell Splitter’ we removed the


duplicated products and concatenated them in a
single group as per the purchase or order ID.

This again helped us to classify the items in the


set format which is in square bracket
These rules are actionable in that they can be used to target customers for
MBA – Association Rule marketing, or for product placing, or more generally to inform decision making.
Examples of areas in which association rules have been used include:
Supermarket purchases: common combinations of products can be used to
inform product placement on supermarket shelves.

This is the most important node for our Market Basket Analysis. We have here the three metrics
that are Support, Confidence and Lift, we added a value to our Support which is between 0-1.
We added value of 0.03 that is 3% sell of a product from overall transactions and we also
selected the association rule for the minimum confidence as 0.05. So as you can see the values
 So as we can see in the previous slide the table shows 145104 records in which each row contains a
INFERENCE
different rules.

 It has created multiple rules on the basis of threshold limit that we have set earlier in the Association Rule

Learner Node and whichever has a higher lift value we recommend that product to the customer

 Consequent column contains recommended products and we have sorted the lift values from higher to

lower for the better recommendations.


RECOMMENDATION

 If we see the result table of the Association Rule Learner some item are single as well as double and
INSIGHTS &

some are more in a single bracket.

 So generally we recommend the products that are listed in consequent feature which has a higher lift

value

 That means it has the higher probability of being purchased by the customer.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy