0% found this document useful (0 votes)
11 views44 pages

Association

Association

Uploaded by

parksy317575
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views44 pages

Association

Association

Uploaded by

parksy317575
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Market Basket Analysis and

Association Rules
Instructor: Junghye Lee

Ref.: M.J.A. Berry and G. Linoff, Data Mining Techniques, Wiley, 1997.

1
Contents

 Introduction
 Association Rules
 Basic Process
- Choosing the right set of items
- Generating rules and their measures
- Overcoming the practical limits
 Strengths and Weakness
 Application Areas

2
Introduction: What is Market Basket Analysis?

 Finding useful information in ‘Market Basket’


 Useful information
– Who customers are
– Which products tend to be purchased together
– Why some products tend to be purchased together
 Useful information like “If Item A then Item B” is
called ‘Association rule’
Example: shopping cart
window
cleaner detergent
orange
juice
milk milk bananas

3
Introduction: Point of Sale Transactions

 transaction and item

Ex) Grocery Point-Of-Sale Transactions

customer items

1 orange juice, banana

2 orange juice, milk

3 detergent, window cleaner

transaction

4
Introduction: Transactions and Co-Occurrence
Customer Items
• OJ and soda are more likely to
be purchased together.
1 Orange juice, Soda
•Detergent is never purchased
2 Milk, Orange juice, Window Cleaner
with window cleaner or milk.
3 Orange juice, Detergent
•Milk is never purchased with
4 Orange juice, Detergent ,Soda soda or detergent.

5 Window Cleaner, Soda


OJ Window Milk Soda Detergent
Cleaner
Confidence of the rule OJ 4 1 1 2 1

-“if soda, then orange juice”: Window 1 2 1 1 0


Cleaner
2/3 (67%) Milk 1 1 1 0 0
- “if orange juice then soda”: Soda 2 1 0 3 1
2/4 (50%) Detergent 1 0 0 1 2
5
Association Rules

 The clear and useful result of market basket


analysis.
 Only one result in association rules are
strongly recommended.
– “If diapers and Thursday, then beer” is more
useful than “If Thursday, then diapers and beer”.
 Three types of rules
– the useful
– the trivial
– the inexplicable

6
Association Rules - The Useful Rule

 Contains high quality, actionable information


 Once the pattern (rule) is found, it is often
not hard to justify.
– Example ; Rules like “ On Thursday, customer
who purchase diapers are likely to purchase beer”
 Young couples prepare for the weekend by
stocking up on diapers for the infants and beer
for dad.
 Can be applied to Market layout
– Ex) Placing other baby products within sight of
the beer

7
Association Rules - The Trivial Rule

 Trivial results are already known by anyone


familiar with the business.
Ex) - “Customers who purchase maintenance
agreements are very likely to purchase large appliances”
(they purchase both at the same time.)
- “customers purchasing paint buy paint brushes”
 Results from market basket analysis may
simply be measuring the success of previous
marketing campaigns.

8
Association Rules - The inexplicable rules

 Inexplicable rules give a new fact but no


explanation about consumer behavior and
future actions.
– Example) “When a new hardware store opens,
one of the most commonly sold items is toilet
rings”
 Inexplicable rules can be flukes in data.
 More investigation might give explanation.

9
The Basic Process in Market Basket Analysis

Choosing the right set of item and right level


- using taxonomy and virtual items

Generating rules by co-occurrence matrix


(measures: support, confidence, improvement)

Overcoming the practical limits


(pruning)
10
Basic Process:
Choosing the right set of items - Taxonomy
Frozen Foods
 Why?
– To reduce too many item combinations Desserts
 How to find the appropriate level?
Ice Cream
– Consider frequency (roll up rare items to
higher levels to help to generalize items)
Vanilla
– Consider importance (roll down expensive
item to lower levels)

When the items occur in about the same number


of times in the data, the analysis produce the best
results
11
Basic Process:
Choosing the right set of items- Virtual Items

 Items that cross product boundaries


– They do not appear in the product taxonomy
(e.g. “Calvin Klein”, “cash”, “month”)
 Cautions
– Prime cause of redundant rules
(e.g., If Coke products, then Coke)
– When a virtual item and a generalized item appearing
together are a proxy for individual items
(e.g., If “coke product” and “diet soda” then “pretzels”
If “diet coke” then pretzels”)

12
Basic Process: Generating rules

1. Gathering transactions for selected items


( including virtual items)
2. Making Co-Occurrence Matrix
3. Finding the most frequent combination
from the matrix
4. Making a distinction between ‘condition’
and ‘result’ from that combination
If “condition”, then “result”.

– support, confidence, improvement

13
Performance Measures - Support

Rule: If “condition” then “result”.

 Support
- How many transactions that contain “condition” and “result” at
the same time ?

# of transactions that include " condition" and " result "


S  P("condition"" result " ) 
# of total transactions

- Support can be used to eliminate uninteresting rules.

14
Performance Measures - Confidence

 Confidence
- How many transactions that contain “condition” and “result”
among the transactions including “condition” ?

P(condition result)
C  P(" result"|" condition" ) 
P(condition)
# of transactions that include condition and result

# of transactions that include condition
- Conditional probability
- Degree of association – may not imply causality
- not symmetric

15
Performance Measures - Improvement

 improvement or lift
- Lift (improvement) tells us how much better a rule is at
predicting the result than just guessing the result at random

P(result | condition) P(codition  result )


I 
P(result ) P(condition) P(result )

Improvement example

1 Two items are independent Pepper and cookie

>1 complementary Bread and butter

<1 substitutional Butter and margarine

16
Basic Process: Generating rules -
Example

 Grocery shopping cart

Customer Items

1 Orange juice, Soda

2 Milk, Orange juice, Window Cleaner

3 Orange juice, Detergent

4 Orange juice, Detergent ,Soda

5 Window Cleaner, Soda

17
Basic Process: Generating rules -
Example
 Co-Occurrence Matrix
4 /5
transactions OJ Window Milk Soda Detergent
Cleaner

OJ 4 1 1 2 1
(0.8) (0.2) (0.2) (0.4) (0.2)
Window 1 2 1 1 0
Cleaner (0.2) (0.4) (0.2) (0.2)

Milk 1 1 1 0 0
(0.2) (0.2) (0.2)
Soda 2 1 0 3 1
(0.4) (0.2) (0.6) (0.2)
Detergent 1 0 0 1 2
(0.2) (0.2) (0.4)

18
Basic Process: Generating rules -
Example
 Assume most common combination
‘A,B,C’

Combination Probability Combination Probability

A 45% A and B 25%

B 42.5% A and C 20%

C 40% B and C 15%

A and B and C 5%

19
Basic Process: Generating rules -
Example
 Which is Result between A,B,C?
 Setting a result on the basis of ‘confidence’
 Confidence of rule “ If condition then result” support
– P(Result|Condition) = P(Result and
Condition)/P(Condition)
 Confidence of rule “If AB then C” = P(ABC|AB)
Association P(condition) P(condition confidence
Rule and result)
If AB then C 25% 5% 0.20

If AC then B 20% 5% 0.25

If BC then A 15% 5% 0.33

20
Basic Process: Generating rules -
Example
 What if P(R) > P(R|C) (= Confidence) ?
 ‘Improvement’ tells how much better a rule is than
just guessing randomly the result
 Improvement = P(R|C) / P(R) = P(RC)/P(R)P(C)
– If Improvement > 1  the rule is better
– If Improvement < 1  “If C, then NOT R ” is
better (Negative rule)

Rule confidence Improvement


If BC then A 0.33 0.74 0.33/0.45
If BC then Not A 0.67 1.22 0.67/0.55
If A then B 0.56 1.31
0.25/0.45
21
Basic Process:
Overcoming the practical limits
 Exponential growth as problem size
increases
– On menu with 100 items, how many
combinations are there?  161,700 (3 items )
 How to solve the problem of Big Data?
– To use the taxonomy ; generalize items that
can meet criterion
– To use Pruning ; throw out item or
combination of items that do not meet
criterion. “Minimum support pruning” is the
most common pruning mechanism

22
Strengths of Market Basket Analysis

 It produces clear and understandable


results
– because the results are association rules
 It supports both directed and undirected
data mining
 It can handle transactions themselves
 The computations it uses are simple to
understand
– The calculation of confidence and
improvement is simple

23
Weaknesses of Market Basket Analysis

 It requires exponentially more


computational effort as the problem size
grows
– number of items, complexity of the rules
 It has limited support for data attribute
– Virtual items make rules more expressive
 It is difficult to determine the right number
of items
 It discounts rare items

24
Application of Market Basket Analysis

 It can suggest store layouts.


 It can tell which products are
amenable to promotion.
 It is used to compare stores.
 It can be applied to time-series
problems.

25
Apriori Algorithm

 Agrawal and Srikant, 1994


(Phase 1) Find all frequent itemsets having the minimum support smin.

(Phase 2) Consider a subset A of a frequent itemset L.


For a specified confidence cmin,
if supp(L)/supp(A)  cmin ,
then, generate a rule R: A  (L-A).

So, the support of this rule will be supp(R)=supp(L), and


the confidence will be conf(R)= supp(L)/supp(A).

26
Apriori Algorithm – Phase 1
Step 0. Specify the minimum support smin .
k=1 C1  [{i1},{i2 },...,{im }] L1  {c  C1 | supp(c)  s min }

Step 1. k=k+1
Generate new candidate itemsets C k from Lk 1
(apriori-gen function)
Step 1-1. (join)
Generate k-itemsets Lk 1 by joining like C  Lk 1 * Lk 1

Step 1-2. (prune)


Delete any (k-1)-itemset in C if it does not belong to Lk 1
Form C k by pruning all of these itemsets.
Stop if C k   .

Step 2. Generate Lk such that Lk  {c  Ck | supp(c)  s min }


Repeat Step 1.
27
Apriori Algorithm – Example
transaction items
Smin=0.4
1 b, c, g
2 a, b, d, e, f
C1=[{a}, {b}, {c}, {d}, {e}, {f}, {g}] 3 a, b, c, g
4 b, c, e, f
L1=[{a}, {b}, {c}, {e}, {f}, {g}] 5 b, c, e, f, g

C2=[{a,b}, {a,c}, {a,e}, {a,f}, {a,g}, {b,c}, {b,e}, {b,f},


{b,g}, {c,e}, {c,f}, {c,g}, {e,f}, {e,g}, {f,g}]
L2=[{a,b}, {b,c}, {b,e}, {b,f}, {b,g}, {c,e}, {c,f}, {c,g},
{e,f}]
C3=[{b,c,e}, {b,c,f}, {b,c,g}, {b,e,f}, {c,e,f}]
L3=[{b,c,e}, {b,c,f}, {b,c,g}, {b,e,f}, {c,e,f}]
C4=[{b,c,e,f}]=L4 28
Apriori Algorithm – Example

 Rules generated

L={b,c,g}

Candidates of rules having 1-item in ‘result”:

R1: {b,c}{g} conf(R1)=0.6/0.8 = 0.75


R2: {b,g}{c} conf(R2)=0.6/0.6 = 1
R3: {c,g}{b} conf(R3)=0.6/0.6 = 1

29
Apriori Algorithm – Theorem
Sequential Patterns
Sequential Patterns
 Sequence: List of items in the order of time, etc.
– eg: s1  A1 , A2 ,..., An  s2  B1 , B2 ,..., Bm 

 Length of sequence: number of items in a


sequence, (k-sequence)
 Subsequence s1 of s2 : sequence s1 is a
subsequence of s2 if
A1  Bi1 , A2  Bi2 ,..., An  Bin i1  i2  ...  in

– eg: s1=<{a}, {b,c}, {e}>, s2=<{f}, {a,g}, {h}, {b,c,d},{e,j}>,


s3=<{a}, {b}, {c}, {e}>
 s1 is a subsequence of s2, but s3 is not a
subsequence of s2
30
Sequential Patterns
 sequence s is called as maximal if it is not a
subsequence of others.
 Support of a sequence s:
supp(s) = proportion of customers that
contain sequence s
<example>
# Customer sequences Maximal sequences having min. support 0.4:
1 <{a}, {b}> s1=<{a}, {b}>, s2=<{a}, {e, g}>
2 <{c,d}, {a}, {e,f,g}>
3 <{a,h,g}> s3=<{a}, {e}> is not maximal.
4 <{a}, {e,g}, {b}>
5 <{b}>

31
Algorithm for Finding Sequences
 Agrawal and Srikant (1995)
1. Sort Phase: convert the transaction
database into customer sequences
2. L-itemset Phase: Find the set of all l-
itemsets L by considering the minimum
support.
3. Transformation Phase: Transform each
customer sequence into the set of all l-
itemsets contained in that transaction.
4. Sequence Phase: Generate large
sequences
5. Maximal Phase: Find the maximal
sequences among large sequences.
32
Algorithm for Finding Sequences
<example> (cont’d)
Min. support: 0.4
2) L-itemset Phase
itemset support #
{a} 4 1
{b} 3 2
{e} 2 3
{e, g} 2 4
{g} 3 5
3)Transformation Phase
cust # Cust seq Transformed seq
1 <{a}, {b}> <{1}, {2}>
2 <{c,d}, {a}, {e,f,g}> <{1}, {3, 4, 5}>
3 <{a,h,g}> <{1, 5}>
4 <{a}, {e,g}, {b}> <{1}, {3, 4, 5}, {2}>
5 <{b}> <{2}>
33
Algorithm for Finding Sequences
4)Sequence Phase
•AprioriAll: does not guarantee max seq, so requires
Maximal Phase
•AprioriSome, ‘DynamicSome’: guarantee max seq
AprioriAll
Step 0. Set all large 1-seq to L1.
k=1
Step 1. k=k+1
Ck  Lk 1 * Lk 1
Step 2. Obtain Lk from Ck
Stop if Lk   . Repeat Step 1, otherwise.

Example (cont’d)
L1=[<1>, <2>, <3>, <4>, <5>]
L2=[<1, 2>, <1, 3>, <1, 4>, <1, 5>]
L3=[]stop
Max seq : <1, 2> and <1,4>
34
Algorithm for Finding Sequences
Example
cust Transformed seq
1 <{1,5}, {2}, {3}, {4}>
2 <{1}, {3}, {4}, {3,5}>
3 <{1}, {2}, {3}, {4}>
4 <{1}, {3}, {5}>
5 <{4}, {5}>
Min support: 0.4
L1=[<1>, <2>, <3>, <4>, <5>]
L2=[<1 2>, <1 3>, <1 4>, <1 5>, <2 3>, <2 4>, <3 4>, <3 5>, <4 5>]
C3=[<1 2 3>, <1 2 4>, <1 3 4>, <1 3 5>, <1 4 5>, <2 3 4>, <2 3 5>,
<2 4 5>, <3 4 5>]
L3=[<1 2 3>, <1 2 4>, <1 3 4>, <1 3 5>, <2 3 4>]
C4=L4=[<1 2 3 4>]
Max seq: <1 2 3 4>, <1 3 5>, <4 5>

35
Algorithm for Finding Sequences
AprioriSome
(Forward phase)
Step 0. k=1; Obtain L1; C1=L1; last=1.
Step 1. (Generate Ck)
k←k+1
1) Lk 1known: Ck  Lk 1 * Lk 1
2) Lk 1unknown: Ck  Ck 1 * Ck 1
Step 2. (select k for Lk )
Stop if Ck   , Proceed, otherwise.
1)If k=next(last), obtain Lk , last=k, Go to Step 1.
2)If not k=next(last), go to Step 1.
(Backward phase)
Step 0. k=kmax
Step 1.
1) Lk known: delete all subsequences in Li (i  k ) from Lk
2) Lk unknown: delete all subsequences in Li (i  k ) from Ck
Step 2. k←k-1; go to Step 1.
36
Algorithm for Finding Sequences
 Function ‘next’ determines the length of
sequences.
– Agrawal & Srikant (1995)
| Lk |
hit (k ) 
| Ck |
1) If hit(k) < 0.666 , next(k)=k+1
2) If 0.666  hit(k) < 0.75 , next(k)=k+2
3) If 0.75  hit(k) < 0.80, next(k)=k+3
4) If 0.80  hit(k) < 0.85, next(k)=k+4
5) If hit(k)  0.85, next(k)=k+5
 Once the all large sequences are obtained,
the union will be the maximal sequence.
37
Algorithm for Finding Sequences
Example(cont’d)- AprioriSome
next(i)=2i
(Forward phase)
Iteration 0.
L1=C1=[<1>, <2>, <3>, <4>, <5>], last=1
Iteration 1. (k=2)
C2=[<1 2>, <1 3>, <1 4>, <1 5>, <2 3>, <2 4>, <2 5>, <3 4>, <3 5>, <4 5>]
next(1)=2=k
L2=[<1 2>, <1 3>, <1 4>, <1 5>, <2 3>, <2 4>, <3 4>, <3 5>, <4 5>]
last=2
Iteration 2. (k=3)
C3=[<1 2 3>, <1 2 4>, <1 3 4>, <1 3 5>, <1 4 5>, <2 3 4>, <2 3 5>, <2 4 5>,<3 4 5>]
next(2)=43
Iteration 3.(k=4)
C4=[<1 2 3 4>, <1 2 3 5>, <1 2 4 5>, <1 3 4 5>, <2 3 4 5>]
next(2)=4=k
L4=[<1 2 3 4>]
38
Algorithm for Finding Sequences
Example(cont’d)
(Backward phase)
Iteration 0.
kmax=4
Iteration 1.
L4=[<1 2 3 4>]; k=3
Iteration 2.
L3=[<1 3 5>]; k=2
Iteration 3.
L2=[<4 5>]

Max sequences: <1 2 3 4>, <1 3 5>, <4 5>

39

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy