DWMExp 7
DWMExp 7
7
Aim: Implement Association Rule mining algorithms using Weka.
Theory:
Association rule learning is a type of unsupervised learning technique that checks for the
dependency of one data item on another data item and maps accordingly so that it can be
more profitable. It tries to find some interesting relations or associations among the variables
of dataset. It is based on different rules to discover the interesting relations between variables
in the database.
The association rule learning is one of the very important concepts of machine learning, and
it is employed in Market Basket analysis, Web usage mining, continuous production, etc. Here
market basket analysis is a technique used by the various big retailer to discover the
associations between items. We can understand it by taking an example of a supermarket, as
in a supermarket, all products that are purchased together are put together.
Association rule learning can be divided into three types of algorithms:
1. Apriori
2. Eclat
3. F-P Growth Algorithm
Apriori algorithm refers to an algorithm that is used in mining frequent products sets and
relevant association rules. Generally, the apriori algorithm operates on a database containing
a huge number of transactions. For example, the items customers but at a Big Bazar.
The given three components comprise the apriori algorithm.
1. Support
2. Confidence
3. Lift
Support:
Support refers to the default popularity of any product. You find the support as a quotient of
the division of the number of transactions comprising that product by the total number of
transactions.
Confidence:
Confidence refers to the possibility that the customers bought both biscuits and chocolates
together. So, you need to divide the number of transactions that comprise both biscuits and
chocolates by the total number of transactions to get the confidence.
Lift:
Consider the above example; lift refers to the increase in the ratio of the sale of chocolates
when you sell biscuits. The mathematical equations of lift are given below.
Lift = (Confidence (x-y)/ (Support (x))
Apriori algorithm Algorithm:
Algorithm for association rule mining
Step 1: Data Collection
Gather transaction data, where each transaction lists items bought together.
Step 2: Data Preprocessing
Prepare the data, ensuring items and transactions are well-defined and properly structured.
Step 3: Itemset Generation
Create a list of frequent items (items that occur above a minimum support threshold).
Step 4: Rule Generation
Generate association rules from frequent item sets by examining itemset combinations.
Step 5: Rule Evaluation
Evaluate rules based on metrics like confidence and lift to find meaningful associations.
Step 6: Visualization and Interpretation
Visualize and interpret the discovered association rules for insights.
Association rule mining aims to discover patterns in data, such as "if X, then Y," and is widely
used in retail and recommendation systems.
Advantages of Apriori Algorithm:
• Apriori algorithm is an expensive method to find support since the calculation has to
pass through the whole database.
• Sometimes, you need a huge number of candidate rules, so it becomes
computationally more expensive.