0% found this document useful (0 votes)
10 views12 pages

DM Module 3 - Online

Data mining

Uploaded by

imsupriya1409
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views12 pages

DM Module 3 - Online

Data mining

Uploaded by

imsupriya1409
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

MINING

ASSOCIATION RULES
Unit 3
UNIT 3
❑ Mining Association Rules
• Basics Concepts
• Single Dimensional Boolean Association
• Rules from Transaction Databases
• Multilevel Association Rules from transaction databases
• Multi dimension Association Rules from Relational Database and Data Warehouses.
• Apriori algorithm
• FP-Tree algorithm
Basic Concepts
■ Association Rule in data mining, involves discovering relationships between
independent relational databases or other data repositories through simple
If/Then statements.
■ The procedure aims to identify frequently occurring patterns, correlations, or
associations in datasets across various relational and transactional
databases.
■ The Association rule is a learning technique that helps identify the
dependencies between two data items.
■ An association rule has 2 parts:
– an antecedent (if) and a consequent (then)
– An antecedent is something that’s found in data, and a consequent is
an item that is found in combination with the antecedent.
– “If a customer buys bread, he’s 70% likely of buying milk.”
– In the above association rule, bread is the antecedent and milk is the
consequent.
■ Association rules are created by thoroughly analyzing data and looking for
frequent if/then patterns.
■ Depending on the following two parameters, the important relationships are
observed:
– Support: Support indicates how frequently the if/then relationship
appears in the database.
– Confidence: Confidence talks about the number of times these
relationships have been found to be true.
■ Association rules are a commonly used technique in data
mining and warehousing for discovering interesting
relationships between variables or items in a dataset.

Single ■ Single-dimensional association rules involve analyzing the


relationships between two variables, such as the association

Dimensional
between the purchase of a certain product and the purchase of
another product.

Boolean ■ These rules can be represented as “if A, then B” statements,


where A is the antecedent (the item that is being analyzed) and

Association
B is the consequent (the item that is being predicted or
associated with A).
■ For example, a single-dimensional association rule could be “If
a customer buys bread, they are likely to also buy milk.”
■ If the items or attributes in an association rule reference only
one dimension, then it is a single-dimensional association rule.
■ For example, the rule
– computer => antivirus software [support = 2%,
confidence = 60%] could be written as
– buys(X, "computer”) = buys(X, “antivirus software")
Single Dimensional Boolean Association

■ A Boolean association rule is a rule that


defines an association between quantitative
items or attributes as either a yes or no.
■ If a rule concerns associations between the
presence or absence of items, it is a Boolean
association rule.
– For example,
– buys(X, “laptop computer”) = buys(X,
“HP_printer”)
Rules from Transactional Database
What is a transactional database?
– Transactional databases are used to handle transactions efficiently and reliably, and are
optimized for running production systems like banks, retail stores, and websites.
Transactional database attributes –
– High data integrity – By enforcing data rules and constraints, transactional databases
prevent entry of inconsistent or invalid data.
– Scalability – These databases can scale vertically (e.g., more room on a single machine) or
horizontally (e.g., more machines to create a greater amount of data storage).
– Real-time processing – In order to maintain up-to-date, accurate records, transactional
databases best practices indicate they should update in real time to keep track of
transactional changes and avoid data conflicts.
– Concurrent access – When two users (or more) are uploading database transactions (e.g.,
a retailer with multiple stores with multiple registers), transactional databases must be
able to manage this concurrent traffic while preventing data conflicts.
– Auditability – An audit trail (provided by a database transaction log) is an important aspect
of many transactional databases.
Rules from Transactional Database
What type of database is a transactional database?
– Relational databases –
o Relational databases organize data in rows and columns which are used to form
tables.
o There are multiple tables within a relational database where a relationship between
the two tables can be created using a foreign key.
o These foreign keys (e.g., unique identifiers) maintain predefined relationships that
exist between the tables.
o These databases are run using relational database management system (RDMS)
software and often use such programming languages as SQL, MySQL, or Python
when processing transactions.
o Relational databases are not able to process unstructured data (e.g., text files,
photos, videos).
Rules from Transactional Database
Types of transactions in a transactional database –
– Single transactions – A single transaction within a transactional database refers to
a unit of work (e.g., reliable units) consisting of one or more database operations.
– Examples include an e-commerce order placed or money withdrawn from an ATM.
– Multi transactions – A multi transaction, sometimes called a distributed
transaction, includes multiple, interdependent transactions that range across a
variety of different databases and systems.
– Example of these transactions include a multi document transaction where
customer information is updated and associated invoices and billing must be
updated as well, or validating the dependencies
Rules of transactional databases include –
• Atomicity – A transaction must be complete in its entirety or have no effect at all.
• Consistency – A transaction must conform to existing constraints in the database.
• Isolation – Uncompleted transactions cannot be processed by other transactions.
• Durability – Once a transaction is written to the database, it remains there.
Multi Level Association Rules from
transactions databases
■ Multilevel Association Rule is a technique that
extends Association Rule to discover
relationships between items at different levels
of granularity.
■ Multilevel Association Rule can be classified
into two types:
o Multi-dimensional Association Rule.
o Multi-level Association Rule.
Multi Level Association Rules from transactions
databases
Multi-dimensional Association Rule –
■ This is used to find relationships between items in different dimensions of a dataset.
■ For example, in a sales dataset, multi-dimensional Association Rule an be used to find relationships between products, regions, and time.
Multi-level Association Rule –
■ This is used to find relationships between items at different levels of granularity.
■ For example, in a retail dataset, multi-level Association Rule can be used to find relationships between individual items and categories of
items.
Approaches to Multilevel Association rule –
Multilevel Association Rule has different approaches to finding relationships between items at different levels of granularity. There are three
approaches as explained below in brief.
■ Uniform Support (using uniform minimum support for all levels)
o where only one minimum support threshold is used for all levels.
o This approach is simple but may miss meaningful associations at low levels.
■ Reduced Support (using reduced minimum support at lower levels)
o where the minimum support threshold is lowered at lower levels to avoid missing important associations.
o This approach uses different search techniques, such as Level-by-Level independence and Level-cross separating by single item.
■ Group-based Support (using item or group-based support)
o where the user or expert sets the support and confidence threshold based on a specific group or product category.
o For example, if an expert wants to study the purchase patterns of laptops and clothes category, a low support threshold can be set
for this group to give attention to these items "purchase patterns"
Multi Level Association Rules from transactions
databases
Applications of Multilevel Association Rule in data mining –
■ Retail Sales Analysis – Multilevel Association Rule helps retailers gain insights into customer
buying behavior and preferences, optimize product placement and pricing, and improve supply
chain management.
■ Healthcare Management – Multilevel Association Rule helps healthcare providers identify
patterns in patient behavior, diagnose diseases, identify high-risk patients, and optimize
treatment plans.
■ Fraud Detection – Multilevel Association Rule helps companies identify fraudulent patterns,
detect anomalies, and prevent fraud in various industries such as finance, insurance, and
telecommunications.
■ Web Usage Mining – Multilevel Association Rule helps web-based companies gain insights into
user preferences, optimize website design and layout, and personalize content for individual
users by analyzing data at different levels of abstraction.
■ Social Network Analysis – Multilevel Association Rule helps social network providers identify
influential users, detect communities, and optimize network structure and design by analyzing
social network data at different levels of abstraction.
Multi Level Association Rules from transactions
databases
Challenges in Multilevel Association Rule -
■ High dimensionality – It is the problem of dealing with data sets that have a large number of
attributes.
■ Large data set size – It is the problem of dealing with data sets that have a large number of
records.
■ Scalability – It is the problem of dealing with data sets that are too large to fit into memory.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy