0% found this document useful (0 votes)
17 views27 pages

Chapter 3 ML

The document discusses unsupervised learning, including that it involves training without labeled data. It covers clustering and association problems in unsupervised learning and provides examples. It also describes the k-means algorithm for clustering, including how it works and how to select the optimal k value.

Uploaded by

umatoli95
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views27 pages

Chapter 3 ML

The document discusses unsupervised learning, including that it involves training without labeled data. It covers clustering and association problems in unsupervised learning and provides examples. It also describes the k-means algorithm for clustering, including how it works and how to select the optimal k value.

Uploaded by

umatoli95
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Chapter 3

Unsupervised learning

Misganu T. Itroduction to Machine Learning 1


Unsupervised learning

• Unsupervised learning involves training by using unlabeled data and


allowing the model to act on that information without guidance.

• In this type of Machine Learning, the model is not fed with labeled
data

• Consider the following figure, the model has no clue about the image
‘which image is Tom and which is Jerry’, it figures out patterns and
the differences between Tom and Jerry on its own by taking in tons of
data.

Misganu T. Itroduction to Machine Learning 2


Unsupervised learning
Cont..

For example, it identifies prominent features of Tom such as pointy


ears, bigger size, etc., to understand that this image is of type 1.
Similarly, it finds such features in Jerry and knows that this image is of
type 2. Therefore, it classifies the images into two different classes
without knowing who Tom is or Jerry is.
Misganu T. Itroduction to Machine Learning 3
Unsupervised learning
Cont..
Unsupervised learning problems further grouped into clustering and association
problems.

• Clustering: This type of problem involves assigning the input into two or more
clusters based on feature similarity.

 For example, clustering viewers into similar groups based on their interests, age,
geography, etc. can be done by using Unsupervised Learning algorithms like K-
Means Clustering.

• Association: Association rules allow you to establish associations amongst data


objects inside large databases. This unsupervised technique is about discovering
interesting relationships between variables in large databases.

 For example, people that buy a new home most likely to buy new furniture.
 Recommender system
Misganu T. Itroduction to Machine Learning 4
K-means Algorithm

• K-Means algorithm is an unsupervised learning algorithm that is used


to solve the clustering problems in machine learning
• Clustering
o Process of dividing the dataset into groups, consisting of similar
data-points.
o Points in the same group are as similar as possible.
o Points in the different group are as dissimilar as possible.

Misganu T. Itroduction to Machine Learning 5


K-means Algorithm
Cont..

• which groups the unlabelled dataset into different clusters. Here K defines the number of
pre-defined clusters that need to be created in the process, as if K=2, there will be two

clusters, and for K=3, there will be three clusters, and so on.

• It allows us to cluster the data into different groups and a convenient way to discover the
categories of groups in the unlabelled dataset on its own without the need for any

supervising.

• It is a centroid-based algorithm, where each cluster is associated with a centroid. The


main aim of this algorithm is to minimize the sum of distances between the data point and

their corresponding clusters.

• The algorithm takes the unlabelled dataset as input, divides the dataset into k-number of
clusters, and repeats the process until it does not find the best clusters. The value of k

should be predetermined in this algorithm.


Misganu T. Itroduction to Machine Learning 6
K-means Algorithm
Cont..
The k-means clustering algorithm mainly performs two tasks:
• Determines the best value for K-center points or centroids by an
iterative process.
• Assigns each data point to its closest k-center. Those data points
which are near to the particular k-center, create a cluster.
Hence each cluster has datapoints with some commonalities, and it is
away from other clusters.

Misganu T. Itroduction to Machine Learning 7


How does the K-Means
Algorithm Work?

• The working of the K-Means algorithm is explained in the below steps:


Step1: Initialization:- The user specifies the value of K and randomly
selects K data points to serve as the initial centroids of the clusters.
Step 2: Assignment:- Each data point is assigned to the cluster whose
centroid is closest to it, based on the Euclidean distance between the point
and the centroid.
Step 3: Recalculation:- After all data points have been assigned to a
cluster, the centroids of the clusters are recalculated as the mean of all the
data points in the cluster.
Step 4: Reassignment:- Steps 2 and 3 are repeated until the centroids no
longer change or a maximum number of iterations is reached. During each
iteration, data points are reassigned to the closest centroid, and centroids
are recalculated based on the new cluster assignments.
Step 5: Termination:- The algorithm terminates when the centroids no
longer change or the maximum number of iterations is reached.
Misganu T. Itroduction to Machine Learning 8
How does the K-Means Algorithm
Work? Cont..

Calculate the distance between the first point


and the three centroids.

Misganu T. Itroduction to Machine Learning 9


How does the K-Means Algorithm
Work? Cont..

Misganu T. Itroduction to Machine Learning 10


How does the K-Means Algorithm
Work? Cont..

The rest of the point


will assigned in the
same manner.

Misganu T. Itroduction to Machine Learning 11


How does the K-Means Algorithm
Work? Cont..

The second Iteration will continue and goes


to n iteration until the best quality found.
Misganu T. Itroduction to Machine Learning 12
How to choose the value of K in
K-means Algorithm?

• The performance of the K-means clustering algorithm depends upon highly


efficient clusters that it forms. But choosing the optimal number of clusters is a

big task.

Elbow Method:

• The Elbow method is one of the most popular ways to find the optimal number
of clusters. This method uses the concept of WCSS value.

• WCSS stands for Within Cluster Sum of Squares, which defines the total
variations within a cluster.

• The formula to calculate the value of WCSS (for 3 clusters) is given below:
WCSS= ∑Pi in Cluster1 distance(Pi C1)2 +∑Pi in Cluster2distance(Pi C2)2+∑Pi in CLuster3 distance(Pi C3)2

Misganu T. Itroduction to Machine Learning 13


How to choose the value of K in
K-means Algorithm? Cont..

• In the above formula of WCSS,


• ∑Pi in Cluster1 distance(Pi C1)2: It is the sum of the square of the distances
between each data point and its centroid within a cluster1 and the same for
the other two terms.
• To measure the distance between data points and centroid, we can use any
method such as Euclidean distance or Manhattan distance.
• To find the optimal value of clusters, the elbow method follows the below
steps:
• It executes the K-means clustering on a given dataset for different K values
(ranges from 1-10).
• For each value of K, calculates the WCSS value.
• Plots a curve between calculated WCSS values and the number of clusters K.
• The sharp point of bend or a point of the plot looks like an arm, then that
point is considered as the best value of K.
Misganu T. Itroduction to Machine Learning 14
How to choose the value of K in
K-means Algorithm? Cont..

• Since the graph shows the sharp bend, which looks like an
elbow, hence it is known as the elbow method.
• The graph for the elbow method looks like the below
image:

Misganu T. Itroduction to Machine Learning 15


Apriori Algorithm

• The apriori algorithm is an unsupervised learning algorithm that uses


frequent itemsets to generate association rules.
o It is based on the concept that a subset of a frequent itemset must
also be a frequent itemset.

• Frequent itemsets are those items whose support is greater than the
threshold value or user-specified minimum support.
o It means if A & B are the frequent itemsets together, then
individually A and B should also be the frequent itemset.

Misganu T. Itroduction to Machine Learning 16


How Does Apriori Algorithm
works?

• Example: Suppose we have the following dataset that has various


transactions, and from this dataset, we need to find the frequent
itemsets and generate the association rules using the Apriori
algorithm:

Misganu T. Itroduction to Machine Learning 17


Apriori Algorithm Example
Cont..

Step-1: Calculating C1 and L1:


In the first step, we will create a table that
contains support count (The frequency of
each itemset individually in the dataset) of
each itemset in the given dataset. This table
is called the Candidate set or C1

Now, we will take out all the itemsets that


have the greater support count that the
Minimum Support (2). It will give us the
table for the frequent itemset L1.
Since all the itemsets have greater or equal
support count than the minimum support,
except the E, so E itemset will be removed.

Misganu T. Itroduction to Machine Learning 18


Apriori Algorithm Example
Cont..

Step-2: Candidate Generation C2, and L2:


In this step, we will generate C2 with the help
of L1. In C2, we will create the pair of the
itemsets of L1 in the form of subsets.
After creating the subsets, we will again find
the support count from the main transaction
table of datasets.

Again, we need to compare the C2 Support


count with the minimum support count, and
after comparing, the itemset with less support
count will be eliminated from the table C2. It
will give us the below table for L2

Misganu T. Itroduction to Machine Learning 19


Apriori Algorithm Example
Cont..

Step-3: Candidate generation C3, and L3:


• For C3, we will repeat the same two processes, but now we will form
the C3 table with subsets of three itemsets together, and will calculate
the support count from the dataset. It will give the below table:

o Now we will create the L3 table. As we can see from the above C3 table,
there is only one combination of itemset that has support count equal to the
minimum support count.
o So, the L3 will have only one combination, i.e., {A, B, C}.

Misganu T. Itroduction to Machine Learning 20


Apriori Algorithm Example
Cont..

Step-4: Finding the association rules for the subsets:


• For all the rules, we will calculate the Confidence using formula sup( A
^B)/A. After calculating the confidence value for all rules, we will exclude
the rules that have less confidence than the minimum threshold(50%).
Rules Support Confidence

A ^B → C 2 Sup{(A ^B) ^C}/sup(A ^B)= 2/4=0.5=50%

B^C → A 2 Sup{(B^C) ^A}/sup(B ^C)= 2/4=0.5=50%

A^C → B 2 Sup{(A ^C) ^B}/sup(A ^C)= 2/4=0.5=50%

C→ A ^B 2 Sup{(C^( A ^B)}/sup(C)= 2/5=0.4=40%

A→ B^C 2 Sup{(A^( B ^C)}/sup(A)= 2/6=0.33=33.33%

B→ A^C 2 Sup{(B^( A ^C)}/sup(B)= 2/7=0.28=28%

As the given threshold or minimum confidence is 50%, so the first three


rules A ^B → C, B^C → A, and A^C → B can be considered as the
strong association rules for the given problem.

Misganu T. Itroduction to Machine Learning 21


Co-training Algorithm

• Co-training is a well known semi-supervised learning algorithm.


• In which two classifiers are trained on two different views (feature
sets): the initially small training set is iteratively updated with
unlabelled samples classified with high confidence by one of the two
classifiers.

• Semi-supervised learning techniques are useful in many practical


applications in which a small set of labelled data is available, but a
large set of unlabelled data can be exploited to improve the
performance of learning algorithms.

Misganu T. Itroduction to Machine Learning 22


Working of Co-training
Algorithm

• Assume, We have an instance space X = X1 * X2, where X1 and X2 correspond to two


different views of an example.

• That is, each example x is given as a pair (x1; x2). We assume that each view in itself is
sufficient for correct classification.

• Specifically, let D be a distribution over X, and let C1 and C2 be concept classes defined
over X1 and X2, respectively.

• What we assume is that all labels on examples with non-zero probability under D are
consistent with some target function f1 € C1, and are also consistent with some target
function f2 € C2.

• In other words, if f denotes the combined target concept over the entire example, then for
any example x = (x1; x2) observed with label l., we have f (x) = f1(x1) = f2(x2) = l.

• This means in particular that D assigns probability zero to any example (x1; x2) such
that f1(x1) ≠ f2(x2).
Misganu T. Itroduction to Machine Learning 23
Q-Learning Algorithm

Q-Learning is a fundamental type of reinforcement learning that utilizes Q-values


(also known as action values) to improve the learner's behaviour continuously.
 Q-Values, also known as Action-Values: Q-values are defined for actions and
states. Q(S A, S) is an estimate of the probability of performing the action at
the time of S.
 The estimation of Q(S A, S) is computed iteratively by using the TD-
Update rule that we will learn about in the coming sections.
 Episodes and Rewards: An agent, throughout his life, begins in a state of
beginning and makes numerous shifts between its present state and the next
state, according to the type of actions and the environment it interacts with.
 Every step in the transition, an agent in the state of transition takes action,
is rewarded by the surrounding environment then goes to a new state.
 In the event that, at some point in time, the agent lands at one of the
ending states, it means that there are no more transitions that are feasible.
This is referred to as the end of an episode.
Misganu T. Itroduction to Machine Learning 24
Q-Learning Algorithm
Cont..

• Temporal Difference or TD-Update: A temporal Difference (TD) Update rule


could be expressed in the following manner:
Q(S,A)←Q(S,A)+ α(R+ γQ(S`,A`)-Q(S,A))
• The update rule used to calculate Quantity is used each time stage of the Agent's
interaction with their environment. The terminology used is explained below:
 S: The current state of the Agent.
 A: The current Action is selected by a policy.
 S`: Next State where the Agent will end up.
 A`: The next most effective option to choose based on the most current Q-value
estimation, i.e., select the Action that has the highest Q-value in the following state.
 R: Current Reward was seen by the environment in response to current actions.
 γ(>0 and <=1): Discounting Factor for Future Rewards. Future rewards are of lesser
value than present rewards. Therefore they should be discounted. Because the Q-value
estimates anticipated rewards from a specific state, the discounting rules are also
applicable in this case.
 α: The length of a step to revise Q(S, A).

Misganu T. Itroduction to Machine Learning 25


References
1. Stuart Russell and Peter Norvig – Artificial Intelligence A Modern Approach (3rd
Edition)

2. Stuart Russell and Peter Norvig – Artificial Intelligence A Modern Approach (4th
Edition)

3. Zsolt Nagy-Artificial Intelligence and Machine Learning Fundamentals-A press


(2018)

4. K. R. Chowdhary- Fundamentals of Artificial Intelligence (2020)


5. https://www.javatpoint.com/machine-learning-algorithms.
6. A. Blum and T. Mitchel, Combining Labeled and Unlabeled Data with Co-Training

Misganu T. Itroduction to Machine Learning 26


Misganu T. Itroduction to Machine Learning 27

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy