0% found this document useful (0 votes)
11 views1 page

Tutorial Series 4: Exercise 1

This document is a tutorial for a second-year secondary cycle course on file structures and data structures. It contains exercises on KNN and K-means methods, hierarchical clustering, and the Expectation-Maximization algorithm for clustering data. Each exercise requires calculations and descriptions of clustering algorithms applied to given datasets.

Uploaded by

Sarah Houas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views1 page

Tutorial Series 4: Exercise 1

This document is a tutorial for a second-year secondary cycle course on file structures and data structures. It contains exercises on KNN and K-means methods, hierarchical clustering, and the Expectation-Maximization algorithm for clustering data. Each exercise requires calculations and descriptions of clustering algorithms applied to given datasets.

Uploaded by

Sarah Houas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Higher School of Computer and Digital Sciences and Technologies

file structures and data structures


2 nd year secondary cycle

Tutorial series 4

Exercise 1:
1. What are the similarities and differences between the KNN and K-means methods?
2. Given n observations represented by p binary variables, using a hierarchical clustering
algorithm with single linkage, what is the maximum depth of the resulting dendrogram? If
we now use complete linkage and suppose that p << n, does this change the answer?
3. What distance metric is used in a hierarchical clustering algorithm applied to data
represented by binary variables?
Exercice 2 :
Let the set D of the following integers be: D = { 2, 5, 8, 10, 11, 18, 20 }.
We want to divide the data in D into three (3) clusters using the K-means algorithm.
The distance d between two numbers a and b is calculated as follows: d(a, b) = |a - b| (the absolute
value of a minus b).
Apply K-means using the initial cluster centers: 8, 10, and 11, respectively. Show all calculation
steps.
Exercice 3 :
Given the points: 1, 2, 9, 12, 20
1. Apply the hierarchical clustering algorithm and draw the corresponding dendrogram.
2. Consider a top-down hierarchical clustering algorithm that, at each iteration, looks for the
best way to split a set of points into two parts.
Describe in detail the first iteration of this algorithm (using minimal jump strategy).
Exercice 4 :
Given the following 1D dataset: X = {1, 2, 2, 3, 6, 7, 8, 9}
Assume the data comes from a mixture of two Gaussian distributions (K = 2).
Use the Expectation-Maximization algorithm to cluster the data (1 iteration).

Initial Conditions:
• μ₁ = 2, μ₂ = 8
• σ₁² = 1, σ₂² = 1
• π₁ = 0.5, π₂ = 0.5

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy