CT075!3!2 DTM Topic 10 Cluster Analysis
CT075!3!2 DTM Topic 10 Cluster Analysis
CT075-3-2
Cluster Analysis
Learning Outcomes
• Pattern Recognition
• Spatial Data Analysis
– create thematic maps in GIS by clustering feature
spaces
– detect spatial clusters and explain them in spatial data
mining
• Image Processing
• Economic Science (especially market research)
• WWW
– Document classification
– Cluster Weblog data to discover groups of similar
access patterns
Examples of Clustering Applications
• Marketing: Help marketers discover distinct groups in their
customer bases, and then use this knowledge to develop
targeted marketing programs
• Land use: Identification of areas of similar land use in an
earth observation database
• Insurance: Identifying groups of motor insurance policy
holders with a high average claim cost
• City-planning: Identifying groups of houses according to
their house type, value, and geographical location
What Is Good Clustering?
9 9
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
10 10
9 9
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Equations required
Movie A B
M1 1.0 1.0
M2 1.5 2.0
M3 3.0 4.0
M4 5.0 7.0
M5 3.5 5.0
M6 4.5 5.0
M7 3.5 4.5
Example
M1 1.0 1.0
M4 5.0 7.0
Example
C1 1 1
ALLOCATION TO
C2 5 7 C1 C2 NEAREST CLUSTER
M1 1 1 0 10 C1
M3 3 4 5 5 C1, C2
M4 5 7 10 0 C2
M7 3.5 4.5 6 4 C2
Example
STEP 5
A B
C1 1.83 2.33
C2 3.9 5.1
SEED1 1 1
SEED2 5 7
Example
DISTANCE FROM
CLUSTERS
M1 1 1 2.16 7 C1
M2 1.5 2 0.66 5.5 C1
M3 3 4 2.84 2 C1
M4 5 7 7.84 3 C2
M5 3.5 5 4.34 0.5 C2
M6 4.5 5 5.34 0.5 C2
M7 3.5 4.5 3.84 1 C2