0% found this document useful (0 votes)
44 views23 pages

Modul 8 (ANN1)

The document discusses several machine learning clustering and classification algorithms, including k-means clustering, radial basis function (RBF) networks, probabilistic neural networks (PNN), and self-organizing maps (SOM). It provides information on how each algorithm works, such as how k-means clustering assigns data points to centroids to minimize distance, how PNN uses Gaussian basis functions to classify data, and how SOM uses competitive learning on a neural network to project high-dimensional data onto a lower-dimensional display. Issues with algorithms like local minima and empty neurons in SOM are also mentioned.

Uploaded by

api-19755462
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views23 pages

Modul 8 (ANN1)

The document discusses several machine learning clustering and classification algorithms, including k-means clustering, radial basis function (RBF) networks, probabilistic neural networks (PNN), and self-organizing maps (SOM). It provides information on how each algorithm works, such as how k-means clustering assigns data points to centroids to minimize distance, how PNN uses Gaussian basis functions to classify data, and how SOM uses competitive learning on a neural network to project high-dimensional data onto a lower-dimensional display. Issues with algorithms like local minima and empty neurons in SOM are also mentioned.

Uploaded by

api-19755462
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Module 8

k-means
RBF networks
PNN
SOM
Non-Hierarchical Cluster Analysis: k-means

+ B

+
A +: centroid

1. Select number of clusters (centroids)


2. Calculate centroids µk of partitions Mk
3. Assign cluster members x to centroids
4. Minimize distance function K
D = ∑∑ ( xk − µ k ) → Min.
2

k =1 x k

http://www.elet.polimi.it/upload/matteucc/Clustering/tutorial_html/kmeans.html
Example: k-means Clustering

Data
ID x1 x2 B2
A1 1 1 + B3
A2 2 1 B1
B1 4 5 x2
B2 5 7
B3 7 7
A1
A2
+
• 2 centroids (k = 2) x1
• Euclidian Distance
Cluster Boundary
(Classifier)
k-Nearest Neighbors

• Pick k nearest objects to a reference point


• Problem: reasonable choice of K

x2 k=3
k=6

x1
Kernel-based Nearest Neighbors

• Pick nearest objects to a reference point according


to a Kernel function
• Problem: reasonable choice of the Kernel & parameters

Kernel function
f( A,B) → scalar value
f( A,B) = 0 if A = B
f( A,B) > 0 if A ≠ B + µB
x2
σB
Gaussian Kernel φ

 x −µ 2
 A σA
 j 
Φ j ( x ) = exp − 
 2σ 2j
  x1

Kernel Discrimination Methods


Probabilistic Neural Network (PNN)

one standardized Gaussian basis function placed on the


location of each pattern (xi = µi)

 x−x 2

1  

class , j
yclass = f class (x) = exp − 
j∈CLASS
D D
M (2π ) σ j  2σ 2j
 

φ1
Optional: softmax
x1 y1 exp( y class )
zclass = C

xd yc ∑ exp( y
k =1
k )

smoothed version of
Inputs φM Outputs „winner-take-all“
Basis
functions
Probabilistic Neural Network (PNN)

cytoplasmic proteins (class1) fclass1(x)


secreted proteins (class2) fclass2(x) decision boundary
property 2

property 1
P(x)

<l
ip
op
hi
lic
ity
> <volume>
Radial Basis Function (RBF) Network

M
y ( x ) = ∑ w kj Φ j ( x ) + w k 0
j =1
Gaussian basis function φ
M basis functions φ
 x −µ 2
  ( x − µ j )T ( x − µ j ) 
φ1  j   
Φ j ( x ) = exp −  = exp −
 2σ 2j 2σ j 2 
   

x1 y1
Standardized Gaussian φ
w
xd yc  x −µ 2

1  j 
Φ j (x ) = exp − 
(2π )D σ Dj  2σ 2j
 
Inputs φM Outputs
D : dimension of X
Basis
functions
The Self-Organizing Map (SOM)
Data Analysis by Self-Organizing Maps
(Kohonen networks)

X
SOM
Properties of Kohonen Networks

• Projection of high-dimensional space


• Self-organized feature extraction and cluster formation
• Non-linear, topology-preserving mapping

Other Projection Techniques

• Principal Component Analysis (linear)


• Projection to Latent Structures (linear, non-linear)
• Encoder networks (non-linear)
• Sammon mapping (non-linear)
Architecture of Kohonen Networks

w A=6

(A • B) neuron array
x1
x2
Input
x3
B=5
x4

Output
1/0

One neuron fires at a time


Neighborhood Definition in Kohonen Networks

Square
neuron array

second neighborhood
first neighborhood
Hexagonal central (active) neuron
neuron array
Toroidal Topology of 2D-Kohonen Maps

An “endless plane”
Competitive Learning
1. Randomly select an input pattern x
2. Determine the “winner” neuron (competitive stage)
dim 
i* ← min  ∑ x j − wij ; i = 1,2, ... , n
( 2
)
 j =1 

3. Update network weights (cooperative stage)

 wold + η x
 ij j
if i ∈ N i * (Normalization of w)
new  w old + η x
wij =  i

 wijold if i ∉ N i *

4. Goto Step 1 or terminate


Scaling Functions “connection kernel”

h
• Neighborhood (N) correction

 d1 (r , s ) 2  s
h(t , r , s ) = exp  −  r
 2σ 2 (t ) 
 
h
t / t max
 σ fin 
σ (t ) = σ ini  

 σ ini  s
r

•Time-dependent learning rate h


“Mexican Hat”
t / t max
 η fin  (not in SOM)
η (t ) = ηini  
 s
 η ini  r
Vectorial Representation of Competitive Learning

Neuron 2 x2 x2

Neuron 2

x1 x1

Neuron 1
Neuron 1 unit sphere

Learning Time
SOM Adaptation to Probability Distributions

t = 500

t = 400

t = 300

t = 200

t = 100
0 Learning time

B
Voronoi
tesselation
A
SOM - Issues

• Neighboring neurons code for neighboring input positions,


but the inverse does not hold
• Best results: input dimensionality = lattice dimensionality

• Neighborhood decay & shape local minima

• Problems with capturing the fine-structure of input space


(oversampling of low-probability regions)
• “dead” or “empty” neurons

• features are not invariant to, e.g., translations of the input signal
Mapping Chemical Space: “Drugs” and “Nondrugs”

120-dimensional data, Ghose & Crippen parameters


5’000 drugs, 5’000 nondrugs (Sadowski & Kubinyi, 1998)
Visualizing Combinatorial Libraries (UGI)
O O
+
R1 N C + R2 + R3 NH2 + R4
H OH

R2 O R2
MeOH / RT H H
N + N
R1 R4 R1 NH
O R3 O R3

7 Thrombin binding assay


6
IC50 < 10 µM
5
PC2

4
3
2
1
PC1 1 2 3 4 5 6 7

PCA Kohonen-Map
Self-organizing neural networks demo
1) University of Bochum
http://www.neuroinformatik.ruhr-uni-
bochum.de/ini/VDM/research/gsn/DemoGNG/GNG
2) SOMMER

Link on modlab software page


www.modlab.de

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy