Lab 8
Lab 8
Aman Agarwal
Sec. C
Question: Write your own GMM implementation, using the EM algorithm for
parameter learning. Learn a GMM with 10 components on your data in PCA space.
SOURCE CODE:
import numpy as np
from scipy.stats import norm
np.random.seed(0)
X = np.linspace(-5,5,num=20)
X0 = X*np.random.rand(len(X))+10 # Create data cluster 1
X1 = X*np.random.rand(len(X))-10 # Create data cluster 2
X2 = X*np.random.rand(len(X)) # Create data cluster 3
X_tot = np.stack((X0,X1,X2)).flatten() # Combine the clusters to get the random d
atapoints from above
r = np.zeros((len(X_tot),3))
print('Dimensionality','=',np.shape(r))
print(r)
print(np.sum(r,axis=1))
OUTPUT:
Dimensionality = (60, 3)
[[2.97644006e-02 9.70235407e-01 1.91912550e-07]
[3.85713024e-02 9.61426220e-01 2.47747304e-06]
[2.44002651e-02 9.75599713e-01 2.16252823e-08]
[1.86909096e-02 9.81309090e-01 8.07574590e-10]
[1.37640773e-02 9.86235923e-01 9.93606589e-12]
[1.58674083e-02 9.84132592e-01 8.42447356e-11]
[1.14191259e-02 9.88580874e-01 4.48947365e-13]
[1.34349421e-02 9.86565058e-01 6.78305927e-12]
[1.11995848e-02 9.88800415e-01 3.18533028e-13]
[8.57645259e-03 9.91423547e-01 1.74498648e-15]
[7.64696969e-03 9.92353030e-01 1.33051021e-16]
[7.10275112e-03 9.92897249e-01 2.22285146e-17]
[6.36154765e-03 9.93638452e-01 1.22221112e-18]
[4.82376290e-03 9.95176237e-01 1.55549544e-22]
[7.75866904e-03 9.92241331e-01 1.86665135e-16]
[7.52759691e-03 9.92472403e-01 9.17205413e-17]
[8.04550643e-03 9.91954494e-01 4.28205323e-16]
[3.51864573e-03 9.96481354e-01 9.60903037e-30]
[3.42631418e-03 9.96573686e-01 1.06921949e-30]
[3.14390460e-03 9.96856095e-01 3.91217273e-35]
[1.00000000e+00 2.67245688e-12 1.56443629e-57]
[1.00000000e+00 4.26082753e-11 9.73970426e-49]
[9.99999999e-01 1.40098281e-09 3.68939866e-38]
[1.00000000e+00 2.65579518e-10 4.05324196e-43]
[9.99999977e-01 2.25030673e-08 3.11711096e-30]
[9.99999997e-01 2.52018974e-09 1.91287930e-36]
[9.99999974e-01 2.59528826e-08 7.72534540e-30]
[9.99999996e-01 4.22823192e-09 5.97494463e-35]
[9.99999980e-01 1.98158593e-08 1.38414545e-30]
[9.99999966e-01 3.43722391e-08 4.57504394e-29]
[9.99999953e-01 4.74290492e-08 3.45975850e-28]
[9.99999876e-01 1.24093364e-07 1.31878573e-25]
[9.99999878e-01 1.21709730e-07 1.17161878e-25]
[9.99999735e-01 2.65048706e-07 1.28402556e-23]
[9.99999955e-01 4.53370639e-08 2.60841891e-28]
[9.99999067e-01 9.33220139e-07 2.02379180e-20]
[9.99998448e-01 1.55216175e-06 3.63693167e-19]
[9.99997285e-01 2.71542629e-06 8.18923788e-18]
[9.99955648e-01 4.43516655e-05 1.59283752e-11]
[9.99987200e-01 1.28004505e-05 3.20565446e-14]
[9.64689131e-01 9.53405294e-03 2.57768163e-02]
[9.77001731e-01 7.96383733e-03 1.50344317e-02]
[9.96373670e-01 2.97775078e-03 6.48579562e-04]
[3.43634425e-01 2.15201653e-02 6.34845409e-01]
[9.75390877e-01 8.19866977e-03 1.64104537e-02]
[9.37822997e-01 1.19363656e-02 5.02406373e-02]
[4.27396946e-01 2.18816340e-02 5.50721420e-01]
[3.28570544e-01 2.14190231e-02 6.50010433e-01]
[3.62198108e-01 2.16303800e-02 6.16171512e-01]
[2.99837196e-01 2.11991858e-02 6.78963618e-01]
[2.21768797e-01 2.04809383e-02 7.57750265e-01]
[1.76497129e-01 2.01127714e-02 8.03390100e-01]
[8.23252013e-02 2.50758227e-02 8.92598976e-01]
[2.11943183e-01 2.03894641e-02 7.67667353e-01]
[1.50351209e-01 2.00499057e-02 8.29598885e-01]
[1.54779991e-01 2.00449518e-02 8.25175057e-01]
[7.92109803e-02 5.93118654e-02 8.61477154e-01]
[9.71905134e-02 2.18698473e-02 8.80939639e-01]
[7.60625670e-02 4.95831879e-02 8.74354245e-01]
[8.53513721e-02 2.40396004e-02 8.90609028e-01]]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
PCA :
SOURCE CODE:
# Principal Component Analysis (PCA)
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('Wine.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values
# Splitting the dataset into the Training set and Test set
# Feature Scaling
Accuracy Score:
Graph: