assignment(2)
assignment(2)
Assignment no. 2
Dr. Mohamed Abdelhafeez
1)
Min-Max normalization
The formula for Min-Max normalization is:
X_normalized = (X - X_min) / (X_max - X_min)
Z-score normalization
formula for Z-score normalization is:
X_normalized = (X - mean) / standard_deviation
mean = (10 + 40 + 50 + 10 + 50 + 70 + 90 + 30) / 8 = 45
standard_deviation = sqrt(((10 - 45)^2 + (40 - 45)^2 + (50 - 45)^2 + (10 - 45)^2 + (50 -
45)^2 + (70 - 45)^2 + (90 - 45)^2 + (30 - 45)^2) / 8) = 27.7489
2)
Mean Imputation:
To impute the missing value using mean imputation, you calculate the mean of the
available values in the dataset:
Mean = (10 + 40 + 50 + 10 + 50 + 70 + 90 + 30) / 8 = 43.75
Then, you replace the missing value with the calculated mean:
[10, 40, 50, 10, 50, 70, 90, 30, 43.75]
Linear Interpolation:
For linear interpolation, you consider the neighboring data points around the missing
value. In this case, the value before the missing one is 30, and the value after it is 43.75.
You can then calculate the interpolated value using linear interpolation formula:
Interpolated value = 30 + (43.75 - 30) * (1/9) = 31.5278
Replace the missing value with the interpolated value:
[10, 40, 50, 10, 50, 70, 90, 30, 31.5278]
4)
1. {a} / {a, b, c, d, e} = 1/5.
Cosine Similarity:
sklearn.metrics.pairwise.cosine_distances: Calculates the pairwise cosine distances
between two sets of points.
sklearn.metrics.pairwise.cosine_similarity: Calculates the pairwise cosine similarities
between two sets of points.
Minkowski Distance:
sklearn.metrics.pairwise_distances: Calculates the pairwise distances using the
Minkowski distance metric.
euclidean_distance = distances[0, 0]
import numpy as np