Module-3Conti.. Similarity& Dissimlarity
Module-3Conti.. Similarity& Dissimlarity
Similarity and
Dissimilarity
Unit - II
Datamining
Measures of Similarity and
Dissimilarity
Transformations
● Transformations are often applied to
○ convert a similarity to a dissimilarity,
○ convert a dissimilarity to a similarity
○ to transform a proximity measure to fall within a particular range, such as [0,1].
● Example
○ Similarities between objects range from 1 (not at all similar) to 10 (completely
similar)
○ we can make them fall within the range [0, 1] by using the transformation
■ s’ = (s−1)/9
■ s - Original Similarity
■ s’ - New similarity values
Measures of Similarity and Dissimilarity
Measures of Similarity and Dissimilarity
Dissimilarities between Data Objects
Euclidean Distance
Measures of Similarity and Dissimilarity
Dissimilarities between Data Objects
If d(x, y) is the distance between two points, x and y, then the following properties hold.
1. Positivity
2. Symmetry
3. Triangle Inequality
If d(A, B) = size(A − B), then it does not satisfy the second part of the
positivity property, the symmetry property, or the triangle inequality.
Jaccard Coefficient
Measures of Similarity and Dissimilarity
Examples of proximity measures
● Similarity Measures for Binary Data
Measures of Similarity and Dissimilarity
Examples of proximity measures
Cosine similarity (Document similarity)
If x and y are two document vectors, then
Measures of Similarity and Dissimilarity
Examples of proximity measures
cosine similarity (Document similarity)
Measures of Similarity and Dissimilarity
Examples of proximity measures
cosine similarity (Document similarity)
print("A:", A)
print("B:", B)
Note:-
Dividing x and y by their lengths normalizes them to have a length of 1 ( means magnitude is not
considered)
Measures of Similarity and Dissimilarity
Examples of proximity measures