K - Means Clustering
K - Means Clustering
Introduction
• K-Means Clustering is an
Unsupervised Machine Learning algorithm,
which groups the unlabeled dataset into
different clusters.
• Unsupervised Machine Learning is the process of teaching a
computer to use unlabeled, unclassified data and enabling the
algorithm to operate on that data without supervision. Without
any previous data training, the machine’s job in this case is to
organize unsorted data according to parallels, patterns, and
variations.
• K means clustering, assigns data points to one of the K
clusters depending on their distance from the center of the
clusters. It starts by randomly assigning the clusters
centroid in the space. Then each data point assign to one
of the cluster based on its distance from centroid of the
cluster. After assigning each point to one of the cluster,
new cluster centroids are assigned. This process runs
iteratively until it finds good cluster. In the analysis we
assume that number of cluster is given in advanced and
we have to put points in one of the group.
• A centroid is a data point that represents the
center of the cluster (the mean), and it might
not necessarily be a member of the dataset.
• In some cases, K is not clearly defined, and we have to
think about the optimal number of K. K Means
clustering performs best data is well separated. When
data points overlapped this clustering is not suitable. K
Means is faster as compare to other clustering
technique. It provides strong coupling between the data
points. K Means cluster do not provide clear
information regarding the quality of clusters. Different
initial assignment of cluster centroid may lead to
different clusters. Also, K Means algorithm is sensitive
to noise. It maymhave stuck in local minima.
objective of k-means clustering