0% found this document useful (0 votes)
37 views2 pages

Cluster Analysis

Cluster analysis is a technique used to classify objects into homogeneous groups called clusters based on their similarities. Objects within each cluster are similar to each other and dissimilar to objects in other clusters. The key steps involve formulating the problem, selecting a distance measure and clustering procedure, deciding the number of clusters, interpreting and profiling clusters, and assessing validity. Common clustering procedures include hierarchical and non-hierarchical methods. Discriminant analysis develops a linear discriminant function to best discriminate between categorical dependent variables based on interval independent variables. It involves a linear combination of predictor variables with discriminant coefficients. Multi-group discriminant analysis handles problems with three or more dependent variable categories.

Uploaded by

awanish kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views2 pages

Cluster Analysis

Cluster analysis is a technique used to classify objects into homogeneous groups called clusters based on their similarities. Objects within each cluster are similar to each other and dissimilar to objects in other clusters. The key steps involve formulating the problem, selecting a distance measure and clustering procedure, deciding the number of clusters, interpreting and profiling clusters, and assessing validity. Common clustering procedures include hierarchical and non-hierarchical methods. Discriminant analysis develops a linear discriminant function to best discriminate between categorical dependent variables based on interval independent variables. It involves a linear combination of predictor variables with discriminant coefficients. Multi-group discriminant analysis handles problems with three or more dependent variable categories.

Uploaded by

awanish kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Cluster analysis

Cluster analysis is a class technique used to classify objects or cases into relatively
homogenous groups called clusters. Objects in a cluster tend to be similar to objects in the
same cluster while they tend to be dissimilar to objects in other clusters. Cluster analysis is
also called classification analysis.
The steps involved in cluster analysis are:
1. Formulate the problem
2. Select a distance measure
3. Select a clustering procedure
4. Decide on the number of clusters
5. Interpret and profile clusters
6. Assess the validity of clustering

1) The key part in formulating the problem statement is selecting the variables on
which the clustering is based. The set of variables selected should help in describing
the similarity between the objects. If some irrelevant variable is selected, then the
outcome would be of no use.
2) As the objective of clustering is to group similar objects together, we need some
measure to assess how similar or different the objects are. The most common
approach is to measure similarity by measuring the distance between 2 objects.
Objects with smaller distance between them is considered to be more similar. The
most commonly used measure of similarity is Euclidean distance. It is the square root
of the sum of the squared distances in values for each variable.
3) Clustering procedure can be hierarchical, non-hierarchical or other procedure. In
Hierarchical clustering development of tree-like structure or hierarchy is done. In
hierarchical clustering, agglomerative clustering and divisive clustering are two
procedures.
a. Agglomerative clustering is a hierarchical clustering procedure where each object
starts out in a separate cluster. Clusters are formed by grouping objects into
bigger and bigger clusters.
b. Divisive clustering is a hierarchical clustering procedure where all objects start
out in one giant cluster. Clusters are formed by dividing this cluster into smaller
and smaller clusters.
In non-hierarchical clustering, there are 3 methods:
a. Sequential threshold method is a non-hierarchical clustering procedure in which
a cluster center is selected and all objects within a prespecified threshold value
from the center are grouped together.
b. Parallel threshold method is a non-hierarchical clustering procedure that
specifies several cluster center at once. All objects that are within a prespecified
threshold value from the center are grouped together.
c. Optimizing partitioning method is a non-hierarchical clustering method that
allows for later reassignment of objects to clusters to optimize an overall
criterion.
4) Decide on the Number of clusters:
a. In hierarchical clustering, the distances at which clusters are combined can be
used as a criteria. This information can be obtained from the agglomeration or
from the dendrogram.
b. In non-hierarchical clustering, the ratio of total within-group variance to
between-group variance can be plotted against the number of clusters. The point
at which an elbow or a sharp bend occurs indicates an appropriate number of
clusters. Increasing the number of clusters beyond this point is usually not
worthwhile.
c. The relative size of the clusters should be meaningful.
5) Interpret and profile the clusters: it involves examining the cluster centroid. The
centroid represents the mean value of the objects contained in the cluster on each
of the variables. The centroid enables us to describe each cluster by assigning it a
name or label.
6) Assess reliability and validity:
a. Perform cluster analysis on the same data using different distance measures.
Compare the results across different measures.
b. Use different methods of clustering and compare the results.
c. Split the data into half. Perform clustering on each half. Compare cluster
centroids across the two subsamples.

Discriminant Analysis:

In this analysis, the market research data is analysed where the dependent variables are
categorical in nature while the independent variables are interval type. One of the key
objective of discriminant analysis is to develop a discriminant function.

Discriminant Function:
It is linear combination of independent variables that will best discriminate between the
categories of the dependent variables.

Discriminant Analysis Model involves linear combination of the following form:


D = b0 +b1X1 +b2X2 +…… +bkXk
Where
D = discriminant score
B’s = discriminant coefficient or weight
X’s = predictor or independent variable

Multi group discriminant analysis:

discriminant analysis technique where the criterion variable involves three or more
categories. The main distinction is that in a two group discriminant analysis, it is possible to
derive only one discriminant function. Whereas in multiple discriminant analysis, more than
one function may be computed.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy