0% found this document useful (0 votes)
70 views

Cluster Analysis - Market Segmentation

Cluster analysis is a technique used to segment a market. It involves several steps: 1. Standardize the dataset variables to ensure each has equal effect on cluster selection. 2. For more than two variables, decide the number of clusters and select initial cluster anchors. 3. Use a solver tool to iteratively assign data points to clusters to minimize the distance between points and their cluster anchor, finding the optimal cluster configuration. The number of clusters can be varied to determine the right number for the data.

Uploaded by

nambi2rajan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

Cluster Analysis - Market Segmentation

Cluster analysis is a technique used to segment a market. It involves several steps: 1. Standardize the dataset variables to ensure each has equal effect on cluster selection. 2. For more than two variables, decide the number of clusters and select initial cluster anchors. 3. Use a solver tool to iteratively assign data points to clusters to minimize the distance between points and their cluster anchor, finding the optimal cluster configuration. The number of clusters can be varied to determine the right number for the data.

Uploaded by

nambi2rajan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Cluster Analysis – Market Segmentation

T RAMA NAMBI RAJAN || 16PGM36

Step I - Start with your data set


 Create a data set that is not nominal in scale.

Step Two – If just two variables, use a scatter graph on Excel


 For only two variables involved in a dataset a scatter plot would be
sufficient to cluster the data.

Step III – Decide the number of clusters


 If the variables are more than two then decide the number of clusters.
 So select the cluster anchors so that they have a differentiation among
themselves.

Step IV – Standardize the attributes


 In order to do this identify the mean and SD of all the attributes involved
Standardization = (attribute value – attribute’s mean value) / attribute’s SD
 Working with standardized values for each attribute ensures that your
analysis is unit-free and each attribute has the same effect on your cluster
selection.
 The standardized value is known as the Z-score.

Step V – Using solver to find optimal cluster


 Each identifier in the cluster should be similar to its cluster anchor, and all
other identifier not in the cluster should be different from the cluster anchor.
 The cluster anchors can be arbitrarily be decided

For example (from book)


For each city in the data set, you can determine the squared distance (using z-
scores) of each city from each of the four cluster anchors. Then you assign each
city the squared distance to the closest anchor and have your Solver target cell
equal the sum of these squared distances.
To begin, set up a way to “look up” the z-scores for candidate cluster centers:
 In H5:H8 enter “trial values” for cluster anchors. Each of these values can
be any integer between 1 and 49. For simplicity you can let the four trial
anchors be cities 1–4.
 After naming A9:N58 as the range lookup in G5, look up the name of the
first cluster anchor with the formula =VLOOKUP(H5,Lookup,2).
 Copy this formula to G6:G8 to identify the name of each cluster center
candidate.
 In I5:N8 identify the z-scores for each cluster anchor candidate by copying
from I5 to I5:N8 the formula =VLOOKUP($H5,Lookup,I$3).
 Figure : Look up z-scores for cluster anchors
 You can now compute the squared distance from each city to each cluster
candidate.
 To compute the distance from city 1 (Albuquerque) to cluster candidate
anchor 1, enter in O10 the formula =SUMXMY2($I$5:$N$5,$I10:$N10).
This cool Excel function computes the following:
(I5-I10)2+(J5-J10)2+(K5-K10)2+(L5-L10)2+(M5-M10)2+(N5-N10)2
 To compute the squared distance of Albuquerque from the second cluster
anchor, change each 5 in O10 to a 6. Similarly, in Q10 change each 5 to a 7.
 Finally, in R10 we change each 5 to an 8.
 Copy from O10:R10 to O11:R58 to compute the squared distance of each
city from each cluster anchor.
 Figure : Computing squared distances from cluster anchors
 In S10:S58 compute the distance from each city to the “closest” cluster
anchor by entering the formula =MIN(O10:R10) in cell S10 and copying it
to the cell range S10:S59.
 In S8 compute the sum of squared distances of all cities from their cluster
anchor with the formula =SUM(S10:S58).
 In T10:T58 compute the cluster to which each city is assigned by entering in
T10 the formula =MATCH(S10,O10:R10,0) and copying this formula to
T11:T58. This formula identifi es which element in columns O:R gives the
smallest squared distance to the city.
 Use the Solver window, as shown in Figure, to find the optimal cluster
anchors for the four clusters.

Determining the Right number of clusters:

In the solver we can input the number of cluster by varying it by 1 or a


minimum number to identify the optimal no. of clusters.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy