0% found this document useful (0 votes)
19 views23 pages

S Iv BRM 21 43

The document discusses measurement concepts and scaling techniques in business research, emphasizing the importance of assigning numbers to both physical and abstract characteristics. It outlines four types of measurement scales: nominal, ordinal, interval, and ratio, detailing their properties, uses, and limitations. Additionally, it introduces scaling as a process for quantifying attitudes and opinions through various techniques, such as rating scales.

Uploaded by

promptmba24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views23 pages

S Iv BRM 21 43

The document discusses measurement concepts and scaling techniques in business research, emphasizing the importance of assigning numbers to both physical and abstract characteristics. It outlines four types of measurement scales: nominal, ordinal, interval, and ratio, detailing their properties, uses, and limitations. Additionally, it introduces scaling as a process for quantifying attitudes and opinions through various techniques, such as rating scales.

Uploaded by

promptmba24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Paper GE.

404: Business Research Methods


Dr. Bappaditya Biswas
Module I

Topic: Scaling Techniques and Questionnaire Design

1. Concept of Measurement

Measurement means assigning number or other symbols to characteristics of objects according to


certain pre-specified rules. In our daily life we are said to measure when we use some yardstick to
determine weight, height, or some other feature of a physical object. We also measure when we
judge how well we like a song, a painting or the personalities of our friends. We, thus, measure
physical objects as well as abstract concepts. Measurement is a relatively complex and demanding
task, specially so when it concerns qualitative or abstract phenomena. By measurement we mean
the process of assigning numbers to objects or observations, the level of measurement being a
function of the rules under which the numbers are assigned.
It is easy to assign numbers in respect of properties of some objects, but it is relatively difficult in
respect of others. For instance, measuring such things as social conformity, intelligence, or marital
adjustment is much less obvious and requires much closer attention than measuring physical weight,
biological age or a person’s financial assets. In other words, properties like weight, height, etc., can
be measured directly with some standard unit of measurement, but it is not that easy to measure
properties like motivation to succeed, ability to stand stress and the like. We can expect high accuracy in
measuring the length of pipe with a yard stick, but if the concept is abstract and the measurement
tools are not standardized, we are less confident about the accuracy of the results of measurement.

2. Measurement Scales
From what has been stated above, we can write that scales of measurement can be considered in
terms of their mathematical properties. The most widely used classification of measurement scales
are: (a) nominal scale; (b) ordinal scale; (c) interval scale; and (d) ratio scale.

(a) Nominal scale:

1|Page
Nominal scale is simply a system of assigning number symbols to events in order to label them.
The usual example of this is the assignment of numbers of basketball players in order to identify
them. Such numbers cannot be considered to be associated with an ordered scale for their order
is of no consequence; the numbers are just convenient labels for the particular class of events and
as such have no quantitative value. Nominal scales provide convenient ways of keeping track of
people, objects and events. One cannot do much with the numbers involved.

For example, one cannot usefully average the numbers on the back of a group of football players
and come up with a meaningful value. Neither can one usefully compare the numbers assigned to
one group with the numbers assigned to another. The counting of members in each group is the
only possible arithmetic operation when a nominal scale is employed. Accordingly, we are
restricted to use mode as the measure of central tendency. There is no generally used measure of
dispersion for nominal scales. Chi-square test is the most common test of statistical significance
that can be utilized, and for the measures of correlation, the contingency coefficient can be
worked out.

Nominal scale is the least powerful level of measurement. It indicates no order or distance
relationship and has no arithmetic origin. A nominal scale simply describes differences between
things by assigning them to categories. Nominal data are, thus, counted data. The scale wastes any
information that we may have about varying degrees of attitude, skills, understandings, etc. In spite
of all this, nominal scales are still very useful and are widely used in surveys and other ex-post-facto
research when data are being classified by major sub-groups of the population.

Characteristics
 It has no arithmetic origin.
 It shows no order or distance relationship.
 It distinguishes things by putting them into various groups.
Use
This scale is generally used in conducting in surveys and ex-post-facto research.
Example:
Have you ever visited Bangalore?
Yes-1
No-2
'Yes' is coded as 'One' and 'No' is coded as 'Two'. The numeric attached to the answers has no

2|Page
meaning, and is a mere identification. If numbers are interchanged as one for 'No' and two for
'Yes', it won't affect the answers given by respondents. The numbers used in nominal scales serve
only the purpose of counting.
The telephone numbers are an example of nominal scale, where one number is assigned to one
subscriber. The idea of using nominal scale is to make sure that no two persons or objects receive
the same number. Similarly, bus route numbers are the example of nominal scale. "How old are
you"? This is an example of a nominal scale.

Limitations
 There is no rank ordering.

 No mathematical operation is possible.

 Statistical implication - Calculation of the standard deviation and the mean is not possible. It
is possible to express the mode.

(b) Ordinal scale:

The lowest level of the ordered scale that is commonly used is the ordinal scale. The ordinal scale
places events in order, but there is no attempt to make the intervals of the scale equal in terms of
some rule. Rank orders represent ordinal scales and are frequently used in research relating to
qualitative phenomena. A student’s rank in his graduation class involves the use of an ordinal
scale. One has to be very careful in making statement about scores based on ordinal scales.
For instance, if Ram’s position in his class is 10 and Mohan’s position is 40, it cannot be said
that Ram’s position is four times as good as that of Mohan. The statement would make no sense
at all. Ordinal scales only permit the ranking of items from highest to lowest. Ordinal measures
have no absolute values, and the real differences between adjacent ranks may not be equal. All
that can be said is that one person is higher or lower on the scale than another, but more precise
comparisons cannot be made.
Thus, the use of an ordinal scale implies a statement of ‘greater than’ or ‘less than’ (an equality
statement is also acceptable) without our being able to state how much greater or less. The real
difference between ranks 1 and 2 may be more or less than the difference between ranks 5 and 6.
Since the numbers of this scale have only a rank meaning, the appropriate measure of central
tendency is the median. A percentile or quartile measure is used for measuring dispersion.
Correlations are restricted to various rank order methods. Measures of statistical significance are
restricted to the non-parametric methods.

3|Page
Characteristics

 The ordinal scale ranks the things from the highest to the lowest.

 Such scales are not expressed in absolute terms.

 The difference between adjacent ranks is not equal always.

 For measuring central tendency, median is used.

 For measuring dispersion, percentile or quartile is used.

Example:

The respondents may be given a list of brands which may be suitable and were asked to rank on the
basis of ordinal scale:

o Lux

o Liril

o Cinthol

o Lifebuoy

o Park Avenue

(c) Interval scale:


In the case of interval scale, the intervals are adjusted in terms of some rule that has been
established as a basis for making the units equal. The units are equal only in so far as one accepts
the assumptions on which the rule is based. Interval scales can have an arbitrary zero, but it is not
possible to determine for them what may be called an absolute zero or the unique origin. The
primary limitation of the interval scale is the lack of a true zero; it does not have the capacity to
measure the complete absence of a trait or characteristic. The Fahrenheit scale is an example of
an interval scale and shows similarities in what one can and cannot do with it. One can say that
an increase in temperature from 30° to 40° involves the same increase in temperature as an
increase from 60° to 70°, but one cannot say that the temperature of 60° is twice as warm as the
temperature of 30° because both numbers are dependent on the fact that the zero on the scale is
set arbitrarily at the temperature of the freezing point of water. The ratio of the two temperatures,
30° and 60°, means nothing because zero is an arbitrary point.

4|Page
Interval scales provide more powerful measurement than ordinal scales for interval scale also
incorporates the concept of equality of interval. As such more powerful statistical measures can
be used with interval scales. Mean is the appropriate measure of central tendency, while standard
deviation is the most widely used measure of dispersion. Product moment correlation techniques
are appropriate and the generally used tests for statistical significance are the ‘t’ test and ‘F’ test.

Characteristics

 Interval scales have no absolute zero. It is set arbitrarily.

 For measuring central tendency, mean is used.

 For measuring dispersion, standard deviation is used.

 For test of significance, t-test and f-test are used.

 Scale is based on the equality of intervals.

Use

Most of the common statistical methods of analysis require only interval scales in order that they
might be used. These are not recounted here because they are so common and can be found in
virtually all basic texts on statistics.

Interval scales may be either numeric or semantic.

Suppose we want to measure the rating of a refrigerator using interval scale by scoring them on a
scale of 5 down to 1 (i.e. 5 = Excellent; 1= Poor) on each of the criteria listed. Circle the appropriate
score on each line.

Brand name 5 4 3 2 1

Price 5 4 3 2 1

After Sale Service 5 4 3 2 1

Utility 5 4 3 2 1

Attractively Design 5 4 3 2 1

The researcher cannot conclude that the respondent who gives a rating of 4 is 2 times more

5|Page
favourable towards a product under study than another respondent who awards the rating of 2

Please indicate your views on Jack Olive Oil by ticking the appropriate responses below:
Excellent Very Good Good Fair Poor

Succulent

Freshness

Freedom from skin blemish

Value for money

Attractiveness of packaging

d. Ratio scale:

The highest level of measurement is a ratio scale. This has the properties of an interval scale
together with a fixed origin or zero point. Examples of variables which are ratio scaled include
weights, lengths and times. Ratio scales permit the researcher to compare both differences in scores
and the relative magnitude of scores. For instance the difference between 5 and 10 minutes is the
same as that between 10 and 15 minutes, and 10 minutes is twice as long as 5 minutes.

Ratio scales have an absolute or true zero of measurement. The term ‘absolute zero’ is not as precise
as it was once believed to be. We can conceive of an absolute zero of length and similarly we can
conceive of an absolute zero of time. For example, the zero point on a centimeter scale indicates the
complete absence of length or height. But an absolute zero of temperature is theoretically
unobtainable and it remains a concept existing only in the scientist’s mind. The number of minor
traffic-rule violations and the number of incorrect letters in a page of type script represent scores on
ratio scales. Both these scales have absolute zeros and as such all minor traffic violations and all
typing errors can be assumed to be equal in significance. With ratio scales involved one can make
statements like “Jyoti’s” typing performance was twice as good as that of “Reetu.” The ratio
involved does have significance and facilitates a kind of comparison which is not possible in case of
an interval scale.

Ratio scale represents the actual amounts of variables. Measures of physical dimensions such as
weight, height, distance, etc. are examples. Generally, all statistical techniques are usable with ratio

6|Page
scales and all manipulations that one can carry out with real numbers can also be carried out with
ratio scale values. Multiplication and division can be used with this scale but not with other scales
mentioned above. Geometric and harmonic means can be used as measures of central tendency and
coefficients of variation may also be calculated.

Thus, proceeding from the nominal scale (the least precise type of scale) to ratio scale (the most
precise), relevant information is obtained increasingly. If the nature of the variables permits, the
researcher should use the scale that provides the most precise description. Researchers in physical
sciences have the advantage to describe variables in ratio scale form but the behavioural sciences
are generally limited to describe variables in interval scale form, a less precise type of measurement.
Characteristics

 This scale has an absolute zero measurement.

 For measuring central tendency, geometric and harmonic means are used.

Use:
Ratio scale can be used in all statistical techniques.
Example: Sales this year for product A are twice the sales of the same product last year.

Summary
Type Basic empirical Typical Typical Statistics
operation usage Descriptive Inferential

1. Nominal Determination Classification Percentage, Chi-square,


Of equality Male-female Mode Binomial test
(0, 1,2, 9) purchaser non-
purchaser, Team
A Team-B
2. Ordinal Determination Rankings: Percentile, Rank-order
Of greater or less preference data, Median correlation
(o<1<2… <9) market position,
attitude measures,
many
psychological
measures
3. Interval Determination of equality Index numbers, Mean, Range, Product-
of intervals (2-1=7-6) attitude measures Standard moment
Deviation correlation

4. Ratio Determination of equality Sales, units Geometric Coefficient of


of ratios (2/4 = 4/8) produced, number mean variation

7|Page
of customers.
Costs, age

3. Scaling

In research we quite often face measurement problem (since we want a valid measurement but may
not obtain it), specially when the concepts to be measured are complex and abstract and we do not
possess the standardised measurement tools. Alternatively, we can say that while measuring attitudes
and opinions, we face the problem of their valid measurement.

Scaling is a process or set of procedures, which is used to assess the attitude of an individual.
Scaling is defined as the assignment of objects to numbers according to a rule. The objects in the
definition are text statements, which can be the statements of attitude or principle. Attitude of an
individual is not measured directly by scaling. It is first migrated to statements and then the numbers
are assigned to them.

Meaning of Scaling

Scaling describes the procedures of assigning numbers to various degrees of opinion, attitude and
other concepts. This can be done in two ways viz., (i) making a judgement about some characteristic
of an individual and then placing him directly on a scale that has been defined in terms of that
characteristic and (ii) constructing questionnaires in such a way that the score of individual’s responses
assigns him a place on a scale.

It may be stated here that a scale is a continuum, consisting of the highest point (in terms of some
characteristic e.g., preference, favourableness, etc.) and the lowest point along with several
intermediate points between these two extreme points. These scale-point positions are so related to
each other that when the first point happens to be the highest point, the second point indicates a
higher degree in terms of a given characteristic as compared to the third point and the third point
indicates a higher degree as compared to the fourth and so on. Numbers for measuring the
distinctions of degree in the attitudes/opinions are, thus, assigned to individuals corresponding to
their scale-positions. All this is better understood when we talk about scaling technique(s). Hence
the term ‘scaling’ is applied to the procedures for attempting to determine quantitative measures of
subjective abstract concepts.

Scaling has been defined as a “procedure for the assignment of numbers (or other symbols) to a
property of objects in order to impart some of the characteristics of numbers to the properties in

8|Page
question.”

4. Different Scaling Techniques


We now take up some of the important scaling techniques often used in the context of research
specially in context of social or business research.

(i) Rating scales:

The rating scale involves qualitative description of a limited number of aspects of a thing or of
traits of a person. When we use rating scales (or categorical scales), we judge an object in absolute
terms against some specified criteria i.e., we judge properties of objects without reference to other
similar objects. These ratings may be in such forms as “like-dislike”, “above average, average, below
average”, or other classifications with more categories such as “like very much—like some
what—neutral—dislike somewhat—dislike very much”; “excellent—good—average—below
average—poor”, “always—often—occasionally—rarely—never”, and so on. There is no specific
rule whether to use a two-points scale, three-points scale or scale with still more points. In practice,
three to seven points scales are generally used for the simple reason that more points on a scale
provide an opportunity for greater sensitivity of measurement.

The following rating scales are often used in organizational research:

a. Dichotomous scale
b. Category scale
c. Likert scale
d. Numerical scales
e. Semantic differential scale
f. Itemized rating scale
g. Fixed or constant sum rating scale
h. Stapel scale
i. Graphic rating scale
j. Consensus scale

Other scales such as the Thurstone Equal Appearing Interval Scale, and the Multidimensional
Scale are less frequently used. We will briefly describe each of the above attitudinal scales.

9|Page
a. Dichotomous Scale
The dichotomous scale is used to elicit a Yes or No answer, as in the
example below. Note that a nominal scale is used to elicit the response.

Do you own a car? Yes No


Example

b. Category Scale
The category scale uses multiple items to elicit a single response as per the
fol- lowing example. This also uses the nominal scale.

Where in northern California do you reside? North Bay


Example South Bay East
Bay Peninsula
Other

c. Likert Scale The Likert scale is designed to examine how strongly subjects agree or
disagree with statements on a 5-point scale with the following anchors:

Strongly Neither Agree Strongly


Disagree Disagree Nor Disagree Agree Agree
1 2 3 4 5
The responses over a number of items tapping a particular concept or
variable (as per the following example) are then summated for every
respondent. This is an interval scale and the differences in the responses
between any two points on the scale remain the same.
Example
Using the preceding Likert scale, state the extent to which you agree
with each of the following statements:
My work is very interesting 1 2 3 4 5
I am not engrossed in my work all day1 2 3 4 5
Life without my work will be dull 1 2 3 4 5

d. Semantic
Differential Scale

Several bipolar attributes are identified at the extremes of the scale, and
respon- dents are asked to indicate their attitudes, on what may be called a
semantic
space, toward a particular individual, object, or event on each of the attributes. The

10 | P a g e
bipolar adjectives used, for instance, would employ such terms a Good–Bad;
Strong–Weak; Hot–Cold. The semantic differential scale is used t assess respondents‘
attitudes toward a particular brand, advertisement, object, o individual. The responses can be
plotted to obtain a good idea of their percep tions. This is treated as an interval scale. An
example of the semantic differentia scale follows.
Example
Responsive — — — — — — — Unresponsive
Beautiful — — — — — — — Ugly
Courageous — — — — — — — Timid

e. Numerical Scale
The numerical scale is similar to the semantic differential scale, with the
difference that numbers on a 5-point or 7-point scale are provided, with
bipolar adjectives at both ends, as illustrated below. This is also an interval
scale.

Example: How pleased are you with your new real estate agent?
Extremely Extremely
Pleased 7 6 5 4 3 2 1 Displeased

f. Itemized Rating Scale


A 5-point or 7-point scale with anchors, as needed, is provided for each
item and the respondent states the appropriate number on the side of each
item, or cir cles the relevant number against each item, as per the examples
that follow. Th responses to the items are then summated. This uses an interval
scale.

Example (i) Respond to each item using the scale below, and indicate your response number
on the line by each item.

1 2 3 4 5
Very Unlikely Unlikely Neither Unlikely Likely Very Likely
Nor Likely

1. I will be changing my job within the next 12 months. —


2. I will take on new assignments in the near future. —
3. It is possible that I will be out of this organization
within the next 12 months. —
Note that the above is a balanced rating scale with a neutral point.

11 | P a g e
Example: (ii) Circle the number that is closest to how you feel for the item
below.

Not at All Interested Somewhat Interested Moderately Interested Very Much Interested
1 2 3 4

How would you rate your interest in 1 2 3 4


changing current organizational policies?
This is an unbalanced rating scale which does not have a neutral point.

g. Fixed or Constant Sum Scale


The respondents are here asked to distribute a given number of points across various items
as per the example below. This is more in the nature of an ordinal scale.
Example
In choosing a toilet soap, indicate the importance you attach to each of the following five aspects
by allotting points for each to total 100 in all.
Fragrance —
Color —
Shape —
Size —
Texture of lather —
Total points 100

h. Stapel Scale
This scale simultaneously measures both the direction and intensity of the atti tude toward
the items under study. The characteristic of interest to the study i placed at the center and a
numerical scale ranging, say, from + 3 to – 3, on either side of the item as illustrated below.
This gives an idea of how close or distant the individual response to the stimulus is, as
shown in the example below. Since this does not have an absolute zero point, this is an
interval scale.

State how you would rate your supervisor’s abilities with respect to each of
the characteristics mentioned below, by circling the appropriate number.

12 | P a g e
+3 +3 +3
+2 +2 +2
+1 +1 +1
Adopting Modern Product Interpersonal
Technology Innovation Skills
–1 –1 –1
–2 –2 –2
–3 –3 –3

i. The graphic rating scale

It is quite simple and is commonly used in practice. Under it the various points are usually put
along the line to form a continuum and the rater indicates his rating by simply making a mark
(such as ✓) at the appropriate point on a line that runs from one extreme to the other. Scale-
points with brief descriptions may be indicated along the line, their function being to assist the
rater in performing his job. The following is an example of five-points graphic rating scale when
we wish to ascertain people’s liking or disliking any product:

How do you like the product? (Please


check)

Like Like Neutral Dislike Dislike very


very some some much
what what
much

13 | P a g e
j. Consensus Scale

Scales are also developed by consensus, where a panel of judges selects certain items,
which in its view measure the relevant concept. The items are chosen particularly based on
their pertinence or relevance to the concept. Such a consensus scale is developed after the
selected items are examined and tested for their validity and reliability. One such
consensus scale is the Thurstone Equal Appearing Interval Scale, where a concept is
measured by a complex process followed by a panel of judges. Using a pile of cards
containing several descrip- tions of the concept, a panel of judges offers inputs to indicate
how close or not the statements are to the concept under study. The scale is then
developed based on the consensus reached. However, this scale is rarely used for measuring
orga- nizational concepts because of the time necessary to develop it.

(ii) Ranking scales:

Under ranking scales (or comparative scales) we make relative judgements against other
similar objects. The respondents under this method directly compare two or more objects and
make choices among them. As already mentioned, ranking scales are used to tap
preferences between two or among more objects or items (ordinal in nature). However,
such ranking may not give definitive clues to some of the answers sought. For instance,
let us say there are four product lines and the manager seeks information that would
help decide which product line should get the most attention. Let us also assume that
35% of the respondents choose the first product, 25% the second, and 20% choose
each of products three and four as of importance to them. The manager cannot then
conclude that the first product is the most preferred since 65% of the respondents did not
choose that product! Alternative methods used are the paired comparisons, forced
choice, and the comparative scale, which are dis- cussed below. There are three generally used
approaches of ranking scales viz.

a. Paired Comparison

The paired comparison scale is used when, among a small number of objects, respondents
are asked to choose between two objects at a time. This helps to assess preferences. If,
for instance, in the previous example, during the paired comparisons, respondents
consistently show a preference for product one over products two, three, and four, the

14 | P a g e
manager reliably understands which product line demands his utmost attention. However,
as the number of objects to be com- pared increases, so does the number of paired
comparisons. The paired choices for n objects will be [(n) (n–1)/2]. The greater the
number of objects or stimuli, the greater the number of paired comparisons presented to
the respondents, and the greater the respondent fatigue. Hence paired comparison is a
good method if the number of stimuli presented is small.

b. Forced Choice

The forced choice enables respondents to rank objects relative to one another among the
alternatives provided. This is easier for the respondents, particularly if the number of
choices to be ranked is limited in number.

Rank the following magazines that you would like to subscribe to in the order of
preference, assigning 1 for the most preferred choice and 5 for the least preferred.

Fortune —
India Today —
Time —
People —
Nature —

c. Method of rank order

Under this method of comparative scaling, the respondents are asked to rank their choices.
This method is easier and faster than the method of paired comparisons stated above. For
example, with 10 items it takes 45 pair comparisons to complete the task, whereas the
method of rank order simply requires ranking of 10 items only. The problem of transitivity
(such as A prefers to B, B to C, but C prefers to A) is also not there in case we adopt method
of rank order. Moreover, a complete ranking at times is not needed in which case the
respondents may be asked to rank only their first, say, four choices while the number of
overall items involved may be more than four, say, it may be 15 or 20 or more. To secure a
simple ranking of all items involved we simply total rank values received by each item.
There are methods through which we can as well develop an interval scale of these data.
But then there are limitations of this method. The first one is that data obtained through this
method are ordinal data and hence rank ordering is an ordinal scale with all its limitations.

15 | P a g e
Then there may be the problem of respondents becoming careless in assigning ranks
particularly when there are many (usually more than 10) items.

5. Criteria for the Good Test


There are two criteria to decide whether the scale selected is good or not. They are:

a. Reliability Test; and

b. Validity Test

a. Reliability Test
Reliability means the extent to which the measurement process is free from errors. Reliability
deals with accuracy and consistency. The scale is said to be reliable, if it yields the same
results when repeated measurements are made under constant conditions. Reliability refers to
the extent to which a scale produces consistent results if repeated measurements are made.
Systematic sources of error do not have an adverse impact on reliability, because they affect
the measurement in a constant way and do not lead to inconsistency. In contrast, random
error produces inconsistency, leading to lower reliability. Reliability is assessed by
determining the proportion of systematic variation in a scale. This is done by determining the
association between scores obtained from different administrations of the scale. If the
association is high, the scale yields consistent results and is therefore reliable. Approaches for
assessing reliability include the test–retest, alternative-forms and internal consistency
methods.

Test–retest reliability
In test–retest reliability, respondents are administered identical sets of scale items at two
different times, under as nearly equivalent conditions as possible. The time interval between
tests or administrations is typically two to four weeks. The degree of similarity between the
two measurements is determined by computing a correlation coefficient. The higher the
correlation coefficient, the greater the reliability.

Alternative-forms reliability
In alternative-forms reliability, two equivalent forms of the scale are constructed. The same
respondents are measured at two different times, usually two to four weeks apart (e.g. by
initially using Likert scaled items and then using Stapel scaled items). The scores from the
administrations of the alternative scale forms are correlated to assess reliability.

16 | P a g e
Internal consistency reliability
Internal consistency reliability is used to assess the reliability of a summated scale where
several items are summed to form a total score. In a scale of this type, each item measures
some aspect of the construct measured by the entire scale, and the items should be consistent
in what they indicate about the construct. This measure of reliability focuses on the internal
consistency of the set of items forming the scale.

Split-half reliability
The simplest measure of internal consistency is split-half reliability. The items on the scale
are divided into two halves and the resulting half scores are correlated. High correlations
between the halves indicate high internal consistency. The scale items can be split into halves
based on odd- and even-numbered items or randomly. The problem is that the results will
depend on how the scale items are split.

Coefficient alpha, or Cronbach’s alpha


A popular approach to overcoming this problem is to use the coefficient alpha. The
coefficient alpha, or Cronbach’s alpha, is the average of all possible split-half coefficients
resulting from different ways of splitting the scale items. This coefficient varies from 0 to 1,
and a value of 0.6 or less generally indicates unsatisfactory internal consistency reliability.
An important property of coefficient alpha is that its value tends to increase with an increase
in the number of scale items. Therefore, coefficient alpha may be artificially, and
inappropriately, inflated by including several redundant scale items.

Another coefficient that can be employed in conjunction with coefficient alpha is coefficient
beta. Coefficient beta assists in determining whether the averaging process used in
calculating coefficient alpha is masking any inconsistent items.

Example:

Attitude towards a product or brand preference.

Reliability can be ensured by using the same scale on the same set of respondents, using the
same method. However, in actual practice, this becomes difficult as:

17 | P a g e
(i) Extent to which a scale produces consistent results
(ii) Test-retest Reliability: Respondents are administered scales at 2 different times under
nearly equivalent conditions
(iii) Alternative-form Reliability: 2 equivalent forms of a scale are constructed, then tested
with the same respondents at 2 different times

(iv) Internal Consistency Reliability:

(a) The consistency with which each item represents the construct of interest

(b) Used to assess the reliability of a summated scale

(c) Split-half Reliability

(v) Items constituting the scale divided into 2 halves, and resulting half scores are
correlated: Coefficient alpha (most common test of reliability)

(vi) Average of all possible split-half coefficients resulting from different splitting of the
scale items.

a. Validity Test
The paradigm of validity focused in the question "Are we measuring, what we think, we are
measuring?" Success of the scale lies in measuring "What is intended to be measured?" Of the
two attributes of scaling, validity is the most important.

The validity of a scale may be considered as the extent to which differences in observed scale
scores reflect true differences among objects on the characteristic being measured, rather
than systematic or random error.

There are several methods to check the validity of the scale used for measurement:

Construct Validity: A sales manager believes that there is a clear relation between job
satisfaction for a person and the degree to which a person is an extrovert and the work
performance of his sales force. Therefore, those who enjoy high job satisfaction, and have
extrovert personalities should exhibit high performance. If they do not, then we can question
the construct validity of the measure.

Content Validity: A researcher should define the problem clearly. Identify the item to be
measured. Evolve a suitable scale for this purpose. Despite these, the scale may be criticised
for being lacking in content validity. Content validity is known as face validity. An example

18 | P a g e
can be the introduction of new packaged food. When new packaged food is introduced, the
product representing a major change in taste. Thousands of consumers may be asked to taste
the new packaged food. Overwhelmingly, people may say that they liked the new flavour.
With such a favourable reaction, the product when introduced on a commercial scale may still
meet with failure. So, what is wrong? Perhaps a crucial question that was omitted. The people
may be asked if liked the new packaged food, to which the majority might have "yes" but the
same respondents were not asked, "Are you willing to give up the product which you are
consuming currently?" In this case, the problem was not clearly identified and the item to be
'measured' was left out.

Predictive Validity: This pertains to "How best a researcher can guess the future
performance from the knowledge of attitude score"?

6. Questionnaire construction
Questionnaire is a structured technique for data collection consisting of a series of questions,
written or verbal, that a respondent answers. A questionnaire, whether it is called a schedule,
interview form or measuring instrument, is a formalised set of questions for obtaining
information from respondents. Typically, a questionnaire is only one element of a data
collection package that might also include (1) fieldwork procedures, such as instructions for

selecting, approaching and questioning respondents: (2) some reward, gift or payment offered

to respondents; and (3) communication aids, such as maps, pictures, advertisements and
products (as in personal interviews) and return envelopes (in mail surveys). Any
questionnaire has three specific objectives.

First, it must translate the information needed into a set of specific questions that the
respondents can and will answer. Developing questions that respondents can and will answer
and that will yield the desired information is difficult. Two apparently similar ways of posing
a question may yield different information. Hence, this objective is most challenging.

Second, a questionnaire must uplift, motivate and encourage the respondent to become
involved, to cooperate, and to complete the task. Figure 13.1 uses a basic marketing model of
exchange of values between two parties to illustrate this point. Before designing any
questionnaire or indeed any research technique, the researcher must evaluate ‘what is the
respondent going to get out of this?’ In other words, the marketing researcher must have an

19 | P a g e
empathy with target respondents and appreciate what they think when approached and
questioned. Such an appreciation of what respondents go through affects the design of how
they are approached, the stated purpose of the research, the rewards for taking part and the
whole process of questioning and question design.
Third, a questionnaire should minimise response error. The potential sources of error in
research designs were discussed in Chapter 3, where response error was defined as the error
that arises when respondents give inaccurate answers or when their answers are misrecorded
or mis-analysed. A questionnaire can be a major source of response error. Minimising this
error is an important objective of questionnaire design.

7. Questionnaire Design Process


The design process is founded upon generating information that will effectively support
decision-makers. Establishing the nature of social problems and corresponding research
problems, i.e. defining the nature of effective support. Different techniques and sources of
information were outlined to help in the diagnosis process, which feed directly into the stages
set out below:

a. The ‘source of idea’ represents the culmination of marketing decision-maker and


marketing researchers’ diagnoses and the information they have available at the time
of commissioning a marketing research project.
b. From the diagnoses, and the statement of marketing and research problems, emerge
specific research questions. Based upon the diagnoses, the purpose of each potential
question should be established, i.e. ‘question purposes’. Some research problems
may be tackled through actual measurements in questionnaires. Other research
problems may not be tackled by questionnaires.
c. With clear question purposes, the process of establishing ‘actual questions’ can
begin. At this point, the researchers have to put themselves ‘in the shoes’ of the
potential respondent. It is fine to say that certain questions need to be answered, but
this has to be balanced with an appreciation of whether respondents are able or indeed
willing to answer particular questions, as illustrated in the following example.
d. Deciding how the data collected are to be analysed does not happen when
questionnaires have been returned from respondents. ‘Question analyses’ must be
thought through from an early stage. The connections between questions and the

20 | P a g e
appropriate statistical tests that fulfil the question purposes should be established as
the questionnaire is designed. Again, trade-offs have to be considered. In Chapter 12,
different scale types were linked to different statistical tests. As one progresses from
nominal through ordinal to interval and then ratio scales, more sophisticated statistical
analyses can be performed. However, as one progresses through these scale types, the
task for respondents becomes more onerous.
e. The understanding that is taken from the data comes back to the ‘source of idea’. By
now the researcher or questionnaire designers may have collected other data,
interpreted existing data differently, or been exposed to new forces in the
marketplace. They may even now see what questions they should have been asking!

8. Types of Questionnaire
A question may be unstructured or structured. We define unstructured questions and
discuss their relative advantages and disadvantages and then consider the major types of
structured questions: multiple choice, dichotomous and scales.

a. Unstructured questions
Unstructured questions are open-ended questions that respondents answer in their own words.
They are also referred to as free-response or free-answer questions. The following are some
examples:
● What is your occupation?
● What do you think of people who patronise secondhand clothes shops?
● Who is your favourite film personality?

21 | P a g e
Open-ended questions can be good first questions on a topic. They enable the respondents to
express general attitudes and opinions that can help the researcher interpret their responses to
structured questions. They can also be useful as a final question in a questionnaire.

After respondents have thought through and given all their answers in a questionnaire, there
may be other issues that are important to them and that may not have been covered. Having
an open-ended question at the end allows respondents to express these issues. As well as
providing material to help the researcher interpret other responses, the respondents have the
chance to express what they feel to be important Unstructured questions have a much less
biasing influence on response than structured questions. Respondents are free to express any
views. Their comments and explanations can provide the researcher with rich insights.

A principal disadvantage is that potential for interviewer bias is high. Whether the
interviewers record the answers verbatim or write down only the main points, the data depend
on the skills of the interviewers. Recorders should be used if verbatim reporting is important.

b. Structured questions
Structured questions specify the set of response alternatives and the response format. A
structured question may be multiple choice, dichotomous or a scale.

i. Multiple-choice questions
In multiple-choice questions, the researcher provides a choice of answers and respondents are
asked to select one or more of the alternatives given. Consider the following question:

Do you intend to buy a new car within the next six months?
____________________________ Definitely will not buy
____________________________ Probably will not buy
____________________________ Undecided
____________________________ Probably will buy
____________________________ Definitely will buy
____________________________ Other (please specify)
Of concern in designing multiple-choice questions are the number of alternatives that should
be included and the order of potential responses, known as position bias. The response

22 | P a g e
alternatives should include the set of all possible choices. The general guideline is to list all
alternatives that may be of importance and to include an alternative labelled ‘other (please
specify)’, as shown above. The response alternatives should be mutually exclusive.

Respondents should also be able to identify one, and only one, alternative, unless the
researcher specifically allows two or more choices (e.g. ‘Please indicate all the brands of soft
drinks that you have consumed in the past week’). If the response alternatives are numerous,
consider using more than one question to reduce the information processing demands on the
respondents.
ii. Dichotomous questions
A dichotomous question has only two response alternatives, such as yes or no, or agree or
disagree. Often, the two alternatives of interest are supplemented by a neutral alternative,
such as ‘no opinion’, ‘don’t know’, ‘both’ or ‘none’, as in this example. The question asked
before about intentions to buy a new car as a multiple-choice question can also be asked as a
dichotomous question.
Do you intend to buy a new car within the next six months?

Yes
No
Don’t know

iii. Scales
To illustrate the difference between scales and other kinds of structural questions, consider
the question about intentions to buy a new laptop computer. One way of framing this using a
scale is as follows:

Do you intend to buy a new car within the next six months?
Definitely Probably Undecided Probably Definitely
will not buy will not buy will buy will buy
1 2 3 4 5

23 | P a g e

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy