09 - Chapter 2
09 - Chapter 2
CHAPTER 2
A Review of Related Literature and Studies
2.1 Prologue:
can be done by anybody. The score given by any assessor will always
remain the same.
2.2.2.2 Norms:
Objective scoring means any two equally trained scorers would get
exactly identical scores for the same test-paper. This is possible only for
the objective type tests for which the scoring key can be fixed and the
scorer has to just match the right answer key with the examinee's
responses and count the responses which are correctly answered i.e.
matching with the right answer key. Therefore the essay type of
subjective tests are not used in standardized tests but only the objective
type tests can be standardized. The scoring of a standardized objective
type tests is not influenced by the subjective reactions of the scorers. The
scoring of objective tests can now be done by using the automated
scoring machine known as "Optical Mari< Reader" (Wardrop, 1976, p.3.).
decided in such a way that it remains useful for the years to come. In the
content analysis one has to prepare a blue print, which gives the topics to
be covered, the number of questions from each topic and the distribution
of difficulty level of questions under each topic and the weightage to be
given for each topic. In short, the blue print of a test gives the detailed
picture of the test.
Item writing: The next step to content analysis is item writing. To write
good items is a difficult task. The items written must go through the
stages of review, modification and editing before they are put to actual
use. Even with the careful process of review, a number of items are likely
to be discarded after actual try out. So at least two to four times the
number of items should be written than that of the actual requirement.
Item Analysis:
Item Analysis is a technique to know the qualities of items, on the
basis of responses given by the candidates who have taken the test. It is
necessary to ensure the effectiveness of the test as a whole, in terms of
each of its units i.e. items.
True Score - Refers to the score that an individual would have obtained on
a test under perfect conditions and it could be measured without error.
However, this is a difficult, rather only a hypothetical situation so far as the
psychological tests are concerned. Therefore, the scores obtained on a
test can be represented by the following equation -
X = X +E
t T
where X = obtained score
t
X = True Score
T
It may be noted here that certain concepts viz. true score, error,
true variance, error variance, etc. cannot be directly measured but they
can only be estimated.
representing a zone in which the student's true ability lies. That is, if a
particular student scores at the 60'^ percentile, it is wrong to conclude that
this is his exact standing. It is more appropriate to say that his true score
lies somewhat near the 60**' percentile. Scores obtained on any particular
test on any particular day are somewhat biased. High scores are too high
and low scores are too low. it's not only the ability of the student that
determines the score, but it is also the luck factor that contributes. The
students who scored high on test, are not only high on ability but also they
had a better luck at that particular point of time and the students who
scored low are not only low on ability but also they had a bad luck at that
particular point of time. If an alternate form of test is administered on
some other day, the students who scored high on the first occasion, will
as a group, score little lower while the group of students who scored low
will as a group, score on little higher side. This is called the tendency of
regression towards mean. However, this tendency does not necessarily
apply to each individual. Few students who score high on the first
occasion may score still higher on the second occasion and few students
who scored low on first occasion may score still lower on the second
occasion. All scores above mean tend to be somewhat biased upward i.e.
they are probably higher than they should be.
In this method the constraint of practice effect gets reduced but not
completely eliminated. The forms of test being parallel, contain the
problems based on the same principles. Once the principle is understood
32
In this method the scores are obtained for each student in two parts
viz. for two halves of a test. The test is split into two equal halves, no. of
questions in each part being equal. This can be done by taking first half
and next half questions, or by taking alternate questions in the two parts or
by any other method so as to ensure that the two parts of the test are
equivalent. The method of splitting would depend upon the structure of
the test. Usually, in most of the tests, first half and second half would not
be equivalent, owing to differences in nature and difficulty level of items as
well as cumulative effect of warming up, practice, fatigue, boredom and
any other factors varying progressively from the beginning to the end of
the test. A procedure that is most suitable for most of the purposes is to
find the scores on odd and even items of the test. Such a method of
division yields most nearly equivalent half-scores. However, while dividing
the test into two halves, if a group of items is based on common
information, all those items in the group should be put in the same part.
2.2.4. Validity:
Any test is always designed with some purpose and the test scores
have some meaning only when they are related to some other variable,
e.g. an achievement test is meant to know the academic performance of
students in the class-room, an aptitude test is meant for knowing certain
abilities of an individual which in turn will predict the behaviour of an
individual in certain domain. A test will be useful when its scores can be
correlated with some other variable and can be meaningfully interpreted. If
36
a test has perfect consistency but its scores cannot be correlated with any
other variable, such a test will be of no use and hence will have no validity.
Validity is a generic term and can be defined at various levels in various
ways. The validity of a test can be determined from the answers to the
questions like - How well does the test measure what it is supposed to
measure?, What traits does it measure?, Does it really measure what it is
supposed to measure? Does it supply the infomnation that can be used in
decision making? What interpretation can be given to the test scores?
What percent of the variance in the test score is attributable to the variable
that the test measures?
The class room tests are used for assessing the knowledge
acquired by the students through the class room teaching. These tests are
syllabus based and the items included in the test are just a sample as it is
not feasible to cover entire syllabus in the test. The teacher thus has to
test the entire knowledge through these test items and it is therefore
necessary that the tests are representative of the behavioural domain.
Scores on the tests are used not as ends in themselves, but rather to
make inferences about performance in the wider domain. The purpose of
37
The most common use of tests, other than the class room tests is
to predict the performance in the domain of behaviour. For example,
achievement tests are used to place the students in various class
sections, while aptitude tests are used for predicting performance on job.
The variable that is predicted by the test is called criterion, and the validity
thus obtained is called criterion-related validity. Since the test is used to
predict the criterion, the criterion-related validity is sometimes referred to
as predictive validity; and since it involves collection of empirical data
on the relationship between test scores and the criterion measure, this
type of validity is also referred to as empirical validity. The proper
measure of a test's criterion related validity, and thus its usefulness, is an
index of its relative contribution, over and above that of other measures
and sources of information, to increased decision making accuracy. We
40
are not interested in the test scores per se, but are interested in the test
because it predicts some important criterion behaviour.
Test Decision
Acceptable A B
(+ve, above average)
Miss Hit
Unacceptable C D
(-ve, below average) Miss
Hit
A. Sample:
It is important to select the sample which will well represent the
population. Size of the sample also plays an important role. As the size
of the sample increases, errors of measurement tend to counterbalance
each other and the obtained results could be expected to be more
balanced. Therefore, with a larger sample, the results can be more
expected to be statistically significant.
B. Base Rates:
As described in Brown (1970, p. 127) the base rate may be defined
as the rate of occurence of a phenomenon in an unselected population.
Thus, the base rate would be the proportion of people who would be
successful on a job or in an academic program if there were no selection.
In practical situations the selection is always made using some or the
44
2. Out of the entire list of all the tasks, the group has to identify the
main activities being performed, what is the exact role in performing
these activities, how much time is spent on each of these activities
and rank them in the order of importance to determine the
weightage for each activity. The group may also spell out the
expectations from a role incumbent if his performance is to be rated
as excellent.
Determinants of Job-Performance
PERSONALITY
VARIABLES K>B
(VALUES, ASPIRATION. ATlSFACnON
LOCUS OF
CONTROL)
fEUCEPTlON OF
JOB SITUATION
JOB (ORGANISATIONAL
SITUATION CtlMATE)
BACKGROUND
VARIABLES
lACC. SEX. MARITAL
STATUS. INCO\fE.
SESi
fig. 2.1
2.2.6 Job-Satisfaction:
90-
1 1 70- y^
2 1
^ %
60-
50-
y "/
0 g
0) ra
40-
^,1^
";;^^
S5 30-
c •^ -;;^^^
4) o) 20 -
2 £ —^r-^^^^"^"*^"^
SIS 10-
°' 0.
0-10 11-20 21-30 31-40 41-50
Score Range
Fig. 2.2
53
The results of the study did indicate the strength of objective tests
vis-a-vis the descriptive papers in clerical selection. However, it
must be borne in mind that the tests have to be related to the
important components of the job and also have to be carefully
constructed. Any objective test poorly constructed or improperly
administered may not prove effective.
The validity of the tests was tested against the criteria "on the job
performance". Findings of the study were (1) A considerably large
proportion of employees recruited through NIBM were evaluated by
their supervisors as very good performers. Even in regard to the
specific abilities, the evaluation was on the same lines. (2) it
confirmed that the selection strategy evolved at NIBM had been
more effective in selecting clerks with higher potentials for
supervisory and management jobs in comparison with the
traditional system of selection.
55
The study was based on the work attitudes of 100 skilled blue-collar
workers. Questionnaires were employed to determine worker's
satisfaction with 8 job-context factors and 5 job-content factors.
Satisfaction index for each factor was derived by subtracting the
importance score (need-strength) from the extent to which the
56
Thus the study explored the factors associated with both job-
satisfaction and dissatisfaction and found that the factors were not
necessarily distinct and separate. The factors were found to
interact among themselves; the feelings of either satisfaction or
dissatisfaction being determined by the achievement of the aim of
promotion and salary. Thus, Herzberg's dual factor theory was not
supported by the empirical evidence in this study.
The final sample of the study had 586 secondary school teachers
selected on a proportionate stratified random sampling technique.
Tools used to measure the job-satisfaction and job-involvement
inventory were Indiresan's job satisfaction inventory and a job
involvement scale respectively. Winer's Leadership Behaviour
59
Sample consisted of 205 students from six high schools; with males
and females in almost equal proportion. In order to test the
validity of the three factor model of learning approaches, structural
equation modelling techniques were employed. The present
problem required the use of confirmatory factor analysis. The
overall fit of the data to the model was very good resulting in a
comparative fit index of 0.97 based on 11 degrees of freedom. As
required for the test of the model, all three latent variables were
maintained in an orthogonal configuration. The lack of correlation
between the three latent variables and the resulting good fit of the
61
It was found that the tests used were appropriate for predicting only
some measures of job performance. In fact traits viz. hard-work,
convincing people and planning had very poor values. The test
battery, however, needed to consist of tests which could be
sequentially given so that the desirable traits could be held
common and other cognitive tests then administered to assess the
planning and other abilities of the applicants.
62
The item areas of the test were verbal analogy, verbal reasoning,
vocabulary general information and numerical relations. The test
was standardized on a sample of 2000 boys and girls chosen on
a stratified random basis. Split-half, test-retest and other
reliability coefficients were calculated. The inter-item correlation
65
and factor analysis with varimax rotation were used for a study of
validity of the test.
The reliability indices were high in the range of 0.73 to 0.92; and
the validity coefficients in the range of 0.52 to 0.73. The factors
identified through factor analysis were general reasoning and
verbal comprehension.
For pre-tryout a sample of 100 students was used. For further try-
out a sample of 750 students selected from three schools in
Chidambaram was used. For final administration, 5000 students
from 34 schools in one of the districts in Tamilnadu were selected
by the method of stratified proportionate sampling. The test
used included seven sub-tests: Synonym, antonym, analogy,
classification, mixed words, reasoning (Verbal) and reasoning
(numerical). A total of 110 items were there in the test.
Nalr, K.S. (1972, p.493) carried out An Analytical Study of the Factor
Pattern of Verbal and Non-Verbal Tests of Intelligence.
The sample for item analysis consisted of 370 students and the sample for
final analysis consisted of 420 students chosen from the secondary
schools of Trivandrum educational districts.
(a) Verbal and Non-Verbal tests were formed mainly on the basis of
content.
(b) A third factor, identified as numerical ability, was the same as the one
identified by others.
(c) A fourth factor which showed possibilities for tests to be grouped on its
basis, would have emerged if more tests had been used.
(d) Factor I which had high loadings on the tests of analogies, series,
spatial relations, classifications, water-reflection and arithmetic
reasoning could be identified as Non-Verbal factor.
67
(e) Factor II which had high loadings on vocabulary tests as well as water
reflection which was classified as a Non-Verbal item could be termed
as a Verbal factor.
(f) Factor III which had high loadings on arithmetic reasoning tests,
number series and number classification, could be termed a numerical
reasoning factor.
The study of Deshpande et.al (1974) and Mankidy (1977) had given
the information about the types of tests being used at the inception of the
system, work done by NIBM in relation to testing as selection strategy for
bank recruitment. It answered the question like why more importance was
given to the objective type tests as Compared to the 'traditional' descriptive
type tests and how the selection system was evolved. Since the present
study basically dealt with the present selection strategy for bank
recruitment and was an effort to Establish the validity of objective type
tests, the above information was very much required and useful.
Mankidy's study also threw light on how the validity of tests was
established against the criteria of job-performance, tools and traits
selected for measuring job performance. Present study was an extension
69
2.5 Epilogue: