Raven 1989
Raven 1989
In this paper, some recent results relating to the stability o f scores on the Raven
Progressive Matrices Test f o r different subgroups within and between the
United Kingdom, the United States, and other Western societies are summar-
ised. Subsequent sections deal with variation over time. A possible explanation
f o r the variation in norms over time and between ethnic groups within the
United States is offered.
(for a cross-section of ability). The MHV scale itself is available in senior and
junior forms. Both sets of tests have been revised from time to time over the past
40 years.
U.S. Data
Between 1983 and 1987 some 30 norming studies, involving more than 30,000
students aged 5 to 18 years, were carried out in school districts across the United
States of America (Raven, Summers ct al, 1986).~ Each sample was chosen to be
representative of the school district from which it was drawn.
The norms that were obtained varied markedly from one school district to
another and, within districts, between socioeconomic and ethnic groups. Both
ethnicity and SES made independent contributions to the variance. Differences
between the norms for school systems catering to white students of differing SES
were as great as the ethnic differences within school districts. The ethnic
differences seem to correspond to differences in birth weight, infant mortality,
and the incidence of serious childhood illness (U.S. Bureau of the Census,
1984).
The data hint that the Hispanic-white difference may be declining: There was
no major Hispanic/white difference in the data collected by Stallard (see Raven,
Summers et ai, 1986) in the Ontario-Montclair school district of California. Yet
there had been a marked difference between these two groups in the adjacent
county of Riverside when A. R. Jensen (1980) collected his data 15 years earlier.
Although Jensen's results have been confirmed in studies conducted in other
areas by Burciaga (1973) and Hoffman (1983), Stallard's results are confirmed
John Raven
el)
1979 9Slho*,le
1979 9Otn**,le
SO ~ ] 9 7 9 ?Slh ".,le
45 •
40,
35
15
Io
q" - - - - - - + 1 9 7 9
s •- - . 1938
evidence suggesting that we can), it would appear that children now master the
abilities tested by the Matrices at an earlier age, and that the scores of the less
able plateau above their previous level. However, for the majority of children
aged 11 t/2 and over, there has been little increase in score.
Data assembled by Flynn (1987a) both supports and challenges this conclu-
sion. The Progressive Matrices scores of young adults (military conscripts) from
a wide variety of westernised societies appear (see, e.g., Bouvier's data in Figure
2) to have been going up steadily--Flynn says "dramatically"--over time.
(Flynn cites a round figure of about one standard deviation per generation. 2)
Flynn suggests that what seems to be a more moderate increase in the United
Kingdom--that is, the increase documented in Figure l - - m i g h t be explained if
the 1938 British norms were too high. This explanation is certainly consistent
with the available data. However, an alternative explanation is that British social
and welfare provisions were, at that time, well above the international average
but have now fallen behind. What is perhaps most striking about Flynn's
compendium is that differences between the various norms summarised previ-
ously which, taken by themselves, could be dismissed as sampling errors, turn
out, in a broader context, to be meaningful.
It would appear from the results summarised earlier that there has been, and
still is, considerable--if far from perfect--stability in SPM scores from one
Western society to another at any given point in time. However, in common with
other test norms (see, e.g., Bouvier, 1969; Garfinkel & Thorndike, 1976;
John Raven
38
French Speaking A
/.
g 35
N
Thorndike, 1975, 1977; and the large number of published and unpublished
studies brought together by Flynn, 1984b, 1987b), there has been an increase in
mean scores over time, particularly for the less able, at least since 1970 and,
according to Flynn, continuously throughout the period (although accelerating
more recently).
Thorndike (1977) and Garfinkel and Thorndike (1976) suggest several
hypotheses that might explain the increase. However, the data available on the
Progressive Matrices do not really support any of them. Thorndike suggests, for
example, that the observed acceleration in development may be due to earlier
maturity. However, if maturity is a factor, the growth curves for boys and girls
should differ more than the evidence summarised in Raven (1981), Raven,
Summers et al (1986), and Court (1983) suggests that they do. Thorndike
suggests that the differences may be due to the nature of early school education,
but the fact that there was little difference between the RPM norms obtained in
Scotland and England in the course of the 1979 standardisation suggests that this
is unlikely because Scottish infant education remains very formal (HMI, 1980).
Thorndike suggests that television may have had an effect. However, television
was widely available in Ireland when what can now be seen to be low Irish norms
were collected.
Others have suggested that the increases in Progressive Matrices scores over
time may be attributed to schools using matrix-type problems to teach problem
Raven Progressive Matrices
solving. However, Thorndike showed that performance on all the subscales of the
Binet had improved.
Flynn (1987a) (having, in 1984b, queried Thorndike's hypotheses concerning
the Binet results) concluded that most of the explanations for the RPM increase
that readily come to mind do not hold up. He shows, through a detailed analysis
of Leeuw and Meester's (1984) data, that changes in level of education could
account for only one of the 20 IQ-point gains in RPM scores documented among
servicemen. Changes in the intellectual quality of the home environment, as
indexed by SES, could account for little more.
The variation in mean scores between ethnic groups within the United States
does seem to correspond to variation between the same groups in height, birth
weight, and infant mortality. Height and birth weight have, like intelligence test
scores, increased over the past 80 years. 3 The most probable explanation of the
increase in scores over time therefore seems to be that it is due to the same
variables as increases in height and birth weight and decline in infant mortali-
t y - t h a t is, to improved nutrition, welfare, and hygiene. What it is about these
variables that is important is as obscure for height and birth weight as it is for
intellectual ability. However, the fact that such variables do have important
effects on RPM score as well as birth weight and height is shown in a remarkable
study carried out in Aberdeen, Scotland, most of the results from which have
never been published (Baird & Scott, 1953; Scott, Illsley, & Thomson, 1956). In
this study, calcium intake was used as an index of quality of diet, and it was
shown that this has a marked impact on all three of the outcomes mentioned, and
that the relationship held both within and between socioeconomic groups. More
recently, Benton and Roberts (1988) have shown that vitamin and mineral
supplements appear to have a rapid and marked effect on eductive, but not
reproductive, ability.
variables still account for only a small proportion of the total variance. The
development of the ability to perceive and think clearly appears to be promoted
by "democratic" childrearing practices in the home and by "open" educational
activities in the school (Chan, 1981; McGillicuddy-DeLisi, DeLisi, Flaugher, &
Sigel, 1987; Miller, Kohn, & Schooler, 1985; Raven, 1980; Raven et al., 1985;
Sigel, 1986; Sigel & McGillicuddy-DeLisi, 1984; Stallings & Kaskowitz, 1974).
It is also promoted by work that demands high levels of problem solving and
responsibility (Jaques, 1976; Kohn & Schooler, 1978; Lempert, 1986). However,
it appears that the child-rearing and educational practices that increase
Progressive Matrices scores depress reading, writing, and arithmetical ability if
these abilities are measured by conventional educational tests that load heavily
on reproductive ability (Sigel, 1986; Stallings & Kaskowitz, 1974). 4
Experimental attempts to teach the strategies required to solve matrix
problems (e.g., Budoff & Corman, 1976; Budoff, Corman, & Gimon, 1976;
Feuerstein, 1979, 1980; Haywood et al. 1982; Jacobs, 1977; Wortman, 1968)
yield dramatic short-term results, but it is not clear that these affect eductive
ability in any basic sense (Savell, Twohig, & Rachford, 1986). What is notable
about at least some "democratic" parents and "open" educators is that they
extend children's utterances and encourage them to ask questions and to invent
ways of perceiving and conceptualising things for themselves. They also encour-
age them to share in their own problematizing and thinking about the long-term
consequences of their actions in the context of their own personal views of how
society works. 5 In this context it is, however, important to note that attempts to
introduce such activities into schools in general have not succeeded even in
getting teachers to change their behaviour (Fraley, 1981; HMI 1980; Raven et
al., 1985). The same has been true of attempts to get parents to change their
behaviour (Raven, 1980). This is almost certainly why overall evaluations of
intervention programmes such as Head Start and Follow-Through (e.g., Bock,
Stebbins, & Proper, 1977; Spitz, 1986) show no effect.
The evidence that child-rearing practices can influence scores suggests that
there is possibly a more psychologically based explanation of the increase in
RPM scores over time than the explanation in terms of nutrition and hygiene
favoured earlier. This is that the move toward smaller families, combined with
welfare provisions that satisfy the more fundamental needs in Maslow's (1954)
hierarchy, has permitted more parents to adopt child-rearing practices that
facilitate their children's cognitive development. These parents may encourage
their children to think things out for themselves; to share in their own thought
processes and, in particular, to think about the operation of complex social
processes (society) and consider the probable long-term, and as yet intangible,
social consequences of their actions. If they do, these experiences might well be
reflected in increases in both cognitive ability and the development of internal-
ised codes to guide behaviour. When these more socialised children reach school,
their teachers can change their discipline strategies and treat them in a more
developmental way. And when they get to work these young people might find
that it, too, has become more cognitive, requiring them to take more responsibil-
ity for the personal, organisational, and societal consequences of their actions. If
Raven Progressive Matrices
work has changed in this way, Jaques (1976), Kohn and Schooler (1978), and
Lempert (1986) already have demonstrated that it further promotes the develop-
ment of problem-solving ability. What is important about this possible explana-
tion of the observed increase in scores is that it is not disconfirmed by the
observations that earlier led us to reject Thorndike's hypotheses.
It is important to conclude this section on a less "environmentalist" note by
juxtaposing the fact that Progressive Matrices scores can be influenced by and
have changed over time with the fact that the bulk of the variance is still between
children from similar ethnic and socioeconomic backgrounds. Indeed, as Max-
well (1969) and Jencks et al. (1973) have shown, two-thirds of the variance in
intelligence test scores is between children from the same families.
Concluding Comments
The first general conclusion that may be drawn out of this material is that it
does not really support Flynn's (1987a) contention that "IQ tests do not measure
intelligence but rather a correlate with a weak causal link to intelligence." It is
easiest to demonstrate this by citing parallel data on height. According to Flynn
(personal communication), height has, since well before the turn of the century,
been increasing at about two thirds of a standard deviation per generation. This is
of the same order of magnitude as the one standard deviation per generation
figure he cites for RPM scores. For height, too, sex and ethnic differences
abound, and the differences persist despite intergenerational increases. Neither
the intergenerational increases nor the sex and ethnic differences imply that
rulers cannot validly be used to measure height. Flynn buttresses his conclusion
that IQ scores have little meaning by suggesting that, if people's intelligence
really has improved, more patents should have been filed. This has not happened.
But no one would suggest that a prerequisite to rulers being regarded as valid
measures of height would be that the number of Olympic high jump gold
medalists should have gone up. (Incidentally, would not the number of books and
articles published be a better criterion against which to judge the construct
validity of IQ tests than the number of patents filed? The number has certainly
increased over time.)
My own view is that it would have been virtually impossible to have obtained
meaningful evidence on the issues in question if the tests had not been available.
The impressive stability in the norms across cultures at any given point in time,
the stability in the test properties and item statistics across time and socioeco-
nomic and ethnic group, the consistency of the increases from year to year, and
the persistence of the ethnic differences imply that we are dealing with
psychological processes of fundamental importance.
Despite my disagreement with Flynn's general conclusion, I feel that his
compilation of data is of the greatest theoretical and practical importance. These
data demonstrate that some, as yet unidentified, features of the environment
have dramatic effects on eductive ability.
Another set of insights that stem from the material summarised in this article
are those that relate to the controversy about test bias which has troubled
American education for two decades. Because RPM and MHV scores are
John Raven
relatively unambiguous, the data that have been reviewed direct attention toward
a constructive search for explanations of differences and for educational policies
that reflect the socioeconomic realities, values, and educational needs of the
groups concerned. (See Raven (1987) and the article by Tharp, Gallimore, and
others (Tharp et al., 1984) for a fuller discussion the way in which this might be
done.) Nevertheless, to fully capitalize on these observations it is essential for
psychologists to undertake the fundamental research that is needed to develop
measures of a wider range of human talents. It seems that we are still trying to
administer our educational system with the aid of concepts and tests developed
by Binet at the turn of the century. (A new attempt to provide a conceptual
framework and measurement model appropriate to thinking about the assess-
ment, development and release of a wider range of talents is available in Raven,
1984, 1988, in press-a.)
It may be useful to expand on what was said in the last paragraph by taking an
example. The material that has been presented shows that, if Federal US funds
for Gifted Education were administered on the basis of National US norms, those
funds would be channelled into what are already very wealthy school districts. If
this is regarded as inequitable, and if it is therefore felt that local instead of
national norms should be used to determine the level of provision of, and
eligibility for entry to, both "gifted" and "remedial" forms of special education,
then the logic of the argument points to the use of local ethnic norms. In the end,
therefore, decisions about which norms to use depends on judgments about
whether Special Education is a good thing or a bad thing: does it help or does it
label? By introducing these facts and considerations into the debate it is possible
to direct attention to the goals of policy and the steps that are required to reach
them.
Psychologists should be arguing much more forcefully for a major role in
identifying policy goals and the processes to be used to reach them and in
developing the tools needed to administer the programmes and assess their
effects. If they did this, they would find themselves developing a whole new range
of understandings of the qualities which are to be fostered in the course of
educational programmes, the processes to be used to nurture them, and the
psychometric models to be used to assess educational processes and outcomes
(Raven, 1977, 1984, 1988).
Summary
At its most basic level, the data briefly summarised in this paper suggest that
there is considerable stability in SPM and M H V performance both within and
between societies with a literary tradition at any given point in time. This is true
at the level of item statistics as well as both mean scores and variance. On the
other hand, there has been an impressively continuous increase in scores over
time. Despite these increases, the differences between socioeconomic and ethnic
groups, both within the United States and across cultures, remain.
From a more fundamental point of view the material suggests that the use of
theoretically based tests which have good psychometric properties has:
• demonstrated that we are dealing with psychological processes that are of
10
Raven Progressive Matrices
Notes
~Compilation of thcsc norms is continuing. The author would welcome correspondence
from anyone able to contribute to the process and, in particular, from anyone able to
establish norms for an urban, black sample.
2There is no doubt that there has been a major increase in scores over time and that
Flynn has done an outstanding service by unearthing and bringing together studies that
demonstrate this. Likewise, there is no doubt it was necessary for Flynn, like most other
researchers, to employ statistics that do not fully satisfy the relevant statistical assump-
11
John Raven
References
Adams, A. E. (1952). Analysis o f Raven's matrics scores: Preliminary report. Surrey,
England: Surrey Educational Research Association.
Baird, D., & Scott, E. M. (1953). Intelligence and childbearing. Eugenics Review, 45,
139-154.
Benton, D., & Roberts, G. (1988). Effect of vitamin and mineral supplementation on
intelligence of a sample of schoolchildren. Lancet, 140-143.
Bock, G., Stebbins, L. B., & Proper, E.C. (1977). Education as experimentation: A
planned variation model (Vol. IV-B). Effects of follow through models. Cambridge,
MA: Abt Associates.
Bouvier U. (1969). Evolution des cotes a quelques test. Belgium: Centre de Recherches,
Forces Armees Beiges.
Budoff, M., & Corman, L. (1976). Effectiveness of a learning potential procedure in
improving problem-solving skills of retarded and non-retarded children. American
Journal o f Mental Deficiency, 81,260-264.
Budoff, M., Corman, L., & Gimon, A. (1976). An educational test of learning potential
assessment with Spanish speaking youth, lnteramerican Journal of Psychology, 10,
13-24.
Burciaga, L. E. (1973). A research study on the Raven Coloured Progressive Matrices
among school children of the E1 Paso public schools. Doctoral dissertation, University
of Texas, El Paso.
Byrt, E., & Gill, P. E. (1973). Standardisation of Raven's Standard Progressive Matrices
and Mill Hill Vocabularly for the Irish population ages 6-12. Master's thesis, National
University of Ireland, University College Cork.
Chart, J. (1981). Correlates of parent-child interaction and certain psychological variables
12
Raven Progressive Matrices
13
John Raven
14
Raven Progressive Matrices
Raven, J. (1980). Parents, teachers and children. Edinburgh: Scottish Council for
Research in Education.
Raven, J. (1981 ). The 1979 British standardisation of the standard progressive matrices
and Mill Hill vocabulary scales, together with comparative data from earlier studies in
the UK, US, Canada, Germany, and Ireland. In J. C. Raven, J. H. Court, & J. Raven
(1987), A manual for Raven's progressive matrices and vocabulary tests (Research
Supp. No. 1). London: H. K. Lewis.
Raven, J. (1984). Competence in modern society: Its identification, development and
release. London: H. K. Lewis.
Raven, J. (1987). Values, diversity and cognitive development. Teachers College Record,
89, 21-38.
Raven, J. (1988). The assessment of competencies. In H. D. Black & W. B. Dockrell
(Eds.), New developments in educational assessment: British Journal of Educational
Psychology, Monograph Series No. 3. 98-126.
Raven, J. (in press-a). A model of competence, motivation and its assessment. In H.
Berlak (Ed.), Assessing academic achievement: Issues and problems. Madison, WI:
National Center for Effective Secondary Schools.
Raven, J. (in press-b). Equity in diversity: The problems posed by values--and their
resolution. In F. Macleod (Ed.), Families and schools: Issues in accountability and
parent power. Brighton, England: Falmer Press.
Raven, J. (in press-c). Parents and schools in early childhood. British Journal of
Educational Psychology (Special issue).
Raven, J. (in press-d). Progressive matrices." Solution strategies and their implications
for education. Madrid: Ministry of Education.
Raven J., & Court, J. H. (in press). Additional normative, reliability, and validity
data--to 1988. Research supplement No. 4 to Raven, J. C., Court, J. H., & Raven, J.
Manuel for Raven's Progressive Matrices and Vocabulary Scales. London: H.K.
Lewis. San Antonio, TX: Psychological Corporation.
Raven, J., Johnstone, J., & Varley, T. (1985). Opening the primary classroom.
Edinburgh: Scottish Council for Research in Education.
Raven, J., Summers, W. A., et al. (1986). A compendium of North American normative
and validity studies. In J. C. Raven, J. H. Court, & J. Raven, A manual for Raven's
progressive matrices and vocabulary tests (Research Supp. No. 3). London: H. K.
Lewis; San Antonio, Texas: The Psychological Corporation.
Raven, J. C. (1941). Standardisation of progressive matrices. British Journal of Medical
Psychology, 19, 137-150.
Raven, J.C., Court, J.H., & Raven, J. (1987). A manual for Raven's Progressive
Matrices and Vocabulary Tests. London: H.K. Lewis; San Antonio, Texas: The
Psychological Corporation.
Raven, J. C., & Walshaw, J. B. (1944). Vocabulary Tests. British Journal of Medical
Psychology, 20, 185-194.
Savell, J. M., Twohig, P. T., & Rachford, D. L. (1986). Empirical status of Feuerstein's
"Instrumental Enrichment" (FIE) Technique as a method of teaching thinking skills.
Review of Educational Research, 56, 381-409.
Scott, E. M., lllsley, R., & Thomson, A. M. (1956). A psychological investigation of
primigravidae: II. Maternal social class, age, physique and intelligence. Journal of
Obstetrics and Gynaecology of the British Empire, 63, 338-343.
Sigel, I.E. (1986). Early social experience and the development of representational
competence. In W. Fowler (Ed.), Early experience and the development of competence.
New Directions for child development (No. 32). San Francisco: Jossey-Bass.
15
John Raven
Author
JOHN RAVEN, Consultant, 30 Great King St., Edinburgh EH3 6QH, Scotland.
Degrees: BSc, University of Aberdeen; Dip. Soc. Psych., London School of Economics;
PhD, University of Dublin. Specializations: broadly based educational evaluation, links
between education and society.
16