0% found this document useful (0 votes)
28 views14 pages

Anderson 1991

Uploaded by

Lu_fibonacci
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views14 pages

Anderson 1991

Uploaded by

Lu_fibonacci
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Psychological Science http://pss.sagepub.

com/

Reflections of the Environment in Memory


John R. Anderson and Lael J. Schooler
Psychological Science 1991 2: 396
DOI: 10.1111/j.1467-9280.1991.tb00174.x

The online version of this article can be found at:


http://pss.sagepub.com/content/2/6/396

Published by:

http://www.sagepublications.com

On behalf of:

Association for Psychological Science

Additional services and information for Psychological Science can be found at:

Email Alerts: http://pss.sagepub.com/cgi/alerts

Subscriptions: http://pss.sagepub.com/subscriptions

Reprints: http://www.sagepub.com/journalsReprints.nav

Permissions: http://www.sagepub.com/journalsPermissions.nav

>> Version of Record - Nov 1, 1991

What is This?

Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014


PSYCHOLOGICAL SCIENCE

Research Article

REFLECTIONS OF THE
ENVIRONMENT IN MEMORY
John R. Anderson and Lael J. Schooler
Department of Psychology, Carnegie Mellon University

Abstract-Availability of human memories for specific items current day? Memory would be behaving optimally if it made
shows reliable relationships to frequency, recency, and pattern this memory less available than memories that were more likely
ofprior exposures to the item. These relationships have defied to be used but made it more available than less likely memories.
a systematic theoretical treatment. A number ofenvironmental In this paper we examine a number of environmental sources
sources (New York Times, parental speech, electronic mail) to determine how probability of a memory being needed varies
are examined to show that the probability that a memory will be with pattern of past use. However, we first review how avail-
needed also shows reliable relationships tofrequency, recency, ability in human memory varies with pattern of past use. Some
and pattern of prior exposures. Moreover, the environmental aspects of this problem have been extensively studied in em-
relationships are the same as the memory relationships. It is pirical studies of human memory.
argued that human memory has the form it does because it is
adapted to these environmental relationships. Models for both
the environment and human memory are described. Among the FORM OF THE MEMORY FUNCTIONS
memory phenomena addressed are the practice function, the
retentioll functioll, the effect of spacing of practice, and the Two of the most basic statistics we might gather about pat:
relationship between degree ofpractice and retention. tern of past use are how often a memory has been practiced and
how long it has been since it was last practiced. Learning func-
tions and retention functions to describe these two aspects of
human memory· have been collected since the original experi-
The title of our paper is inspired by the following remark in ments of Ebbinghaus (1885/1964). Figure 1 shows the retention
Shepard (1990): "We may look into that window on the mind as function and practice function obtained by Ebbinghaus.
through a glass darkly, but what we are beginning to discern
there looks very much like a reflection of the world" (p. 213).
He was commenting on how the principles of perception are The Retention Function
exquisitely tuned to the features of the environment in which
we live. Basically, Shepard's thesis is that perception has been Ebbinghaus measured retention in terms of the percent sav-
optimized through evolution to make the best possible infer- ings in relearning a list of nonsense syllables. The function
ences about the world given the perceptual input. Recently, shows the classic negative acceleration typical of such retention
Anderson (1989, 1990) has suggested that the same might be functions. In order to be able to compare this memory function
true about human memory. to the environment, we need to decide how to characterize the
Many people hold the bias that human memory is anything forgetting function. Some (e.g., Loftus, 1985) have suggested
but optimal. They point to the many frustrating failures of mem- that these functions satisfy an exponential formula:
ory. However, these criticisms fail to appreciate the task before
human memory, which is to try to manage a huge stockpile of P = Ae- bT (1)
memories. In any system responsible for managing a vast data
base there must be failures of retrieval. It is just too expensive
Where P is the performance measure, T is the delay time, and A
to maintain access to an unbounded number of items.
and b are parameters of the model. The intuitive appeal of an
Given the initial bias against human memory, it would be
exponential function probably explains why it is so often sug-
particularly compelling if we could show that human memory
gested. It implies that during each unit of time, the memory
were optimal. How does a system behave optimally when it is
loses a constant fraction of what is left. This process evokes
faced with a huge data base of items and cannot make all of
images of radioactive decay, an analogy often used to describe
them instantaneously available? It would be behaving optimally
forgetting. One can investigate whether this function holds by
if it made most available those items that were most likely to be
performing a log transformation of the performance scale. lethe
needed.
underlying relationship is exponential, a linear relationship
In this paper we explore the issue of whether human mem-
should obtain between log performance and time:
ory is behaving optimally with respect to the pattern of past
information presentation. Each item in memory has had some
history of past use. For instance, our memory for one person's log P = log A - bT. (2)
name may not have been used in the past month but might have
been used five times in the month previous to that. What is the A precondition to performing an adequate test of such a func-
probability that the memory will be needed (used) during the tion is that we have a large manipulation of the time scale.

396 Copyright © 1991 American Psychological Society VOL. 2, NO.6, NOVEMBER 1991
Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

John R. Anoerson and Lael J. Schooler

60
60
(a) Ebblnghaus's Retention data
(b) Ebblnghaus's Practice Data

50
50

l/I 40
Cl
t:
E
l'lI
:; CIl
..J
~ 40
.2 30
l/I
m
'i:
I- 20
30

10

200 400 600 800 0


0 1 2 3 4 5 6 7
Hours of Delay
Days of Practice

Fig. 1. (a) Ebbinghaus's (1885/1964) retention function showing percent savings as a function of delay.
Ebbinghaus used delays from 20 minutes to 31 days. (b) Ebbinghaus's practice data showing total number of
trails to master a set of lists as a function of number of days of practice.

Ebbinghaus's data certainly satisfy this precondition, as he var- function relating delay to retention. I A power function has the
ied retention intervals from 20 minutes to 31 days. form:
Figure 2a illustrates the Ebbinghaus data with the perfor-
mance scale transformed. As may be observed, the resulting (3)
function is anything but linear. Thus, despite its popularity, the
hypothesis of an exponential forgetting function is not sup-
ported. Wickelgren (1976), using a d' memory measure and de- 1. Actually. Wickelgren's theory also had an exponential component
lays from 2 minutes to 14 days, found evidence for a power that would dominate the power component at very long delays.

log P = 3.862- 0.126 log 0


4.2,.-----------------, RI\2 = 0.978
(a) Ebbinghaus's Retention Data 4.2.-----------------,
with Log Transformation of (b) Ebbinghaus's Retention Data
the Performance Scale with Log Transformations
4.0 of Both Scales
4.0

3.6
3.8

3.6
3.6

3.4 3.4

3.2 3.2

3.0+---~-~~----.----I 3.0+---~----~---6~/--;-"----l
o 200 400 600 800 ·2 0 2 4 8
Hours of Delay Log Hours of Delay

Fig. 2. The retention data from Figure 1 with (a) the performance measure transformed according to a
logarithmic function and (b) both performance and delay scales transformed according to a logarithmic
function.

VOL. 2, NO.6, NOVEMBER 1991 397


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

Reflections of Environment in Memory

This can produce a very slowly decaying memory function. If case we have to switch the sign ofthe exponent since recall time
one performs log transformations of both the performance mea- increases with delay.
sure and the time measure, one obtains a linear relationship: One of our goals is to explain why retention functions tend to
satisfy a power relationship. Given that people have preferred
Log P = log A - b log T. (4) an exponential function on an intuitive basis, such an explana-
tion would be a nontrivial result. Power functions seem to de-
Figure 2b illustrates the Ebbinghaus data with both scales log scribe memory performance from a few seconds to years. As
transformed. As can be seen, one gets a very good approxima- Wickelgren (1974) has argued, there does not seem to be any
tion to a linear relationship in these log scales with log A = discontinuity that would be associated with a shift from short-
3.862 and b = - .126. If we go back to the original scales, we term memory to long-term memory. It will be a significant re-
get a relationship of the form: sult if we can find a reason for predicting a power function (in
contrast to an exponential function) from an analysis of the
P = 47.56 T-· 126 • (5) environment.
The exponent .126 can be taken as the forgetting rate.
A power function implies that the performance measure will The Practice Function
go to infinity as time goes to zero. In contrast, an exponential
function implies a bound on how good performance can be at t We can ask the same thing about the practice functions-are
= O. Although we never realize a true delay of zero, we still can they better fit by an exponential form or a power form? The
fail to find power functions if we use scales with an upper measure used in Ebbinghaus's Figure 1b is appropriate for ad-
bound. Probability of recall is such a scale. Ebbinghaus's per- dressing this question. Plotted there are the number of trials to
cent savings is another scale, but even at the 20-minute delay in learn a list of 36 nonsense syllables to a criterion of one correct
Ebbinghaus's experiment there was only 58% savings, so the anticipation. Ebbinghaus practiced these lists each successive
ceiling was not approached. Power functions for forgetting tend day and so we see the improvement across days with practice.
to be obtained when we use measures that do not have upper Figure 3 compares how well exponential and power func-
bounds or do not approach their upper bounds. The d' measure tions fit these data. The range of practice (1 to 6 days) is not
of Wickelgren is a scale that does not have an artificial upper large enough to enable a clear discrimination among the func-
bounds. Later we will also advocate recall odds rather than tions, although the power function produces a somewhat better
recall probability, since odds varies from zero to infinity. Recall fit. This practice function has been explored over much larger
time is another measure that ranges from zero to infinity and ranges of practice and a power function typically provides a
tends to yield power functions for retention, although in this better fit (Newell & Rosenbloom, 1981), although there has

log TrIals =4.24 - 0.50 Days log Trials = 4.08 - 1.44 log Days
=
R"2 =0.949 R"2
5.--------------,
0.996
5,.--------------,
(a) Ebbinghaus's Practice Data (b) Ebbinghaus's Practice Data
with log Transformation with log Transformations
of the Performance Scale of Both Scales
4 4
...c
III
...
c
III
ClJ ClJ
...I ...I
E E
III
3 ~ 3
lii III
-.: -.:
I- I-
Cl Cl
0 0
;.J ...I

2 2

1+--------r----~--__l
o 1 2
log Days of Practice
Days of Practice

Fig. 3. The practice data from Figure 1b with (a) the performance measure transformed according to loga-
rithmic function and (b) both performance measures transformed according to a logarithmic function.

398 VOL. 2, NO.6, NOVEMBER 1991


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

John R.-·Anderson and Lael J. Schooler

again been a history of initial preference for the exponential Thus, it is fair to say that there is no theory of human mem-
function (Mazur & Hastie, 1975; Restle & Greeno, 1970). An- ory that adequately predicts both the practice and forgetting
other goal we have is to provide an environmental explanation functions. This is a pretty startling result since it has been a field
for why there is this ubiquitous practice function. Again this of constant research and theorizing for over 100 years.
result is not trivial given the initial beliefs that the learning
function should be exponential in form.
The Spacing Effect
The power function that corresponds to the data in Figure
3 is: One other effect that we would like to note creates even
greater stress on theories of memory-the spacing effect (Bahr-
P = 513 S-1.24 (6) ick, 1979; Glenberg, 1976). It is found that the spacing between
successive repetitions of an item affects how well the item is
where S is the number of days of study. The size of this expo- remembered. Moreover, this effect interacts with the delay be-
nent can be interpreted as the learning rate. tween the last study of an item and the test. Figure 4 displays
the results from Glenberg (1976). In this experiment there were
two studies of an item followed by a test. The data are orga-
Implications of Power Functions
nized according to the lag between the two studies and the lag
Note that in Figures 1-3 we are measuring retention by a between the second study and the test. At short test lags, recall
savings measure, where larger numbers are better, while we are is better the shorter the study lag. This can be seen as derivative
measuring practice by a trials-to-relearn measure, where large from what we have seen about the retention curve. The longer
numbers are worse. Throughout the literature one can find a the study lag, the greater the retention interval from the first
variety of performance scales, some of which have a positive study to the final test. However, when the test lag is long, there
valence like savings and others of which have a negative va- is better recall the longer the study lag. This result contradicts
lence like trials to relearn. Later we have more to say about what we would extrapolate from the retention curve alone. The
percent correct, the most common positive valence scale, and spacing effects might be characterized as showing greatest re-
reaction time, the most common negative valence scale. Gen- call when study lag matches test lag. Whether this conclusion is
erally, power functions are found whatever scale is used (pro- correct or not is unclear, but there is abundant evidence for an
vided it is not a scale with an upper bound, or if it is, the upper interaction of the sort illustrated in Figure 4 between study lag
bound is not approached). Forgetting functions display a nega- and test lag.
tive slope on positive valence scales and a positive slope on No theory of human memory, including Anderson (1982),
negative valence scales. This relation is reversed for practice has been able to account adequately for practice effects, reten-
functions. It might seem curious that power functions appear tion effects, and the spacing effect. The reason should be ap-
for different performance scales, but the power relationship is a
strong one and will be approximately maintained by many
transformations of scale. As a final comment, we should say we
have no investment in the claim that these empirical functions .6 RETENTION INTERVAL
are best modeled or correctly modeled as power functions. For 2 EVENT
our purposes, it is enough to note that power functions give ____------e
remarkably good approximations. Our goal is to show that these
8 EVENTS
remarkably good approximations are implied by the structure of fa-I .5 .---------x
the environmental input to memory. -I 32 EVENTS
A number of recent theories are capable of accounting for
<l:
()
W
_ - - - >----------0
power-law learning (Anderson, 1982; Lewis, 1978; Logan, 1988; ll:

McKay, 1988; Newell & Rosenbloom, 1981; Shrager, Hogg, & z .4


o 64 EVENTS
Huberman, 1988). Except for Anderson (1982), however, none i=
a:
of these theories are capable of accounting for the forgetting ~
function. This is a serious deficit. Any extended practice must ~ .3
be taking place over many days and it is reasonable to assume Q..

that subjects are forgetting the impact of the early training.


Models that predict power-law learning but ignore forgetting
might well fail to predict power-law learning when forgetting is .2
factored in. The model in Anderson (1982) basically assumes
that the power-law learning function arises from a simple linear 01 4 8 20 40
learning process being slowed down by forgetting. That theory NUMBER OF EVENTS BETWEEN TWO PRESENTATIONS
led to the prediction that the forgetting exponent and the prac-
tice exponent should sum to 1 (see Anderson, 1982, for deriva- Fig. 4. The proportion of paired-associate responses recalled as
tion). There is no evidence for that prediction in the Ebbinghaus a function of the number of events between .two presentations
(1885/1964) data nor in any other experimental effort that has of repeated items (lag interval) and the number of events be-
obtained esti~ates of both practice effects and forgetting ef- tween the second presentation and the test (retention interval).
fects. I From Glenberg (1976).

VOL. 2, NO.6, NOVEMBER 1991 399


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

Reflections of Environment in Memory

parent, as the three effects would seem to be somewhat in con- be the gain associated with a successful retrieval, one should
tradiction. Holding test lag to last presentation constant, the stop when C > pG.
advantage of each presentation should diminish as they are Despite the description of this process in terms that evoke
spaced further apart because we are increasing the retention images of memories being considered one at a time, there are
intervals from the earlier presentations. However, the spacing equivalent parallel processes. We prefer a parallel model in
effect tells us that this is not always true. One should not think, which different memories are allocated different resources ac-
however, that the spacing effect eliminates the retention effect. cording to their need probability. However, for current pur-
The biggest effect in Figure 4 is the retention effect, which is poses we simply note that this analysis does not imply a com-
reflected in how far apart the separate curves are. One way of mitment as to the mechanism of retrieval.
characterizing what is going on is that there is a large effect of
delay since last presentation but that the other delays have a
less clear effect. Relationship between Need Odds and
A number of theories have been able to predict simulta- Behavioral Measures
neously a forgetting function, a retention function, and a spac- This analysis does allow predictions to be derived about the
ing effect (Estes, 1955; Landauer, 1975; Glenberg, 1976). How- relationship between need probability and the dependent mea-
ever, it does not seem that they can predict the power-function sures of recall latency and recall accuracy. With respect to
form that these functions appear to take. These theories assume recall latency, the critical assumption is that there is a distribu-
that memories get associated to contexts that gradually change tion of memories in terms of their estimated need probabilities.
over time. The practice function simply results from the in- The reasonable assumption is that there will be a mass of need
creased associations to context with repetition. The retention probabilities near zero with a tail of a few higher probability
function results because with time, the test context changes memories; that is, to say the distribution of memories will be
from the learning context. The spacing effects result because at J-shaped or highly skewed. It is more convenient to think about
long lags memories are likely to be associated to different con- the shape of such a distribution in terms of need odds. If pis
texts. This results in increased probability that the test context need probability, then q = p/(1 - p)"will be need odds. An odds
will overlap with one ofthe study contexts. Such a model might measure has the advantage of varying from zero to infinity.
well be given an expression that would produce the parametric Thus, the expectation is that most memories will have near-zero
form of the three effects. However, we and others have been odds and a rapidly diminishing few will have higher odds.
frustrated in our attempts to find such an expression. 2 A great many phenomena show such J-shaped distributions,
including distributions of scientists by number of publications,
words by frequency, and firms by size. Simon and Ijiri (1977)
AN ENVIRONMENTAL EXPLANATION
present the following density as characterizing such distribu-
Given that there have been no successful mechanistic expla- tions:
nations for practice, retention, and spacing phenomena, it be-
comes all the more interesting to see whether we can explain f(x) = ax- k (7)
these phenomena from the assumption that the memory system
is adapted to the structure of the environment. The basic idea is where f is the frequency of an item of measure x (e.g., word
that at any point in time, memories vary in how likely they are frequency, firm size, or need odds) and a and k are constants.
to be needed and the memory system tries to make available If we assume that memories are examined in order of odds,
those memories that are most likely to be useful. The memory then the time to examine a memory with odds q will be propor-
system can use the past history of use of a memory to estimate tional to the number of memories with odds greater than q. This
whether the memory is likely to be needed now. This view sees can be calculated as:
human memory in some sense as making a statistical inference.
However, it does not imply that memory is explicitly engaged in
(8)
statistical computations. Rather, the claim is that whatever
memory is doing parallels a correct statistical inference.
What memory is inferring is something we call the need where b = a/(k - I). Thus, we see that time is related to need
probability, which is the probability that we will need a partic- odds as a power function with exponent (k - I). Thus, if odds
ular memory trace now. The basic assumption developed in were related to retention interval or practice as a power relation
Anderson (1990) is that memories are considered in order of with exponent c, then time would be related to retention inter-
their need probabilities until the need probability is so low that valor practice with exponent c(k - 1). The force of this anal-
it no longer is worth considering any more. If we let p be the ysis is that power functions in need probability imply power
need probability, C be the cost of considering a memory, and G functions in time, although not necessarily with the same ex-
ponent. Ifk = 2, the exponent will be the same. Simon and Ijiri
2. Wickelgren (1972) produced a mathematical theory that was tai- report that values of k = 2 are common.
lored to the form of the retention function but does not address the form The above was an analysis of time. Anderson and Milson
of practice function. It mispredicts the spacing effect in that it claims (1989) can be consulted for a similar analysis of recall proba-
that the u~i1ity of later presentations is a function of how distant they are bility. The basic assumption there is that recall will stop before
from the first. It has no role for the lag among these later presentations. retrieving the target item if its need probability is too low. This

400 VOL. 2, NO.6, NOVEMBER 1991


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

John R. 'Artderson and Lael J. Schooler

might seem to imply a step function in which all items above a article to retrieve information about the referent of that word
certain need probability are recalled and all below are not re- to decide whether this is an article that the reader might want
called. However, there has to be some noise in the process such to read.
that the distance between an item's need probability and the
threshold varies. A natural scale on which to try to model this 2. We have looked at the subset of the CHILDES data base of
variation is log need odds, which varies from minus infinity to MacWhinney and Snow (1990) looking at children's verbal
interactions. Every time someone says a word to a child, this
infinity. If we assume that there is a normal distribution of
estimated log need odds around true need odds, we predict a is a demand on the child to retrieve the word's meaning.
sigmoidal function rather than a step function relating need 3. We have looked at the electronic mail messages the first
odds to recall odds. Anderson and Milson show that this rela- author (J.A.) received from March 1985 to December 1989.
tion implies a power relationship between need odds and recall Here we have analyzed the senders of the messages. The
odds. Thus, as in the case of time, we see that the natural assumption here is that every time J .A. receives a message
prediction is that a power function in need odds implies a power from a certain person, that is another demand to retrieve
function in the observed behavior. Again, the exponent need some information from J.A.'s memory about the sender.
not be the same.
These considerations about recall odds and reaction time Figure 5 illustrates the pattern of usage of some words over a
greatly simplify our research program. They mean that these 100-day period for the New York Times. The question of inter-
dependent measures should directly reflect the functional form est is how does this pattern of use over the 100 days predict the
and ordinal relationships displayed by need odds. Thus, we can probability of use on the 101st day? In addressing this question
look to see whether need odds functions are power functions we can look at the relationship between various statistics de-
like the behavioral functions. It is not necessary that they have scribing the past 100 days and probability of occurring on the
the same parameters such as exponent or scale constant for the IOlst day. For instance, "Reagan" occurs 52 times in that 100-
power function. For instance, it is reasonable to suppose recall day period. We can look at need probability on the WIst day.
odds will be much greater than the corresponding need odds, This would be representative of an item that has had 52 prac-
but they should have the same functional forms. tices in an experiment and we are looking at its recall. It turns
out in this case "Reagan" actually appeared in the headlines on
INFORMATION ABOUT day 101 but aggregating over items that appeared 52 times in a
ENVIRONMENTAL STRUCTURE 100-day window, some will appear on day 101 and some will
not. We can use the empirical proportion as an estimate of the
What we need to find out is how past history of usage of probability that an item used 52 times in 100 days will be needed
information predicts the probability that the knowledge will be on day 101.
used in the next time interval. Anderson and Milson (1989)
developed a theory based on mathematical models that were
developed to explain library borrowings and accesses to files in The Practice Function
computer systems. While this approach has some strengths, it Figure 6a shows the relationship between the number of pre-
has two considerable weaknesses that we hope to redress in this vious days on which a word has appeared during the past 100
paper. First, while these are examples of systems that have to
retrieve information, they are not systems facing human re-
trieval demands and so we are left with an argument by analogy. Patterns of Word Usage (New York Times)
Second, while a formal model has some analytic advantages, it
obscures the very direct relationship being proposed between
the environment and memory, leading some (e.g., Simon, in
press) to claim that the predictions rest on the auxiliary assump- ---'-- -------_._--_._- of

tions in the environmental model. Quite the contrary, it is the •• - •• - . _ •• - - - _ _.- - _.- - reagan
case that the predictions are a direct reflection of the structure
of the environment. noM
Ideally, we would like to follow people about determining
When demands are being made on their memory to retrieve a .. .. amencan

piece of information and how demands for the same piece of


information tend to repeat over time. While this is technically chanenger

infeasible, it is possible to study certain subsets of demands that


are being placed on human memory. We have studied the fol- Oaddafi

lowing three sources (see Schooler and Anderson, unpublished,


for detailed information about each source):
20 40 60 80 100 120 140

1. We have analyzed 730 days of New York Times headlines Date

from January 1, 1986, to December 31, 1987. Every time a


particular word like "Qaddafi" appears in the New York Fig. 5. Patterns of usage of various words in the New York
Times headline, this is a demand on a potential reader of the Times data base over a loo-day period.

VOL. 2, NO.6, NOVEMBER 1991 401


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

Reflections of Environment in Memory

Prob = ·.01 ••01 Freq Prob ••00 •.0076 Freq


=0.997
R A2 RA 2 = 0.964 Prob • .00 •.009 Freq
R A 2.0.999
1.0 0.4r----------,
(0) No.. York Tlmo. proctlce (b) Porenlal Spaech Proctlce

0.8
co 0.3
~
... 3c
~
c
0
0.6
.
:!
5 0.2
~ .:
..
:a 0.4
~
Z>
l!
A.
.
:a
Z> 0.1
0.2 0
ci:

20 40 60 80 100 10 20 30 40 50 20 40 60 80 100
Frequency In Po.1 100 Doy. Frequency In Po.1 100 Utlerances Frequency In Pill 100 Doy.

Log Odd. = -4.88 • 1.13 Log Frequoncy


Log Odd. = ·5.26 • 1.28 Log Frequency Log Odd.. • 5.07 • 1.15 Log Frequency AA2 ••995
R A 2 =.994 Or-_R'-'-A..::2-==-=0:.::.996=- ~ 2r-------------,
(d) New York Times Proctlce (e) Parenla' Spaech Practice (I) Mon Sourco. Proctlco

-1

0
·2
0
~ ~
~ ·3
·1

I ·2 I 1.2

~ ~ -4 ~
·3
-4
·5
-4

~ ~ ·5
0 2 3 0 2 3 4 0 2 3 4 5
Log Frequency Log Frequency Log Frequency

Fig. 6. (a) Probability of a word occurring in a headline of the New York Times on Day 101 as a function of
the number of times it occurred in the previous 100 days; (b) probability of a word occurring in the 101st
utterance from a parent as a function of the number of times it occurred in the previous 100 days; (c)
probability of receiving a message on the 101st day from a source as a function of the number of times
messages were received from that source in the previous 100 days. Panels (d-O provide transformation of
(a-c) plotting log needs against log frequency.

days and the probability it will appear in the current day. We Simon (1955) noted that the probability of an item being
have plotted probability of occurrence on the 101st day against repeated was proportional to its past frequency of usage in a
number of uses in the previous 100 days. This analysis reveals number of sources. We have just replicated this result. The
a particularly straightforward relationship. In this data base, constant of proportionality (1.0 for New York Times, .76 for
future probability of use perfectly reflects the proportion of past child language, and .9 for mail messages) reflects the rate at
use in the data base. which new terms are appearing. One minus this constant is the
Figure 6b shows a similar analysis for the child language data probability that the next item is a new term.
base. Here we are looking at the probability of a word occurring In Figures 6a-c we have plotted the relationship between
in the 101st utterance to the chiid as a function of the number of need probability and frequency. Our prediction is that there
times it appeared in the previous 100 utterances to the child. should be a power relationship between need odds and fre-
Again we have plotted probability of use against number of quency or a linear relationship between log need odds and log
prior utterances. The relationship is again linear, although we frequency. Figures 6d-f plot log odds rather than log probabil-
find that past proportion overestimates future use. Basically, if ity. Generally, there is a strong correlation between log need
an item has occurred in a proportion P of the past 100 utter- odds and log frequency but systematic deviations appear for
ances, it has a probability .76P of occurring in the next utter- frequencies over 50. We have estimated best·fitting linear func-
ance. tions for frequencies under 50 and the results are every bit as
Finally, Figure 6c shows a similar analysis for the electronic good as in the original Figures 6a-c. We are not bothered by·
mail data. Again a linear relationship is found, but this time the deviations for frequencies over 50 because these represent very
function is .9P. few items. In the case of the New York Times, they are a few

402 VOL. 2, NO.6, NOVEMBER 1991


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

John R. Anaerson and Lael J. Schooler

functor words. In the case of electronic mail, they are two rable analysis from the child language data. Here we plotted
individuals. There are no such items in the case of child lan- probability that the word would appear in the 101st utterance to
guage. These few items do represent extremes that are not re- the child as a function of where last it appeared in the last 100
alized in memory experiments that produce power functions. utterances. Figure 7e shows another power relationship, this
They are items that occur nearly every day of our lives and no time with exponent .77. Figures 7c and f show the data for the
memory experiment comes close to creating that ubiquitous a mail messages. Again a linear relationship appears in the case of
learning experience. the log transformed data in Figure 7f, implying a power rela-
tionship. In this case the exponent is .83. Although we have not
bothered to include the plots, in each case the data do not
The Retention Function satisfy an exponential relationship.
We also used a window of 100 days in analyzing the New
York Times for an analog of the retention function. Here we Spacing Effects
look at probability of recall on the 101st day as a function of
how many days have elapsed since the item last occurred in that We tried to find an analog of the Glenberg study in the en-
100-day window. Figure 7a shows this relationship with an un- vironment. For the New York Times, we selected cases where
transformed scale, and Figure 7d shows the relationship plot- a word occurred exactly twice in the past 100 days and consid-
ting log need odds against log time. As can be seen, the data in ered the probability of its occurring on day 101. We analyzed
Figure 7a show the typical negative acceleration of a retention this probability of occurrence as a function of the lag between
curve, and the data in Figure 7d show that this satisfies a power the two occurrences (the analog of study lag) and the lag be-
function with exponent .73. Figures 7b and e shOW the compa- tween the second occurrence and test (the analog of test lag).

0.2 0.12 0.3r----------.


(e) New Yorll Times Retenllon (b) P.renlel S~"h Relenllon (c) Mell Source. Relenllon
~
0.10
=
•.
..... .....
;;
c ~
l! 0.08
D
c:
o 01
i D
0
c
S 0.06

.
~.
~
:a ~ :D
:a J
!
0.04 0.1
...e
6>

...e
0.02

0.0
0 20 40 60 .0 100
0.00
0 20 40 60 80 100 0.00!---:"2~0 -=4=0::::,::0=:::.:::0:=J,0 D
DeJ••Ince LSII occurrence Uu.ranee••Ine. La.1 occurrence Deys .rnce La.1 OCCUrrence

Log Odds: - 1.95·0.73 Log DaJ. Log Odd• • • '.70 - 0.77 LOll Uuerance. Log Odde = • 1.09 • 0.83 Log DaJ.
R'2 = 0.993 R·2.0.984 R'2 = 0.986
-1r--------------, ·29<:""""'.---------,
(e) Perenl.l Speech Relenllon
0.------------.
(d) New Yorll Time. Relenllon
(I) Men Sourc•• Rel.nllon

~ -4

I
J
-5
·5

_5l---_~- ~~-1

-6'-------~---~
o
2 3 4 5 2 3 4 $ o 2 3 4 5
Logo-Js LOll U!lerentes LogPeJ'

Fig. 7. (a) Probability of a word occurring in a headline in the New York Times on day 101 as a function of
how long it has been since the word previously occurred; (b) probability of word occurring in the 101st
utterance from a parent as a function of how many utterances it has been since the word previously occurred;
(c) probability of receiving a mail message from a source as a function of how many days it has been since
a message was last received from that source. Panels (d-,.O provide transformation of (a-c) plotting log need
odds against log frequency.

VOL. 2, NO.6, NOVEMBER 1991 403


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

Reflections of Environment in Memory

(b) P.,.nlll SpeICh Spacing (c) M.i1 Sp.cing

....
(.) N.w York Time. Sp.cing
0.03 0.08

. ...
- 5 D.y. Unmontlontd ...........- 5 OIW" Sine.lasl ..... Mg •
Cl - 20 D.y. \lnrnIlI_ ~ 0.07 - 20 DIJI Since ....1...._

~
! ~
i

\-
0.035
---.- 40 Day- \lnrnIlItIontd _ _ 40 D.y. Since ....1....._
" i
."
::l 0.06

~
=
0

-
Ii
t
z
1... .."
0.05
.5


0.025

•" 0.02 5 Utt8r1ncee UnrMndoned 0.04

---------
1 _ _ 20 Uttwane•• Unmentioned


i '0
li _ 4OUtlarInct.~_
0.03
1:-
i 0.015 i.!
~ .e
'0 0.02

1-'"
1:'

0.01
0
~ 10 20 30 .0 50
~

.
.:
e 0.005
a 10 20 30 .0 50
0.01

0.00
0
~
10 20 30 .0 50
NYmbor 01 D.y. a.lWlln 2 "'ntionlng. N _ 01 D.y. _ _ ...... go.
Numbtr of UtlI..nc.. bt_n 2 1Ion_lng.

Fig. 8. Ana~og o~the Glenberg study (Fig. 4) in the (a) New York Times, (b) child language data source, and
(c) electrOnIc matI data source.

Such data are relatively rare, and therefore we collapsed these


tunately, this research has not considered the power function
into three categories-lags of 1-9, lags of 10-30, and lags of
for forgetting. 4 Figure 10 shows some of the data that have
31-89. Classifying study and test lag according to these three
fueled this controversy on graphs that plots log odds scales.
categories gives us a 3 x 3 classification of the data. Figure 8a
Figure lOa is from Hellyer (1962), who gave one to eight pre-
shows the data organized according to this classification. This
sentations of a three-consonant unit followed by a retention
figure qualitatively reproduces the data of Glenberg. At short
interval.of 3 to 27 seconds. Figure lOb is from Krueger (1929),
test lags, probability decreases with study lag, but the reverse
who tramed a list of 12 nouns to various degrees of overlearning
relationship holds for long test lags. Figure 8b shows the same
and then looked at retention from I to 28 days. Figure IOc is
analysis for the child language data, and Figure 8c shows the
from Underwood and Keppel (1963), who looked at retention of
data for mail messages: Again the same qualitative interaction
nine letter associates at I or 7 days as a function of number of
appears. 3
trials of training. Figures lOa and b use log delay as the abscissa
The data in both figures are plotted as Glenberg reports his
and plot different degrees ofleaming as different curves. Figure
data-the abscissa is study lag and different curves represent
IOc plots amount of learning as the abscissa and has two differ-
different retention lags. This analysis makes the point that, as in
ent curves for the two different delays. All three sources illus-
Glenberg's data, the big effect is for the retention interval (dif-
trate the same point. Delay and practice have approximately
ferent curves) and the relatively small effect is for the study lag
additive effects in these log transformed scales.
(shape of individual curves). It is of interest to replot this data
One interesting question is what is the relationship in our
looking for the effect of retention interval for various study lags.
three environmental sources. Figure 11a shows the retention
We have done this in Figure 9, collapsing the two longer study
data from the New York Times broken down into high- and
intervals. Two things are apparent. First, the retention function
low-frequency items and Figure lib shows the comparable data
is steeper for the shorter study lag (.20 for long lag and .49 for
for child language. Both data sources show the same approxi-
short lag in New York Times; .45 for long lag and.76 for short
mately additive effect of the two factors. We should stress that
lag in parental speech; .48 for long lag and 1.03 for short lag in
we ~~ve. no .investment in the claim that the effects are truly
mail sources). Second, these functions, which are controlled for
additive m either memory or the environment. Rather, our ob-
number of prior studies, show much shallower slopes than
servation is simply that the two effects are approximately the
those in Figure 7, where it was possible that number of prior
same.
studies was, confounded with retention interval. The shallower
slope is particularly apparent in the case of long lags.
SUMMARY AND CONCLUSIONS
We have now looked at some details surrounding the rela-
Interact~()Os between Practice and Forgetting
tionship between retention and practice and found that human
Recently, there has been some controversy as to the form of memory mirrors, with a remarkable degree of fidelity, the struc-
various forgetting functions at various degrees of learning (Bo- ture that exists in the environment. Both display retention and
gartz, 1990; Loftus, 1985; Siamecka & McElree, 1983). Unfor- practice functions that are at least approximately power laws.
Retention and practice effects are approximately additive.
These are not trivial conclusions and other relationships are
3. As the Glenberg interaction is basically ordinal, we have chosen
to plot Figure 9 in the conceptually si~pler need probability rather than 4. Bogartz (1990) focuses on the model in Wickelgren (1972), rather
the theoretically more correct need odds. than Wickelgren (1974), where power fits are discussed.

404' VOL. 2, NO.6, NOVEMBER 1991


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

John R. Anderson and Lael J. Schooler

(I) Nlw York Times Retention Interaction (b) Parenlll Speech Retention Inlerlctlon (c) MIll Sourcls Rellntlon Inlerlctlon
·3r-------------, ·3 -2

- - Shortleg - - Shortleg - - Shortleg


Longleg -it-Longleg
-it- - - Longleg
-3

·5

.fjl--_ _- _ _- -_ _-l
. 5 l - - - - - - - -_ _- - - J · 5 l - - - - - - - -_ _- - - - - I
1 2 3 4 1 2 3 4 1 2 3 4 5
Log Oays Log Ullerances Log Oays

Fig. 9. Retention function for items occurring twice in the previous 100 time units: (a) New York Times, (b)
child language, and (c) electronic mail messages. Separate functions are plotted for items for whom the two
occurrences were less than 10 units apart (short lag) and for whom the two occurrences were at least 10 units
apart (long lag).

quite plausible. Evidence for their nontriviality can be seen in Formulating the Effects of Practice and Retention
the fact that these conclusions have been reached with some
reluctance and controversy in psychology-to the extent that There remains the question of what memory mechanism
we can consider these conclusions established. Finally, there is would actually produce the practice and retention functions we
an interaction between spacing and retention such that reten- saw. One can aspire to address this question at different levels.
tion functions are steeper for more massed practice. One level would be the underlying processes that produce these
What are we to make of this parallelism between memory results. We believe that such an explanation would have to be
and environment? Certainly we can go away with the conclu- at the neural level in terms of the physical changes that underlie
sion that the functioning of memory is remarkably well adapted learning. Short of this, one could aspire to have a mathematical
to the structure of the environment. We also believe that there description of how memory would respond to various presen-
is a causal link here-that memory has the structure it has be- tation schedules. There has not been a satisfactory mathemat-
cause the environment has the structure it has. However, it is ical description to date. However, as a consequence of the anal-
possible to hold out for the hypothesis of an accidental corre- yses we have developed in this paper, we think we are now in
lation between the two. possession of such a formulation.

(b) Krueger's Dala (e) Underwood and Keppel's Data

--
(a) Helleyer Data

----
6 4
-.- - - 1 day retention

--
--<>-- 8p.....,"'IIoNl 200% teeming
4p.....,lalloNl --{}- 7 daya retantlon
lSO%leemlng
--{}- 2p_IIoNl 100% leemlng
lp.....,"'11on 3

·1
. . . :I

a ·2
."

~ 2 ~
!' ...ll' !'

~
-3
0
IJ

'"
.

·2 oS ·1
1 2 3 4 0 2 3 .( 0 1 2 3 4
Log SeconcIa Log DIy. Log TrI.l. of Learning

Fig. 10. (a) Forgetting curves at four practice levels from Hellyer (1962); (b) forgetting curves at four practice
levels from Krueger (1929); (c) effects of practice at two retention intervals from.Underwood and Keppel
(1963).

VOL. 2, NO.6, NOVEMBER 1991 405


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

Reflections of Environment in Memory

(a) New York Times (b) Child Language

·1..-----------------, -1,--------------,

7.12 occurrences
--0--
--0-- 7.12 Occurrences -2 - - 1·6 occurrences
- - 1·6 Occurrences

·3.J-----.-----.------,.--:.---l
o 1 2 3 4 -3+----.-----.-~---r--=--__1
o 1 2 3 4
Log Days Log Utterances

Fig. 11. (a) Retention effects in the New York Times for a word with different frequencies of occurrence; (b)
retention effects in the child language data base for items with different frequencies of OCcurrence.

Before providing a mathematical formulation, we would like 2. Strengths of individual presentations decay as a power func-
to state the basic assumptions behind the model: tion of the time.
3. The exponent of the power function for decay of each pre-
O. Strength of a trace provides an encoding of its need odds sentation decreases as a function of time since previous pre-
memory performance. sentation.
I. The strengths from individual presentations sum to produce We now give an equation to formalize each of the assump·
a total strength. tions 1-3.

(8) Log S = • 1.98 + 0.84 Log D = • 11.60


RA 2 =1.000 (b) Log S
RA 2
+ 0.31 Log D
=0.964
3r------------7"I

.r:
'&
c
l!!
en -11
Cl
o
-'

·1

-12t--.....-----.---.---..........--....----l
1 2 3 4 5 6 o 1 2 3 4 5 6
Logoays
Log Days

Fig. 12. Practice functions generated by the mathematical model for (a) d l .125 and (b) d 1 = 1.000.

406 VOL. 2, NO.6, NOVEMBER 1991


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

John R. Anderson and Lael J. Schooler

Let tj be the time since the ith presentation of an item and


s(t) be the strength remaining after this time. Then, correspond- Glenberg Simulation
0,8~-------------...,
ing to Assumption 1, we have:

-
n

S = A 2: s(tj)
;=1
(9)
0.7 r-o-. ~
~
-0--- Test Lag 2
"""V

where S is total strength and A is a scale factor. Corresponding (

to Assumption 2, we have 0.6



(10)
-
-'=
Cl -
Test Lag 8


- ~
c
Q)
where dj is an exponent that can be different for each presen- ~
0.5
tation i. In the case of the first presentation, d. is a parameter of (/') -{}-
Test Lag 32
the experiment. It may vary with the type of material. Corre- l ~

sponding to Assumption 3, we have for other d j :


0.4 •
(II)
~ • Test Lag 64
that is, d j is the maximum of the decay rate for the initial pre-
sentation d., and b(tj - tj_.)-d,. The basic idea is that the 0.3
decay rate should also decline according to a power function of 0 10 20 30 40 50
the time elapsed between the ith and i-1st presentation, b(tj
- tj_.)-d" but that in no case should it become lower than d •. Lag between Presentations
Thus, if we wait a short time for a second presentation, the
decay rate for the second presentation will be high; whereas if Fig. 13. Simulation of the Glenberg (1976) data (Fig. 4) by the
we wait long enough, the decay rate for the second presentation mathematical model.
will be no different than for the first. Intuitively, the closer two
studies are together, the smaller the contribution of the second Although deeper mechanistic explanations would be nice,
makes to the overall strength. While Equation 11 satisfies these we think it is an accomplishment to finally have mathematical
constraints, its exact form is a bit arbitrary in that it also has functions that can capture the effects of practice and delay. We
decay rate declining as a power function. There is no evidence think this has been a direct result of our focus on the structure
one way or the other for this precise an assumption. of the environment. The relationships determining need proba-
We have fit this model to various empirical results. Our goal bility in the environment seem particularly apparent-perhaps
is to see if we can reproduce the empirical relations we have because one is not blinded by prior beliefs about mechanistic
observed in terms of strength. Since we leave open the mapping models.
of strength onto actual behavioral measures, we can arbitrarily
set A = 1 for simplicity. We have also set b = .61, a value that
What Produces the Environmental Structure?
works well for all of our applications, leaving d. as the one
parameter to be chosen. In lieu of a mechanistic explanation, one can ask for an
The model can obviously produce the phenomena of power- explanation of why the environment displays the relationships it
law forgetting, since that is directly built into the retention func- does. Anderson and Milson (1989) can be consulted for the
tion. We explored the growth in strength of one practice per day details of an explanation that is an elaborated version of a model
when d. = .125 and when d. = 1.000. The results are plotted in proposal by Burrell (1980) to account for library borrowings. It
Figure 12. Both curves approximate a power function quite has basically two assumptions. First, it assumes that memories
well, although the approximation is better when d. = .125. As vary in a property called desirability, where a memory's intrin-
can be seen, the exponent of the learning curve decreases with sic desirability determines its rate of use. It turns out that this
the decay exponent as proposed by Anderson (1982), but it is no assumption helps explain frequency and recency effects in that
longer the case that the two sum to 1. memories that have been used more recently or frequently are
Next, we investigate whether this model can reproduce the more likely under a Bayesian analysis to be highly desirable.
spacing effects. Figure 13 shows the strength calculation for Second, the model assumes that memories can rise and fall in
Glenberg's (1976) experiment with d. = .125. The correspon- this desirability and memories also differ in such volatility. This
dence with Figure 4 is compelling. Finally, we attempted a sim- assumption, again under a Bayesian analysis, helps predict re-
ulation with d. = 1.5 of the Hellyer (1962) data in Figure lOa on cency and spacing effects. For instance, an item that has had a
the additivity of retention and practice effects (Fig. 14). Once number of massed presentations a long while ago is identified as
again the correspondence between data and simulation is com- probably being a volatile item that had a momentary rise to high
pelling. desirability and is no longer in use.

VOL. 2, NO.6, NOVEMBER 1991 407


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014
PSYCHOLOGICAL SCIENCE

Reflections of Environment in Memory

Anderson, J.R:, & Milson, R. (1989). Human memory: An adaptive perspective.


Helleyer Simulation Psychological Rniew, 96(4), 703-719.
0,.---------------") Baddeley, A.D. (986). Working memory. Oxford, UK: Oxford University Press.
Bahrick, H.P. (1979). Maintenance of knowledge: Questions about memory we
--0- 8 Presentations forget to ask. Journal of Experimental Psychology: General. 108, 296-308.
Bahrick, H.P. (1984). Semantic memory content in permastore: Fifty years of
• 4 Presentations memory for Spanish learned in school. Journal ofExperimental Psychology:
General, II3, 1-24.
-0- 2 Presentations Bogartz, R.S. (990). Evaluating forgetting curves psychologically. Journal of
• 1 Presentation Experimental Psychology: Learning, Memory, and Cognition, 16, 138-148.
Burrell, Q.L. (980). A simple stochastic model for library loans. Journal of
Documentation, 41, 100--115.

-
-2 Ebbinghaus, H. (1964, 1885). Memory: A contribution to experimental psychol.
ogy. Mineola, NY: Dover Publications.
.s:: Estes, W.K. (1955). Statistical theory of distributional phenomena in learning.
en Psychological Review, 62, 369-377.
c Glenberg, A.M. (1976). Monotonic and nonmonotonic lag effects in paired-
~
en associate and recognition memory paradigms. Journal of Verbal Learning
and Verbal Behal'ior, 15, 1-16.
en Hellyer, S. (1962). Frequency of stimulus presentation and short-term decrement
o in recall. Journal of Experimental Psychology, 64, 650.
..I
-4 Krueger, W.C.F. (929). The effects of overlearning on retention. Journal of
Experimental Psychology, 12, 71-78. "
Landauer, T.K. (1975). Memory without organization: Properties of a model with
random storage and undirected retrieval. Cognitive Psychology, 7, 495-531.
Lewis, C.H. (978). Production system models of practice effects. Unpublished
doctoral dissertation, University of Michigan, Ann Arbor.
Loftus, G.R. (985). Evaluating forgetting curves. Journal of Experimental Psy-
chology: Learning, Memory, and Cognition, II, 397-406.
Logan, G.D. (1988). Toward an instance theory of automatization. Psychological
Re"iew, 95, 492-527.
_6+---------r--~--__r_--~-____l
MacWhinney, B., & Snow, C. (1990). The child language data exchange system:
1 2 3 4 an update. Journal of Child Language, 17,457-472.
Mazur, J.E., & Hastie, R. (1975). Learning as accumulation: A reexamination of
Log$econds the learning curve. Psychological Bulletin, 85, 1256-1274.
McKay, D.G. (1988). The problem flexibility, flu~,•.;;}, and speed·accuracy trade·
off in skilled behavior. Psychological Rel'iew. 89. 483-506.
Fig. 14. Simulation bfthe Hellyer (1962) data (Fig. lOa) by the Newell, A., & Rosenbloom, P. (1981). Mechanisms of skill acquisition and the law
of practice. In J.R. Anderson (Ed.), Cogniti."e skills and their acquisition
mathematical model. (pp. I-55). Hillsdale, NJ: Lawrence Erlbaum Associates.
Resile, F., & Greeno, J.G. (1970). Introduction to mathematical psychology.
Reading, MA: Addison-Wesley.
This is not a particularly obscure model of the environmental Schooler. L.J., & Anderson. J.R. (unpublished). Environmental demands in
properties of memories. Nonetheless, it turns out these simple memory: Statistical analogs to learning and forgetting curves.
assumptions have led to memory characteristics that have con- Shepard, R.N. (1990). Mind sights. New York: Freeman.
Shrager, J.C .• Hogg, T .• & Huberman. B.A. (1988). A dynamical theory of the
founded psychologists since Ebbinghaus. power-law learning in problem-solving. In Proceedings of the tenth annual
conference of the Cognith'e Science Society (pp. 468-474). Hillsdale, NJ:
Cognitive Science Society.
Acknowledgments-This research was supported by Grant BNS- Simon, H.A. (1955). On a class of skew distribution functions. Biometrika, 52.
8705811 from the National Science Foundation and Contract 425-440.
NOOOI4-90-J-1489 from the Office of Naval Research. We would like Simon, H.A. (in press). Cognitive architectures and rational analysis: Comment.
to thank Ching-Fan Sheu for his comments on this paper. The sec- In K. Van Lehn (Ed.), Architectures for intelligence. Hillsdale. NJ:
ond author was supported by Training Grant I-T32-MHI9102-01 Lawrence Erlbaum Associates.
from the National Institute of Mental Health. Simon, H.A .• & Ijiri, Y. (1977). Skew distributions and the sizes ofbusiness/irms.
New York: ElsevierlNorth Holland.
Slamecka, N.J .• & McElree, B. (1983). Normal forgetting of verbal lists as a
function of their degree of learning. Journal of Experimental Psychology:
Learning. Memory, and Cognition. 9. 384-397.
Underwood. B.J.• & Keppel, G. (1963). Retention as a function of degree of
REFERENCES learning and letter-sequence interference. Psychological Monographs, 77
(I, Whole No. 567).
Wickelgren, W.A. (1972). Trace resistance and the decay of long-term memory.
Anderson, J.R. ,<1982). Acquisition of cognitive skill. Psychological Review, 89, Journal of Mathematical Psychology, 9, 418-455.
36~06. Wickelgren, W.A. (1974). Single-trace fragility theory of memory dynamics.
Anderson, J.R. (1989). A rational analysis of human memory. In H.L. Roedinger
Memory and Cognition. 2, 775-780.
& F.I.M. Craik (Eds.), Varieties of memory and consciousness (pp. 195- Wickelgren, W.A. (1976). Memory storage dynamics. In W.K. Estes (Ed.). Hand-
210). Hillsdale, NJ: Lawrence Erlbaum Associates. book of learning and cognitive processes (pp. 321-361). Hillsdale, NJ:
Lawrence Erlbaum Associates.
Anderson, J.R. (1990). The adapth'e character of thought. Hillsdale, NJ:
Lawrence Erlbaum Associates. (RECEIVED 1/18/91; REVISION ACCEPTED 7/1/91)

408 VOL. 2, NO.6. NOVEMBER 1991


Downloaded from pss.sagepub.com at COLUMBIA UNIV on October 14, 2014

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy