Kintsch - 1998 - Comprehension
Kintsch - 1998 - Comprehension
Walter Kintsch
153
K62c
1998
Digitized by the Internet Archive
in 2020 with funding from
Kahle/Austin Foundation
https://archive.org/details/comprehensionparOOOOkint
In this landmark volume, Walter Kintsch presents a theory of human text com¬
prehension that he has refined and developed over the past 20 years. Character¬
izing the comprehension process as one of constraint satisfaction, this compre¬
hensive theory is concerned with mental processes - not primarily with the
analysis of materials to be understood. The author describes comprehension as a
two-stage process: first, approximate, inaccurate representations are constructed
via context insensitive construction rules, which are then integrated via a process
of spreading activation.
In Part I, the general theory is presented and an attempt is made to situate it
within the current theoretical landscape in cognitive science. In the second part,
many of the topics are discussed that are typically found in a cognitive psychol¬
ogy text. How are word meanings identified in a discourse context? How are
words combined to form representations of texts, both at the local and global
level? How' do texts and the mental models readers construct from them repre¬
sent situations? What is the role of working memory in comprehension? The
book addresses how relevant knowledge is activated during reading and how
readers recognize and recall texts. It then draws implications of these findings for
how people solve word problems, how they act out verbal instructions, and how
they make decisions based on verbal information.
Comprehension is impressive in its scope, bringing in relevant ideas from all of
cognitive science. It presents a unified, sophisticated theory backed by a wealth
of empirical data.
- '
Comprehension
- '
Comprehension
A paradigm for cognition
Walter Kintsch
University of Colorado, Boulder
Cambridge
UNIVERSITY PRESS
PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE
The Pitt Building, Trumpington Street, Cambridge CB2 1RP, United Kingdom
Typeset in Ehrhardt
4PJ5 i
For Eileen
viii Contents
References 425
Name Index 449
Subject Index 459
-'
Preface
A few years ago some of my friends, colleagues, and former students put
together a book entitled Discourse Comprehension, with chapters that
relate in some way to work that I have done in this area. They are excel¬
lent chapters and illustrate how far research on discourse processes has
advanced in the last few decades. Building on and incorporating earlier
work in linguistics and philosophy, cognitive scientists have gained new
insights about how language is understood and have refined their experi¬
mental methods and elaborated their theories. Indeed, although the con¬
tributors to that book did not represent a single or uniform theoretical
viewpoint, it seemed to me that the current research on discourse com¬
prehension in cognitive science, at least in the section represented there,
could be brought under a unifying, overarching theoretical framework. In
fact, I thought I knew what that framework could be - the construc¬
tion-integration (Cl) model of comprehension that I had been develop¬
ing. More than that, I believed that that framework was broad and pow¬
erful enough to include related areas in cognitive psychology beyond
discourse comprehension proper.
The only problem was that this framework was nowhere stated explic¬
itly and in detail. I had published the basic ideas, and several applications
of the model had also appeared in the last few years, but a complete and
coherent treatment was lacking. Although it seemed that the Cl model
could provide an account for many interesting findings in the study of
cognition, in many cases the actual work of developing and testing a
detailed simulation of the phenomena in question had not yet been done.
So I, could not even be sure that the model was really as powerful as I sur¬
mised. Worse, the applications that had been done were often published
in inaccessible places, making it difficult for a reader to gain an under-
xiv Preface
standing of the scope of the model and to evaluate its effectiveness. Fur¬
thermore, even if a reader studied all the available publications, it would
not have been easy to arrive at a coherent picture, because in the pub¬
lished papers each application of the model stood by itself, and the all-
important links between them were rarely made explicit.
Therefore, this book was written to present to my colleagues and stu¬
dents interested in discourse comprehension a coherent and broad theo¬
retical framework to better investigate its effectiveness. In so doing I had
to explore for myself just how adequate this framework was: Could it
really do all the things claimed for it? Readers are entitled to their own
answers, but I became more and more convinced of the power of Cl archi¬
tecture as I continued to explore new issues in discourse comprehension
within that framework. In fact, I came to realize that the framework I pro¬
posed applies not only to discourse comprehension but also to a broader
range of issues in the study of cognition in general.
Thus, my own goals changed as I engaged more deeply with the
research for this book. I realized that not only discourse comprehension
but also other cognitive processes may be viewed from the vantage point
of the Cl model. I became interested in just how far the domain of the
comprehension paradigm extends. Comprehension is modeled as a con¬
straint satisfaction process in the Cl model. It is apparent now that some
other cognitive processes, such as action planning and decision making,
can be modeled in the same way. Hence, the title of this book, which sug¬
gests that comprehension can serve as a theoretical paradigm for a wider
range of cognitive processes. It does not encompass all of cognition -
there are truly analytic thinking processes that are beyond the range of
the Cl model. Such is the case even in the discourse area itself. For
instance, certain kinds of inference processes require conscious, active
problem solving rather than mere constraint satisfaction. But on the
other hand, I was able to show that most of what goes under the label of
inference in discourse comprehension can in fact be accounted for within
the proposed framework.
Writing this book thus became a quest to determine the limits of the
comprehension paradigm. I am still not quite sure where these limits lie,
except that they are farther out than most psychologists had assumed.
Further study will be required to explore in depth some of the sugges¬
tions made here, but I think that the usefulness of the comprehension
paradigm for the study of cognitive processes has been established.
Acknowledgments
The research reported here could not have been performed without the
long-term support afforded by a Merit Award from the National Institute
of Mental Health, grant MH-15872. Additional support has been
received in the past ten years for various special projects: from the
National Science Foundation (with James G. Greeno) for the work on
word arithmetic problems, from the National Science Foundation (with
Gerhard Fischer, Clayton Lewis, and Peter G. Poison) and the Army
Research Institute (with Fischer) for the research on action planning,
from the Mellon Foundation (with Eileen Kintsch) for the research on
learning from texts, and from the McDonnell Foundation (with Fischer
and Thomas K. Landauer), and from DARPA (with Landauer) for the
exploration of the Latent Semantic Analysis technique. I am deeply
grateful to these institutions for permitting us to develop and evaluate the
ideas about discourse comprehension that eventually led to this book.
The University of Colorado has been blessed with an exceptional
group of graduate students, several generations of whom participated in
this endeavor. Their dedication and enthusiasm, experimental and mod¬
eling skills, but above all their fresh ideas and infectious inquisitiveness,
have profoundly shaped this research program. It is one the great plea¬
sures of being a professor to be able to collaborate with an ever changing
but eternally young team of the brightest and the best. Among the stu¬
dents and post-docs who were associated most closely with the work
described here are, in roughly chronological order: Randy Fletcher,
Chuck Weaver, Ernie Mross, Susan Zimny, Suzanne Mannes, Walter Per-
rig, Denise Cummins, Kurt Reusser, Franz Schmalhofer, Stephanie
Doane, Jose Otero, Isabelle Tapiero, Mitch Nathan, Danielle McNamara,
xvi Acknowledgments
Evelyn Ferstl, Marita Franzke, Julia Moravcsik, Mike Wolfe, Dave Stein-
hart, and Missy Schreiner.
The Institute of Cognitive Science and the Department of Psychology
at the University of Colorado has provided an ideal environment for my
work. I am deeply indebted to my colleagues for many stimulating dis¬
cussions and arguments. I could not have worked without their feedback,
whether critical or encouraging. Over the years, Peter Poison has been a
most valuable partner in this respect. I also want to mention two specific
collaborations that had a significant influence on the work presented in
this book. My former colleague Anders Ericsson and I became intrigued
with the functioning of working memory, which resulted in the theory of
long-term working memory - ideas that put a whole new twist on our
understanding of discourse comprehension. Last, I gratefully acknowl¬
edge my debt to Tom Landauer who introduced me to latent semantic
analysis (LSA) and who has been my companion in exploring this excit¬
ing new technique.
My intellectual debts are owed not only to my collaborators at Col¬
orado but also to a much broader group of researchers in psychology, cog¬
nitive science, and education. I shall not try to list them here, but I want
to express my appreciation and my hope that in the pages that follow I
always have given credit where credit is due.
Special thanks, however, go to those who read and criticized one or
more of the chapters of this book: Anders Ericsson, Tom Landauer, Akira
Miyake, Peter Poison, Paul van den Broek, and Sashank Varma. I also
thank Julia Hough of Cambridge University Press for her interest and
support.
This book is dedicated to my wife and co-worker, Eileen Kintsch. She
has been my supporter, research collaborator, discussion partner, critic,
and editor — not only on this project but throughout many wonderful
joint years. This book would certainly not exist without her. I only hope
that she and the many others who helped me with this book will be happy
with what it has become.
Walter Kintsch
1
Introduction
vidual models constrain and support one another. Newell’s own work and
the research program of J. R. Anderson are stellar examples of this
approach to cognitive science (Anderson, 1983, 1993; Newell, 1990).
Comprehension may be another paradigm for cognition, providing us
with a fairly general though perhaps not all-encompassing framework
within which cognitive phenomena can be explained. Since Newell and
Simon’s groundbreaking book, Human Problem Solving (1972), problem
solving has been the paradigm for the higher cognitive processes, in
particular these authors’ conceptualization of problem solving as search
in a problem space. The task of the problem solver is conceived of as find¬
ing a solution path in a large, complex problem space full of dead ends.
This has been an extremely successful way of thinking about human cog¬
nition. But it may be that for at least some cognitive processes there are
alternative conceptualizations that could be fruitfully explored. For
instance, one can look at cognition, at least certain forms of cognition,
as a constraint-satisfaction, or comprehension, process instead. In this
book, I work out a constraint-satisfaction theory of comprehension and
show how this kind of cognitive architecture can serve as a paradigm for
much of cognition.
The terms understanding and comprehension are not scientific terms but
are commonsense expressions. As with other such expressions, their
meaning is fuzzy and imprecise. To use them in a scientific discourse, we
need to specify them more precisely without, however, doing undue vio¬
lence to common usage. First, understanding and comprehension are
used as synonyms here. The choice of one or the other term is thus purely
a matter of linguistic variation.
But just what do we mean by either understanding or comprehension?
What seems most helpful here is to contrast understanding with percep¬
tion on the one hand and with problem solving on the other. In ordinary
usage, perceive is used for simple or isolated instances of perception, espe¬
cially when no specific action is involved. Understand is used when the
relationship between some object and its context is at issue or when action
is required. 1 he boundaries between the two terms are certainly fuzzy,
however Understanding is clearly the preferred term when a perception
involves language.
Introduction 3
means that it is not a theory of text structure, or a text analysis. The text
structure is only indirectly important, in that it is one determinant of the
comprehension process and therefore of the product of this process, the
mental representation of the text and actions based on this construction.
Second, at the most general level the theory characterizes the compre¬
hension process as one of constraint satisfaction. Comprehension occurs
when and if the elements that enter into the process achieve a stable state
in which the majority of elements are meaningfully related to one another
and other elements that do not fit the pattern of the majority are sup¬
pressed. The theory thus must specify what these elements are and how
they reach a stable configuration.
All kinds of elements enter into the comprehension process. In com-
monsense terms, these may be perceptions, concepts, ideas, images, or
emotions. We need a way to deal with all these in the theory. A proposi¬
tional representation will be described that provides a common notation
for these elementary units of the comprehension process and for the
description of the relations among them.
A crucial consideration is where these elements come from: from the
world via the perceptual system, as well as from the organism in the form
of memories, knowledge, beliefs, body states, or goals. At the heart of the
theory is a specific mechanism that describes how elements from these
two sources are combined into a stable mental product in the process of
comprehension.
Roughly, the story goes like this. We start with a comprehender who
has specific goals, a given background of knowledge and experience, and
a given perceptual situation. The perceptual situation may, for instance,
be the printed words on a page of text. We mostly skip the question of
how the reader forms basic idea units from these words (though we deal
extensively with word identification in a discourse context and, at least
tangentially, with the question of how sentences are parsed into their con¬
stituents). Given these idea units in the form of propositions as well as
the reader’s goals, associated elements from the reader’s long-term mem¬
ory (knowledge, experience) are retrieved to form an interrelated network
together with the already existing perceptual elements. Because this
retrieval is entirely a bottom-up process, unguided by the larger dis¬
course context, the nascent network will contain both relevant and irrel¬
evant items. Spreading activation around this network until the pattern of
activation stabilizes works as a constraint-satisfaction process, selectively
Introduction 5
activating those elements that fit together or are somehow related and
deactivating the rest. Hence, the name of the theory, the construction-
integration (Cl) theory: A context-insensitive construction process is fol¬
lowed by a constraint—satisfaction, or integration, process that yields if all
goes well, an orderly mental structure out of initial chaos.
the empirical work and modeling have been done in the area of text com
prehension. Furthermore, the theory of comprehension presented here
is a computational theory. It describes a sequence of steps that, given cer¬
tain initial conditions, yields outcomes comparable to human compre¬
hension. Because computation presupposes a representation on which
computations can be performed, the concept of mental representation
and the role that mental representations play in cognition is discussed in
the second chapter. Parts' of this chapter are highly speculative, and a
wiser author may have refrained from touching on these issues. It is
important, however, that we attempt to place our work within a broader
framework.
In the third chapter, propositional representations are introduced and
justified. The fourth chapter discusses the computations performed on
these propositional representations that form the basis of text compre¬
hension. This first section of the book concludes with a discussion of the
processing theory, the construction-integration model.
The second part of the book consists of seven chapters touching on
many of the topics that are typically discussed in a cognitive psychology
text. How are word meanings identified in a discourse context, including
anaphora and metaphors? How are words combined to form coherent rep¬
resentations of texts, at both the local and global level? How do texts and
the mental models readers construct from them represent situations?
What is the role of working memory in comprehension? How is relevant
knowledge activated during reading, and how is the information provided
by a text integrated with a reader’s knowledge? How do readers recognize
and recall texts? What is the distinction between remembering a text and
learning from a text, and what principles govern remembering and learn¬
ing? What are the implications of these findings for how people solve word
problems, how they execute verbal instructions, and how they make deci¬
sions based on verbal information? Thus, the book is concerned with
attention and pattern recognition, knowledge representations, working
memory, recognition and recall, learning, problem solving, and judg¬
ment- almost the complete range of topics in cognitive psychology. I have
tried to discover for myself and to show others how comprehension theory
can be applied to these phenomena. Some of this work has been published
in scholarly journals or book chapters before; much of it is presented here
for the first time. But even when I describe previously published work, it
Introduction 9
is set here into a general framework that did not exist earlier or that could
not be as fully explicated as I have done here.
Some of the work reported in these chapters consists of large-scale,
carefully executed, and well-documented experimental studies and theo¬
retical simulations. But I have not resisted the temptation to include pre¬
liminary results and simple examples illustrating how the theory could be
applied. Such illustrations are like Gedankenexpenments exploring the
implications of the comprehension theory. They are no substitute for sys¬
tematic experimentation, theoretical proof, or large-scale simulation. Yet
we cannot do without them if we wish to explore the full scope of a the¬
ory, and although the reader may not want to take them as ultimate evi¬
dence, they may seduce the reader with their promise. Thus, many results
are presented on the following pages, some with more assurance than oth¬
ers. If these pages convince the reader that comprehension is a useful par¬
adigm for cognition, a conceptual framework will have been provided for
use in following up on the issues that have been treated here only super¬
ficially, as well as for exploring the even more numerous issues that have
not even been considered here at all.
2
Geometry was one thing and algebra was quite another before
Descartes discovered the isomorphism between these two fields:
Part I
The theory
14 The theory
Figure 2.1
tations that are not possible in the actual environment. The environment
rarely provides all the information necessary for an action. It is usually
the case that human knowledge and experience are required in coordina¬
tion with the environment to guide action. The knowledge representation
makes this possible. At a minimum, it supplements information in the
environment by filling in gaps that are unspecified. Often, however, the
environmental input must be transformed in complex ways to ensure an
optimal action.
1 The termy symbol has at least two senses: (1) a symbol is defined by its referent, and
(Z) a symbol is abstracted from its referent and defined by its conceptual role I am
using the term here in its second sense.
Cognition and representation 17
system had returned to normal, allowing him to act normally in his envi¬
ronment, but the recognition system had not yet adapted.
representations. The professor must access his own memory of his lec¬
ture, realize just what confused the student, and reformulate his explana¬
tion in a more judicious manner. This is no longer a direct response to the
environment based on some well-established ability - but neither does it
require a deliberate act of problem solving. The professor “understands”
what the student is saying and why, and responds appropriately.
Greeno emphasizes the social origin of mental representations. People
do not think and act alone but as part of a social and cultural community.
Concepts evolve out of the discourse of communities of practitioners in
some particular domain. For instance, the concept of the turning radius
of a car was constructed in response to certain constraints experienced
by drivers and automotive engineers. It may be used quite differently by
different groups of people. It may be an implicit concept - affordance
plus ability of an experienced driver. Or it may be an explicit symbolic
concept - for example, for the driving instructor who must explain it to
students, or for the engineer who describes it with a mathematical for¬
mula. And if the engineer also drives a car, it may be an implicit and an
explicit concept at the same time.
Cognitive neuroscience: Habit and cognition systems. The idea that there
are two distinct psychological systems involved in cognition and learning,
direct action and representation, has attracted a great deal of attention
within cognitive neuroscience. Many patients with damage to certain
areas of the brain have been studied who are unable to recognize objects
but behave appropriately with respect to these objects (e.g., the Hindsight
phenomenon first reported by Weisskratz, Warrington, Sanders, & Mar¬
shall, 1974). Leibowitz and Post (1982) provide a succinct summary of
this research that supports a distinction between visual processes devoted
to object recognition on the one hand and orientation on the other. They
demonstrate this distinction by the ease with which it is possible to walk
while simultaneously reading. Attention is focused on the reading mater¬
ial, but locomotion in most environments is smooth and troublefree.
Focal vision is primarily involved in reading, whereas orientation engages
peripheral vision. Recognition is conscious, whereas the ambient func¬
tions usually operate without awareness. Evidence that supports these
common observations from both anatomical studies involving ablations
and perceptual experiments is reviewed by Leibowitz and Post.
Cognition and representation 25
they were doing; this is also typical for studies of this kind. Second, and
more surprisingly, normal and amnesic subjects performed about equalh
well. This is a striking result, because Squire and Knowlton’s amnesic
subjects had severe memory deficits. In the extreme case, such subjects
could remember nothing at all; even after many experimental sessions,
they did not remember ever having performed the experimental task -
although they had learned the reinforcement contingencies quite well, as
well as normal subjects. Third, once subjects reached their performance
asymptote on the prediction task, amnesic subjects remained at this level,
but normal subjects showed a small further improvement with continued
training. This improvement was correlated with verbalizations such as
“Now I understand that the double X on top is more likely to predict sun
than rain.”
The probability learning task did not involve the formation of mediat¬
ing mental representations. Learning was slow, based on reinforcement,
and understanding played no role. The brain areas that perform this
function were unaffected by the brain damage that destroyed the event
memory of the amnesic patients. Thus, although the amnesic patients
could not remember events such as having participated in this experi¬
ment before, their acquisition of a direct isomorphism (implicit learning,
habit formation) was unimpaired. The superior representational facilities
of normal subjects were of no use in this task - the experimenters clev¬
erly manipulated the environment in such a way that there was nothing
to learn except rote stimulus-response contingencies, thus depriving the
normal subjects of a chance to use their superior abilities. Only with pro¬
longed training did normal subjects start to reflect on the habit they had
acquired and to verbalize their experiences and formulate rules, which
the amnesics could not do. The task was rigged in such a way that reflec¬
tion and rule formulation were of little help, however. Nevertheless, it is
of great interest theoretically. It demonstrates that the lower habit system
is encapsulated within the higher cognitive system. Our habits are not
merely habits, as in a dog or an amnesic patient; we can reflect on them,
talk about them, mathematize them, and so on. In Squire and Knowlton’s
experiment, this ability was of little consequence, but in everyday life this
ability to link habits with language and symbolic thought can have pro¬
found consequences. It permits the higher cognitive systems a certain
amount of control over the lower ones. If one asks a tennis player to ana¬
lyze a certain stroke, he will be at a loss and ask to be allowed to demon-
Cognition and representation 27
strate what he does; the tennis coach, however, has learned how to talk
about the player’s stroke, and the kinesiologist may be able to provide a
scientific analysis of it.
The fact that so many investigators in such different areas have found
it useful to distinguish two broad categories of mental representations
suggests that we are on fairly solid ground in differentiating a habit and a
cognitive system. In addition, there seems to be fairly general agreement
about the need to separate procedural representations from episodic memory.
The best evidence comes from studies of child development (Karmiloff-
Smith, 1992; Nelson, 1996). Procedural learning tends to be unconscious
and is fully under environmental control. Episodic memory may involve
different degrees of consciousness and provides an organism with the
earliest opportunities to weaken the direct link between action and envi¬
ronment.
The distinctions within the cognitive system that have been made here
are based partly on Bruner (1986), who argued for a distinction between
what he termed the narrative and paradigmatic forms of language use,
and partly on studies of cognitive development, both its ontogeny and
philogeny. In particular, it is the work of Donald (1991) on the evolution
of human cognition and the related work of Nelson (1996) on child devel¬
opment that have strongly influenced my thinking in these matters. I
briefly discuss their ideas, both as justification for the types of represen¬
tations assumed here and to further elaborate them.
2.1.3 Implications
We can take for granted that several different types of mental represen¬
tation play a role in behavior and cognition and that some representa¬
tions are embedded in others. How are we as scientists to represent these
layers of embedded representations in our theories? It is not easy to
envisage a theory of complex cognition that provides an optimal formal¬
ism for each type of representation and that adequately describes their
interactions, though for some purposes that may be required. However,
for a theory of comprehension, especially one that focuses on text com¬
prehension, a simpler solution is available: Find one form of representa¬
tion that fits everything. Obviously, a single format cannot fit everything
precisely, or we would not need to talk about different types of repre¬
sentation in the first place. However, there may be a form of representa¬
tion that allows us to adequately approximate the various types of repre¬
sentation. One does not have to look far to find a candidate for such a
superrepresentation: language. Language has evolved to enable us to talk
about all the world and all human affairs and is as suitable a tool for our
purposes as we have. Not that expressing everything through language
does not sometimes distort things. It simply means that on the whole
language is the best tool we have available.
What are the difficulties that arise when we represent the complexity
4 Some of the borrowers did not intend originally to stray beyond the boundaries of the
field of logic (e.g., Bierwisch, 1969; van Dijk, 1972), but others (including Kintsch,
1974) extended the meaning of the term, in spite of the possibility of confusion with
its original logical meaning.
38 The theory
tic role of the participants. For example, the predicate GIVE may have
three argument slots, as in GIVE[agent:MARY, object:BOOK, goal:
FRED], Atomic propositions may be modified by embedding one propo¬
sition within another, as in GIVE[agent:MARY, object:OLD [BOOK],
goahFRED]], or INADVERTENTLY [GIVE[agent:MARY, object:
OLD[BOOK], goahFRED]].
Complex propositions (van Dijk & Kintsch, 1983) are compounds
composed of several atomic propositions that are subordinated to a core
propositional meaning. The general schema for complex propositions is
given by
- Modifiers:
Circumstance:
- Time:
- Place:
Thus, the sentence Yesterday, Mary gave Fred the old book in the library
would be represented as:
- Action:
—Predicate: GIVE
— Arguments:
“Agent: MARY
“Object: BOOK
Modifier: OLD
-Goal: FRED
Circumstance:
-Time: YESTERDAY
L Place: LIBRARY
Not all expressions in the surface structure are represented in this nota¬
tion (e.g., tense is not, nor is the definite article, which is a discourse sig¬
nal that identifies that that book is known to the hearer). In general, surface
sti ucture may express pragmatic, rhetorical, stylistic, cognitive, or inter¬
actional properties, as well as additional syntactic and semantic properties
Cognition and representation 39
that are neglected by this notation. The choice of which sentence proper¬
ties to represent in the propositional notation is pragmatic: Whatever
seems of little importance for a given theoretical or experimental purpose
is omitted. Hence, significant differences may be found in the explicitness
and completeness of propositional representations constructed for differ¬
ent uses.
The van Dijk and Kintsch (1983) notation in terms of complex propo¬
sitions has several advantages over the earlier notation of Kintsch (1974)
in terms of atomic propositions. In the earlier notation, the sentence
would have been represented as follows (using [.] as an abbreviation for a
list of arguments when it is clear what these arguments are):
GIVE[M ARY,BOOK,FRED]
INADVERTENTLY[GI VE[. ]]
OLD[BOOK]
YESTERDAY[GIVE[.]]
IN-LIBRARY[GIVE[.]]
or, equivalently,
PI GIVE[MARY,P3,FRED]
P2 INADVERTENTLY[P 1 ]
P3 OLD[BOOK]
P4 YESTERDAY[P 1]
P5 IN-LIBRARY[P 1 ]
1. Indirect coherence. The meaning units are part of the same episode.
That is, they share a time, place, or argument.
2. Direct coherence. The same as indirect coherence, but in addition the
40 The theory
(1) The snow was deep on the mountain. The skiers were lost, so
they dug a snow cave, which provided them shelter.
The coherence between the first two meaning units is indirect, through
the shared place, because the skiers presumably were lost on the mountain.
The coherence between the second and third proposition is direct,
expressed by so, but in addition coherence is provided by argument over¬
lap via agent and place, for the skiers presumably are still on the moun¬
tain. The fourth meaning unit, PROVIDE[SNOWCAVE, SKIERS,
Cognition and representation 41
where the ellipsis dots indicate information that has been omitted for the
sake of brevity. A complex proposition like this does the same work as a
conventional schema: It specifies relevant attributes and lists their default
values.
42 The theory
5. It is also worth noting that production rules, which are the preferred
representational system for procedural knowledge (Anderson, 1983;
1993; Newell, 1990), can also be written as schemas with two slots IF[. . .]
and THEN[. . .].
2.2.4 Imagery
tions (Anderson, 1978; Kosslyn, 1980; Pylyshyn, 1981) has been the sub¬
ject of a long-standing debate. No definitive conclusions have been
reached in this debate on the basis of either logical argument or behavioral
data. However, Kosslyn (1994) has tried to resolve the imagery debate by
considering the evidence from brain studies. The processes underlying
mental imagery in the brain appear to be closely related to perceptual
processes and can be clearly differentiated from verbal and symbolic
processes.
Nevertheless, for many purposes — including modeling text compre¬
hension with the construction—integration model - it is desirable to
work with a uniform medium, that is, the predicate-argument schema.
Therefore, imagery and linear strings will be translated into this format -
not because all information is by nature propositional but because of
practical considerations. We know how to work with predicate-argument
units, and it is not clear how to interface linear or spatial analog repre¬
sentations with such units. On the other hand, the translation from the
linear or spatial analog to propositional form should be done in such a
way that the unique properties of the analog representations are main¬
tained in the propositional form. That is, correspondences must be
established between direct forms of reasoning that are possible in these
analog systems in the propositional domain (e.g., Furnas, 1990). Thus,
ideally, representational properties that are characteristic of imagery but
not of verbal or propositional information (some such properties are the
existence of a canonical view, symmetry biases, the hierarchical structure
of space, framing effects, alignment, and perspective) should be repre¬
sentable in the propositional format. This is not an easily achieved ideal,
however.
The best-known approach to relating propositional and imagery rep¬
resentations is that of Kosslyn (1980). Kosslyn distinguishes a deep, non¬
pictorial level of representation (names and lists ol spatial coordinates)
and a surface, quasi-pictorial representation in a spatial medium. Spatial
information can thus be translated into propositional form through the
use of spatial predicates. This procedure does not necessarily satisfy the
criterion stated above that the properties of spatial reasoning be pre¬
served in the transformation. It is not entirely satisfactory, therefore, and
can only be considered as a stopgap solution in the absence of a more ade¬
quate procedure.
To illustrate this treatment of spatial imagery, a brief example from
46 The theory
The study of cognition entails the need for some way of repre¬
senting mental structures. The difficulty is that there are many
different kinds of mental structures and that a good format for the
representation of mental structures should be suitable for all
kinds of structures. The basic linguistic meaning unit is the pred¬
icate-argument schema, and this schema can be used to represent
other types of structures as well. It is a very general format that
subsumes feature representations, semantic nets, production sys¬
tems, and frame systems. However, predicate-argument units
must not only represent abstract units of meaning but must also
represent meaning at multiple levels, including the perceptual,
action, linguistic, and abstract-symbolic levels. Although predi¬
cate-argument units are more suitable for the representation of
The theory
Propositional representations
Figure 3.1
Figure 3.3
actually generate? If readers have all this freedom to reorganize and dis¬
tort a text according to their ideas and beliefs, how can we ever hope to
predict what they will do? Actually the situation in text comprehension is
not that bad, and better than in some other areas of cognitive science,
notably problem solving. When we give a subject a reasonably complex
problem to solve, we really have no way of telling how the problem will be
approached. In text comprehension, on the other hand, the subjects in
our experiments tend to follow the text we present more or less faithfully.
Many texts are designed so that the subject’s situation model will not dif¬
fer much from the textbase and hence is fairly well predictable. When we
do studies in which prior knowledge and beliefs play a larger role, we usu¬
ally have some idea what these are and are able to formulate hypotheses
about the way they should affect text processing. Thus, although we do
not have total control over our experimental subjects, we can constrain
their behavior reasonably well in our experiments. Real-life comprehen¬
sion episodes are much less constrained, and instead of prediction we
54 The theory
ized by a predicate frame that specifies how many and which arguments
that a predicate takes. A verb frame, for instance, may specify that a par¬
ticular verb must have an agent, an object, and an optional goal, with
restrictions on these categories, such as that the agent must be human, or
the goal must be a location, Not only can concepts be arguments, but a
proposition may have as an argument another embedded atomic proposi¬
tion. • Sentence connectives, for instance, are predicates that require
propositions as arguments. Circumstances refer to the whole propositional
frame, specifying place and time. All propositional elements may be mod¬
ified by additional atomic propositions.
Thus, to repeat the example discussed in section 2.2.2, the sentence
Yesterday, Mary inadvertently gave Fred the hook in the library is repre¬
sented as a complex proposition structured around the predicate GIVE.
The arguments of this predicate are an agent MARY, an object BOOK,
and a goal FRED. Inadvertently modifies GIVE, and old modifies BOOK.
Time and place are specified by yesterday and in the library.
A brief list of frequently encountered propositional constructions fol¬
lows. This list parallels the one provided by Bovair and Kieras (1985) to
facilitate comparison. In each case, only a simple text example and its
propositional form are given without much comment. When more than
one version is given, these are notational variants of varying explicitness
to be used as convenient.
CARRY[HEMOGLOBIN, OXYGEN]
The blood from the body arrives at the atrium through the veins.
ARRIVE[BLOOD,FROM-BODY, AT-ATRIUM,
THROUGH-VEINS]
56 The theory
or, abbreviated
ARRIVE[BLOOD,BODY,ATRIUM, VEINS]
ARRIVE[BLOOD,RIGHT[ATRIUM]]
IS[FIRST[CHAMBER],RIGHT[ ATRIUM]]
TEND
-BLOOD
L PURPLISH
Llack
- BLOOD
L PURPLISH
L- OXYGEN
or more compactly:
PI TEND[P2,P3]
P2 PURPLISH [BLOOD]
P3 LACK[P2,OXYGEN]
DISCOVER[DOCTOR,CONGENITAL[DEFECT] ]
Blood that has been drained of oxygen arrives at the right atrium.
ARRIVE[BLOOD],RIGHT[ ATRIUM]]
I— DRAIN [BLOOD,OF-OXYGEN]
FATAL[HE ARTATTACK]
•— POSSIBLE
or
POSSIBLE[FATAL[HEARTATTACK]
septal defect
SEPTALfDEFECT]
DEFECT
1-SEPTAL
QUICK[RETURN[BLOOD,HEART]]
or
CARDIOLOGIST[DOCTOR]
NOT-CONSTANT[HEART RATE]
IS[LEFT[ VENTRICLE],LARGEST[CHAMBER]]
1— OF-HEART
POSSESS[HEART,CHAMBER]
MORE-THAN[SMALL[RIGHT[VENTRIC]] SMALL
[LEFT[VENTRIC]]]
Propositional representations 59
SMALLER-THAN[RIGHT[VENTRIC]],
[LEFT[VENTRIC]]
HOW[PUMP[HEART,BLOOD],?]
4. Problems and issues. Cognitive states (hope to, decide to, expect to, fail
to, be afraid to, begin to, hesitate to), causal verbs {cause, bring about, result
in), and verbs of saying, thinking and believing are treated as modifiers of
the sentence complement:
COME
LMARY
L HOPE[PAUL]
COME
lMARY
- SAY[PAUL]
LEAVE
-JOHN
Learly
LSHOCK
LEVERYONE
or simply
PI LEAVE[JOHN]
P2 EARLY[P1]
P3 SHOCK[P2, EVERYONE]
60 The theory
CON[P 1 ,P2]
Note that Mary ran quickly and Mary might have run are represented
in the same way here, as a proposition RUN[MARY] modified by either
QUICKLY or POSSIBLY. This (and many similar “problems” one
could come up with) is not a weakness of the system but a source of its
strength. Quickly modifies ran, whereas might have applies to the whole
proposition, which is a valid distinction and crucial when we are con¬
cerned with the truth value of sentences. But we are not; we want to
model comprehension, to count elements in working memory, to score
recall protocols, and so on - for which purposes we do not need a logically
consistent system or a detailed semantics. We need a way to represent the
meaning of a text sufficiently independently of the words used that pre¬
serves those aspects of meaning that are important for our purposes and
glosses over those that are not. Propositions as defined here best fulfill
that need.
1 This section is based on Kintsch (1985). An excellent discussion of the general issues
involved in such an analysis is provided by Perfetti and Britt (1995).
Propositional representations 61
(1) The first of the heart’s four chambers, the right atrium, receives
purplish blood, short of oxygen and laden with carbon dioxide.
This used blood arrives through the body’s two major veins, the
superior and inferior venae cavae, and from the many minute
blood vessels that drain blood from the walls of the chamber
itself.
Figure 3.4 A sample analysis. Text elements are shown in the first column-
the second column shows the syntactic tags required for building proposi- ’
tions; propositions under construction but still incomplete are shown in the
third column; the completed propositions are shown in the fourth column.
Propositional representations 63
Figure 3.5 The three complex propositions constructed from the analysis in
Figure 3.4.
66 The theory
PI
P2 Ml RECEIVE
- RIGHTf ATRIUM]
P3 B ,OOD
“ SHORT-OF[BLOOD,OXYGEN]
_ LADEN-WITH[BLOOD,CARBONDIOXIDE]
RIGHTf ATRIUM] ~
RIGHT[VENTRICLE] CHAMBER
LEFT[ATRIUM] 1— OF-HEART
LEFT[VENTRICLE] —
EXPAND[MUSCLE]
|= OF-HEART -» PUMP[HEART]
CONTRACT[MUSCLE]
Propositional representations 67
Figure 3.6
68 The theory
store), conventions, and habits. There are, however, also rich signals in
the texts themselves that enable the formation of macrostructures. These
may be structural signals, such as titles, initial topic sentences, summary
statements, and the like, as well as a great variety of syntactic and seman¬
tic signals that are mainly used to indicate local importance in a discoui se
but that may achieve macrorelevance through cumulative inference. Text
elements that have beeo repeatedly signaled to be of local importance
become important for the macrostructure, too. Syntactic signals for dis¬
course relevance include phonetic stress, cleft sentences, passives, the
clause structure of sentences, and other foregrounding and topicalization
devices. Semantic signals in a text include topic change markers (such as
change in time or place, introduction of new participants, change of per¬
spective, and so on) and various cues that indicate local coherence or a
break thereof that subjects can use either to maintain or to change their
current macrohvpothesis (Givon, 1995; van Dijk & Kintsch, 1983).
Textual schemas, what van Dijk and Kintsch (1983) have called the
rhetorical superstructures, also play an important role in the formation
of macrostructures. Readers know that particular text types tend to be
organized in certain ways and employ this knowledge to construct
schema-based macrostructures. Thus, narratives in our culture have a
basic exposition—complication—resolution structure, with the possibility
of embedding. By assigning macropropositions to these schematic cate¬
gories, readers can simplify for themselves the task of deriving the over¬
all organization of a story. Indeed, as Poulsen, Kintsch, Kintsch, and
Premack (1979) have shown, even four-year-olds make good use of this
schematic knowledge in story understanding. Conversely, when the
schema is misapplied because the narrative is constructed according to
some other cultural schema, comprehension may be distorted accord¬
ingly (Bartlett, 1932; Kintsch & Greene, 1978).
Rhetorical schemas play a role in understanding descriptive texts that
is as important as their role in understanding stories. If familiar schemas
are present, comprehension is facilitated but only at the macrolevel, not
locally. For instance, Kintsch and Yarborough (1982) wrote brief essays
according to four familiar rhetorical schemas: classification, illustration,
compare-and-contrast, and procedural description. The texts conformed
as closely as possible to these schemas. The texts were then rewritten, and
although their schematic structure was destroyed their content remained
unchanged. This was done by a partial reordering of sentences and the
Prop o sit ion a l rep re sen tatio ns 69
Why do we need propositions at all for our theorizing, what is wrong with
just plain natural language? In doing research on text understanding it is
frequently the case that propositional representations of the texts are
more readily related to the behavioral data than are the texts themselves.
Presumably this is the case in those situations in which meaning matters
most; propositions are designed to capture those semantic relations that
are most salient in text comprehension, whereas natural language serves
many purposes other than the expression of meaning and hence is often
less suited for our purposes than a representation that is focused on
meaning. Propositions appear to be the semantic processing units of the
mind and hence the most useful form of representation for our studies.
What is the empirical evidence that propositions are in fact effective
units in text comprehension? Because complex propositions are defined
in such a way that they (generally) correspond to sentences, all the evi¬
dence that shows that sentences are psychological units also supports the
notion of complex propositions. But sentences are structured according
to the rules of syntax, whereas the internal structure of complex propo¬
sitions is quite different and only indirectly related to syntax. Complex
propositions have a central proposition and various associated elements -
modifiers of various sorts and/or circumstantials. The question then
becomes whether these smaller propositional elements, or atomic propo¬
sitions, can be shown to function as psychological processing units. What
is the evidence that atomic propositions are the functional processing
units of text comprehension?
A discussion of the psychological reality of propositions can be found
in most contemporary cognitive psychology textbooks as well as in van
70 The theory
Dijk and Kintsch (1983, chap. 2). Hence, no more than a brief summary
is presented here. In a sense, every experimental study that successfully
employs a propositional analysis either to design the experimental mate¬
rials or to analyze the data provides evidence for the usefulness of propo¬
sitional analyses. However, one would like more direct evidence as to the
psychological reality of propositions. Such evidence comes from three
sources: recall experiments, reading times, and priming studies.
Cued and free recall. If subjects study sentences that are based on more
than one proposition and then are given a word from the sentence as a
recall cue, words from the sentence that belong to the same atomic propo¬
sition as the cue word are recalled better than words from the sentence
that do not come from the same atomic proposition. This result has been
obtained in several studies, (e.g., Wanner, 1975). Thus, consider the two-
element sentence that consists of a main proposition and a propositional
modifier
(2) The mausoleum that enshrined the tzar overlooked the square.
OVERLOOK[MAUSOLEUM, SQUARE]
1-ENSHRINEfMAUSOLEUM, TZAR1
Subjects who are given the word overlooked as a recall cue will more often
recall square than tzar, which are arguments of the same atomic proposi¬
tion, in spite of the fact that tzar is closer in the surface structure to the
recall cue than square.
Barshi (1997) studied how people were able to execute verbal instruc¬
tions that differed in the number of propositions and/or words. His sub¬
ject listened to messages that instructed them to move in a simulated
three-dimensional space on a computer screen. They repeated these
instructions orally and then attempted to follow them - as airline pilots do
when receiving real air traffic control instructions. The instructions were
given either in complete English sentences (“move two squares to the
left”) or in abbreviated form (“two left” - an unambiguous command in
the context of Barshi’s experiment). The number of propositions a subject
had to remember and respond to mattered greatly: Doubling the number
of propositions from two to four caused an increase in errors from 3% to
52%. The number of words used to express these propositions, on the
other hand, had no effect at all in Barshi’s experiment: A two-proposition
instruction was as hard to remember and execute whether it was expressed
Propositiona l rep re sen tatio ns 71
with four words or with eight words. But there was a striking difference in
difficulty whether eight words expressed two propositions or four.
There are also a number of studies that demonstrate that when recall¬
ing sentences based on complex propositions, subjects tend to recall the
various elements of the complex proposition as units. That is, they either
recall the whole element or omit it entirely. The most extensive of these
studies is one by Goetz, Anderson, and Schallert (1981). In this study
subjects read eight sentences and then recalled them. The sentences were
either based on a single propositional element, such as
LECTURE[PROFESSOR]
1— FAMOUS
— LOC: CLASSROOM
Almost all the free recall of these sentences, 94%, consisted of complete
propositional elements. That is, if a subject recalled anything at all from
sentences like (3), the whole sentence was recalled. In contrast, sentences
like (4) were often recalled partially, that is, subjects may recall The pro¬
fessor lectured. The professor was famous or The famous professor was in the
classroom, for instance. However, Goetz et al. argued that the holistic
recall they observed may not have been owing to the fact that subjects
represent the meaning of sentences in memory propositionally, but that
they may be a consequence of the familiarity, concreteness, and high
imagery value of these units. Therefore, they constructed similar sen¬
tences in which unlikely elements were combined and that would be
much harder to image, such as
Reading times and recall. In a number of early studies, sentence recall was
found to be a function of the number of propositional elements in the
sentence when reading time was controlled. Thus, Forster (1970) pre¬
sented six-word sentences at a constant rate. All sentences were six words
long but contained either one or two propositional elements:
The average number of words recalled for sentences of types (7) and (8)
were 4.41 and 3.09, respectively.
Conversely, if reading time is subject controlled, subjects take longer to
read propositionally complex sentences, even when the number of words
is the same. Kintsch and Keenan (1973) constructed sentences that were
all of approximately the same length (14—16 words) but that contained any¬
where from four to eight propositional elements. Of course, the mere fact
that a sentence contains many propositional elements does not guarantee
that a reader constructs an equally complex representation. A fast reader
may simply skip over some propositional elements. Hence, it is inappro¬
priate to plot reading times against sentence complexity. However, if one
plots reading time against the number of propositional elements that sub¬
jects recall on a free recall test, a rather striking approximately linear rela¬
tionship results. For each additional propositional element recalled, read¬
ing times increased by 1.5 s in the Kintsch and Keenan study.
One needs to be cautious about generalizing this finding, however.
Kintsch and Keenan examined only a very limited range - between four
and eight propositional elements. It is not really possible to extend this
range, because as sentences or texts get longer, retrieval and forgetting
problems complicate the interpretation of the data. Even more critically,
when sentences are constructed that vary in propositional complexity but
not in length, these sentences necessarily also vary in many other ways —
lexical items, syntax, familiarity, the number of repeated concepts, and
so on.
Since these early studies, reading times have been analyzed repeatedly
using multiple regression analyses in the attempt to untangle these con-
foundings. In these studies, propositions in the text, not propositions in
the head (as reflected by subjects’ recall) have invariably been used as the
Prop o sit ion a l rep re sent at io ns 73
variable of interest. Hence, these studies are not entirely relevant for the
question at hand — whether or not propositions are meaningful psycho¬
logical units of comprehension. Nevertheless, considerable support for
the usefulness of this notion can be derived from these studies (which are
reviewed in section 9.1.2).
targets:
long-term memory: \ \ II
targets:
Figure 3.7 Part of the long-term memory network after reading the play-
ing-with-tomatoes (top) and painting-tomatoes (bottom) paragraph and two
target sentences. The numbers below the nodes in the network are long-term
memory strengths; the numbers below the targets are the activation values
they receive from the net.
Thus, not only are concepts idiosyncratic, but they are also highly unsta¬
ble. However, Barsalou was able to show that only the concepts that sub¬
jects constructed in this underconstrained experimental situation were so
unpredictable. If subjects were given all the features generated by every¬
one in the experiment, they agreed very well that these were indeed fea¬
tures characteristic of the concept in question (97% agreement between
subjects, 98% within ^sqbjects). Everyone had more or less the same
knowledge base when it came to these familiar everyday concepts. Fur¬
thermore, in a more constraining context, different subjects tended to
agree much better on the features that characterize a concept. Thus, in
the experiment reported by Barsalou, intersubject agreement was 45%
when subjects defined concepts like vehicle in isolation, but rose to 70%
with even a minimally constraining context (“Taking a vacation in the
rugged mountains of Mexico”).
Barsalou’s data nicely illustrate the fact that, although concepts are
fleeting and flexible, not all is chaos, because the knowledge bases from
which these concepts are constructed are more stable and predictable,
and most of the time the context itself will be sufficiently constraining to
ensure that the concepts different people form will be similar. Neverthe¬
less, concepts are never quite the same - surely a limiting factor in com¬
munication.
The theory of meaning advocated here is not only constructivist but
also minimalist. Clearly, readers can study a text over and over again and
construct very elaborate meanings for its propositions and concepts. Lin¬
guists, philosophers, and literary critics do this all the time, and most peo¬
ple do so at least some of the time. But most of the time, in reading or con¬
versation, the process of meaning construction remains shallow, not just
because comprehenders are inherently lazy but mostly because no more is
required. A slight knowledge elaboration of a text is usually quite sufficient
for whatever action is intended. Most of the time texts do not need much
elaboration and interpretation to arrive at stable interpretations upon
which appropriate responses can be based. Long-term working memory
allows the comprehender easy elaborations and inferences whenever they
are required. It is enough for the well-informed reader to feel that the
potential for the elaboration of meaning is there - there is no need to real¬
ize it. It is of course possible to do so, and we often do so, sometimes read¬
ily, sometimes with the expenditure of considerable effort. Indeed, the
deliberate construction of meaning may extend over long periods and may
Propositional representations 81
One of the most salient differences between knowledge nets and most
other forms of knowledge representation is that knowledge nets are asso¬
ciative, hence based on perception and experience, and therefore relatively
unorganized and chaotic. Semantic nets, scripts, frames, and the like are
all attempts to make knowledge organization more orderly, more logical.
Clearly, knowledge can be organized logically; there is abundant evidence
for the psychological reality of higher-order schematic structures such as
scripts and frames. Why, then, has an associative net been chosen as the
basis for representing knowledge? And how can one account for the obvi¬
ous fact that people operate with schematic structures all the time if such
structures are not components of the basic knowledge representation?
When psychologists and AI researchers began to use scripts and frames
as knowledge units, they viewed them as preexisting structures in mem¬
ory that are retrieved upon demand and used for a wide variety of pur¬
poses. This view soon came into conflict with the available psychological
evidence as well as the computational demands of AI systems. Computa¬
tionally, scripts and frames proved to be too inflexible to serve the purposes
for which they were originally designed (Schank, 1982). Scripts have to be
applied in ever changing contexts, and various ways of fine-tuning scripts
proved to be inadequate to provide the flexibility that was needed. For
instance, attempts to fine-tune the restaurant script by distinguishing var¬
ious tracks (fancy restaurant, Chinese restaurant, fast-food restaurant)
only led to an endless proliferation of subscripts. Empirically, too, there
were problems. Although certain aspects of scripts (differences in the cen¬
trality of items in a script - Bower, Black, & Turner, 1979; the direction¬
ality of scripts - Haberlandt & Bingham, 1984) could be substantiated by
psychological experiment, others could not. Primarily, the distance
between events in a script did not behave as it should if scripts were fixed
linear structures: The time it takes to retrieve an event given another script
event as a cue does not vary proportionally to the distance between the
events (Bower et ah, 1979; Haberlandt & Bingham, 1984.)
Schank (1982) therefore modified the original script notion by intro¬
ducing smaller units called Memory Organization Packets (MOPs), out
of which scripts can be constructed in contextually more appropriate
Propositional representations 83
ways. A different but related proposal was made by Kintseh and Mannes
(1987), who showed that scripts and related structures can be constructed
out of associative nets. Scripts are not fixed, fully elaborated, preexisting
knowledge structures that merely need to be retrieved for use. Instead,
only the raw material for constructing a script is part of the knowledge
net: a script-proposition that can function as a frame for other, contextu¬
ally appropriate, associated information in the knowledge network.
Scripts and other types of schemas are emergent structures, and because
they always are constructed in the context of their use, they are well
attuned to that context.
Kintseh and Mannes (1987) start from the observation that the retrieval
of the items of a script occurs in quite a different way from the retrieval of
the items of a conceptual category. If subjects retrieve instances of some
conceptual category, such as cars, a plot of the number of items retrieved
against time yields a scalloped, negatively accelerated curve as in Figure
3.8 (Bousfield & Sedgwick, 1944; Walker & Kintseh, 1985).
Typically, a subject produces a burst of category items in response to a
retrieval cue that he or she is using, pauses, and then produces another
burst. Such retrieval cues may be “Japanese cars,” “cars in my dormitory
parking lot,” “cars my parents owned,” “cars I have wrecked,” and the
like. Each retrieval cue works for a while, rapidly retrieving some number
of items that are closely associated with it, but then dries up. What hap¬
pens during the pause is that the subject is searching for a new retrieval
cue when the old one has become ineffective. Finally, the subject is unable
to generate new, effective retrieval cues and quits. The results of this
process are the seal loped curves, steep at first and then flattening out, as
shown in Figure 3.8.
If a subject retrieves the items of a script, such as going to a grocery
store, a smooth function increasing at a constant rate is obtained, as in
Figure 3.9 (Kintseh & Mannes, 1987). This pattern of retrieval might be
explained by assuming that a script is retrieved and the items are read off
from it at a more or less constant rate until the end is reached. However,
the protocol data analyzed by Kintseh and Mannes suggest a different
interpretation. The retrieval of items is associative and occurs in bursts,
very much as in category retrieval. The difference is that the subject does
not have to engage in the time-consuming task of generating a new
retrieval cue when an old one becomes ineffective, because the next
retrieval cue is readily available. It is provided by a script proposition that
is part of the knowledge net. Such a proposition is not a full script but
84 The theory
TIME
merely a skeleton script - an ordered list of episode names. For the GRO¬
CERY-SHOPPING script this would be as shown in Figure 3.10. Each
episode name can serve as a retrieval cue. Retrieval of items associated
with the retrieval cue occurs associatively, just as in the case of category
retrieval, except that the associations among the elements that comprise
a script are often unidirectional. Going through the aisles is associated with
Propositional representations 85
Figure 3.10
going to the checkout counter, but there is no link in the reverse direction.
Hence, the retrieval process will continue to move through the script
without backtracking. The particular items that will be retrieved in each
retrieval episode will vary in this probabilistic process and will be
strongly influenced by whatever other retrieval cues are operative in the
situation. But there will be no scallops as in category retrieval, for there is
no need to pause to search for a new retrieval cue. The next retrieval cue
is provided by the script proposition, and it is just as effective as the pre¬
vious one, so that there is no general slowing down as in Figure 3.8.
Kintsch and Mannes (1987) have described a simulation that shows
how scripts can emerge in this way from an associative knowledge base,
given basic information about the temporal sequence of the scriptural
episodes. For this purpose, they combined the model of memory retrieval
proposed by Raajmakers and Shiffrin (1981) with the processes for cate¬
gory and script retrieval that have been discussed here. They obtained the
scalloped, negatively accelerated retrieval functions typical for category
retrieval and the smooth, constant-rate functions typical for script gener¬
ation from the same associative net. Thus, they have demonstrated that
86 The theory
global structures such as schemata need not be in the mind but may
emerge from the action of certain types of control processes on a globally
unstructured, only locally linked knowledge base.
the network and the entry in the 2th row andyth column is the strength of
the link between nodes i and j.
A concept or proposition can thus be thought of as a vector of num¬
bers, each number indicating the strength with which the concept or
proposition is linked to another concept or proposition.
What determines these numbers? Presumably; they are the end prod¬
uct of lifelong experience, of interacting with the world we live in. We
learn - by observation, by talking to others, by reading stories - that cats
are not scared normally but that they are when chased by a dog. The val¬
ues in our vectors, therefore, are the fine-tuned products of numerous
and diverse experiences that we as humans have.
There appears to be no way one could provide an artificial system
with these numbers by some sort of hand coding. The system is too com¬
plex and too huge and too subtle, and unavailable to reliable introspection.
The only way to acquire these numbers is to live a normal human life,
learning through interaction with the human environment. A machine has
no chance to learn all these numbers perfectly, because it cannot live and
act like a human - an obvious point, much belabored by philosophers.
Does that mean that we cannot build a machine that will simulate
human cognition adequately? Are we forever restricted to hand coding of
propositions, as is being done in most of the research described in this
book? Not, perhaps, if we shift our criteria somewhat. Machines cannot
act and live as humans can and hence they cannot learn from experience
as we do, but they can read. Therefore, they can learn from reading.
A machine that knows about the world only from reading surely is a far cry
from a human with real red blood and surging hormones, but there is a lot
to be learned from the written word. It is only the second-best choice, but
suppose we teach a machine what the strength values in all these concept
and proposition vectors are by experience with the written word only.
Latent Semantic Analysis (LSA) is a technique that allows us to do
something like that (Landauer & Dumais, 1997; Landauer, Foltz, &
Laham, in press). This technique uses singular value decomposition, a
mathematical generalization of factor analysis. It was originally developed
as an information retrieval technique by Deerwester, Dumais, Furnas,
Landauer, and Harshman (1990) and extended to discourse analysis and
general problems concerning learning and language by Foltz (1996) and
Landauer and Dumais (1997). The reader will have to refer to these
sources for an adequate description of the technique and technical details.
Here, I try to give only a general impression of what LSA does and then
88 The theory
M —- A * D * A'
where A and A' are matrices composed of the eigenvectors of the matrix
and D is a diagonal matrix of the eigenvalues (or singular values) of the
matrix. The theorem generalizes to the nonsquare matrices used by LSA.
The eigenvalues are ordered in terms of their magnitude or importance.
Multiplying the three matrices yields back the original matrix, M. What
is done in LSA is to throw away most of the eigenvalues (and their asso¬
ciated eigenvectors), keeping only the largest ones, say, the 300 largest
ones. Multiplying the three matrices thus reduced does not reproduce M
precisely but only approximates the original M. That turns out to be a
considerable advantage. The original matrix contains too much informa-
Propositional representations 89
tion - all the details and accidents of word use. By discarding all this
detail, we keep only the essence of each word meaning, its pure semantic
structure, abstracted from particular situations. This constructs a seman¬
tic space of, in the encyclopedia case, 300 dimensions in which each word
and document from the original matrix can be expressed as a vector. Fur¬
thermore, new words and documents can be inserted into this space and
compared with one another and with any of the vectors originally com¬
puted. Of the various ways to compare vectors, the only one I discuss here
is closely related to correlation: A measure of relatedness between vectors
is the cosine between the vectors in the 300-dimensional space. Identical
vectors have a cosine of 1, orthogonal vectors have a cosine of 0, and
opposite vectors have a cosine of-1. For instance, tree and trees have a cos
= .85; tree and cat are essentially independent, cos = -01; cat and The dog
chased the cat yield a cos = .36.
What is to be gained by this vector representation? Unlike proposi¬
tional analysis, it is fully automatic and objective. It is computationally
not very demanding (once an original semantic space has been con¬
structed). The cosine measure of semantic relatedness is readily inter¬
pretable. Thus, it is not necessary to assign links between nodes in a net¬
work arbitrarily or by collecting empirical association data. The cosines
between words, sentences, or paragraphs are easily computed and pro¬
vide objective, empirically based measures. We need no longer guess what
the neighboring nodes of a word (or sentence) are in semantic space - we
can look it up in the LSA space. Furthermore, words and documents are
treated in exactly the same way. “Documents” for LSA are akin to expe¬
riences or episodes for a human learner. For us the meaning of a word or
proposition is determined as much by the episodes in our memory to
which it is related as by the other words to which it is related; that is,
meaning involves both semantic and episodic memory. LSA computes
relationships both between words and between documents qua episodes.
Thus, it allows us to explore this important aspect of meaning, which is
not easily done within conventional approaches.
The initial results that the developers of LSA (Landauer, Durnais,
Foltz, and their co-workers) obtained with this method impressively
demonstrate its promise: LSA appears indeed capable of capturing much
of word meanings. More specifically, Landauer and Durnais (1997) found
that the vectors for words derived from the encyclopedia analysis pre¬
dicted the correct answers to standardized vocabulary tests in which
90 The theory
described in this book involving LSA is only a beginning and still quite
tentative.2 Nevertheless, it opens exciting new possibilities that should be
explored by discourse researchers.
2 See sections 5.1.2 on words meaning, 5.3 on metaphor, 6.2.3 on macrostructures, 7.4
on retrieval structures, 8.2.2 on evaluation of summaries, 9.6 on learning from text,
and 11.4 on decision processes. Readers may check (and extend on their own) these
computations by using the LSA program available at http://samiam.colorado.edu
/~lsi/Home.html
92 The theory
selected from these all-too-rich offerings, and how do they cohere into an
understanding that is attuned to our goals and motivations as well as the
characteristics of the situation within which we find ourselves?
There is no question that this happens. But how? The traditional
answer is that order is guaranteed because the process of understanding
is under the control of a schema that guides it (Bartlett, 1932; Schank &
Abelson, 1977; Selz, 1922). A schema in this view serves as a control
structure that regulates comprehension processes in a top-down fashion.
It works, on the one hand, like a perceptual filter, in that it admits mate¬
rial consistent with itself but blocks irrelevant materials, and, on the other
hand, it serves as an inference machine, in that it fills in the gaps that are
inevitably found in the actual stimulus material.
Schema theory has taken many forms in the years past, in both AI and
psychology, and has become rather more sophisticated than the caricature
just presented suggests. Nevertheless, it has its problems. First and fore¬
most, there exist enough psychological data to question whether the top-
down guidance of comprehension is as tight as schema theory suggests. I
do not review these data here, but results are presented throughout this
book that indicate the need for conceiving of comprehension as a more
bottom-up, loosely structured process. Second, human comprehension is
incredibly flexible and context-sensitive. It is hard to see how one could
model that process with fixed control structures like schemas. It is, there¬
fore, worth exploring an alternative.
A coherent model is certainly the outcome of comprehension, but it
does not have to be the result of forcing comprehension into a pro-
crustean schema. One can conceive of comprehension as a loosely struc¬
tured, bottom-up process that is highly sensitive to context and that flex¬
ibly adjusts to shifts in the environment. Comprehension, in this view,
may in fact be quite chaotic in its early stages. It becomes orderly and well
behaved only when it reaches consciousness. Such a process can be mod¬
eled by a construction process that is only weakly controlled and proceeds
largely in an associative, bottom-up manner that is followed by a con¬
straint satisfaction process in the form of a spreading activation mecha¬
nism and that yields the coherence and order that we experience.
I have called this model the construction-integration (Cl) model. In
this model mental representations are formed by weak production rules
that yield disorderly, redundant, and even contradictory output. How-
Modeling comprehension processes 95
To say that construction rules are weak and bottom-up and require an
integration process to function does not specify what these rules are. It
merely characterizes an architecture within which various kinds of mod¬
els can be developed. That is, to model a particular cognitive process, a set
of (weak) construction rules must be formulated that are involved in this
process. I discuss a few examples here that are important for understand¬
ing text. This discussion remains informal and relies on hand-simulation
of the rules involved. Construction rules for a number of different cog¬
nitive processes are described and evaluated empirically in the remaining
chapters of this book.
FLY[THEY,PLANES]-ISA[THEY,FLYING [PLANE]]
The negative link is assigned not because the two interpretations are log¬
ically incompatible, but because the two alternative propositions were
formed from the same sentence. Thus, the system does not have to under¬
stand the contradiction in order to assign a negative link, but it does so on
purely formal grounds. If the same string of words can be configured
once into one proposition and once into another, the two constructions
are mutually exclusive and inhibit each other.
A propositional network is thus formed. Link strengths may be chosen
to be all equal, or they may be varied according to theoretical considera¬
tions. For instance, embedding relations may be given more weight than
Modeling comprehension processes 97
direct links, which in turn may be given more weight than indirect links.
Link strengths may also be estimated empirically from free association
data or by means of the cosine values in an LSA analysis. Different ver¬
sions of the model may thus be obtained to be evaluated empirically for
their adequacy.
3. Rules for the activation of knowledge. The basic mechanism for knowl¬
edge activation is assumed to be associative. Items in working memory
activate some neighboring nodes in the knowledge net with probabili¬
ties proportional to the strengths with which they are linked to them.
This mechanism may be modeled formally (see, for instance, Mannes &
Kintsch, 1987, where the SAM theory of memory retrieval by Raajmak-
ers & Shiffrin, 1981, is used for that purpose; likewise Doane, McNa¬
mara, Kintsch, Poison, & Clawson, 1992, or Kitajima & Poison, 1995,
where memory retrieval processes are of central importance.) Sometimes,
however, a more informal treatment is appropriate, as in many of the illus¬
trative examples discussed later. Thus, Kintsch (1994a) obtained empiri¬
cal estimates of what knowledge was likely to be retrieved but then made
the simplifying assumption that all of it actually was retrieved. Even more
informally, often only directly relevant knowledge is dealt with explicitly
in uses of the model when the complete details of knowledge retrieval do
not play a central role.
4. Rules for constructing inferences. Just as we are unable to state explic¬
itly all the rules for deriving a propositional representation from a text,
we cannot list all rules that people use for inferencing. Examples of such
rules appear frequently throughout these chapters. For instance, a simple
rule used in forming a situation model might involve the following tran¬
sitive inference:
into
Rules like (1) and (2) are used intuitively in applications of the present
model, just like the parsing rules. We are primarily concerned with the
resulting structures, not with the parsing and inference rules themselves.
W = {**}
Activation spreads to a node from all of its neighboring nodes, so that the
activation of a node at time t + 1 becomes
X
ah + 1) = -
max a;(r + 1)
all scores. Other standard connectionist techniques could also be used for
normalization - for example, a sigmoidal transformation, as in Goldman
and Varma (1995).
The spreading activation process is continued for as many iterations as
are necessary for the activation values to settle in a stable pattern. Once
no activation value changes by more than some criterial amount, say .001
or .0001, the process is stopped. The conditions under which a spreading
activation process stabilizes, that is, approaches an asymptotic activation
vector A(°°), or simply A, are not known in general (though see Roden-
hausen, 1992). Symmetric connectivity matrices with no negative values
behave like ergodic Markov chains. That is, an asymptote exists and is
independent of the starting vector.
The values of A indicate the strength of each node in the integrated
network. Nodes that are positively connected to many other nodes in the
net will be strengthened. Nodes that have few connections or are nega¬
tively connected will wither away or become suppressed. This is a process
of constraint satisfaction: Nodes that satisfy the multiple constraints of
the network become stronger, whereas nodes that do not become weaker.
The resulting pattern of activation indicates for each node the role it plays
in the network as a whole and can be considered a measure of the strength
of that node in the mental representation.
A simple numerical example helps to form some intuitions about this
integration mechanism. Consider the network of five nodes shown in
Figure 4.1. The dashed line indicates a negative link. If we let all links
have a strength of either 1 or -1, we obtain the following connectivity
matrix, remembering that each node is linked to itself:
A B C D E
A 1 1110
B 1 10 10
C 1 0 10 1
D 1 10 1-1
E 0 0 1-11
We assume that each of the five nodes has an activation value of 1 initially:
A(l) = (1, 1, 1, 1, 1)
100 The theory
Figure 4.1
(4, 3, 3, 2,1)
The activation values after one iteration are obtained by dividing these
scores by their maximum:
The integration that takes place at the sentence end has a special sta¬
tus, however. Except for very short sentences, working memory at this
point is usually loaded to capacity and must be cleared to make room for
the next sentence. Whatever has been constructed is transferred to long¬
term memory. As a consequence, except for one or two central proposi¬
tions that are retained in the focus of attention because of their presumed
relevance to further processing, all that has been constructed up to this
point in working memory is now lost from consciousness/primary mem¬
ory. However, in a normal text, this information is still readily retrievable
because the succeeding sentence most likely will contain retrieval cues
that make it accessible in long-term working memory, as discussed in
chapter 7. The active processing focuses on only the current sentence,
plus whatever information had to be retrieved from long-term memory
that is necessary for its processing. Thus, working memory is like a spot¬
light that moves across a text, sentence by sentence, constructing and
integrating a mental representation in the process. The representation
that results from this cyclical process is a coherent structure and not a
sequence of disjoint structures, each corresponding to a sentence. The
coherence is a result of the fact that propositions are kept in short-term
memory from one sentence to the next to serve as bridging material and
because earlier portions of the text as well as general knowledge items that
were retrieved during processing provide a lattice of interconnections.
The number of propositions that are carried over from one cycle to the
next is a free parameter of the model. The estimates that have been
obtained so far suggest that it is a small number - generally 1. Just which
proposition is carried over is determined by the activation values propo¬
sitions receive at the end of a processing cycle. Typically, the strongest
proposition of the previous cycle remains in the focus of attention as
attention shifts to a new sentence. This strength criterion replaces the
selection mechanisms that were introduced in earlier work, for example,
the “leading edge strategy” of Kintsch and van Dijk (1978). A full dis¬
cussion of memory use in comprehension is given in chapter 7.
As a result of this cyclical processing of a text, not all relations in a
text that can be detected by linguistic analysis are actually realized in the
mental representation. Only those relations that hold between proposi¬
tions that were together in working memory at some time during the sen¬
tence-by-sentence process of comprehension play a role in the text repre¬
sentation.
Modeling comprehension processes 103
where w,,Ij is an element of W and a,i is the final activation value of the zth
element; the summation is over the k processing cycles that the zth ele¬
ment has participated in, k > 1.
The extent to which a reader will actually perform the work of trans¬
forming a textbase into a situation model is subject to numerous influ¬
ences. In a given comprehension episode, one influence or another may
dominate, but it is generally the case that both play a role, complement¬
ing each other.
Figure 4.2 illustrates the relationship between textbase and situation
model. The textbase can be more or less coherent and complete, and the
situation model can be more or less adequate and precise. Mental text
representations may fall anywhere in this quadrant. In the text-that-tells-
it-all, in which every detail as well as the overall structure is made per¬
fectly explicit (as far as that is possible!), the textbase is also a good situa¬
tion model and no further knowledge elaborations on the part of the
comprehender are required. Typically, however, the mental text repre¬
sentation is a mixture of text-derived and knowledge-derived informa¬
tion, not necessarily in equal parts. Extreme cases, in which either the
high
LU BB&F(1972)
Q O The entirely ex^cit text
O
/
X /
O
/
S3 /
t /
CO /
LU /
X
I- /
LL.
/
O
>- /
o /
< / M&K(1993)
3 o
o
LU
/
Cl
< low high
Figure 4.2 Textbase and situation model jointly determine the nature of the
mental text representation. Extreme cases are Bransford, Barclay, and Franks
(1972), where the situation model dominates at the expense of the textbase,
and Moravcsik and Kintsch (1993), where the textbase dominates at the
expense of the situation model.
Modeling comprehension processes 105
(3) Jane could not find the vegetable and the fruit she was looking
for. She became upset.
Not finding something one is looking for is a common reason for becom¬
ing upset. Hence, a causal link is inferred between the two propositions,
thus forming a simple situation model.
Elaborative inferences come in many kinds:
(4) Jack missed his class because he went to play golf. He told his
teacher he was sick.
the situation model might be an image of a lake, a log with a turtle, and a
fish underneath. Various elaborations are implied by such an image, for
example, that the turtle is above the fish, the fish is in the water, and so on.
Situation models often involve more than filling gaps between propo¬
sitions or local elaborations. They may provide a basis for the overall
organization of the text even in cases where the text structure is not sig¬
naled explicitly in the text. Readers recognize certain text structures and
can use this knowledge to impose an organization on the text. For
instance, in a little story “Jane goes grocery shopping” readers can gen¬
erate a macrostructure for the text even if none is signaled by reference to
their grocery-shopping script, which identifies the episodes typically
involved in grocery shopping (preparation, shopping, checkout, pay/
leave). The sentences of the text can be subordinated to these episode
nodes, thus creating the macrostructure of the text via the schema-based
Modeling comprehension processes 107
(6) John traveled By car from the bridge to the house on the hill. A
train passed under the bridge.
To form the textbase we assume that the two brief sentences are both
processed in a single cycle. The first sentence yields a single proposition
with a modifier; another proposition with a filled location slot is derived
from the second sentence; the two propositions are only indirectly related
(by the common argument BRIDGE), as shown in Figure 4.3. The
textbase is thus strongly constrained by the text; most or all cooperative
readers would be expected to form a propositional textbase like the one
shown in the figure. Just what sort of situation model would be formed is
much less constrained. We make the assumption here, which will be cor¬
rect for at least some readers, that an image will be formed. For the first
sentence, this image might be of a house on a hill, a bridge crossing a
river, a road connecting the bridge and the river, and a car driving from
the bridge up to the house. Note that this image does not include John,
but there is a road and a river, as well as a particular spatial relationship
between hill and river - for example, the hill might be to the left of the
TRAVEL
agent: JOHN
instrument: CAR
source: BRIDGE
goal: HOUSE
mod: ON-HILL
PASS
object: TRAIN
location: UNDER-BRIDGE
Figure 4.3
Modeling comprehension processes 109
river, for which there is no license in the text. This image can be repre¬
sented by the four spatial relations (shown in the ovals of Figure 4.4),
which form a network together with the two textbase elements (shown in
the rectangules). The TRAVEL proposition corresponds to the
ON[CAR,ROAD] image element, which therefore links them together in
the network, and the spatial and propositional ON[HOUSE,FIILL] ele¬
ments correspond to each other.
As each proposition or imagery element is formed, it is integrated into
the network that is being formed in working memory. The status of the
network after the elements shown in Figure 4.4 have been constructed
and integrated is indicated by the activation values that are shown with
each element. The main point is that the gratuitous elaboration of includ¬
ing a river in the image receives a low but positive activation value. This
will change, as the next sentence becomes part of the network (see Figure
4.5).
We have the two propositional elements derived from the second sen¬
tence, which give rise to a new imagery element: a train under the bridge.
This is linked to the previous image of the bridge, road, and house but is
incompatible with the other element of the previous image of a river
110 The theory
under the bridge. Thus, the negative link is based on an inference that a
bridge over train tracks does not have a river underneath (neglecting the
possibility that a bridge could span both train tracks and a river). The two
UNDER nodes therefore interfere with each other and are linked by a
value of-1 in the network (indicated by a broken line). The outcomes of
the final integration cycle are shown at the top, with each node. The
image of the train under the bridge dominates the net and completely
suppresses the image of bridge and river.
This example was chosen to demonstrate two points. First, it shows
how the model can deal with images - not very elegantly, but it can do so.
Some imagery information can be presented in the way shown here but
with some loss of information - the concreteness of the image together
with some of its spatial implications. Second, it provides a good example
of the operation of weak, dumb rules. In response to the first sentence, an
image was constructed of a bridge and a river. The text said nothing about
Modeling comprehension processes 111
(7) (SI) Jane went grocery shopping on Sunday afternoon. (S2) She
had to park the car far away from the store because the parking
lot was full. (S3) The aisles were as crowded as the parking lot.
(S4) The lettuce was wilted and the fruit was picked over. (S5)
She became upset. (S6) But she filled her cart with whatever she
could find. (S7) The bill was larger than expected. (S8) She car¬
ried two heavy bags across the parking lot.
GROCERY SHOPPING
agent: PERSON
Figure 4.6
read in the second sentence (S2). This sentence is assigned to the prepa¬
ration slot of the grocery-shopping script, because preparation is associ¬
ated in the knowledge base with making a shopping list, driving to the
store, parking, and entering store. Thus, the structure that is generated
when the first two sentences are processed now contains seven nodes
(Figure 4.7). All nodes and links are assigned a value of 1. The main
proposition is GROCERY-SHOPPING, with three filled slots; the
PREPARATION slot is connected to PARK, which has two modifiers
and is linked (via an explicit BECAUSE relation) to FULL. Working
memory is loaded to capacity, and the end-of-cycle integration deter¬
mines the final activation values for these nodes, which are as shown in
Figure 4.7 before each element of the net.
The third sentence (S3) is associated with the SHOPPING slot of the
main proposition, and it also reinstates the FULL proposition from the
previous processing cycle. Because S3 is a short sentence and only three
new elements have been constructed at this point, the model goes on and
reads in S4 as part of the same cycle. It is also connected to the SHOP¬
PING episode. In addition, the PARK proposition is carried over in the
short-term memory buffer as the most highly activated proposition from
the first processing cycle. This yields the network shown in Figure 4.8.
The third processing cycle comprises S4 and S5. The key element here is
the UPSET proposition. Because other information has already been
Modeling comprehension processes 113
B
E
C
A
U
S
E
.48
FULL[PARKING-
LOT]
Figure 4.7
2 This is only one possible interpretation of this text - not all readers will follow this
course.
114 The theory
Figure 4.8
ated by the fact that a script that was retrieved on the basis of the first sen¬
tence proved to provide an appropriate organizational structure for the
remainder of the text. Each sentence of the text was recognized as a pos¬
sible filler for the slots of the script that was available. This computation
involves matching each (complex) text proposition with propositions
associated with the slots of the scripts in long-term memory. The second
macroproposition involved a similar process. A proposition in the text -
UPSET — retrieved information from long-term memory (let’s call it an
ANNOY schema) about normal conditions for becoming upset in a gro¬
cery store; text propositions were found that matched this information
and hence were subordinated to the ANNOYED proposition. These two
macropropositions provide an organization for the text. They are the core
Modeling comprehension processes 115
1.00
ANNOYED
agent: JANE
.41
condition: FULL[PARKING LOT]
state:
.41
CROWDED[AISLES]
BUT
.32
GROCERY SHOPPING
agent: JANE
.87
FILL[JANE,CART,$]
.75
event sequence:SHOPPING
.43
FIND[JANE,GROCERIES]
.17
WHATEVER
Figure 4.9
116 The theory
Figure 4.10
of the situation model; the rest of the text provides detail that fits into that
model and supports the situation model. The situation model was of
course constructed on the basis of that detail, but once the construction
process is over, it becomes the dominant element in the episodic text
structure.
Note that in all the preceeding examples one cannot make a strict dis¬
tinction between situation model and textbase. Certain propositions are
clearly text derived and hence belong to the textbase, but the way they are
organized and indeed their very strength in the overall structure depend
on the situation model that has been created of a woman getting annoyed
while grocery shopping. A reader without knowledge about shopping in
a supermarket who has not experienced the annoyances of such places
could and would not create this kind of structure. His or her memory rep¬
resentation would therefore be quite different - more like a pure textbase.
Modeling comprehension processes 117
S8-carry
S7-large-biIl
S6-fill-cart
S5-upset
S4-picked-fruit
S4-wilt-lettuce
S3-crowd-store
S2-full-lot
S2-parking
SI-Sunday
SI-grocery-shop
0 12 3 4
Memory Strength
Figure 4.11 Long-term memory strengths for the propositions of the gro¬
cery-shopping text.
118 The theory
(8) This woman went grocery shopping. She got upset because it
was crowded.
Although much has changed since Kintsch and van Dijk (1978) began
work on a process model of comprehension, the basic framework of that
model is still the same today: the propositional representation, the cycli¬
cal processing, the distinction between the micro- and macrostructure.
But the model has evolved and, indeed, changed in some important ways.
The year of the mental model was 1983. Johnson-Laird published his
influential book of that title, and van Dijk and I introduced the related
notion of situation model (van Dijk & Kintsch, 1983). This was a major
change in the theory, which shifted the emphasis away from the text itself
to the knowledge/text interaction. This emphasis has been retained and
strengthened here.
A more technical innovation made by van Dijk and Kintsch (1983) was
the use of complex propositions in contrast to atomic propositions, as in
the earlier work. However, in actual modeling work I continued to use
atomic propositions for some time thereafter, and only in the present
work have I employed complex propositions more consistently.
In 1988 the Cl architecture was developed (Kintsch, 1988). It involved
a shift from a schema-based control system to a bottom-up system in
which context effects were viewed as constraint satisfaction. This made it
possible for the first time to deal seriously with the use of knowledge and
memory in comprehension. In the present book I have tried to develop in
detail the program laid out in 1988. In particular, the theory of long-term
working memory of Ericsson and Kintsch (1995) is used to specify the
details of the use of knowledge and memory in comprehension, as detailed
in chapter 7. Thus, the new framework has significantly modified earlier
views of working memory management, the use of episodic text memory
and knowledge in comprehension, and inferencing.
In the present book I also introduce Latent Semantic Analysis (LSA)
as an alternative to propositional representations. It is consistent with the
processing theory presented here, as I have tried to show with the various
Modeling comprehension processes 119
Models of comprehension
5
200 to 250 ms, word meanings in a discourse context need more time than
that to stabilize. This means that meaning construction at the word level
extends beyond the fixation time. In general, readers attempt to interpret
whatever structure they encounter - lexical, syntactic, semantic, or dis¬
course - immediately rather than engaging in a wait-and-see strategy (for
a detailed review, see Just & Carpenter, 1987.) However, this means only
that they begin the interpretive process as soon as possible. It is not nec¬
essarily the case that each word is fully encoded while it is being fixated.
During the fixation, the perceptual construction processes are initiated
and largely completed. According to the Cl model, these do not produce
a final product - a particular word meaning - but must be integrated into
the discourse context. This process of contextual constraint satisfaction
starts immediately, as soon as anything has been constructed to be inte¬
grated, but may not be completed until sufficient local context is available
to allow the integration process to stabilize.
5.1.2 Homographs
The results were most interesting: The lexical decision times to words
that were associates of either sense of the homograph were both primed,
that is, they were about 40 ms shorter than the lexical decision times for
unrelated control words. In further work, it was found that if the lexical
decision task is not presented immediately after the target word but
delayed for 350 ms, the contextually inappropriate associate is no longer
primed. Swinney concluded that all meanings of a homograph were
accessed initially but that the context quickly selects the appropriate one,
so that the inappropriate one never even enters consciousness.
Following Swinney’s original demonstration, a large number of stud¬
ies of lexical access have appeared using either the lexical decision task or
simply a naming task. Swinney’s findings were replicated repeatedly, but
the opposite result was obtained in a number of experiments, too. Despite
this confusion, the issue seems to be settled today: Initial lexical access
does appear to be exhaustive and context independent, at least for homo¬
graphs with two balanced meanings. In a recent review of that literature,
Ravner, Pacht, and Duffy (1994) conclude that the most likely reason for
the repeated findings of context effects is that both the lexical decision
task and the naming task are subject to postlexical influences. Thus, what
appears to be a context effect on lexical access is really an effect of the
context on the integration of the word into the discourse context subse¬
quent to lexical access. They argue convincingly that measures of eye
movement provide a better and less biased index of lexical access, and
they report some data that let us observe both the context-independent
access to multiple meanings of homographs and the process of contextual
integration that begins the moment the multiple meanings have been
generated.
Rayner et al. (1994) measure the fixation time on homographs embed¬
ded in a disambiguating prior discourse. They find that readers fixate
longer on a homograph that has two meanings, one of which is much
more frequent than the other, when the subordinate meaning is instanti¬
ated, than on balanced homographs or on unambiguous control words
matched for length and frequency. They call this the subordinate bias
effect. Presumably, the more frequent meaning is accessed first, and the
longer initial fixation time is required to access the subordinate meaning
required by the context.
Now what happens if a biased homograph is repeated, both times in its
subordinate sense? If context can constrain access, it makes no sense for
130 Models of comprehension
was obtained for unassociated thematic words. Priming was clearly a mat¬
ter of local associations. The theme neither helped to prime unassociated
words related to it, nor did it suppress associated words unrelated to it.
Till, Mross, and Kintsch (1988) extended these results by including
not only local associates but also thematic associates as test words in
their study. An example of one of their experimental sentences was the
following:
(1) The townspeople were amazed to find that all the buildings had
collapsed except the mint.
After reading this sentence, subjects were given a lexical decision task
with either one of four test words (nonword strings were, of course, also
used in the experiment):
These test words were presented either 200 ms after the last word in the
sentence, 300, 400, 500, 1,000, or 1,500 ms later. Figure 5.2 shows the
amount of priming that was obtained for these test words as a function of
the stimulus onset asynchrony (SOA) as the difference between the lexi¬
cal decision time for contextually inappropriate associates minus the lex¬
ical decision time for contextually appropriate associates. For short
SOAs, the Till et al. (1988) data replicate the results of Swinney (1979)
and Kintsch and Mross (1985). Both the contextually appropriate associ¬
ates and the inappropriate associates are equally activated initially. We can
explain this in the following way. The process of word-meaning con¬
struction is completed by 350 ms: The contextually appropriate word
sense has been established, and its associate money remains activated.
Because the alternate meaning is now deactivated, no priming effect is
obtained for candy at this point.
However, globally relevant inference words corresponding to the
theme of the sentence do become activated after a certain time even in the
absence of any local associations, unlike the scriptally relevant test words
132 Models of comprehension
Inferences
Associates
Figure 5.2 Time course of priming effects for associate and inference tar¬
gets. Points represent the mean priming effect: response latencies for contex¬
tually inappropriate targets minus response latencies for contextually appro¬
priate targets. After Till et al. (1988).
1 E.g., ugate” is part of the boarding a plane script but may not be topically relevant in
a story about someone almost missing a plane.
Word identification in discourse 133
(2) The gardener pulled the hose around to the holes in the yard.
Perhaps the water would solve his problem with the mole.
Landauer and Dumais (1997) have already shown that in an LSA simula¬
tion of the Till et al. (1988) materials the cosines between the target words
and the experimental sentences mirror the obtained experimental results.
We go a step further here by incorporating their semantic relatedness esti¬
mates into a construction-integration simulation, which allows us to
model the changing time course of priming. Table 5.1 shows the cosines
obtained from LSA, that is, the strength of the semantic relation, between
the prime mole and the two target words, ground and face. Both targets are
reasonably closely related to the prime; empirically, both are strong asso¬
ciates of the prime. In separate Cl simulations, these two target words were
then linked to the words in (2) according to their cosine values. The results
of the spreading activation process are also shown in Table 5.1. Ground is
strongly activated, because it has high or moderate cosines with several
words in (2), whereas face ends up with a low activation value, because its
cosines with all of the words in (2) except mole are low.
The implications of this analysis are in agreement with the Till et al.
data: There is immediate priming for both ground and face because of
their relatedness to mole. But once mole has been integrated into the sen¬
tence context, only ground will be primed; face is no longer semantically
related to the mole-in-the-yard.
Now consider an example of the kind Steinhart used:
(3) For his birthday last year, I wanted to give my brother something
ugly. But no pet store in town would sell me a mole.
cosine
activation
The two target words can be linked up with (3) in the same manner,
employing the cosine values between the targets and the words in (3) as
link strengths. However, now ground, the target related to the contextu¬
ally appropriate sense of mole, fares no better than the target related to the
inappropriate sense: All the cosines are low in both cases, as are the acti¬
vation values resulting from a Cl simulation, also shown in Table 5.1.
It does not seem that word identification is a look-up process in a men¬
tal lexicon. Otherwise, the ground-mole sense should have been selected in
(3) as well as in (2), and ground should have been primed in both cases,
because we know that it is a strong associate. On the other hand, suppose
that in word identification there are no “senses” of words to be picked out
of a lexicon. If words and sentences are represented as vectors in a seman¬
tic space, mole is a weird mixture of the ground-mole and the face-mole, by
itself priming ground as well as face. Once mole is integrated into the sen¬
tence context of (3), it is no longer the ground-mole but a new construc¬
tion, the ugly-pet-mole. The ground-mole burrows holes into the ground
and annoys gardeners; none of this is important for the ugly-pet-mole — it
is ugly and supposed to annoy my brother! Reading (3) does not select a
particular sense of mole but constructs a new meaning of mole meticu¬
lously attuned to its context.
Results in the memory literature lend support to Steinhart’s priming
data. For instance, Barclay et al. (1974) had subjects study sentences like
The man lifted the piano or The man tuned the piano. Later, two kinds of
recall cues were provided: either Pianos are heavy or Pianos make nice
sounds. The subjects had to recall the study sentences. Depending on
which form of the study sentence they had read, one or the other of these
cues was appropriate. Recall was significantly better (80%) with the
appropriate cue than with the inappropriate cue (61%). Similar results
were obtained in an experiment by Anderson and Ortony (1975), who
found that the sentence Television sets need expert repairmen was recalled
best with the cue appliance, and the sentence Television sets look nice in the
family room was recalled best with the cue furniture. Clearly, both appliance
and furniture are potential components of the meaning of television set, but
only the contextually relevant feature was actually used in comprehension.
These memory data with unambiguous words support Steinhart’s
finding that priming effects for homonyms are influenced by their con¬
textual use. However, negative results like Steinhart’s are at best sugges¬
tive and need to be replicated with appropriate controls. If they do hold
136 Models of comprehension
up, this would constitute strong evidence for the position advocated here.
Word meanings are not something to be pulled from a mental lexicon but
are ephemeral constructions that are generated during comprehension.
In the next section, I explore in a little more detail the nature of this con¬
struction process as it is conceived by the Cl model.
Proposition
Cycle Number Number Proposition
1 PI DESIGN[BAR,P2]
1 P2 MALEfPROFESSIONALS]
2 P3 TURN-AWAY[BAR,SECRETARIES,NURSES]
2 P4 TRY[SECRETERIES,NURSES,P5]
2 P5 ENTERfSECRETARIES,NURSES]
3 P6 LUXURIOUSfBAR]
4 P7 LOCATION[BAR]
4 P8 IN[P7,NICEPLACE]
4 P9 IN[P7,BUSINESSDISTRICT]
4 P10 IN[P7,TOWN]
5 Pll PERFECT[BAR,P12]
with a value of .5, to indicate that an item retrieved from long-term mem¬
ory during comprehension receives less weight than one that is explicit in
the text.
We are now ready to start the integration process. The complete net¬
work that has been constructed is shown in Figure 5.3. A cycle size of five
elements was assumed, but knowledge elaborations belonging to a text
proposition were always included in the same cycle as the text proposi¬
tion, even if more than five elements had to be processed in working
memory because of that. A buffer size of 1 was assumed, that is, the most
highly activated proposition on each cycle was carried over to the next
processing cycle. Six cycles were required to process this text under these
conditions. The long-term memory strengths of the 13 propositional
nodes in the network that resulted from these six processing cycles are
shown in Figure 5.4.3
The knowledge elaborations have been omitted from Figure 5.4
because none of them received enough strength to be experimentally sig¬
nificant. However, this does not always have to be the case. Imagine a
reader who is reminded of discrimination every time she reads the six sen¬
tences of Table 5.2. For such a reader, the knowledge elaboration discrim¬
ination would achieve considerable strength. Specifically, the discrimina¬
tion node will have a strength of .62 in the network, the eighth highest
value. Thus, under appropriate circumstances it is quite possible that
knowledge elaborations end up with greater memory strength than actu¬
ally presented items.
In the experiment of Schwanenflugel and White (1991), subjects read
texts like the one shown in Table 5.2 with an RSVP procedure, that is, one
word at a time, appearing centered on a computer screen. It was a dual
task experiment, because in addition to reading the text (and answering
questions about it later), at certain points during reading subjects made
rapid lexical decisions. The point of particular interest here concerns the
last word of sentence 6, namely, women.
We make the assumption that priming effects will occur in the lexical
decision task to the extent that the target words are activated by the con-
3 The normalization procedure used for these calculations required the total sum of
activation to remain constant in each cycle and differs therefore from the one
employed in the Mross and Roberts (1992) program.
Models of comprehension
DES[BARJP2]
MALE [PROF]
TURN[BAR,SEC,NUR]
LOC[BAR,P9]
PERF[BAR,P12]
LUX[BAR]
TAKE[CLT,BAR]
NS[BAR,DRK,WOM]
IN[P7,NICEPL]
IN[P7,BUSDIS]
IN[P7,TOWN]
TRY[SEC,NUR,P5]
ENTER[SEC,NUR]
0 1 2 3 4 5
Strength
Figure 5.4 Final memory strengths for the propositions shown in Figure
5.3.
text of reading. Suppose that subjects are reading sentences 1 to 6 but that
at the point where the last word should have been presented, a lexical
decision trial occurs with the word women. We need to calculate how
strongly this word will be activated by the memory structure for the text
that already has been constructed at this point. This structure is identical
with that shown in Figure 5.3, except that the last proposition, P13, is
incomplete and has the form NOT-SERVE[BAR, DRINKS, $] - the
recipient of the action has not yet been presented. How will this structure
activate the target word women? Women is connected with the episodic
text structure in two ways. First, it fits semantically into the slot of the
incomplete NOT-SERVE proposition and hence receives activation from
that source. However, we already have a potential slot filler for that propo¬
sition in the form of the knowledge elaboration minors, which became a
part of the net as an associate of NOT-SERVE(BAR,DRINKS,10-
PERSON). Because women and minors are mutually exclusive alternatives
for the same slot, they interfere with each other; thus, we add a link with
a negative weight of-1 between them. Hence, there are two sources of
activation for women, a positive and a negative. If we let activation spread
to it from the incomplete network through the two links just mentioned,
142 Models of comprehension
that source too. However, as was the case for students, the NOT-SERVE
proposition completed with the object minors is no longer macrorelevant
and hence receives no special activation from the discourse topic. As a
consequence, minors ends up with an activation value not as high as that
for women but higher than the activation of students.
The Cl model therefore correctly predicts at the same time both local
priming.effects that are independent of context and priming through the
discourse context. Purely associative models would predict the former,
schema theories the latter, but what is needed to account for what actually
happens is something like the marriage of bottom-up processes and con¬
textual integration processes as postulated by the Cl theory. Actually, the
theory makes even more powerful predictions, for which, however, no
empirical tests are available as yet. As shown in Figure 5.5, it predicts that
the target minors is strongly activated from the very beginning (via the
preexisting minors that is already a node in the network), but that its acti¬
vation then decreases because it does not fit well into the discourse con¬
text. Women and students, in contrast, start out with zero activation but
then become activated from the discourse structure to different degrees.
Women
Students
-o— Minors
Figure 5.5 The settling process for the activation values of three target
words in a lexical decision task on the Cl model. The targets were presented
in place of the last word of sentence 6.
144 Models of comprehension
5.2 Anaphora
in the same passage of a text (Rayner et al., 1994). These results suggest
that repeated words are identified in much the same bottom-up way as
words that appear for the first time in a discourse context.
There is, of course, a difference in the way new words and repeated
words are processed. Part of the process of meaning construction for the
repeated word involves retrieving the context in which the word appeared
earlier and incorporating it into the present context. This retrieval of
prior information about a repeated word occurs quite rapidly, roughly in
the same time that it takes to activate and fixate information about the
word from semantic memory. A study by Dell, McKoon, and Ratcliff
(1983) illustrates this process.
Dell et al. (1983) investigated the case in which a discourse referent is
first introduced with one word (e.g., burglar) and then referred to with
another, definite noun phrase (e.g., the criminal). One of their texts was
the following:
(4) A burglar surveyed the garage set back from the street.
Several milk bottles were piled at the curb.
The banker and her husband were on vacation.
Thej criminal slipped2 away from the streetlamp3.
At the positions labeled 1,2, and 3 in the last sentence, subjects were given
a single word recognition test with one of three test words: the prior ref¬
erent of the noun phrase (burglar), a word appearing in the same proposi¬
tion as the prior referent (garage), and a control word from a different sen¬
tence (bottles). The results are shown in Figure 5.6 as the difference in the
recognition times between the test words and the control word. Before the
anaphoric word (position 1), response times for all words were about
equal. About 250 ms after the anaphor (position 2), both the prior refer¬
ent and the word referring to a concept from the same proposition as the
prior referent were recognized significantly faster than the control word,
suggesting that both were activated in the reader’s working memory. At
position 3, the prior referent was still activated, but other concepts from
the prior context of the anaphor were no longer activated.
This pattern of results is roughly what one would expect from the Cl
model. Figure 5.7a shows the network that a reader has constructed
according to the model at position 2: It contains the two surface elements
the criminal and slipped, the corresponding proposition, a bridging infer-
146 Models of comprehension
Burglar
■O' Garage
Figure 5.6 Priming effects (recognition time for test word minus recogni¬
tion time for control word) for prior referent (burglar) and other concepts
from the same proposition {garage) at three positions in text (4).
SURVEY! BURGL,GARAGE]
(a)
ISA! BURGL^CRI M]
(b)
Figure 5.7 (a) The Cl network at position 2. (b) The Cl network at position
3. Circles are surface structure nodes, squares indicate text propositions, and
the triangle indicates a bridging inference.
The literature is rich but confusing. The results of many studies that vary
by only a single factor are difficult to interpret because other confound¬
ing factors were not controlled. An overall picture emerges nevertheless.
(5) Flying to America. Jane wasn’t enjoying the flight at all. The dry
air in the plane made her really thirsty. Just as she was about to
call her, she noticed the stewardess coming down the aisle with
the drinks trolley.
What concerns us here is how readers interpret the “she” in the two con¬
tinuation sentences - as Jane or as stewardess.
To simulate these conditions in the Cl model, the comprehension of
the text prior to the test sentence must be simulated. The resulting net-
150 Models of comprehension
mod mod
in the context of this LTM trace. Each continuation sentence was twice
represented as a main proposition with a modifier: once for JANE and
once for the STEWARE)ESS, connected with an inhibitory link. In the
order case, the test proposition was connected to three situation model
elements: the actors Jane and stewardess as possible antecedents of the
pronoun she and, for the JANE version only, to order because ordering a
drink is’what passengers are supposed to do. In the case of the pour sen¬
tence, it was linked to the two possible antecedents and to serve in the sit¬
uation model for the STEWARDESS version only. Because we are inter¬
ested in the whole course of the integration process and not just the final
outcome, the starting values of the test sentence elements were set to zeio
and those for prior test elements were set to their LTM values. Most of
the prior text was deactivated, and only the three elements directly linked
to the new input, plus the situation model header FLIGHT, were allowed
to participate in the processing of the test sentences. Figure 5.9 shows the
course of pronoun resolution in the two continuation cases. Foi the older
sentence, the interpretation shejane orders a Coke gains quickly in activa¬
tion, whereas shestewardess orders a Coke never gathers much activation at all.
Discourse focus and pragmatic acceptability combine in favor of the pas-
Figure 5.9 Activation values for the two referents Jane and stewardess in she
orders and she-pours sentences.
152 Models of comprehension
senger ordering the drink. The situation is quite different, however, for
continuation 2. Initially, it is Jane, the referent in the discourse focus, that
is more strongly activated, and only after four processing cycles does the
fact that stewardesses are more likely to pour drinks than are passengers
make itself felt. In the end, however, the model arrives at the correct,
intended interpretation in both cases. Asymptotically, shejune orders a Coke
and shestewardess pours a Coke are equally strong. Total reading times for
these sentences in Garrod et al.’s (1994) experiment were not signifi¬
cantly different.
The pattern of results is quite different if the text (5) is altered so that
it contains a steward rather than a stewardess. Now the pronoun she in the
continuation sentences uniquely identifies JANE, but discourse focus
and verb pragmatics still have their effects, as is shown in Figure 5.10.
These simulations were performed in the same way as before, except that
no link was made between shestewardpours a Coke and STEWARD because
of the incongruent pronoun. The model does arrive at the intended ref¬
erent Jane for both continuation sentences, but shejane pours a Coke never
catches up with shejane orders a Coke. There is a significant reading time
difference in this case in the data of Garrod et al. (1994).
I' igure 5.10 Activation values for the two referents Jane and steward in she
orders and she-pours sentences.
Word identification in discourse 153
Figure 5.11 shows the results of another variation: Here the text con¬
tains steward, but the continuation sentences also contain he. A straight¬
forward, totally unambiguous case, but the discourse focus effects in the
model are powerful. Again, the model arrives at the intended interpreta¬
tion of he, namely, the STEWARD, but it takes some time, and hesteward
orders a Coke never receives much strength. The Garrod et al. (1994) data
show significant differences in reading times. The differences between
Figures 5.10 and 5.11 reflect discourse focus effects: JANE has an LTM
strength of 1.54 and STEWARD of 0.40, and the strong concept (i.e., the
one in the discourse focus, in the terminology of Garrod et al.) is better
able to violate pragmatic constraints than is the weak concept.
Figures 5.9 to 5.11 can hardly be considered decisive tests of the Gar¬
rod et al. (1994) data. It is not obvious just what feature of the simulations
would predict reading times. I have suggested that when the activations
of two continuation sentences reach the same asymptotic value, no read¬
ing time differences should be expected. Another possibility is that read¬
ing times for the continuation sentences might depend on the difference
between how strongly and how long the intended and unintended alter¬
natives were activated in the integration process. Thus, if one adds up all
activation values in these figures for each pair of alternatives and takes the
Jane-order
steward-order
Jane-pour
steward-pour
Figure 5.11 Activation values for the two referents Jane and steward in he-
orders and he-pours sentences.
154 Models, of comprehension
difference (e.g., the sum of activation values for shejane orders a Coke
minus the sum of activation values for shestewardess pours a Coke), one
should get a predictor of reading times. Indeed, these differences are cor¬
related r = .84 with the mean reading times reported by Garrod et al.
(1994). There are, however, only six means, so one cannot make any
strong claims about the goodness of fit of the model.
It is also of interest to examine with the Cl model how explicit
anaphors are identified in the context of Garrod et al.’s (1994) experi¬
ments. Consider, for instance, the case in which the continuations for text
(5) do not involve a pronoun but either a name {Jane) or a definite
description {stewardess). In this case, simulations can be performed just as
before, except that now there is only a single alternative to be considered
for each continuation sentence, either JANE or the STEWARDESS,
whatever the sentence was. Figures 5.12 and 5.13 show what happens:
Explicit anaphora are identified very rapidly, but there are still effects due
to the pragmatics of the verb {Jane orders is activated more rapidly than
Jane pours, and the reverse is true for stewardess). The effects are very
small, however, and Garrod et al.’s (1994) results show no differences.
The discourse focus effect, on the other hand, is a little bigger, as a com¬
parison of Figures 5.12 and 5.13 reveals. Again, Garrod et al. observed no
differences.
Thus, the model provides a framework within which we can simulate
anaphora resolution in considerable detail, in fact, in more detail than the
resolution power of our experimental procedures. These simulations
summarize and organize what we know about pronoun resolution from
the psycholinguistic literature as exemplified by the results of the Garrod
et al. experiments. Furthermore, the simulations could be done without
any really new theoretical machinery. It is true that some things were
done differently here than in other simulations reported in this book. In
particular, because we were interested in the time course of activation, the
test sentence was started with an initial activation value of zero, and prior
elements were given their LTM strengths as starting values. But these are
technicalities, well within the general theoretical framework. They are
not used elsewhere only because they are not needed and would simply
add irrelevant complexities.
It would have been a significant ad hoc theoretical adjustment to intro¬
duce a new concept such as discourse focus into the theory. However, the
Word identification in discourse 155
Figure 5.12 Activation values for the referent Jane in Jane-orders and Jane-
pours sentences.
model could do very well without it because this concept was implicit in
the Cl theory all the time. Currently activated discourse entities that have
acquired a high LTM strength in the process of comprehension are said
to be in discourse focus. Thus, one can talk about discourse focus in the
framework of the Cl theory, as done here, where the use of that term was
natural and convenient. However, discourse focus is not really a concept
needed by the Cl theory. For many theories of text comprehension, dis¬
course focus is a central, explanatory concept (e.g., Greene et al., 1994;
Grosz & Sidner, 1986; Sanford & Garrod, 1981). The Cl model is more
parsimonious in this respect, for discourse focus is simply a concept that
falls right out of its basic architecture.
There is another context in which simulations such as the ones per¬
formed here may become relevant. Within functional linguistics, there
are several studies that attempt to describe the factors that control the use
of anaphora in discourse. This is essentially correlational work: A partic¬
ular discourse characteristic is observed to be correlated with the use of,
say, zero anaphora, pronouns, or full noun phrases. The most basic of
these factors is linear distance (Givon, 1983). If few words intervene
before a concept is repeated, many languages employ zero anaphora. If
more words intervene between concept repetitions, pronouns are used,
unstressed pronouns or stressed ones in English if the distance is larger.
If the number of intervening words is high, noun phrases are normally
used. These default cases can be overridden for specific linguistic pur¬
poses. For instance, pronoun use when zero anaphora would suffice may
signal a topic shift (e.g., Fletcher, 1984; see also section 6.2.1). Thus, lin¬
ear distance is by no means the only determinant of pronoun use. Fox
(1987) shows that a theoretically motivated measure based on the distance
in a theoretical structure provided a better fit than linear distance mea¬
sures. A host of additional syntactic and semantic factors have been iden¬
tified. These include, for example, the role of the antecedent (e.g.,
whether the antecedent concept was the subject or actor, whether it was
the protagonist of a story, whether it was animate or not, and so on). It has
been shown that not only linear distance in words matters, but also
whether or not a paragraph boundary intervenes. By putting all these fac¬
tors together in a multiple correlation, good accounts of pronoun use can
be obtained.
It may be conjectured that the activation value in working memory of
Word identification in discourse 157
5.3 Metaphor
Table 5.4. Four neighbors 0/ theory (top four items) and four neighbors of
laser beam (bottom four items) and their cosines in the LSA space
polarizer .60
f
both theory and laser beam, and it tells us how strong these links are. With
this information we can construct a Cl model network, integrate it, and
obtain some predictions as to the meaning of the metaphorical statement.
Specifically, the four nearest neighbors of theory in the encyclopedia-
based LSA space (excluding proper names, morphologically related
words, and function words) are the first four nodes shown in the first col¬
umn of Table 5.4 together with their link strengths (the cosine between
the respective vectors in the LSA space). The table also shows four
neighbors of laser beam - the last four items in column 1 - and the cosines
between all these vectors.
A network consisting of the text nodes THEORY, LASER-BEAM,
and ISA[THEORY,LASERBEAM] and the eight knowledge elabora¬
tions in Table 5.4, with all their pairwise cosine values as link strengths,
yields the activation values shown in Figure 5.14 when integrated.
Although I have no data to evaluate these predictions, they seem intu¬
itively acceptable. The dominant features of theory in isolation are deem-
phasized, and several features derived from laser beam become relatively
strong: coherent, light, and illuminates. However, note that this feature
transfer is quite crude. Features like polarizer, which seem intuitively
irrelevant, also are introduced at a fairly high level of strength.
As another example, consider the famous opening line of Carl Sand¬
burg’s poem: The fog comes on little cat feet. Again, I selected four neigh-
160 Models of comprehension
Activation
■
Figure 5.15 Activation strength of nodes related to The fog comes in isolation
and to the metaphorical statement The fog comes on little cat feet.
Word identification in discourse 161
bors of The fog comes from the LSA space as well as four neighbors of on
little cat feet. The text propositions
(6) PI COME[FOG]
P2 LITTLE[CAT[FEET]]
P3 ON[Pl,P2]
are integrated together with the eight semantic neighbors that have been
retrieved by LSA, with the cosine values from the LSA serving as link
strengths. Figure 5.15 shows the results. Dominant features of The fog
comes, such as drizzle, become less central to the meaning of the whole
sentence, and some new features are introduced, such as stalk and kitten.
Note that The fog comes already was somewhat soft, and although this node
is even more strongly related to little cat feet, its absolute activation did
not increase much, but its relative weight in the pattern that constitutes
the meaning of text (6) certainly has increased.
Note that if I had used the sentence The theory is abstract, the process
of meaning construction would have been essentially the same, though
quite nonmetaphorical. Some of the things we associate with theory
would be deemphasized, whereas some features of abstract would trans¬
fer, creating a new concept.
My next example makes this equivalence between metaphorical and
literal processes explicit. Consider The old rock had become brittle with age,
uttered about a former professor and the wall of a medieval monastery,
respectively (adapted from Gibbs, 1994):
(7a) John was in for a surprise when he visited his former professor
after many years. The old rock had become brittle with age.
(7b) John was in for a surprise when he touched the wall of the
monastery tower. The old rock had become brittle with age.
to determine what these neighbors are and how strongly they are associ¬
ated to the noun phrases in the (7a) and (7b) sentences. Thus, I assumed
that the items shown in Table 5.5 will be retrieved; in each case I selected
three word concepts, that is, words that have a high cosine between their
vector and the eliciting noun phrase in the LSA space, and one document
(the starred item) - the LSA analog of a memory episode.
Latent semantic analysis knows the world only through reading docu¬
ments, whereas people have real experiences, episodes they remember and
use to understand and interpret language. Nicolaas Bloembergen, I sup¬
pose, must be a famous old professor to whom an encyclopedia article was
dedicated, and when LSA read the phrase old professor it was reminded of
that article, much as a human might be reminded of some old professor or
of a particular encounter with a member of that curious species.
If we construct a network linking the two noun phrases in (7a) and (7b)
in the same way as in Figure 5.14 and integrate these networks, the acti¬
vation patterns shown in Figures 5.16 and 5.17 are obtained. Whether
metaphoric or not, the results are much the same: Features of old rock are
transferred to wall as well as to professor, and some of the original features
of these terms are muted, though by no means effaced. The wall reminds
us now more of the Dome of the Rock than Westminster Abbey and becomes
old and more like a monolith; it is less likely to be composed of masonry.
The professor, too, ages and assumes some features of the Dome of the Rock
Word identification in discourse 163
Wall
Figure 5.16 The activation of eight nodes linked to wall of monastery tower
alone and linked to the wall of the monastery tower is an old rock. The first four
items have been retrieved from the neighborhood of wall of monastery tower,
and the last four items have been retrieved from the neighborhood of old rock.
4 Actually, there may be less agreement than linguists think concerning what a
metaphor is. I often have trouble convincing my class that “the stock market crashed"
is a metaphor.
164 Models of comprehension
Professor
Figure 5.17 The activation of eight nodes linked to old professor alone and
linked to the old pro fessor is an old rock. The first four items have been
retrieved from the neighborhood of old professor; and the last four items have
been retrieved from the neighborhood of old rock.
their vector in the LSA space is .07 and .04, respectively. Old rock is just
as unrelated to wall of the monastery tower as to former professor. Perhaps
judging whether a sentence or phrase is or is not a metaphor involves
more analysis and is not directly based on semantic distance. Perhaps the
aesthetic pleasure one derives from reading figurative or poetic language
is also based on such a postcomprehension analysis. For instance, a reader
might detect a category violation in the case of the simple IS-A meta¬
phors we have discussed here: We say (7a) is a metaphor because we know
that rocks and professors belong to different conceptual categories, and
we take pleasure in understanding the sentence nevertheless.
The illustrations of metaphor comprehension described here show
that it is feasible to use the Cl model together with semantic relatedness
data obtained from LSA to simulate metaphor comprehension. However,
many aspects of metaphor comprehension and usage have not been
addressed here (such as asymmetries in metaphorical comparisons). To
demonstrate the general validity of the points made here will require a
Word identification in discourse 165
A parser in the Cl model should take text as its input and generate as its
output a network of propositions that then could be used as a foundation
for the further modeling of comprehension processes. We do not have
such a parser and must do with hand coding as an unsatisfactory (but
workable) substitute. Nevertheless, there are at least the beginnings of
some research on parsing processes within the Cl framework, which I
discuss in the first section of this chapter. The main focus of the Cl model
has always been on how textbases and situation models are put together,
once the elementary propositions that are their building blocks have been
constructed. Thus, I review recent research on the formation of mac¬
rostructures and the role they play in comprehension. This leads to
the general topic of inferences, which is discussed in some detail here,
in part because there has been a great deal of controversy in this field
in recent years and in part because of their acknowledged central role in
comprehension. Inferences (though that term, I argue, is misleading) are
involved in both the formation of the textbase and the construction of the
situation model. In particular, I discuss the construction of spatial situa¬
tion models because of the special challenges this topic provides for a
propositional theory.
Although most research on text comprehension has been done with
narrative or descriptive texts, the principles of comprehension are very
general. How they might be applied to literary texts is explored in section
6.5. A good argument can be made that the comprehension of literary
texts is not different at the level of the basic architecture from the com¬
prehension of the kinds of texts we typically study in our laboratories,
although it demands special strategies and knowledge. However, all text
genres require domain-specific strategies and knowledge.
168 Models of comprehension
6.1 Parsing
(1) The linguists knew the solution of the problem would not be
easy.
LINGUISTS 0.3848
KNOW [LING,SOLUTION] 0.3247
KNOW [LING ,$] 0.4137
SOLUTION 1.0000
NOT-EASY[SOLUTION] 0.7344
NOT-EASY???
OF-PROBLEM
0.3496
0.5189
SK
LINGUISTS
LINGUISTS 0.3302
KNOW [LING .SOLUTION] 0.0000
KNOW[LING,$] 0.6249
SOLUTION 1.0000
NOT-EASY[SOLUTION] 0.8549 _
NOT-EASY??? 0.5246 AA
OF-PROBLEM 0.5248 LINGUISTS
Figure 6.2 The same network as in Figure 6.1, except that the broken link
in boldface has a strength of —2. Final activation values after the integration
phase are displayed on the left.
to provide a better solution with a higher harmony value. Indeed, the har¬
mony for the network shown in Figure 6.2 increases to a value of .30.
Ferstl (1994a) discusses a parsing problem that parallels (1). She is
concerned with another well-known garden-path sentence:
Figure 6.3 The network constructed to process the sentence The horse raced
past the ham fell. Final activation values after the integration phase are shown
on the left. Positive links are indicated by solid lines, negative links by broken
lines. After Ferstl (1994a).
WINDOWS 0.0731
ROOM 0.3877
CLEAN [JAN .ROOM .BROOM] 1.0000
WITH [ROOM .BROOM] 0.0000
CLEAN [JAN .ROOM] 0.0000
JANITOR 0.3729
CLEANING 0.4589
BROOM 0.4590
Figure 6.4 The network constructed to process the sentence The janitor
cleaned the room with the broom. Final activation values after the integiation
phase are shown on the left. Positive links are indicated by solid lines, negative
links by broken lines. After F stl (1994a).
172 Models of comprehension
To represent the verb attachment bias of subjects in Figure 6.4, the self
activation of the preferred CLEANQANITOR,ROOM,BROOM] node
was increased, as was the inhibitory link between the two alternatives.
ROOM was associated with WINDOW, and CLEANING was associated
with BROOM, and JANITOR (association strengths were assumed to be
.5). As a result of these minimal knowledge elaborations, the model cor¬
rectly computes the verb attachment for BROOM in (3) and (a network
analogous to Figure 6~4) the noun attachment for WINDOW in (4).
A final example from Ferstl (1994a) illustrates how the Cl model takes
account of the discourse context in parsing a sentence. It has already been
mentioned that the noun attachment required by (4) is less preferred than
the verb attachment. This bias can be overcome by an appropriate dis¬
course context. For instance, if the sentence
(5) There was a room with plants and a room with windows.
(6) There was a lounge with plants and a room with windows.
precedes (4), this effect is not obtained. Altmann and Steedman (1988)
have called this the principle of referential support: Example (5) has the
effect of introducing a room with windows into the discourse, to distin¬
guish it from the room with plants, whereas the window modifier is redun¬
dant in (6).
Ferstl (1994a) simulated this effect by first integrating (5) and carrying
over in the short-term memory buffer the two most highly activated
ptopositions lor the second processing cycle, which involved the integra¬
tion of The janitor cleaned the room with the window. The propositions car¬
ried over were WITH[ROOMbPLANT] and WITH[ROOM2,WIN¬
DOW], Similarly, (6) adds WITH[LOUNGE, PLANT] to the network.
In either case, whether (5) or (6) preceded it, the noun attachment was
successfully made (i.e., the proposition with the instrument interpreta¬
tion was correctly deactivated). However, the time course of integration
was quite different, depending upon whether the discourse context was
(5) or (6). Ferstl’s results (1994a) are shown in Figure 6.5. In the context
when only one room was mentioned, the preferred (but incorrect) instru¬
ment interpretation for window was slightly more activated for the first 14
iterations and did not become fully deactivated until after 27 iterations.
In contrast, when two rooms were mentioned so that a discourse object
Textbases and situation models 173
CLEA[JANTrOR,ROOM,BROOM]
O. CLEAN[JANITOR,ROOM,WINDOWS]
Figure 6.5 The time course of activation for the two propositions
CLEANQANITOR,ROOM,BROOM] and
CLEAN[JANITOR,ROOM,WINDOWS]. After Ferstl (1994a).
parsing within the Cl framework is yet to come.2 Such a task may need to
wait until we have acquired a more extensive database on the parsing
strategies people actually use than is available now. Richer empirical stud¬
ies of parsing strategies, data as in Ferstl (1994b), would substantially
help our progress theoretically.
According to the model of text processing of Kintsch and van Dijk (1978)
and van Dijk and Kintsch (1983), the formation of a macrostructure is an
integral part of normal text comprehension. It does not occur merely in
response to special task demands, such as instructions to summarize the
text, but is an automatic component of the process of comprehension that
cannot be separated from it.
That readers are able to recognize topic sentences that are expressed in
a text has been shown many times with a variety of procedures, including
reading times, think aloud protocols, and importance ratings (e.g., by
Kieras, 1980). Similarly, it has been shown that readers can produce ade¬
quate summaries of simple texts on demand (e.g., Kintsch & Kozminsky,
1977). However, there also exist good experimental data to support the
stronger claim of the theory that macrostructure formation occurs as an
integral part of comprehension.
If subjects read a text as part of a word recognition experiment, there
is no reason to think that they would intentionally engage in macropro¬
cessing if such processing were an optional, strategic part of comprehen¬
sion. Thus, if one can show that readers under these conditions form
macrostructures anyway, this would provide evidence for the automatic
nature of macroprocesses. Guindon and Kintsch (1984) produced such
evidence in a study that relied on the recognition priming method of Rat¬
cliff and McKoon (1978).
Ratcliff and McKoon had shown that recognition priming can provide
a very sensitive and at the same time nonintrusive measure for the analy¬
sis of the memory structures generated during reading. Their method
was exceedingly simple. They gave subjects sentences to read, followed by
a word-recognition test. They looked at the speed with which words from
- I he present approach has some similarities to the probabilistic parsers that have
recently been developed within computational linguistics (Charniak, 1993; Jurafsky
1196), but the compatibility of these approaches needs to be investigated further.
Textbases and situation models 175
the sentences were recognized as a function of the word that preceded the
test word on the recognition test: whether or not the preceding word was
from the same sentence as the target word, and whether or not it was from
the same proposition. In the first case, they observed a priming effect of
110 ms over different-sentences controls. When the preceding word and
the target word not only came from the same sentence but also from the
same proposition, an additional small, but statistically significant priming
effect of 20 ms was obtained.
Guindon and Kintsch (1984) used this procedure to study macropro¬
cessing in comprehension. In experiment 1, their subjects read para¬
graphs with an initial topic sentence. For instance, a text describing the
training of decathloners might start with the topic sentence A decathloner
develops a well-rounded athletic body and contain as one of the sen¬
tences in the body of the paragraph the statement A decathloner also
builds up strong hands. In what we called macropairs, both the preced¬
ing word and the target word came from a topic sentence (e.g., develop,
body); in micropairs, both words came from a detail sentence (e.g., build,
hand)\ in control pairs, words that appeared in the text but in different
sentences succeeded each other. Hence, the correct response to all target
words was “yes.” The priming effect was computed as the difference
between the recognition time for control target words minus the recogni¬
tion time for macro- and microtarget words. Figure 6.6 (left panel) shows
that the priming effect was substantially larger for macro words than for
micro words.3
Similar results were obtained in experiment 2 of Guindon and Kintsch
(1984), except that in this experiment the paragraphs were shown with¬
out the topic sentences, so that macropropositions had to be inferred
rather than constructed on the basis of an actual topic sentence. Control
word pairs were used (words that had not appeared in the paragraphs), as
well as word pairs thematically related to the paragraph but that did not
function as macrowords. Thus, we compare the recognition times for
body as a macroword and feet as a thematically related nonmacroword with
a thematically unrelated control word. Because the subject had not seen
any of these words before, the correct response in all cases was uno” in
this experiment. The results are shown in the right panel of Figure 6.6: A
Exp. 1 Exp. 2
Target Word
Figure 6.6 Priming effects in word recognition for macro words and non¬
macro words. After Guindon and Kintsch (1984).
stronger (negative) priming effect was obtained for macrowords than for
thematically related nonmacrowords.
figure 6.6 provides evidence that readers react to macro words differ¬
ently than to nonmacrowords, even though this was only a word recogni¬
tion task. Readers did not even have to understand what they read. How¬
ever, just as they could not help understanding the texts they read, they
could not help generating a macrostructure for these texts. Macrostruc-
tuie foi mation indeed appears to be an automatic and integral part of
reading comprehension.
A stud\ by Lorch, Lorch, and Mathews (1985) adds further support to
this conclusion by means ol a different experimental methodology. Lorch
and his colleagues constructed texts that were hierarchically organized
into major and minor subcategories. They compared reading times for
topic sentences after a major and a minor topic shift and found that after
a major shift topic sentences required 300 ms more reading time than
after a minor shift. However, there were no shift effects on sentence read¬
ing times when the sentences were presented in a disorganized fashion,
thus preventing readers from generating an orderly macrostructure. It is
interesting that these effects occurred when subjects were reading with¬
out special instructions. When the subjects were informed that they
would have to outline the text, these effects became even larger. Thus,
Textbases and situation models 177
In the implicit version of the texts these topic statements were omitted.
(9) There are no legal requirements for the sole proprietorship form
of organization. (Unmarked text)
In his first experiment, Mross used an item recognition task to study the
speed and accuracy with which words from topic statements and words
from detail sentences were recognized. These recognition probes were
presented at two points during the reading of each paragraph. Macro¬
probes (sole proprietorship) were recognized more accurately (93% vs.
76%) than microprobes (legal requirements) and were responded to faster
than microprobes (487 ms vs. 524 ms, respectively), in agreement with
Guindon and Kintsch’s (1984) results.
In another experiment (experiment 3), Mross (1989) recorded reading
times for both topic sentences and detail sentences in the marked as well
as the unmarked texts. These results are shown in Figure 6.7. Topic sen¬
tences were read more slowly than detail sentences, replicating the results
Marked
Unmarked
Topic Detail
Figure 6. / Reading time per syllable in ms for topic and detail sentences in
the marked and unmarked condition. After Mross (1989).
Textbases and situation models 179
and asked them to rewrite the second sentence to make it sound more
natural. His striking results are shown in Figure 6.8. For the highly
coherent (la) - (2) pair, subjects almost always chose either zero anaphora
or a pronoun. For the (lc) - (2) pair, which involves a clear topic shift, a
180 Models of comprehension
Low Coherence
O. Medium Coherence
full noun phrase was the overwhelming choice, and intermediate results
were observed for the pair with medium coherence.
Fletcher’s result were corroborated by Vonk, Hustinx, and Simons
(1992), who had subjects provide verbal descriptions of picture stories
that did or did not involve a topic shift. Because there was only a single
actor in these picture stories, no linguistic ambiguity was involved. Nev¬
ertheless, topic shifts in the pictures were signaled linguistically by a full
noun phrase in 88% of all continuations, which contrasts with 83% pro¬
noun choices when no topic shift was involved.
The experiments reviewed here clearly show that macroprocesses are
real. They are as much part of discourse comprehension as micropro¬
cesses are. These experiments also allow us to glimpse how readers form
macrostructures. In particular, we note that natural language has avail¬
able rich means, including syntactic ones, to help the reader to generate
macrostructures. What the experiments discussed here do not make clear
is the enormous importance of macroprocesses for text understanding:
how easy a text is, how well readers understand it, how well they can
remember it, what they learn from it - all this is strongly dependent on a
successful macrostructure. Local comprehension problems may be a nui¬
sance, but problems at the macro level tend to be a disaster.
Textbases and situation models 181
The two contradictory sentences (in boldface type) are separated by only
two sentences; all sentences in the paragraph are on the topic of super¬
conductivity; the contradiction is explicit (cooling vs. increasing the tem¬
perature). The readers were asked to report any comprehension problems
while reading and later recalled the paragraph. They were not warned of
contradictions in the texts (four of the six experimental texts contained a
contradiction). Most of the failures (82%) to notice a contradiction in a
text were of three types:
Thus, we have a high incidence of errors, and errors that are highly sys¬
tematic rather than random - a phenomenon that cries for an explanation.
To simulate comprehension of the superconductivity paragraph, the
text (11) was represented as a list of propositions. Because subjects read a
series of texts that all had the same structure (definitional sentence plus
three elaborative sentences), it is highly likely that they would have
considered SUPERCONDUCTIVITY and IS[SUPERCONDUCT-
IVITY,DISAPPEARANCE] as macropropositions. As an additional
macroproposition, we assumed that subjects would use either OF [DIS-
APPAEARANCE,RESISTANCE] or OBTAIN [SUPERCODUCTIV-
1 IT [COOL [TEMPERATURE]], corresponding to the type 1 and type
2 readers. The network contains the two contradictory propositions
OBTAIN [SUPERCODUCTIVITY[COOL [TEMPERATURE]] and
OBTAIN[SUPERCONDUCTIVITY,INCREASE[TEMPERATURE,
OF MATERIAL]], which are linked by an inhibitory link of strength —1.
All other links have strength +1.
Integrating the propositinal network thus generated in cycles yields
the long-term memory strength values for the two contradictory propo¬
sitions shown in Figure 6.9 (“normal”). The strength values are unequal,
but both propositions have substantial strength, and one would expect
that the contradiction would be noted.
T.jpe 1 and type 2 errors can be created in the model by increasing
the weights of the macropropositions in the network. If we increase
the strength of the links between macropropositions from 1 to a value of
10, the model makes type 1 and type 2 errors, depending on which
macrostructure we assume. If OF[DISPPEARANCE, RESISTANCE] is
Tex lb uses and situation models 183
COOL
INCREASE
Figure 6.9 Strength values for the two contradictory propositions for nor¬
mal readers and three types of nondetectors.
5 The fact that these results are based on a semantic space derived from reading an
encyclopedia must be borne in mind. An adult human semantic space is based on
much more experience with words, and with words from different kinds of texts —
encyclopedias focus on that part of human knowledge that is not common sense,
whereas common sense knowledge is what is most involved in understanding simple
texts like the ones analyzed here.
186 Models of comprehension
Figure 6.10 Cosines between words and sentences, sentences and para¬
graphs, and paragraphs and the whole text.
rules, one might generate the following summary sentences for Tjl, ^[2,
and the whole text, T, respectively:
have cosines with the ^]1 and ^2 vectors of .39 and .42, respectively, and
the cosine between the summary sentence S(T) and the whole text vector
T is also .42. These are fairly high values (compare them to the cosines
between words and sentences in Figure 6.10), indicating that we have
constructed good summaries. This technique may prove useful for eval¬
uating summaries objectively and automatically.6
The cosine values between words and sentences, sentences and para¬
graphs, and paragraphs and the whole text shown in Figure 6.10 can be
used to implement a Cl-model simulation of the text together with its
macrostructure. We add to the atomic propositions of the text the com¬
plex propositions SI to S4 corresponding to the four sentences, and a
macrostructure consisting of Tfl and TJ2 for the two paragraphs, and T
for the text as a whole, and link them with the cosine values obtained
from the LSA. The resulting network is shown in Figure 6.11. Sentence
2 fills the time-circumstance slot of DRIVE; DRIVE provides the setting
for the events described by S3 and S4; the propositions of S3 and S4 are
all causally linked. The microstructure consists of the atomic text propo¬
sitions and the four complex propositions corresponding to sentences SI
to S4. The macrostructure consists of the two paragraph vectors ^1 and
T[2 (PI and P2), and the whole-text vector T. Links between atomic
propositions have a value of 1; all other links have values equal to the
strength of their relationship as assessed by LSA and shown in Figure
6.10. The numbers after each label are the long-term memory strength
values computed from the Cl simulation.
Comprehension of this network was simulated in four cycles, each
cycle comprising one sentence, including its superordinate macronodes.
For each cycle, the most strongly activated node from the previous cycle
was held over in the short-term memory buffer. The results, shown in
terms of long-term memory strengths in Figure 6.11, are interesting.
The strongest nodes are the DRIVE proposition from the first sentence
and sentence vectors S2, S3, and S4. On the other hand, the paragraph
representations ^1 and ^2 and the whole-text vector T end up with rela¬
tively low strengths.
Does including the macrostructure in this way make a difference for
the model? It certainly does, for if we simulate comprehension of this text
without the complex propositions and macronodes in Figure 6.11, the
strengths values obtained for the atomic text propositions correlate only
6.3 Inferences
and prior experience, has been studied under the label of inferencing in
text comprehension. In particular, the process of knowledge use that
every reader must engage in to properly understand a text has been char¬
acterized as making inferences in text comprehension. This is a regret¬
table terminology that has caused a great deal of confusion, for much of
what we call inferencing has very little to do with real inferences.
Retrieval Generation
A C
B D
7 I am not claiming that this sort of inference does or does not occur “naturally” dur¬
ing comprehension. I am simply classifying inferences.
Textbases and situation models 191
the cues available in short-term memory are used to retrieve other likely
cues from long-term memory that in turn are capable of retrieving what
is needed. Consider the following:
(14) Three turtles rested on a floating log, and a fish swam beneath
them.
the statement The turtles are above the fish is immediately available to a
reader. Indeed, readers often are unable to distinguish whether they were
explicitly told this information or not (e.g., Bransford, Barclay, & Franks,
1972). Note, however, that this is not merely a question of knowledge
retrieval as in doors are parts oficars. The statement the turtles are above the
fish is not something that already exists in long-term memory and is
now retrieved, but it is generated during the comprehension process. The
reason it is so highly available in the reader’s working memory is pre¬
sumably that the fish-log-and-turtle scene is encoded as an image, and
this mental image constitutes a highly effective retrieval structure that
provides ready access to all its parts - not just the verbal expression used
in its construction.
Here is an instructive case in which the decision to represent mental
192 Models of comprehension
8 Good experimental evidence exists on both sides of the rule versus model contro¬
versy; whether it can be aligned along levels of representation, as suggested here, is
another question.
9 Nevertheless, I shall drop the quotation marks from “inference” after this sentence
in deference to the commonly accepted terminology.
194 Models of comprehension
caused the rain. But given The clouds gathered quickly, and it became omi¬
nously dark, they do not infer the consequent the clouds caused ram. This
finding that antecedents, though not consequent causal inferences, are
made in text comprehension is readily accounted for by the Cl model.
Suppose a text describes a situation that is a common cause of some
event, and then asserts that this event occurred, without mentioning an
explicit causal connection between the antecedent and the event. Preex¬
isting retrieval structures causally link the antecedent and the event in the
reader’s memory; the causal link is activated and is likely to become a per¬
manent part of the reader’s episodic text memory because it connects two
highly activated nodes in the memory structure.
The situation is different for the consequent inferences. The same
retrieval structures that made available the causal antecedent will make
available the causal consequent, too. But at that point in the reading
process, the consequent is a dangling node in the episodic text structure
because it is connected to nothing else in the network but the antecedent.
Therefore, the consequent will not receive much activation in the inte¬
gration process and will be excluded from episodic memory. Thus, The
clouds gathered quickly, and it became ominously dark might make available
the clouds caused rain, but if nothing else in the text connects to rain, this
node will become quickly deactivated in the network. When in a later
processing cycle other information becomes available that could have
linked up with rain, that node is most likely lost from working memory.
Hence, although the retrieval structures in the reader’s long-term mem¬
ory make available both antecedent and consequent information, only the
former is likely to survive the integration process and become a stable
component of the reader’s text memory.
According to Table 6.1, it should make a great deal of difference
whether bridging inferences occur in a familiar domain or in an unfamil¬
iar domain. In the former case, preexisting retrieval structures make
available the information that fills the gap. The process of retrieving
information from long-term memory via a retrieval structure takes about
400 ms, as is shown in chapter 7. Hence, reading times for sentence pairs
for which a bridging inference is required should be increased by at most
400 ms in comparison with sentence pairs for which this information has
been explicitly stated. In fact, 400 ms is an upper limit, because the
retrieval from long-term working memory most likely occurs at least
partly in parallel with other ongoing reading processes. Noordman and
196 Models of comprehension
Vonk (1992) provide some data that allow us to test this prediction. In one
of their texts about a woman looking for an apartment to rent, the follow¬
ing sentence pair occurred:
(15) The room was large, but one was not allowed to make music in
the room.
1 he text is about a sailboat race, and the required inference, which is pro¬
vided in the explicit condition, is
Presumably, the subjects in this experiment did not know this fact about
Kevlar sails before reading this text. The results are also shown in Table
6.2. Reading times for the because phrase in (17) are equal in the explicit
and implicit conditions; that is, subjects wasted no time on figuring out
the obscure relationship between the Kevlar sails and the wind while
reading this sentence. On the other hand, when forced to make the infer¬
ence in a sentence verification task, they were considerably faster to ver-
lfy (18) in the explicit than in the implicit condition. Noordman andVonk
(1992) also report a version of this experiment in which the same texts on
economics were used in the familiar and unfamiliar condition, but the
readers were either novice students in economics or expert economists.
The results are analogous to those in Table 6.2.
The but and because in (15) and (17) play a crucial role in this experi¬
ment. Keenan and I have reported an analogous experiment with familiar
Textbases and situation models 197
Table 6.2. Phrase reading times for the second phrase in (15)
(familiar text) and (17) (unfamiliar text) and inference verifi¬
cation times in ms for explicit and implicit conditions. After
Noordman and Vonk (1992)
Familiar
Explicit 1,473 2,169
* ns
Unfamiliar
materials, such as those in examples (15) and (16) but without linguistic
signals like but and because (Kintsch & Keenan, 1973). Our results were
completely different: Reading times for explicit and implicit sentences
were not significantly different, but verification times were 400 ms longer
in the implicit condition (these results are discussed in section 7.3.2).
However, I do not believe that these results contradict Noordman and
Vonk (1992). By using a sentence connective such as but or because, Noord¬
man and Vonk invite their subjects to find a connection between the two
phrases. When that is easy, because the bridging inference involves no
more than a retrieval from long-term working memory, subjects comply,
resulting in longer reading times for the to-be-linked phrase but verifica¬
tion times that are the same as for explicit sentences. When that is diffi¬
cult, because the domain is unfamiliar and suitable retrieval structures are
lacking, readers do not construct the link they are invited to form. Hence,
they have to do it in the verification phase of the experiment. In the exper¬
iment by Keenan and me, the domain was familiar enough and readers
could have retrieved the missing link readily enough, but being lazy read¬
ers and lacking a linguistic cue specifically requesting a bridging infer¬
ence, they did not make the effort. This added 400 ms to the verification
time, because the linking proposition had to be retrieved at that point from
long-term working memory. Thus, the question “Are inferences made
during reading?” needs to be approached with great care: Background
198 Models of comprehension
knowledge, task demands, and linguistic cues in the text all interact in
determining what will happen in a particular case.
Much recent research has been concerned with the construction of situ¬
ation models (e.g., Glenberg, Kruley, & Langston, 1994; Glenberg &
Langston, 1992; Graesser & Zwaan, 1995; Zwaan, Langston, & Graesser,
1995; Mani & Johnson-Laird, 1982; Trabasso & Suh, 1993; van den
Broek, Risden, Fletcher, &Thurlow, 1996). It should be clear from every¬
thing that has been said so far that there is not a single type of situation
model and not a single process for the construction of such models. Sit¬
uation models are a form of inference by definition, and Table 6.1 is as rel¬
evant for situation models as it is for any other inference in discourse
10 Because poor readers do not always make inferences even when more skillful readers
do, some of these estimates (in particular, the Till study) may be unduly inflated by
averaging results over skilled and less skilled readers.
Textbases and situation models 199
comprehension. That is, situation models may vary widely in their char¬
acter. In the simplest case, their construction is automatic. Relevant
information is furnished by existing retrieval structures, as in the exam¬
ples given for cell A in Table 6.1. Or it may be available simply as a con¬
sequence of a particular form of representation, such as imagery in (14).
Such situation-model inferences do not add new propositions to the
memory representation of the text but simply make available information
in long-term memory via retrieval structures, or information that is
implicit in the mental representation, such as an image (see Fincher-
Kiefer, 1993, and Perfetti, 1993, for similar suggestions). On the other
hand, situation models can be much more complex and result from
extended, resource-demanding, controlled processes. All kinds of repre¬
sentations and constructions may be involved. The process may be shared
by a social group or even by a whole culture and extend over prolonged
periods. Text interpretation is not confined to the laboratory. At least
some experimental methods for studying more complex cases are being
developed today. Trabasso and Suh’s three-pronged method that com¬
bines theoretical analysis, verbal protocols, and experimental procedures
is a promising development in this respect, as is the simple but very effec¬
tive “landscape” method of van den Broek et al. (1996). However, for the
most part current research on situation models prudently sticks to rather
simple cases, as I do in the next section, where the formation of spatial sit¬
uation models is discussed, a topic that presents a particular challenge to
a propositional theory such as the Cl model.
other the same information was provided in the form of a survey descrip¬
tion. Thus, where one text might say
(19) After two blocks, turn right on 6th Street to reach the school
house.
(20) Sixth Street crosses Main Street two blocks north. The school
house is on 6th Street just west of its intersection with Main
Street.
Each subject read either the route or the survey text. Reading rate was
controlled by the experimenter. Half of the subjects read the text once,
and half were given three presentations of the text. After reading, subjects
were given a free recall test and a sentence verification test, with true old
sentences, paraphrases, and inferences, and false distractor sentences.
Reproductive memory, tested either by free recall or verification of sen¬
tences actually in the text (verbatim or not), was quite good. Across both
texts, subjects recalled 22% of the propositions after one reading and 40%
after three readings. Similarly, the d' for the verification of old test sen¬
tences was 1.63 after one presentation and 2.36 after three presentations,
attesting to the fact that these subjects had been able to form an adequate
textbase. However, the subjects had not been able to form a situation
model that would have allowed them to verify the inference questions cor¬
rectly. The d! value for inference sentences was a mere .20 after one pre¬
sentation and rose to only .85 after three presentations. Thus, we have here
a good textbase enabling normal reproductive memory but a weak situa¬
tion model that yielded poor performance on inference sentences.
In their experiment 2, Perrig and Kintsch (1985) used briefer texts (14
sentences), describing a simpler geographic layout, so that readers would
be better able to form a usable situation model. Furthermore, subjects
were allowed to study the text at their leisure. Again, both a route and a
survey version of the text were used. These changes in the experiment
had the desired effects. Overall, reproductive memory remained at a high
level, as in experiment 1: The average recall was 40% of the text propo¬
sitions, and the / for old test sentences was 3.01. However, unlike in
experiment 1, the verification of spatial inference sentences was also
good, with an average / of 1.79. Thus, with a simpler task and more time,
these subjects were able to form an adequate situation model that enabled
them to verify inference sentences fairly accurately.
Textbases and situation models 201
Figure 6.12 Floor plan of the castle that subjects had to memorize for the
spatial inference task in experiment 1. From Haenggi et al. (1995).
moved through the rooms of the castle. At four points during the story
two objects were presented on the computer screen, and the subjects had
to decide whether the objects were in the same room or in different
rooms. For instance, just after the main character walked into the ball¬
room, CARAFE-RUG was presented as the test pair, requiring a “same”
response. In this case the objects were in a different room than the actor.
On another test both objects were in the same room as the actor, and there
were tests where the objects belonged to different rooms. The main result
of the experiment, just as in Morrow et al., was that the decision time for
“same” responses depended on whether the objects were in the same
room as the actor in the story or in another room. In experiment 1, for
instance, the mean response time for “same” responses was 2.108 s when
the objects were in the same room as the actor, and 2.782 s when they
were in a different room. Readers seem to focus on the location of the
actor, so that objects close to that location become more available than
objects farther away. Obviously this is a dynamic process, for the readers
must update their situation model throughout the story.
Textbases and situation models 203
Only later, after she had acquired more vocabulary and learned to read,
she found out that it wasn’t a nonsense rhyme at all:
The rhyme in this case had overpowered the semantics. It is not an iso¬
lated instance. In comprehending and remembering rhymes, the poetic
language is at least as essential a factor as the meaning. How can the Cl
model deal with such observations?
Kelly and Rubin (1988) have collected a large number of observations
about the following nursery rhyme, which therefore is an ideal object for
our analysis:
Figure 6.14 shows an analysis of the first two lines of this nursery rhyme.
The surface structure has been subdivided into two parts, the phonology
and the versification. The first text element - (1) in Figure 6.14 - is iden¬
tified phonologically as /ee/. It is linked to the second element /nie/, as
is /ie/, with /mee/, and so on. At the level of versification /ee/ is
not directly linked to /nie/ but with the verse markers VOWEL[l],
BEGINNING-OF-LINE[l], RHYME[1,3], and RHYTHM[1,3,5,7,
STRESSED]. The same element, therefore, enters into a different set of
Versification Semantics
Phonology
For most genres that have been studied so far (descriptive texts, stories,
manuals), a single situation model is assigned to a text. The author’s
210 Models of comprehension
Figure 6.15 The activation values for the network shown in Figure 6.14.
intent is to communicate this model to the reader via the text. How effec¬
tive a textbook or a manual is depends on how well the author succeeds
with this communication. If competent readers construct the wrong
model, the author has failed. Not so for literature. It is almost always the
case that the author intends the reader to construct situation models at
more than one level: At one level there may be an action sequence, at
another a commentary about the social condition, at yet another a moral
message, and so on. It is also not necessary or even possible for the author
to define the intended situation model precisely and uniquely, for what
Textbases and situation models 211
readers make of a text depends on their personal experiences and the way
they read the text.
As yet, only very little research in psychology and cognitive science has
been concerned with the comprehension processes involving multilevel
situation models. Story grammars, for instance, have focused primarily
on the action sequence in a story, with the moral of the story as a sort of
afterthought (Mandler & Johnson, 1977). On the other hand, there have
been attempts in the context of artificial intelligence to describe the con¬
stellation of actions that can be considered as “betrayal,” or “seduction”
(Lehnert, 1981). Multiple situation models are more familiar in a differ¬
ent context — the study of word arithmetic problems and the use of com¬
puters, for instance, as discussed later this book. There we distinguish
between the situational understanding of a word problem and the formal
problem model, its mathematization (chapter 10), or between the task
model that a computer user might have and the device model that is nec¬
essary to perform this task on the machine in question (chapter 11).
There has also been some work on the role that personal experiences play
in interpreting new situations, although not in a literary context (such as
“reminding” in Schank, 1982, or the work on case-based reasoning by
Kolodner, 1993).
There are no simulations of how multilevel situation models are
formed for literary texts. The general theory for such a task exists, and
there are no compelling reasons why it could not be undertaken. Such a
project would have to be conceived as an interdisciplinary enterprise, for
although the principles of comprehension are presumably the same for
literary texts as for other genres, only literary experts could tell us what
the content of the multilevel situation models ought to be. The cognitive
scientist can construct the situation model, but the content of the model
and the strategies used in its construction are domain specific. This is no
different from other cases - for example, when we must ask a computer
expert what the device model should be, or when we consult a biologist
about the biology content of an instructional text.
Do the strategies used for comprehending literary text differ from those
used in other genres? This issue has received little attention among cog¬
nitive psychologists, but there are some interesting studies on how bal-
ladeers in primitive societies without a written record reproduce their
212 Models of comprehension
songs from memory or, rather, reconstruct them (Lord, 1960; Wallace &
Rubin, 1988). Strategies similar to those these singers use to reconstruct
their songs are employed by educated persons in comprehending literary
texts. We thus can gain an impression of the rather complex strategies
that must be acquired before literary texts can be appreciated.
Lord (1960) describes a well-known Serbo-Croatian singer of ballads
who was able to repeat a 2,294-line ballad he had never heard before after
listening to it once. His reproduction was 6,313 lines long and contained
many embellishments, character descriptions, and an emotional depth
that were lacking in the somewhat naive original. The strategies that make
such a feat possible are reasonably well understood (Lord, 1960; Rubin,
1988).
The songs of the balladeers — in Homer’s day as well as in our own
(Wallace & Rubin, 1988) - largely consist of metric formulas (90% in the
Odyssey) that can be used whenever needed, irrespective of the semantics.
Thus, we find “brilliant Odysseus,” “resourceful Odysseus,” or “long-
suffering, brilliant Odysseus,” depending on whether the verse requires
two syllables, three, or more. In composing the song, this metric vocabu¬
lary is organized around particular topics. A large variety of scripts is
used for this purpose. For instance, the topic “council meeting” is used
over and over again in just as stereotyped a manner as the restaurant
script in AI. This includes various subscripts, such as the arrival of the
hero, with horse and weapons always being described in the same order.12
Equipped with this vocabulary and his scripts, the balladeer is able to
concentrate entirely on the action and characters of the story - not unlike
musical improvisation, especially in jazz.
The balladeer employs a variety of strategies that enable him to regen¬
erate songs of a certain kind. (There is no claim to originality here; the
balladeer insists that he merely reproduces faithfully what he has heard.)
These generative strategies are employed effortlessly, automatically, and
unconsciously. Thus, they must be well rehearsed before they can be
used. Furthermore, the singer’s memory must contain a rich store of sim¬
ilar songs as well as related usable information and episodes from every¬
day life. According to the findings of Ericsson and Chase (1982), we can
safely assume that this material is organized in memory in certain ways.
Thus, what we have here is a form of expert memory, which, as is well
known, enables one to achieve a great deal but which can be acquired only
with much effort and over an extended period (e.g., Ericsson, 1985).
Comprehending literary texts makes similar demands. The reader
must be able to employ the required encoding strategies automatically
and must have readily available the knowledge basis needed for the use of
these strategies. Everyone can read and remember a trite newspaper
story. We are all experts in this regard, the required linguistic encoding
strategies and the necessary domain specific knowledge having been prac¬
ticed over a whole lifetime (Ericsson & Kintsch, 1995, chap. 7). Literary
texts are different. Unfamiliar linguistic strategies are needed for com¬
prehension, beginning with verse and rhyme, all the way to schemes for
the organizational principles of the novel. Information about the author
and the time and social context in which the text was generated may be
relevant. Remindings and reminiscences of related texts may be funda¬
mental for the interpretation of the text. I am suggesting, therefore, that
the comprehension of literary texts should be regarded in the same way
as any other expert performance. Peak performance, accordingly, would
probably require a decade of intensive study. Of course, a literary text can
be enjoyed at some level even without that expertise, but deep under¬
standing is reserved for the expert.
Returning to the question posed at the beginning of this section - Is
the comprehension of literary texts different from that of nonliterary
texts? - the answer must be “yes” and “no.” Yes, because literary texts
demand specific encoding strategies and specific knowledge that do not
play a role in comprehending nonliterary texts. Specifically, the encoding
strategies for literary language are different from those employed for
everyday language, and specific domain knowledge is required to under¬
stand literary texts. No, because the psychological processes involved are
the same in both cases: The “what” is different, but the “how” is the
same. A simulation of a literary text more complex than Eenie, meenie,
minie, mo is quite conceivable, but it would be unwise to attempt it until
the comprehension theory can be expanded beyond the realm of the
purely cognitive. Without a computational account of the emotional reac¬
tions (chapter 11) and aesthetic experiences that play a central role in
understanding literary text, such a simulation would be unsatisfactory.
Memory for text is usually quite good, and often very good. There is an
issue here, because if the same memory system is used for remembering
text as for learning the nonsense syllables and word lists we study in the
laboratory, the discrepancy between the good text memory and the poor
list memory needs to be explained.
The span of immediate memory is plus or minus seven items. From a
list of 30 random words, college students recall about 12 to 14 after one
reading, adults somewhat fewer. It takes about one hour to memorize 100
random words. How can people live with such a terrible memory?
They can’t and they don’t. The memory demands of many cognitive
216 Models of comprehension
1 The distinctions between these different types of memory, such as episodic and
semantic memory, are irrelevant for present purposes, although they may be impor¬
tant in other contexts.
218 Models of comprehension
Figure 7.1 (a) The classical theory of working memory: ST-WM is the acti¬
vated portion of LTM, containing no more than 4 to 7 chunks, (b) The long¬
term working memory theory of Ericsson and Kintsch: LT-WM is the part
of LTM (gray circles) linked to the chunks in STM (circles) by retrieval
structures.
It is only when we study expert memory, inside and outside the laboratory,
that this severe capacity constraint becomes intolerable. I have already
mentioned the many things a discourse comprehender must maintain in
working memory in order to succeed at comprehension. Expert perfor¬
mance in chess, mental calculation, scientific problem solving, and med¬
ical diagnosis makes similar demands on working memory. Furthermore,
The role of working memory in comprehension 219
in all these tasks one can show that working memory involves a long¬
term component, not some sort of expanded temporary storage in a
super-STM. This is implied by the repeated findings of insensitivity to
interruption, good incidental long-term recall, and resistance to interfer¬
ence. Apparently, these experts have learned to use parts of their long¬
term memory as working memory - the long-term working memory, or
LT-WM.
Figure 7.1b illustrates how LT-WM functions. The items available in
the capacity-limited STM serve as retrieval cues for those parts of LTM
that are connected to them by retrieval structures (as explained in section
7.3). A single retrieval operation, using one of the cues in STM, makes
available that subset of LTM memory that is linked to the cue by such a
retrieval structure. A retrieval from LTM by a cue in STM requires about
300 to 400 ms (see Ericsson & Kintsch, 1995). Hence, the amount of
information in working memory consists of two sets of items: those
already in ST-WM, which are accessible very rapidly though not instan¬
taneously (Sternberg, 1969), and those reachable by a retrieval structure
in about 400 ms. Whereas the capacity of ST-WM is strictly limited, that
of LT-WM is constrained only by the extent and nature of the retrieval
structures that can be accessed via the contents of STM.
Retrieval structures were first studied in detail by Chase and Ericsson
(1982), who investigated a subject who was able to increase his digit span
to more than 90 digits after extended training. Later, many other subjects
were taught to extend their digit span to 30 or more items. Such feats
were achieved by developing and automating efficient encoding strategies
to store digits in LTM. This requires a large body of relevant knowledge,
for the encoder must be able to perceive familiar patterns in the digit
sequences that are to be memorized and to associate these patterns with
retrieval cues. Schemata retrieved from LTM must be used to further
organize these retrieval cues into stable retrieval structures that will sup¬
port the quick and reliable recall of the digit sequence to be learned, for
instance, S. F., Chase and Ericsson’s original subject, was a runner who
used his extensive knowledge about running times as an encoding
scheme. Typically, he formed four-digit chunks by associating the digits
with some remembered numerical fact about running — for example,
3,596 might be coded as 3 m 59.6 s, or just under a four-minute mile. He
then used spatial relations to encode these chunks into higher-order
groups, forming a hierarchical retrieval scheme, as shown in the upper
part of Figure 7.2. Having formed such a structure, S. F. was able to recall
220 Models of comprehension
the corresponding digits at any location in this structure that the experi¬
menter requested.
Subject S. F. further strengthened this retrieval scheme through elab-
orative encoding. He not only associated digits with a retrieval cue but
used higher-order relations among the encodings he had generated to
organize his retrieval structure. He might, for instance, have noted that
several successive chunks were all related to running times for a mile,
thereby forming a supergroup. The resulting retrieval structure thus was
organized both hierarchically and elaboratively, as shown in Figure 7.2.
There are several important things to keep in mind about retrieval
structures such as that shown in Figure 7.2. Only a very rich knowledge
base enables one to form such structures, because many different patterns
and associations must be readily available for encoding. Suitable knowl¬
edge bases exist only in areas where a person has reached a high level of
The role of working memory in comprehension 221
Table 7.1. Example of a text used in the interruption experiment by Glanzer et at.
(1981). Sentences 1, 3, 5, . . . comprise a connected discourse; the interspersed sentences
2 ,4, 6, . . . are unrelated
51 James Watts’ first steam engine was completed after many years of thought
and labor.
52 Honeybees gather nectar from flowers of a particular kind during spring.
53 Even then much work was needed to make it practical.
54 Democratic principles are not always observed in the developing countries of Asia.
55 He supplied enough money to make the machine a success.
56 Water management in the arid states of the Southwest is affected by legal issues.
57 Mine owners came from all over to see the engine work.
Etc.
working memory, for reading an unrelated sentence surely must wipe out
any traces of the prior text from the reader’s STM. Resource-consuming
reinstatement searches should be required, and comprehension should
be badly degraded. The theory of LT-WM, however, readily accounts for
the observed results. The next sentence of a text following an interrup¬
tion provides the cues in STM that can retrieve the LTM trace of the
previous text from LT-WM. The mental structure that the reader has
created in the process of comprehending the text itself functions as a
retrieval structure. Hence, with every new sentence, the reader gains
access to the previous memory trace of the text in LTM, at the cost of a
single retrieval from LT—WM, which takes about 400 ms.
Many versions of the Glanzer experiment exist, with longer and
shorter interruption intervals and different interrupting activities such as
doing simple arithmetic problems. A full discussion of their results can be
found in Ericsson and Kintsch (1995). All these results are in agreement
with the interpretation in terms of LT-WM offered here. Interrupting a
text does not interfere with comprehension because LT-WM allows the
reader reliable access to the LTM trace of the prior text with a single
retrieval operation. Similar observations have been reported concerning
the interruption of other highly skilled cognitive operations, such as play¬
ing a chess game.
Text comprehension is structure building. To comprehend a text
means forming a mental structure that represents the meaning and mes¬
sage of the text. Different theories of text comprehension (e.g., Gerns-
bacher, 1990; Just & Carpenter, 1987; van Dijk & Kintsch, 1983) differ on
224 Models of comprehension
the precise characteristics of that structure, but they agree on the central
issue of structure building. Any one of these theories could therefore in
principle serve as a theory of LT—WM. Whatever mental structure is
incidentally generated in the process of comprehension also serves as a
retrieval structure, thereby generating LT—WM. In the next section I
explore in somewhat more detail how the text representations generated
by the Cl model function as retrieval structures.
Memory plays two distinct roles in text comprehension. First, texts must
be comprehended sequentially, one sentence after another; therefore,
when focusing on one sentence, memory is needed to provide access to
the prior text. Second, text representations are not based solely on the
text but require significant contributions from the comprehender’s long¬
term memory. Hence, relevant structures and facts must be continuously
retrieved from long-term memory during comprehension.
Figure 7.3 A retrieval structure for a text (the propositions P1-P6) stored in
long-term working memory. On top is the hierarchical macrostructure of the
text. On the bottom are the knowledge-based associations relating units of
encoded information with scripts, frames, and schemas.
(1) John hit a deer driving [0] late at night on a forest road. It was dark
and at first he [Punstressed] did not realize how much damage had
been done. However, the mechanic at the garage said that the bill
would be very large. The insurance agent took a long time to
inspect the damage, and he [Pstressed] got quite impatient. In the
meantime in the forest, three vultures circled the dead deer [NP].
In the first sentence, zero anaphora is used for John as the agent of dri¬
ving. In the second sentence, John is referred to with an unstressed pro¬
noun. In both cases John is presumably still available in the reader’s focus
of attention. The stressed pronoun in the fourth sentence signals a refer¬
ent in LT-WM, namely John. When the deer is mentioned in the last sen¬
tence, it is reintroduced with a full, definite noun phrase, the dead deer.
Needless to say, there is a great deal more to say about the use of anaphora
in English (e.g., Chafe, 1974; Greene et ah, 1994; see also section 5.2).
Besides their other roles, anaphora also serve as directives for the use of
memory. They are by no means the only processing signals. Givon (1995)
has discussed other signals of this kind. His evidence is based on the lin¬
guistic analysis of texts that shows that anaphora are indeed employed in
the way he claims. For example, the distance between a referent and its
anaphora is generally short for zero anaphora, moderate for stressed pro¬
nouns, and long for full noun phrases. Such data are very important and
strongly support his analyses. It is to be hoped that psycholinguists will
also find ways to test these claims with their methods.
The role of working memory in comprehension 227
money
robbery
Figure 7.4 Long-term working memory: some links between the words of a
sentence and items in LTM.
228 Models of comprehension
(2) Implicit
A burning cigarette was carelessly discarded. The fire destroyed
many acres of virgin forest.
(3) Explicit
A burning cigarette was carelessly discarded. It caused a fire that
destroyed many acres of virgin forest.
On the immediate test, verification times for explicit test sentences were
400 ms faster on the average than for implicit test sentences. In the case
of the explicit sentences, all three propositions CAUSE [CIGA¬
RETTE,FIRE], DISCARD[BURNING[CIGARETTE]] and [DE-
STROY[FIRE,FOREST]] were active in STM, and hence the test sen¬
tence could be verified quickly. In the case of the implicit sentence, only
DISCARD[BURNING[CIGARETTE]] and [DESTROY[FIRE,FOR¬
EST]] were in STM, but because both are linked associatively to
CAUSE[CIGARETTE,FIRE], this proposition could be retrieved from
LT-WM within about 400 ms, which added that much to the verification
time for the implicit test sentence.
It is misleading, however, to talk about a bridging inference in this case,
because that would imply that by some inference operation the proposi¬
tion CAUSE[CIGARETTE,FIRE] was inserted into the reader’s work¬
ing memory independently of the test statement. What happened,
instead, was that as a normal consequence of comprehension, the bridg¬
ing proposition became part of the reader’s LT-WM, and when the test
statement required it, it was readily accessible.
Given the right knowledge about what happens when burning ciga¬
rettes are thrown away in dry forests, comprehension assured that this
linking knowledge was available when needed. Without the right knowl¬
edge, this bridging cannot happen. Consider the following sentence pair:
230 Models of comprehension
For most readers, hydrocele and spermatic cord retrieve nothing. A real
bridging inference is required to establish the coherence of such a text. A
reader might reason that the spermatic cord is some sort of place where
hydrocele, which seems to be a kind of fluid, is located. This is deliberate,
conscious inferencing, reflected in the reader’s verbal protocol, unlike the
automatic knowledge access that occurs in a familiar domain.
This is not the place for a comprehensive discussion of either the prim¬
ing data or of topical inferences. (For another discussion of these issues,
see chapters 5 and 6, respectively). My goal here is merely to show how
the theory of LT-WM is relevant to the interpretation of these phenom¬
ena.
We have already discussed the interruption studies of Glanzer and his
colleagues (e.g\, Glanzer et ah, 1981). Their studies constitute strong
empirical evidence for the operation of LT-WM. In comprehending easy
texts in familiar domains, readers construct coherent, orderly text repre¬
sentations, tied into their LTM, which serve as efficient retrieval struc¬
tures, almost obviating the effects of the interruption. If the textbase
formed were incoherent, however, no effective retrieval structure would
be available, and interruptions should have a deleterious effect. McNa¬
mara and Kintsch (1996) tried to arrange for such conditions. Instead of
the simple story Glanzer used, we chose difficult scientific, technical
texts in domains with which our subjects were unfamiliar. Furthermore,
instead of interrupting after each sentence of a text, we interrupted in the
middle of each sentence. A subject read the last half of the prior sentence
plus the first half of the present sentence and was then interrupted by an
unrelated sentence. Thus, at the time of interruption whatever represen¬
tation a subject had formed was necessarily incomplete, so it would be
less likely to function as a retrieval structure.
The results of this modified interruption procedure are shown in Fig¬
ure 7.5, together with the results of a replication of Glanzer’s work with
familiar texts and interruptions after complete sentences. In neither case
was there an effect on comprehension as measured by the subjects’ free
recall. Subjects recalled as much with midsentence interruptions as with
end-of-sentence interruptions as with no interruptions at all. However,
The role of working memory in comprehension 231
Figure 7.5 The difference in reading times per sentence between normal
reading and interrupted reading for familiar texts with end-of-sentence inter¬
ruptions and for unfamiliar texts with midsentence interruptions. After
McNamara and Kintsch (1996).
actual text and its organization only sketchily, whereas LTM structures
are used to organize and elaborate the textual material. Text representa¬
tions are always a mixture of textbase and situation model, but one or the
other component may predominate. In other words, either the top or bot¬
tom half of the structure shown in Figure 7.3 is emphasized. Further¬
more, depending on their knowledge base, different readers may form
quite different situation models.
This argument can be elaborated by considering in more detail the role
of LT-WM in the process of comprehension.
more but also stronger links in their LTM structure. Poor readers may
need some sort of help to bring their text-LTM associations above
threshold strength so that an effective LT-WM is formed.
Situation model dominance. At the other extreme, a person with rich back¬
ground knowledge forms a very different LT-WM. The text is not only
linked to (mostly useless) single-word associates but also to rich and
extensively interconnected larger knowledge structures that connect dif-
ferent.text elements. Thus, activation of a schema or script might connect
a large number of text propositions in different parts of a text. When such
a structure is retrieved from LT-WM, it tends to assume a dominating
position in the integration process because of its centrality in the net¬
work. High-knowledge persons, therefore, will tend to reconstruct a text
when trying to recall it in terms of these general knowledge structures
and will reproduce correspondingly fewer text propositions and less of
the text structure.
These different outcomes can be observed in studies in which persons
at different levels of expertise recall medical case records and diagnoses
(Groen & Patel, 1988; Schmidt & Boshuizen, 1993). Interns behave like
readers who mostly depend on the textbase and have only a weak situa¬
tion model. Thus, they reproduce parts of the text without clearly differ¬
entiating essential and inessential information and make few knowledge
intrusions. The recall of doctors with several years of practice, in con¬
trast, clearly reflects their situation model. It is very rich, and much mate¬
rial is included that was not in the case record but was inferred in the
process of arriving at a diagnosis. In particular, this includes general med¬
ical and physiological knowledge that is relevant to then understanding of
these cases. Interns do not include such material in their recall protocols
either because they do not have the requisite knowledge or because,
although they learned these things in medical school, their knowledge is
too weak and insufficiently situated to be used in diagnosis.
Interestingly, the real experts, doctors with many years of practice,
produced much briefer and quite different recall protocols than the doc¬
tors with less experience. They too relied on their LT-WM and lecon-
structed rather than reproduced the protocols, but because their situation
model was very different from that of less experienced doctors, their LT-
WM and what they recalled were different, too. Expert doctors gave brief
recalls concerned with the essential symptoms and diagnosis of a case.
234 Models of comprehension
In the text-processing theory of Kintsch and van Dijk (1978), the STM
buffer played a crucial role. It was the only way a coherent text represen¬
tation could be constructed. According to that theory, a few elements, or
at least one, from one processing cycle must be maintained in the STM
buffer to be reprocessed together with the elements of the next cycle, in
the hope that some link (argument overlap) will be found so that a coher¬
ent text representation can be formed. The buffer was the bridge in the
model between processing cycles that permitted the formation of a
coherent mental representation of the text, which had to be processed
sentence by sentence. The capacity of the buffer became an important
issue, though most estimates suggested that it remained relatively con¬
stant at one or at most two propositions.
This picture must now be modified in some important respects. First,
the Glanzer data discussed earlier (Glanzer et ah, 1981) clearly demon¬
strate that comprehension is possible without the use of an STM buffer.
The interruption procedure surely wipes out the contents of STM - yet
comprehension is unaffected, except for a slight increase in reading
times. Second, the theory of LT-WM provides an alternative mechanism
for the construction of a coherent text representation. Links between text
elements from different processing cycles are formed via LT-WM.
Thus, both empirically and theoretically, comprehension can do with¬
out an STM buffer. However, there is good evidence (Fletcher, 1981;
Fletcher & Bloom, 1988) that an STM buffer is used in normal uninter¬
rupted comprehension. One may question why an STM buffer is used
when comprehension obviously can proceed without it, but just because
a mechanism is redundant does not mean that it is useless. The STM
The role of working memory in comprehension 235
buffer, for instance, allows for the formation of a more coherent textbase.
Propositions maintained in the buffer can be linked with the propositions
derived from the next sentence on the basis of purely textual relations,
whereas LT-WM links must be based on prior knowledge. High-knowl-
edge readers are probably not as dependent on the buffer, but interfer¬
ence with the STM buffer might have stronger effects for low-knowledge
readers.
The notion of retrieval structure has appeared in several places in the lit¬
erature on comprehension and problem solving in recent years. Richman,
Staszewski, and Simon (1995) propose a formal model of retrieval struc¬
tures in recall in the context of the EPAM theory of memory. Clark and
Marshall (1981) describe how what they call “reference diaries” function
as retrieval structures in a conversation, somewhat like macrostructures
do in reading a text (Figure 7.4). Myers, O’Brien, Albrecht, & Mason
have reported a series of experiments in which they explore how the con¬
tent that is in the focus of attention during reading accesses related prior
information in the text or in the reader’s knowledge base (e.g., Albrecht
& Myers, 1995; O’Brien, 1995; O’Brien & Albrecht, 1991). Myers uses a
well-chosen metaphor for this process, namely, that long-term memory
resonates with the contents of short-term memory; the items that res¬
onate most strongly become accessible in working memory (Myers et al.,
1994). Myers’s resonance theory is a specific, mathematical model of
retrieval structures in comprehension. It builds on memory theoiy, in
particular the SAM formulation of Gillund and Shiffrin (1984), to
describe how resonance works in comprehension and is in principle quite
compatible with the Cl model. Indeed, Varma and Goldman (1996) have
implemented a version of the Cl model that includes a resonance-based
reinstatement mechanism.
Retrieval structure, therefore, is not just a useful but vague concept;
rather, it is something that can be modeled formally. I he possibility of
formally specifying retrieval structures brings up another problem, how¬
ever. How can we objectively and with some degree of accuracy hope to
model the reader’s long-term memory? As suggested in section 3.3, latent
semantic analysis may provide at least a partial solution to this problem.
Latent semantic analysis (LSA) can be used to model retrieval struc¬
tures. Retrieval structures are fixed, well-established links in long-term
236 Models of comprehension
(6) SI John was driving his new car on a lonely country road.
52 The air was warm and full of the smell of spring flowers.
53 He hit a hole in the road and a spring broke.
54 John lost control and the car hit a tree.
What are likely words that LSA might be reminded of when reading SI,
John was driving his new car on a lonely country road?Table 7.2 shows two
of the 10 most closely related terms in the encyclopedia LSA space for
each of the content words in that sentence.
For John, LSA thinks of two historical Johns; for new it knows no bet¬
ter than New York and New Jersey. But suppose we add these remindings
to the network created for that sentence in (6), using the cosines from the
LSA analysis for link strengths. Integrating, we obtain the results shown
in Figure 7.6, where the areas of the squares used for each node are
approximately proportional to the final activation value of that node.
Note that lonely has completely disappeared from the network, as have
most of the intuitively irrelevant remindings from the list in Table 7.2.
Only rail and York remain as obviously irrelevant associations but with
activation values barely above zero.
If we similarly select two associates from the top 10 neighbors of each
Figure 7.6 The activation values for the words from the sentence John mas
driving his nem car on a lonely country road (gray squares) as well as for several
remindings (white squares) and the sentence node S (black square). The area
of each square indicates the activation value of the node.
content word in the other three sentences of example (6), the following
knowledge intrusions can be obtained.
52 The air was warm and full of the smell of spring flowers:
smell drops out, humidity, dew, showy, and leaves are added.
53 The car hit a hole and a spring broke:
driver, automobile, smash, and album are added.
54 John lost control and the car hit a tree:
automobile, trunk, and bark are added.
tree along the road. Therefore, LSA misunderstands things because of its
inadequate knowledge base. For instance, the nearest neighbor in the
LSA space for the sentence SI, John driving his car around, is the docu¬
ment “New Frontier.” This is not as puzzling as it appears: Put together
John, country, new, road - quickly you get into Camelot country. One
might say LSA is doing a good job here. If it knew more about the every¬
day worlds - knowledge that LSA might get from reading newspapers,
novels, movie scripts - it might also be reminded of more prosaic drives
in the country. The example also illustrates another limitation of the pre¬
sent version of LSA: its disregard of syntax. Sentence SI is not about
John (FK) leading his country along a new road but about driving on a
country road.
In spite of these caveats about LSA, its use for modeling the operation
of retrieval structures in comprehension opens exciting new possibilities.
Only further research will show how much of this promise can be real¬
ized. It should be possible, however, to employ LSA to derive specific, a
priori predictions for experiments such as those of Myers and his col¬
leagues - with a better, more informed LSA space.
In section 9.1 there is a discussion of the factors that make for a good or
a poor reader. Decoding skills, language skills, and domain knowledge are
emphasized. Another factor that has often been suggested as a determi¬
nant of reading comprehension is the capacity of working memory. The
theory of long-term working memory described earlier implies that indi¬
vidual differences in comprehension may be the result of skill and knowl¬
edge differences. Capacity differences in working memory may or may
not play an additional role (see also Ericsson & Kintsch, 1995).
At first glance, the data seem to contradict any relation between read¬
ing comprehension and STM capacity. Reading comprehension improves
enormously between childhood and adulthood, but measures of STM
capacity such as the digit span show only a slight improvement (e.g., Chi,
1976; Huttenlocher & Burke, 1976; and others). Good and poor readers
cannot be distinguished on the basis of their memory span (e.g., Farnham-
Diggory & Gregg, 1975; Rizzo, 1939). However, such data are not really
conclusive, because the memory span merely measures a subject’s abil¬
ity to store unfamiliar digit sequences and does not provide a direct indi¬
cation of the STM capacity available during reading. Daneman and
The role of working memory in comprehension 239
storage, but that they are more skilled in putting things into long-term
storage and retrieving them again.
This reinterpretation of the findings of Daneman and Carpenter
(1980) fits many of the results that have been reported in support of the
capacity theory. Differences in language skills, not STM capacity, explain
these findings. Cantor and Engle (1993), for instance, had subjects with
high and low reading spans memorize either a list of unrelated sentences
or a list of thematically related sentences. The number of statements in
these sentences that related to a particular concept was varied. A fan
effect is typically observed in such situations when subjects are asked to
verify these statements: The more statements there are about a particular
concept, the slower the verification time. Indeed, this was what Cantor
and Engle observed in both low- and high-span readers when the sen¬
tences were unrelated. When the sentences were related, however, low-
span readers still showed the typical fan effect, whereas this effect was
reversed for high-span readers: The more statements about a particular
concept, the faster were the latters’ verification times. The fan effect
occurs because the more unrelated statements subjects have to remember,
the larger the search set becomes on a verification test. For thematically
related sentences, however, high-span readers could use their superior
organizational skills to construct a single, coherent representation. The
more information there was about a given concept, the more strongly
integrated this representation became and the better it was able to serve
as a retrieval structure. Hence, the negative fan effect. The low-span
readers, on the other hand, did not fully succeed in constructing a coher¬
ent representation of the thematically related sentences because of their
inferior organizational skills. The mental representation they created was
therefore only partially coherent, which resulted in a typical fan effect.
We can similarly explain the results of Just and Carpenter (1992) in
terms of LT-WM processes. In one of their experiments, Just and Car¬
penter showed that only high-span subjects took advantage of pragmatic
cues when comprehending reduced relative clauses. Consider the follow¬
ing two sentences with unreduced relative clauses:
(7) The evidence that was examined by the lawyer shocked the jury.
(8) The defendant who was examined by the lawyer shocked the
jury.
From these sentences, a baseline measure can be obtained for the reading
The role of working memory in comprehension 241
time for the phrase by the lawyer. Reduced relative clauses can be con¬
structed by omitting the that was or who was from the original sentences:
(11) The experienced soldiers warned about the danger before the
midnight raid.
This sentence has a simple structure, and low-span readers read through
it without difficulties. High-span readers, on the other hand, pay more
attention to the subtle cues the language provides and notice a potential
242 Models of comp rehension
(12) The experienced soldiers (who were) warned about the danger
were ready.
Readers who form an expectation like this will stumble over the last word
of the actual sentence and experience a garden-path effect. Indeed, that is
what happened to high-span readers, who took about 130 ms longer to
read raid than did mid- and low-span readers.
Just and Carpenter (1992) attribute this result to STM capacity. Low-
span and high-span readers alike form both expectations (that the sen¬
tence will continue as a normal sentence, as it actually does, or that there
is a reduced relative clause). However, low-span readers have insufficient
memory capacity to retain both alternatives in STM and hence drop the
less likely parse before they come to the end of the sentence. High-capac¬
ity readers retain both parses to the end, which results in unnecessary
processing difficulties. An alternative possibility is that the less skilled
low-span readers never consider the possibility of a reduced relative
clause and hence have no difficulty. High-span readers habitually use
more of the information available to them in the parsing process, which
in this case gets them into trouble. Pearlmutter and MacDonald (1995)
have actually investigated whether both low- and high-span readers are
sensitive to the cues in question and found results in accord with our
claims. They found that only high-span readers were sensitive to proba¬
bilistic constraints in the sentences used in their experiment. Low-span
readers ignored these constraints. It was not that low-span readers did
not know about these constraints. They had the knowledge when appro¬
priately questioned, but that knowledge was not sufficiently automated
to be used for the encoding of sentences in an actual comprehension
task. This illustrates a very important feature of retrieval structures.
Simply knowing something - that is, being able to retrieve it under opti¬
mal circumstances - is not sufficient for a retrieval structure; the knowl¬
edge must also be strong, stable, well practiced, and automated, so that
it can be employed for encoding without additional resource demands.
Retrieval structures are a characteristic of expert memory. Unskilled,
low-span readers may have the knowledge, but it is not in readily usable
form.
Thus, much of what has been taken as studies of working memory
capacity turn out to be studies of comprehension skills. This may also be
The role of working memory in comprehension 243
(13) Since Ken liked the boxer, he took a bus to the nearest pet store
to buy the animal.
There were large differences between high- and low-span readers with
such sentences. Low-span readers became confused when they had to
activate the nonpreferred sense of the ambiguous noun at the end of the
sentence, whereas high-span readers had no such difficulties. However,
when the disambiguating phrase followed right after the homograph,
low-span readers performed just like high-span readers. Clearly, low-
span readers knew both word senses.
The capacity interpretation of this finding is that both low- and high-
span readers activated both word senses, but only the high-span readers
had enough working memory capacity to maintain both meanings until
the end of the sentence, when it was needed for understanding. Low-span
readers were forced to drop the less likely meaning because of the limita¬
tions of their working memory capacity, and hence they had difficulties
understanding the end of the sentence. The trouble with this interpreta¬
tion is that we know (e.g., section 5.1.2) that word meanings in context are
fixed in about 350 msec; hence, by the end of (7) all readers must have
deactivated the nonpreferred word sense. However, high-span readers
had built a good retrieval structure that enabled them to reactivate with¬
out difficulties the lost word sense when it was needed, whereas low-span
readers had to engage in more cumbersome recovery processes because
the tentative sentence structure they had built did not serve them as an
efficient retrieval structure.
This interpretation is strengthened by another finding Miyake et al.
(1994) reported. Differences between high- and low-span readers were
observed only with sentences like (13), where the homograph has a pre¬
ferred and a nonpreferred meaning. With equibiased homographs, there
were no differences between high- and low-span readers. I his observa¬
tion is difficult for a capacity theory but is readily explained if knowledge
and skill differences are involved. Low-span readers know all about word
senses, but only in the case of the more frequent, equibiased homographs
is that knowledge sufficiently automated and usable. Again, knowledge of
word meanings is not sufficient to support the formation of a retrieval
244 Models of comprehension
the sports page of the newspaper or a trite little story, and their
memory for what they read and did and their ability to activate
relevant knowledge will not be characterized by the theory of
retrieval structures but by the classical theory of short-term
working memory.
8
MEMORY IMAGES
h
CX S(CX,Ij)
II
h
Ii SCI,,!,)
Cues
(2)
all j
How well does the Cl model account for the data from sentence recogni¬
tion experiments? An earlier attempt was made by Kintsch, Welsch,
Schmalhofer, and Zimny (1990) to answer this question. Some simula¬
tions consistent with the present version of the model are presented
below (see also Singer & Kintsch, in preparation).
—□— New-Inap.
.O. New-Ap.
—•O'— Inference
—a— Paraphrase
---83--- Old
Their results are shown in Figure 8.2. Under the conditions of their
experiment, no memory loss was observed for the situation model in a
four-day period; the propositional textbase lost about half of its strength
over that period, and surface memory was lost completely.
Differences in instructions are another factor that contributes signifi¬
cantly to the complexity of sentence recognition data. Results are similar
whether subjects recognize a sentence as having been part of a text or
whether the sentence is true with respect to that text (Schmalhofer,
1986). However, differences are obtained when subjects are asked to rec¬
ognize a sentence or to judge its plausibility (Reder, 1982, 1987). Her
recognition data are about the same as those shown in Figure 8.1, but the
plausibility data show a different pattern of results. There are almost the
same number of “yes” responses to old sentences and inferences (roughly
an 8% difference), on both the immediate and delayed test, and there is
essentially no forgetting.
Sentence recognition data, therefore, are fairly complex. The challenge
to the Cl model is to account for this pattern of results. Which of the
empirical phenomena are simply a consequence of the basic architecture
of the Cl model? What additional assumptions are needed to account for
these data, and how justifiable are these assumptions?
1 he Cl model is not a model of recognition memory and hence it must
Memory for text 253
Surface
Q. Textbase
—O— Model
Figure 8.2 Estimates of trace strengths for surface, textbase, and situation
model traces as a function of delay. After Kintsch et al. (1990).
be combined with one. As stated earlier, the easiest choice for that pur¬
pose is the recognition memory model of Gillund and Shiffrin (1984).
This model represents memory in the same form as the Cl model does,
as a network of interconnected nodes. The Cl computation generates a
network that can be used as an input to the Gillund and Shiffrin memory
model. Other memory models, such as TODAM (Murdock, 1993) and
MINERVA (Hintzman, 1988), use different representations that would
require a transformation from the network representation of the Cl
model to the format appropriate for TODAM or MINERVA.
To model delay effects, we need to know what happens to the mem¬
ory trace as a function of delay. Figure 8.2 provides the necessary con¬
straints. Thus, the simulations of sentence recognition are constructed
from three components: the simulation of comprehension by the Cl
model, the simulation of recognition by the Gillund and Shiffrin model,
and the empirically determined decay functions for different levels of
sentence memory.
differences between items are random to the extent that other factors
have been successfully controlled.
Sentence recognition experiments are fundamentally different in this
respect, because the sentences in a natural text are never equal. Indeed, it
has been shown that important, central sentences in a text are recognized
better than unimportant, less central sentences (Walker & Yekovich,
1984). The Cl model allows one to calculate the strength of each sentence
in a textbase. Hence, if we use these calculations as the input to the
Gillund and Shiffrin recognition model, the model should predict cor¬
rectly that sentences high in the textbase hierarchy should be recognized
better than sentences low in the hierarchy.
A simple test of this prediction can be obtained by constructing an
arbitrary textbase and then computing how well a sentence correspond¬
ing to a proposition high in the hierarchy is recognized versus a sentence
corresponding to a lower-order proposition. Figure 8.3 shows the net¬
work I have used for that purpose. It contains seven nodes forming a
textbase, the propositions PI to P7, the corresponding surface structure
nodes SI to S7, and a situation model consisting of a general context
node, CX, and two macropropositions, Ml and M2. This is an abstract
example, in that no particular text is associated with it. Its structure is
arbitrary but designed to allow us to investigate recognition for important
and unimportant sentences, that is, sentences high or low in the textbase
hierarchy. Thus, the propositional nodes are arranged so that they occupy
three levels of a hierarchy: P3 and P6 are at the top of the hierarchy
(directly subordinated to the macropropositions), P2, P4, P5, and P7 are
at the intermediate level (subordinated either to P3 or P6), and PI is at the
lowest level (subordinated to P2). This allows us to test recognition for
sentences based on PI and P3, that is, for a low-importance and for a
high-importance sentence.
As a first step, comprehension of the structure shown in Figure 8.3 was
simulated. (At this point, the stars in the upper left corner of Figure 8.3
play no role.) The simulation was done in two cycles: The first contained
CX, Ml, and all nodes under Ml; the second cycle included CX and M2
and all the nodes dominated by M2, plus P3, which was retained in the
S PM buffer as the most strongly activated proposition from the first
cycle. Thus, a long-term memory matrix Ltext, which became the basis
for the recognition tests, was calculated. The program of Mross and Rob¬
erts (1992) was used for these calculations.
The recognition probe is represented in Figure 8.3 by the stars labeled
Memory for text 255
si
pi
cD— —GD
s/
P3 S3 |
M
Ml
-
—CD
CD"'"—-(D
/
/
/
.(D
\
CD-- --CD
Figure 8.3 A network representation of a text and a test sentence high in the
textbase hierarchy. Nodes that do not participate in the test are linked by
dashed lines.
256 Models of comprehension
1 I thank hrnie Mross for writing the “Matrix” program that performs these operations.
Memory for text 257
(3)
p(yes)iow = -17-2
Thus, in our example, a high-importance sentence is recognized much
more often than a low-importance sentence. Of course, these values
depend on how the example has been constructed; a different and con¬
siderably higher value would be obtained had we chosen P2 as the low-
importance proposition. Yet different values would be obtained with
different networks. The point here is merely that when we compare
recognition of a high-importance sentence and a low-importance sen¬
tence, the former is correctly predicted to be higher than the latter. 1 his
prediction is obtained without ad hoc assumptions, merely by using the
Cl model together with an off-the-shelf recognition model and standard
signal detection analyses.
2 mhi h = .536 and i'high = 2 implies shigh = .536/2 = .268. For d' - 2, a b value of -.84
corresponds to a b value of 1.16 for d' = 0. For the distribution of the low-importance
sentence with a mean of .038, d' = .038/.268 — . 142. The criterion value of 1.16 coi-
responds to 1.16 - .142 = 1.02 on the low-importance distribution. Hence P(yes)low
= .172.
258 Models of comprehension
off-—off-Qf)
Figure 8.4 A network representation of a text and an old test sentence.
Nodes that do not participate in the test are linked by dashed lines.
Table 8.1. The LTM strength values for the network shown in Figure 8.4 and
the strength values calculated for the recognition of an old sentence, a paraphrase
sentence, and an inference sentence
nodes are strong for old items, whereas for paraphrases only the MOD and
PRO nodes are high, and for inferences only the MOD node is high.
The Cl model having done its work, we turn to the recognition mod¬
ule. We first need to transform the sparse matrices obtained from the Cl
model into full matrices, showing indirect as well as direct connections
between all nodes in the network. Table 8.2 shows the sparse matrix
obtained from testing the old sentence; this is equivalent to column 2 of
Table 8.1, except presented in a different form.
Memory for text 261
Table 8.2. The sparse matrix showing the direct links between nodes for an old test sentence
Memory Images
CX 0.95 0 0 0 0 0.67 0 0 0 0 0 0 0
Ml 1.69 1 0 0.71 0 0 0 0 0 0 0.74 0 0
PI 1 1.25 0.82 0 0 0 0 0 0 0 0 0.87 0
SI 0 0.82 0.56 0 0 0 0 0 0 0 0 0 0.61
P2 0.71 0 0 0.5 0.26 0 0 0 0 0 0 0 0
S2 0 0 0 0.26 0.14 0 0 0 0 0 0 0 0
M2 ‘0 0 0 0 0 1 0.67 0.67 0 0 0 0 0
P3 0 0 0 0 0 0.67 0.45 0 0.22 0 0 0 0
P4 0 0 0 0 0 0.67 0 0.45 0 0.22 0 0 0
S3 0 0 0 0 0 0 0.22 0 0.11 0 0 0 0
S4 0 0 0 0 0 0 0 0.22 0 0.11 0 0 0
MOD 0.74 0 0 0 0 0 0 0 0 0 0.95 0.97 0.92
PRO 0 0.87 0 0 0 0 0 0 0 0 0.97 1 0.94
SUR 0 0 0.61 0 0 0 0 0 0 0 0.92 0.94 0.89
Table 8.3 shows the corresponding full matrix, in which the strengths
of all indirect paths have been computed by multiplying together the
strengths values of all path segments connecting to nodes. Because the
network in Figure 8.4 is fully connected, all entries in this matrix are
nonzero. These entries are the strengths values with which nodes i and j
are connected in memory, S(I;,Ij).
In the Gillund and Shiffrin model (1984), the Ms, Ps, and Ss of Fig¬
ure 8.4 are the memory images, which we now label as Ik. The recognition
probe consists of the context element CX and the test sentence, which are
represented by the three elements MOD, PRO, and SUR in Figure 8.4.
The familiarity of the probe (CX, MOD, PRO, SUR) according to equa¬
tion 1 of Gillund and Shiffrin is given by
9
F(CX,MOD,PRO,SUR) = £ S(CX,Ik) * S(MOD,Ik) *
k=i
S(PRO,Ik) * S(SUR,Ik). (4)
The familiarity values calculated in this way for OLD sentences Irom
Table 8.3 is 1.378. These calculations were repeated for paraphrases and
inferences, starting with the values for these test sentences shown in Table
8.1. The familiarity value of the paraphrase test sentence turns out to be
262 Models of comprehension
Table 8.3. The full matrix showing both direct and indirect links between nodes
for an old test sentence
Memory Images
CX 0.95 0.95 0.78 0.67 0.17 0.67 0.44 0.44 0.1 0.1 0.7 0.82 0.64
Ml 1.69 1 0.82 0.71 0.18 0.63 0.42 0.42 0.09 0.09 0.74 0.87 0.68
PI 1 1.25 0.82 0.71 0.18 0.63 0.42 0.42 0.09 0.09 0.84 0.87 0.82
SI 0.82 0.82 0.56 0.58 0.15 0.52 0.35 0.35 0.08 0.08 0.56 0.71 0.61
P2 0.71 0.71 0.58 0.5 0.26 0.45 0.3 0.3 0.07 0.07 0.52 0.61 0.48
S2 0.18 0.18 0.15 0.26 0.14 0.12 0.08 0.08 0.02 0.02 0.14 0.16 0.12
M2 0.63 0.63 0.52 0.45 0.12 1 0.67 0.67 0.15 0.15 0.47 0.55 0.43
P3 0.42 0.42 0.35 0.3 0.08 0.67 0.45 0.44 0.22 0.1 0.31 0.37 0.29
P4 0.42 0.42 0.35 0.3 0.08 0.67 0.44 0.45 0.1 0.22 0.31 0.37 0.29
S3 0.09 0.09 0.08 0.07 0.02 0.15 0.22 0.1 0.11 0.02 0.07 0.08 0.06
S4 0.09 0.09 0.08 0.07 0.02 0.15 0.1 0.22 0.02 0.11 0.07 0.08 0.06
MOD• 0.74 0.84 0.56 0.52 0.14 0.47 0.31 0.31 0.07 0.07 0.95 0.97 0.92
PRO 0.87 0.87 0.71 0.61 0.16 0.55 0.37 0.37 0.08 0.08 0.97 1 0.94
SUR 0.68 0.82 0.61 0.48 0.12 0.43 0.29 0.29 0.06 0.06 0.92 0.94 0.89
.834, and of the inference test sentence .221. These values are shown in
Table 8.4.
Familiarity values for test sentences after a delay can also be calculated
by making suitable assumptions about the decay of probe strengths as a
function of delay. These assumptions were motivated by Figure 8.2.
Specifically, it was assumed that there was no loss of strength as a func¬
tion of delay for situation model nodes in memory but that propositional
nodes lost half their strength and surface elements all their strength.
Thus, the terms in equation 2 were multiplied by .5 if the imagery ele¬
ment was a proposition and by 0 if it was a surface element. The result¬
ing familiarity values for old sentences, paraphrases, and inference test
sentences are also shown in Table 8.4.
Finally, a signal-detection model was used to calculate recognition
probabilities from the familiarity values of the test sentences. As in the
previous example, d'(M was arbitrarily set to 2, and Pold was chosen so that
P(yes)0id = -80, as suggested by Figure 8.1. The d' values for paraphrases
and inferences were obtained by scaling their familiarity values in the
Memory for text 263
Table 8.4. Familiarity values, d'- and Rvalues and P(yes) for old
sentences, paraphrases, inferences, and unrelated distractor sentences for
immediate and delayed recognition
Immediate
Delay
same way as for old sentences (i.e., dividing by .69). The (3 and P(yes) val¬
ues could then be calculated for paraphrases, inferences, and distractors,
as shown in the table. The calculations for the delayed recognition test
were identical, again choosing a criterion value that yields a recognition
probability of .80 for old test sentences. Figure 8.5 summarizes the pre¬
dicted recognition probabilities for both the immediate and delayed test.
These predictions capture the essential features of the empirical data of
Figure 8.1. That the “yes” probability for old sentences remains at .80 for
both immediate and delayed tests was built into Figure 8.5 by the way the
parameters of the signal detection model were estimated. What the model
predicts correctly is the ordering of the test items and the increase with
delay in false alarms for items that were not actually part of the original
text (paraphrases, inferences, and distractors).
The two examples of how well-known phenomena of sentence recog¬
nition can be derived from the Cl model and current memory theory
make an important point. We don’t need a special theory of sentence
memory: If we understand sentence comprehension (the Cl theory) and
recognition memory (the list-learning literature), we have all the parts we
need for a sentence recognition model. Conceptually it is very simple.
264 Models of comprehension
0.8
<ZJ
<0 —□— Old
C 0.6
c
G. .o-. Paraphrase
<D
Oi
—-O"— Inference
c/l
u 0.4
--A- New
0.2
The actual calculations are complex, and the pattern of data that must be
accounted for is complex, too. But underlying all this complexity is a sim¬
ple model of comprehension and a simple memory model.
The examples discussed have been intentionally simplified in order to
be able to work through the actual calculations step by step. This is
important if one wants to show that a complex new phenomenon can be
accounted for by simple old theories. It is, of course, also necessary to
work with real, nonarbitrary texts, fitting actual data, and to extend the
domain of phenomena modeled - for example, by including the plausi¬
bility results mentioned in the introduction to this section. Singer and
Kintsch (in preparation) attempt to do so, thus demonstrating that the
kind of account proposed here of sentence recognition in terms of famil¬
iar theories about comprehension and memory is indeed feasible, and not
just a theoretical possibility.
To provide an account of how readers recall a text they have read has been
one of the oldest and most important applications of the Cl theory and its
predecessors (Kintsch, 1974, 1976, 1977b; Kintsch & van Dijk, 1978;
Miller & Kintsch, 1980; van Dijk & Kintsch, 1983). In general, one can
Memory for text 265
say that the theory has done a reasonably good job of predicting recall,
especially for short texts and stories. The way these predictions were
derived was to simulate comprehension of a text and then assume that
recall probabilities for propositions are a monotonic function of the acti¬
vation values, or better, the long-term memory strength values calculated
in the simulation. Correlations between predicted recall and observed
recall frequencies typically range between .4 and .8. This is as good as or
better than intuitive judgments of what will or will not be recalled from a
text.
Exceptions have been encountered, however. In my experience, there
are two cases in which the theory regularly yields poor predictions. First,
recall for long texts cannot be predicted without an explicit account of
macrostructure formation. Brief, paragraph-long texts are recalled repro-
ductively, but longer texts are summarized. Their macrostructure may be
reproduced, but the microstructure is either deleted or reconstructed, as
Bartlett (1932) showed long ago. The solution to this problem is to include
the macrostructure explicitly in the simulation. When this is done, as in
Kintsch and van Dijk (1978), the predictions of the model can be sub¬
stantially improved.
Another reason for prediction failures of the model occurs when read¬
ers do not properly understand the text that they are recalling. The model
assumes an ideal reader, that is, it assumes that the reader has processed
the whole text and has formed a mental representation of the text as the
author intended - essentially complete and essentially correct. If the
reader misunderstands the text and makes various kinds of errors in
recall, the model simply does not apply. Such misunderstandings occur
quite frequently, especially with scientific texts. For instance, recall data
recently collected in my laboratory on a text discussing several hypothe¬
ses concerning the extinction of the Irish elk - a huge animal with giant
antlers - were beyond the scope of the comprehension theory because
most subjects egregiously misunderstood the discussion of the compet¬
ing hypotheses. They formed idiosyncratic mental text representations,
depending on the nature of their misconceptions, which is unpredictable.
Thus, some conspicuous failures of the model to predict recall data that
have been reported in the literature may very well be attributable to the
subjects’ inadequate comprehension of the text they had read.
The model does, overall, a fairly good and reliable job of predicting
story recall. Some issues, however, need to be considered and might help
266 Models of comprehension
ety of reasons, even if they are remembered. For instance, a modifier may
be omitted in a recall protocol, not because it was not encoded as part of
the text representation, but because it was considered redundant in the
production phase. Such editing processes introduce further and spurious
variability into data scored in terms of atomic propositions. Thus, com¬
plex propositions appear to be better units for analyzing recall data and
for comparing predictions with data.
In the previous section I showed how memory strength estimates
derived from the Cl model of comprehension can predict sentence recog¬
nition data when plugged into a standard model of recognition memory.
In contrast, no recall theory was used in deriving recall predictions in the
studies of story recall; it was simply assumed that recall was proportional
to the activation values calculated in the comprehension simulation.
Although this simplifying assumption yielded reasonable predictions, it
is certainly as questionable as in the case of sentence recognition. Why
should activation values predict frequency of recall? In theories of list
recall, we do not assume that memory strengths directly predict recall;
instead, it has been necessary to introduce complex processing assump¬
tions in order to describe the experimental phenomena typically observed
in list recall experiments. Thus, the SAM theory of Gillund and Shiffrin
(1984) assumes recall to be a complex process of cue-based retrieval and
recovery. Whatever cues are present in short-term memory (for example,
a context cue plus an already recalled list item) serve as retrieval cues. The
retrieval process then may or may not produce implicitly another as yet
not recalled item from the list, depending on the strength of the relation
between the retrieval cue and the items of the list. This implicitly
retrieved item may then be recovered with a probability depending on its
memory strength. This process is formulated mathematically, and pre¬
dictions derived in this way indeed mimic all the major features of list
recall data that have been observed in the laboratory. Why should text
recall be different? Should not the same model of recall that has proven
its worth elsewhere also be used to derive predictions for text recall? The
current practice in predicting story recall from comprehension simula¬
tions assumes without any justification that recall is directly determined
by memory strength. Can we enhance our predictions based on the Cl
model by using a proven memory theory to model the recall process, as
we did for the recognition process in the previous section?
268 Models of comprehension
Table 8.5. The Frog story: Two Frogs and the Well
t SUMMER
IN-HEAT
P4 The frogs decided to find a new home.
FIND [F, HOME]]
L NEW
DECIDE
P5 The tw o frogs left the pool of water.
LEAVE [F, P]
P6 The tw'o frogs began their search for another pool of water.
SEARCH[F, POOL]]
L OF-WATER
BEGIN
P7 As the two frogs traveled along, they reached a well.
REACH [F, W]
I- AS-TRAVEL
P8 The well was very deep.
DEEP [W]
I_ VERY
P9 The first frog said the well would provide them with shelter and food.
PROVIDE[W,SHELTER,FOOD]
I_ SAID [FIRST-FROG]
P10 The first frog wanted to enter the well.
ENTER[F, W]
I_WANT[FF]
PI 1 The second frog wanted to think about the well as a new home.
THINK [F, W, H]
WANT[SEC-FROG]
P12 The second frog replied:
“Suppose the well’s bottom is not a
good home.”
NOT-GOOD[ BOTTOM,H]
I- SUPPOSE 1—OF-W
P13 The second frog said.
“How will we get out of the deep well in
that case?”
HOW-GET-OUT[F, W]
I_ IF-N-G
- SAID[SF]
270 Models of comprehension
POOL
HOME
FIRST-FROG
P12-NOT-GOOD/ SECOND-FROG
PI 3-GET-OUT
Table 8.7. The Miser story. The Miser, the Neighbor, and the Gold
Figure 8.7 The correlations between observed recall frequencies and model
predictions for two stories. The connected lines show the correlations with
models based on argument repetition only (Arg) or argument repetition plus
causal links (Caus) for input cycle sizes between 4 and 8 atomic propositions.
The correlations between predictions and recall with complex propositions as
units, with links based on argument repetition and an input size of 7, are also
shown.
see that not even that matters very much. It is the structure of the model
that yields the predictions, not the specific parameter values chosen.
However, an input size of 7 appears to be best for these two stories.
Figure 8.7 also shows the correlations for complex propositions, for
cycle size 7 only. These were obtained by adding together the activation
values of all atomic propositions as well as the corresponding recall fre¬
quencies. The activation values for complex propositions predict recall
much better than do the activation values for atomic propositions and
indeed, account for over half of the variance in the recall data.
Although correlating activation values for complex propositions with
recall data gives good predictions, it remains to be investigated how much
better can we do if we use an established memory model for free recall,
such as the Gillund and Shiffrin (1984) theory. In the present case there
is probably not much to be gained because, for these very simple stories,
activation values predict recall quite well - the reliability of the data is
274 Models of comprehension
tions (13 for Frog, 17 for Miser) can be calculated by summing the
strength values of all atomic propositions that belong to the same com¬
plex proposition. Note that the self strengths thus calculated are not just
the sum of the self-activation values but include also the links between
the atomic propositions that make up the complex proposition.
Table 8.9 shows the correlations between predictions and recall data
for complex propositions when the activation values obtained from the Cl
simulation of comprehension were used directly as predictors and when
these values were used in conjunction with the Gillund and Shiffrin
model, as it has been described. Both sets of predictions are about equally
good irt predicting the overall frequency with which complex text propo¬
sitions were recalled. Neither predicts the order of recall: The Gillund
and Shiffrin model makes order predictions, but, as has been discussed,
the list-learning control structure is not suited for story recall. Thus,
there are numerous jumps from one part of the story to another, forward
and backward, in the Monte Carlo simulations, that have no counterpart
in the data.
Wolfe and Kintsch (in preparation) have explored modifications of the
control structure of the Gillund and Shiffrin (1984) model that would
allow the model to predict the order in which stories are recalled. The
Gillund and Shiffrin model of recall is a two-stage model. In the first stage,
memory nodes (text propositions in our case) are implicitly reti leved with
probabilities proportional to their relative memory strength; in the second
stage, an attempt is made to recover these implicitly retrieved memory
items, the success of which depends on the absolute memory strength of
the item. In our simulations, this recovery process played no role, because
absolute memory strength was so high that recovery was almost assured.
The recovery process may be likened to a recognition judgment of implic¬
itly retrieved items: Does the item that was retrieved come from the right
Frog Miser
Figure 8.8 Observed and predicted recall frequencies for the Frog story.
Figure 8.9 Observed and predicted recall frequencies for the Miser story.
The most straightforward way to derive predictions from the model for
summarization would be to assume that subjects employ the same mem¬
ory representation as for recall but that their output criterion is higher.
Thus, the implicit retrieval process would remain the same as has been
described for recall, except that when a story proposition is retrieved,
278 Models of comprehension
ical data consist of the frequencies with which each sentence of a para¬
graph was chosen as the most important one. To see whether this pattern
of choices agrees with LSA, all paragraphs and their sentences were
expressed as vectors in the LSA space obtained from the encyclopedia
scaling (see section 3.3). The cosines were computed between a para¬
graph and each sentence in the paragraph. These cosines indicate the
strength of the semantic relationship between a sentence and the para¬
graph. More important sentences should be closer to the overall meaning
of the paragraph than less important sentences. Hence, these cosines
should predict the frequency with which readers choose each sentence as
most important in a paragraph. 1 he overall correlation between the fre¬
quency judgments and the cosines was r = .51, p < .01. The split-hall
reliability of the frequency data is r = .75. Thus, the LSA predictions are
not quite as good as the agreement among subjects but are reasonably
close to human performance.
The importance ranking data for the three-paragraph passages were
analyzed in the same way: The mean empirical rank for each paragraph
was compared with the cosine between the paragraph and the whole pas¬
sage. The overall correlation was r = .58, p < .01. To obtain an idea of how
high a correlation might be expected, a split-half reliability coefficient
was computed for the empirical rankings. This correlation was r - .80. As
was the case for sentence judgments in paragraphs, LSA’s ability to pre¬
dict the importance ranking of paragraphs in a text is not quite as good as
human judgments. This may indicate a weakness of LSA (almost cer¬
tainly the fact that LSA was trained on an encyclopedia limits its perfor¬
mance in the present case, where commonsense knowledge is more rele¬
vant than the kind of formal knowledge that is usually found in an
encyclopedia), or it may indicate that semantic relatedness is not the only
factor in importance judgments. Further research is needed to clarify this
question.
The summaries for the single paragraphs as well as the whole texts
were evaluated by computing the cosine for each summary with the text
it was supposed to summarize. The better a summary, the closer its vec¬
tor in the LSA space should be to the text it summarizes. Not surpris¬
ingly, there were substantial differences between paragraphs. The first
paragraph was summarized quite well (the average cosine between all
summaries and the paragraph was .57), the second paragraph was sum¬
marized quite poorly (average cosine .28), and third paragraph was inter¬
mediate (average cosine .42). The overall correlation between the cosines
280 Models of comprehension
and the average grade assigned to a summary was r = .57, p < .01. The
correlation between the two human graders was only r = .48, however.
Thus, LSA did even a little better than the two human graders in this
case.
Similar results were obtained for the passage summaries. The differ¬
ences between texts were less pronounced (the average cosines for the
three texts were .36, .43, and .38, respectively). The overall correlation
between the average grade and cosines was r = .48, p < .01, about the
same as the correlation between the two human graders, r = .45.
Although these results need to be extended with more texts, systematic
variation between text types, summaries of different length, and human
graders who achieve a higher level of agreement, they suggest that LSA
is quite successful as an automatic method for evaluating the quality of
summaries, thus opening up a new research area as a well as a potentially
fruitful field of application.
Henderson (1992) obtained estimates ranging from 355 ms per word for
poor fifth-grade readers to 170 ms for good seventh-grade readers. Rapid
decoding is important because better word recognition frees up resources
for higher-level processing (Perfetti’s verbal efficiency hypothesis). Thus,
better decoders build more accurate and complete representations of text
content. Their decoding capabilities feed the growth of vocabulary, they
know more word meanings, and they tend to have richer representations
of individual word meanings.
However, slow decoders can use higher-order processes to compensate
for their lack of decoding skills. Stanovich (1980) has shown that poor
readers can use the sentence context to speed up their word recognition.
Good readers, on the other hand, can recognize words rapidly and accu¬
rately either within a sentence context or without context; although con¬
text may speed their processing, they do not depend on it. Many studies
investigating this phenomenon are summarized and discussed in Perfetti
(1985, pp. 143ff): vocalization latency to words in context and in isola¬
tion, word identification latencies, predicting next word, and degrading
of visual stimuli. A number of other studies are in agreement with these
findings. Good readers, for example, fixate every word in a sentence,
whereas poor readers show more irregular fixation patterns (Just & Car¬
penter, 1980). Therefore, good readers detect misspellings better than
poor readers (McConkie & Zola, 1981). Note that poor readers, accord¬
ing to Perfetti, also are less able to exploit orthographic patterns in decod¬
ing when processing demands are high, even though they apparently do
not lack knowledge of such structures.
Good decoding skills make good readers less dependent on discourse
context in order to recognize a word, but certainly they are able to use
contextual information when they need to, and furthermore they do so
more effectively than poor readers. Frederiksen (1981) provided com¬
pelling evidence of this ability in comparing good and poor readers on a
word recognition task. The target word was shown either without context
or as the last word of a sentence. When the sentence made the target word
highly predictable, both good and poor readers saved 125 ms in compar¬
ison to the context-free presentation. However, when the sentence con¬
text was only mildly constraining, poor readers could not take advantage
of such a context, whereas good readers still showed a saving. Thus, good
readers are not only better decoders but also better top-down processors.
284 Models of comprehension
weakens (the Long et al. results) but does not fully suppress (the Gerns-
bacher et al. results) such associations for less skilled readers. Further
research is clearly needed to clarify these issues.
Efforts to determine the role of syntactic complexity in reading have
not yielded uniform results. Graesser et al. (1980; also Haberlandt &
Graesser, 1989) have used as a measure of syntactic complexity an aug¬
mented transition network analysis but without notable success. Simpler
measures, such as the number of main and subordinate clauses per sen¬
tence, tend to be redundant with propositional analyses.
The number of propositions per sentence is indeed related to reading
speed and recall. In an early study, Kintsch and Keenan (1973) observed
an increase in reading times of 1 sec per proposition recalled. However,
the effects of other potentially important variables were not partialled out
in this study. When this is done, estimates of the additional reading time
per proposition range from 191 to 238 ms for poor readers and 75 to 122
ms for good readers in the Graesser et al. (1980) experiment. Estimates
obtained by Bisanz et al. (1992) for younger readers are consistent with
these values: 166 ms per proposition for fifth-grade poor readers, down
to 94 ms for seventh-grade good readers.
Of the skills that are important for the organization of the textbase,
especially its macrostructure, the ones that have been studied most thor¬
oughly are the role of new propositional arguments in the coherence
structure and the reader’s use of causal relations in narratives. Kintsch et
al. (1975) found longer reading times and poorer recall for paragraphs
containing many new arguments than for paragraphs containing few new
arguments. For short history and science paragraphs, subjects required
on the average 1.6 s of reading time per proposition recalled when the
paragraphs contained few new arguments, versus 2.5 s reading per
proposition recalled when the paragraphs contained many new argu¬
ments. The overall level of recall was 75% for the paragraphs with few
new arguments and 58% for those with many new arguments. In other
words, subjects read at a fairly constant rate but recalled more from the
paragraphs with few new arguments than from the paragraphs with many
new arguments. Haberlandt et al. (1986) and others obtained similar
results for reading times, and Bisanz et al. (1992) for recall.
A number of studies have pointed out the important role that causal
links play in a text. For instance, Trabasso and van den Broek (1985) have
shown that for both adults and children recall of a text unit is a function
286 Models of comprehension
of the number of causal links connecting it with other text units. If sto¬
ries are analyzed into units that fall on the primary causal chain of states,
events, and actions that make up the story and dead-end units that are not
located on this causal chain, units on the causal chain are recalled better
than units off the chain (e.g., Omanson, 1982; Trabasso & van den Broek,
1985).
Bisanz et al. (1992) have shown that poor readers (fifth- and seventh-
grade pupils) employ their causal knowledge in a compensatory manner.
The more causal relations there were in a text, the faster poor readers
could read it. In contrast, the number of causal relations had no effect on
reading times for good readers. Thus, we have here another instance
when readers are using a top-down process (causal inferences) to com¬
pensate for a weakness in other components of the reading process (they
are slow at decoding, require more time to form propositions).
Bisanz et al. (1992) also make the interesting observation that different
skills are related to reading times and recall. Specifically, reading time is
primarily a function of the number of words and the number of proposi¬
tions in a sentence, whereas recall is a function of the causal links in a text
and the number of new arguments, both of which are important for the
overall organization of the textbase.
Topic inferences by skilled and less skilled readers have also been inves¬
tigated in the study by Long et al. (1994) already cited. Their experiment
used a design like that of Till et al. (1988): At SOA intervals between 200
and 1,000 ms after reading a sentence, lexical decision responses were
made to appropriate and inappropriate topic words. Response times to
appropriate topic words were significantly faster than to inappropriate
topic words at the 500 ms SOA for good readers, and an even larger dif¬
ference was observed at the 750 ms SOA, but no differences in the lexical
decision times for appropriate and inappropriate topic words were
observed for poor readers, even after 1 s. It was not the case that poor read¬
ers could not infer the topic of these sentences, however, because when
asked to provide topics for these sentences they performed just as well as
skdled readers. Fhus, they had the knowledge, but what was missing was
the automatic retrieval structure necessary to make available this knowl¬
edge during normal reading. Referring back to Table 6.1, for good read¬
ers, the topic inference belonged to the inference category in cell A of that
table (automatic retrieval of bridging information), whereas it fell into cell
B (controlled search for bridging information) for poor readers. Or,
Learning from text 287
1 Elaborativ e recall is that portion of a recall protocol that is left over when verbatim or
paraphrased reproductions of the text are deleted.
Learning from text 289
Correct
Erroneous
ing their style. In one version, the language was as helpful as we could
make it in signaling discourse importance to the reader. The other version
was as unhelpful as we could make it and still write an English text. Writ¬
ing quality had a major effect on reproductive recall, facilitating repro¬
duction about as much as domain knowledge did, but it did not help
understanding. Whether or not a text was well written did not have a sta¬
tistically significant effect on the proportion of erroneous elaborations.
Thus, although good writing can help the reader to construct a better
textbase that is sufficient for recall, it does not by itself guarantee the
deeper understanding that is a prerequisite for learning.
Just how low-ability students go about using their domain knowledge
to achieve good comprehension results was investigated by Yekovich,
Walker, Ogle, and Thompson (1990). One group of their subjects had
high knowledge of football, and one had low knowledge. With texts that
had nothing to do with football, the two groups performed equivalently.
With a football text, the high-knowledge subjects outperformed the low-
knowledge subjects, as in the other studies reported here. The questions
on which the high-knowledge subjects showed the greatest advantage
290 Models of comprehension
ing, which is only mechanical, from real problem solving, which involves
deep understanding. The same difference exists in text comprehension.
In most cases, deep understanding and mere memory for the words of a
text are intermingled to various degrees, but there are experiments in the
literature that represent fairly pure cases of one or the other extreme.
One is the well-known study of Bransford and Franks (1971), which can
be characterized as all understanding and no memory. In this study, the
texts consisted of four simple sentences, such as The ants mere in the
kitchen. The ants ate the jelly, and so on. These ideas could also be expressed
in more complex sentences, such as The ants in the kitchen ate the jelly. Sub¬
jects were given several such texts, either in the form of four one-idea sen¬
tences, two two-idea sentences, one three-idea sentence, and one one-idea
sentence, and other such combinations. Later, they were given a recogni¬
tion test consisting of some sentences they had actually read and others
they had not seen before. The results of the study were clear: Subjects
remembered very well the stories they had read (e.g., they remembered
the ants and the jelly and whatever else there was to that text) but did not
know which particular sentences they had read. They remembered the
meaning of each minitext, a scene, an image, perhaps - but not the way it
had been presented verbally. The memory for the actual text they had read
was wiped out by heavy interference (the subjects read many sentences, all
very similar), but they had no trouble keeping in mind the few simple and
distinct situation models they had formed for each of the several texts they
had read.
Close to the other extreme - all memory and little understanding - is
the study by Moravcsik and Kintsch (1993). Their readers had formed a
good textbase but only a very sketchy or erroneous situation model. 1 he
subjects in the Bransford and Franks experiment had formed a good sit¬
uation model but no textbase at all. Given the extreme simplicity and arti¬
ficiality of the texts used by Bransford and Franks, it seems likely that
their subjects merely combined what was in the text into an integrated
scene to form a situation model.
ation model are aspects of the same episodic memory trace of a text. We
distinguish these aspects for the purposes of scientific analysis because
this distinction is often useful in research as well as instruction.
the basis of the comprehender’s knowledge but was not actually stated in
the text).
5. A third distinction that must be kept in mind is how complete and
good each one of these structures is. One may have a poor textbase (either
at the micro- or macrolevel), perhaps because the text is poorly written or
perhaps because the comprehender did not encode properly what was
there. One may have a poor situation model, perhaps because the com¬
prehender lacked the required knowledge, perhaps because he or she
failed to apply it. A poor textbase may be combined with a poor situation
model but may also be associated with a good situation model (as was
shown in Figure 4.1 earlier).
6. As long as one tests for information that is directly given in the text,
one is measuring textbases. For instance, in many laboratory experiments
recall is almost purely reproductive, as are summaries. They may involve
some semantic knowledge, as in generalizing a concept, but that still
remains at the textbase level. Recall and summaries are not always repro¬
ductive, of course, and can be perfectly good indicators of well-developed
situation models, when and if they go beyond the text.
As soon as one tests for things not directly in the text, one is testing for
aspects of the situation model. Inference questions and sorting tasks are
obvious examples. In some cases, recall or summaries may also function
as situation model measures to the extent that they go beyond reproduc¬
tion.
7. Bridging inferences often involve merely semantic information; one
does not need a situation model construct for that case. They may, how¬
ever, involve more than that in some texts. Often a comprehender needs
a good model of the situation under discussion in order to make some of
the bridging inferences required by the text.
9.2.2 Learning
Situation models not only integrate the text but integrate the text with
the reader’s knowledge. An example from Kintsch (1994c) makes this
clear. In Figure 9.3 a brief text is shown, together with its propositional
textbase and a situation model in the form of a graph. The textbase con¬
sists of three complex propositions linked by sentence connectives. 1 he
situation model consists of an image (shown here as a graph), plus back-
294 Models of comprehension
Text
When a baby has a septal defect, the blood cannot get rid of enough carbon
dioxide through the lungs. Therefore, it looks purple.
Figure 9.3 A text, its textbase, and situation model. The textbase is printed
in italics and is linked to prior knowledge, forming a situation model. After
Kintsch (1994c).
textbase that has been formed and reproduces it), or it may be recon¬
structive (e.g., the subject retrieves the situation model that has been
formed and reconstructs the text on this basis). Obviously, the two cases
are not mutually exclusive. Learning from text, according to the defini¬
tion given earlier, requires the formation of a situation model. Thus, the
subject has learned something from the text in Figure 9.3 if the septal gap
is added to the reader’s knowledge about how the blood circulates in the
heart.
obtained, showing the rated closeness between all word pairs. Such a
matrix can then be used as the basis for multidimensional scaling (e.g.,
Bisanz, LaPorte,Vesonder, &Voss, 1978; Henley, 1969; Stanners, Price, &
Painton, 1982). A low-dimensional space is generated in which the key
words are embedded. One can then ask whether the space is the same for
the students as for the experts, and whether the location of the key words
in this space is similar for students and experts. This method has been
used successfully a number of times. For instance, the semantic field of
animal names has been scaled in this way, yielding a space with the two
dimensions of size and ferocity, which account for 59% of the variance of
the paired-comparison judgments (Henley, 1969).
However, these scaling methods are of limited usefulness. The pairwise
comparison method is laborious for the subject and rapidly becomes
impossible to use as the number of keywords increases. Furthermore,
multidimensional scaling methods work with group data, but we often
need to work with data from individual subjects. Most important, how¬
ever, it has become apparent that very few knowledge domains (other
than animal names) are regular and simple enough to be described by a
space of a few namable dimensions.
A more practical method for indirect knowledge assessment has been
developed by Ferstl and Kintsch (in press). It applies to the problem of
measuring the amount of learning that occurs from reading a text. If the
text has an effect on the reader’s memory and knowledge, it should
change the way the reader organizes a knowledge domain, and the change
should be in the direction of the text organization. Thus, we assess a
reader’s knowledge about a particular domain, have the reader study a
related text, and reassess the reader’s knowledge organization to see
whether it has changed in accordance with the text organization.
Ferstl and Kintsch chose a knowledge domain that is quite rich for
most subjects and fairly stereotyped: the birthday party. We then wrote a
story about a somewhat weird birthday party, designed to distort our
readers’ conventional birthday party schema. We assessed this particular
schema both before and after reading the story. The question we were
concerned with was whether the story had an effect on the way subjects
organized the birthday party domain, whether this effect could be attrib¬
uted to reading the story, and whether we could measure this effect with
our procedures.
298 Models of comprehension
Figure 9.4 The two top layers in the clustering hierarchy before reading
based on sorting data. Only the two top layers are shown; long words and
compounds are abbreviated. After Ferstl and Kintsch (in press).
gory membership. Indeed, the results are almost too neat, suggesting that
subjects performed some sort of semantic analysis when sorting the key
words. We have previously argued (Walker & Kintsch, 1985) that the sort¬
ing task produces results that are a bit too orderly and logical and do not
necessarily reflect the memory structures that are operative in memory
retrieval. In comparison, the structure derived from the cued association
data (Figure 9.5) is much richer. It still shows much the same clusters as
Figure 9.4, but there is a complex, rich pattern of interconnections evi¬
dent that Figure 9.4 lacks. My guess is that the cued association data more
accurately reflect knowledge organization than the less spontaneous sort¬
ing data do, but whether or not this is so would have to be established
empirically.
If Figures 9.4 and 9.5 represent our subjects’ birthday party schema,
300 Models of comprehension
Figure 9.5 Part of the Pathfinder structure based on the cued association
data. Only the links that were present both before and after reading are
shown. Heavy lines indicate links that were stronger after reading, and thin
lines indicate links that were stronger before reading. After Ferstl and Kintsch
(in press).
how did this structure change after they read the birthday party story?
One obvious way to answer this question is to have subjects re-sort the
key words after reading the story, or ask them again for their cued associ¬
ations, and construct from the resulting data postreading structures to be
compared with Figures 9.4 and 9.5. Ferstl and Kintsch showed that it is
indeed possible to see how these structures changed as a result of reading
and argue that the changes that have taken place can be attributed to the
new relations among the key words that were established by the story the
subjects had read. Flowever, this is a bit like interpreting inkblots, and
there is cei tainly a danger of reading more into the data in accord with
one’s expectations than the facts warrant.
Learning from text 301
reading the text, both immediately after reading and one week later. How¬
ever, the proportion after one week is significantly lower than the pro¬
portion immediately after reading.
The story subjects read thus clearly influenced the pattern of associa¬
tions they produced. Was their knowledge changed thereby, or was what
we see in Figure 9.6 merely due to the influence of the episodic text rep¬
resentation they had formed? We don’t know this, but if episodic mem¬
ory and knowledge are all part of the same dynamic network, this is not a
question we necessarily have to answer. Our subjects had learned some¬
thing new that changed the way they thought about birthday parties,
more strongly right after they read the story than a week later, but even
then some change could still be documented.
The measurement of learning from text, as distinct from text memory,
is probably best approached through a combination of direct and indirect
procedures: questions that require inferencing and problem solving on
the one hand and scaling methods based on sorting or association tasks,
as in Ferstl and Kintsch (in press), on the other. Both approaches have
their limitations, so it is important to obtain results with different meth¬
ods that confirm each other. The experiments discussed in the next sec¬
tion demonstrate the practicality of this approach.
CONTRACT! V ALVEI
I
OPEN [VALVE] - -CAUSE PUMP[BLOOD] -
I
ALLOW[VALVE.S, ' IN-HEART
^ AORTAfTO-BODY]
'''nAME[$,VEIN]-MAIN [VEIN]-VENACAVA
REASON
PROVIDE[LUNG, $, OXYGEN]
CAUSE "S>ION-OXY[BLOOD]
OXYtBLOOD]-CAUSE -RED[BLOOD]
INDICATE
. SPEED[$]-HEARTRATE
HAVE[HEART, CHAMBER]-FOUR
DIVIDE[SEPTUM ,$,$] X. /
ART]_ J I_RIGirr [HEART]
CONSIST[HEART,MUSCLE]
(a)
DICATE
‘ PULSE[IN-WRIST, IN-NECK]
(b)
Figure 9.7 Knowledge maps for (a) a high-knowledge subject and (b) a low-
knowledge subject. After Kintsch (1994c).
304 Models of comprehension
propositions. Links are drawn between core propositions and their mod¬
ifiers, and whenever propositions are linked by a specific semantic rela¬
tion (like CAUSE, or CONSEQUENCE), or when one proposition is
embedded as an argument of another proposition. An inspection of Fig¬
ure 9.7a shows that this subject knew quite a bit about the heart and that
his understanding about the functioning of the heart was basically cor¬
rect. Compare this with the skimpy knowledge map also shown in Figure
9.7b for a low-knowledge subject.
Two separate processing simulations of a text about heart disease (not
shown here) were performed, one for the high- and one for the low-
knowledge reader. The assumption was made that whenever a concept
occurred in the text, it retrieved all the knowledge the reader had about
this concept. Thus, the phrase the heart supplies blood to the body retrieved
PUMP[FIEART,BLOOD] for the low-knowledge subject, and FLOW
[BLOOD, FROM-HEART,TO-BODY] for the high-knowledge subject.
Valve retrieved ALLOW[VALVE,PUMP[BLOOD], FLOW [BLOOD,
FROM-HEART,TO-BODY]] for the high-knowledge subject and noth¬
ing for the low-knowledge subject, and so on. In this manner 15 nodes
from the knowledge map for the high-knowledge subject and one node
from the knowledge map of the low-knowledge subject (the nodes printed
in boldface in Figure 9.7) were added to the simulation at the appropri¬
ate points. It is not necessarily the case that every bit of knowledge a
reader knows that is relevant to a certain text is actually retrieved when
processing this text, but for present purposes this is a reasonable simpli¬
fying assumption.
For the simulation, working memory was set to five atomic proposi¬
tions. Hence, the model reads the proposition(s) from the first sentence,
goes on to the next sentence, and continues until it has read at least five
elements. The model then retrieves from its knowledge base (the maps
shown in Figure 9.7) any related information and adds these elements to
the network. The network in working memory is then integrated. The
most strongly activated proposition in each processing cycle is carried
over to the next processing cycle in the short-term memory buffer.
The results of the simulation (not shown here) were not very different
for high- and low-knowledge subjects so far as text propositions are
concerned. The pattern of activation is similar, and the average activa¬
tion strengths are about the same. However, there are two important
differences. First, because 15 knowledge nodes were combined with
Learning from text 305
the text structure for the high-knowledge subject but only one for the
low-knowledge subject, the text representation generated by the high-
knowledge subject is solidly anchored to the subjects’ prior knowledge.
This is not the case for the low-knowledge subject. Second, the prior
knowledge that has become incorporated into the text representation for
the high-knowledge subject bridges some coherence gaps in the textbase
of the low-knowledge subject. Where the text is coherent, the textbase for
the low-knowledge subject is also coherent, but where there is a gap in the
text, requiring special knowledge for a bridging inference, the textbase
lacks coherence. This is not the case for the high-knowledge reader,
where gaps in the text can be bridged by prior knowledge.
What empirical predictions do these simulations imply? Behaviors that
depend mostly on the textbase, such as questions about facts explicitly
stated in the text, should not differ much for high- and low-knowledge
subjects. High-knowledge subjects, however, should be better on ques¬
tions requiring inferences or problem-solving tasks. The predictions for
free recall are somewhat ambiguous. The reproductive component should
be about equal for high- and low-knowledge readers, especially for a text
that does not require many bridging inferences, but the former readers
could successfully reconstruct portions of the text on the basis of their
deeper understanding of the situation and hence end up with higher recall
scores overall.
Predictions for the sorting task are illustrated in Figure 9.8. Consider
how subjects with different knowledge would sort the four key words oxy¬
gen, purple, septal defect, and brain defect before reading the text on heart
disease. A subject who knows nothing about septal defects would put the
two defects together and sort the remaining two words into separate cat¬
egories. A more knowledgeable reader, on the other hand, should sort the
first three words together and use a separate category for brain defect. Fig¬
ure 9.8 shows that the simulation makes just these predictions. 1 he figure
shows a segment of the mental representation constructed by the high-
knowledge reader, together with the four key words. An inspection of the
figure shows that there exist pathways with strong links between oxygen,
purple, and septal defect, which therefore would tend to be sorted together,
whereas brain defect is only weakly linked to septal defect. The predictions
are quite different for the low-knowledge reader, however. The low-
knowledge reader’s text representation is the same as that of the high-
knowledge reader, except that it lacks the knowledge component (all the
306 Models of comprehension
Figure 9.8 Network formed by four key words from the sorting task, related
text propositions from reading, and related knowledge items. The numbers
refer to the memory strengths of the links. After Kintsch (1994c).
nodes below the lower dashed line in Figure 9.8). It is easy to see that in
this case there are no pathways linking purple with any of the other words.
Hence, after reading the text a low-knowledge subject should keep purple
separate in sorting these four key words but group oxygen and septal
defect, which are linked together on the basis of the text alone. A low-
knowledge subject who had not read the text would group the two defects
only on the basis of their preexperimental association.
The most important predictions that can be derived from the simula¬
tion described concern learning from text, that is, the usability of the tex¬
tual information in novel situations, which should be fairly high for the
high-knowledge subject. According to the simulation, there are 12 entry
points that allow access to the textual information from the reader’s
knowledge base — the items in boldface type in Figure 9.7 that form links
between the text representation and background knowledge. Thus, even
at a time after reading when the contextual and temporal cues no longer
permit retrieval of the episodic text structure, it remains still accessible
via the reader’s general knowledge. Indeed, it has become a part of it and
ceased to exist effectively as a separate episodic memory structure.
Learning has occurred. With only one reading, the links may be too weak
Learning from text 307
to permit reliable retrieval at a later date — typically one does not learn
much from a single reading - but they could be strengthened through
further study. The high-knowledge reader of our simulation is on the way
to learning something he or she did not know before about the heart.
This is not the case for the low-knowledge reader, who in the simula¬
tion forms much the same text structure as does the high-knowledge
reader, except that it is only very weakly linked to his knowledge base.
This reader has stored in his memory the same information as the high-
knowledge fellow student, but the only way to retrieve it is through the
contextual and temporal retrieval cues associated with the episodic text
memory. It is thus inert knowledge - the student knows something but
cannot spontaneously make use of this knowledge in novel tasks because
what is known is not tied in with the rest of his knowledge. The informa¬
tion may be retrievable given a retrieval cue such as “Think of the chap¬
ter on heart disease that you read here the other day,” but once such
episodic retrieval is no longer effective, the text information is lost.
(1) The heart is connected to the arteries. The blood in the aorta is
red.
308 Models of comprehension
For a reader who does not know what the relationship between arteries
and aorta is, there will be a coherence problem. Or consider
(la) The heart is connected to the arteries. The blood in the aorta, the
artery that carries blood from the heart to the body, is red.
And we can help with (2) by inserting explicit links between the unknown
terms and what a reader might be expected to know.
addressed. Many years after the war, the people who participated in Brit¬
ton and Gulgoz’s study had very little specific information about the war
and hence found the text hard going. Although they were able to recall
quite a bit from the text and answered more than half of the fact questions
correctly, their performance was poor on the inference questions. In fact,
they really did not understand the text at all. This conclusion follows
from an analysis of the conceptual understanding of the situation as it was
assessed by a key word comparison task.
Britton and Gulgoz (1991) selected 12 key words that were of crucial
importance for the understanding of the text and had students give pair¬
wise relatedness judgments for these key words. They also had the origi¬
nal author of the text plus several other experts on the Vietnam war pro¬
vide relatedness judgments for the same set of key words. The author and
experts agreed quite well among themselves (average intercorrelation r =
.80). But the students who had obtained their information from reading
the text did not agree with the author or the experts at all (average r =
.08). Their understanding of the air war in Vietnam on the basis of read¬
ing this text was quite different from the one the author had intended
(and which the experts achieved from reading the same text). Thus, even
though the students recalled a good part of the text and answered ques¬
tions about it reasonably well, they really did not understand what they
were saying! Britton and Gulgoz also report a Pathfinder analysis of the
key word judgment data that makes clear some of the fundamental mis¬
conceptions of these readers. For instance, whereas the text emphasized
the failure of the operation Rolling Thunder, in the students’ judgments
Rolling Thunder was linked to success, instead. Britton likens this result to
reading the Bible and concluding that the devil was the good guy.
This dismal performance could be significantly improved by some
rather simple revisions of the original text to make it understandable to
readers without adequate background knowledge. Britton and Gulgoz
(1991) used the Miller and Kintsch (1980) simulation program to locate
all coherence gaps in their original text. Whenever that program encoun¬
ters a coherence gap (no argument overlap between propositions), it stops
and asks the operator to supply a bridging inference. For instance, if in
one sentence North Vietnam is mentioned, and the next sentence begins
with In response to the American threats, Hanoi decided, Britton and Gul¬
goz might have made this sentence pair coherent by adding to the second
Learning from text 311
not write fully coherent, explicit texts all the time? The answer is that to
write such a text is an elusive goal; a writer must always rely on the
reader’s knowledge to some degree. There is no text comprehension that
does not require the reader to apply knowledge: lexical knowledge, syn¬
tactic and semantic knowledge, domain knowledge, personal experience,
and so on. The printed words on the page or the sound waves in the air
are but one source of constraints that must be satisfied. The reader’s
knowledge provides the other. Ideally, a text should contain the new
information a reader needs to know, plus just enough old information to
allow the reader to link the new information with what is already known.
Texts that contain too much that the reader already knows are boring to
read and, indeed, confusing (e.g., legal and insurance documents that
leave nothing to be taken for granted). Hence, too much coherence and
explication may not necessarily be a good thing.
[?]] High-Coherence
Q Low-Coherence
High K Low K
o
5
B
o High-Coherence
U
c
o □ Low-Coherence
'£
o
D-
O
High K Low K
o
<o
t:
o [2] High-Coherence
u
G
O
Low-Coherence
'fi
o
a,
o
High K Low K
02
. -
Sorting Score
0.15-
[FJ High-Coherence
□ Low-Coherence
0.1 -
0.05-
High K Low K
Figure 9.13 Sorting scores as a function of prior knowledge for high- and
low-coherence texts. After McNamara et al. (1996).
Learning from text 317
the low-coherence text. That text was apparently just too difficult for
these subjects, and they could get nothing out of it. However, the con¬
ceptual change induced by the low-coherence text in readers with a good
knowledge background, as indexed by the sorting scores in Figure 9.13,
was almost twice as great as for the high-coherence text. It appears that
the high-coherence text that spelled out everything these readers already
knew quite well induced an illusory feeling of knowing that prevented
them from processing the text deeply. They were satisfied with a superfi¬
cial understanding, which was good enough for recall and answering text-
based questions, but they failed to construct an adequate situation model
combining their prior knowledge with the information from the text.
Hence, their relatively poor performance with the text that should have
been the easiest and most effective text.
The interaction shown in Figure 9.13 has been replicated by McNa¬
mara and Kintsch (1996) with a different text and a different knowledge
manipulation. In this experiment, we used the original and revised ver¬
sions of the Vietnam text of Britton and Gulgoz (1991). The goal of the
experiment was to replicate Britton and Gulgoz but at the same time to
investigate whether their results depended on a low level of background
knowledge, as we have argued. To achieve more variation in background
knowledge among our subjects, we pretrained half of our subjects. These
subjects received a brief, 20-minute history lesson designed to teach them
the knowledge that was needed to understand the original version of the
Britton and Gulgoz text - a few facts about the geography and history of
Vietnam and about the role of the United States and its South Vietnamese
allies in the war. Perhaps not surprisingly, this minilesson was not very
effective. There was a considerable overlap in scores on the preknowledge
test for subjects with and without pretraining. Hence, not all of our pre¬
trained subjects had become high-knowledge subjects, and the division
between high- and low-knowledge subjects had to be made strictly on the
basis of the knowledge pretest, without regard to the subject’s pretrain¬
ing condition.
The results of this study were similar for both recall and questions
(text-based questions and bridging inferences). Subjects who read the
revised, coherent text performed better than subjects who read the orig¬
inal text, and high-knowledge subjects did better than low-knowledge
subjects. There were no significant interactions. Thus, these results repli¬
cate both Britton and Gulgoz (1991) and Figures 9.10 and 9.11. The cru-
318 Models of comprehension
cial results, however, are the sorting data (no problem-solving questions
were given in this study), which are shown in Figure 9.14. Obviously, the
same kind of interaction between prior knowledge and the coherence of
the text was obtained as in McNamara et al. (1996) (Figure 9.13). For
low-knowledge subjects, the original low-coherence text is quite ineffec¬
tive, as is the revised, high-coherence text for high-knowledge subjects.
Flowever, if readers who know very little about Vietnam read the coher¬
ent, well-written text, their sorting is influenced by it. Similarly, if the
better informed reader^ are given the more challenging text, the text
clearly influences the way they sort the key words.
These results are paradoxical. Should we start writing incoherent texts
and give disorganized lectures so that our better students will benefit
from them? The answer to this question seems to be a qualified “yes.”
Making things too easy for a student may be a significant impediment to
learning. However, just messing up a lecture is not a solution. Instead, we
need to challenge the student to engage in active, deep processing of a
text. This can be done, as we have shown here, by placing impediments in
the path of comprehension, but impediments of the right kind and in the
right amount. They must be impediments we have reason to think the
student can overcome with enough effort, and the activity of overcoming
0.15
0.1 -
<D
S-H
O
o [n] High-Coherence
co
bD 0.05-
a
□ Low-Coherence
'6o
00
-0.05
High K Low K
Figure 9.14 Sorting scores as a function of prior knowledge for high- and
low-coherence texts. After McNamara & Kintsch (1996).
Learning from text 319
0.6
0.5-
O 0.4-
<D
fc
O
U [7] Same Structure
§ 0.3.
□ Different Structure
c
o
P,
O 0.2
0.1 -
Old Inference
the advance information and the text were organized differently. The for¬
mation of a textbase was presumably facilitated because the macrostruc¬
ture they had formed for the advance organizer provided a fitting schema
for the text itself. When this was not the case, the textbase was not as per¬
fect, and behavioral measures depending primarily on the textbase were
lower. However, their situation model was enhanced because in under¬
standing the target text, they had to retrieve and integrate information
from the advance organizer that had presented the information in a dif¬
ferent context. Therefore, a richer, more interrelated network was con¬
structed, which later helped the readers with inference questions and
problem solving (see also Mannes, 1994).
Making comprehension difficult for students is an effective technique
to foster learning but only up to a point. If we make comprehension so
difficult that students cannot succeed, nothing positive is achieved. E.
Kintsch (1990) showed this very nicely in a study comparing the ability
of sixth-grade students, tenth-grade students, and college students to
summarize a text. The text was a descriptive, comparing two countries on
a variety of dimensions. In one version, the text had a well-ordered and
clearly signaled macrostructure; in a second version, the content was the
same but was presented in a disorganized, though locally coherent way.
Figure 9.16 shows some of the results of this study. An ideal summary
should consist entirely or mostly of sentences corresponding to macro¬
propositions expressed in the text or inferred. By that standard, sixth-
grade students write pretty poor summaries, though the quality of the
summaries improves considerably with age. The important point is not
this general improvement, however, but the interaction between the age
of the students and the quality of the text they read. The college students
actually wrote better summaries when they received the poorly organized
text, paralleling the findings of some of the other studies reviewed here.
For the sixth-graders, however, the poorly organized text was too diffi¬
cult, such that they performed only about half as well with the poorly
organized text as with the well-organized text.
Thus, there is evidence now from a number of studies from our labo¬
ratory as well as elsewhere that increasing the difficulty of the learning
phase can have beneficial effects. Moreover, these studies are not limited
to learning from text but involve other kinds of learning situations as well.
Task difficulty can stimulate active processing, with the result that a more
elaborate, better integrated situation model will be constructed. As we
Learning from text 323
Good Text
Poor Text
Age
have seen, this statement must be carefully qualified. The student must
have the necessary skills and knowledge to successfully engage in the
required activity, and the activity must be task relevant.
Studies like McNamara et al. (1996) and Voss and Silfies (1996) show how
important it is to provide a learner with the right text, not too easy and
not too hard. But how can that be done? To perform a theoretical analy¬
sis with the Cl model of both the student and the text as was done in sec¬
tion 9.4 is clearly impractical. Of course, good teachers and friendly
librarians who know their students and their books do so intuitively all
time, and often quite successfully. Our goal is to individualize learning,
yet teachers cannot always be there to help, nor can they know everything
that is available, and large electronic databases just don’t come with a
friendly librarian.
Kintsch (1994c) has hypothesized the existence of “zones of learnabil-
ity,” in analogy with Vygotsky’s “zones of proximal development” (Vy¬
gotsky, 1986). If a student’s knowledge overlaps too much with an
324 Models of comprehension
instructional text, there is simply not enough for the student to learn
from that text. If there is no overlap, or almost no overlap, there can be no
learning either: The necessary hooks in the students’ knowledge, onto
which the new information must be hung, are missing.
One way to determine whether a particular text falls within a student’s
zone of learnability is to represent both the student’s knowledge and the
text as vectors in the semantic space constructed by LSA and determine
how far apart they are (see section 3.3 on LSA). Wolfe, Schreiner, Rehder,
Laham, Foltz, Landauer, and Kintsch (in press) have investigated the
potential of this idea.
Wolfe and his co-workers used the heart as their knowledge domain.
They first assessed what average college students knew about the func¬
tioning of the heart and then developed a 40-point questionnaire to test
for this knowledge. This questionnaire was used both as a measure of the
students’ background knowledge and, after instruction, as a measure of
their learning. Specifically, if a student scored x points before instruction
and y points after, the proportion of improvement was used as a measure
of learning:
learriquest = (y ~ x)/(40 - x)
one in our informal judgment. We assumed that these two texts would be
approximately appropriate for the majority of our subjects. The fourth
text, Text D, was from a medical journal and was definitely too hard for
most or all of our subjects. All four texts were concerned with the func¬
tioning of the heart, and although their contents were by no means iden¬
tical in detail, they were all intended to describe, at rather different levels
of sophistication, how the heart functions. We wanted to find out how well
prior knowledge predicts what students learned from these texts and
whether we could use LSA for these predictions.
Latent semantic analysis was used in two ways: to represent the four
instructional texts as vectors in the semantic space, and to represent the
essays each student wrote before as well as after instruction as vectors in
the same space. A new semantic space was constructed for this purpose
that dealt specifically with knowledge about the heart. Thirty-six articles
of an encyclopedia that dealt with the heart were used to construct this
space. A comparison of the cosines among the four text vectors confirmed
our intuitions that the texts were ordered from easiest to most difficult in
the order A < B < C < D. In each case, the cosine between nearer texts
were higher than the cosines between more distant texts.
Student scores on the two measures of knowledge, the questionnaire
and the essay, were well correlated, r - .74. Furthermore, if we assume
that Text B and C are standard texts, in the sense that their content is what
college sophomores should know about the heart, the cosine between a
subject’s pretest essay and one of these texts provides another knowledge
measure - how distant is that student’s essay from where it should be in
the LSA space? The fact that this third, LSA-derived measure of prior
knowledge also correlated quite highly with either the question-based or
the essay-based knowledge measures shows that we are dealing with a
rather orderly data set and that LSA may be the right measuring stick for
our purposes. Specifically, the correlation between the cosine for a stu¬
dent’s pretest essay and Text C and scores on the questionnaire and essay
grades were r = .64 and .69, respectively.
The main questions of interest were, however, whether prior knowl¬
edge predicted learning and whether there was any evidence for a non¬
monotonic relationship, as implied by the hypothesis of the zones of
learnability. According to that hypothesis, the relationship between prior
knowledge and learning should depend on the difficulty of the text.
Specifically, students should not learn much from the too-easy Text A
326 Models of comprehension
unless they really know nothing about the heart; nor should they learn
much from the too-hard Text D unless they already know a lot.
We have a restricted range problem in this study, however. All our col¬
lege sophomores knew a little bit about the heart, but there were very few
really high-knowledge people, so we really did not have a text that was too
easy. We therefore ran a fifth group of subjects, a group of medical school
students, who were given the easy Text A to study, d he results are shown
in Figure 9.17. Figure 9.17 shows the mean improvement in learnquest and
learnessay as a function of the average cosine between the texts that stu¬
dents read and their pretest essays. Including the medical students in
Figure 9.17 clearly brings out the hypothesized nonmonotonic relation¬
ship between prior knowledge and amount learned. It is particularly
noteworthy that two very different methods of measuring learning - by
questionnaire and by essay grades — yielded comparable results.
According to Figure 9.17, a great deal can be gained by assigning stu¬
dents the right text for learning. We could easily have doubled the learn¬
ing scores for the college students who read Text D, as well as improved
□ leam quest
O leamessay
the scores of students who read Text A, simply by giving them either of
the other two texts! Wolfe et al. (in press) made some calculations that
suggest that by assigning individual students to texts on the basis of their
background knowledge, learning scores could have been improved by
more than 50%. Even if in practice these calculations turn out to be opti¬
mistic, such a potential can hardly be ignored.
The complex interaction between learning, text characteristics, and
background knowledge certainly deserves more research. However,
Wolfe et al. (in press; see also Rehder, Schreiner, Wolfe, Laham, Lan-
dauer, & Kintsch, in press, for a more detailed analysis of the LSA meas¬
ures that can be used to predict learning) have shown one thing quite
clearly: LSA predicts learning about as well as the more cumbersome
empirical measures (essays have to be graded, questionnaires require
careful design), thus giving us a powerful research tool and perhaps even
a practical tool that could be used to match students with the texts they
could most profit from.
Learning from text, like all learning, involves the construction or modifi¬
cation of situation models. Hence, the first educational implication of the
work discussed here is that if we are interested in learning, we must make
sure that our measures are sensitive to learning — that is, they must reflect
properties of the situation model rather than merely the textbase. Such
measures are inference questions and problem-solving tasks, and, at least
for laboratory studies, keyword sorting and cued association tasks. Recog¬
nition and fact questions, in contrast, may merely reflect understanding at
the level of the textbase. Recall and summarization are intermediate in this
respect because reproductive recall can be quite good even in the absence
of an appropriate situation model, whereas the reconstructive component
of recall can be more indicative of the nature of the learner’s situation
model. In the studies we have reviewed, textbase and situation model mea¬
sures frequently gave different results. Failure to distinguish between
measures of learning and measures of memory invites theoretical confu¬
sion and cannot produce educationally satisfactory results. Many educa¬
tors have stressed the dangers of superficial understanding and the need
for engaging students in deeper understanding. I hope that by tying the
commonsense terms deep and superficial understanding into the theory of
328 Models of comprehension
Word problems
People spend a great deal of time reading, some much more than
others. The amount of time spent in leisure reading, according to
one study (Guthrie & Greaney, 1991), is 63 minutes per day, 39
minutes for newspapers and 24 minutes for books. These times
are medians of highly skewed distributions and they are probably
underestimates because a substantial part of leisure reading is sec¬
ondary to some other activity (cooking, planning a vacation, wait¬
ing in a doctor’s office) and hence is underreported. In addition,
the median estimated time people spend in occupational reading
amounts to another 61 minutes (much more for professional
workers). Once again, it is hard to get accurate reports, because
so much of occupational reading is embedded in another activ¬
ity. Scanning a production schedule tends not to be reported as
reading.
It is clear from these numbers that reading takes a consider¬
able portion of our time, even if we live outside a university and
even if much of the reading we do occurs in the service of some
other activity. Comprehension, memory, and learning are the key
concepts when we look at reading for its own sake for relaxation
or acquiring knowledge. But we need to broaden our approach
when looking at reading in the service of other goals, such as
solving a mathematical problem or performing some action.
skills. Most students profess to hate them and find them confusing. They
know the math but not whether to add or subtract or what to multiply
with what. Teachers are challenged to make their students understand
what they are doing, instead of applying rules mechanically.
To the psychologist, word problems provide some wonderful research
opportunities. Not only are they texts, so that one can recall and summa¬
rize them just like stories, but one can also observe the student’s problem
solving and the eventual answer. What the student remembers and what
the student does are related in informative ways and mutually constrain
each other. Thus, a richer set of experimental observations can be obtained
than in other text research.
Most efforts to study how children learn to solve word arithmetic prob¬
lems have focused on the formal aspects of the problem. Thus, the first
models of word arithmetic problem solving by Riley, Greeno, and Heller
(1983) and Briars and Larkin (1984) had no explicit language-processing
component and instead dealt with only the problem representations the
children constructed, the inferences they made, and the counting opera¬
tions they used. However, the language used to express a problem makes
a difference. A well-known example of how important the language can be
has been provided by Hudson (1983). Only 39% of Hudson s first-grade
students were able to solve the following problem:
Kintsch and Greeno (1985) argued that word problem solving involves a
text understanding phase that yields the kind of mental representation
needed by the second problem-solving phase to compute an answer.
Many texts have a conventional structure that readers know and use to
organize their mental representation of the text. Thus, stories are orga¬
nized by the story schema and legal briefs by an argumentation schema.
Word problems have their own schemas that have to be learned in school,
recognized as relevant when the problem is being read, and used as an
Word problems 335
organizational basis for the text. The schemas are the structures identi¬
fied and described by Riley et al. (1983): the Transfer schema, the Part-
Whole schema, and the More(Less)-Than schema in their various ver¬
sions. They are all based on the notion of a set. A set is a schema, too, a
frame with slots to be filled by particular quantities and objects:
SET
object:<count noun>
quantity:<number, some, many>
specification: <owner, location, time>
role:<start, transfer, result;
superset, subset;
large-set, small-set, difference-set>
Thus, the sentence Joe had 5 marbles yields a set with Joe as the owner, 5
as the quantity, and past as specification. Its role depends on the problem
context. For example, it might be the start-set of a Transfer problem (He
gave away 3 marbles. How many marbles does he have now?), or the subset
of a Combine problem (Jack had 3 marbles. How many do they have alto¬
gether?), or the large set of a Compare problem (He had 2 more marbles
than Jack. How many marbles did Jack have?)
Riley, Greeno, and Heller (1983) classified word arithmetic problems
into 14 types in three main classes, as shown in the following list:
• The 3 marbles in the first sentence provide the cue for making a set Sj
of 3 marbles, with owner Joe in the past, role unknown.
• Similarly, a second set is formed on the basis of the next sentence,
the objects being marbles, quantity some, and specification that Joe is
the owner, at a later time. Furthermore, give cues the Transfer schema,
with the result that S, can be assigned the role of start-set and S2
the role of transfer-set. An expectation is created for the missing result-
set.
• It is filled by S3, which is created on the basis of the third sentence: 8
marbles, owned by Joe now, the result of the transfer process.
• The final sentence merely asks to calculate the as yet unspecified quan¬
tity of S2. This is done by using a counting procedure called Add-On,
as is typical for first-grade students.
338 Models of comprehension
• Sp objects: marbles
quantity: 8
specification: owner Joe
role: unknown
• S2: objects: marbles
quantity: 5
specification: owner Joe
role: difference-set
What is it that children learn when they solve word arithmetic problems?
According to the logicomathematical development view, which has its
roots in the work of Piaget, children learn to solve certain kinds of prob¬
lems by acquiring the conceptual knowledge necessary for these prob¬
lems. This is the view taken by, among others, Riley et al. (1983) and Bri¬
ars and Larkin (1984). Riley et al. (1983), for instance, hypothesize that
the problem solving of good students involves complete schemas such as
those we have described, whereas poor problem solvers work with impov¬
erished schemas that represent individual sets but not the relations
among the sets.
Alternatively, one might hypothesize that the children have the neces¬
sary formal knowledge but that certain difficult problems employ lin¬
guistic forms that are unfamiliar to the children. This linguistic develop¬
ment view contrasts with the logicomathematical hypothesis because it
places the source of difficulty not in the children’s deficient conceptual
structures but in their lack of linguistic knowledge. The child may have
an adequate understanding of the Part-Whole schema, for example, but
might not know how to map a phrase like How many more Xs than Ys?
onto that schema.
Cummins et al. (1988) have shown that one can decide among these
competing hypotheses by considering the way errors occur in the simu¬
lation and comparing these errors to the ones children make. 1 he trick is
340 Models of comprehension
HAVE[AND[MARK,SALLY], SEVEN[TRUCKS]].
As a consequence, the system will create 2 sets of 7 trucks - one for Sally
and one for Mark. The quantity of trucks in Mark’s set gets updated to 2
by the second sentence, and the problem becomes a superset problem. In
answer to the question, the simulation uses its “Already-know-the-
answer” strategy and responds with 7.
Real children make this sort of error 45% of the time. The children
also recall the problem as a Combine-1 problem:
This type of misrecall occurs on 74% of the trials on which the children
made a superset error, whereas it was rare when other types of errors
occurred, and it did not occur at all when the problem was solved cor¬
rectly. Figure 10.1 shows the recall data, including a control group of sub¬
jects w4io read the problem but did not solve it.
Compare-5 problems were also a difficult problem type for our sub¬
jects. Errors occurred in our data again on 32% of all trials; when an error
occurred, either the difference set was given as the answer (32%), or the
two numbers in the problem were added (44%). The simulation yields
342 Models of comprehension
80
60 -
qj
ofl
3 vXvXxXx:
§ 40 -
0. ::::X:X;.;.;.;:;.;
2° -
o -1..i. J
Read Only Correct Superset Error
both of these error types if we lesion a key piece of its linguistic knowl¬
edge. Have-more-than requires a complex linguistic construction, involv¬
ing three sets: a large-set, a small-set, and a difference-set. If it is simply
parsed as more-than, the simulation makes the same errors that the chil¬
dren do:
This incorrect parsing does not yield a well-structured problem, but nei¬
ther the children nor the simulation is at a loss. The children use one of
their default strategies to come up with a reasonable answer. Two of these
default strategies apply: the “Already-know-the-answer” strategy yields
Word problems 343
the difference set error - Tom has 5 sticks. The “Addition” strategy based
on the keyword more yields the addition error - Tom has 13 sticks. These
two errors in fact accounted for 76% of all errors the children made. Fur¬
thermore, the way the children recalled the problem correlated highly
with the way they solved it. If they made an addition error, the most fre¬
quent recall pattern was
Figure 10.2 shows the frequency with which this pattern of recall was
observed for Compare-5 problems when subjects made an addition error,
when they solved the problem correctly, and when they only read the
problem without solving it in a control condition.
The subjects’ tendency to simplify problems and solve the simplified
problem rather than the original one does not always result in errors. For
instance, the second line of the Compare-5 problem exemplified in (7) is
rather clumsily worded. The real structure of the problem is expressed
much better by Tom has 5 sticks fewer than Joe. Here the fewer directly keys
subtraction, and the owner of the set is kept in subject position. The
with real goals and motivations for the actors. The simulation that was
developed was also given the new problem set to solve. Apart from telling
it the meanings of several new words, the simulation was not changed at
all. Nevertheless, it managed to solve thirteen of the complex problems
correctly. It also accounted quite well for the kinds of errors children
made in solving these problems when the simulation’s linguistic knowl¬
edge was degraded. Degrading its mathematical knowledge, in contrast,
yielded errors quite unlike those the children made. The simulation failed
to solve three of the problems. The reason for this failure was the same in
each case: It did not have enough linguistic or general world knowledge
to understand the situation described in the story. In one case, for
instance, it failed to connect a reference to some pencils a boy gave his sis¬
ter with a later reference the 5 pencils he gave her. In another case, it did
not know that trade implies a switch of ownership.
What we see here is the central importance of situational understand¬
ing in word problem solving. The reader must have a correct and com¬
plete situation model of the various actions, objects, and events described
in a story problem. In the textually simplest arithmetic problem, this
amounts to little more than learning how to use correctly a few key words
and phrases. As problem texts get richer, however, all kinds of linguistic
and world knowledge may be required for the construction of an adequate
situation model. Familiarity with the overall domain becomes very
important, as we have already seen from the example of Hudson (1983).
Stern and Lehrndorfer (1992) provide some nice experimental demon¬
strations that problems of a certain type can be solved by children at a cer¬
tain age when the problems are embedded in familiar contexts, but not
otherwise.
Reusser (1989) has stressed most emphatically the importance of situ¬
ation models in word problem solving. Reusser developed a simulation
model that extends the Cummins et al. (1988) model by building in an
explicit situation model. This was a significant innovation in the evolu¬
tion of such models. In the past, theorists of word problem solving
attempted to go directly from the text to the equation (and instructors
stressing key-word strategies followed suit). Kintsch and Greeno (1985)
and Cummins et al. (1988) presented a more complex model in which text
understanding is interwoven with the construction of a mathematical
problem representation (in terms of sets and their interrelations), from
Word problems 347
The Kintsch and Greeno (1985) theory and the simulations based upon
it (Cummins et al., 1988; Reusser, 1989) are top-down schema theories.
Problems are understood because cues in the text lead to the activation of
an appropriate arithmetic schema around which the text is organized and
on which solution procedures can be based. If the right schema is acti¬
vated, the problem will be solved correctly. This model has the same dif¬
ficulties as all top-down schema theories, in that the cues that tell a stu¬
dent which arithmetic schema is needed may be subtle, complex,
multiple, unreliable, and even contradictory. Instead of jumping to per¬
haps premature conclusions and committing oneself to a single hypothe¬
sis about the nature of a word problem, it might be better to consider all
the evidence for several alternatives in parallel in the context of the story
as a whole. That is what the Cl architecture is designed to do, and Kintsch
(1988) has shown that it is quite possible to embed the Kintsch and
Greeno theory within the Cl framework.
Which version of the theory is the better one, however? As we have just
seen, quite a bit of productive work could be done with the top-down ver¬
sion. Nevertheless, there are reasons to prefer the Cl architecture in this
case, too. If one just looks at the end result, both versions of the theory
achieve identical outcomes. They differ in the way a schema is selected.
In the earlier version, the organizing schema is selected on the basis of a
348 Models of comprehension
cue or cues in the text, which then controls the rest of the process in a
top-down fashion. In the Cl version, evidence for various alternative
schemas is collected in parallel, and the schema that best satisfies the mul¬
tiple constraints of a problem eventually wins out. I he fact that there are
multiple or even contradictory cues is easily incorporated in the Cl ver¬
sion of the theory. Thus, experimental evidence that during the course of
reading a word problem subjects consider more than one alternative, and
that they combine multiple cues, would support the Cl version of the
theory and contradict or at least require a revision and elaboration of the
single-schema version.
Such evidence has been obtained in an experiment by Lewis, reported
in Kintsch and Lewis (1993). Lewis asked subjects to make a choice
between two alternative hypotheses about a problem, addition or sub¬
traction, at several query points during reading the problem. Thus, the
time course of hypothesis formation could be recorded. The problems
she used contained two cues, the overall structure of the problem and a
keyword favoring either addition or subtraction.
Prototypical problems used by Lewis — four versions of Compare
problems from Table 10.1 - are shown in Table 10.2. The keywords in the
example in Table 10.2 are taller and shorter. Taller is associated with more
and hence with addition; shorter is associated with less and hence with
subtraction. Lewis used 16 problems of this kind with different keyword
pairs, such as fast-slow or hot-cold. The force of these keywords was
either consistent with the nature of the problem, or inconsistent. Lin¬
guistically, the two comparative terms are not equivalent. Taller is the
default term, which can be used without particular connotations, as in
How tall is Jeff ? Smaller is the marked member of the pair. Its use is not
Consistent Inconsistent
neutral. How small is Jeff? implies that Jeff is small. One would therefore
expect that unmarked comparative terms would influence the course of
hypothesis formation little or not at all; marked terms, however, should
have a stronger effect, because normally speakers use such terms only if
they have something special in mind.
Lewis’s data for 144 college students are shown in Figures 10.4 and 10.5
for addition and subtraction problems, respectively. These figures show
the percentage with which subjects chose the correct hypothesis about the
appropriate mathematical operation at three query points located at the
end of each of the three sentences of the word problems. Logically, sub¬
jects should guess randomly after the first sentence and should respond
correctly at the end of the second sentence. The final question is logically
redundant. The keyword should be disregarded, and hence the consistent
and inconsistent problem version should be equivalent. This is not what
happened. Performance was imperfect after the second sentence and gen¬
erally improved after the third sentence. When the keyword was consis¬
tent with the problem structure, the correct hypothesis was chosen earlier
and with a higher final accuracy. Inconsistent keywords depressed both
the rate and asymptote of performance. There was a significant interac¬
tion between consistency and markedness. Linguistically marked com¬
Consistent data
Consistent data
O Inconsistent data
Query Points
s(ADD)
Pr (ADD) =
s(ADD) + s(SUBTRACT)
where s(ADD) and s(SUBTRACT) are the strength values of these pro¬
cedures determined by the spreading activation mechanism of the Cl
model. The strengths of these procedures depend on how much support
they receive from their associated schemas, ADD-SCHEMA and SUB¬
SCHEMA, as well as from the more- or less- keywords. The schemas in
turn are supported by the text. Text propositions favoring either the
ADD-SCHEMA or the SUB-SCHEMA are linked to the respective
schema nodes. Text propositions, such as those derived from the first sen-
Word problems 351
tence of these problems, which do not indicate the problem structure, are
linked equally to both hypotheses, although more weakly (Figures 10.6a,
10.6b, and 10.6c).
Figures 10.6a, 10.6b, and 10.6c provide a processing trace of the model
for the inconsistent addition problem. The parameter values used were
estimated by informal exploration to maximize the goodness of fit of the
model to the data in Figures 10.4 and 10.5. Specifically, links among text
propositions and and links among schemas and procedures were assigned
a value of 1. Links between text propositions and the schema they sup¬
ported were given a value of .8. Neutral propositions were linked to both
schemas w ith a value of .4. Marked key words were linked directly to the
procedure with a value of .4. All other links were zero. Each node was
linked to itself w ith a value of 1, except for the ADD procedure, which
was assigned a value of 1.2, which is the model’s way of accounting for the
observed response bias in favor of addition. Intuitively, these parameter
estimates seem reasonable in this problem context.
The first sentence, shown in Figure 10.6a, expresses the proposition
TALL[TOM] with the modifier 175CM. These nodes are linked to both
schemas, which in turn are linked to their corresponding procedures. The
network settles with the activation values shown. The probability of
choosing the correct hypothesis (ADD) at this query point is .58/(.58 +
.47) = .55; that is, the model starts out with a modest addition bias.
The second sentence, in Figure 10.6b, adds another proposition with
a modifier,
SHORTER[TOM,JEFF]
I_ 12CM
(a)
subtraction schema
(b)
which further strengthens the add procedure (it supports the ADD-
SCHEMA in the same way as the previous sentence) but succeeds in lift¬
ing the probability of choosing the correct procedure to only .65. This far
from perfect performance level attests to the strong interference pro¬
duced by the marked keyword and corresponds to the actual performance
of Lewis’s subjects.
Word problems 353
(c)
Figures 10.4 and 10.5 show the predictions for all four problem types
(Add, Subtract, Consistent, & Inconsistent) used in the Lewis experi¬
ment. It is obvious that the predictions agree rather well with the data.
Indeed, a chi square goodness of fit test yields a nonsignificant value, x2
(9) = 18.75, which is quite impressive considering (1) that each of the 12
data points was based on 576 observations so that the goodness of fit test
is very sensitive, and (2) that we have estimated only three parameters.
The success of the Cl model in accounting for these data does not
mean that alternative models are wrong. However, it is worth noting that
time course predictions for the formation of hypotheses could be
obtained with the Cl model without any additional assumptions, just by
running the model sentence by sentence. Also, key word effects and bias
parameters could be incorporated in a natural way. That is not to say that
alternative models could not be similarly elaborated, although for single¬
schema, top-down control models some rather arbitrary assumptions
would be needed to account for the time course data as well as the key¬
word effects.
What intellectual gain has been achieved by modeling these data? If
one wants to know about the time course of hypothesis formation, key¬
word effects, or response biases, one merely has to look at the data to find
354 Models of comprehension
out. The simulation does not tell us anything about these matters that the
data did not already reveal. The primary value of the simulation is that it
allows us to understand the particular results of this study within a broad,
overarching framework that ties together many different observations
about language understanding and problem solving. It lets us see these
results as pieces of a much larger puzzle, as systematic observations
related to a host of others.
There may also be some more immediate gain. As Kintsch and Lewis
(1993) have suggested^if we want to find out about inconsistency effects
in Change problems, we would have to run another expensive experi¬
ment. Or we can boldly calculate. According to our simulation, the fol¬
lowing problem should have a solution probability of .88.
what the next generation model for word problem solving should look
like. It incorporates the work discussed so far but turns further away from
the abstract schema concept and toward the notion of situated cognition.
The Kintsch (1988) formulation corresponds to the performance of
children who have had a great deal of experience with these particular
word problems. However, a schema-based problem model is the end
product of learning to solve word problems in school; it does not charac¬
terize the way children learn to solve these problems. Stern (1993b)
showed convincingly that children in the early elementary school years do
not have an abstract schema that they use to form problem models in the
way we have surmised. Instead, their knowledge is more accurately char¬
acterized as a set of concrete, situation-bound principles about arithmetic
that allow them to solve many problems and out of which abstract general
schemas will eventually be constructed.
Stern’s data come from an eight-year longitudinal study that assessed
the cognitive and social development of a large group of school children
in Munich, Germany - the LOGIK project of Weinert and Schneider (in
press). Two observations from this study form the core of Stern’s argu¬
ment. First, in a large number of cases, students can provide the correct
answer to a word problem but are unable to explain how they arrived at
their answer. Specifically, they are unable to write a correct equation for
their solution. This happens to all students, both high and low achievers,
and for all kinds of problems, as long as the problem is challenging for a
particular student. The student is able to figure out the problem by what
he or she knows about numbers and arithmetic operations, taking into
account the situational constraints imposed by the problem, but without
being able to form a metalinguistically explicit problem model. Figure
10.7 shows some representative results. Percentage of errors, correct
solutions with equation, and correct solutions without equation are
shown for a three-year period for the following problem:
Errors
Correct&Equation
Correct&No-Equation
Figure 10.7 The percentage error, percentage correct solutions with correct
equation, and percentage of correct solutions without correct equation for the
“apple tart” problem in grades 2, 3, and 4. After Stern (1993b).
and
In this problem the Compare schema must be used first to determine how
many rabbits Mike has, but that is not specifically asked for. About one
third of the children in the two grades where all these problems were
tested solved the first two problems correctly but could not solve the
third problem. (The reverse, that the third problem was solved correctly
but errors were made on either of the first two, was very rarely observed,
indicating that we are not dealing here with random variability but with
a true asymmetry.)
Problem models, therefore, are not derived from metalinguistically
available schemas, except at a very late stage of schooling. Stern (1993b)
discusses some evidence that suggests such schemas are in fact the end
result of learning how to solve word problems.
Stern also points out that the assumption that children use arithmetic
schemas to solve word problems is somewhat inconsistent with the
essence of the Cl models. In other applications of the model, schemas are
not treated as full-fledged structures, ready to be retrieved and used, but
as emergent structures that are put together in the context of a particular
task from smaller knowledge units. In the case of word arithmetic prob¬
lem solving, the knowledge units used to construct problem models are
the arithmetic principles as they are learned in concrete task contexts by
the child. Only later will schemas be abstracted from these principles.
One principle will come to be associated with another one, thus eventu¬
ally becoming a unit and a schema. But while the children are learning to
solve word problems, their knowledge is less orderly, less abstract, and
more situated. For instance, commutativity may be contextually
restricted to numbers below 20; the possibility of reformulating more
statements by less statements may be realized in some contexts but not
others; the term more than may be understood in the context of an action
but not in the context of a static comparison. Instead of a few powerful
358 Models of comprehension
There are some good reasons why one should study word algebra prob¬
lems. The domain of algebra is so much richer and more complex than
that of arithmetic, and hence the texts of word problems are longer and
more interesting, placing greater demands on a model of discourse com¬
prehension. Furthermore, the central issues in algebra word problem
solving appear to be the same as in the arithmetic domain. In particular,
the importance of situation models in algebra word problem solving is at
least as high as in arithmetic word problem solving. In a pioneering study,
using extensive protocol analyses, Hall, Kibler, Wenger, and Truxaw
(1989) have shown that competent college students reason within the sit¬
uational context of a story problem to identify the quantitative con¬
straints required for a solution. They use the text to build, elaborate, and
verify a situation model from which they derive their solution. A variety
of reasoning strategies are used by students to develop these situation
models, which are by no means restricted to the algebraically relevant
aspects of a problem. Integrating the dual representations of a problem at
the situational and mathematical level appears to be the central aspect of
competence according to Hall and his co-workers, as indeed it was for
children working on arithmetic word problems.
Word problems 359
The only way one could build a simulation of word algebra problem
solving would be to restrict the domain radically, say to a blocks world, so
that the simulation could know everything there is to know in that
domain. Further restrictions would have to be placed on the form of the
linguistic input. This did not seem an attractive alternative to us. We
decided, therefore, in favor of a hand simulation in which the problem of
general world knowledge is avoided by using our own intuitions as
required. There is a great deal this approach cannot deliver in compari¬
son with a real simulation, but some useful implications can still be gath¬
ered from it.
The units may be amount per time (a boat travels 4 km per hour), cost per
unit (a pound of almonds costs $5.99), portion to total cost (7% of the cost
of a car), and amount to amount (5% acid in 3 gallons of solution). These
Rate schemas are the building blocks of algebraic problem models. The
story text usually specifies one or two of the members of a schema (Unitj,
Unit2, rate). If only one member is missing, it can be computed from the
Word problems 361
Text:
A plane leaves Denver and travels east at 200 miles per hour. Three
hours later, a second plane leaves Denver on a parallel course and
travels east at 350 miles per hour. How long will the second plane
take to overtake the first plane?
Textbase:
LEAVE[PLANE1, DENVER] LEAVE[PLANE2,DENVER] OVERTAKE[P1,P2]
rate: 200MPH rate: 350MPH
direction: EAST direction: EAST
time:3 H LATER time: ?
loc: PARALLEL COURSE
Equation:
200(t+3) = 350t
Figure 10.8 A word algebra problem: the text, textbase, situation model,
problem model, and equation.
Word problems 363
enter this information correctly into the problem model, the student
must realize that this means that the flying time of Plane 1 equals the fly¬
ing time of Plane 2 plus the 3 hours. The other piece of information
needed is not explicit and must be inferred on the basis of the student’s
world knowledge: If the planes start from the same place and fly along the
same route, they will have covered the same distance at the point at which
the faster one overtakes the slower plane. Students have a great deal of
experience with races between various kinds of vehicles and are quite
familiar with this situation. Therefore, this knowledge is available, but it
must be activated at this point and used to complete the problem model.
An equation is easily derived from the problem model. The unknown
in the problem is the time t that the second plane needs to overtake the
first. This can be entered in the problem schema as shown. Next, t + 3 can
be computed as the travel time for Plane 1. Finally, the distances traveled
by each plane can be expressed as the product of time multiplied by
speed, and set equal, resulting in the equation shown in Figure 10.8. The
equation can then be solved as in any numerical problem.
This is indeed a complex model for solving a simple problem! Similar
problem models can be constructed for all rate problems and, with some
obvious extensions, also physics and geometry problems. But is this com¬
plexity necessary for these rather simple problems? First, algebra word
problems are not simple for most people and most students. Second,
there is some evidence that the model presented here is by no means
unnecessarily complex. What we lack and cannot have for the reasons we
have discussed is a full simulation of the model and direct empirical tests
of it, as in the case of word arithmetic problems. (The subtlety of infer¬
ring equal distance from overtake is a good example of the kind of general
world knowledge that is necessary for these problems that we do not know
how to represent in a knowledge base.) We can argue for the present
model on three grounds.
First, the components of the model are directly motivated by our
research on word arithmetic problems. We had to introduce a situation
model and problem model representation between the textbase and equa¬
tion to understand how people solve word arithmetic problems. It would
be surprising if we could do with less in the case of woid algebra. Second,
in their study of word algebra problem solving, Hall et al. (1989) obtained
strong empirical evidence that good students engage in extensive leason-
ing at the level of the situation model and that the major difficulty they
364 Models of comprehension
Intelligent tutoring is based on the premise that the system solves a prob¬
lem in a way that simulates what human experts do. The system also has
a model of the student and is able to interpret the student’s behavior in
terms of that model and of its own understanding of the problem. Flence,
it can diagnose errors and take remedial steps. Because we cannot simu¬
late human word algebra problem solving, this approach is not open to us.
Flowever, suppose the model described here is essentially correct. If
that were the case, what instructional implications would it have? What
are the difficult steps in solving word algebra problems where students
might need help? Because reading is not problematic for college students,
they are perfectly capable of forming a textbase. Also, because our initial
goal is to bring their performance on word problems up to their level of
performance on numerical problems, we will not bother helping them
with the equations. Furthermore, we know that college students have no
trouble understanding the situations described in algebra word problems.
There are some exceptions, especially in the areas of statistics and prob¬
ability, and we have also encountered some rather fuzzy notions about the
concept of “interest.” However, these exceptions are neglected here. In
general, students do not need help constructing situation models from
the simple contexts depicted in word problems. They do need help with
the formation of the problem model, because they often skip this stage
and jump to equations without a clear understanding of the problem
structure. Thus, we must force them to build explicit problem models,
and we must provide them with the means to do so. Some students and
experts successfully use notes and graphs for this purpose, but in general
students do not have available a systematic, easy system for designing
problem models. Nathan et al. (1992) decided to teach students the
graphic system we used to represent problem models in our theory as a
systematic way to construct such models while attempting to solve them.
Word problems 365
with negative values for the length of a board and similar absurdities (Paige
& Simon, 1966). Because algebra students cannot operate in the real world
and receive feedback from it, the next best thing is to supply them with a
substitute world in the form of an animation that acts out what is implied
by the problem model they have constructed. Thus, in the case of the
problem in Figure 10.8, the animation would show Plane 1 leaving; after a
delay, Plane 2 leaves at a greater speed. After some time, t it overtakes the
first plane and the animation stops. On the other hand, if the student
makes an error and constructs a problem model in which the slow plane
leaves after the fast plane, this is exactly what happens in the simulation,
and the student invariably realizes the error. Similarly, if the student for¬
gets to specify the distances the planes travel, they will simply go on fly¬
ing, even after one overtakes the other. The student notices that something
is wrong, and now has a chance to figure out what.
Nathan et al. (1992) have built such a tutor, called ANIMATE. It is not
an intelligent tutor, because the system does not understand the problem
at all. It merely executes an animation as instructed by the student’s prob¬
lem model (it paints fences, mixes solutions, stacks up piles of money, or
lets objects move). If the right thing happens, the student realizes that the
solution is correct. If the wrong thing happens, the student must figure
out how to correct it. If nothing happens because the simulation was not
given enough information to execute anything, the student knows that
one or more pieces are missing from his formal model and can try to com¬
plete it.
Figure 10.9 provides a graphical description of word problem solving.
Both problem-relevant information and situation-relevant information
are extracted from the problem text, forming potentially isolated problem
representations that must be coordinated. This link is what the ANI¬
MATE tutor focuses on. ANIMATE has two components, Network,
which is a system that allows problem models to be constructed accord¬
ing to the graphical conventions we have sketched, and the Animation,
which runs whatever Network tells it to. The purpose of ANIMATE is to
enable the student with the two links in the figure labeled Is it right? and
What went wrong? by making explicit the correspondence between the
problem model (in Network) and the situation model (the Animation).
ANIMAl E works quite well. Although it has never received a field test
in a classroom study, it has passed a number of laboratory tests with good
grades (Nathan et al., 1992). Nathan showed that with only minimal
Word problems 367
Situation Conceptual
Model Problem Model
WORD PROBLEM
Figure 10.9 The role of the ANIMATE tutor in helping students to solve
world algebra problems. From Nathan et al. (1992).
368 Models of comprehension
what they knew and failed to use ANIMATE to deepen their understand¬
ing. If they had been given more challenging problems, they might have
found more use for ANIMATE and learned to engage in situational rea¬
soning in the process. For the other three subjects, ANIMATE seemed
just right. The problems were difficult for them but not too difficult, and
the kind of support ANIMATE provided enabled them to solve these
problems and learn a great deal in the process. My hunch is not that this
style of tutoring does not work with poor students or with very good stu¬
dents. Rather, in order to learn students must receive problems that are
adjusted to their level of skill: problems they cannot quite solve on their
own but which they can just manage with the scaffolding provided by the
tutor.
It is unfortunate that ANIMATE was never used in a real educational
environment. As it is, it is impossible to judge its value as a classroom
tool. But from a theoretical perspective, the kind of work that has been
done with ANIMATE has proven to be quite interesting. Discourse
comprehension theory can be used for the analysis of word problem solv¬
ing, and such an analysis allows us to significantly expand and elaborate
the theory.
Beyond text
The model for word problem solving that was developed in the previous
chapter can provide a framework for a broad range of related compre¬
hension tasks whenever a verbal instruction — a text or an oral communi¬
cation - must be understood in the context of a particular task and situa¬
tion to guide the actions of the comprehender. In the previous chapter the
action required was to calculate a numerical answer to a mathematical
word problem; the text was used to construct a situation model and a cor¬
responding problem model. The formal problem model provided the
mathematization of the problem, which made it possible to employ the
tools of arithmetic or algebra for the solution of the problem. An analo¬
gous situation is encountered in many other tasks, in particular tasks that
require the use of a computer for the solution of a problem. Instead of a
mathematical problem model, we are dealing in this case with a formal
system model. To use the computer, the situational understanding of a
task must be reformulated into what we have called a system model
(Fischer, Henninger, & Redmiles, 1991).
The formal specification of a computing task in terms of a system
model is analogous to the formulation of a word problem in terms of a
mathematical problem model. It requires a situational understanding of
the task at hand, as well as a knowledge of the formal constraints of the
computer system that must be respected in generating the system model.
The required situational understanding can be more complex than in the
case of word problems. In a word problem the question (usually) specifies
precisely the goal of the computation. When people perform tasks on a
computer system, an explicit specification of the goal state is not neces¬
sarily available. The goal of the action is frequently specified only in
general, situational terms and must first be translated into system terms.
372 Models of comprehension
them and provided rich verbal protocols. These protocols were proposi-
tionalized and formed the core of the knowledge for our computer simu¬
lation. In addition, this knowledge base was augmented in three ways:
1. Six other experienced computer users were shown each of the proposi¬
tions generated in the concurrent verbal protocols and asked to provide
a free association - the first thing that came to mind. These subjects
were not told anything about our purposes, so that the associations they
produced (such as as post office and stamp in response to mail) were fre¬
quently irrelevant to the computer context.
2. For all propositions that stated a request (e.g., to enter mail, edit a file),
a second proposition was added that stated what the outcome of the
action would be if the request were acted on. This information was
sometimes generated spontaneously in the protocols or the free associ¬
ation task, but because it was crucial for our purposes we had to make
sure that it was available for every request.
3. We added a few action plans that were needed to perform the tasks we
were concerned with that were not generated spontaneously by our
subjects.
(EDIT FILE?)
EXIST FILE?
Plan-element Precondition
KNOW FILE? LOCATION
The nodes thus generated were connected via argument overlap, form¬
ing an 81 x 81 long-term memory matrix that was fixed and did not
change during the course of a simulation. This long-term memory net¬
work was then used in generating dynamically a task network in working
memory when the system was instructed to perform some particular task.
Generating the task network involved the following sequence of steps:
system used practically everything it knew that was related to the task at
hand; for small values of n, on the other hand, the system might some¬
times fail to use critical information it knew about but did not sample on
a particular occasion, leading it to make an error (see section 11.3 for fur¬
ther discussion of this aspect of the model).
4. All plan elements in long-term memory were bound to existing
objects in the world and added to the proposition list. Thus, if two
files are mentioned in the instructions, two copies of all plan elements
with the variable FILE?, one bound to each INTHEWORLD file, were
generated.
5. The next step was to link the nodes that had been generated in
working memory into a network:
Figure 11.2 Excitatory (solid lines) and inhibitory (broken lines) relations
among plan elements and INTHEWORLD propositions.
However, it is usually not possible to carry out the action one wants to do
most, because the preconditions for that action might not be met by the
existing INTHEWORLD environment. Thus, we cannot send a letter
without first writing it, entering the mail system, and so on. We therefore
introduced a distinction between can-do and want-to nodes.
Accordingly, all computations were done in the network of want-to
nodes, but the can-do nodes determined what action was taken. Each
want-to node activated a corresponding can-do node in proportion to its
own strength. However, can-do nodes were inhibited if their precondi¬
tions did not exist INTHEWORLD. Thus, although the system wants
most strongly to send a letter, it cannot do so because there is no letter. It
therefore executes the most highly activated can-do plan element. As we
shall see, this tends to be an action that produces the missing precondi¬
tions and starts a chain of intermediary actions that change the state of
the world step by step and eventually result in the execution of the
desired final action.
Mannes and Kintsch (1991) give a formal and detailed description of
how task networks are constructed and integrated and describe a com-
putei program called NETWORK that performs these operations. I
must refer the reader to this original paper for the details of how this sys¬
tem works and focus here instead on the major conceptual issues
involved. The details of such a program are, of course, all important, but
they are available elsewhere, which allows me to discuss the theoretically
significant points at a more general level.
Beyond text 377
Figure 11.3 Relative activation values for selected plan elements on the first
three cycles of the “Include” task. AD abbreviates ADDRESS and LET
abbreviates LETTER. Want-to nodes are shown as open circles, can-do nodes
are shown as filled circles. A star indicates the action taken on each cycle.
tem still wants most strongly to paste in the address or type it, but cannot
do so. The strongest can-do node now is EDIT FILEALETTER. This
action is executed, it once again changes the state of the world - the file
lettei is now in a buffer, ready for typing — and another processing cycle
begins. This time TAPE TEXTAADDRESS can actually be performed;
it generates the outcome (IN TEXTAADDRESS FILEALETTER),
which completes the pattern established by the original request. As a final
step (not shown in Figure 11.3), the system exits the newly created
FILEALETTER/ADDRESS.
We have here an example of situated cognition. The system does not
plan ahead by setting up a search space and finding a goal path by means-
end analysis or some other heuristic. Instead, it looks at the situation,
comprehends it in terms of its goals, figures out what it wants to do, and
does what it can do, thereby creating a new situation. It then compre¬
hends the changed situation and acts again, producing another change in
its world. In the example just described, each of these^changes was a step
in the right direction, eventually allowing the system to reach its goal.
There is no guarantee that this would have been the case, but in an
Beyond text 379
orderly world about which the system has sufficient knowledge, that is
exactly what should happen.
This action without planning, but responding in a goal-directed man¬
ner to an ever changing situation, has considerable intuitive appeal. Con¬
sider what happens when I decide to leave the building: I do not plan to
open the door, go out in the hallway, turn right, descend the stairs, and so
on. Instead, I simply respond to the open door by walking through it,
respond to being in the hallway by walking toward the stairs, and so on.
Such routine tasks are different from real problem solving. For example,
when I design an experiment, I do try to think of all the possibilities in
advance and plan very far ahead. Not all thinking consists in understand¬
ing situations in terms of particular goals. More complex, intentional
problem-solving processes certainly are required in many cases, but not
all planning is problem solving, and much that we have treated in cogni¬
tive science as problem solving might be better understood within a com¬
prehension paradigm.
Mannes and Kintsch (1991) work through several more illustrative
tasks in their paper. One is a version of the notorious conflicting subgoal
task: “Print and delete the file.” To handle the sequencing in this task is
not trivial for most AI programs and may require special mechanisms.
NETWORK solves the task without problems. The spreading activation
process favors PRINT over DELETE because of the causal chaining built
into the plan elements (Figure 11.2). PRINT inhibits DELETE because
it would destroy a precondition necessary for printing, but not vice versa.
NETWORK also handles more complex, two-step tasks, such as
“Revise a manuscript you are working on with a colleague by removing a
paragraph, and then send the revised manuscript to that person.” Quite
a few steps are involved in executing these instructions, all in the propel
sequence. There is much that could go wrong, therefore, but NET¬
WORK manages not to lose its way.
I have already commented on one characteristic of NETWORK,
namely, that it does not plan ahead, but responds to each situation as well
as it can in a goal-directed way. Another aspect of NETWORK worth
noting is that it is nondeterministic. In generating a task net, it does not
include all information it has available in long-term memory but only a
random sample - whatever the current propositions in working memory
are able to retrieve. This means that it has different information available
380 Models of comprehension
on different trials with the same task, and hence it may take a different
course of action. If by accident a crucial piece of information is missing
(has not been sampled from long-term memory), it may fail at a task that
it could do otherwise (see section 11.1.3 for further discussion) or per¬
form it in a different way. That, of course, happens to people, too. Mannes
and Kintsch (1991) asked one of their subjects, an experienced computer
user who had just performed one of the tasks we had given him in a rather
clumsy, inelegant way, why he had not chosen a more optimal solution.
He immediately understood the optimal solution but commented that he
just had not thought of it. Because of its random sampling mechanism,
NETWORK also might “not think” of a possible solution path.
We, the theorists, have built into NETWORK all the knowledge it has.
Thus, NETWORK does not address the question of how knowledge is
acquired; yet once constructed, NETWORK can learn from experience.
Mannes and Kintsch (1991) reported some initial observations of how
such a system could learn from its own experience by remembering par¬
ticular cases it had worked on. Case-based reasoning is a large and impor¬
tant topic in cognitive science, and our explorations with NETWORK
are certainly not the last word on this topic, but they are worth noting
here because of the extreme simplicity of our approach. We simply add a
node that encodes a few essential features of a case NETWORK had
already solved to its long-term memory network. If retrieved when
NETWORK works on a new case (e.g., because there is some overlap
between the current task and the prior task), this case node participates
in the integration process as part of the task network, often playing a
major role in redirecting the activation flow.
Io illustrate how NETWORK reasons with cases, Mannes and
Kintsch descnbe how NE TWORK solved the “Revise manuscript” task
aftei having previously worked on three related tasks. For each prior task,
we created a memory node consisting of the most strongly activated argu¬
ments of the nonplan-element propositions in the network, plus the
names of the plan elements involved in performing the task. Thus, for the
“Include” task shown in Figure 11.3, the arguments FILEALETTER,
TEXT ADDRESS, and so on would be included, together with the plan
elements that were actually executed (FIND, EDIT, TYPE, EXIT).
Given a new problem involving a FILEALETTER, this case memory
might be retrieved from long-term memory and strengthen the plan ele-
Beyond text 381
ments associated with it. If the two tasks are sufficiently similar, this
might be a good thing: Relevant plan elements would be strengthened,
and irrelevant, potentially interfering elements would be weakened, lead¬
ing to a quicker solution of the problem. On the other hand, case mem¬
ory can equally well interfere with the solution of new problems - for
instance, if the new task requires some entirely different action involving
FILEALETTER. Both positive and negative effects of case memory are
well demonstrated, and it remains to be seen whether NETWORK can
simulate these results in detail.
Considerable work needs to be done on case-based reasoning in NET¬
WORK. For instance, just what information, and how much, does a case
memory contain? Do we need a special forgetting mechanism (a more
recently performed task probably has a stronger influence than one done
a long time ago)? Thus, there are many open questions at this point, but
they seem worth studying, for NETWORK has some potential advan¬
tages over other case-based reasoning systems. It requires neither a spe¬
cial selection mechanism nor special evaluation procedures nor analogi¬
cal reasoning. All we need is to add properly defined memory nodes to our
network - the retrieval and integration mechanisms are already there. It
would be interesting to see how far such a simple-minded model can be
pushed.
one source of this difficulty, but as we show later, this is no more than a
minor problem among many more serious ones that UNIX poses for its
users.
Doane, Pellegrino, and Klatzky (1990) observed UNIX users of vari¬
ous levels of experience, some of them over extended periods: novices
(average experience with UNIX 8 months), intermediates (average expe¬
rience 2 years) and experts (5 years’ experience). All subjects had taken
classes in the advanced features of UNIX and were active users of UNIX.
Thus, even the novices in this study were by no means inexperienced.
Subjects were given brief verbal instructions to produce legal UNIX
commands, using as few keystrokes as possible. There were three types of
tasks: single commands-(e.g., “display a file”), multiple commands (e.g.,
“arrange the contents of a file alphabetically,” “display the file names of
the current directory”), and composite commands (e.g., “sort the con¬
tents of a file alphabetically and print the first ten elements”). Composite
commands are thus little custom-made programs to perform specific
tasks and made up of elementary command sequences using special
UNIX facilities such as pipes or redirection symbols.
Doane et al. (1990) made a number of interesting observations. Both
novice and intermediate users knew the elements of UNIX; that is, they
could successfully execute the single and multiple commands, but they
could not put these elements together to form composite commands. The
novices failed entirely at this task, and the intermediates often found
nonoptimal solutions by using unnecessarily complex redirection meth¬
ods rather than directly piping the output of one command into the next
one. Only the experts could effectively use the input-output (I/O) redi¬
rection facilities of UNIX — one of its central and most powerful design
features!
Doane et al. (1989) simulated these results in a system called UNI¬
COM, which was modeled after the NETWORK system described in the
previous section. A knowledge base for UNICOM was constructed on
the basis of the empirical results of Doane et al. (1990). It consisted of
general, declarative knowledge and plan elements (procedural knowledge
or productions). General knowledge contained both specific information
about the syntax of UNIX commands and the I/O redirection syntax, as
well as more abstract conceptual knowledge about commands and redi¬
rection in UNIX. The plan elements were formally equivalent to those in
NETWORK (Figure 11.1): a name, preconditions, and outcomes. As in
Beyond text 383
Figure 11.2, each plan element activated those plan elements that gener¬
ate its preconditions and inhibited those elements that interfere with its
preconditions. Thus, the model is designed so that the interconnections
among plan elements represent the user’s knowledge about causal rela¬
tionships within the UNIX operating system.
The verbal instructions were again propositionalized into REQUEST,
OUTCOME, and INTHEWORLD propositions, as in the previous sec¬
tion. UNICOM integrated these instructions together with its knowl¬
edge and responded by performing the most activated among its plan ele¬
ments whose preconditions were met, as was described in the previous
section for NETWORK. In this way, the system performs correctly all
the tasks used in the Doane et al. (1990) experiment. Thus, it simulates
the behavior of the expert subjects in that study, roughly speaking, for
even experts sometimes make errors.
To simulate the behavior of the intermediate and novice subjects, we
(Doane et al., 1989) lesioned the knowledge base of UNICOM. We took
out all the knowledge, general and specific, about redirecting input and
output of commands to simulate the novice subjects. To simulate the
intermediate subjects, we gave the system knowledge about rediiection in
general and information about the use of redirection symbols, but we
deleted more advanced features, such as knowledge about the use of
pipes. With these lesioned knowledge bases, UNICOM reproduced the
major features of the Doane, Pellegrino, Klatzky data. The system cor¬
rectly performed single and multiple commands, unless know ledge about
a specific command was deleted, but it was unable to perform composite
tasks. It failed completely in the novice version and succeeded only occa¬
sionally, and without finding an optimal solution, in performing a com¬
posite command in the intermediate version.
We also were able to simulate the performance of particular users. Our
strategy was to give UNICOM all the knowledge a specific user had
employed in solving the task. This knowledge was determined from the
explicit evidence the user had provided in a verbal report. We then ran
UNICOM with this knowledge base on the problems that we had previ¬
ously given to the subject and noted whether or not UNICOM s behav¬
ior matched that of the subject. If we had everything right - the theory as
well as the knowledge - the behavior of the subject and our simulation
should match closely.
For instance, one of our tasks was “Display the first 10 elements ot the
384 Models of comprehension
Task Description
Format the text in ATT2 using the -ms macro package and
store the formatted version in ATT 1
Eiomols
Prompt 2 You will need to use this the formatted contents of ATT2). and
Figure 11.4 Example of task description and prompts for the problem
“nroff-ms ATT2>ATT1.” From Doane et al. (1992).
386 Models of comprehension
[x] Novice
□ Intermediate
[~] Expert
Figure 11.5 Changes in percentage correct after four types of prompts for
novice, intermediate, and expert UNIX users. After Doane et al. (1992).
ory losses may involve both item and order information. In our study,
novices and intermediates left out five or six times as many items initially
as experts (very few deletion errors were made by anyone after the third
prompt). Loss of order information as indexed by substitution errors
showed a similar pattern. Such errors were two to three times more fre¬
quent among novices and intermediates than among experts, and they
continued to be a serious problem up to the sixth prompt, which first pro¬
vided the order information the novices and intermediates were lacking.
Thus, memory loss appeared to be a serious problem for the novices and
intermediates, whereas the experts managed working memory much
more successfully.
Novices and intermediates need help retrieving the elements that go
into making a composite command, and they need help with ordering
these commands. The task of command construction severely taxes their
working memory, so that memory errors become a serious problem. This
is just what one would expect from the UNICOM simulations reported
earlier. The number of elements that go into the construction of a com¬
posite command, together with their ordering, imposes a significant bur¬
den on working memory. Furthermore, the user must also represent and
maintain in working memory the invisible intermediate results of the
process. Nowhere does UNIX tell the user what the outcome of the first
computational step is or the input to the next step. The user must antici¬
pate and remember such information without help from the system. Only
experts with well-established retrieval structures are up to this task.
Novices and intermediates not only lack knowledge; they can become
confused even when they have it.
One puzzling fact about skilled performance is that even experts, who
certainly know better, make errors. Indeed, error rates for experts are
often surprisingly high. For instance, in the prompting study desciibed
in the previous section, experts made many fewer errors than intermedi¬
ates and novices in producing composite UNIX commands, but their ini¬
tial solution attempts were by no means always successful, with error
rates varying for different problems between about 13% and 65%. Card,
Moran, and Newell (1983) have studied a skilled user performing two
tasks, manuscript editing and electronic circuit design editing. Eriors
388 Models of comprehension
were made on 37% of the editing operations. Most of these were detected
during the generation of the command sequence, but 21% of the com¬
mands issued by this highly skilled user contained an error that had to be
corrected later. Similarly, Hanson, Kraut, and Farber (1984) collected a
large database from 16 experienced UNIX users performing document
preparation and e-mail tasks. They observed an overall error rate of 10%,
with a range for different commands from 3% to 50%. Why do experts
make errors, and can simulations with the construction-integration
model account for this phenomenon?
Kitajima and Poison (1995) have developed a simulation of a human-
computer interaction task that suggests an answer to this puzzle. The
sample task that they analyze involves the use of the popular graphics
application, Cricket Graph. A user is given a set of data and instructed to
produce a graph with certain characteristics. This is a much more com¬
plex problem than the routine tasks studied by Mannes and Kintsch
(1991) or even the production of composite UNIX commands simulated
by Doane et al. (1992). Nevertheless, Kitajima and Poison successfully
simulated performance on this task with a model designed in much the
same way as these simpler models. That is, they used a propositional rep¬
resentation for the various knowledge sources involved: the display the
user is confronted with, the user’s goals, the user’s knowledge about
graphs and computer systems, and finally the objects and actions involved
(what we called the plan elements in Figure 11.1). A task network was con¬
structed fiom these elements in much the same way as in Mannes and
Kintsch (1991), and an integration process then selected the strongest
action candidate for execution. This action changed the display in one way
or another, which led to the selection of a new action, until the desired goal
state was reached.
However, to achieve their goal, Kitajima and Poison had to elaborate
this basic model in several ways. First, the practice of binding all objects
in the environment to all possible actions had to be abandoned — there were
simply too many possibilities, leading to a combinatorial explosion. Inter¬
views with human users suggested that these users simply were not pay¬
ing attention to most of these possibilities. Selective attention is of course
something familiar to psychologists, and Kitajima and Poison found a
ready way to incorporate such a mechanism into the Cl model. First, the
model does a construction-integration cycle with just the objects (e.g.,
menu items, whatever else there is on the screen) in view and the task goals
Beyond text 389
some similar term. Labels often, but not always, indicate the action to
2 The details of this simulation are described in Kitajima and Poison (1995).
390 Models of comprehension
which an object is relevant. The Kitajima and Poison model strongly relies
on such labels. If the labels do not indicate which objects are relevant, the
model might still make a correct decision if it has a great deal of knowl¬
edge about the task at hand and if it uses that knowledge, but for the most
part it will be lost. So are human users. Novices rely heavily on a label¬
following strategy, failing when that strategy fails. Experts are more flexi¬
ble and often can figure things out on the basis of their knowledge and
experience, but even experts do better when they can follow the labels.
Figure 11.6 shows data on Cricket Graph users from a dissertation by
Franzke (1994). On the first trial, novice users of Cricket Graph are able
to act quickly if the action is labeled in the same way as in the instructions
or by means of a synonym. The action time increases sharply when an
inference is required, that is, when there is no menu item that is linked to
the instruction either directly or by a synonym. Action times are even
longer when no verbal link exists at all (e.g., to change an axis in a graph
you must double click on it in Cricket Graph — even longtime Macintosh
users have a hard time figuring this out!). On a second trial, the time dif-
Figure 11.6 Time required for using the Cricket Graph software for actions
that are labeled in the same way as the instruction, actions that are labeled
with a synonym, actions that are labeled in such a way that the user must
make an inference, and actions that are not labeled at all. The data are for the
first and second trial of novice users. After Franzke (1994).
Beyond text 391
ferences are greatly reduced, but they are still apparent. Indeed, some
recent unpublished data of Poison show that even experts are faster when
they can rely on the label-following strategy than when they cannot.
Thus, the label-following strategy appears to be the choice of novices in
a wide variety of situations and remains important even when a user has
become familiar with an application. The Kitajima and Poison (1995)
model correctly implies the importance of this label-following strategy in
human—computer interaction because labels play an often crucial role in
directing attention to the task-relevant objects on the screen in the inte¬
gration cycle that precedes the actual action selection.
The other aspect of Kitajima and Poison’s (1995) model that is of par¬
ticular interest here is their error simulation. A simulation that can solve
all the problems is not a good simulation of human behavior because
humans, even experts, make errors. Kitajima and Poison found a natural
way to account for these errors. One of the parameters of the model is the
rate of sampling knowledge that is related to the current display, as in
Mannes and Kintsch (1991). Nsample is the number of times each element
in the representation that has been constructed is used to retrieve a
related knowledge element, where the probability of sampling a particu¬
lar knowledge element depends on the strength of its relationship to the
cue, relative to all other elements. Kitajima and Poison ran 50 simulations
each for values of Nsample between 4 and 20. Some of their results are
shown in Figure 11.7, where the error rate for various component actions
is shown as a function of how much long-term memory is retrieved.
Although some actions are much more knowledge-dependent than oth¬
ers, it is clear that the sampling rate is a major determinant of success. II
an item is allowed to retrieve no more than four associated knowledge ele¬
ments from long-term memory, the error rates vary between 8% and
75%. When up to 20 elements can be sampled, the error rates drop to
between 0%> an d 15*%. Thus, the model has a natuial explanation for
such errors. If users try hard to retrieve whatever knowledge they have,
error rates will be low. But if users are sloppy and do not use the knowl¬
edge they have, large error rates will result.
The three simulations discussed here - routine actions by Mannes and
Kintsch (1991), UNIX commands by Doane et al. (1992), and Cricket
Graph by Kitajima and Poison (1995) — demonstrate that the Cl model of
comprehension can be extended beyond text comprehension. The same
392 Models of comprehension
—□— Task 1
.O. Task 2
—0— Task 3
Figure 11.7 Predicted error rates for a difficult (Task 1), an intermediate
(Task 2), and an easy task (Task 3) as a function of the amount of elaboration,
the Nsampie parameter in the model. After Kitajima and Poison (1995).
3-2-1 0 0
Figure 11.8 Initial state of the Tower of Hanoi problem and formal notation.
Beyond text 397
placed on top of a smaller disk. With just three disks it is not a difficult
problem, and most people solve it after some fumbling.
The strategies people use in solving the Tower of Hanoi problem have
been analyzed and documented by Kotovsky, Hayes, and Simon (1985).
Typically, novices initially adopt a simple difference-reduction strategy,
which proves ineffective. They then shift to a means-end strategy in
which subgoals are created (e.g., “clear everything off the largest disk, so
it can be moved”). Once they adopt the right subgoals, they solve the
problem quickly.
Now consider how one could solve the Tower of Hanoi task by relying
strictly on comprehension processes, that is, by constructing a network
representing this task and using spreading activation to integrate this net¬
work.
We first need a more compact notation, which I have already intro¬
duced in Figure 11.8: (3—2—1, 0, 0) means that the largest disk 3, the
medium disk 2, and the smallest disk 1 are all on the first stick, and that
no disks are on the second and third sticks; (3, 2-1, 0) means that the
largest disk is on the first stick and the other two disks are on the second
stick; (3, 1—2,0) is not a permissible state, because the medium disk 2 can¬
not be placed on top of the small disk 1.
Thus, the anticipated goal state is (0, 0, 3-2-1) - all disks on the third
stick. The task for the comprehender is to construct a network of moves
that link the initial state with this anticipated goal state. The minimum
such network is shown in Figure 11.9. The initial state is labeled
WORLD. There are two possible moves from here, labeled DO: move
disk 1 either to stick 2 or stick 3. Which is the right one? We need to con¬
struct a path linking the given state of the problem in the world with the
anticipated outcome. [0, 2—1, 3] is the first state that links up with the
anticipated goal state: Disk 3 is in the right place. If the network shown
in Figure 11.9 is integrated, the resulting activation values for the two
competing moves — first disk to stick 2 or 3 — clearly favor the coirect
move, and hence a decision that starts the solution process off on the right
path can be made.
But note that in order to link up the current state of the world and the
outcome state, a large and complex network had to be constructed,
including nine intervening nodes! 1 his may not look like much compared
with the networks we are familiar with from analyses of text comprehen-
398 Models of comprehension
WORLD[3-2-1,0,0] 0.3469
DO[3-2,1,0) 0.0000
DO[3-2,0,1] 0.6836
[3,1,2] 0.0145
[3-1,0,2] 0.0103
[3,0,2-1j 0.0103
[3,2,1] 1.0000
[3-1,2,0] 0.5076
[3,2-1,0] 0.7787
[0,2-1,3] 0.5334
OUTCOME[0,0,3-2-1] 0.271 1
[3,2-1,0]
Figure 11.9 State network for the Tower of Hanoi - initial state. The activa¬
tion values ot the nodes after integration are shown in the left part of the
figure.
cerns a striking but common error in decision making that Tversky and
Kahneman (1983) have labeled the representativeness bias: Subjects judge
that a conjunction of two sets is more probable than one of the sets
involved if the other set is highly representative. This is a clear violation
of probability theory, for P(A), P(B) < P[AAB] for all sets A and B. Con¬
cretely, if subjects are given the text
(1) Linda is 31 years old, single, outspoken, and very bright. She
majored in philosophy. In college, she was involved in several
social issues, including the environment, the peace campaign,
and the antinuclear campaign.
and are asked to judge which of the following two statements is more
probable:
q
si ngle
anti-nuclear
comparable to the link values between text propositions, which are set to
1. With these values, the spreading activation process yields final activa¬
tion values of .31 for BANK TELLER and .75 for BANK TELLER &
FEMINIST Hence, the model predicts that the more highly activated
BANK TELLER & FEMINIST should be the preferred response.
Does this little exercise in modeling a classical result tell us anything
worthwhile, or is it just that - a little exercise? I think it has significant
implications that ought to be explored. We might not need a representa¬
tiveness bias in order to understand the Linda results. If a subject repre¬
sents the text as a story and the decision is based on activation values in
that representation,Tversky and Kahneman’s result follows. However, a
subject with an elementary knowledge of probability theory may form a
higher-level, abstract, formal representation, as shown in Figure 11.11.
This representation includes the same net as before, with the same link
strengths estimated from LSA, but in addition there is another part of
the net that brings to bear the subject’s knowledge of probability theory
as it applies to this situation. I have indicated the general knowledge of
probability theory in terms of relations among the sets A and B and a the-
Beyond text 401
single
orem from probability theory, P(A&B) < P(A),P(B), which the decision
maker must know. Set A is linked to BANK4 ELLER, (BT), and set B is
linked to BANKTELLER & FEMINIST, (BT&FEM). The general the¬
orem supports the hypothesis that p(BT) is greater but contradicts the
hypothesis that p(BT&FEM) is greater. As a result, BT receives an acti¬
vation value of .34, but BT&FEM receives only .01. Therefore, a correct
response will be made.
If a decision maker represents the problem formally (which requires not
only a knowledge of probability theory but also that an encoding strategy
must be used to ensure that this knowledge will actually be applied), a
“correct” response will be made. If only a narrative-level repiesentation
is used instead, a response will be made that is incorrect in the formal sense
but consistent with the story representation: From what we know about
Linda, she surely looks like a feminist, even though she has a job as a bank
teller.
The Cl model thus allows us to reanalyze and reinterpret a classical
example from the literature on decision making. But obviously, analyzing
a single example is by no means the same as proposing a new theoiv of
402 Models of comprehension
observed 1.0000
black 0.9988
pushed-someone 0.9998 A.
aggressive 0.9978
violent-push 0.9984
jovial-push 0.0000
aggressive
observed 0.9269
white 0.5533
pushed-someone 1.0000
violent-push 0.3736 S observed.
jovial-push 0.3736
cr
white "a
pushed-someone
/-\
violent-push jovial-push
acteristic of lawyers, and the argue node is supported by the verbal node
also characteristic of lawyers. In contrast, construction workers are charac¬
terized by the nodes working class and unrefined, both of which support
punch and are unrelated to argue. The results are displayed in Figures
11.14 and 11.15. Aggressive lawyers argue rather than punch. Aggressive
construction workers argue just as much, but even more likely, they
punch.
observed 0.4330
9
observed
lawyer 1.0000
aggressive 0.8110
upper-middle class 0.3434
verbal 0.7214
punch 0.2043
argue 0.6647
V
observed 0.3563 observed
construction worker 1.0000
aggressive 0.8919
working class 0.6663
unrefined 0.3919
punch 0.8725
argue 0.6351
punch argue
A
observed
observed 0.6187
black 0.6182
veil-dressed 0.9991
ghetto-black 0.0000
businessman-black 1 .0000
aggressive 0.0000
aggressive
P
observed
observed 0.4142
black 1.0000
veil-dressed 0.0000
ghetto-black 0.9997
businessman-black 0.0000
aggressive 0.4141
aggressive
node. On the other hand, in the same network, if a black is observed who
is not well dressed, aggressive will become activated, as shown in Figure
11.17.
observed 0.4477
punch adult 1.0000
housewife 0.0000
aggressive 0.3943
aggressive
observed 0.5700
punch adult 1.0000
construction worker 0.7850
aggressive 0.8891
SK
dala and prefrontal cortices) are involved in the control of the state of the
body, effecting changes in adjustments in the body. The body signals
these state changes back to other parts of the brain (the somatosensory
cortex), where these signals may be attended to and become conscious.
Moreover, it is not necessary for the body actually to assume a particular
state. The amygdala and prefrontal cortex may influence the somatosen¬
sory cortex directly, so that it acts as if the body were in a particular state.
Suppose that working memory contains, in addition to the cognitive
nodes that comprise short-term memory or consciousness, a set of
somatic markers that we experience as general body feelings. Introspec-
tively, this appears plausible. These somatic markers in working memory
can serve as cues that, via retrieval structures, can readily retrieve all
kinds of information about one’s own body. Thus, although I am not con¬
scious now of the position of a my right foot, this information is directly
available. People certainly have had the years of practice necessary to
acquire the retrieval structures that provide access to information about
their own body.
If somatic markers are always present in working memory, we must
then modify our view of what is involved in perception. Traditionally, we
think of perception as representing the environment, enriched by the
perceiver’s prior knowledge and beliefs. But if somatic markers are an
important part of working memory, body feelings become parts of the
perceptual experience, just as knowledge and beliefs do. Damasio (1994)
has pointed out that perception consists of more than simply receiving
sensory information. When I look out the window of my study on the
scene below, I see the city nestled in the trees, the flat curve of the hori¬
zon, and the sky above speckled with afternoon clouds. I move my head
as the eyes scan the horizon, the familiar view makes me feel good, and I
stretch my legs and take a deep breath. All of this contributes to the per¬
ception, not just the visual image. When I recreate the image later, traces
of my movements, the way I sat in my chair, and the somatic reactions that
occurred in the first place are regenerated with it. I experience a little of
that good feeling that went with the original perception. Cognition does
not occur in a vacuum or in a disembodied mind but in a perceiving, feel¬
ing, acting body. The image is reconstructed in the brain, and its content
includes the reactions of the body that occurred in response to the origi¬
nal scene - not just what happened on the retina, but bodily reactions in
the widest sense, motor as well as visceral. It is my perception and my
Beyond text 411
memory image because its content is not only visual but also includes a
variety of somatic reactions of my body.
What we would gain from adding somatic markers as nodes to our net¬
work is the possibility of formulating an account of emotions, attention,
and motivation, as I suggest later. First, however, this would be a way of
representing the self in working memory.
The arena in which cognition takes place is working memory, alias
short-term memory, alias focus of attention, alias consciousness. The
whole model of cognition as it has been constructed here is based on the
contents and operations in working memory, but a moment’s considera¬
tion shows that we have omitted rather a lot from our working memory.
Take the typical reader as modeled by the Cl model: Working memory
contains a dynamically changing stream of about seven (atomic) proposi¬
tions. But there must be more than that. There is an awareness of the
physical environment, especially of changes in it. There is an awareness
not onlv of the goal set for this particular reading task but also of multi¬
ple other goals. There is the constant background feeling of one’s body -
its position, tone, and feeling. There is the self that is reading.
The self is important, for we know quite well from experience, obser¬
vation, as well as experiment, that everything self-related is special for an
organism and receives favored treatment. Yet experimental psychologists
and most cognitive scientists have refused to deal with the self concept.
In part, this refusal has been based on the prevalent view of the self,
which they rightly find unsatisfactory: that the self, in one way or another,
is some sort of homunculus that does the experiencing. However, think¬
ing about cognition in terms of shifting patterns of activation in a large
network suggests a more plausible alternative: that the self does not exist
but is continuously reconstructed as the activated part of the network.
The self nodes consist of some highly and more or less permanently
activated nodes in our long-term memory, including our name, the per¬
sons closest too us, our occupation, and other significant items. These per¬
sonal memory nodes would always be activated, though with some fluctu¬
ations. The memories that make up my self are probably not entirely the
same when I am at home with my family and when I am speaking at a pro¬
fessional meeting. Moreover, such essentially propositional nodes are not
the only ones that constitute the self. There is another group of nodes in
the network representing the body that is part of the self.
I am not only my memories but also my body. These nodes may repre-
412 Models of comprehension
sent limbs and muscles, the position of the body in space, visceral infor¬
mation, as well as the chemical or hormonal balance or imbalance of the
body. I make no attempt either to list or describe all that might be
involved but only point to the existence and significance of this part of
our being, which I lump together as the somatic nodes in the network.
Not all these somatic nodes are so strongly or permanently activated. Our
body sense depends on context: whether I am sitting in my chair or
climbing a mountain, whether I am hungry or satiated, excited or drowsy.
But at any time, there is a group of somatic nodes that is sufficiently acti¬
vated and that, together with the memory nodes, constitute my self — or
more precisely, that is how the Cl theory could model the self.
The two groups of self nodes form a single, tightly interrelated cluster.
In consequence, perception is not just a sensory event but also has an
inherent somatic component. Another way of saying this is that percep¬
tion is not only a cognitive process but also an emotional process. We react
to the world not only with our sense organs but also with gut-level feel¬
ings. The things that excite us, please us, scare us are the ones most
closely linked to the somatic level. Our most central memories are the
ones most intimately linked to our body.
All this is clearly speculation. Furthermore, at this point these specu¬
lations are not sufficiently well formulated to be empirically testable.
However, some support for notions like the ones advanced here exists.
Damasio (1994) has provided evidence from neuroscience for his the¬
ory of somatic markers. He specifically points to the importance of body
reactions for cognition. Feelings about the body — real or vicarious —
become conditioned to certain perceptions and actions and thus become
crucial to the cognitive processes involved in planning and decision mak¬
ing. Actions and perceptions that are tied to good body feelings are
favored over others, and those tied to negative feelings are avoided, thus
directly influencing planning and decision making. The search of prob¬
lem spaces and the rational calculation of outcomes is cut short in this
way, allowing a person to act without becoming lost in thought and plan¬
ning. Damasio reiterates here a point made by Simon (1969) that, far
from being irrelevant by-products of decision making, emotions are cru¬
cial for resource management in a system with limited capacity, such as
human working memory.
There is another line of evidence from cognitive science that appears to
require an explicit account of the self. Many observations about language
Beyond text 413
and language use have been amassed by such writers as Lakoff (1987) and
Glenberg (1997) to show that language expresses relationships with refer¬
ence to the human body and that we interpret language by means of our
body schema. Through the experience with our own bodies we acquire a
number of schemas, such as the container schema, the part-whole schema,
or the source-path-goal schema. These body schemas are transferred to
language and the world. We understand the world in terms of these
schemas, and language has evolved around the use of them. We know what
a container is because of our direct experience with our body. It has an
inside, an outside, and a boundary, and we can map our understanding of
these relationships, which is extralinguistic and experiential, on the world
by means of metaphorical language. We understand Mary felt trapped in
her marriage and wanted to get out by assigning to marriage the role of the
container, \nfohn and Mary are splitting up, we map marriage into the part-
whole schema instead (examples from Lakoff, 1987). We understand the
world with reference to our own body experience and action; when lan¬
guage evolved, it reflected this bodily basis of understanding.
A working memory in which somatic nodes are continuously active may
provide a mechanism for the explanation of the linguistic phenomena
described by Lakoff (1987). All spatial perception occurs in the context of
activated somatic nodes that represent the position and state of our body.
Thus, from the very beginning, spatial information about the world is
linked to our own body. For instance, the concept more is acquired in a spa¬
tial context, as an increase in quantity along the verticality dimension,
which is given directly in our bodily experience. Once acquired in this con¬
text, the use of the word more is extended metaphorically to nonspatial
contexts, resulting in the more is up, less is down metaphor. Language is
embodied because it is linked to the representations of our own body in
working memory — in its daily use, in its acquisition by the individual, and
in its cultural evolution.
(2) Imagine that the United States is preparing for the outbreak of
an unusual Asian disease that is expected to kill 600 people. Two
alternative programs to combat the disease have been proposed.
Assume that the exact scientific estimates of the consequences of
the programs are as follows:
Program B is now preferred over Program A (78% vs. 22%). The Cl the¬
ory readily predicts this switch, as Figure 11.21 shows. The negative
Figure 11.20
416 Models of comprehension
Figure 11.21
wording produces inhibition from the BAD node (.5 for 400 dying, 1 for
600 dying), whereas the GOOD node is connected to only none-dying.
Integrating this new network yields much lower activation values for both
choices, but their relative strength is now reversed: Program A has zero
activation value, whereas Program B is favored with an activation value of
.05. I am not aware of any data showing that negatively worded choice-
pairs in planning decisions are less activated than positively worded
choices, but that is a prediction of this kind of model.
Note that there is nothing in the Cl model as such that yields these pre¬
dictions. Rather, they are a consequence of the assumption we made
about the nature of the mental representation that is being constructed.
If people employ a nonanalytical comprehension strategy, treating the
text just like a story, they will make these “errors” in judgment. However,
if they take a more analytic approach and construct an abstract represen¬
tation, employing their knowledge of probability theory, both decision
alternatives will be equally strong (exp-value abbreviates the proposition
expected value equals 200) (see Figure 11.22).
In this case, both save-200 and save-600-or-none receive activation val¬
ues of .41. Thus, it is not a question whether people have “decision
biases but, rather, how they approach the decision problem. If they treat
it as an everyday comprehension problem, their choices will be quite dif¬
ferent than if they employ an analytic strategy, representing the informa¬
tion provided by the text in terms of the appropriate probability schema.
Some decision makers cannot use this strategy because they do not have
the necessary knowledge of probability theory and hence are unable to
construct a representation at this level. Tversky and Kahneman’s point is
that many decision makers who have the right schema still make errors.
Thus, by formally incorporating into the Cl model the self-relevance
of textual information, in both a positive and negative way, some inter-
Beyond text 417
Figure 11.22
links. The other class of interesting things are those that are neither too
familiar nor too strange - objects of intermediate novelty (Berlyne, 1960).
Those are the things that arouse our curiosity, without necessarily being
linked to our own self. This form of interest seems rather different in its
psychological origin than the intrinsic interest based on links to the self.
“Interest,” like so many other psychological concepts, is a folk concept,
and perhaps scientific theory will have to separate what language has put
together. James (1969) thought so for emotions, which he claimed were a
heterogeneous class of phenomena lumped together by our language. In
the Cl theory, it may be possible to treat emotion as a unitary phenome¬
non after all - in much the same way as James wanted to, namely, as per¬
ceptions of bodily states.
11.4 Outlook
Albrecht, J. E., O’Brien, E. J., Mason, R. A., & Myers, J. L. (1995). The role of
perspective in the accessibility of goals during reading. Journal of Experi¬
mental Psychology: Learning, Memory, and Cognition, 21, 364—72.
Albrecht, J. E., & Myers, J. L. (1995). The role of context in accessing distant
information during reading. Journal of Experimental Psychology: Learning,
Memory, and Cognition, 21, 1459-68.
Albrecht, J. E., & O’Brien, E. J. (1991). Effects of centrality on retrieval of text-
based concepts. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 17, 932-9.
(1993). Updating a mental model: Maintaining both global and local coher¬
ence. ‘Journal of Experimental Psychology: Learning, Memory, and Cognition,
19, 1061-70.
(1995). Goal processing and the maintenance of global coherence. In R. F.
Lorch & E. J. O’Brien (Eds.), Sources of coherence in reading (pp. 263-78).
Hillsdale, NJ: Erlbaum.
Altmann, G. T. M., & Steedman, M. (1988). Interaction with context during
human sentence processing. Cognition, 30, 191-238.
Anderson, A., Garrod, S., & Sanford, A. J. (1983). The accessibility of pronom¬
inal antecedents as a function of episode shift in narrative text. Quarterly
Journal of Experimental Psychology, 35, 427-40.
Anderson, J. R. (1978). Arguments concerning representations for mental
imagery. Psychological Review, 85, 249—77.
(1983). The architecture of cognition. Cambridge, MA: Harvard University
Press.
(1990). The adaptive character of thought. Hillsdale, NJ: Erlbaum.
(1993). Rules of the mind. Hillsdale, NJ: Erlbaum.
Anderson, J. R., & Bower, G. H. (1972). Human associative memory. Washington,
DC: Winston.
Anderson, R. C., & Ortony, A. (1975). On putting apples into bottles - a prob¬
lem of polysemy. Cognitive Psychology, 7, 167-80.
426 References
Bisanz, G. L., Das, J. P., Varnhagen, C. K., & Henderson, H. K. (1992). Struc¬
ture and components of reading times and recall for sentences in narra¬
tives: Exploring changes with age and reading ability. Journal of Educational
Psychology, 84, 102-14.
Bisanz, G. L., LaPorte, R. E.,Vesonder, G. T., &Voss, J. F. (1978). Contextual
prerequisites for understanding: Some investigations of comprehension
and recall .Journal of Verbal Learning and Verbal Behavior, 17, 337-57.
Bousfield, W. A., & Sedgwick, C. H. (1944). The analysis of sequences of
restricted associative responses. Journal of General Psychology, 30, 149-65.
Bovair, S., & Kieras, D. E. (1985). A guide to propositional analysis for research
on technical prose. In B. K. Britton & J. B. Black (Eds.), Understanding
expository text (pp. 315-62). Hillsdale, NJ: Erlbaum.
Bower, G. H. (1981). Mood and memory. American Psychologist, 36, 129-48.
Bower, G. H., Black, J. B., & Turner, T. J. (1979). Scripts in memory for text.
Cognitive Psychology, 11, 177-220.
Bransford, J. D., Barclay, J. R., & Franks, J. J. (1972). Sentence memory: A con¬
structive versus interpretive approach. Cognitive Psychology, 3, 193-209.
Bransford, J. D., & Franks, J. J. (1971). The abstraction of linguistic ideas. Cog¬
nitive Psychology, 2, 331-50.
Bransford, J. D., & Johnson, M. K. (1972). Contextual prerequisites for under¬
standing: Some investigations of comprehension and recall. Journal of Ver¬
bal Learning and Verbal Behavior, 11, 717-26.
Briars, D. J., & Larkin, J. H. (1984). An integrated model of skill in solving ele¬
mentary word problems. Cognition and Instruction, 1, 245-96.
Britton, B. K., & Eisenhart, F. J. (1993). Expertise, text coherence, and con¬
straint satisfaction: Effects on harmony and settling rate. In Proceedings of
the Cognitive Science Society (pp. 266-71). Hillsdale, NJ: Erlbaum.
Britton, B. K., & Gulgoz, S. (1991). Using Kintsch’s model to improve instruc¬
tional text: Effects of inference calls on recall and cognitive structures.
Journal of Educational Psychology, 83, 329-45.
Broadbent, D. E. (1975). The magic number seven after fifteen years. In
A. Kennedy & A. Wilkes (Eds.), Studies in long-term memory (pp. 3-18).
London: Wiley.
Brown, A. L., & Palincsar, A. S. (1989). Guided, cooperative learning and indi¬
vidual knowledge acquisition. In L. B. Resnick (Ed.), Knowing, learning,
and instruction. Hillsdale, NJ: Erlbaum.
Brown, J. S., &VanLehn, K. (1980). Repair theory: A generative theory of bugs
in procedural skills. Cognitive Science, 4, 379—426.
Bruner, J. S. (1986). Actual minds, possible worlds. Cambridge, MA: Harvard Uni¬
versity Press.
Cacciari, C., & Glucksberg, S. (1994). Understanding figurative language. In
428 References
Cummins, D. D„ Kintsch, W., Reusser, K., & Weimer, R. (1988). The role of
understanding in solving word problems. Cognitive Psychology, 20, 405-38.
Damasio, A. R. (1994). Descartes' error: Emotion, reason, and the human brain.
New York: Putnams.
Daneman, M., & Carpenter, P. A. (1980). Individual differences in working
memory and reading. Journal of Verbal Learning and Verbal Behavior, 19,
450-66.
DeCorte, E., Verschaffel, L., & DeWinn, L. (1985). The influence of rewording
verbal problems on children’s problem representation and solutions. Jour¬
nal of Educational Psychology, 77, 460-70.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R.
(1990). Indexing by Latent Semantic Analysis. Journal of the American Soci¬
ety for Information Science, 41, 391-407.
Dell, G. S., McKoon, G., & Ratcliff, R. (1983). The activation of antecedent
information during the processing of anaphoric reference in reading. Jour¬
nal of Verbal Learning and Verbal Behavior, 22, 121-32.
Dellarosa, D. (1986). A computer simulation of children’s arithmetic word prob¬
lem solving. Behavior Research Methods, Instruments, (V Computers, 18,
147-54.
Dewey, J. (1897). The University Elementary School: History and character.
University [of Chicago] Record, 2, 72-75.
Doane, S. M., Kintsch, W., & Poison, P. G. (1989). Action planning: Producing
UNIX commands. In The Eleventh Annual Conference of the Cognitive Sci¬
ence Society (pp. 458-65). Hillsdale, NJ: Erlbaum.
Doane, S. M., McNamara, D. S., Kintsch, W., Poison, P. G., & Clawson, D.
(1992). Prompt comprehension in UNIX command production. Memory,
& Cognition, 20, 327-43.
Doane, S. M., Pellegrino, J. W., & Klatzky, R. L. (1990). Expertise in a computer
operating system: Conceptualization and performance. Human-Computer
Interaction, 5, 267—304.
Donald, M. (1991). Origins of the modern mind. Cambridge, MA: Harvard Uni¬
versity Press.
Ehrlich, K. (1980). Comprehension of pronouns. Quarterly Journal of Experi¬
mental Psychology, 32, 247-55.
Ehrlich, K., & Rayner, K. (1983). Pronoun assignment and semantic integration
during reading: Eye-movements and immediacy of processing. Journal of
Verbal Learning and Verbal Behavior, 22, 75—87.
Eich, E., Macauley, D., & Ryan, L. (1994). Mood dependent memory for events
of the personal past. Journal of Experimental Psychology: General, 123,
201-15.
Ericsson, K. A. (1985). Memory skill. Canadian Journal of Psychology, 39, 188—
231.
430 References
Ericsson, K. A., & Charness, N. (1994). Expert performance: Its structure and
acquisition. American Psychologist, 49, 727-47.
Ericsson, K. A., & Chase, W. G. (1982). Exceptional memory. American Scientist,
70, 607-15.
Ericsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psycholog¬
ical Review, 102, 211—45.
Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data
(Rev. ed.). Cambridge, MA: MIT Press.
Estes, W. K. (1986). Array models for category learning. Cognitive Psychology,
18, 500-49.
(1995). A general theory of classification and memory applied to discourse
processing. In W. A. Weaver, S. Mannes, & C. R. Fletcher (Eds.), Discourse
comprehension: Essays in honor of Walter Kintsch (pp. 35-48). Hillsdale, NJ:
Erlbaum.
Farnham-Diggory, S., & J3regg, L. (1975). Short-term memory function in
young readers. Journal of Experimental Child Psychology, 19, 279-98.
Ferreira, F., & Clifton, C. (1986). The independence of syntactic processing.
Journal of Memory and Language, 25, 348-68.
Ferstl, E. C. (1994a). The Construction-Integration Model: A framework for
studying context effects in sentence processing. In The Sixteenth Annual
Conference of the Cognitive Science Society (pp. 289-94). Hillsdale, NJ: Erl¬
baum.
(1994b) Context effects in syntactic ambiguity resolution: The location of preposi¬
tional phrase attachment. Ph.D. dissertation, University of Colorado.
Ferstl, E. C., & Kintsch, W. (in press). Learning from text: Structural knowledge
assessment in the study of discourse comprehension. In S. R. Goldman &
H. van Oostendorp (Eds.), The construction of mental representations during
reading.
Fincher-Kiefer, R. H. (1993). The role of predictive inferences in situation
model construction. Discourse Processes, 16, 99-124.
Fischer, B., & Glanzer, M. (1986). Short-term storage and the processing of
cohesion during reading. Quarterly Journal of Experimental Psychology, 38,
431-60.
Fischer, G., Henninger, S., & Redmiles, D. (1991). Cognitive tools for locating
and comprehending software objects for reuse. In The Thirteenth Interna¬
tional Conference on Software Engineering (pp. 318-28). Austin, TX: ACM,
IEEE.
Fletcher, C. R. (1981). Short-term memory processes in text comprehension.
Journal of Verbal Learning and Verbal Behavior, 20, 264—74.
(1984). Markedness and topic continuity in discourse processing. Journal of
Verbal Learning and Verbal Behavior, 23, 487—93.
References 431
Rayner, K., Pacht, J. M., & Duffy, S. A. (1994). Effects of prior encounter and
global discourse bias on the processing of lexically ambiguous words: Evi¬
dence from eye fixations .Journal of Memory and Language, 33, 527—44.
Recht, D. R., & Leslie, L. (1988). Effect of prior knowledge on good and poor
readers. Journal of Educational Psychology, 80, 16-20.
Reder, L. M. (1982). Plausibility judgments versus fact retrieval: Alternative
strategies for sentence verification. Psychological Review, 89, 250—80.
(1987). Strategy selection in question answering. Cognitive Psychology, 19,
90-38.
Rehder, R., Schreiner, M. E., Wolfe, M. B. W., Laham, D., Landauer, T. K., &
Kintsch, W. (in press). Using Latent Semantic Analysis to assess knowl¬
edge: Some technical considerations. Discourse Processes.
Reusser, K. (1989). From text to situation to equation: Cognitive simulation of
understanding and solving mathematical word problems. In H. Mandl,
E. DeCorte, N. Bennett, & H. F. Friedrich (Eds.), Learning and instruction
(pp. 477-98). Oxford: Pergamon Press.
(1993). Tutoring systems and pedagogical theory. In S. Lajoie & S. Derry
(Eds.), Computers as cognitive tools (pp. 143—77). Hillsdale, NJ: Erlbaum.
Richman, H. B., Staszewski, J. J., & Simon, H. A. (1995). Simulation of expert
memory using EPAM IV. Psychological Review. 102, 305-30.
Riley, M. S., Greeno, J. G., & Heller, J. H. (1983). Development of children’s
problem solving ability in arithmetic. In H. P. Ginsburg (Ed.), The develop¬
ment of mathematical thinking (pp. 153-96). New York: Academic Press.
Rips, L. J. (1994). The psychology of proof: Deductive reasoning in human thinking.
Cambridge, MA: MIT Press.
Rizzo, N. D. (1939). Studies in visual and auditory memory span with specific
reference to reading disability. Journal of Experimental Education, 8, 208-
44.
Rodenhausen, H. (1992). Mathematical aspects of Kintsch’s model of discourse
comprehension. Psychological Review, 99, 547-49.
Rosch, E., & Mervis, C. B. (1975). Family resemblance studies in the internal
structure of of categories. Cognitive Psychology, 7, 573-605.
Rubin, D. C. (1988). Learning poetic language. In F. S. Kessel (Ed.), The devel¬
opment of language and language researchers: Essays in honor of Roger Brown
(pp. 339-51). Hillsdale, NJ: Erlbaum.
Rumelhart, D. E., & McClelland, J. L. (Ed.). (1986). Parallel distributed process¬
ing. Cambridge, MA: MIT Press.
Sanford, A. J., & Garrod, S. C. (1981). Understanding written language. London:
Wiley.
Sanford, A. J., Moar, K., & Garrod, S. (1988). Proper names as controllers of dis¬
course focus. Language and Speech, 31, 43-56.
444 References
Singer, M., & Ritchot, K. F. M. (1996). The role of working memory capacity
and knowledge access in text inference processing. Memory & Cognition,
24, 733-43.
Skinner, B. F. (1938). The behavior of organisms. New York: Appleton-Century-
Crofts.
Smith, E. E., & Medin, D. L. (1981). Categories and concepts. Cambridge, MA:
Harvard University Press.
Smith, E. E., Shoben, E. J., & Rips, L. J. (1974). Structure and process in seman¬
tic memory: A feature model for semantic decision. Psychological Review,
81, 214-41.
Smolensky, P. (1986). Information processing in dynamical systems: Founda¬
tions of harmony theory. In D. F. Rumelhart & J. L. McClelland (Eds.),
Parallel distributed processes. Cambridge, MA: MIT Press.
Spence, K. W. (1956). Behavior theory and conditioning. New Haven, CT: Yale
University Press.
Squire, L. R. (1992). Memory and the hippocampus: A synthesis of findings
with rats, monkeys, and humans. Psychological Review, 99, 195-231.
Squire, L. R., & Knowlton, B. (1995). Learning about categories in the absence
of memory. Proceedings of the National Academy of Sciences, 92, 12470.
Squire, L. R., Knowlton, B., & Musen, G. (1993). The structure and organiza¬
tion of memory. Annual Review of Psychology, 44, 453-95.
Stanners, R. F., Price, J. M., & Painton, S. (1982). Interrelationships among text
elements in fictional prose. Applied Psycholinguistics. 3, 95-107.
Stanovich, K. E. (1980). Toward an interactive-compensatory model of indi¬
vidual differences in reading fluency. Reading Research Quarterly, 16,
32-1.
Staub, F. C., & Reusser, K. (1995). The role of presentational structures in
understanding and solving mathematical word problems. In C. A. Weaver,
S. Mannes, & C. R. Fletcher (Eds.), Discourse comprehension: Essays in honor
of Walter Kintsch (pp. 285-306). Hillsdale, NJ: Erlbaum.
Steinhart, D. J. (1996) Resolving lexical ambiguity: Does context play a role? MA
thesis, University of Colorado.
Stern, E. (1993a). What makes certain arithmetic word problems involving com¬
parison of sets so difficult for children? Journal of Educational Psychology,
85, 7-23.
(1993b). Die Entwicklung des mathematischen Verstjindnisses im Kindes-
alter. Habilitationsschrift, Universitiit Munchen.
Stern, E., & Lehrndorfer, A. (1992). The role of situational context in solving
word problems. Cognitive Development, 7, 259-68.
Sternberg, S. (1969). Memory scanning: Mental processes revealed by reaction
time experiments. American Scientist, 57, 421—57.
Stigler, J. W., Fuson, K. C., Ham, M., & Kim, M. S. (1986). An analysis of addi-
446 References
Whitney, P., Budd, D., Bramuci, R. S., & Crane, R. C. (1995). On babies, bath¬
water, and schemata: A reconsideration of top-down processes in compre¬
hension. Discourse Processes, 20, 135—66.
Whitney, P., Ritchie, B. G.,& Clark, M. B. (1991). Working memory capacity and
the use of elaborative inferences in text comprehension. Discourse Processes,
14, 133-46.
Wisniewski, E. J. (1995). Prior knowledge and functionally relevant features in
concept learning. Journal of Experimental Psychology: Learning. Memory,
and Cognition, 21, 449-468.
Wolfe, M. B. W., & Kintsch, W. (in preparation). Story recall: Joining compre¬
hension theory and memory theory.
Wolfe, M. B., Schreiner, M. E., Rehder, R., Laham, D., Foltz, P. W., Landauer,
T. K., & Kintsch, W. (in press). Learning from text: Matching reader and
text by Latent Semantic Analysis. Discourse Processes.
Yekovich, F. R., Walker, C. H., Ogle, L. T., & Thompson, M. A. (1990). The
influence of domain knowledge on inferencing in low-aptitude individuals.
In A. C. Graesser & G. H. Bower (Eds.), The psychology of learning and moti¬
vation (pp. 175-96). New York: Academic Press.
Zwaan, R. A., Langston, M. C., & Graesser, A. C. (1995). The construction of
situation models in narrative comprehension: An event-indexing model.
Psychological Science, 6, 292-97.
Zwaan, R. A., Magliano, J. P., & Graesser, A. C. (1995). Dimensions of situation
model construction in narrative comprehension. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 21, 386-97.
Name Index
Kitajima, M., 97, 388, 389, 390, Long, D. L., 132, 194, 198, 232,
391, 392f, 422 284, 285, 286
Klatzky, R. L., 382, 383 Lorch, E. P., 176, 179
Knowlton, B., 25, 26 Lorch, R. F., 176, 179
Kohler, I., 22 Lord, A. B., 212
Kolodner, J. L., 211 Loxterman, J. A., 312
Korkel, J., 287
Kosslyn, S. M., 44, 45 McCandliss, B. D., 126, 127
Kotovsky, K., 397, 398 McCarrell, N. S., 78, 135
Kozminsky, E., 174, 221, 285 Macauley, D., 420
Kraut, R. E., 388 McClelland, J. L., 35, 170, 171
Kreuz, R. J., 193 McConkie, G. W., 283
Kruley, P., 198 McDaniel, M. A., 319
Kucan, L., 329 McDonald, J. L., 148
Kunda, Z., 394, 402, 403, 404, MacDonald, M. C., 242
406, 408 McKeown, M. G., 312, 329
McKoon, G., 73,78, 145, 146,
Laham, D., 87, 88, 90, 324, 326f, 148, 156, 174, 193,221,
327 226,285
Lakoff, G., 43, 157,413 McNamara, D. S., xv, 97, 230,
Landauer, T. K., xv, xvi, 87, 309f, 311,313, 315f, 316f,
88, 89, 90, 134, 324, 326f, 317,318,319, 320, 323,
327 381,384, 385f, 386, 388,
Langston, M. C., 198 391
Langston, W. E., 198 MacWhinney, B., 148
LaPorte, R. E., 297 Magliano, J. P., 194, 198
Larkin, J. H., 31, 334, 339, 365, Mandler, J. M.,211
381 Mani, K., 198
Lave, J., 23 Mannes, S. M., xv, 37, 83, 84f,
Lehnert, W. G., 211 85,97,320, 32If, 322, 372,
Lehrndorfer, A., 346 376,379, 380,388,391
Leibniz, G. W. F., 32 Marr, D., 5
Leibowitz, H. W., 24 Marshall, C. R., 235
Lesgold, A. M., 148 Marshall, J., 24
Leslie, L., 287 Marslen-Wilson, W. D., 148
Lewis, A. B., 348, 349, 352, 353, Mason, R. A., 181, 190, 194, 235
354 Masson, M. E. J., 239
Lewis, C., xv Mathews, P. D., 176, 179
Linde, C., 179 Mayer, R. E., 359
454 Name Index
3 ns? □□0LUM5B a
DATE DUE
DEMCO 38-296