Davies Kreysa JCL 2018
Davies Kreysa JCL 2018
doi:10.1017/S0305000918000120
ARTICLE
Abstract
Children’s ability to refer is underpinned by their developing cognitive skills. Using a
production task (n = 57), we examined pre-articulatory visual fixations to contrast
objects (e.g., to a large apple when the target was a small one) to investigate how visual
scanning drives informativeness across development. Eye-movements reveal that
although four-year-olds fixate contrast objects to a similar extent as seven-year-olds and
adults, this does not result in explicit referential informativeness. Instead, four-year-olds
frequently omit distinguishing information from their referring expressions regardless
of the comprehensiveness of their visual scan. In contrast, older children make greater
use of information gleaned from their visual inspections, like adults. Thus, we find a
barrier not to the INCIDENCE of contrast fixations by younger children, but to their USE
of them in referential informativeness. We recommend that follow-up work investigates
whether younger children’s immature executive skills prevent them from describing
referents in relation to contrast objects.
Introduction
Of the wide range of pragmatic phenomena developing throughout childhood, the
ability to refer unambiguously is a communicative priority, yet the component and
integrative skills driving this are still unclear. The current study focuses on the
development of a foundational prerequisite for unambiguous reference: the ability to
visually scan a scene and then integrate distinguishing information into felicitous
referring expressions. To complement the large body of existing work investigating
the later stages of reference production (e.g., assessing accessibility; perspective-
taking: Allen, Hughes, & Skarabela, 2015: Nadig & Sedivy, 2002), here we focus on
the earlier stages, when speakers collect the information they need to eventually
produce fully informative referring expressions.
© Cambridge University Press 2018
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
2 Davies and Kreysa
expressions? How much time is required to encode distinguishing features, and how
long in advance of articulation? What enables them to identify these distinctive
features and then encode them into their referential choices? To address these
questions, we investigate how the prerequisite of VISUAL SCANNING BEHAVIOUR affects
children’s referential informativeness.
Although few studies of children’s sentence production have used eye-movement
paradigms, existing research demonstrates the value of such methods in examining
links between children’s visual attention, speech planning, and referential production.
Bunger, Trueswell, and Papafragou (2012) recorded four-year-olds’ eye-movements as
they described motion events to ascertain whether children’s linguistic omissions are
due to attentional deficits (i.e., that children simply do not look at core aspects of a
scene) or due to constraints stemming from the developing linguistic system itself.
Like the adult comparison group in Bunger et al.’s study, the children fixated
multiple core elements of the scene (e.g., instrument, path). However, this did not
always lead them to mention these aspects, in contrast to the more explicit adults.
The authors conclude that the similar eye-movement patterns yet different linguistic
encoding between the two age groups reflect children’s developing interface between
attention and language production, or their developing linguistic production system
(the latter explanation was also put forward by Norbury, 2014, with respect to
children with language impairment). These findings leave open the possibility that
even if children fixate a crucial aspect of a scene, they may not go on to encode it in
their referring expressions.
Intuitively, in a referential communication paradigm, speakers must look at
competitor objects to identify which features distinguish the target from these other
objects, and to monitor potential ambiguity for the addressee. Deutsch and
Pechmann (1982, p. 178) appealed for research into the link between visual scanning
and referring, and Pechmann (1989, p. 98) proposed incomplete visual scanning as a
reason for failures in informativeness, though did not provide developmental data to
support this. More recently, studies into adults’ pre-articulatory visual scanning
found that fully informative expressions are associated with fixations to a contrast
referent before articulation (Brown-Schmidt & Tanenhaus, 2006; Davies & Kreysa,
20171). In Davies and Kreysa (2017), we showed that speakers were more likely to be
informative when they had fixated the contrast object during multiple temporal
regions and for longer before starting to speak. However, such fixations were not
essential for producing a fully informative utterance: the cooperative adult speaker
has a pragmatic drive to be informative and can use information gleaned from a
number of sources (direct fixation, extrafoveal processing, previous exposure) in
order to provide their addressee with a felicitous referring expression.
Rabagliati and Robertson (2017) examined three- to five-year-olds’ monitoring
processes when producing informative or under-informative expressions to refer to
target objects accompanied by a foil and a distractor object. They investigated
proactive monitoring, i.e., saccades to target and contrast objects before naming.
Unlike the adult comparison group, children across the tested age range did not
typically monitor for potential ambiguity, although they did show some evidence of
1
Like the current study, these investigations into adults’ pre-articulatory visual scanning use a paradigm
that successfully combines language production with eye-tracking, as validated by Griffin (2004), Griffin
and Bock (2000), Meyer, Sleiderink, and Levelt (1998), Vanlangendonck, Willems, Menenti, and
Hagoort (2016), i.a.
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
4 Davies and Kreysa
Method
Participants
Twenty-seven four-year-olds and 30 seven-year-olds were recruited from nurseries,
schools, and playschemes in Leeds. Table 1 contains participant profile information.
All were monolingual native speakers of British English, and all had normal or
corrected-to-normal vision and hearing. Each participated voluntarily with the
informed consent of their caregiver, and each child gave their assent before starting
the tasks. In addition, 24 adults were recruited from the University of Leeds for a
separate study with a similar methodology (reported in Davies & Kreysa, 2017). We
refer to this adult data as a comparison to the children’s patterns, and present this
control group data at relevant points to show fully developed referential and visual
behaviour.
Table 1. Participant Profiles for the Original Sample and after Exclusions from the Eye-movement
Analysis (see ‘Data cleaning’ for exclusion criteria)
Entire sample: analysed for production data and standardised test performance
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
Journal of Child Language 7
Figure 1. Four-object stimuli. Left panel shows a contrast-absent item and right panel shows a contrast-present
item. Target is highlighted in both panels.
Figure 2. Eight-object stimuli. Left panel shows a contrast-absent item and right panel shows a contrast-present
item. Target is highlighted in both panels.
mates by size (large vs. small); no other adjectives were required or would discriminate
the target from the contrast object. In the four-object displays, the contrast-absent items
contained three distractor objects and the contrast-present items contained two. In the
eight-object displays, the contrast-absent items contained seven distractors and the
contrast-present items contained six. The 16 critical items all appeared in four
pseudo-randomised lists, counterbalanced for target attribute and for block order.
Thus, half the participants saw, e.g., the small apple as the target, while the other
half saw the large apple as the target. No object appeared as target more than once
throughout the experiment, and the position of the target and the contrast objects
was rotated around each slot of the four- and eight-object displays. Stimuli were
presented and eye-movements recorded using Tobii Studio software, v. 3.1.6.
The 24 filler items were of four types: two-object picture displays, two-object
number displays, four-object picture displays, and eight-object picture displays. In
the four- and eight-object filler displays, targets differed from contrast mates by
pattern (stripy vs. spotty). The fillers were partly designed to mask the pattern
inherent in the critical trials, i.e., when a display contained a contrast set, the target
object in the critical trials was necessarily a member of this set. In order to reduce
the likelihood of the children predicting the identity of the critical target before it
was highlighted, half of the filler items featured a target object which was not a
member of the co-present contrast set.
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
8 Davies and Kreysa
The sequencing of each trial is depicted in Figure 3. The experiment was conducted
using a Tobii X120 remote desk-mounted eye-tracker, a Dell flat panel monitor visible
to the participant, and a Lenovo W540 laptop running the experimental software,
visible to the experimenter. Participants’ utterances were recorded using an
omnidirectional tabletop microphone. The adult design and procedure was
comparable to the child experiment, though there were double the number of items
and dimensions involved, and the exposure time for the preview and
target-highlighted displays were each 1000 ms shorter. For full details, see Davies and
Kreysa (2017).
Figure 3. Trial sequence: (1) the fixation cross was presented for 1000 ms, followed by; (2) a preview of the
displays without the target highlight (3000 ms for four-object displays; 4000 ms for eight-object displays); (3)
a red fixation cross then appeared within the preview for a further 1000 ms; (4) the fixation cross
disappeared and the target was highlighted with a red frame around the object. This final display remained
visible for 5000 ms, during which time the participant produced their utterance using the form “click on the X”.
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
Journal of Child Language 9
Procedure
Participants were tested individually in a quiet room in their nursery, school, or
playscheme setting. The nursery children’s key worker sat with them during their
session. Children were welcomed, briefed on the content of the session, and then
gave their assent. The order of tests was the same for all participants and was as
follows (with approximate durations):
The children were debriefed, thanked, and received a certificate for their participation.
The whole testing session lasted approximately 30 minutes. The study was approved by
the Faculty Research Ethics Committee at the lead author’s institution.
Data preparation: utterance coding
The utterances were transcribed and coded from the audio-recording made during the
testing session. If a referring expression contained minimally sufficient information for
the addressee to uniquely identify it (i.e., with appropriate modification in the
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
10 Davies and Kreysa
contrast-present condition; “click on the big apple”) it was coded as OPTIMAL. If it lacked
such information (e.g., “click on the apple” in the contrast-present condition) it was
coded as UNDER-INFORMATIVE. Since we were interested in participants’ eye-movements
leading up to their first attempt at a referring expression, utterances which were initially
under-informative but subsequently self-corrected to an informative form were coded as
under-informative (e.g., “click on the glasses (.) the big ones”). This applied to six out
of the 432 critical referring expressions in the four-year-olds’ data (1%), and 17 out of
the 480 critical referring expressions in the seven-year-olds’ data (3.5%). Referring
expressions which contained more information than necessary for unique reference
resolution (e.g., “click on the little tie” in a display with a single tie) were coded as
OVER-INFORMATIVE. Utterances which referred to an INCORRECT TARGET were coded as such
and excluded from subsequent analysis: this applied to nine out of the 432 critical
referring expressions in the four-year-olds’ data (2%), and one out of the 480 critical
referring expressions in the seven-year-olds’ data (0.2%). Trials in which the
participants did not respond or gave incomprehensible responses were coded as NO
RESPONSE: this applied to 11 out of the 432 critical referring expressions in the
four-year-olds’ data (2.5%), and three out of the 480 critical referring expressions in the
seven-year-olds’ data (0.6%). Only the optimal, under-informative and over-informative
items went forward for analysis. The other response types were excluded, totalling 6%
of the four-year-olds’ data and 4% of the seven-year-olds’ data.
Results
Referential communication task: production data
For measuring the form of referring expressions from participants’ production data
(hypothesis 1), the experiment had a 2 × 2 × 2 design (age group × contrast × display
2
For fixations which spanned two or three temporal regions, each region was allocated half or a third of a
fixation, respectively. Fixation duration was defined to include individual fixations, gazes, and refixations of
an object within one temporal region.
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
Journal of Child Language 11
3
Of the remaining 6%, 2% were over-informative and 4% were excluded due to references to the incorrect
target, no response, or incomprehensible response.
4
Of the remaining 9%, 8% were over-informative and 1% were excluded due to references to the incorrect
target, no response, or incomprehensible response.
5
The production data were also analysed using the reduced sample included in the eye-movement
analysis (see Table 1 for details of the subsample). Behavioural effects were not quantifiably changed in
this smaller sample. As in the full sample, the four-year-olds were largely under-informative in their
referential choices (84% under-informative and 14% informative), whereas the seven-year-olds were
more frequently informative (38% under-informative and 62% informative). Both main effects and the
interaction held in the original direction, i.e., age on informativeness (F(1,50) = 35.51, p < .001, η2p = .42);
display complexity on informativeness (F(1,50) = 34.2, p < .001, η2p = .41); age and complexity on
informativeness (F(1,50) = 10.01, p < .01, η2p = .17).
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
12 Davies and Kreysa
Table 2. Mean Rates of Referential Informativeness as a Percentage of all Expressions Produced.
Percentages Summing < 100 within Informativeness Group Are Due To Exclusions (see footnotes 3 and 4).
Figure 4. Mean rates of informativeness as a percentage of expressions produced, by age group and display
complexity; contrast-present condition only.
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
Journal of Child Language 13
though not to the extent of adult speakers. Both child groups produced fewer
informative expressions when displays were complex, and this effect was more
pronounced in the older group.
Correlational analyses
A Pearson correlation coefficient was computed to assess the relationship between
informativeness of referring expressions (contrast-present condition only) and
performance on the standardised tests. Within each child group, there were no
significant correlations between the proportion of referring expressions that were
under-informative and any of the standardised measures, (all ps > .1; all rs < .3).
This was the case when correlations were run across the two levels of display
complexity, and when the four-object and eight-object conditions were analysed
separately. Thus, our second hypothesis was not supported. That is, the
informativeness of children’s referring expressions is not associated with their
receptive language ability, their narrative ability, or their visual search capabilities, as
measured using the selected tools. This lack of significant associations may have been
due to the minimal variance in the informativeness rates in both the four-year-old
and the seven-year-old groups.
Note that when correlations were run across the entire child sample (n = 57), we
found significant positive correlations between informativeness and scores on the
BPVS (raw) (r = 0.58, p < .001), scores on the DELV (r = 0.39, p < .01), and scores on
the Bug Search (r = 0.55, p < .001). No relationship was found between
informativeness and BPVS (standardised) (r = –0.19, p = .16). That is, the higher the
children scored on tests of receptive vocabulary, narrative ability, and visual search,
the higher their rates of informativeness. However, since these correlations did not
remain once age was controlled (all ps > .7; all rs < 0.05), nor were they significant
within each age group, age appears to be driving the relationship in the whole
sample: older children tend to be more informative and score higher on the tests
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
14
Davies and Kreysa
Table 3. Scores on Standardised Tests: Mean (SD).
Children
BPVS (raw) 74.1 (11) 108 (14.4) −10.0 55 < .001 2.65 161.3 (4.1)
range 54–99 83–140 151–167
BPVS (standardised) 109.3 (6.9) 102.6 (12) 2.60 47 < .05 0.68 111.4 (7.4)
range 91–124 81–126 96–124
DELV (narrative) 3.5 (1.6) 6 (1.1) −6.77 47 < .001 1.82 5.8 (1.2)
range 1–7 2–7 3–7
WPPSI-IV Bug Search (raw) 21.9 (8.7) 41.5 (7.3) −9.15 54 < .001 2.44 –
range 6–42 29–60 –
Journal of Child Language 15
because their abilities in all areas improve with age, rather than their informativeness
and language/cognitive abilities being directly related.
Eye-movement data
For measuring the relationship between eye-movements and informativeness
(hypothesis 3), each analysis used a different combination of predictor and outcome
variables. Since the eye-movement analyses focused on looks to the contrast object
(which was of course absent in the contrast-absent condition), only the
contrast-present level of this variable was retained. The first analysis ( proportion of
contrast-fixated trials resulting in informative expressions) took age, visual complexity,
and presence or absence of fixations to the contrast object during two temporal
regions (preview; pre-utterance) as predictors, and utterance type as outcome
(though only with two levels: under-informative and informative: over-informative
trials were excluded due to their low frequency in the data). The second analysis
( proportion of under-informative trials preceded by a contrast fixation) took age and
presence of contrast fixations as predictors (with the two temporal regions analysed
separately) and utterance type as outcome. The third analysis (contrast fixation
duration) took age and utterance type as predictors, and total fixation duration to
the contrast object during the same temporal regions as the outcome.
Data cleaning
Since the eye-movement analyses focus on fixations to the contrast object, the
contrast-absent condition is not considered here. Five participants (four from the
four-year-old group; one seven-year-old) were wholly excluded from the eye-tracking
analysis since in each of these cases less than 20% of the samples recorded by the
eye-tracker were usable, leaving the remaining participant samples at n = 23 and n =
29 for the younger and older groups respectively (see Table 1 for details). A more
conservative cut-off (< 50%) had previously been used in analysing the adult data,
thus four adult participants were also excluded from the eye-tracking analyses.
In addition, 19 individual trials from the four-year-olds’ data and 28 trials from the
seven-year-olds’ data had to be removed from the eye-movement analyses for one of five
reasons: (i) no oral response; (ii) early articulation, i.e., a participants’ utterance
occurred before the target was revealed; (iii) late articulation, i.e., the utterance
started after the offset of the target display; (iv) the incorrect target was referred to;
(v) over 50% of the samples in the eye-tracking data for a particular trial had validity
codes of 4-4, signalling that neither eye was found by the eye-tracker. After these
exclusions, 90% of the four-year-olds’ original dataset and 88% of the
seven-year-olds’ were included in the analyses.
both of these temporal regions were excluded, leaving 70% of the four-year-olds’
original dataset and 69% of the seven-year-olds’.
This analysis allows us to examine the role of contrast fixations as a predictor of
informativeness. We focused on those trials which contained a contrast fixation in
either the preview region, the pre-utterance region, or both. This represented 80% of
the four-year-olds’ valid trials, 88% of the seven-year-olds’ valid trials, and 80% of
the adults’ valid trials (n = 102, n = 142, and n = 235, respectively).
As Figure 5 shows, when four-year-olds fixated the contrast object, they seldom went
on to use it in their referring expressions (only 17% of contrast-fixated trials were
informative across display complexity conditions). A clear difference can be seen in
the seven-year-olds, who frequently went on to use the information from the
contrast fixation in their expressions (69% of contrast-fixated trials were informative
across display complexity conditions). Adults almost always went on to use the
information from the contrast fixation in their expressions (83% of contrast-fixated
trials were informative across display complexity conditions). Importantly, although
the older children’s rate of informativeness is in line with the adults’ for the
four-object displays, they were significantly hampered from reaching adult levels by
the eight-object displays. A chi-square analysis reveals a significant association
between informativeness and display complexity in the seven-year-olds (χ 2(1) =
11.13, p = .001, Cramer’s V = .28, odds ratio = 1.97), with no association between
informativeness and complexity for the four-year-olds (χ 2(1) = 0.03, n.s.) or the
adults (χ 2(1) = 0.007, n.s.).
Figure 5. Proportion of all trials with pre-articulatory contrast fixations which resulted in informative or
under-informative referring expressions, by age and display complexity. Since the percentages are based on
an absolute frequency out of all trials (i.e., not averaged over participants or trials), there is no variance to
report.
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
Journal of Child Language 17
This analysis suggests that the four-year-olds struggled to integrate the information
they gleaned from fixating the contrast during utterance planning. Despite looking at
the contrast object, they did not go on to provide fully informative expressions in the
same trial. On the other hand, contrast fixations boosted informativeness for the
seven-year-olds who, like adults, were able to use the information from the contrast
object in their ensuing informative expressions. However, in contrast to adults, the
older children’s informativeness was significantly compromised by display complexity.
18
Davies and Kreysa
Table 4. Frequency of Valid Trials of Each Fixation Pattern and Each Utterance Type
4-year-olds 7-year-olds
Temporal region containing a contrast fixation Informative Under-informative Total Informative Under-informative Total
Neither 1 25 26 7 12 19
Preview 7 19 26 17 14 31
Pre-utterance 1 24 25 13 13 26
Both 9 42 51 68 17 85
TOTALS 18 110 128 105 56 161
Journal of Child Language 19
Figure 6. Mean proportions of under-informative trials following contrast fixation patterns across preview and
pre-utterance temporal regions. Error bars show ± 1 SE.
fixation vs. no fixation across two temporal regions). Linear mixed effects models
investigated the influence of age and informativeness on fixation duration to the
contrast object during the preview and pre-utterance temporal regions combined.
Again, data from the adults is shown for comparison, though only the child groups
are included in the reported statistical analyses. Since there were 45 trials in which
children did not fixate the contrast object at all in these regions, we excluded those
trials from this analysis. Three outlying trials with fixation durations of > 3000 ms
were also excluded, leaving 83% of the prepared dataset. The model included the two
fixed factors (age and informativeness), their interaction, and random intercepts for
participants and items: fixation duration to contrast age * informativeness + (1+ |
ppt) + (1 | item).
During the combined preview and pre-utterance regions, the four-year-olds (M =
1037 ms, SD = 712) fixated the contrast for longer than the seven-year-olds (M = 887 ms,
SD = 587; age coefficient = –233.7, SE = 96.8, t = –2.41), regardless of informativeness.
Both age groups fixated the contrast object for longer before producing an informative
utterance (M = 1004 ms, SD = 643) than before producing an under-informative utterance
(M = 899 ms, SD = 643; informativeness coefficient = –211.1, SE = 93.8, t = –2.25).
Although Figure 7 suggests that this pattern is more marked in the seven-year-olds
(informative M = 988 ms, SD = 617; under-informative M = 664 ms, SD = 444) than
the four-year-olds (informative M = 1107 ms, SD = 801; under-informative M = 1023 ms,
SD = 698: informativeness coefficient = –216.2, SE = 120.7, t = –1.79), the interaction was
not significant (t = –0.99). However, it seems clear that longer looks to the contrast object
before speaking are associated with informativeness, particularly in the older children.
Figure 7. Mean total fixation duration to the contrast object during the preview and pre-utterance regions, by
age group and informativeness. Error bars show ± 1 SE.
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
Journal of Child Language 21
As the contrast fixation analyses suggest, children at four and at seven years of age
marginally differed in how long they fixated contrast objects. Distractor fixations were
also monitored to provide a measure of how much the children were scanning the
display generally. On average, four-year-olds and seven-year-olds showed a similar
pattern of fixation durations between areas of interest, with distractor items being
fixated least (see Figure 8). Adults also fixated the distractors the least of all areas of
interest, though they showed a more marked preference for the target than the two
child groups.
Figure 8. Mean total fixation duration to each area of interest during the preview and pre-utterance regions, by
age. Error bars show ± 1 SE.
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
22 Davies and Kreysa
Discussion
How does children’s visual scanning behaviour influence the informativeness of their
referring expressions? As a first step to answering this question, we ascertained that
our sample of four-year-olds produced under-informative expressions 83% of the
time when referring to objects in a display containing a contrast, whereas their
seven-year-old peers did so just 37% of the time. Having to apprehend more
complex displays increased rates of under-informativeness in both age groups,
though it penalised the older children more heavily, since they had a higher baseline
rate to fall from. Both the age and complexity findings support our first hypothesis,
and replicate previous production studies, which found a developmental shift from
under-informativeness to full informativeness as children mature (Davies & Katsos,
2010; Matthews et al., 2007; Whitehurst & Sonnenschein, 1981, i.a.).
Of the various reasons proposed in the literature for younger children’s
under-informativeness, we focused on the association between visually scanning the
display during utterance planning – specifically looking at the contrast object – and
the ensuing informativeness of referring expressions. By examining children’s
eye-movements as they previewed visual stimuli and planned their expressions, we
have shown that, although children looked at the contrast object at least once in the
majority of trials, younger children did not encode the critical information in their
referring expressions. Thus, we discount the suggestion that it is a lack of contrast
fixations that causes referential informativeness in young children (Deutsch &
Pechmann, 1982; Pechmann, 1989). As our data show, younger children indeed
allocate attention to a contrasting object, but nevertheless these contrast fixations do
not appear to be associated with their informativeness in any way. Whether they
fixate the contrast object in both pre-articulatory regions or not at all, and regardless
of the length of their fixations, four-year-old children largely produce
under-informative referring expressions. However, this pattern changes by the time
children reach seven years of age, when rates of informativeness rise significantly in
our task (approaching adult levels for the simple displays), and contrast fixations and
referential informativeness become positively associated. Thus, we find that
four-year-olds omit critical linguistic information despite having inspected its visual
representation; a pattern in line with Bunger et al.’s (2012) findings on visual scene
inspection and the encoding of manner and path information. Our results also
accord with Rabagliati and Robertson’s findings that young children “fail to take
heed of any ambiguity in the world around them” (2017, p. 24). Children have a
latent ability to notice potential ambiguity, yet neglect to provide disambiguating
information for their addressee. The current study extends Rabagliati and
Robertson’s study by finding a developmental difference in the use of contrast
information during proactive monitoring, refining our third hypothesis to reveal a
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
Journal of Child Language 23
developmental difference not in the incidence of contrast fixations, but in the use of
them in producing informative referring expressions.
Thus, in terms of behaviour during the early stages of reference production, the
critical skill for full informativeness is the integration of information from an initial
visual search. As shown by the second eye-movement analysis, seven-year-old
children are able to integrate information from a preview stage (i.e., even before the
identity of the target is known) to produce informative referring expressions.
Although this suggests they need a longer ‘run-up’ than adults (who find contrast
fixations just before the utterance as helpful for informativeness as fixating it in both
the preview and pre-utterance regions; Davies & Kreysa, 2017), perhaps due to
slower speed of processing or needing more time for speech planning, it highlights
older children’s ability to hold referential information in mind while attending to
visual information and planning their eventual referential form. However, this is
harder to achieve when displays are complex; in these cases, older children struggle
to encode the distinguishing information even when they have fixated the contrast
object. We suggest that the additional objects in the display impose extra processing
demands, which may cause children to revert to referring to target objects in
absolute rather than relative terms. The lack of ANY modifying adjective in these
trials – even incorrect or non-distinguishing ones – suggests that the extra visual
complexity may curtail the necessary linguistic complexity in spontaneous referring.
Interestingly, Whitehurst and Sonnenschein (1981) successfully elicited fully
informative expressions requiring comparisons of complex arrays from five-year-olds,
but only when the children were explicitly instructed to make such comparisons.
So what is it that prevents younger children from integrating visual information into
their expressions? One possibility is that these children are more likely to talk about an
element of a scene that has captured their attention. Recall that the target was
highlighted using a red square; a salient cue that may have overshadowed the rest of
the array even when the contrast object had been previously inspected. This
explanation is in line with Bunger at al.’s (2012, p. 147) suggestion that adults are
“able to suppress their excitement about particular event components in the interest
of providing fully informative event descriptions”. Here we can extend such an
explanation to children just three years older than those four-year-olds who could
not stop themselves describing the highlighted target on its own merits, rather than
relative to contrast objects, as required for felicitous referring. This susceptibility to a
‘see-it-say it’ strategy may be caused by a tendency in younger children to use
adjectives descriptively rather than contrastively (though their low rate of
over-informative referring casts doubt on this as a sole explanation). More likely,
their narrow focus is related to immature executive function skills, e.g., inhibitory
control, which we turn to below. A more gradient, though complementary,
explanation is that children and adults differ in the AMOUNT of visual attention
required for eventual integration into informative utterances, as shown by our analysis
of fixation duration where both child groups spent almost twice as long as the adults
fixating the contrast object before producing an informative utterance. Interestingly, an
analysis of speech onset time between the child groups suggests that although
four-year-olds were slower (M = 1819 ms, SD = 607) to start producing their utterances
than the seven-year olds were (M = 1520 ms, SD = 308; age coefficient = –333.9, SE =
98.9, t = –3.4), this didn’t enable them to match their older peers’ informativeness.
Follow-up work which increases the salience of the difference between target and
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
24 Davies and Kreysa
contrast, or that allows children more time to attend to it, would shed light on the role of
timing in informative reference.
Counter to our second hypothesis, we did not find a contributory role for receptive
vocabulary, narrative ability (both used as indices of language ability), or visual search
capabilities towards referential informativeness at either age-point. Note, however, that
there was limited within-group variance in the informativeness rates, which may have
contributed to the null results for the correlation analysis. We would welcome further
investigation of the role of linguistic and visual search skills in referential tasks designed
to elicit more variable rates of informativeness in older groups, e.g., referential
communication tasks that require two modifiers for unique disambiguation.
Additionally, the use of computational cognitive models that specify the relationships
between linguistic and cognitive processes would also be a productive means of
investigating the interplay of these factors, as well as the role of individual differences
(for example in ACT-R; Hendriks, 2016).
Although we didn’t measure our participants’ executive functioning skills, an
interesting future direction would be to assess whether executive functioning
moderates the relationship between contrast fixations and informativeness of the
referential phrase. That is, it may be the case that only those children with good
executive functioning are able to make use of the information gleaned from the
contrast object.6 Executive functioning is a set of cognitive skills which has been
frequently linked to performance in referential tasks, e.g., the ability to mentally
maintain or manipulate information (i.e., working memory), to withhold a dominant
response (inhibitory control), or to shift representations (i.e., cognitive flexibility)
(see De Cat, 2015, for a review). Studies by, e.g., Bacso and Nilsen (2017), Nilsen
and Graham (2009), and Nilsen, Varghese, Xu, and Fecica (2015) suggest that greater
working memory enables children to more effectively hold features of a target object
in mind and compare them with contrasting objects (see also Hendriks, 2016, for
supporting evidence from cognitive modelling). Similarly, previous research has
implied that stronger cognitive flexibility enables children to notice multiple
dimensions of an object (e.g., that a sock is both long and stripy) and to produce an
expression that captures the critical dimension(s) (Bacso & Nilsen, 2017). Inhibitory
control has also been found to relate to referential informativeness (Wardlow, 2013),
and although the current study does not have data to corroborate this, it is feasible
that the see-it-say-it strategy mentioned above might be minimised with better
inhibitory control as children get older. An age-related boost in executive function
skills might help children scan the critical objects, hold them in mind, suppress
prepotent responses, and then consistently encode relevant information to produce
felicitous expressions.
Like many referential interactions, our task required use of a communicative partner’s
perspective. The interactive experimental set-up was designed to encourage participants
to describe the target object for the addressee rather than merely describing the scene
generally, e.g., the imperative sentence frame that the child was instructed to use
(“click on the X”), the presence of a live addressee, instructions emphasising that the
child’s job was to help the addressee, information about what the addressee could and
couldn’t see, and the addressee’s clear motivation to find the correct object in response
to the child’s instructions. Despite these aspects of the design, the children may not
have realised that the identity of the target object was unknown to the addressee
6
We thank an anonymous reviewer for this suggestion.
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
Journal of Child Language 25
before they produced their referring expression. Indeed, the high frequency of
under-informativeness by the younger children in our sample accords with other work
finding that children overuse forms that imply accessibility of the referent to their
addressee (De Cat, 2015, p. 278). However, children may make these apparent
mis-estimates of accessibility, or fail to take their addressee’s perspective into account,
not for reasons of erroneous higher-level situation modelling, but due to problems in
integrating discourse information at a more basic level. That is, they may realise that
their partner needs a modified description, but are simply unable to maintain
activation of contrast information while planning their utterances. Consequently, they
fail to meet the pragmatic expectation and end up describing the target in absolute
terms. This may be exacerbated in situations where communicative demands are
higher, e.g., novel scenarios with less supportive contexts and more aspects to integrate
(Allen et al., 2015, p. 134). Experimental situations involve many of these demands;
testing between these artificial vs. more naturalistic contexts may reveal further
executive function-related explanations for children’s referential inadequacy.
One potential limitation of our study is that participants received no feedback other
than a mouse-click, regardless of the referential form they produced, to signal that a
referent had been found and that they could move on to the next item. This liberal
acceptance of any utterance they produced might have particularly encouraged the
resource-poorer younger speakers to use unmodified expressions over the course of
the experiment, because the addressee seemed to be satisfied with the given
descriptions. However, there was no difference in rates of unmodified expressions
between items in the first and in the second half of the experiment for either the
four-year olds (t(26) = 0.47, p = .65) or the seven-year-olds (t(29) = –0.36, p = .72),
suggesting that lack of feedback was not a contributing factor in rates of
under-informativeness. Nevertheless, if we reframe informative reference as the
avoidance of MISUNDERSTANDING (Hendriks, 2017) instead of the avoidance of
ambiguity, children’s under-informative behaviour in this task starts to appear more
rational than it initially appears. Further, since participants were always in the
speaker role, they did not receive effective models, or experience what it is like to
receive inadequate expressions. This is not just a methodological point. It has been
shown that children learn to avoid ambiguity from precise (caregiver) feedback
(Abbot-Smith et al., 2016; Bacso & Nilsen, 2017; Matthews et al., 2007, 2012;
Wardlow & Heyman, 2016), so even within the course of a single experiment that
includes feedback and/or modelling, increased rates of informativeness can emerge,
mediated by executive function skills. Such a paradigm could produce a rather
different picture with regard to the link between contrast fixations and
informativeness. However, despite the lack of incentive to be maximally informative
and the lack of effective modelling, the older children’s drive to be informative did
not appear to be compromised in our study (cf. Varghese & Nilsen, 2013).
Participants were instructed that their role was to help a real, physically co-present
addressee to find the objects, which may have compensated for the lack of feedback,
at least for the older children.
There is a trend in the results which calls into question the assumption that the
contrast object must be fixated for an informative expression to occur. As reported in
our second eye-movement analysis, 96% of the younger children’s and 63% of the
older children’s trials without a contrast fixation were under-informative. This means
that 4% of the younger and 37% of the older children’s trials were in fact informative
despite not having fixated the contrast object in either the preview or pre-utterance
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
26 Davies and Kreysa
temporal region. This suggests that, at least for the older children, it is possible to
produce an informative referring expression without having directly checked the
contrast before articulation. This pattern is even more pronounced for the adult
comparison group at 62% informativeness without a prior contrast fixation (discussed
in depth in Davies & Kreysa, 2017). This ability may be due to either (i) extrafoveal
processing of the contrast object or (ii) late fixations to it during articulation. Whilst
beyond the scope of the current paper, this line of reasoning points to a further
age-related difference in the use of contrast information, i.e., that contrast fixations
are helpful but not essential for full informativeness as speakers mature.
It has been repeatedly shown that young children are frequently under-informative
in their referential behaviour. At the same time, there is ample evidence that composite
skills for informative reference are in place from an early age. For example,
22-month-olds react to newness and communicate more about what is new (O’Neill
& Happé, 2000); two-year-olds adapt their communicative behaviour depending on
their assessment of the knowledge of others (O’Neill, 1996) and can be trained to
produce fully informative expressions (Matthews et al., 2007); and five-year-olds can
track what is accessible to their interlocutor (Nadig & Sedivy, 2002). The current
study has extended this list of prerequisite skills by showing that, by four years of
age, children are able to engage in comprehensive visual scanning. However, it may
take another three years for them to manage these skills in unison and alongside
fully-fledged linguistic output.
Acknowledgements. We thank the children, families, and staff of Oakwood Acorns and Holly House
nurseries, Ducklings childcare, Kerr Mackie Primary School, and Children’s Corner’s Chillout Club (all
of Leeds) for their participation. We gratefully acknowledge assistance from Clara Andrés-Roqueta for
creating the stimuli (originally used in Davies, Andrés-Roqueta, & Norbury, 2016), Tara Evans for data
collection, transcription, and preparation, Jessica Dealey for data transcription and preparation, and
Chris Norton for preparing the eye-tracking data. Many thanks to Pirita Pyykkönen-Klauck and Gerry
Altmann for guidance during the early stages of this study, Cécile De Cat for discussion and comments
on earlier drafts, and to two anonymous reviewers for their helpful comments. The study was funded by
a British Academy Quantitative Skills grant awarded to the first author (grant reference SQ120012).
References
Abbot-Smith, K., Nurmsoo, E., Croll, R., Ferguson, H., & Forrester, M. (2016). How children aged 2;6
tailor verbal expressions to interlocutor informational needs. Journal of Child Language, 43(6), 1277–91.
Allen, S., Hughes, M., & Skarabela, B. (2015). The role of cognitive accessibility in children’s referential
choice. In L. Serratrice & S. E. Allen (Eds.), The acquisition of reference (pp. 123–53) (Trends in
Language Acquisition Research, Vol. 15). Amsterdam: John Benjamins.
Audacity Team (2014). Audacity(R): free audio editor and recorder [Computer program]. Version 2.0.6,
retrieved 12 November 2014 from <http://audacity.sourceforge.net/>.
Bacso, S. A., & Nilsen, E. S. (2017). What’s that you’re saying? Children with better executive functioning
produce and repair communication more effectively. Journal of Cognition and Development, 18(4), 441–64.
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4.
Journal of Statistical Software, 67(1), 1–48.
Borovsky, A., Elman, J., & Fernald, A. (2012). Knowing a lot for one’s age: vocabulary skill and not age is
associated with the timecourse of incremental sentence interpretation in children and adults. Journal of
Experimental Child Psychology, 112(4), 417–36.
Brown-Schmidt, S., & Tanenhaus, M. K. (2006). Watching the eyes when talking about size: an investigation
of message formulation and utterance planning. Journal of Memory and Language, 54, 592–609.
Bunger, A., Trueswell, J., & Papafragou, A. (2012). The relation between event apprehension and
utterance formation in children: evidence from linguistic omissions. Cognition, 122, 135–49.
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
Journal of Child Language 27
Christensen, D., Zubrick, S. R., Lawrence, D., Mitrou, F., & Taylor, C. L. (2014) Risk factors for low
receptive vocabulary abilities in the preschool and early school years in the longitudinal study of
Australian children. PLoS ONE 9(7), e101476. doi:10.1371/journal.pone.0101476.
Davies, C., Andrés-Roqueta, C., & Norbury, C. F. (2016) Referring expressions and structural language
abilities in children with Specific Language Impairment: a pragmatic tolerance account. Journal of
Experimental Child Psychology, 144, 98–113.
Davies, C., & Katsos, N. (2010). Over-informative children: production/comprehension asymmetry or tolerance
to pragmatic violations? Lingua (Special Issue on Asymmetries in Child Language), 120(8), 1956–72.
Davies, C., & Kreysa, H. (2017). Looking at a contrast object before speaking boosts referential
informativeness, but is not essential. Acta Psychologicam, 178, 87–99.
De Cat, C. (2015). The cognitive underpinnings of referential abilities. In L. Serratrice & S. Allen (Eds.),
The acquisition of reference (pp. 263–83). Amsterdam: John Benjamins.
Deutsch, W., & Pechmann, T. (1982). Social interaction and the development of definite descriptions.
Cognition, 11, 159–84.
Dickson, W. (1982). Two decades of referential communication research: a review and meta-analysis. In C.
J. Brainerd and M. Presley (Eds.), Verbal processes in children (pp. 1–33). New York: Springer Verlag.
Dunn, L. M., Dunn, D. M., Styles, B., & Sewell, J. (2009). The British Picture Vocabulary Scale, 3rd ed.
(BPVS-III). London: GL Assessment.
Girbau, D. (2001). Children’s referential communication failure: the ambiguity and abbreviation of
messages. Journal of Language and Social Psychology, 20(1/2), 81–9.
Glucksberg, S., Krauss, R. M., & Weisberg, R. (1966). Referential communication in nursery school
children: method and some preliminary findings. Journal of Experimental Child Psychology, 3, 333–42.
Graf, E., & Davies, C. (2014). The production and comprehension of referring expressions. In D. Matthews
(Ed.), Pragmatic development in first language acquisition: trends in language acquisition research (pp.
161–81). Amsterdam: John Benjamins.
Griffin, Z. M. (2004). Why look? Reasons for eye movements related to language production. In
J. Henderson & F. Ferreira, (Eds.), The integration of language, vision, and action: eye movements and
the visual world (pp. 213–47). New York: Taylor and Francis.
Griffin, Z. M., & Bock, K. (2000). What the eyes say about speaking. Psychological Science, 11, 274–9.
Hendriks, P. (2016). Cognitive modeling of individual variation in reference production and
comprehension. Frontiers in Psychology, 7, 506. doi: 10.3389/fpsyg.2016.00506.
Hendriks, P. (2017) Symposium discussion: processes underlying children’s reference production. 14th
International Congress for the Study of Child Language (IASCL), University Lyon 2, France, July 2017.
Krauss, R. M., & Glucksberg, S. (1969). The development of communication: competence as a function of
age. Child Development, 40, 255–66.
Matthews, D., Butcher, J., Lieven, E., & Tomasello, M. (2012). Two- and four-year-olds learn to adapt
referring expressions to context: effects of distractors and feedback on referential communication.
Topics in Cognitive Science, 4, 184–210.
Matthews, D., Lieven, E., & Tomasello, M. (2007). How toddlers and preschoolers learn to uniquely
identify referents for others: a training study. Child Development, 78(6), 1744–59.
Meyer, A. S., Sleiderink, A. M., & Levelt, W. J. M. (1998). Viewing and naming objects: eye movements
during noun phrase production. Cognition, 66(2), B25–B33.
Mueller, S. T. (2014). PEBL: the psychology experiment building language (Version 0.14) [Computer
experiment programming language]. Retrieved from <http://pebl.sourceforge.net> (last accessed June 2014).
Nadig, A. S., & Sedivy, J. C. (2002). Evidence of perspective-taking constraints in children’s on-line
reference resolution. Psychological Science, 13(4), 329–36.
Nicoladis, E. (2002). The cues that children use in acquiring adjectival phrases and compound nouns:
evidence from bilingual children. Brain and Language, 81, 635–48.
Nilsen, E. S., & Graham, S. (2009). The relations between children’s communicative perspective-taking
and executive functioning. Cognitive Psychology, 58, 220–49.
Nilsen, E. S., Varghese, A., Xu, Z., & Fecica, A. (2015). Children with stronger executive functioning and
fewer ADHD traits produce more effective referential statements. Cognitive Development, 36, 68–82.
Norbury, C. F. (2014). Sources of variation in developmental language disorders: evidence from
eye-tracking studies of sentence production. Philosophical Transactions of the Royal Society B:
Biological Sciences, 369(1634). doi:10.1098/rstb.2012.0393
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120
28 Davies and Kreysa
O’Neill, D. K. (1996). Two-year-old children’s sensitivity to a parent’s knowledge state when making
requests. Child Development, 67, 659–77.
O’Neill, D. K., & Happé, F. (2000) Noticing and commenting on what’s new: differences and similarities
among 22-month-old typically developing children, children with Down syndrome, and children with
autism. Developmental Science, 3, 457–78.
Pechmann, T. (1989). Incremental speech production and referential overspecification. Linguistics, 27, 89–110.
R Core Team (2015). R: a language and environment for statistical computing. R Foundation for Statistical
Computing, Vienna. Retrieved from: <https://www.R-project.org>.
Rabagliati, H., & Robertson, A. (2017). How do children learn to avoid referential ambiguity? Insights
from eyetracking. Journal of Memory and Language, 94, 15–27.
Seymour, H. N., Roeper, T., & De Villiers, J. G. (2003). DELV-ST (Diagnostic Evaluation of Language
Variation) Screening Test. San Antonio TX: Psychological Corporation.
Vanlangendonck, F., Willems, R. M., Menenti, L., & Hagoort, P. (2016). An early influence of common
ground during speech planning, Language, Cognition and Neuroscience, 31,6, 741–50.
Varghese, A. L., & Nilsen, E. S. (2013). Incentives improve the clarity of school-age children’s referential
statements. Cognitive Development, 28, 364–73.
Wardlow, L. (2013). Individual differences in speakers’ perspective taking: the roles of executive control
and working memory. Psychonomic Bulletin Review, 20(4), 766–72.
Wardlow, L., & Heyman, G. D. (2016) The roles of feedback and working memory in children’s reference
production, Journal of Experimental Child Psychology, 150, 180–93.=
Wechsler, D. (2013). Wechsler Preschool and Primary Scale of Intelligence, 4th ed. (WPPSI-IV). London:
Pearson.
Whitehurst, G. J. (1976). Development of communication – changes with age and modeling. Child
Development, 47(2), 473–82.
Whitehurst, G. J., & Sonnenschein, S. (1981). The development of informative messages in referential
communication: knowing when vs. knowing how. In W. P. Dickson (Ed.), Children’s oral
communication skills (pp. 127–42). New York: Academic Press.
Cite this article: Davies C, Kreysa H (2018). Look before you speak: children’s integration of visual
information into informative referring expressions Journal of Child Language 1–28. https://doi.org/
10.1017/S0305000918000120
Downloaded from https://www.cambridge.org/core. IP address: 213.122.203.89, on 12 May 2018 at 09:45:06, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0305000918000120