0% found this document useful (0 votes)

527 views688 pages

RM14-Methods in Pragmatics

Uploaded by

Cô Nấm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

527 views688 pages

RM14-Methods in Pragmatics

Uploaded by

Cô Nấm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 688

Methods in Pragmatics

HoPs 10
Handbooks of Pragmatics

Editors
Wolfram Bublitz
Andreas H. Jucker
Klaus P. Schneider

Volume 10

De Gruyter Mouton
Methods in Pragmatics

Edited by
Andreas H. Jucker
Klaus P. Schneider
Wolfram Bublitz

De Gruyter Mouton
ISBN 978-3-11-043066-0
e-ISBN (PDF) 978-3-11-042492-8
e-ISBN (EPUB) 978-3-11-042752-3

Library of Congress Cataloging-in-Publication Data

Names: Jucker, Andreas H., editor. | Schneider, Klaus P., editor. | Bublitz,
Wolfram, editor.
Title: Methods in pragmatics / edited by Andreas H. Jucker, Klaus P.
Schneider, Wolfram Bublitz.
Description: Berlin ; Boston : De Gruyter Mouton, [2018] | Series: Handbooks
of pragmatics ; 10 | Includes index.
Identifiers: LCCN 2018009948 | ISBN 9783110430660 (hardback : acid-free paper)
Subjects: LCSH: Pragmatics--Handbooks, manuals, etc. | BISAC: LANGUAGE ARTS &
DISCIPLINES / Linguistics / General.
Classification: LCC P99.4.P72 M468 2018 | DDC 401/.45--dc23 LC record available at
https://lccn.loc.gov/2018009948

Bibliographic information published by the Deutsche Nationalbibliothek

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available on the Internet at http://dnb.dnb.de.

© 2018 Walter de Gruyter GmbH, Berlin/Boston

Cover image: efetova / iStock / thinkstock
Typesetting: Dörlemann Satz GmbH & Co. KG, Lemförde
Printing and binding: CPI books GmbH, Leck
www.degruyter.com
Preface to the handbook series
Wolfram Bublitz, Andreas H. Jucker and Klaus P. Schneider

The series Handbooks of Pragmatics, which comprises thirteen self-contained

volumes, provides a comprehensive overview of the entire field of pragmatics.
It is meant to reflect the substantial and wide-ranging significance of pragmatics
as a genuinely multi- and transdisciplinary field for nearly all areas of language
description, and also to account for its remarkable and continuously rising popu-
larity in linguistics and adjoining disciplines.
All thirteen handbooks share the same wide understanding of pragmatics as the
scientific study of all aspects of linguistic behaviour. Its purview includes patterns
of linguistic actions, language functions, types of inferences, principles of commu-
nication, frames of knowledge, attitude and belief, as well as organisational prin-
ciples of text and discourse. Pragmatics deals with meaning-in-context, which for
analytical purposes can be viewed from different perspectives (that of the speaker,
the recipient, the analyst, etc.). It bridges the gap between the system side of lan-
guage and the use side, and relates both of them at the same time. Unlike syntax,
semantics, sociolinguistics and other linguistic disciplines, pragmatics is defined
by its point of view more than by its objects of investigation. The former precedes
(actually creates) the latter. Researchers in pragmatics work in all areas of lin-
guistics (and beyond), but from a distinctive perspective that makes their work
pragmatic and leads to new findings and to reinterpretations of old findings. The
focal point of pragmatics (from the Greek prãgma ‘act’) is linguistic action (and
inter-action): it is the hub around which all accounts in these handbooks revolve.
Despite its roots in philosophy, classical rhetorical tradition and stylistics, prag-
matics is a relatively recent discipline within linguistics. C. S. Peirce and C. Mor-
ris introduced pragmatics into semiotics early in the twentieth century. But it was
not until the late 1960s and early 1970s that linguists took note of the term and
began referring to performance phenomena and, subsequently, to ideas developed
and advanced by Wittgenstein, Ryle, Austin and other ordinary language philoso-
phers. Since the ensuing pragmatic turn, pragmatics has developed more rapidly
and diversely than any other linguistic discipline.
The series is characterised by two general objectives. Firstly, it sets out to
reflect the field by presenting in-depth articles covering the central and multifar-
ious theories and methodological approaches as well as core concepts and topics
characteristic of pragmatics as the analysis of language use in social contexts. All
articles are written specifically for this handbook series. They are both state of
the art reviews and critical evaluations of their topic in the light of recent devel-
opments. Secondly, while we accept its extraordinary complexity and diversity

https://doi.org/10.1515/9783110424928-201
vi Preface to the handbook series

(which we consider a decided asset), we suggest a definite structure, which gives

coherence to the entire field of pragmatics and provides orientation to the user of
these handbooks. The series specifically pursues the following aims:

– it operates with a wide conception of pragmatics, dealing with approaches that

are traditional and contemporary, linguistic and philosophical, social and cul-
tural, text- and context-based, as well as diachronic and synchronic;
– it views pragmatics from both theoretical and applied perspectives;
– it reflects the state of the art in a comprehensive and coherent way, providing a
systematic overview of past, present and possible future developments;
– it describes theoretical paradigms, methodological accounts and a large num-
ber and variety of topical areas comprehensively yet concisely;
– it is organised in a principled fashion reflecting our understanding of the struc-
ture of the field, with entries appearing in conceptually related groups;
– it serves as a comprehensive, reliable, authoritative guide to the central issues
in pragmatics;
– it is internationally oriented, meeting the needs of the international pragmatic
community;
– it is interdisciplinary, including pragmatically relevant entries from adjacent
fields such as philosophy, anthropology and sociology, neuroscience and psy-
chology, semantics, grammar, discourse and media analysis as well as literary
studies;
– it provides reliable orientational overviews useful both to students and more
advanced scholars and teachers.

The thirteen volumes are arranged according to the following principles. The first
three volumes are dedicated to the foundations of pragmatics with a focus on micro
and macro units: Foundations must be at the beginning (volume 1), followed by
the core concepts in pragmatics, speech actions (micro level in volume 2) and
discourse (macro level in volume 3). The following six volumes provide cognitive
(volume 4), societal (volume 5) and interactional (volume 6) perspectives and
discuss variability from a cultural and contrastive (volume 7), a diachronic (vol-
ume 8) and a medial (volume 9) viewpoint. The remaining four volumes address
methodological (volume 10), sociomedial (volume 11), fictional (volume 12), and
developmental and clinical (volume 13) aspects of pragmatics:

1. Foundations of pragmatics
Wolfram Bublitz and Neal Norrick
2. Pragmatics of speech actions
Marina Sbisà and Ken Turner
3. Pragmatics of discourse
Klaus P. Schneider and Anne Barron
Preface to the handbook series vii

4. Cognitive pragmatics
Hans-Jörg Schmid
5. Pragmatics of society
Gisle Andersen and Karin Aijmer
6. Interpersonal pragmatics
Miriam A. Locher and Sage L. Graham
7. Pragmatics across languages and cultures
Anna Trosborg
8. Historical pragmatics
Andreas H. Jucker and Irma Taavitsainen
9. Pragmatics of computer-mediated communication
Susan Herring, Dieter Stein and Tuija Virtanen
10. Methods in pragmatics
Andreas H. Jucker, Klaus P. Schneider and Wolfram Bublitz
11. Pragmatics of social media
Christian R. Hoffmann and Wolfram Bublitz
12. Pragmatics of fiction
Miriam A. Locher and Andreas H. Jucker
13. Developmental and clinical pragmatics
Klaus P. Schneider and Elly Ifantidou
Preface

Pragmatics is no doubt an unusually large and diverse subfield of linguistics. Over

the last thirty or forty years it has grown from a small area for a few specialists to
one of the dominating approaches. There is an ever increasing number of dedicated
journals, textbooks and handbooks that testify to its importance and widespread
appeal. The series of handbooks in which this volume appears in itself comprises
13 volumes and a total of almost 9,000 pages of overviews of specific areas of
research within pragmatics. Each volume individually and the entire series as a
whole make a strong claim for the broad diversity of objects, theories and research
methods within the scope of pragmatics. And indeed, we strongly believe that this
diversity, which some might perhaps see as a lack of unity and coherence, is, in
fact, enriching and empowering. It is the opposite of a dogmatic adherence to one
single methodology, one single theoretical approach or one single type of data of
analysis. It is the aim of this volume to give an overview of the full breadth of
research methods in today’s pragmatics.
The handbook opens with three papers devoted to the basics of any pragmatic
investigation. It presents general surveys of data types, methods and ethics of
data collection, and the different methods of transcribing spoken language. The
second part of the handbook comprises surveys of what we have decided to call
“introspectional pragmatics” (see the introduction to part 2 for a justification of
the term). Today’s pragmatic research relies mostly on empirical methods, but
important work is still being done within this research tradition, which goes back
to some of the early luminaries of the field, the philosophers of language John L.
Austin, John Searle and H. Paul Grice. The remaining three parts of the handbook
are devoted to empirical methods of pragmatic research. Part 3 comprises over-
views of experimentational methods in pragmatic research, such as discourse com-
pletion tasks, comprehension tasks and psycholinguistic production tasks. Part 4
on observational pragmatics looks at methods that focus on (usually relatively)
small sets of data, such as ethnomethodology, conversation analysis or discourse
analysis, while part 5 on corpus pragmatics looks at methods that rely on much
larger data sets and usually employ computer tools for pragmatic analysis.
As editors of the current volume and as general editors of the entire series of
handbooks it is our pleasure to thank Birgit Sievert and Barbara Karlson for their
enthusiasm and unfailing support for both this volume and the entire series. We
also thank Larssyn Staley for copy editing most of the current volume and Sophie
Decher for compiling the index of names, and above all we would like to thank our
contributors for their exemplary diligence, co-operation and patience.

Zurich, Bonn and Berlin, December 2017

https://doi.org/10.1515/9783110424928-202
Table of contents

Preface to the handbook series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Preface to Methods in Pragmatics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

I. Introduction

1. Data in pragmatic research

Andreas H. Jucker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2. Methods and ethics of data collection

Klaus P. Schneider. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3. The art of transcription: Systems and methodological issues

Roger J. Kreuz and Monica A. Riordan. . . . . . . . . . . . . . . . . . . . . . . . 95

II. Introspectional pragmatics

4. Introduction to part 2: Introspectional pragmatics

Wolfram Bublitz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5. Philosophical pragmatics
Marina Sbisà . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

6. Research methodology in classical and neo-Gricean pragmatics

Yan Huang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

7. Cognitive pragmatics: Relevance-theoretic methodology

Billy Clark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

III. Experimentational pragmatics

8. Introduction to part 3: Experimentational pragmatics

Klaus P. Schneider. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

9. Discourse completion tasks

Eva Ogiermann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
xii Table of contents

10. Assessing the comprehension of pragmatic language:

Sentence judgment tasks
Alma Veenstra and Napoleon Katsos. . . . . . . . . . . . . . . . . . . . . . . . . . 257

11. Psycholinguistic production tasks

Raymond W. Gibbs, Jr.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281

12. Role plays

J. César Félix-Brasdefer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

IV. Observational pragmatics

13. Introduction to part 4: Observational pragmatics

Andreas H. Jucker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335

14. Ethnographic methods in pragmatics

Meredith Marra and Mariana Lazzaro-Salazar. . . . . . . . . . . . . . . . . . 343

15. Ethnomethodology and conversation analysis

Andrea Golato and Peter Golato. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367

16. Discourse analysis

Anita Fetzer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395

17. Critical discourse analysis

Piotr Cap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425

V. Corpus pragmatics

18. Introduction to part 5: Corpus pragmatics

Andreas H. Jucker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455

19. Corpus construction

Gisle Andersen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467

20. Corpus annotation

Dawn Archer and Jonathan Culpeper . . . . . . . . . . . . . . . . . . . . . . . . . 495

21. Historical corpus pragmatics

Irma Taavitsainen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
Table of contents xiii

22. Corpus pragmatics: From form to function

Karin Aijmer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555

23. Corpus-based function-to-form approaches

Anne O’Keeffe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587

24. Corpus-based metapragmatics

Michael Haugh. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619

Bionotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645

Name Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651

Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669
I. Introduction
1. Data in pragmatic research
Andreas H. Jucker

Abstract: This introductory chapter gives a broad-brush overview of the various

types of data in the field of empirical research in pragmatics. It starts with a dis-
cussion of the various types of analytical units in pragmatics, taking as its starting
point single utterances, which are contrasted to smaller units, such as deictic ele-
ments, stance markers, discourse markers, hedges and the like, as well as to larger
units, such as sequences of utterances and entire discourses. Data for pragmatic
research comes in different modalities. Spoken language and written language are
the most obvious modalities, but digital language with its own complexities, sign
language and non-verbal behaviour have recently become increasingly important
as data for pragmatic research. Moreover, research data can be categorised on
the basis of their location on four scalar dimensions. The first dimension con-
cerns the amount of constraints on the interactants and the allowable contribu-
tions. The second dimension scales the level of fictionality or factuality of the
language under observation. The third dimension assesses the amount of research
interference in the production of the data, and the fourth dimension, finally, situates
data according to the researcher focus between the two poles of small amounts
of highly contextualized data to big data searches of largely decontextualized
phenomena.

1. Introduction

There is no research in pragmatics without data. Data – in one form or another –

form the essence of what pragmatic research is about. Research – at a very basic
level – consists in the search for generalizable patterns in the data. This is true
for large computer searchable corpora, it is true for transcriptions of multi-party
conversations and it is also true for thought experiments. Thus the researcher must
start by collecting data in order to answer a specific research question. The type
of data and the method of collecting the data are closely connected to the research
question that drives the analysis and to the theoretical framework within which the
analysis is carried out. A certain method of data collection will typically provide
a very specific type of data and lend itself to a specific way of analysing it, or –
viewed from the opposite direction – a certain research question will require a
specific set of data that needs to be collected and analysed with a specific method.
In general, we can distinguish four different aspects of research: 1) type of data,
2) method of data collection, 3) analysis of data and 4) theoretical framework.

https://doi.org/10.1515/9783110424928-001
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 3–36. Berlin/Boston:
De Gruyter Mouton.
4 Andreas H. Jucker

This opening chapter will focus on the first of them, and the following chapter by
Schneider will be devoted to the second.
Often, researchers justify and defend a particular analytical and theoretical
framework while they take a certain type of data and a certain method of data col-
lection for granted. A certain type of data and a certain method of data collection
are regularly presented as the only viable option. Unbiased overviews of different
data collection methods and discussions of their inherent strengths and weaknesses
are relatively rare (but see, for instance, Kasper and Dahl 1991; Kasper 2000;
Jucker 2009a; Leech 2014: chapter 9). This volume starts from the premise that
there is no single best type of data and no single best method for collecting data for
pragmatic analyses. In fact, all four aspects of research mentioned above have to be
assessed in relation to specific research questions (Jucker 2009a; Golato 2017: 21).
In general, the data of any pragmatic research is language used in actual con-
texts, and language is ever pervasive. We interact with other people, we watch tel-
evision, attend lectures, read newspapers, consult user manuals, recite poems, surf
the Internet, interact via social media, look at advertising messages, listen to public
announcements on the train and so on and so forth. For each and every one of us
the mix of communicative situations that we encounter every day is different, but
every one of us is embedded in a flow of language. Even in our thoughts and in our
dreams language plays an important role. Potentially all these situations, all these
instances of language use could be the object of scholarly investigations. However,
pragmatics has a long history of preferring – explicitly or implicitly – some types
of data over other types, giving preference to unconstrained spoken interaction in
natural settings. Written language, on the other hand, has often been rejected as
unsuitable for pragmatic analyses because it is secondary (see section 3.1 below).
Fictional language, as for instance in novels or plays, has met even more resistance
because of its artificiality (see Jucker and Locher 2017). But even certain types
of spoken language are occasionally seen as less ideal or unsuitable for pragmatic
analyses, in particular language produced in highly constrained communicative
situations, such as courtroom or classroom interaction because of the clear assign-
ment of communicative roles and the constraints on the allowable contributions
for the different participants.
Such an approach to language data that distinguishes between more acceptable
and less acceptable types of data is based on an understanding of language as a
more or less coherent and homogeneous entity where variations are seen as devi-
ations. In such a framework, the linguist’s task is considered to be the description
of the common core of a language, and for this task only certain types of language
use, such as maximally unconstrained spoken interaction, qualify as legitimate
data. However, today many, perhaps most, pragmaticists adopt a very different
view of language. Language is inherently variable and heterogeneous, and linguists
are interested exactly in this variability. Every type of language has to be assessed
on its own merits, and every type of language, whether spoken or written, deserves
Data in pragmatic research 5

to be investigated. This shift in perspective from homogeneity to heterogeneity has

been identified as one of a number of paradigm shifts in linguistics (see Traugott
2008: 208; Jucker and Taavitsainen 2013: 6).
This handbook categorically adopts a perspective that focuses on the hetero-
geneity of language and on a diversity of research questions, data types, analytical
approaches and theoretical frameworks. Its contributions provide overviews of a
wide range of different methods of data collection and data analysis. In this first
chapter, however, I shall provide an overview of the different types of data that are
used by pragmaticists while the second chapter by Klaus P. Schneider provides an
overview of the different types of data collection methods.
The early philosophers of language and pragmaticists, Austin and Searle, relied
on their intuition in their seminal work on speech acts. Their data consisted of their
own intuition about the use of language. As any competent native speaker, they
knew what it meant to make a promise, to ask a question or to give advice, and they
used this intuitive knowledge to dissect the relevant elements, what Austin (1962)
called the felicity conditions, of these speech acts. In the philosophical tradition,
intuition is an important source of data. According to Schneider,
the word ‘intuition’ designates an uninferred or immediate kind of knowledge or appre-
hension, as opposed to discursive knowledge, mediated by accepted methods of demon-
stration. (Schneider 1995: 606)

For philosophers, it is important to discuss the possible foundation of such intuitive

knowledge. “Introspection”, according to Schneider, is a special type of intuitive
knowledge. The objects of such knowledge are understood as being situated on
an inner stage of a person. From there they can be retrieved by “looking inside”
(Schneider 1995: 606). Feelings, emotions or the workings of our own native lan-
guage are examples of such intuitive knowledge.
In the field of pragmatics, it is useful to make a terminological distinction
between “intuition” or “intuitive knowledge” on the one hand and “introspection”
on the other. Intuition here refers to the knowledge that a researcher brings to the
task of investigating his or her native language, together with the ability to fab-
ricate test sentences that can be assessed on the basis of their grammaticality or
accessibility. The term “introspection”, on the other hand, has been used for a long
time in the fields of cognitive psychology and (applied) psycholinguistics to refer
to experimental methods, involving thinking-aloud protocols and other elicitation
techniques (see chapter 2 by Schneider). The papers in a volume edited by Færch
and Kasper (1987), for instance, use the term introspection for a range of methods
adopted from cognitive psychology, such as verbal reports by learners about their
thought processes (see also Clark, this volume).
The terminological distinction helps to differentiate between the work of the
language philosophers, who use their own intuitive knowledge about their native
language to theorize about the use of language, and the work of experimental prag-
6 Andreas H. Jucker

maticists, who use a range of elicitation techniques to access the native speaker
introspection of the participants of their experiments. In a wider sense, all experi-
mental work can be seen as accessing the introspection of native speakers. In pro-
duction experiments, such as dialogue construction tasks or discourse completion
tasks the elicited data consist of the language use that the participants consider to
be typical or at least appropriate for a given situation. In comprehension and eval-
uation experiments, the introspective knowledge is accessed in a somewhat more
direct form.
Before I turn to the different types of naturally occurring data, I will provide an
outline of the different units of analysis in section 2. Section 3 will then be devoted
to the different modalities of naturally occurring data and the different ways of
conceptualizing these differences. It will cover not only the difference between
spoken and written language but also the status of online and digital data, and the
importance of sign language, i. e. the language systems used by deaf communi-
ties, and gestures as an additional layer of face-to-face communication. Section 4
deals with important dimensions or scales of observational data. It deals with con-
strained versus unconstrained language, and with the distinction between fictional
and factual data. It also addresses the question of researcher interference. And it
considers the difference between small snippets of data and huge corpora. This
last dimension does not really concern the type of data under investigation but the
research focus and whether the researcher attempts to discern communicative pat-
terns on a micro scale of a short extract of a conversation, for instance, or whether
the patterns are searched for across millions or even billions of words of running
text.

2. Units of analysis

Utterances are – in a sense – the most basic units of analysis in pragmatics. They
were the focus of the early language philosophers who asked how utterances can
be used to change the world. Words are used to build utterances, which are used
as speech acts to perform actions. Utterances are also the focus of researchers who
are interested in how conversationalists interpret what they hear. Grice (1975),
for instance, provides an account of how people systematically read between the
lines of the utterances they hear; and Sperber and Wilson (1995) develop a com-
prehensive theory of utterance interpretation. Utterances are also seen as the main
building blocks of larger structures, e. g. as turns-at-talk, where the focus is on the
micro context of utterances and on the question of how one utterance is shaped by
and helps to shape its immediate context. They are also seen as building blocks
in layered hierarchies of conversational interactions (e. g. Sinclair and Coulthard
1975; Schiffrin 1987). In some cases, the focus of the pragmatic analysis is on
units that are smaller than utterances, e. g. deictic elements, discourse markers,
Data in pragmatic research 7

stance markers and pragmatic noise. In other cases, it is on units that are larger than
utterances, e. g. on entire discourses or texts or even on discourse domains. In this
section, I would like to disentangle the different perspectives and give an overview
of the units of analysis in pragmatics (see also Jucker 2008).

2.1. Utterances
The pioneering work of the language philosophers and early pragmaticists, Austin
(1962) and Searle (1969), focused on what they called “speech acts”, i. e. on utter-
ances that are used to perform actions. Since this early work, the investigation of
speech acts has been one of the most important pillars of pragmatic research. The
early work relied on philosophical methods and the researcher’s intuition about
the nature of particular speech acts and how they are used to perform specific
actions. Later work employed experimental methods, such as discourse completion
tasks (e. g. Blum-Kulka, House and Kasper 1989), role plays and role enactments
(e. g. Trosborg 1995), and, more recently, also corpus-linguistic methods (e. g.
Deutschmann 2003). But in all cases the focus lies on single utterances and on how
these utterances are used to perform specific actions. In some cases, the focus is
extended to neighbouring speech acts. Compliments, for instance, regularly elicit
responses, and some research, therefore, focuses on both elements of the pair and
their sequential organisation (e. g. Golato 2005), but much of the research on com-
pliments and compliment responses nevertheless focuses exclusively on either one
or the other of the pair (see overview in Alfonzetti 2013).
Grice (1975) adopted a different perspective. He did not ask how utterances are
used to perform actions but rather how conversationalists interpret utterances. How
are listeners able to systematically read between the lines of what other people
say? Utterances regularly mean more than what they explicitly say; they implicate
additional meanings. Grice’s Cooperative Principle is an attempt to give a system-
atic account of how listeners figure out the implicatures of individual utterances.
Sperber and Wilson (1995), in their Relevance Theory, extended these questions to
utterance interpretation in general. Listeners use pragmatic reasoning not only to
recover implicatures from the utterances that they hear, but much more basically
to work out even the explicitly communicated meaning of utterances. Blakemore’s
(1992) introductory textbook is even entitled Understanding Utterances: An Intro-
duction to Pragmatics. Utterances, according to this theory, are underdetermined.
They are ambiguous and vague. Nevertheless in actual situations, conversation-
alists generally pick out the intended meaning. They disambiguate and enrich the
explicit content of utterances and come up with pragmatically meaningful inter-
pretations of these utterances.
Utterances are also the building blocks of larger units. On a micro level,
researchers have focused on the immediate context of utterances. It was the eth-
nomethodologists Sacks, Schegloff and Jefferson (1974), in particular, who initi-
8 Andreas H. Jucker

ated a large body of research on the minutiae of the turn-taking system. They were
interested in how one utterance – or turn-at-talk – is followed by another such unit
with a minimal gap and no or minimal overlap between the units. This strand of
research focuses on the transition between utterances and on the micro context in
which utterances occur.
Researchers in this theoretical framework also noted that certain types of utter-
ances tend to occur in pairs, so-called “adjacency pairs”. Questions are followed
by answers; greetings by greetings; invitations by acceptances or refusals and so
on. This kind of research focuses on the pairings of utterances and on preferred
or dispreferred combinations (see for instance Bilmes 1988; Schegloff 2007; Clift
2014). Dispreferred reactions, such as refusals or rejections, are generally clearly
marked, while preferred reactions, such as acceptances, are generally unmarked.
Thus, conversation analysis does not deal with utterance acts alone but with the
sequencing of such acts, their interaction and the principles of their ordering.
With a slight shift of focus, utterances can also be seen as the building blocks of
larger structures. Sinclair and Coulthard (1975), for instance, propose an analysis
of classroom interaction consisting of a layered hierarchy (see e. g. Edmondson
2014). In this system, single utterances, and sometimes even parts of utterances
are the smallest units, the so-called acts. They combine to form moves, such as
“initiation”, “response” or “feedback”. Moves by different interlocutors combine
to form exchanges. The three moves initiation, response and feedback, for instance,
together form an exchange which is typical for classroom interaction. The teacher
asks a question or uses some other way to initiate a reaction by the pupils. One of
the pupils responds and the teacher gives feedback on the response. At a higher
level, several exchanges combine to constitute transactions, which typically start
with a preliminary exchange and – after a series of medial exchanges – end with a
terminal exchange. Several transactions together, finally, make up an entire lesson.

2.2. Micro units (smaller than utterances)

While utterances may be seen as the most basic units of analysis in pragmatics,
pragmaticists regularly focus on smaller elements as well. These elements have in
common that their description requires pragmatic explanations, i. e. explanations
that take into account the way in which these elements are used in actual situations
and how they link the utterance in which they occur to the communicative situa-
tion in which they are used. Typical examples are deictic elements, stance mark-
ers, discourse markers, hedges and pragmatic noise. Deictic elements include a
wide and diverse range of linguistic expressions which link the utterance in which
they occur to its larger context (Levinson 1983: chapter 2; Chapman 2011: 39–42;
Hanks 2011). Expressions like now, then, next Thursday or this evening connect
the utterance in which they occur to its temporal frame; expressions like here, on
this side, behind, come and go connect it to its spatial frame; and expressions like
Data in pragmatic research 9

but, therefore, however, in conclusion and anyway connect it to the discourse in

which it occurs, to mention just the most important categories.
Stance markers are linguistic elements by which speakers convey their evalua-
tions, personal attitudes and emotions as well as their level of commitment towards
propositions (for an overview see Biber et al. 1999: chapter 12; Keisanen and
Kärkkäinen 2014; Gray and Biber 2015; Landert 2017). They are a diverse group
and – depending on the specific perspective – have been known under a variety
of names, such as modality markers, subjectivity or intersubjectivity, hedges and
so on. Typical linguistic elements that convey stance are modal and semi-modal
verbs (e. g. might or have to), adverbials (adverbs, such as obviously or fortunately,
or prepositional phrases, such as in actual fact) and complement clauses (e. g. it’s
amazing that). But stance can also be expressed through evaluative word choice
and even with paralinguistic and non-linguistic means, including tone of voice,
loudness, body posture, facial expression and gestures.
Discourse markers, too, comprise a heterogeneous set of elements that have
received a lot of attention from pragmaticists with a range of different definitions
and different terms. Schiffrin (1987: 31) defines them as “sequentially dependent
elements which bracket units of talk”, while Fraser (1999: 931) defines them as
signalling “a relationship between the interpretation of the segment they introduce,
S2, and the prior segment, S1”. He further claims that “they have a core mean-
ing, which is procedural, not conceptual, and their more specific interpretation is
‘negotiated’ by the context, both linguistic and conceptual” (Fraser 1999: 931; see
Beeching 2016: chapter 1 for a discussion of different terms and definitions).
Pragmatic noise is a term that was introduced by Culpeper and Kytö (2010:
chapters 9 to 12). They use it to refer “to items such as AH, HA, HAH, O, OH, HO,
UM, HUM, as well as reduplicative forms like HA, HA or HA, HA, HA” (Culpeper
and Kytö 2010: 199). They acknowledge that the term overlaps with the category
of interjections, but it also includes laughter, pause-fillers and hesitation markers.
Culpeper and Kytö investigate a corpus of Early Modern English dialogues, which
means that they have to focus on the written representations of such elements in
their data of plays and court records. But such elements have recently received
more and more attention from researchers working on present-day materials (for
an overview of work on pauses and hesitations see, for instance, Stenström 2011).
Reber (2012) provides a detailed analysis of how speakers display affectivity in
social interaction with a range of elements that she calls “sound objects”, i. e.
interjections, such as oh, ooh and ah or paralinguistic signals, such as whistles and
clicks.
Thus pragmatic analyses regularly focus on linguistic elements that are smaller
than utterances and indeed on paralinguistic and non-linguistic elements. But there
is also a large body of work that focuses on entities larger than utterances.
10 Andreas H. Jucker

2.3. Macro units (larger than utterances)

Utterances occur in contexts, and – as I have pointed out above – some research-
ers focus on the contextualisation of individual utterances into larger entities, be
that as pairs of utterances or as entire discourses that are made up of structured
sequences of utterances. Some pragmatic research, however, starts from a more
global perspective and focuses primarily on larger units, which are variously called
discourse or text. The term text is often restricted to written language while the
term discourse is used for spoken language, but both terms are notoriously incon-
sistent across different research traditions (see Fetzer 2014 for an overview of
different conceptualisations of the term discourse and Esser 2014 for an overview
of taxonomies of discourse types).
A particular strand of this research goes back to the 1970s and 1980s and was
originally labelled “textlinguistics”. It was an attempt to seek linguistic regularities
beyond the sentence boundaries, which manifested itself explicitly in book titles
such as A Text Grammar of English (Werlich 1982). Werlich develops a typology
of different types of text as well as an outline of the principles of text construction,
their function and the contexts of their occurrence. In a similar way, de Beaugrande
and Dressler (1981) investigate how texts are used in communication. Can we dis-
tinguish between acceptable and unacceptable texts in the same way that we can
distinguish between grammatical and ungrammatical sentences? Which elements
provide the cohesive ties that keep the sentences of a text together and render
the text coherent? This tradition was particularly strong among German speaking
scholars (see for instance the numerous introductions to textlinguistics written in
German, e. g. Coseriu 1980; Sowinski 1983; Heinemann and Viehweger 1991; or
more recently Schubert 2008). Many scholars tried to apply textlinguistic ques-
tions about the structure and function of texts to specific genres or text types. Suter
(1993), for instance, focuses on wedding reports in local English newspapers. Wed-
ding reports are descriptions of local weddings that have taken place in the week
preceding the publication. The analysis focuses on the situational context in which
these articles appear, the text production process, their content, thematic structure
and their communicative function. Suter adds a diachronic dimension by contrast-
ing wedding reports published in the 1930s to reports published in the 1980s. Auf
dem Keller (2004) provides a similar analysis of textual structures in advertise-
ments for books and medical supplies in eighteenth-century English newspapers.
And Jacobs (2014) investigates press releases, i. e. texts from businesses, govern-
ment agencies or political parties issued to the media in the hope of wider publicity.
The term discourse can not only be used to refer to the macro unit that is larger
than the individual utterance, but also in a wider sense to refer to a discourse
domain. In this sense, it refers to the entire range of linguistic practices in a par-
ticular, socially defined domain, as, for instance, in the discourse of sports or the
discourse of science (Jucker 2008: 901; see also Henke 2005). Such domains are
Data in pragmatic research 11

large and overlapping. Historical pragmatics has a long tradition of investigating

such domains, in particular, the discourse of science and mass media discourse
(see, for instance, Jucker 2005; Claridge 2010; or the papers in Brownlees 2006
and Jucker 2009b).
In line with this chapter, this overview of units of analysis in pragmatics has
focused on data units, such as utterances, discourse markers or entire discourses.
Such a perspective does not cover the entire breadth of pragmatic research because
pragmatic research does not always take a particular linguistic unit as a starting
point. A good example would be the large area of politeness and impoliteness
research. Here, the starting point is not a particular linguistic unit and how speak-
ers use this unit in interaction, but rather particular types of interaction and the
effects that such interaction has on the participants. This kind of research looks into
the effects of communication and searches for elements that create these effects,
whether they are words, such as terms of address, specific mitigators or speech acts
(see e. g. Watts 2003). Cognitive approaches likewise do not take a linguistic unit
as a starting point. They ask about the interrelationship between language and cog-
nition (see Schmid 2012). Such approaches are interested in cognitive processes
and how they are reflected in linguistic structures. They do not set out to analyse
specific linguistic items, such as deictic elements, even if deictic elements may
play a prominent role in their argumentation (see for instance Levinson 2003).

3. Medium of transmission

According to a simplistic view of language, there is a straightforward distinction

between spoken language and written language. In one case we speak and listen,
in the other we write and read. However, the situation is considerably more com-
plex in particular for research in pragmatics. In the case of communication via
electronic devices, such as computers, tablets and mobile phones the complexity
increases even more. In addition to the spoken and the written mode, there is also
sign language, which uses hand shapes and movements to communicate, and when
we talk to each other, we also communicate with gestures, with our posture, with
facial expressions and so on. The current section gives an overview of the impor-
tant distinctions and introduces some of the models that have been developed to
conceptualise them.

3.1. Spoken versus written language

The relationship between spoken and written language can be and has been
described in many different ways. The written language, for instance, can be seen
as derivative and secondary. By and large, all living languages have a spoken form
but not all of them have a written form. Thus, the linguist’s main task – one might
12 Andreas H. Jucker

argue – involves the description of the spoken language. However, the advances of
corpus linguistics have had the effect of shifting the primacy of description to the
written language because language that already exists in written form is much more
readily available for corpus compilation. Early corpora, such as the Brown Corpus
or the LOB Corpus consisted entirely of written language, and even later corpora,
such as the BNC or COCA only contain relatively small samples of transcribed
spoken language. Biber et al.’s (1999) standard grammar of the English language
treats the spoken language of conversation as a register next to fiction, news and
academic texts. Biber (1988) investigated the variation between speech and writing
in a systematic way. He contrasted large sections of the London-Lund corpus of
spoken English and the LOB corpus of written English on the basis of features with
specific discourse functions which he clustered into textual dimensions in order to
evaluate specific texts according to their informational density, or their affective
and interactional content.
It is probably fair to say that for a long time pragmaticists – in contrast to
corpus linguists – ignored written language because of its secondary nature. How-
ever, there were also early attempts to think more carefully about the relationship
between spoken and written language from a communicative or pragmatic point of
view. Koch and Oesterreicher (1985, 2007; see also Koch 1999; Jucker and Taavit-
sainen 2013: 21–22), for instance, developed a model to clarify and visualise the
distinction. They take the mode of transmission to be a dichotomy between phonic
and graphic. Language is transmitted either in the phonic code or in the graphic
code. In addition to this dichotomy, there is a scale between the opposite poles of
communicative immediacy and communicative distance. The two codes are not
restricted to one end of this scale but they have preferences. The graphic code has
a preference for situations and genres of communicative distance while the phonic
code has a preference for situations and genres of communicative immediacy. This
is schematically illustrated in Figure 1.

Figure 1: Koch and Oesterreicher’s model of communicative immediacy and distance

(Koch 1999: 400)
Data in pragmatic research 13

In this model, communicative immediacy is characterized by the parameters in

the following list (Koch 1999: 400):
(a) physical (spatial, temporal) immediacy
(b) privacy
(c) familiarity of the partners
(d) high emotionality
(e) context embeddedness
(f) deictic immediacy (ego-hic-nunc, immediate situation)
(g) dialogue
(h) communicative cooperation of the partners
(i) free topic development
(j) spontaneity
Communicative distance, on the other hand, is characterized by the opposite val-
ues of these parameters, i. e. physical distance, lack of familiarity, low emotion-
ality and so on. The four letters in the two triangles represent more typical or
less typical situations. The letter A stands for communicative exchanges in the
phonic code that are characterized by communicative immediacy, that is to say
typically a face-to-face interaction between conversationalists who know each
other well. The situation is informal, private, not public, and spontaneous. Topics
can be freely chosen and changed and so on. But there are also communicative
exchanges in the phonic code that are characterized by communicative distance;
area B in the lower triangle. This applies to monologues, such as lectures, in for-
mal, public situations with conversational partners who do not know each other
well, and in situations that define specific topics and topic developments. Commu-
nicative immediacy is more typical for the phonic code. This is represented by the
larger area A. Communicative distance is less typical, represented by the smaller
area B. Most communicative situations in the phonic code are situated somewhere
along the scale. Telephone conversations lack only a few of the communicative
immediacy features of face-to-face conversations, while job interviews already
show many of the communicative distance features of very formal, monologic
situations.
The letters C and D stand for situations of communication in the graphic code.
C represents the less typical situation of graphically communicated messages in
situations of communicative immediacy. At the time when Koch and Oesterreicher
developed their model this referred mainly to printed interviews, private letters,
and entries in a personal diary. D represents the more typical situation of graphi-
cally communicated messages in situations of communicative distance. Legal texts,
academic writing or articles in high-brow newspapers would be typical examples.
Here, too, there are a lot of situations that are located between the two extremes.
Today’s situation with a wealth of typed messages transmitted electronically the
situation has changed considerably. Communication via hand-held devices, via
14 Andreas H. Jucker

social media and so on provide an entirely new situation for area C of communica-
tive immediacy in the graphic code.
For researchers in historical pragmatics the relationship between the spoken
and the written language is particularly important. In the early days of historical
pragmatics, researchers often felt obliged to apologize for the use of written data
in pragmatically driven investigations. In the absence of genuinely spoken data,
they searched for instances of written language that were as close as possible to
spoken language, such as dialogues in plays or transcripts of courtroom interac-
tions. Rissanen (1986: 98), for instance, argued that “texts which record speech
for some reason or another, are closer to spoken language than texts which are not
based on actual speech”. In fictional writing, the situation is even more complex.
Authors regularly include oral features into their writings to give the dialogues of
their characters an air of authenticity even if the features do not directly correspond
to features attested in actual spoken discourse. Scripted and performed interactions
between actors in plays also differ from normal everyday interactions in systematic
ways (see Bublitz 2017).
In an early paper on historical pragmatics, Jucker (1998) sketched the various
ways in which written language can be related to spoken language. Even genuinely
written data can be classified into instances that tend to be monologic because there
is normally little opportunity for the readers to interact with the writer and dialogic
instances where such interaction is possible and expected.

Figure 2: Data in historical pragmatics: the “communicative view”

(Jucker 1998: 5, see also Jucker and Taavitsainen 2013: 23)

Written representations of spoken language can be separated into three different

types. Reports, protocols and diaries regularly report actual spoken interactions,
while conversation manuals and language textbooks record (invented) sample con-
Data in pragmatic research 15

versations that are meant to be used by the readers on future occasions. A third
type is made up by fictional texts that record fictional conversations, for instance
in play texts or in narrative literature but also, historically, in academic texts that
were often written as fictional conversations between a master and a student (see
also Culpeper and Kytö 2000 for a similar model and Kytö 2010 for an overview).
Landert and Jucker (2011) build on Koch and Oesterreicher in order to develop
a model that adds two more dimensions to the dichotomy of phonic versus graphic
and the scale of linguistic immediacy: the scale of accessibility and the scale of
privacy. The distinction between phonic and graphic is not visually represented
in their model. They argue that their model applies both to messages transmitted
in the phonic code and to messages in the graphic code. In Figure 3, they provide
prototypical examples from the sphere of graphically transmitted messages.

Figure 3: Enriched communicative model (Landert and Jucker 2011: 1427)

The scale of accessibility is defined by the ease of access to a particular message

by others. In non-public situations, only very few people have access to a message.
A typical message would be a short text message transmitted via a mobile phone
intended for one single addressee. Such a message typically – but not necessarily –
deals with private topics, which Landert and Jucker (2011: 1427) define as topics
that “affect single individuals or very small groups of people”, while non-private
topics are topics “that lack this concentration on a private individual or a very
small group”. With this terminological move they disentangle the privacy of topics
16 Andreas H. Jucker

from the accessibility of messages. This makes it possible, for instance, to describe
more accurately what may be seen as a tendency in some sectors of today’s mass
media to make the private lives of celebrities public. The topics and issues remain
private, according to this terminology, even if they are made public, i. e. publicly
accessible. Scientific articles are prototypical examples of messages in the graphic
code which deal with issues that are not restricted to a small group of individuals
and which are made publicly accessible for a larger range of people.

3.2. Online/digital data

The communicative affordances of computer technology that have been devel-
oping over the last few decades have added new dimensions to the distinction
between spoken and written language. To some extent there is still the dichotomy
of the phonic code and the graphic code. We use computers and handheld devices
to communicate with our voice (e. g. Internet telephony), and we use the same
devices for all sorts of communication in the graphic code. However, the new tech-
nology has added an additional layer of affordances and has, therefore, opened up
a large range of new research opportunities.
A clear terminology has not yet established itself for this type of communica-
tion. The most widespread term is probably “computer-mediated communication”.
It was already well-established in the 1990s and popularized by Herring (1996).
There is a journal which uses this designation, the Journal of Computer-Mediated
Communication, and a dedicated handbook (Herring, Stein and Virtanen 2013),
entitled Pragmatics of Computer-Mediated Communication. But there are a host of
other terms, such as “electronically mediated communication” or “electronic dis-
course”, “digitally mediated communication” or “digital communication”, “Inter-
net-mediated communication”, and “keyboard-to-screen communication” (see
Crystal 2011: 1–3; Jucker and Dürscheid 2012: 35–37; or Locher 2014: 555–557
for a discussion of terminology). The different terms focus on different aspects of
this special type of communication and they are not always entirely co-extensive in
what they include or exclude. Herring’s (2007: 1) definition of computer-mediated
communication as “predominantly text-based human-human interaction mediated
by networked computers or mobile telephone” explicitly includes communication
via mobile phones, which begs the question whether mobile phones can be seen
as computers. At the time when Herring proposed this definition, this was perhaps
less clear than it is today. Terms such as “electronic discourse” (Locher 2014),
“digital communication” (Tagg 2015) or “keyboard-to-screen communication”
(Jucker and Dürscheid 2012) avoid the issue of classifying the electronic devices
used to send and receive messages as computers or not and focus on the way in
which the signals are transmitted or how they are encoded and received.
There are several important features that distinguish digital data from spoken
and from written data. Spoken communication typically takes place in a situation
Data in pragmatic research 17

of synchronicity. The interactants are co-present, if not spatially then at least tem-
porally (e. g. on the telephone). Messages are encoded and decoded at the same
time. Written communication, on the other hand, typically takes place in an asyn-
chronous situation. Messages are normally decoded only some time, perhaps even
a very long time, after having been encoded. Computer-mediated communication
uses the graphic code but it can be more or less synchronous. Jucker and Dürscheid
(2012: 39) argue that the term “quasi-synchronous” is more appropriate for this
type of communication. It covers all cases in which interactants exchange mes-
sages in quick succession, e. g. turn-by-turn in Facebook chat, or message-by-mes-
sage in WhatsApp conversations. As such the term “quasi-synchronous” has fuzzy
boundaries and coincides more or less with the term “synchronous” in cases where
messages are transmitted not turn-by-turn, but stroke-by-stroke. And it coincides
more or less with the term “asynchronous” in the case of, for instance, email mes-
sages that are exchanged in relatively quick succession.
Two further distinctions that are blurred in many forms of digital data are the
oppositions between monologic and dialogic, and the opposition between dis-
course or text and utterance. Written communication tends to come in the form of
monologic texts, while spoken communication most frequently comes in the form
of dialogic utterances. For digital data, such a distinction is much less useful.
For chat contributions, to take one specific example, neither the term “text” nor the term
“utterance” seems to fit. They are realized in the graphic code, and thus may resemble a
text. But they are also spontaneous, unplanned, context embedded (e. g. “What are you
doing now?”), short and situated in a dialogic (more precisely: in a quasi-synchronous)
context, and thus are more like prototypical utterances. (Jucker and Dürscheid 2012: 40)

As an alternative, Jucker and Dürscheid (2012: 42–44) propose the term “commu-
nicative act”. Communicative acts can have a high expectation of being taken up
and responded to by an interactant (in which case they are more utterance-like) or –
at the other end of the scale – a small expectation of being taken up and responded
to (in which case they are more text-like). Examples are chat contributions, which
have a high expectation of uptake even if some contributions occasionally go unan-
swered, and user manuals, which have a very low expectation of uptake even if
some frustrated user might occasionally try to get in contact with the author of the
manual to complain about faulty or inscrutable instructions.
Digital data is further differentiated from traditional written data in its fluid-
ity. Written texts, and in particular printed texts, are characterized by a high level
of fixity. Once a text has been printed, it cannot easily be changed. Handwritten
corrections within a printed text are easily recognizable as such. New printings of
books are, of course, possible and common but each printing stays basically unal-
terable and fixed in its original form. This is not true for digital data. Texts that
are stored digitally can easily be modified. Online news media, for instance, can
update their texts on a minute-by-minute basis. This is why it has become standard
18 Andreas H. Jucker

to add a time stamp to quotations of electronic texts. There is no guarantee that the
text is still the same when it is checked some time later.
Finally, digital data are characterized by a vastly increased multimodality.
Computer-mediated communication regularly combines language, images, memes,
sounds and music. Still pictures and video clips have become very important in
many forms of computer-mediated communication, especially on social-network
sites or instant messaging applications, such as Facebook, Instagram, WhatsApp
or Snapchat (see boyd 2014; Hoffmann and Bublitz 2017).

3.3. Sign language data

The term sign language is here used to refer to a class of languages used by deaf
communities, such as German Sign Language or American Sign Language. They
are equally complex in their structural features as spoken language, and, of course,
they are not to be confused with the improvised gestures used by tourists in attempts
to communicate with locals with whom they do not share a common language. In
contrast to popular opinion, sign language is not only conveyed through hands but
also through body language and facial expressions (Sutton-Spence and Woll 1999:
81; Quinn 2017: 55).
Signs are, of course, a subset of human gestures, just as words are a subset of human
vocalizations. Signs are distinguished from gestures by having an internal structure
composed of elements which form a system of contrasts, and whose usage is rule-gov-
erned. (Woll and Kyle 1998: 855)

Like spoken language, sign language is ephemeral. If it is not recorded, it vanishes

without a trace. Both spoken language and sign language are encoded and decoded
at the same time, i. e. with synchronous production and reception. While the modal-
ity of spoken language is auditory, the modality of sign language is visual-spatial
(Quinn 2017: 55). Relatively little is still known about the history of sign language
in general and of specific sign languages. Recordings have only become available
during the twentieth century. There are older accounts of deaf people who used
signs to communicate (going back to Plato), but records or detailed descriptions of
the signs that were used are missing (Woll and Kyle 1998: 855). One problem for
the investigation of sign languages is that there is no generally accepted notation
system. Moreover, photographs and drawings can only reproduce still pictures, and
superimposed arrows can only provide a very limited rendering of the dynamics of
signing and the way in which hand signs are accompanied and supported by body
language and facial expressions (see, for instance, Sutton-Spence and Woll 1999:
xi–xxi).
Pragmatic research on sign languages covers a wide spectrum. Groeber and
Pochon-Berger (2014) as well as Cibulka (2016) deal with the peculiarities of
turn-taking in signed conversations in Swiss German and in Swedish Sign Lan-
Data in pragmatic research 19

guage respectively. They focus on different types of holds, that is to say the freez-
ing of a sign in turn-final position. The movement of the hand is momentarily
suspended while hand shape and hand position are maintained. They show how
holds perform important functions in the taking of turns and in the projectability
of the next turn. Roush (2011), on the other hand, investigates issues of polite-
ness and impoliteness in American Sign Language. In contrast to Groeber and
Pochon-Berger (2014) and Cibulka (2016), who used a corpus of video recordings
of signed interactions, he used an ethnographic approach by observing native sign-
ers in public gatherings of deaf communities and taking copious field notes (Roush
2011: 338). He focused in particular on metadiscursive terms and markers which
were used to evaluate or describe the ongoing interaction. Mapson (2015) used
data collected through semi-structured group discussions in order to analyse the
ways in which professional interpreters developed their awareness of politeness in
British Sign Language.
Kearsy, Smith and Zwets (2013) analysed the framing of constructed actions
in British Sign Language narratives, and they used elicitation techniques in order
to collect their data. 15 participants with British Sign Language as their preferred
language were shown four short film clips and asked to retell the narratives to
another deaf native signer of British Sign Language (one of the authors of the
article) (Kearsy, Smith and Zwets 2013: 125).

3.4. Data of nonverbal behaviour

The importance of gestures and other forms of nonverbal behaviour in communi-
cation cannot be overestimated. As Kendon (2014) points out:
Willingly or not, humans, when in co-presence, continuously inform one another about
their intentions, interests, feelings and ideas by means of visible bodily action. For ex-
ample, it is through the orientation of the body and, especially, through the orientation
of the eyes, that information is provided about the direction and nature of a person’s
attention. (Kendon 2014: 1)

This opens up a vast range of research opportunities for pragmaticists, but there
are various ways in which the scope of research can be focused on a subset of the
visible bodily actions. The quotation above restricts the focus to those visible bod-
ily actions that have an informative effect on a co-present human being, whether
the effect was intended or not. The scope can be further reduced by restricting it
to bodily actions that come with a communicative intention by the producer, that
is to say actions that are meant to communicate. But this is a very fuzzy distinc-
tion and difficult to apply systematically. A more systematic restriction focuses on
gestures that are used as part of an utterance, as for instance the use of hands in
pointing to an object, in indicating the size or shape of an object or in emphasising
what is being said. Cienki (2017) draws the line in a similar way. He focuses on
20 Andreas H. Jucker

“movement of the hands and forearms by speakers when the movement is not part
of an instrumental action (such as holding a pen and writing) and does not involve
touching oneself or another (as in scratching one’s head or patting someone on the
back)” (2017: 61).
We colour and flavour our speech with a variety of natural vocal, facial and bodily
gestures, which indicate our internal state by conveying attitudes to the propositions
we express or information about our emotions or feelings. Though we may be aware of
them, such behaviours are often beyond our conscious control: they are involuntary or
spontaneous. (Wharton 2009: 1)

Research of gestures and nonverbal behaviour shares some of the problems of

research of sign languages. There is not a sufficiently established way of capturing
the dynamic, spatio-temporal nature of gestures and other bodily actions in suf-
ficient detail, but the problems are exacerbated for gestures because of the fuzzy
nature of bodily actions that are relevant for communication (see Kendon 2014:
Appendix 1 for a set of transcription conventions for gestural actions; see also
Streeck 2009).

4. Observational data: Four dimensions

In the previous section, I focused on the different modalities of language and their
relevance for pragmatic research. In this section, the focus shifts to four scalar
dimensions that characterize observational data. The first dimension is the situ-
ational dimension, which distinguishes between speech contexts that are highly
constrained in terms of what participants are expected – or indeed allowed – to say
at specific points in the interaction and speech contexts that impose few – if any
– such constraints on the contributions. The fictionality dimension distinguishes
between fictional texts on the one hand and factual texts on the other. The third
dimension distinguishes between different levels of researcher interference which
ranges from data that came into existence without any researcher intervention and
data that were purposefully elicited by a researcher. The fourth dimension, finally,
distinguishes between researcher perspectives that focus on very small snippets
of data to those that focus on a new generation of mega corpora. The first two
dimensions are concerned with the nature of the data itself while the latter two
are concerned with the researchers and their influence or perspective on the data.
All these dimensions are often invoked – explicitly or implicitly – in discus-
sions about the suitability of certain types of data for specific research questions or
even for pragmatic research in general. Here, they are not presented in an evalua-
tive sense. There is no claim that one end of a particular scale is, in general, better
than the other end, even though it may turn out to be better suited to specific types
of research questions.
Data in pragmatic research 21

4.1. Situational dimension: Constrained versus unconstrained

Levinson (1979) defined the notion of “activity type” in terms of the allowable
contributions and the constraints it imposes on participants, setting and so on:
In particular I take the notion of an activity type to refer to a fuzzy category whose focal
members are goal-defined, socially constituted, bounded, events with constraints on
participants, setting, and so on, but above all on the kinds of allowable contributions.
Paradigm examples would be teaching, a job interview, a jural interrogation, a football
game, a task in a workshop, a dinner party and so on. (Levinson 1979: 368)

However, it seems clear that not all the activity types that he gives as examples are
subjected to the same level of constraints. They can conveniently, but admittedly
somewhat impressionistically, be situated on a scale from highly constrained situ-
ations to situations with relatively few constraints. At one end of the scale, we find
speech situations that assign clear roles to the different participants and impose a
large amount of restriction on the allowable contributions. Teaching, job interviews
and jural interrogations are obvious examples. In each case the participants are
assigned roles that come with very specific expectations as to the contributions that
they are to make in this situation. Who asks questions? Who answers them? Who
introduces new topics? And so on. At the other end of the scale we find speech
situations in which there are no discernible role differences assigned by the situ-
ation. The dinner party mentioned by Levinson may be situated close to this end
even though there are, of course, differences between the rights and obligations of
the host or hostess and the guests. Other obvious examples might be a chat among
friends on a long car drive, the locker room exchanges among the members of a
sports team before or after a match, or the interactions of a group of children on
the playground. In all these situations, there are also expectations as to what are
appropriate or inappropriate contributions to the interaction, and some participants
play a more important role while others play only subordinate roles. But the roles
the individuals adopt are the result of the constellation of participants. They are not
imposed by the speech situation in the way that an interview assigns differential
roles to the interviewer and the interviewee.
Between the extreme cases there are interesting intermediate cases, such as a
football game and a task in a workshop. A game of football imposes specific speak-
ing rights to the referee, the coach and the team captain and imposes sanctionable
restrictions on the allowable contributions by all the participants. But in contrast to
interviews, spoken contributions are of subordinate importance, and there are few
restrictions on the exchanges between the players themselves. A task in a workshop
might also impose some restrictions on the allowable contributions, depending on
the complexity of the task and the roles of the participants (e. g. supervisor and
apprentice, etc.). An additional example would be a chat during a coffee break at
a place of work. The situation itself may impose relatively few constraints but the
22 Andreas H. Jucker

larger situation of the workplace with its differences in hierarchy may impose its
own constraints on who initiates new topics and who breaks up the coffee break
to go back to work.
The situational dimension is occasionally invoked in an evaluation of data in
that unconstrained data is considered to be more genuine and, therefore, more
likely to reveal the intricacies of conversational interaction without the interfer-
ence of constraints imposed by the speech situation. However, the suitability of
relatively constrained or relatively unconstrained data depends very much on the
research question at hand. Speech situations, or activity types, cannot be placed
on this scale with a high level of precision, but the scale itself helps to create an
awareness for the varying importance of such constraints for specific situations.

4.2. Fictionality dimension: Fictional versus factual

Fictional language comes in many different guises. Obvious cases of fictional lan-
guage are novels or short stories and other narratives that are the product of the
imagination of an author without any claims to depict actual people and actual
facts. It also includes theatre plays and telecinematic discourse, in which a script-
writer invents dialogues that are performed by actors. But there is no clear-cut dis-
tinction between fictional data and non-fictional or factual data. Historical novels,
for instance, may include depictions of historical figures next to invented figures
within events that are partly historically attested and partly invented by the author.
Television documentaries may include staged conversations performed by actors,
and reality television may include a mixture of scripted and improvised conversa-
tions (see Jucker and Locher 2017: 5). Everyday conversations may include anec-
dotes, jokes and even personal narratives that consist of a mixture of factual and
fictitious characters and events.
It is useful to draw a careful terminological distinction between the terms
“fictional” and “fictitious” (see Klauk and Köppe 2014: 5–6; Jucker and Locher
2017: 6). The former refers to utterances, texts, pictures, movies, comics and so on,
while the latter refers to characters, entities and events that have no correspondence
outside of the text and do not exist in the real world. Fictional texts, then, deal with
fictitious characters, entities and events. Factual texts, on the other hand, deal with
characters, entities and events that have an existence in the real world, and in this
sense texts can be factual even if they assert falsehoods about these characters,
entities and events.
For a long time, pragmatics was not interested in fictional data. It was con-
sidered to be artificial, contrived and not sufficiently “real”, and, therefore, not
suitable for pragmatic analyses. Whenever pragmaticists, for instance in the area
of historical pragmatics, resorted to fictional data, they felt the need to apologize
for doing so (see for instance Brown and Gilman 1989: 159 or Salmon 1987: 265).
They pointed out that in the absence of any “real” conversational data, fictional
Data in pragmatic research 23

data seemed to be a reasonably good approximation especially in the case of a

skilful dramatist, such as William Shakespeare. Today, fictional data are seen as
sufficiently interesting in themselves. They no longer serve as a substitute for
“real” data but are analysed on their own terms. Many of Shakespeare’s characters
talk in iambic pentameters. It is safe to assume that at the turn from the sixteenth
to the seventeenth century – or indeed at any other time – probably nobody used
iambic pentameters in their everyday interactions. Shakespeare’s dialogues do
not represent real-life conversations but that does not make it less interesting to
investigate the ways in which Shakespeare chose to depict his characters, how his
characters interact, how they address each other, how they insult each other, how
they are polite or impolite to each other and so on and so forth (see the collection
of overviews of pragmatic approaches to fictional data in Locher and Jucker 2017).

4.3. Researcher interference dimension: Low versus high

The researcher interference scale relates to the amount of interference the researcher
exerts on the production of language data. At one end of the scale there are lan-
guage data that were produced entirely without the interference of a researcher. At
the other end there are language data that were carefully elicited by a researcher in
a highly controlled context. Figure 4 provides relevant examples along the scale.

Researcher Relevant examples

Interference Control
Low Low 1 Speech recording without researcher involvement
2 Surreptitious recording by researcher
3 Non-surreptitious recording by researcher
4 Participant observation recording
5 Semi-structured interview
6 Role play or role enactment
7 Dialog construction task
High High 8 Oral DCTs

Figure 4: Researcher interference dimension

Speech recordings without any researcher involvement, number 1 in Figure 4,

may, of course, be considered to be the most authentic type of data (Kasper 2000:
316) and, therefore, ideal for pragmatic research. It may be argued to be as close
as possible to actual speech. However, with this type of recording the researcher
depends entirely on the previous availability of data that were recorded for some
non-research related purpose. Golato (2017) calls this “naturally occurring data”
and refers to Potter’s (2002: 541) “(conceptual) dead social scientist’s test”, which
24 Andreas H. Jucker

asks whether the data would still exist even if the researcher got run over on the
way to work. The researcher would not be able to carry out an interview, but a
counselling session would take place even if the researcher failed to turn up.
Radio and television broadcasts are examples of recordings that do not depend
on the presence of a researcher and – as forms of public spoken language – they are
generally easily available. This makes them attractive as data for pragmaticists in
spite of the lack of the researcher’s control over the data. He or she cannot manip-
ulate the situation in order to elicit special types of language patterns, e. g. specific
speech acts and the like. The participants of such recordings are obviously aware of
the fact that they are being recorded. The recording situation and a potentially very
large audience are likely to constrain the language production of the participants
in many ways. Thus, in spite of their usefulness, such recordings cannot be used as
substitutes for unconstrained language use, and for many research questions such
speech recordings are not available at all. Much of the content of the spoken com-
ponent of corpora consist of such recordings. The Corpus of Contemporary Amer-
ican English, for instance, contains 109 million words of spoken language (out of
a total of 520 million words), which consist entirely of transcripts of unscripted
conversations from television and radio programmes (http://corpus.byu.edu/coca/)
(see, for instance, Leech 2014: 256–260 on the inclusion of spoken language to
corpora, such as the BNC or ICE).
This might make it interesting for researchers to collect the type of spoken
data that they are interested in by setting up surreptitious recordings, number 2 in
Figure 4. This would eliminate the observer’s paradox (Labov 1972: 209) that we
cannot observe behaviour when it is not being observed, but today’s standards of
ethical research – and in many countries even legal constraints – rule out such a
procedure (see, for instance, Duranti 1997: 117; Flöck 2016: 36). It is no longer
acceptable – as apparently it was in the early days of speech recordings – to record
people surreptitiously and only ask them after the event (but see Hambling-Jones
and Merrison’s 2012: 1121 argumentation that in some situations surreptitious
recordings and retrospective consent might be superior to pre-obtained consent).
With non-surreptitious recordings, number 3 in Figure 4, the researcher has to
accept the observer’s paradox and the effects that the recording equipment might
have on the participants. This category, of course, comprises a rather large range
of possible situations from dinner table conversations to specifically elicited narra-
tives or service encounter recordings. In some cases, the researcher takes part in the
conversations that he or she records, which turns them into participant observation
recordings. Schiffrin (1987), for instance, carried out what she called sociolin-
guistic interviews with groups of people from her neighbourhood and with whom
she shared an ethnic identity. She points out how her participation complicated
the observer’s paradox (Schiffrin 1987: 41). The analyst’s role might influence
the development of the interaction and it might influence the interpretation of
the results because the analyst is no longer a neutral outsider. Rüegg (2014), to
Data in pragmatic research 25

mention a more recent example, investigated thanks responses from a variational

perspective. She collected her data by recording visits to restaurants in Los Angeles
in three different price ranges. The recordings of the interactions between a waiter
and a small group of guests were not surreptitious but the interactions clearly had
a primary purpose that was outside the linguistic research questions. They had to
do with offering and ordering food and drinks and with the incidental necessities
of serving food and drinks, clearing the table and so on.
Number 1 to 4 on the researcher interference dimension can all still be con-
sidered “naturally occurring data” but it is clear that there are differences in the
level of researcher interference and – concomitantly – in the level of researcher
control. With participant observations, the researcher can, of course, try to influ-
ence the flow of the conversation and thus take at least some control of what kind
of language the participants produce, especially if they manage to create speech
situations in which the pragmatic element under investigation is likely to occur in
a naturalistic way because of the necessities imposed by the situation.
The remaining numbers on the dimension shift the balance from naturally
occurring data to elicited data (dealt with in more detail in Schneider, this volume).
They impose more and more control on the language production of the participants.
While a semi-structured ethnographic interview, number 5, leaves some room for
a broader range of responses from participants, role plays or role enactment tasks,
number 6, ask for very specific behaviour, in which the responses depend – at least
to some extent – on the acting abilities of the participants and their willingness to
play along. Dialog construction tasks, number 7, ask participants to create – usually
in written form – an entire dialogue including the utterances by several participants
in order to elicit the participants’ intuition about typical or appropriate dialogues in
a given situation. Discourse completion tasks, number 8, finally impose the highest
level of control on the participants’ language production. Usually they are expected
to produce a speech act of a very specific type, such as an apology, a request or a
response to a compliment.

4.4. Researcher perspective dimension: Micro versus macro

The researcher perspective dimension relates to the amount of data that is being
investigated. It does not distinguish between different types of data as the three
dimensions outlined above. It is concerned with the perspective adopted by the
researcher. At one end of the scale the researcher investigates a very small amount
of usually richly contextualized data, prototypically a single conversation or even
just a small extract of a conversation where the researcher knows a lot about the
participants and the context in which the conversation took place. At the other end
of the scale the researcher searches for patterns of language use in large corpora
consisting of millions or even billions of words. Bednarek (2011: 546) illustrates
this dimension with Figure 5:
26 Andreas H. Jucker

Continuum of text/discourse data

Individual text(s): Small-scale Large-scale

case study/ies corpus corpus

Figure 5: Researcher perspective dimension (Bednarek 2011: 546)

Case studies of individual texts allow for rich contextualisations while large-scale
corpora only provide very minimal contextualisations, that is to say the amount
of data and the contextual richness can be seen – in a very abstract way – as a
reciprocal function. With an increasing amount of data, the contextual richness
becomes smaller and smaller. And, reciprocally, high contextual richness can
only be achieved if the amount of data is very small. The investigative preci-
sion, to use Leech’s term with a slightly different meaning, does not favour
one over the other. In fact, the investigative precision can only be increased by
increasing the amount of data with a given value of contextual richness or vice
versa.
A brief example may illustrate these interdependencies. The phrase I’m sorry
generally serves as an apology, whose occurrences can be investigated both in a
small-scale case study or in a large-scale corpus. In Barbara Kingsolver’s novel
Flight Behavior (2012), there are 20 instances of I’m sorry. Each and every one of
these instances is richly contextualized, and the reader can work out the level of
sincerity that is attached to each one, whether it is a token apology for an interrup-
tion as in (1) or whether it is a heart-felt apology for breaking up a marriage as in
(2). In an important sense, fictional examples provide a more complete contextu-
alization than real life conversations. In real life, conversationalists under obser-
vation from a researcher have a wealth of life experiences that are not accessible
to the researcher. In a novel, the depicted characters do not have any life experi-
ences outside of the novel. Whatever is relevant for the novel is depicted in the
novel.
(1) “I’m sorry for the interruption, Bobby,” Brenda’s mother said, cocking one hand on her
hip, doing a poor job of looking sorry. (Page 99, Location 1198)
(2) “I’m sorry,” she said. “I’m thankful for our children. But I’m not what you need.” (Page
527, Location 6500)

Figure 7 shows the frequency development of I’m sorry over two centuries of
American English. It is based on a corpus of digitized texts containing more than
five million books and a total of some 361 billion words in English texts (Michel et
al. 2010). But these instances are entirely decontextualized. For copy-right reasons
the software does not access a database containing all these books but indexed lists
of ngrams derived from these books. Each ngram in the database comes with an
indication of the year of publication and its language or language variety but it is
Data in pragmatic research 27

disconnected from its actual context. This is an extreme case of a decontextualized

database, and it is, therefore, usually shunned by corpus linguists except for some
very preliminary initial searches that can be used to ask more specific questions.
In this case, it is impossible to ascertain, for instance, whether the phrase I’m sorry
was indeed used as an apology or perhaps to perform another speech act, as for
instance the expression of condolences.
Figure 6 shows that the phrase had a very low frequency in the nineteenth
century. Its use increased in the first half of the twentieth century with a noticea-
ble decrease in the 1960s and 1970s and a sharp increase after that, which poses
interesting follow-up questions whether the decrease in the 1960s and 1970s could
in any way be related to social and cultural developments at the time. However, in
order to tackle such questions, the research would have to go back to contextual-
ized data samples (see also O’Keeffe, this volume on the development of I’m sorry
versus I apologise).

5. Conclusion

Pragmatics studies the use of language in all its complexities and diversities, which
means that language in all its various forms, shapes and varieties provides the data
for pragmatic research. Pragmatics no longer focuses on a single type of data, such
as, for instance, spontaneous, multi-party conversations that take place in private
settings. Pragmatics is not restricted to the modality of spoken language. It is also
concerned with written language, with digital language, with sign language and
with all aspects of nonverbal communication. Different types of language data
invite different types of research questions, and different research questions require
different types of data, as well as different methods of collecting and analysing it
(see Félix-Brasdefer 2007; Jucker 2009a; Golato 2017).
In many cases, it is the triangulation of different types of data that provide
a better understanding of pragmatic issues. Félix-Brasdefer (2007: 163), for
instance, uses both role play data and naturally occurring interactions in his study
of requests in Mexican Spanish, and Flöck (2016: 84), who compares requests in
British English and in American English, uses both audio recordings of informal,
naturally occurring conversations and written production data elicited in discourse
completion tasks. In both studies the combination of data and methods provided
a more comprehensive view of requests than a reliance on one type of data would
have made possible.
This introductory chapter has given an overview of different types of data in
pragmatic research (data collection methods are covered by Schneider, this vol-
ume). Such a task is potentially boundless because virtually all the existing litera-
ture in pragmatic research could be situated within the scope of this paper. I have,
therefore, focused on the relevant modalities (spoken, written, digital, signed,
28 Andreas H. Jucker

Figure 6: “I’m sorry” in American English from 1800 to 2000 (http://books.google.com/

ngrams/)

nonverbal) and their impact on pragmatic research as well as the relevant data
dimensions (level of constraints and fictionality) and researcher dimensions (inter-
ference/control and research perspective/data size).

Acknowledgment

A special word of thanks goes to Andrea Golato, Daniela Landert, Magdalena

Leitner, Miriam Locher, Mirjam Schmalz and Larssyn Staley as well as to my two
co-editors of this volume for their helpful and perceptive comments on draft ver-
sions of this chapter. The usual disclaimers apply.

Data sources

Kingsolver, Barbara
2012 Flight Behavior. A Novel. New York: Harper (Kindle Edition).

References

Alfonzetti, Giovanna
2013 Compliments. In: Marina Sbisà and Ken Turner (eds.), Pragmatics of Speech
Actions, 555–586. (Handbooks of Pragmatics 2.) Berlin: de Gruyter Mouton.
Auf dem Keller, Caren
2004 Textual Structures in Eighteenth-century Newspaper Advertising. A Cor-
pus-based Study of Medical Advertisements and Book Advertisements. Aachen:
Shaker.
Data in pragmatic research 29

Austin, John L.
1962 How to Do Things with Words. The William James Lectures Delivered at Har-
vard University in 1955. Oxford: Oxford University Press.
Beaugrande, Robert-Alain de and Wolfgang Ulrich Dressler
1981 Introduction to Text Linguistics. London: Routledge.
Bednarek, Monika
2011 Approaching the data of pragmatics. In: Wolfram Bublitz and Neal R. Norrick
(eds.), Foundations of Pragmatics, 537–559. (Handbooks of Pragmatics 1.)
Berlin: de Gruyter Mouton.
Beeching, Kate
2016 Pragmatic Markers in British English. Meaning in Social Interaction. Cam-
bridge: Cambridge University Press.
Biber, Douglas
1988 Variation across Speech and Writing. Cambridge: Cambridge University
Press.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad and Edward Finegan
1999 Longman Grammar of Spoken and Written English. London: Longman.
Bilmes, Jack
1988 The concept of preference in conversation analysis. Language in Society
17(2): 161–181.
Blakemore, Diane
1992 Understanding Utterances. An Introduction to Pragmatics. Oxford: Black-
well.
Blum-Kulka, Shoshana, Juliane House and Gabriele Kasper (eds.)
1989 Cross-Cultural Pragmatics: Requests and Apologies. Norwood, NJ: Ablex.
boyd, danah
2014 It’s Complicated: The Social Lives of Networked Teens. New Haven and Lon-
don: Yale University Press.
Brown, Roger and Albert Gilman
1989 Politeness theory and Shakespeare’s four major tragedies. Language in Society
18(2): 159–212.
Brownlees, Nicholas (ed.)
2006 News Discourse in Early Modern Britain. Selected Papers of CHINED 2004.
Bern: Peter Lang.
Bublitz, Wolfram
2017 Oral features in fiction. In: Miriam A. Locher and Andreas H. Jucker (eds.),
Pragmatics of Fiction, 235–263. (Handbooks of Pragmatics 12.) Berlin: de
Gruyter Mouton.
Chapman, Siobhan
2011 Pragmatics. (Palgrave Modern Linguistics.) Houndmills, Basingstoke: Pal-
grave Macmillan.
Cibulka, Paul
2016 On how to do things with holds. Manual movement phases as parts of inter-
actional practices in signed conversation. Sign Language Studies 16(4): 447–
472.
Cienki, Alan
2017 Gesture and pragmatics: From paralinguistic to variably linguistic. In: Anne
30 Andreas H. Jucker

Barron, Peter Grundy and Gu Yueguo (eds.), Routledge Handbook of Prag-

matics, 61–68. London: Routledge.
Claridge, Claudia
2010 News discourse. In: Andreas H. Jucker and Irma Taavitsainen (eds.), Histor-
ical Pragmatics, 587–620. (Handbooks of Pragmatics 8.) Berlin: de Gruyter
Mouton.
Clift, Rebecca
2014 Conversation analysis. In: Klaus P. Schneider and Anne Barron (eds.), Prag-
matics of Discourse, 97–124. (Handbooks of Pragmatics 3.) Berlin: de Gruyter
Mouton.
Coseriu, Eugenio
1980 Textlinguistik. Eine Einführung. (Tübinger Beiträge zur Linguistik 109.)
Tübingen: Gunter Narr.
Crystal, David
2011 Internet Linguistics: A Student Guide. London: Routledge.
Culpeper, Jonathan and Merja Kytö
2000 Data in historical pragmatics: Spoken discourse (re)cast as writing. Journal of
Historical Pragmatics 1(2): 175–199.
Culpeper, Jonathan and Merja Kytö
2010 Early Modern English Dialogues. Spoken Interaction as Writing. (Studies in
English Language.) Cambridge: Cambridge University Press.
Deutschmann, Mats
2003 Apologising in British English. (Skrifter från moderna språk 10.) Umeå: Insti-
tutionen för moderna språk, Umeå University.
Duranti, Alessandro
1997 Linguistic Anthropology. Cambridge: Cambridge University Press.
Edmondson, Willis J.
2014 The emergence of discourse analysis as a disciplinary field: philosophical,
pedagogic and linguistic approaches. In: Klaus P. Schneider and Anne Barron
(eds.), Pragmatics of Discourse, 65–96. (Handbooks of Pragmatics 3.) Berlin:
de Gruyter Mouton.
Esser, Jürgen
2014 Taxonomies of discourse types. In: Klaus P. Schneider and Anne Barron (eds.),
Pragmatics of Discourse, 443–462. (Handbooks of Pragmatics 3.) Berlin: de
Gruyter Mouton.
Færch, Claus and Gabriele Kasper (eds.)
1987 Introspection in Second Language Research. Clevedon: Multilingual Matters.
Félix-Brasdefer, J. César
2007 Natural speech vs. elicited data. Spanish in Context 4(2): 159–185.
Fetzer, Anita
2014 Conceptualising discourse. In: Klaus P. Schneider and Anne Barron (eds.),
Pragmatics of Discourse, 35–62. (Handbooks of Pragmatics 3.) Berlin: de
Gruyter Mouton.
Flöck, Ilka
2016 Requests in American and British English. (Pragmatics & Beyond New Series
265.) Amsterdam: John Benjamins.
Data in pragmatic research 31

Fraser, Bruce
1999 What are discourse markers? Journal of Pragmatics 31(7): 931–952.
Golato, Andrea
2005 Compliments and Compliment Responses: Grammatical Structure and Sequen-
tial Organization. (Studies in Discourse and Grammar 15.) Amsterdam: John
Benjamins.
Golato, Andrea
2017 Naturally occurring data. In: Anne Barron, Yuego Gu and Gerard Steen (eds.),
The Routledge Handbook of Pragmatics, 21–26. London: Routledge.
Gray, Bethany and Douglas Biber
2015 Stance markers. In: Karin Aijmer and Christoph Rühlemann (eds.), Corpus
Pragmatics. A Handbook, 219–248. Cambridge: Cambridge University Press.
Grice, Paul H.
1975 Logic and conversation. In: Peter Cole and J.L. Morgan (eds.), Syntax and
Semantics 3: Speech Acts, 41–58. New York: Academic Press.
Groeber, Simone and Evelyne Pochon-Berger
2014 Turns and turn-taking in sign language interaction: A study of turn-final holds.
Journal of Pragmatics 65: 121–136.
Hambling-Jones, Oliver and Andrew John Merrison
2012 Inequity in the pursuit of intimacy: An analysis of British pick-up artist inter-
actions. Journal of Pragmatics. Special Issue: Im/politeness across Englishes
44(9): 1115–1127.
Hanks, William F.
2011 Deixis and indexicality. In: Wolfram Bublitz and Neal R. Norrick (eds.). Foun-
dations of Pragmatics, 315–346. (Handbooks of Pragmatics 1.) Berlin: de
Gruyter Mouton.
Heinemann, Wolfgang and Dieter Viehweger
1991 Textlinguistik. Eine Einführung. (Germanistische Linguistik 115.) Tübingen:
Niemeyer.
Henke, Christoph
2005 Diskursanalyse und Literatur: Michel Foucaults Anti-Hermeneutik. In: Hans
Vilmar Geppert und Hubert Zapf (eds.), Theorien der Literatur, Vol. 2, 243–
260. Würzburg: Königshausen & Neumann.
Herring, Susan C.
2007 A faceted classification scheme for computer-mediated discourse. language@
internet 1/2007 (http://www.languageatinternet.de)
Herring, Susan C. (ed.)
1996 Computer-Mediated Communication. Linguistic, Social and Cross-Cultural
Perspectives. (Pragmatics & Beyond New Series 39.) Amsterdam: John Ben-
jamins.
Herring, Susan, Dieter Stein and Tuija Virtanen (eds.)
2013 Pragmatics of Computer-Mediated Communication. (Handbooks of Pragmat-
ics 9.) Berlin: de Gruyter Mouton.
Hoffmann, Christian and Wolfram Bublitz (eds.)
2017 Pragmatics of Social Media. (Handbooks of Pragmatics 11.) Berlin: de Gruyter
Mouton.
32 Andreas H. Jucker

Jacobs, Geert
2014 Press releases. In: Klaus P. Schneider, and Anne Barron (eds.), Pragmatics of
Discourse, 583–600. (Handbooks of Pragmatics 3.) Berlin: de Gruyter Mou-
ton.
Jucker, Andreas H.
1998 Historical pragmatics: An interdisciplinary approach. In: Raimund Borgmeier,
Herbert Grabes and Andreas H. Jucker (eds.), Anglistentag 1997 Giessen. Pro-
ceedings, 3–7. Trier: Wissenschaftlicher Verlag.
Jucker, Andreas H.
2005 News discourse: Mass media communication from the seventeenth to the
twenty-first century. In: Janne Skaffari, Matti Peikola, Ruth Carroll, Risto
Hiltunen and Brita Wårvik (eds.), Opening Windows on Texts and Discourses
of the Past, 7–21. (Pragmatics & Beyond New Series 134.) Amsterdam: John
Benjamins.
Jucker, Andreas H.
2008 Historical pragmatics. Language and Linguistics Compass 2(5): 894–906.
Jucker, Andreas H.
2009a Speech act research between armchair, field and laboratory: The case of com-
pliments. Journal of Pragmatics 41(8): 1611–1635.
Jucker, Andreas H. (ed.)
2009b Early Modern English News Discourse. Newspapers, Pamphlets and Scientific
News Discourse. (Pragmatics & Beyond New Series 187.) Amsterdam: John
Benjamins.
Jucker, Andreas H. and Christa Dürscheid
2012 The linguistics of keyboard-to-screen communication: A new terminologi-
cal framework. Linguistik online 56, 6/12. http://www.linguistik-online.org/
56_12/
Jucker, Andreas H. and Miriam A. Locher
2017 Introducing Pragmatics of Fiction: Approaches, trends and developments. In:
Miriam A. Locher and Andreas H. Jucker (eds.), Pragmatics of Fiction, 1–21.
(Handbooks of Pragmatics 12.) Berlin: de Gruyter Mouton.
Jucker, Andreas H. and Irma Taavitsainen
2013 English Historical Pragmatics. (Edinburgh Textbooks on the English Lan-
guage.) Edinburgh: Edinburgh University Press.
Kasper, Gabriele and Merete Dahl
1991 Research methods in interlanguage pragmatics. Studies in Second Language
Acquisition 13(2): 215–247.
Kasper, Gabriele
2000 Data collection in pragmatics research. In: Helen Spencer-Oatey (ed.), Cul-
turally Speaking. Managing Rapport Through Talk Across Cultures, 316–341.
London: Continuum.
Kearsy, Cormier, Sandra Smith and Martine Zwets
2013 Framing constructed action in British Sign Language narratives. Journal of
Pragmatics 55: 119–139.
Keisanen, Tiina and Elise Kärkkäinen
2014 Stance. In: Klaus P. Schneider and Anne Barron (eds.), Pragmatics of Dis-
course, 295–322. (Handbooks of Pragmatics 3.) Berlin: de Gruyter Mouton.
Data in pragmatic research 33

Kendon, Adam
2014 Gesture. Visible Action as Utterance. Cambridge: Cambridge University Press.
Klauk, Tobias und Tilmann Köppe
2014 Bausteine einer Theorie der Fiktionalität. In: Tobias Klauk und Tilmann
Köppe (Hrsg.), Fiktionalität. Ein interdisziplinäres Handbuch, 3–31. Berlin:
de Gruyter.
Koch, Peter
1999 Court records and cartoons: Reflections of spontaneous dialogue in Early
Romance texts. In: Andreas H. Jucker, Gerd Fritz and Franz Lebsanft (eds.),
Historical Dialogue Analysis, 399–429. (Pragmatics & Beyond New Series
66.) Amsterdam: John Benjamins.
Koch, Peter und Wulf Oesterreicher
1985 Sprache der Nähe – Sprache der Distanz: Mündlichkeit und Schriftlichkeit
im Spannungsfeld von Sprachtheorie und Sprachgeschichte. Romanistisches
Jahrbuch 36: 15–43.
Koch, Peter und Wulf Oesterreicher
2007 Schriftlichkeit und kommunikative Distanz. Zeitschrift für germanistische
Linguistik. Deutsche Sprache in Gegenwart und Geschichte 35(3): 346–375.
Kytö, Merja
2010 Data in historical pragmatics. In: Andreas H. Jucker and Irma Taavitsainen
(eds.), Historical Pragmatics, 33–67. (Handbooks of Pragmatics 8.) Berlin: de
Gruyter Mouton.
Labov, William
1972 Sociolinguistic Patterns. Philadelphia: University of Pennsylvania Press.
Landert, Daniela
2017 Stance in fiction. In: Miriam A. Locher and Andreas H. Jucker (eds.), Prag-
matics of Fiction, 489–514. (Handbooks of Pragmatics 12.) Berlin: de Gruyter
Mouton.
Landert, Daniela and Andreas H. Jucker
2011 Private and public in mass media communication. From letters to the editor to
online commentaries. Journal of Pragmatics 43: 1422–1434.
Leech, Geoffrey N.
2014 The Pragmatics of Politeness. (Oxford Studies in Sociolinguistics.) Oxford:
Oxford University Press.
Levinson, Stephen C.
1979 Activity types and language. Linguistics 17: 365–399.
Levinson, Stephen C.
1983 Pragmatics. Cambridge: Cambridge University Press.
Levinson, Stephen C.
2003 Space in Language and Cognition. Explorations in Cognitive Diversity. Cam-
bridge: Cambridge University Press.
Locher, Miriam A.
2014 Electronic discourse. In: Klaus P. Schneider and Anne Barron (eds.), Pragmat-
ics of Discourse, 555–581. (Handbooks of Pragmatics 3.) Berlin: de Gruyter
Mouton.
Locher, Miriam A. and Andreas H. Jucker (eds.)
2017 Pragmatics of Fiction. (Handbooks of Pragmatics 12.) Berlin: de Gruyter
Mouton.
34 Andreas H. Jucker

Mapson, Rachel
2015 Paths to politeness: Exploring how professional interpreters develop an under-
standing of politeness norms in British Sign Language and English. In: Barbara
Pizziconi and Miriam A. Locher (eds.), Teaching and Learning (Im)politeness,
155–184. Berlin: de Gruyter Mouton.
Michel, Jean-Baptiste, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew
K. Gray, The Google Books Team, Joseph P. Pickett, Dale Hoiberg, Dan Clancy, Peter
Norvig, Jon Orwant, Steven Pinker, Martin A. Nowak and Erez Lieberman Aiden
2010 Quantitative analysis of culture using millions of digitized books. Science
(http://www.sciencemag.org/content/early/2010/12/15/science.1199644)
Potter, Jonathan
2002 Two kinds of natural. Discourse Studies 4(4): 539–542.
Quinn, Gary
2017 British Sign Language. In: Anne Barron, Peter Grundy and Gu Yueguo (eds.),
Routledge Handbook of Pragmatics, 55–60. London: Routledge.
Reber, Elisabeth
2012 Affectivity in Interaction. Sound Objects in English. (Pragmatics & Beyond
New Series 215.) Amsterdam: John Benjamins.
Rissanen, Matti
1986 Variation and the study of English historical syntax. In: David Sankoff (ed.),
Diversity and Diachrony, 97–109. Amsterdam: John Benjamins.
Roush, Daniel R.
2011 Language between bodies: A cognitive approach to understanding linguistic
politeness in American Sign Language. Sign Language Studies 11(3): 329–
374.
Rüegg, Larssyn
2014 Thanks responses in three socio-economic settings: A variational pragmatics
approach. Journal of Pragmatics 71: 17–30.
Sacks, Harvey, Emanuel A. Schegloff and Gail Jefferson
1974 A simplest systematics for the organization of turn-taking for conversation.
Language 50(4): 696–735.
Salmon, Vivian
1987 Sentence structures in colloquial Shakespearian English. In: Vivian Salmon
and Edwina Burness (eds.), A Reader in the Language of Shakespearean
Drama, 265–300. Amsterdam: John Benjamins. First published in Transac-
tions of the Philological Society 64(1): 105–140 [1965].
Schegloff, Emanuel A.
2007 Sequence Organization in Interaction. A Primer in Conversation Analysis I.
Cambridge: Cambridge University Press.
Schiffrin, Deborah
1987 Discourse Markers. (Studies in Interactional Sociolinguistics 5.) Cambridge:
Cambridge University Press.
Schmid, Hans-Jörg (ed.)
2012 Cognitive Pragmatics. (Handbooks of Pragmatics 4.) Berlin: de Gruyter Mou-
ton.
Schneider, Hans Julius
1995 Intuition and introspection. In: Jef Verschueren, Jan-Ola Östman and Jan
Data in pragmatic research 35

Blommaert (eds.), Handbook of Pragmatics. Manual, 606–608. Amsterdam:

John Benjamins.
Schubert, Christoph
2008 Englische Textlinguistik. Eine Einführung. (Grundlagen der Anglistik und
Amerikanistik 30.) Berlin: Erich Schmidt.
Searle, John R.
1969 Speech Acts. An Essay in the Philosophy of Language. Cambridge: Cambridge
University Press.
Sinclair, John McH. and R. Malcolm Coulthard
1975 Towards an Analysis of Discourse. The English Used by Teachers and Pupils.
Oxford: Oxford University Press.
Sowinski, Bernhard
1983 Textlinguistik. Eine Einführung. Stuttgart: Kohlhammer.
Sperber, Dan and Deirdre Wilson
1995 Relevance. Communication and Cognition. Second edition. Oxford: Basil
Blackwell.
Stenström, Anna-Brita
2011 Pauses and hesitations. In: Gisle Andersen and Karin Aijmer (eds.), Prag-
matics of Society, 537–567. (Handbooks of Pragmatics 5.) Berlin: de Gruyter
Mouton.
Streeck, Jürgen
2009 Gesturecraft. The Manu-facture of Meaning. (Gesture Studies 2.) Amsterdam:
John Benjamins.
Suter, Hans-Jürg
1993 The Wedding Report. A Prototypical Approach to the Study of Traditional Text
Types. (Pragmatics & Beyond New Series 27.) Amsterdam: Benjamins.
Sutton-Spence, Rachel and Bencie Woll
1999 The Linguistics of British Sign Language. An Introduction. Cambridge: Cam-
bridge University Press.
Tagg, Caroline
2015 Exploring Digital Communication. Language in Action. (Routledge Introduc-
tions to Applied Linguistics.) London: Routledge.
Traugott, Elizabeth Closs
2008 The state of English language studies: A linguistic perspective. In: Marianne
Thormählen (ed.), English Now. Selected Papers from the 20th IAUPE Confer-
ence in Lund 2007, 199–225. Lund: Lund Studies in English.
Trosborg, Anna
1995 Interlanguage Pragmatics. Requests, Complaints and Apologies. Berlin: Mou-
ton de Gruyter.
Watts, Richard J.
2003 Politeness. (Key Topics in Sociolinguistics.) Cambridge: Cambridge Univer-
sity Press.
Werlich, Egon
1982 A Text Grammar of English. (UTB 597.) Second edition. Heidelberg: Quelle &
Meyer.
Wharton, Tim
2009 Pragmatics and Non-Verbal Communication. Cambridge: Cambridge Univer-
sity Press.
36 Andreas H. Jucker

Woll, Bencie and Jim G. Kyle

1998 Sign language. In: Jacob L. Mey (ed.), Concise Encyclopedia of Pragmatics,
854–878. Amsterdam: Elsevier.
2. Methods and ethics of data collection
Klaus P. Schneider

Abstract: This chapter provides an overview of methods and data collection pro-
cedures employed in research in pragmatics. Specifically, the focus is on using
a corpus and recording naturally occurring spoken discourse, and on production
tasks (eliciting conversation, role plays, interviews, discourse completion tasks),
and comprehension and judgement tasks (multiple choice tasks, rating scales).
In this survey of methods, an attempt has been made to include many different
approaches and research traditions, among them speech act analysis, conversation
analysis, discourse analysis, Gricean pragmatics, cross-cultural pragmatics, inter-
language pragmatics and (im)politeness research. It is emphasized that there is no
best method, that all methods have advantages and disadvantages, and that each
method can be used for some purposes but not for others. Therefore, the choice
of method depends on the type of research, the research goals and the research
questions. There is also a discussion of research ethics, notably the principles of
welfare, autonomy, privacy and indebtedness. It is stressed that research ethics
has a historical dimension and can be conceptualized as a process of increasing
sensitivity. Some practices which were permissible or commonly used in the twen-
tieth century, are not acceptable any longer and considered unethical in pragmatics
research today.

1. Introduction

In surveys of methods in pragmatics research, data types and data collection proce-
dures are usually dealt with together. In this handbook, however, we have decided
to tease them apart analytically and treat them separately to offer complementary
perspectives on crucial methodological issues and thus provide a differentiated
view of topics researchers in the field should be aware of. The following example
is given to highlight and illustrate the range of issues addressed in this chapter 2.
The probably best known and most influential paper ever written about the
speech act of compliment was authored by Manes and Wolfson (1981). On the
very first page of their paper, they include the following methodological statement:
It is our conviction that an ethnographic approach is the only reliable method for col-
lecting data about the way compliments, or indeed, any other speech act functions in
everyday interactions. (Manes and Wolfson 1981: 115)

https://doi.org/10.1515/9783110424928-002
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 37–93. Berlin/
Boston: De Gruyter Mouton.
38 Klaus P. Schneider

This far-reaching methodological claim that ethnographic field work, i. e. taking
field notes, is “the only reliable method for collecting data” not only for the anal-
ysis of compliments, but of any speech act, has received a lot of criticism, as in
fact have the results of Manes and Wolfson’s study (e. g. Jucker 2009: 1621–1622).
They found that in their corpus of 686 American English compliments, gathered
by the two researchers and also the students taking their courses, three syntactic
constructions prevailed, of which the most frequent one alone accounted for more
than fifty per cent of all compliments collected. They furthermore found that in
more than two thirds of their compliments, the positive evaluation was expressed
through an adjective and that the same five adjectives were used in most cases.
In view of these findings on recurring syntactic and semantic patterns, Manes
and Wolfson described American English compliments as highly formulaic. This
conclusion has, however, been challenged by pointing to the possibility that the
two researchers or at least their students gathering the data may have noticed only
those formulaic compliments they collected, while less prototypical ones, i. e. less
formulaic, more original or more indirect ones, may have gone unnoticed. More-
over, it has justly been criticized that hearing may not have been accurate, since it
has been shown that, while it is possible to remember individual words, routines
or short phrases with some accuracy, it is difficult to recall the exact wording
of entire utterances just overheard, even immediately after hearing them, as the
short-term memory is much shorter than people commonly believe (cf. Yuan 2001:
288). These two problems, i. e. a focus on stereotypical realizations and inaccurate
hearing, cast serious doubt on the reliability of “the only reliable method” and
shows that this particular approach definitely has some shortcomings. On the other
hand, there are also obvious strengths. One is that the ethnographic approach is
very unobtrusive. It can be used to record at least some aspects of naturally occur-
ring everyday interactions while avoiding the “observer’s paradox” (Labov 1972).
Further advantages include that the researcher does not depend on the availability
and functioning of electronic recording devices. Also it is not necessary to ask the
people observed and recorded their consent, neither before nor after taking field
notes, while getting consent prior to recording people electronically is an important
legal and also ethical issue (see section 4 below). All of the topics illustrated in
this example – i. e. strengths and weaknesses of data collection methods, reliabil-
ity, technical, legal and ethical concerns – will be discussed in detail in the present
chapter.
Regarding Manes and Wolfson’s article on American English compliments and
their formulaic nature and the methodological claim the authors make in it, it must
be borne in mind that their article appeared as early as 1981 and that the research
reported must have been conducted even earlier. In other words, their article was
published at a time when several alternative methods and data collection procedures
were not yet available. At that time, recording devices were, as a rule, much bigger
and used some kind of tape or disc, corpora of spoken language were much smaller
Methods and ethics of data collection 39

and not generally accessible, and experimentational methods such as discourse com-
pletion tasks had not yet been invented (cf. Ogiermann, this volume). This example
demonstrates that the inventory of data collection methods in pragmatics research
has developed and grown over time. It would in fact be intriguing to write a history
of methods in pragmatics. This, however, is not the aim of the present chapter.
The aim of this chapter is to provide an overview of the many different data
collection methods used in pragmatics research today. In this endeavour, a non-eval-
uative stance is adopted. In this handbook, we firmly believe that there is no best
method as such. As each and every method has advantages as well as disadvantages,
choice of method depends entirely on the research goals and the research questions
to be answered. One method may be better suited to provide an answer to a particu-
lar question than another method, but for a different question it may be just the other
way around. Therefore, researchers must be clear about the questions they address,
and, more generally, which overall approach they wish to adopt and which type of
research they want to carry out, and make their methodological choices accordingly.
We further believe that any discussion of data collection methods should be free of
ideologies. Hence, it is intended to do justice to each data collection procedure, to
highlight its respective merits, but to also point out its respective weaknesses so that
informed choices can be made. Finally, to compensate for the disadvantages of any
chosen method and thus increase its validity, triangulation is recommended, i. e. the
combination of different methods and a comparison of data from different sources.
In his monograph on Research Methods in Applied Linguistics, Dörnyei (2007)
devotes the last chapter to the question “How to choose the appropriate research
method” and gives two general recommendations. His first recommendation is
summarized as follows: “feel free to choose the research method that you think
will work best in your inquiry” (Dörnyei 2007: 307; original italics). He further
elaborates this recommendation:
At the end of the day, research is not a philosophical exercise but an attempt to find an-
swers to questions, and just as we can go about this in our everyday life in many differ-
ent ways, the varied palette of research methodology is clear evidence for the possibility
of diverse solutions in the scientific enquiry. (Dörnyei 2007:307)

In this handbook we unreservedly subscribe to the position that research is about

finding answers to questions. On the other hand, we would like to stress the cru-
cial role that such questions play in the selection of a data collection method and
concerning the appropriateness and suitability of each method for addressing a
specific set of research questions. It has, however, to be conceded that other factors
also influence the choice of method, e. g., as Dörnyei (2007: 308–312) points out,
personal style, personal training and personal experience. And these factors, no
doubt, have also influenced writing the present chapter.
The second general recommendation which Dörnyei offers, namely that “it is
worth considering applying a mixed methods research design in every situation”
40 Klaus P. Schneider

(2007: 313, original italics) cautions against the dangers of “monomethodologies”

(cf. Miles and Huberman 1994: 43) and underscores our recommendation of tri-
angulation to increase the validity of results. It has to be noted, however, that the
concept of “mixed methods research design”, which is currently popular in many
areas of applied linguistics and beyond, refers to a combination of qualitative and
quantitative methods (cf. Angouri 2009, Kim 2013), whereas “triangulation” refers
to any combination of different data collection procedures and of data from differ-
ent sources (cf. Bednarek 2011: 551–552).
The present chapter provides a general overview of data collection methods
used in a wide range of approaches and frameworks, with their strengths and weak-
nesses (section 3), as well as a discussion of practical, legal and ethical issues
involved in collecting data (section 4). These discussions are prefaced by general
considerations concerning the types of research which provide the coordinates for
any investigations in the vast field of pragmatics (section 2).

2. Types of research: Some basic distinctions

Given the centrality of research questions and their importance for the selection of
suitable methods and data collection procedures, it is necessary to first briefly con-
sider and discuss the nature of research and the types of research contexts in which
different types of questions are asked. Such types of research can be characterized
with reference to a number of fundamental distinctions captured by the follow-
ing dichotomies, which are essentially relevant to any kind of (language-based)
research, but will be made immediately relevant to research in pragmatics in the
ensuing discussion.
(1) Empirical versus non-empirical
(2) First order versus second order
(3) Inductive versus deductive
(4) Comparative versus non-comparative
(5) Longitudinal versus non-longitudinal
(6) Diachronic versus synchronic
(7) Representative versus non-representative
(8) Qualitative versus quantitative
(9) Micropragmatic versus macropragmatic
(10) Spoken versus written

2.1. Empirical versus non-empirical

It is fair to say that most work in pragmatics today is empirical, i. e. using data col-
lected one way or another, by employing one or more of the procedures and instru-
Methods and ethics of data collection 41

ments which are discussed in more detail in section 3 below. In more precise terms,
empirical work is based on data gathered from people other than the researcher.
By contrast, non-empirical work does not involve data gathered from other people.
In this case, researchers rely exclusively on their own everyday communicative
experience and their pragmatic competence, usually as native speakers of the lan-
guage they are interested in, sometimes generalizing their specific experience and
competence and conceptualizing them as universal.
Initially, in the early days of pragmatics, work was not empirical. Language
philosophers such as Austin, Searle and Grice illustrated their theories with fabri-
cated examples. Austin and Searle were speech act theorists. Speech act analysis,
on the other hand, as an empirical discipline, was started when linguists, inspired
e. g. by Ervin-Tripp’s account of alternative realizations of requests in American
English (Ervin-Tripp 1976), began to systematically investigate the actual linguis-
tic and situationally appropriate realization of speech acts in large collections of
data gathered from people other than the researcher. Classical examples include
Manes and Wolfson’s and Holmes’ work in sociolinguistics, using the ethnographic
method (Manes and Wolfson 1981, Holmes 1986, 1988), or Blum-Kulka et al.’s
and Trosborg’s work in applied linguistics, and specifically in cross-cultural and
interlanguage pragmatics, using discourse completion tasks and role plays (Blum-
Kulka et al. 1989, Trosborg 1995).
After the “empirical turn in linguistics” (Taavitsainen and Jucker 2015),
non-empirical research has sometimes been referred to as “armchair linguistics”,
using Fillmore’s derogatory term (Fillmore 1992). Jucker (2009), however, points
out that some ground-breaking research in pragmatics was in fact non-empirical,
including the works of the philosophers Austin, Searle and Grice. This is also true
of early politeness theories developed in linguistics, such as Lakoff’s (1973) and
Leech’s (1983) theories as well as of other revolutionary work in twentieth-cen-
tury linguistics, including work by de Saussure (1916) and Chomsky (e. g. 1957,
1965). The “armchair method”, i. e. researchers relying on their own competence
and everyday experience as a competent speaker of a language, should therefore
not be rated negatively.
It is often argued that “armchair pragmatics” is also data-based and, hence,
empirical. In this case, the data used is usually called “introspective”. This term is,
however, ambiguous. In “armchair pragmatics” it means that researchers tap their
own competence, whereas in psycholinguistics it means tapping the competence
of (a large number of) informants in an experiment (cf., e. g., Færch and Kasper
1987). To highlight the essential difference between psycholinguistic experiments
on the one hand and armchair pragmatics on the other hand, data in the latter
approach are referred to (maybe less respectfully) as “intuitive”, “fabricated” or
“invented”.
Armchair pragmaticists, using only their own individual communicative expe-
rience and pragmatic competence, are not only researchers illustrating their theo-
42 Klaus P. Schneider

retical claims, but also playwrights and writers of prose fiction as well as authors
of textbooks for (foreign) language teaching. While textbooks for foreign lan-
guage teaching purposes are often written by teams including native speakers of
the learners’ target language and native speakers of the learners’ native language,
there are also some rare cases in which playwrights do not rely on their individual
communicative competence alone, but prefer to develop their plays from scratch
with their actors (cf. Clements 1983). It has been suggested that dramatic dialogue
provides competence data underlying actual performance, rather than actual per-
formance data (e. g. Lakoff and Tannen 1984), and this also applies to dialogue in
prose fiction. Needless to say, dramatic dialogue and dialogue in prose fiction are
mostly devoid of what has been termed “normal non-fluency” (Short 1996), i. e.
hesitations, backchannelling, interruptions, overlap, and so on (cf. Bublitz 2017,
also Jucker 2015). The same is true of the examples of language use produced by
researchers employing the armchair method.

2.2. First order versus second order

The distinction between first-order and second-order conceptualizations, originally
made in systems theory, is especially popular in (im)politeness research (cf. Watts,
Ide and Ehlich 1992: 3–4). Kádár and Haugh (2013) define this distinction in the
following way:
The terminology of first-order and second-order is used in various fields of linguistics,
as well as other areas. In general, a first-order conceptualization refers to the way in
which a phenomenon is perceived by its users, while second-order describes a more
abstract, scientific conceptualization of the given phenomenon. (Kádár and Haugh
2013: 41)

Brown and Levinson’s politeness theory (1978, 1987), for example, has been crit-
icized for being based on a second-order concept of politeness, while researchers
such as Watts (e. g. 2003) have called for an analysis of first-order conceptual-
izations, i. e. how ordinary language users interpret and understand politeness.
First-order conceptualizations of politeness, rudeness, appropriateness, and so on
can be elicited e. g. in perception studies in which judgement tasks and rating
scales are employed (cf. section 3.4.2 below). First-order conceptualizations of
speech acts, on the other hand, can be elicited e. g. in meta-pragmatic interviews,
in which ordinary language users may be asked to define particular speech acts
(e. g. compliments or threats) or to report events in which particular speech acts
occurred (cf. section 3.3.3 below). More generally, first-order conceptualizations
of pragmatic phenomena (e. g. speech acts, discourse genres, courtesy, banter) can
also be examined by analysing the use of meta-pragmatic terms (e. g. compliment,
small talk, face) in fictional and non-fictional discourse (cf. Culpeper 2011, Jucker
and Taavitsainen 2014, Schneider 2017).
Methods and ethics of data collection 43

2.3. Inductive versus deductive

A further relevant distinction is that between inductive and deductive research.

Researchers employing the armchair method usually work deductively. They fab-
ricate utterances as examples to prove a point or illustrate a theory. At the other end
of what can be seen as a continuum, researchers in conversation analysis and inter-
actional linguistics work radically inductively by subscribing to the ethnometh-
odological principle of “unmotivated looking” or, more generally, “ethnometh-
odological indifference” (Garfinkel and Sacks 1970: 345–346), i. e. approaching
data, as a default audiotaped naturally occurring conversation, in an unprejudiced
manner and letting patterns emerge from the data. This approach is also referred to
as purely data-driven. Most work in other areas of pragmatics research is located
between the endpoints of the deductive-inductive continuum, usually closer to the
inductive end, analysing collections of data guided by theories and hypotheses.

2.4. Comparative versus non-comparative

Work in conversation analysis is comparative in a very general sense of the word.
Essentially this work is about structural or, more properly, “organisational” sim-
ilarities between speech events under comparable circumstances, e. g. telephone
calls to an emergency hotline as in Sack’s early work. Such similarities include,
for instance, what is said at the very beginning or the very end of telephone con-
versations.
More commonly, however, comparative research in pragmatics is aimed at con-
trasting different languages and cultures, often for the purposes of foreign language
teaching and learning, and, more recently, at contrasting different varieties of a
language or social groups sharing the same language. Relevant disciplines are
contrastive pragmatics, cross-cultural pragmatics, interlanguage pragmatics and
variational pragmatics (cf. Blum-Kulka et al. 1989a, Barron and Schneider 2009,
Beeching and Woodfield 2015). Pragmatics research in sociolinguistics, by con-
trast, was originally non-comparative, focused on one language or one specific
variety of a language alone. Classical examples are Pomerantz’s study of compli-
ment responses (1978), Manes and Wolfson’s study on compliments (1981), and
Ervin Tripp’s study of requests (1976). All of these studies are focused exclusively
on American English, but they are not interested in variation within American Eng-
lish, e. g. across regions, ethnic communities or age groups. More recent sociolin-
guistic studies in pragmatics are, however, comparative in the sense that they com-
pare their own empirical findings to the findings from earlier studies on the same
phenomena. An early example is Holmes’ (1995) study of compliments in New
Zealand English, in which Holmes explicitly compares her own results to those
by Manes and Wolfson on compliments in American English. Holmes furthermore
examines gender differences as well as situational variation, notably power differ-
44 Klaus P. Schneider

ences and differences in social distance. The study of situational variation can also
be characterized as comparative, as different interpersonal relations and constella-
tion and different types of social context are contrasted. Sociolinguistics today is
no longer interested in examining speech acts in a national variety of a language,
or in gender variation and situational variation alone. Much work in sociolinguis-
tic pragmatics is now focused either on micro units such as discourse markers,
quotatives and question tags or on more global concepts such as politeness, rela-
tional work and discursive identity construction. Regional, socioeconomic, age and
ethnic variation are also taken into account in sociolinguistic pragmatics (cf. e. g.
Macaulay 2009, Holmes et al. 2012, Pichler 2013).
Early work in pragmatics was not interested in variation and, hence, not in
comparison. Speech act theorists and philosophers such as Austin, Searle and Grice,
while using examples from their native English (for which they were later accused
of an ethnocentric bias, cf. Wierzbicka 1985), wanted to explore the fundamen-
tal nature of human verbal communication. Similarly, politeness theorists of the
first generation were interested in the universals of language usage, e. g. Leech
(1983) and, most explicitly, Brown and Levinson (1978, 1987). This also applies
to early work in impoliteness theory, e. g. to Culpeper (1996), who based his ini-
tial approach on Brown and Levinson’s theory. Today, however, there is a general
awareness of differences between languages and cultures and also between varieties
of the language and between subcultures and social groups. This applies in particu-
lar to so-called Continental-European pragmatics (cf. Huang 2010), but not to the
Anglo-American approach, i. e. Gricean pragmatics, nor to conversation analysis.

2.5. Longitudinal versus non-longitudinal

Research in pragmatics is not, as a rule, longitudinal, i. e. does not follow the same
informants across a time span of several years. Most empirical work provides a
synchronic snapshot, that is to say an insight into how language users behave at a
given point in time. Exceptions include studies on the pragmatic development in a
foreign language during a year abroad, i. e. ten to twelve months spent by school-
children or, more commonly, college or university students in a foreign country in
which their target language is spoken natively (e. g. Barron 2003, Schauer 2009,
Ren 2015a). By contrast, studies interested in pragmatic age variation are not nor-
mally real-time longitudinal studies, but apparent-time cross-sectional studies, i. e.
comparing different age groups coexisting at the same time (e. g. Dinkin, in press).
This approach is also suitable for doing research on language change.

2.6. Diachronic versus synchronic

The distinction between diachronic and synchronic research is often not well under-
stood by students who mistake diachronic research for research on historical lan-
Methods and ethics of data collection 45

guage and synchronic research as research on present-day language. While it is true

that pragmatics research is predominantly focused on present-day language use,
historical pragmatics is also a burgeoning field (cf., e. g., Jucker and Taavitsainen
2010). Within this field, a distinction can be made between historical pragmatics
in a narrow sense, i. e. synchronically focused on a period of time in the history
of a language, and diachronic pragmatics interested in language change, i. e. com-
paring periods in the history of a language to examine variation in time (cf. Jacobs
and Jucker 1995). Diachronic pragmatics, although usually conceptualized as a
branch of historical pragmatics in the broad sense of the term, is not restricted to
the study of historical language, but may also deal with recent or ongoing changes
in language use. An example is Dinkin’s study of responses to thanks in Canadian
English (in press), in which he compares juvenile and older speakers and their use
of different response realizations, based on which he postulates ongoing language
change in responding behaviour. A further example is Chen’s partial replication of
an earlier study of compliment responses in American English and Chinese (Chen
1993). Chen replicated the Chinese part of this earlier study, employing the same
production questionnaire, which includes four discourse completion tasks, and
collecting his data in the same city in the People’s Republic of China, i. e. Xi’an,
to warrant immediate comparability (Chen and Yang 2010). He found that Chi-
nese speakers no longer overwhelmingly rejected the compliments, thus following
Leech’s modesty maxim (Leech 1983), but predominantly accepted the compli-
ments, thus following Leech’s agreement maxim. After approximately seventeen
years, the Chinese responses had become more similar to the American responses
established in the earlier study, thus reflecting, Chen and Yang argue, the economic
and societal changes in mainland China. This example illustrates that diachronic
work is also a type of comparative research, and that comparability is an important
issue in this type of research and crucial for arriving at reliable results (cf. Schnei-
der 2014). A further example is Jucker and Landert (2015), who do not examine
speech acts in everyday conversation, but turn-taking and narrative structures in
radio talk shows. Overall, however, studies analysing recent and ongoing changes
in language behaviour are still relatively rare.

2.7. Representative versus non-representative

Students often ask whether an empirical study in pragmatics is representative or
not. What they usually mean is whether the population involved in an empirical
project and the sample used are large enough to yield reliable results. Yet, repre-
sentativeness is not a matter of quantity. Rather, the question is: representative of
what? Generally, empirical studies in pragmatics focused on a particular country,
e. g. the United States of America or the People’s Republic of China, or a national
variety of a language, e. g. American English or New Zealand English, are not
representative of the entire population in the respective nation-state. That is to say,
46 Klaus P. Schneider

such studies do not normally work with carefully stratified samples reflecting the
overall demographic composition of the population in the nation-state in question.
In this regard, studies in pragmatics cannot compete with studies in e. g. sociology
or other social sciences. In fact, there is one particular sociological group whose
verbal behaviour pragmaticists know more about than about the behaviour of any
other group of society. This group is the group of college and university students,
as researchers often, and for obvious reasons, recruit their own students as “guinea
pigs” in their empirical work, and this does not apply to pragmatics alone, but also
to linguistics at large and many other disciplines interested in human behaviour,
e. g. psychology (cf. Kasper 1993). In other words, researchers frequently use what
is known as a “convenience sample”, which is understandable considering the
practical difficulties in recruiting informants for a study, and feasibility should not
be underestimated (Edmondson 1981: 78). On the other hand, students, depending
on their teachers and lecturers, may not participate voluntarily, which is a seri-
ous ethical issue (cf. section 4 below). Moreover, as students of linguistics and
pragmatics, these informants are not, strictly speaking, ordinary language users.
Accordingly, findings from studies involving students, and especially the research-
er’s own students, should be interpreted cautiously and not be overgeneralized.

2.8. Qualitative versus quantitative

By and large, pragmatics research used to be qualitative rather than quantitative.
Researchers are interested in, for instance, the communicative functions of dis-
course markers, the mechanisms of turn-taking, the options available for realis-
ing a particular speech act in a given language, or strategies of being polite or
impolite. At the same time, researchers use relatively large populations and data
sets and apply to them statistical analyses, mostly descriptive statistics (cf., e. g.,
Ogiermann and Sassenroth 2012). In the context of empirical pragmatics research,
“relatively large” usually means several hundred. For example, for their study
of apologies in email discussion lists, Harrison and Allton (2013) analysed 260
instances. Manes and Wolfson’s (1981) study of compliments in American English
was based on 686 ethnographically collected instances. Spencer-Oatey et al. (2008)
gathered 2,490 reactions to compliment responses by employing multiple choice
tasks. Blum-Kulka et al. (1989a: 16) used DCTs to collect over 30,000 instances
of requests and apologies, rendering their Cross-Cultural Speech Act Realization
Project one of the largest, if not the largest, empirical project in speech act research
to date. By contrast, in their recent study of apologies in Australian English and
Bahasa Indonesian, Jones and Adrefiza (2017), who were also interested in gender
differences, involved a total of only 24 informants altogether, six male persons
and six female persons each representing the two language varieties under study
(Jones and Adrefiza 2017: 97). These informants were given three discourse com-
pletion tasks, orally administered, providing a maximum of 72 apologies in all
Methods and ethics of data collection 47

(although the appendices seem to suggest that a much smaller number of instances
was given; cf. Jones and Adrefiza 2017: 113–118), rendering their design a case
studies approach rather than anything else. Even though the two authors do not
provide percentages but raw numbers, great caution is required to draw any con-
clusions from such small datasets, given that some of the features analysed, e. g.
intensifiers, occur with frequencies between 0 and 3 instances (Jones and Adrefiza
2017: 106).
Despite the availability and accessibility of machine-readable corpora, some
of which are extremely large, the amount of data for empirical studies, especially
in the area of speech act research, cannot be easily increased beyond the numbers
given in the preceding paragraph. The main reason for this is that function-to-
form searches, taking illocutionary categories as their starting point to find real-
izations of a given speech act, are not, or only rudimentarily, available at present
(cf. O’Keeffe, this volume), since pragmatic corpus annotation is still in its infancy
(cf. Archer and Culpeper, this volume). However, as several attempts are currently
being made to improve this situation, it should soon be possible to make better use
in pragmatics research of the enormous quantities available in language corpora.

2.9. Micropragmatic versus macropragmatic

Very many studies in empirical pragmatics have a micropragmatic focus. These
studies are either focused on individual speech acts, as in, first and foremost, con-
trastive, cross-cultural and intercultural studies, predominantly employing dis-
course completion tasks and also role-plays. Or they are focused on units smaller
than speech acts (“micro units”, cf. chapter 1 of this volume), e. g. discourse mark-
ers, as in some studies in variational pragmatics, in which corpus-linguistic meth-
odology is preferred (e. g. Aijmer 2013). Historical pragmatics also has a tradi-
tional micropragmatic focus (e. g. Jucker and Taavitsainen 2008) as well as work
in the Gricean tradition, which, while not interested in speech acts, predominantly
concentrates on utterance-size units in their analyses.
A macropragmatic focus, on the other hand, is found in areas sometimes consid-
ered outside the scope of pragmatics, especially from the perspective of researchers
working in the Gricean tradition. These areas are in particular bottom-up conversa-
tion analysis and top-down discourse analysis. In this handbook, and in the hand-
book series this volume belongs to, CA and DA are, however, considered integral
parts of and important disciplines in pragmatics (cf. Schneider and Barron 2014).
Scholars doing research in these two particular traditions are interested in, among
many other phenomena, speech act sequences and other units larger than individual
utterances (“macro units”, cf. chapter 1 of this volume) such as remedial inter-
changes (e. g. Owen 1983), conversational openings and closings (e. g. Schegloff
1972, Schegloff and Sacks 1973) as well as entire speech events such as service
encounters (e. g. Félix-Brasdefer 2015). As a default, research of this type is based
48 Klaus P. Schneider

on self-compiled, and therefore relatively small, corpora of audio recordings or,

more recently, video recordings, especially when the focus of analysis includes
non-verbal communication and multimodality.

2.10. Spoken versus written

Given its roots in speech act theory and considering the great impact of ethnometh-
odology and conversation analysis, pragmatics has a traditional focus on spoken
rather than written language. Needless to say, however, written language is also
used communicatively, intentionally and for practical and social purposes, i. e. to
get things done and to manage interpersonal relations, and this includes hand-writ-
ten and machine-written texts as well as digital manifestations of written language
(for further differentiations, cf. chapter 1, this volume). The pragmatics of written
language use has been studied from various perspectives in discourse analysis,
critical discourse analysis (CDA), text linguistics, text analysis and genre anal-
ysis (for overviews, cf., e. g., Mahlberg 2014, Wodak 2011, Esser 2014, Tardy
and Swales 2014). Investigations have dealt with discourse types, genre conven-
tions, structural, functional and contextual features, sequential aspects, obligatory
and optional elements, and manipulative representations of events, to name but a
few focuses of analysis. These examples once again show that the nature of the
research questions depends on the respective theoretical background and discipli-
nary affiliation of the researchers, which impact the choice of method and data
collection procedure (cf. also Bednarek 2011: 546–551). Research on the pragmat-
ics of written language use was initially based on small self-compiled corpora of
texts, e. g. research articles, as, for instance, in Clyne (1987). Clyne examines 26
research articles authored by native speakers of Australian English and 26 research
articles by German scholars, of which nine were written in their native German
and 17 in English as a foreign language. His study is comparative in two ways,
as he contrasts academic styles in research articles not only across languages, but
also across disciplinary cultures, specifically in linguistics and sociology. Today,
researchers investigating the pragmatics of written discourse frequently employ
large machine-readable corpora, irrespective of the framework that they work in.
Yet, whether or not large machine-readable corpora are employed depends again
on the specific research questions researchers wish to answer, and in particular, of
course, whether a suitable corpus is actually available. Barron (2012), for example,
is a large-scale contrastive genre analysis of 34 public information messages, such
as government initiated road safety or health campaigns, in Ireland and Germany,
which includes both written and spoken language (as well as visual communica-
tion and music) and considers a total of 244 written or spoken texts (posters and
messages in print media, and clips on radio or television and in the cinema). For
this particular project, no corpus was already available, it had to be specifically
compiled.
Methods and ethics of data collection 49

3. Strategies and instruments for data collection

Against the background of the discussion of types of research in the preceding

section, the present section provides an overview of strategies and instruments
for data collection used in pragmatics (cf. also Kasper and Dahl 1991, Kasper
2000, 2008, Jucker 2009, Bednarek 2011, Golato and Golato 2013, Leech 2014:
247–260). Previous overviews are often focused on particular areas of pragmatics,
e. g. interlanguage pragmatics, or particular units of analysis, e. g. speech act anal-
ysis. Methods employed in the Gricean paradigm are, as a rule, not included (cf.,
however, the chapters by Clark and by Gibbs, this volume). This means that the
focus is mostly on production rather than on comprehension. Furthermore, there
is a strong bias towards spoken language in these overviews, methods of collect-
ing written data are not normally covered; Archer et al. (2012: 11–23), although
quite brief, and especially Bednarek (2011) are two exceptions. While most authors
provide a general overview, Cohen (2012), in his survey of research methods in
intercultural pragmatics, takes the example of doctor-patient interactions to discuss
methodological issues, including issues of research design, data collection, and
data analysis.
Data collection methods in pragmatics research can be subsumed under three
headers: intuition, observation and experimentation. For these three broad catego-
ries, Jucker (2009), in his survey of methods for speech act research, adopts the
metaphors “armchair”, “field” and “laboratory” (cf. Clark and Bangeter 2004).
Prototypical armchair research, in which researchers exclusively rely on their own
communicative experience and pragmatic competence, and which is therefore
defined as individual second-order introspection (cf. section 2), can be used to
deductively develop theories and to postulate e. g. principles and maxims of com-
munication. “Armchairing” has been used by language philosophers and theorists
to formulate e. g. speech act theory, relevance theory, and (early) politeness theo-
ries, the Co-operative Principle (CP) and Politeness Principle (PP), conversational
maxims and politeness maxims (e. g. Austin 1962, Searle 1969, Sperber and Wilson
1995, Grice 1975, Leech 1983). This method is not an empirical method. Individ-
ual intuitions of the researchers are not data in the sense this term is usually used
in. No tools or specific procedures are available or employed to collect these data.
Therefore, the armchair method is not further discussed in the present chapter (cf.
the chapters by Bublitz, Sbisà, Huang, and Clark in Part II of the present volume).
Jucker (2009: 1615), who calls the prototypical armchair method “philosophical
method”, also classifies the “interview method” as an armchair method, because it
is also based on intuitions, specifically the intuitions of the interviewees. Yet, since
interviewing involves collective (first-order) introspection and requires the recruit-
ment of informants, audio- or video-recording and transcription work, it is dealt
with in the present chapter as an experimentational method (cf. Félix-Brasdefer
and Hasler-Barker 2017). The general focus of the present section is on empirical
50 Klaus P. Schneider

pragmatics and, since observational data are dealt with in more detail in chapter 1
of this volume, especially on experimentational methods of data collection. All
methods discussed, are presented under the headings “Using a corpus” (section
3.1), “Recording naturally occurring spoken interaction” (3.2), “Production tasks”
(3.3), “Comprehension tasks and judgement tasks” (3.4), and “Further data collec-
tion methods” (3.5). Each experimentational method is exemplified by individual
studies, classical and recent, to highlight and illustrate crucial issues pertaining
to each method, especially problems of experimental design and problem-solving
strategies, as well as the suitability of a method for specific types of research and
the potential for providing answers to particular research questions.

3.1. Using a corpus

Empirical pragmatics, by contrast to armchair pragmatics, requires a data corpus,
in the general and broad sense of the term. This applies to both fieldwork and lab-
oratory work, respectively requiring a corpus of observational data and a corpus of
experimentational data. In linguistics today, the term “corpus” is used in a narrow
and very specific sense, referring only to very large electronic machine-readable
collections of spoken and/or written language, which were not, as a rule, compiled
for any particular research purpose, let alone any particular type of research in
pragmatics. Examples of this kind of corpora include the British National Corpus
(BNC), the Corpus of Contemporary American English (COCA), and the national
or regional corpora belonging to the International Corpus of English (ICE). In
general, however, any collection of data, small or large, and whether machine-read-
able or not, is a corpus. A corpus, in this broad and general sense, may already
exist prior to a research project, as is the case for the above examples, or it may
be compiled by the researchers themselves for a specific project (cf. Andersen,
this volume). Self-compiled corpora are usually much smaller than pre-existing
machine-readable corpora in the narrow technical sense, but are tailored to a par-
ticular research purpose and permit better control of relevant contextual and demo-
graphic variables. In pre-existing large corpora, information about contexts and
demographic features of interlocutors are only rarely provided as comprehensively
and systematically as in the Santa Barbara Corpus of Spoken American English, if
at all. Overall, corpora provide big data, but not big context (cf. Taavitsainen and
Jucker 2015: 18).
Using language material from large machine-readable corpora is usually clas-
sified as a field method, i. e. a method of gathering naturally occurring data (some-
times called “natural”, “naturalistic”, “authentic” or “observational”). However,
corpus data do not all qualify as observational data. They are naturally occurring
to the extent that their existence does not depend on a researcher. Yet there are
significant differences between the data types included in machine-readable cor-
pora, sometimes even in the same corpus. A corpus may include written and spo-
Methods and ethics of data collection 51

ken language, everyday conversation and institutional discourse, fictional material

such as novels, film scripts and drama, and nonfictional material such as naturally
occurring talk. While there is a long tradition of using drama in discourse analysis
(cf. Schneider 2011), there are, of course, also important differences between fic-
tional dialogue on the one hand and authentic conversation on the other hand. Fic-
tional dialogue in prose and drama are (primarily) written representations of talk
which do not include what has been termed “normal non-fluency” (cf. section 2.1).
COCA, for instance, to highlight a further relevant issue, is popular because of its
large size, up-to-dateness, general availability and ease of access, yet it includes
exclusively spoken and written media language, e. g. from radio and television
programmes, and newspapers and magazines. Some television programmes may be
scripted, other programmes, for instance documentaries, may include casual every-
day conversation. Researchers must be aware of the specific nature of pre-existing
corpora and the data types they contain and select the material for their analyses
very carefully. It is definitely helpful that large machine-readable corpora are, as
a rule, subdivided into relevant categories, e. g. “spoken”, “conversation”, “aca-
demic”; these categories are, however, not well defined. They are mostly rather
broad, lumping together different discourse types and genres, and they vary across
corpora.
Large machine-readable corpora are most effectively used in work on the
micro-pragmatic level, notably in work on micro-units such as discourse markers
and similar phenomena. Form-based corpus searches for such units are quick and
exhaustive (cf. O’Keeffe, this volume). These units can then be studied in the
co-text of entire speech events. Searches for larger and, more importantly, more
variable units such as speech acts are less successful. This applies even to speech
acts whose realizations are relatively fixed. For example, in their search for com-
pliments in the BNC, Jucker et al. (2008) found that even the highly routinized
syntactic and semantic patterns identified by Manes and Wolfson (1981) in their
seminal ethnographic study of American English compliments, which were used
as search strings, did not yield the expected results. On the one hand, along with
compliments utterances were retrieved which were not compliments although they
displayed the same structural properties. On the other hand, compliments struc-
turally not corresponding to the search strings were not found. To overcome these
problems of precision and recall, a certain amount of manual sifting was necessary.
A popular strategy in corpus-based speech act analysis is the employment of illo-
cutionary force indicating devices (IFIDs) such as performative verbs (e. g. invite,
offer, apologize) or other devices used in explicit realizations of speech acts such
as sorry in apologies. Harrison and Allton (2013) and Lutzky and Kehoe (2017)
are two recent studies that proceed in this fashion. Both studies examine apologies
in a written digital genre, namely in emails and blogs respectively. Harrison and
Allton (2013) base their analysis on a self-compiled corpus of email messages sent
to discussion lists on academic and professional topics; Lutzky and Kehoe (2017)
52 Klaus P. Schneider

worked with a sub-corpus of the Birmingham Blog Corpus (somewhat confus-

ingly referred to as BBC) and compared their results to Deutschmann’s (2003)
BNC findings. For their searches, both Harrison and Allton (2013) and Lutzky and
Kehoe (2017) used a small inventory of IFIDs which included not only sorry and
apologize but also e. g. excuse, forgive and regret. Needless to say, lexemes such as
the latter three occur in a range of speech acts other than apologies, and even sorry
is not an unambiguous indicator of apologising as it is also used in commiserations
and condolences (e. g. I’m sorry to hear that …). It is also clear that less explicitly
marked and more indirect realizations cannot be retrieved by employing this pro-
cedure. This may not be critical for apologies, or thanking, greeting and farewells,
yet many other speech acts are rarely or never realized by employing a performa-
tive verb; this holds in particular for conflictive and intrinsically face-threatening
acts, among them requests, threats and insults. To remedy this situation and solve
these problems, Jucker et al. (2012) recommend to search corpora for speech act
verbs (e. g. invite, suggest, warn) as well as speech act nouns (e. g. invitation,
suggestion, warning) in both their performative and their discursive uses, i. e. not
only for realizing the respective speech acts, but also for talking about speech acts
(e. g. reporting, commenting, challenging; cf. Schneider 2017). However, as long
as hardly any pragmatically annotated corpora exist (cf. Archer and Culpeper, this
volume), a certain amount of manual sifting will be required in many corpus-based
studies in pragmatics research.
In general, the suitability of corpus data for comparative work is limited as cor-
pus data are, as a rule, not immediately comparable, especially not across corpora.
The ICE corpora are a notable exception. Currently thirteen ICE corpora are avail-
able, ranging from Canada and East Africa to Sri Lanka and the USA, and many
more are planned or under construction. These corpora enable direct comparison
due to their parallel design and composition. Each of these corpora consists of
approximately 60 per cent of spoken language and 40 per cent of written language,
covering dialogue and monologue, private and public, scripted and unscripted,
including face-to-face conversations, telephone calls and speeches, some of them
broadcast, as well as printed, typed and handwritten material, including journalistic
genres and prose fiction <http://ice-corpora.net/ice/design.htm>. The ICE corpora
are, however, not very large, each corpus containing approximately one million
words only, which by today’s standards is rather small, considering that the Brown
Corpus (1961), regarded as the first machine-readable corpus, also includes one
million words (of written language only), and corpora today often comprise sev-
eral hundred million words or more. Of some of the existing ICE corpora only the
written part is available to date, i. e. for Nigeria, Sri Lanka, USA. The Irish corpus,
ICE-Ireland, is exceptional in two ways. First, it is divided into two parts, one for
the Irish Republic and one for Northern Ireland, rendering each part only half the
size of the other corpora and thus even smaller. Secondly, there is also SPICE-Ire-
land, which is a pragmatically annotated version of the spoken part of ICE-Ireland
Methods and ethics of data collection 53

and one of the very few pragmatically annotated corpora accessible today (cf.
Archer and Culpeper, this volume). SPICE-Ireland is annotated for Searle’s illocu-
tionary types (Directives, Expressives, etc.) and for discourse markers and related
phenomena, and is thus particularly suitable for work on these units of analysis.
Unfortunately, none of the other ICE corpora is annotated in this way. Searching
ICE-Ireland for individual illocutions (i. e. speech acts such as requests, complaints
or advice) requires, however, manual sifting, though this is facilitated by the anno-
tation of illocutionary types. A further obstacle to comparative work more gener-
ally is the lack of (sufficient) information about situations and about participants
in almost all corpora. This problem is especially acute for investigations aimed at
examining the impact of macro-social factors such as region, age or gender, e. g.
in variational pragmatics.

3.2. Recording naturally occurring spoken interaction

While collecting written language is relatively simple and straightforward, col-
lecting spoken language is much more demanding, and this holds in particular for
recording naturally occurring spoken interaction such as everyday conversation. In
this case, a high investment of time is required and a number of practical, technical,
legal and ethical problems have to be solved (cf. section 4 below), including the
acquisition of recording devices and transcription work. For researchers, audio- or
video-recording naturally occurring conversation in the truest sense of the word is
virtually impossible. Since consent of all participants is required prior to recording,
the observer’s paradox applies:
[…] the aim of linguistic research in the community must be to find out how people talk
when they are not being systematically observed; yet we can only obtain this data by
systematic observation. (Labov 1972: 209)

In other words, talk cannot be recorded without participants being aware of the fact
and, thus, behaving in less natural ways accordingly. In some studies, it is, however,
reported that participants tend to forget about being recorded and behave increas-
ingly naturally the longer the recording takes and the speech event lasts, feeling
particularly at ease in familiar situations and among friends. Tannen’s study of con-
versation at a Thanksgiving dinner among six friends, which was audio-recorded
for two-and-a-half hours, is a case in point (Tannen 1984, 2005). Another example
is Rüegg’s (2014) quasi-replication of Labov’s (1966) famous department store
study. Rüegg, working on socioeconomic variation in American English responses
to thanks, audio-recorded talk in several Los Angeles restaurants belonging to three
categories reflecting social class differences and labelled as “up”, “middle” and
“low”. In each case, Rüegg had dinner with a group of friends and participated in
the dinner conversations as well as in the interactions with waiters and waitresses.
All people involved were informed about the recordings beforehand, including the
54 Klaus P. Schneider

restaurant owners, but at least the waiters and waitresses seemed to forget it in the
situation as they were busy doing their normal job. In Rüegg’s study, the careful
choice of locations for the recording and the same activities in all the locations, i. e.
having dinner, were the tertium comparationis permitting immediate comparison
and, thus, the analysis of socioeconomic variation in speech act realization.
Félix-Brasdefer (2015) employed a similar strategy. For his contrastive study
of service encounters in Mexico and the United States, he selected four types of
commercial and non-commercial settings as his third of comparison (small shops,
supermarket delicatessens, open-air markets and visitor information centre). His
book-length study is based on 147 hours of naturally occurring face-to-face ser-
vice talk audio-recorded in the selected settings and analysed quantitatively and
qualitatively for a range of phenomena including speech act realization, bargain-
ing sequences, turn-taking, and prosodic features as well as cross-cultural and
intra-cultural variation – in short, genre conventions on the micro- and macro-level
and their invariant and variant features. Shop owners and authorities gave permis-
sion to make the recordings. Customers were informed in a written note displayed
on the counter that the recordings were being made and had the option to refuse
being recorded (cf. also Placencia 2008). The researcher did not participate in the
recorded discourse.
Selecting a particular type of spoken discourse and/or a particular type of set-
ting, institutional or otherwise, is also a strategy frequently employed by research-
ers in conversation analysis and interactional linguistics. For instance, Sacks based
early work on a collection of telephone calls which were made to the helpline
operated by The Los Angeles Suicide Prevention Center (cf. Schegloff 1992). By
focusing on a particular discourse type or context and collecting similar cases, it is
possible for researchers working in that ethnomethodological tradition to identify
recurrent patterns of speaking, pausing, interrupting, etc. and draw conclusions
about participant practices characteristic of the given discourse type or context
and more specifically about “systematic”, i. e. collective, solutions to interactional
problems. In this fashion, researchers can make “seen” what is generally “unseen”
and just taken for granted (cf. Garfinkel 1967). This approach underlines again the
fundamental importance of comparative work and the centrality of comparability.
Generalizations are not easily arrived at by comparing not readily comparable
material, as is sometimes the case in investigations which are based on pre-existing
machine-readable corpora, or by examining only one individual instance of a dis-
course type, e. g. a single everyday casual conversation, where it is not clear which
properties are recurring or invariable and which are accidental or idiosyncratic.
This approach focused on a particular context furthermore demonstrates the over-
all appropriateness of what is essentially a top-down strategy even in CA, which
is primarily concerned with local phenomena. Finally, this approach emphasizes
the role of context and the context-sensitivity of pragmatic phenomena, including
turn-taking and pre-sequences, but also e. g. speech act realization. For example,
Methods and ethics of data collection 55

the aforementioned study by Harrison and Allton (2013) shows that there are cru-
cial differences between apologies in email messages to discussion lists and in
face-to-face conversation.
In general, it seems much easier to gain access and the permission to record
naturally occurring spoken interaction if the researcher is a participant-observer,
as was the case in Tannen’s and Rüegg’s studies. Being a participant also provides
the researcher with the opportunity to steer the conversation in a particular direc-
tion, which may be crucial for the respective aim and research question. At the
same time, participant observation reduces the degree of naturalness or authen-
ticity of the talk. Researchers as external observers, on the other hand, may not
have access to relevant information and misjudge the situation and the relationship
between the interactants, especially in everyday conversation, but not to the same
degree perhaps in e. g. service encounters or institutional discourse. If the observed
interactants are strangers, i. e. not known to the researcher, and if what they talk
about presupposes knowledge of their shared history and prior encounters, then
researchers may not be able to fully understand the recorded discourse. This is a
danger when adopting a less obtrusive etic (i. e. an outsider’s), rather than an emic
(i. e. an insider’s) perspective (cf. also Markee 2013). In this case, a more adequate
interpretation may only be achieved if researchers have the option to interview the
conversationalists after the recording or, ideally, discuss with them the transcripts
at a later stage to obtain a fuller picture.
After audio- or video-recording naturally occurring spoken interaction, tran-
scription work is necessary to enable systematic analysis of the recorded mate-
rial. This work should, however, not be underestimated, because it can be very
time-consuming, the more so, the more fine-grained detail is to be transcribed. For
instance, transcribing phenomena that researchers in conversation analysis and
interactional linguistics are interested in, who would not accept any data type other
than audio- or video-recorded naturally occurring spoken interaction, requires a lot
of time and experience and presupposes specific training. Transcription work in
this case involves e. g. measuring pauses and accurately representing interruptions,
overlaps and simultaneous talk. Needless to say, transcribing non-verbal behaviour
in video-recordings, in addition to verbal behaviour, is even more demanding and
much more time-consuming (for further details about transcribing and systems and
conventions of transcription cf. Kreuz and Riordan, this volume).
An alternative method of collecting naturally occurring spoken data is the eth-
nographic method, i. e. overhearing what other people say and writing it down,
traditionally by hand. This method is also known as “taking field notes” and some-
times called “the notebook method” (Jucker 2009: 1616). The advantages of this
method include that it avoids the observer’s paradox as no consent of the people
overheard is required. Moreover, no electronic recording equipment is needed and
no transcription work involved. This method has been popular in sociolinguistic
research, e. g. in the classical studies by Manes and Wolfson (1981) on American
56 Klaus P. Schneider

English compliments and by Holmes (1986 and 1990) on New Zealand English
compliments and apologies. So obviously this method is considered suitable for
speech act analysis. However, there are at least two serious shortcomings. As men-
tioned above, at the beginning of this chapter, Manes and Wolfson were the target of
methodological criticism. Doubt was cast on their result that compliments in Amer-
ican English are highly routinized and predictable. It was speculated that the data
collectors had gathered only prototypical and explicit realizations not least because
the investigators’ students were among the collectors. More indirect and more cre-
ative and original compliments, it was argued, might have gone unnoticed. This
criticism applies in fact more generally to any use of the ethnographic method for
the purposes of investigations into speech act realizations. It is, mutatis mutandis, a
version of the recall problem, otherwise considered typical of corpus-based speech
act analysis, here specific to the method under inspection. A further shortcoming
derives from the limitations of accurate hearing and memorizing. Keeping in mind
what was actually said and how exactly it was said until it is written down is a chal-
lenging task. The longer, more complex and unpredictable an utterance, the harder
it is to reliably enter its wording into a (conventional paper) notebook (cf. section 1
above). Obviously, the notebook method is best suited to the purposes of research
on micro-units such as discourse markers, highly routinized speech acts and, per-
haps, the contents of more complex speech acts, e. g. what is requested, promised or
offered. The method is unsuitable for investigations of the exact wording of freely
formulated speech acts, speech act sequences, interaction or turn-taking.
A recent example in which the ethnographic method was employed is
Bieswanger’s (2015) study of responses to thanks in American English and Cana-
dian English. Focusing on a particular type of male and female informant (based
on apparent demographic categories), he asked directions in New York City and
in Vancouver. He paid special attention to his informants’ reactions to his acts
of thanking, after he had received the desired information, and manually wrote
down their responses when his interlocutors were out of sight. Given the more
limited inventory of realization strategies for responses to thanks, their brevity
and their high degree of routinization, his data are more robust and reliable than
the compliment and apology data collected by Manes and Wolfson and by Holmes.
Bieswanger was not an eavesdropping bystander, but a participant in all interac-
tions, invariably using the same type of question and of thanking, thus achieving
a high degree of data homogeneity and comparability, which was central for his
investigation conducted in the framework of variational pragmatics and compar-
ing two national varieties of English. By contrast, Manes and Wolfson as well as
Holmes were each focused on one particular national variety and addressed dif-
ferent research questions, among others how many different contexts compliments
and apologies occur in and what topics they refer to, i. e. respectively the entity
complimented and the offence apologized for. Bieswanger, who gathered his data
alone, collected a total of 120 instances (30 male and 30 female informants in each
Methods and ethics of data collection 57

of the two cities) and provides raw numbers and percentages. Dinkin (in press)
used the same strategy as Bieswanger, i. e. asking directions, in his study of age
variation and apparent time changes in Canadian English responses to thanks. Like
e. g. Manes and Wolfson and Holmes, Dinkin involved his students in the data col-
lection and gathered more than 1,500 responses to thanks, on which he performed
detailed quantitative and statistical analyses.
The responses to thanks collected by Bieswanger and by Dinkin were naturally
occurring as the informants were not aware that their responses were recorded. On
the other hand, their responses were not naturally occurring in the strict sense of
the word, because they were elicited by the investigator. With explicit reference to
Labov’s (1972) department store study, Dinkin calls this procedure “rapid anony-
mous elicitation”. In other words, the adopted procedure can be seen as halfway
between truly naturally occurring spoken interaction, recorded by an observing
non-participant, and data elicited in interactions initiated by the investigator under
laboratory conditions. This procedure is, in other words, similar to experimenta-
tional methods, which will be discussed in section 3.3 below. As directions are
asked from total strangers (hence “anonymous elicitation”), the interactions do not
have any social consequences, which is a distinctive feature of almost all experi-
mentational work, whereas truly naturally occurring interaction typically does have
consequences in real life. Asking directions is an elicitation procedure also used in
psychology, where it is considered an experimental method (cf. Gibbs, this volume).

3.3. Production tasks

This section deals with a wide range of different methods used to collect language
data produced by language users. The common denominator of these different
methods is that they are all experimentational methods. This means that they all
meet the following five criteria:
(1) The language produced does not occur naturally, i. e. it does not arise from the
genuine needs and desires of language users, but occurs on the initiative of a
researcher.
(2) The language produced is elicited under conditions determined by the
researcher, sometimes referred to as “laboratory conditions”. That is to say,
the researcher usually decides on time, place and setting of the data elicitation.
(3) All language users serving as informants are consciously aware that they are
involved in an experiment and that their language productions are recorded,
not necessarily electronically, and then used for research purposes. To this,
they have given their consent, and they participate voluntarily (see section 4
on ethical issues).
(4) All informants follow instructions and complete a task designed by the
researcher.
58 Klaus P. Schneider

(5) At least in most cases, the language produced does not have any social con-
sequences, unlike naturally occurring discourse. This lack of consequences
contributes to the often bemoaned artificiality of the elicitation situations and
the language produced therein.

The production tasks discussed in this section include elicited conversation,

role plays, interviews and discourse completion tasks. They can be seen to form
a continuum with decreasing interactionality and, at the same time, increasing
researcher control (cf. also chapter 1, this volume, section 4.3 on researcher inter-
ference). Most of the production task formats discussed here can be, and have been,
employed for testing purposes as well, specifically for assessing the pragmatic
competence of foreign language learners and second language users (cf., e. g.,
Roever 2005, 2013).

3.3.1. Eliciting conversation

The essential difference between elicited talk and naturally occurring talk is that
the former meets the above criteria. By comparison to other production tasks,
researcher interference is minimal. As a rule, two participants are involved, who
can be themselves, i. e. they are not requested to adopt social roles other than
their own. The participant constellation is usually symmetrical, the type of talk
elicited is conversation, and the topics are not predetermined. Under the heading
“elicited conversation”, Kasper (2008: 287) also subsumes tasks in which research-
ers specify topics, interactional goals or discourse roles. Such tasks are, however,
very similar, if not identical, to the type of role play commonly referred to as role
enactment (cf. 3.3.2).
Instructions which seem to work particularly well require participants who
are complete strangers to get to know each other (e. g. Svennevig 1999). These
instructions seem to work well not least because this situation is ecologically
valid, i. e. participants can relate to it and have relevant previous experience, and
may also be genuinely interested in getting to know somebody new. Thus, the
language produced in this situation may even have social consequences. Getting
acquainted was also the task in Haugh and Carbaugh’s (2015) comparative study
of initial encounters between speakers of American English and between speakers
of Australian English, which was focused on self-presentation and self-disclosure,
as the respective interactional practices were found inductively to differ across
the two varieties of English under inspection. A total of 46 dyadic interactions
was recorded audio-visually, amounting to a corpus of nineteen and a half hours.
The informants were invited to participate in a project about “communication in
English”; the specific research goal was not revealed. The participants were taken
to a room and told that they were being recorded for the purposes of this project.
They were allowed to talk about any topic they wished to discuss and to determine
Methods and ethics of data collection 59

the end of the recording themselves. The resulting recordings varied in length from
approximately fifteen minutes to close to two hours. On the issue of social conse-
quences, the investigators note:
That getting to know people was indeed the aim of many participants was also evidenced
by the fact that a number of these encounters resulted in further contact being made be-
tween those participants on their own initiative. (Haugh and Carbaugh 2015: 467)

Unlike in the anonymous elicitation reported in section 3.2, Haugh and Carbaugh
had relevant information about their participants that they retrieved from a back-
ground questionnaire, as commonly used in experimentational work, providing
demographic information such as age, regional affiliation and educational back-
ground.
Elicited conversation displays all features of naturally occurring conversation
relevant to the purposes of pragmatics research, i. e. prosodic, formal, actional,
interactional, organizational, etc. features. Collections of this data type may, there-
fore, serve as a corpus for examining a range of different phenomena, including
intonation, discourse markers, speech act realization, adjacency pairs, speech act
sequences, conversational openings and closings, turn-taking, interruptions and
silence, to name but a few.

3.3.2. Role plays

Role plays are a commonly known task format frequently employed e. g. in foreign
language teaching in schools or in communication trainings, including intercul-
tural trainings, in business contexts. Role plays can be defined as “simulations of
communicative encounters, usually (but not necessarily) conducted in dyads on
the basis of role descriptions or instructions” (Kasper 2008: 288). Several subtypes
have been distinguished with reference to the following parameters: nature and
status of the roles ascribed, length and detail of instruction, and amount of the data
elicited (cf. also Félix-Brasdefer, this volume).
Perhaps the most basic distinction is that between role plays (in a narrow and
specific sense of the term) on the one hand and role enactment on the other hand.
In the latter case, participants can simply be themselves and do not have to adopt
a social role different from their own. In this regard, role enactment resembles
elicited conversation (cf. section 3.3.1 above). By contrast, in role plays (narrowly
defined) participants, e. g. college students, may be requested to take on roles such
as teacher, bank manager or policeman, in other words roles for which they lack
qualifications and experience. Role plays of this kind are nonetheless recorded if
researchers are interested in particular role relations and scenarios, if only for the
practical reason that it is much easier to recruit students as participants in research
projects than actual teachers, bank managers or policemen. However, in this case,
researchers have to bear in mind that the ecological validity of their data is limited.
60 Klaus P. Schneider

The instructions for role plays (in the broad sense) can be very short and rather
vague or extremely long and complex. The following examples (a) – (c) illustrate
the shorter type (short instructions are also used e. g. by Hassall 2012, longer ones
e. g. by Göy et al. 2012). The below examples are taken from an early project in
discourse analysis which was based entirely on a role play corpus (Edmondson
1981: 77):
(a) Sheila wishes to borrow records off a flatmate
(b) Two travellers discover by accident that they seek the same destination
(c) Librarian notices student returning book has copiously annotated the text

In this particular research project, the participants were university students. All
three of the above examples describe scenarios for which university students can
be expected to have relevant experience, and this also applies to the remaining 21
scenarios used in that project. The scenarios can, therefore, be considered ecologi-
cally valid. However, unlike the second example, which is gender neutral, the first
example explicitly requires male as well as female participants to act out Sheila’s
role. To facilitate the task, detailed background information on Sheila and her cur-
rent situation (Edmondson 1981: 184) was given in writing, prior to the recording
session, to the participants playing Sheila’s role. Detailed background information
was in fact given for all characters, including the assignment of male or female
identities also in those cases which seem to be gender neutral in the instructions,
e. g. flatmate, travellers, librarian and student in the above examples. Thus, the
brief instructions illustrated above are all complemented with detailed background
information about the characters involved. The total of 24 scenarios is not a random
number chosen by the investigator, but was achieved by systematically varying rel-
evant variables, including among others power, familiarity and whether something
a person had done was positive and negative for the interlocutor (e. g. copiously
annotating the text of a library book in example (c) above). The validity of the data
is limited in those situations in which the student participants were required to play
the role of e. g. a librarian, a landlady or a shopkeeper. While these roles do not cor-
respond to the participants’ identities in real life, it could be argued that university
students have sufficient experience with librarians, landladies and shopkeepers to
be able to play these roles more appropriately than social roles more alien to them.
How the scenarios used in this project were actually developed is not revealed. It
can be assumed that they were designed by the researcher himself employing the
armchair method, which is a procedure commonly employed in experimentational
studies. An alternative procedure is to ask members of the targeted participant
group about relevant encounters, e. g. in an interview (cf. section 3.3.3 below).
Edmondson made more than two audio-recordings of each of his scenarios,
which is a wise strategy if the time and means are available. He then chose two
recordings for each scenario. Selection criteria included communication break-
down in the role play and participants’ overacting or unnatural behaviour according
Methods and ethics of data collection 61

to the participant comments after the recording. The resulting collection comprised
48 role-plays, which were transcribed and analysed for discourse markers (termed
“fumbles” in this project) and individual illocutions, and, most importantly, how
these combine into interactional structures in spoken discourse.
Recently, Félix-Brasdefer (2009) carried out a role play-based investigation
of requests in three Latin American varieties of Spanish (Mexico, Costa Rica,
Dominican Republic). His population consisted of 18 participants per variety; all
54 participants were male. The three scenarios written for the role play record-
ings all included a symmetrical relationship between the interactants, but differing
degrees of social distance. The 162 interactions in the role-play corpus were used
to analyse aspects of requesting behaviour not normally examined. The focuses of
analysis included not only the realization of request head acts, but also sequential
features of requesting and request negotiations such as “initial” versus “post-initial
requests”, and also several types of downgrading including prosodic downgraders
such as tempo, loudness and rate of delivery. These features were compared across
the three Spanish-speaking cultures in the framework of variational pragmatics.
This example shows how role play data, which display essentially all features of
spoken interaction (albeit, perhaps, in a less natural way, due to the observer’s
paradox and the artificiality of the situation) may contribute to speech act analysis.
In her monograph-length study of discourse markers in British English, specif-
ically well, just, you know, like, sort of and I mean, Beeching (2016) employed a
corpus of 81 role enactments involving undergraduate students, which were based
on the same instructions (Beeching 2016: 230). This corpus had been recorded at
her university for a different purpose between 2010 and 2014 (Beeching 2016:
30–31). The data from this corpus were triangulated with data from the British
National Corpus and the Old Bailey Corpus (Beeching 2016: 49–50). Additionally,
62 informants in two age groups (18–20 and 50–70) were asked to rate each of the
six discourse markers on a five-point scale on four dimensions (“polite/impolite”,
“direct/indirect”, “educated/uneducated”, “friendly/unfriendly”) (Beeching 2016:
40–41, 231–233; on rating tasks, cf. section 3.4.2 below).
Role plays with long and complex instructions for each participant are often
referred to as simulations. While role plays with shorter instructions are preferred for
teaching purposes in foreign language classrooms, simulations are often preferred
for training purposes in business contexts; they are also employed in pragmatics
for data collection in empirical research projects. A case in point is Pohle’s (2009)
study of Irish English business discourse, with a particular focus on offers and
offer sequences in the context of negotiating talk. For this project, Pohle employed
a simulation called Munster Trips-Grand Canal Hotel Negotiation Simulation,
which she had adapted from Groth’s (2001) Brit Trips-Midway Hotel Negotiation
Simulation, originally developed for educational purposes. The participants were
eight middle-aged Irish males with formal qualifications in Commerce or Business
Studies and practical experience in business negotiations (the demographic infor-
62 Klaus P. Schneider

mation was provided in a post-simulation questionnaire). While strictly speaking

the participants took on identities different from their own and assumed the roles
of hotel manager or tour operator, there was a huge overlap with their own profes-
sional identities and their experience as sellers and buyers. The simulated negoti-
ations were inconsequential, in opposition to simulations employed in research in
behavioural economics that involve real money. Pohle’s simulation did not involve
any gains or losses, apart, perhaps, from immaterial ones concerning professional
face and reputation. The simulations were video-recorded; additional audio-re-
cordings were made to enhance the quality of the recordings and to facilitate the
transcription work. The instructions for this simulation (commonly referred to as
“simulation briefs”) were for the hotel manager three and for the tour operator
four pages long, for which the participants were given twenty minutes reading and
preparation time. Groth (2001) gave his participants one hour, but these partici-
pants were students and not professionals. Simulations used in research can also be
much more complex and simulation briefs considerably longer than in Groth’s and
Pohle’s case. For instance, in Martin’s (2001) sales negotiation simulation partici-
pants were given two weeks preparation time working on the briefs.
At the other end of the complexity continuum is a production task format called
“closed role play”. Closed role plays involve only one participant and elicit only
single-turn responses. The scenarios which participants are asked to produce an
oral response to may be provided by the investigator in an oral or written descrip-
tion or by a computer as aural and visual input, as in a multimedia elicitation task
(MET) (Schauer 2004). Closed role plays do elicit spoken data, thus permitting
the analysis of specific features of spoken language use, e. g. the oral performance
of speech acts. Yet, since closed role plays yield only single-turn responses, they
cannot be used for research on interactional features, e. g. sequential aspects or
turn-taking phenomena. This type of role play is therefore more adequately labelled
“oral discourse completion task” (cf. section 3.3.4 below).

3.3.3. Interviews
Interviews are well known to the general public as a journalistic genre, and also
widely employed in research to elicit language production, and used especially in
sociolinguistics (cf. Kasper 2008: 287). Interviews may be considered a subtype of
elicited talk, which is, however, much more constrained than elicited conversation.
The participant roles in interviews are fixed, one participant is the interviewer, the
other participant is the interviewee; there may also be more than one interviewee.
The relationship between interviewer and interviewee is an asymmetrical one, with
the interviewer in the more powerful position. The interviewer asks the questions
(requests for information), the interviewee gives the answers (provides the infor-
mation). The interviewer also controls the topics of the interview. Labov (1970)
took great care to make the relationship between interviewer and interviewee more
Methods and ethics of data collection 63

symmetrical and thus less intimidating than before. This was particularly important
in his work with Afro-American adolescents, who had been interviewed before by
much older interviewers who were European Americans, usually the researchers
themselves. Labov introduced interviewers who were also Afro-Americans and
much closer in age to the interviewees. Moreover, he created a more relaxed atmos-
phere by serving the interviewees soft drinks and snacks. These were some of
Labov’s successful methodological innovations. Under these new conditions, the
interviewees were more cooperative, less monosyllabic, and spoke more freely.
In sociolinguistic interviews, interviewees are standardly asked to tell their
life story, share some experience, narrate dangerous incidents or funny episodes.
In these cases, sociolinguists work on the assumption that the more emotional a
story, the more emotional the speech, which makes it possible for them to compare
standard and vernacular speech, and formal and informal styles. The language
produced in interviews can, of course, also be analysed to examine a range of phe-
nomena interesting to pragmaticists, e. g. backchannelling, turn-taking and repair
(e. g. Færch and Kasper 1982). Speech act realizations can also be studied. For
instance Schneider (2007) compared responses to thanks in closing sequences of
ethnographic interviews (about attitudes towards regional dialects in England),
radio interviews and shop encounters to examine the impact of the discourse genre
on speech act realization, in this case treating ethnographic interviews, which are
similar to sociolinguistic interviews (cf. also Roulston 2013), as one discourse
genre among others.
Interviews as a method of data collection can also be employed for distinctly
different research purposes. They can, for instance, be used by researchers to elicit
first-order conceptualizations of such categories as politeness and rudeness or par-
ticular speech acts. For instance, interviewees may be asked to share their under-
standing of compliments, threats or insults, to explain the difference between sug-
gestions and proposals, or between requests, orders and commands, or they may
be asked to define small talk, gossip or banter. While lay persons’ interpretations
of such terms can also be induced from instances of these meta-pragmatic terms in
naturally occurring spoken or written discourse (cf. section 2.2 above, also Haugh,
this volume), lay persons may explicitly be asked what their understanding is in
an interview. In this case, the interview is not a sociolinguistic interview, but a
meta-pragmatic interview (cf. Kasper 2008: 296–297). First-order conceptualiza-
tions elicited in meta-pragmatic interviews can then be compared to second-or-
der conceptualizations of the same phenomena developed in armchair research
and may thus complement expert definitions and theoretical constructs (cf., e. g.,
Watts’s postulate that (im)politeness theories should be informed by lay persons’
interpretations of politeness and impoliteness; e. g. Watts 2003).
A further goal for which interviews can be used to collect data is to ask inter-
viewees to provide examples of a particular speech act, i. e. utterances realising e. g.
a complaint or an invitation. For instance, they may be invited to remember the last
64 Klaus P. Schneider

compliment they received (e. g. Herbert 1989) or to tell the interviewer when and
how they were last exposed aggression. Interviews used for this particular purpose
resemble oral discourse completion tasks (also referred to as closed role plays).
Additionally, interviewees may be requested to provide examples of e. g. situations
in which small talk is likely to occur, in what type of situations compliments are
expected, or what they would say in a particular context. In this manner, interviews
can also be employed to generate scenarios for role plays or discourse completion
tasks. Scenarios elicited in this way should be ecologically more valid than scenar-
ios thought up in an armchair way by the investigators themselves, especially if the
views of several interviewees on typical scenarios converge.
Retrospective interviews conducted after the completion of a production task,
e. g. a role play or a discourse completion task, may help the researcher to better
understand what the informants said, why they said what they said and how they
said it. Such an interview may provide additional information and may shed light
on some of the informants’ decisions, choices and realizations. For instance, in her
longitudinal study of request modification in graduate learners of English, Wood-
field (2012) used a role play method and after the last recording session conducted
retrospective interviews with her participants to elicit additional qualitative data.
In these interviews, she asked such questions as “What went through your mind
while you were doing the role-play?” and “How did you decide to say what you
did?” (Woodfield 2012: 48; cf. also Barron 2003).
After a role play, the audio- or video- recording may be replayed to the partic-
ipants in part or in total, and the informants may be interviewed by the researcher
about particular passages, either to achieve a better hearing of what had been said
or to elicit comments and explanations. For the same purposes, informants may
alternatively be shown the transcript of a recording. However, due to the delay
informants may no longer accurately remember some details or tell the interviewer
what they think they had said and why, thus providing unreliable data.
Finally, interviews known as “oral proficiency interviews” (OPIs) are an inte-
gral part of many standardized language tests that are used internationally (e. g.
IELTS). These interviews are included to assess the language proficiency of foreign
language learners, including their pragmatic competence in the target language.
Specifically, they are intended to reveal to what extend learners have acquired the
necessary oral skills and interactional competence to perform successfully in an
interview, e. g. to understand the interviewer’s questions and to react in an appro-
priate manner (e. g. Seedhouse 2013).

3.3.4. Discourse completion tasks

Discourse completion tasks (DCTs) are especially popular in contrastive, cross-cul-
tural and interlanguage pragmatics (cf. Ogiermann, this volume). DCTs are over-
whelmingly administered in writing, which contributes to the popularity of this
Methods and ethics of data collection 65

method, as written data do not require any time-consuming transcription work.

On the other hand, this feature has received a lot of criticism. There are, however,
studies, though not very many by comparison, in which DCTs are administered
orally to elicit genuine spoken language and not written representations thereof.
Jones and Adrefiza (2017) are a recent example (for a comparison of written and
oral discourse completion tasks, cf. Yuan 2001, whose comparison also involves
field notes and naturally occurring conversation).
A DCT consists of a brief description of a situation which requires a reaction
from informants. The situations described typically involve two interactants and
are comparable to role play scenarios. Informants are requested to give only a
single-turn answer to complete the discourse in each situation; as a rule they are
expected to produce a particular speech act, e. g. a complaint, or an apology as in
the following example (Tanaka et al. 2008: 90, original italics):
You have a meeting with your lecturer at 2.30 p.m. You arrive there at exactly 2.30 p.m.,
but he is cross with you, saying you promised to be there at 2.00 p.m. Your lecturer says
in an annoyed tone:
Your lecturer: You’re 30 minutes late! We agreed to meet at 2o’clock. What happened?
You: ……………………………………………………………………………………

The informants’ turn may be preceded by a turn of their fictional interlocutor,

as in the above example. The informants’ turn may also be followed by a turn
of that interlocutor, termed “rejoinder”, whose function it is to limit the options
that informants have to complete the discourse and, thus, make it more likely that
informants produce the speech act the researcher is interested in. The occurrence
of a rejoinder does have an effect on what informants actually write and how they
write it (cf. Johnston et al.1998).
As a rule, several DCTs are included in a production questionnaire. The number
of DCTs in a questionnaire may differ widely from study to study depending on
the respective research questions and the background of the research. For instance,
Blum-Kulka et al. (1989), in their paradigmatic Cross-Cultural Speech Act Reali-
sation Project (CCSARP) focused on requests and apologies, employed a question-
naire including sixteen DCTs. Ren (2015b), in his study of requests, compliments
and refusals used twenty DCTs, Mulo Farenkia (2015), in his study of invitation
refusals and other speech acts, used twenty-nine, and Chen (1993), in his study
of compliment responses, used only four. The length of the questionnaire is not
an arbitrary decision taken by the researchers. Chen based the design of his ques-
tionnaire on the topic categories identified by Holmes (1988) in her ethnographic
study of New Zealand English compliments, e. g. appearance and possession. The
length of the questionnaire obviously also depends on the number of speech acts
under study and, more importantly, on the number of situational variables that are
systematically varied. For example, Ogiermann (2009), in her study of apologies in
English, Polish and Russian, employs eight DCTs which are controlled for power,
66 Klaus P. Schneider

distance, the apology receivers’ gender and their face harmed (Ogiermann 2009:
85).
Length of questionnaire is not to be underestimated. The longer a questionnaire,
the lower the informants’ cooperativity and their motivation to complete all DCTs
included. Also, informants tend to get bored when they realize what a questionnaire
is about, which is not what researchers normally tell their participants before they
complete a questionnaire. However, the focus of interest becomes apparent the
longer a questionnaire is, particularly if all DCTs are designed to elicit the same
speech act. This effect can be mitigated by including distractors. Yet, distractors
make a questionnaire even longer.
In the Questionnaire on English Usage (QEU) a different approach was adopted,
as different research questions were addressed. This questionnaire includes fifteen
tasks, of which nine are discourse completion tasks, four are multiple choice tasks,
and two are dialogue production tasks (also called dialogue construction tasks or
free discourse completion tasks; cf. Barron 2003). While multiple choice tasks are
well known and frequently used in many contexts, academic and otherwise (cf.
section 3.4 below), dialogue production tasks (DPTs), in which informants are
asked to individually write short dyadic conversations, are only rarely found in
pragmatics research. The three task formats included in the QEU occur in random
order. The fifteen tasks in this questionnaire were not designed to cover just one or
two but seven different speech acts (some occurring more than once) and two types
of phatic discourse. Among the seven speech acts there are initiating as well as
reacting acts, e. g. requests and responses to thanks, and polite as well as impolite
speech acts, e. g. apologies and responses to insults. All targeted phenomena also
occur in the questionnaire in a random order. With these features – three differ-
ent task formats and nine different pragmatic phenomena –, the QEU is a written
mixed-task multi-focus questionnaire (cf. Schneider 2005: 110–111). Given this
diversity, no distractors were needed.
The Questionnaire on English Usage was obviously not designed to systemat-
ically study the impact of micro-social factors such as power and social distance
on the realization of one or two speech acts (although there is some degree of the
situational variation between the tasks intended to elicit a speech act which occurs
more than once in the questionnaire). The QEU was originally developed for the
project “The Pragmatics of Irish English” (Barron and Schneider 2005). The initial
idea was to capture a pragmatic profile of a community of speakers, in this case
native speakers of Irish English. Later, a neutral version of the QEU, in which e. g.
typically Irish names such as Niamh and Sinead were replaced by names more
generally used in the English-speaking world, was used to collect data in further
Anglophone countries, specifically in England, the USA and Canada, from first
language speakers, and in Ghana from second language speakers. The QEU was
also used to collect data from German learners of English as a foreign language
at school and at university level, and also, with a German translation of the QEU,
Methods and ethics of data collection 67

German first language data. It is thus possible to establish native speaker norms
and pragmatic profiles of individual speakers, and to conduct research in inter-
language pragmatics in the classical tripartite design of comparing learner perfor-
mance to native speaker performance in both the target language and the native
language of the learners. Recently, the QUE has been employed to collect data in
Namibia to obtain a pragmatic profile of English as it is used in this multilingual
country in which it is spoken in second and foreign language varieties of different
proficiencies which seem to form a continuum. In this project on Namibia, the
QEU data are supplemented with meta-pragmatic focus group discussions (cf. also
Ho 2013), role plays, sociolinguistic interviews, field notes and public spoken and
written media discourse (cf. Schröder and Schneider, in press).
Discourse completion tasks have been subject to extended criticism. First, it
has been criticized that DCTs elicit written data to gain insights into spoken dis-
course, thus missing many features characteristic of oral communication, such
as hesitations, repair and intonation. This is, of course, an issue of data validity.
Second, it has been criticized that writers have much more planning time than they
would ever have in conversation. A reaction to these two points of criticism is the
introduction of oral DCTs. Third, it has been found that informants feel obliged
to write something in the lines provided in the questionnaire, even though they
would rather remain silent in a similar situation in real-life contexts. In response
to this, a variant of the classical DCT format has been developed which provides
informants with a further option usually phrased as e. g. “In real-life, would you
prefer to say nothing in this situation?” A fourth point of criticism is that inform-
ants may imagine distinctly different scenarios given the brevity of the instruc-
tions. A solution to this problem is to ask informants to think aloud and verbalize
everything which goes through their mind while they are completing a produc-
tion questionnaire. Recordings of these verbal reports may reveal how informants
imagine the situations described in DCT instructions. An alternative method would
be to explicitly ask informants in retrospective interviews how they imagined the
situations (cf. section 3.3.3 above). It has further been criticized that only sin-
gle-turn contributions are elicited, although in naturally occurring discourse many
speech acts are negotiated across a number of turns. An attempt to elicit dialogical
data is the creation of dialogue production tasks in which informants have to write
entire dialogues. DPTs are usually completed individually, but they could also
be completed by two participants jointly. Entire email messages were elicited by
employing a tool also referred to as a discourse completion task (Pan 2012). Pan’s
questionnaire, which included six such DCTs, was designed for a study on inter-
language requests in institutional discourse (Pan 2012: 160–161).
Finally, it has been pointed out that informants completing a production ques-
tionnaire do not write what they would actually say in real-life situations, but what
they think they would say or what they should say (or, indeed, what they think
would please the researcher). In response, it has been emphasized that data of this
68 Klaus P. Schneider

type are evidence of and provide access to culture-specific social norms govern-
ing verbal behaviour and the expectations of discourse participants (cf. Schneider
2012).
Discourse completion tasks also have undisputed advantages. Written DCTs
can be used to collect large datasets from a large number of informants in a short
time, e. g. simultaneously from hundreds of students in a lecture hall. Produc-
tion questionnaires can also be distributed by email or in social media networks.
Even larger datasets, from thousands of pre-selectable informants, can be gathered
in a very short time by using platforms otherwise used for crowdfunding, e. g.
CrowdFlower (www.crowdflower.com, cf. Renkwitz and Sickinger forthcoming).
As these are commercial platforms, informants have to be paid small amounts
of money. Yet this seems only fair (cf. section 4.4 below). While paying inform-
ants is the standard in many empirical disciplines, including e. g. psychology and
behavioural economics, this is unfortunately not common practice in pragmatics
and other fields of linguistics research. Finally, and most importantly, employing
DCTs provides immediately comparable sets of data, which are indispensable for
any comparative work in pragmatics, especially in inter-lingual, inter-varietal and
cross-cultural research.

3.4. Comprehension tasks and judgement tasks

Production tasks such as interviews, role plays and DCTs (cf. section 3.3 above)
require not only productive competencies, but also receptive competencies, i. e.
comprehension. The interview questions and the instructions and scenarios in role
plays and DCTs have to be understood on a very basic lexico-grammatical level,
before informants can give an answer or provide some other response. Further-
more, informants must be able to adequately interpret the respective social situa-
tion, the relationship between the participants involved in it and their respective
contributions to a discourse, before informants can come up with an appropriate
contribution of their own. Yet, there are also experimentational methods which
are focused on comprehension alone, specifically on comprehension beyond the
level of understanding words and grammatical constructions, i. e. on understanding
pragmatic meaning (in the multifarious senses of this term). Tasks of this type are
used to study e. g. how people understand particular utterances, especially how
people infer what is meant from what is said and what is implied (or implicated).
Tasks of this type are also used to study how young children acquire the neces-
sary abilities for understanding pragmatic meaning in their native language, or
how students learn these abilities in a foreign language. Comprehension tasks are
also employed for testing purposes, and not only for testing children and learners,
but also patients suffering from neuro-degenerative diseases, e. g. from dementia.
In other tasks of this type, informants are provided with systematically varied
linguistic input to find out whether or to what extent these variations impact the
Methods and ethics of data collection 69

informants’ linguistic output given in response. Comprehension tasks often involve

judgement. For example, informants may be asked to say whether utterances are
true or not, or to what degree they are acceptable, in order to establish whether the
informants understand the implicatures. In similar tasks, informants are requested
to assess how polite or impolite utterances are, or to what degree they are appro-
priate or inappropriate in a particular social situation.
As speech act theory is essentially speaker-oriented, empirical speech act
research is primarily focused on speech act performance, which is examined by
employing production tasks. As Gricean theory, on the other hand, is essentially
hearer-oriented, empirical research in the framework of relevance theory is pre-
dominantly focused on inferencing and, hence, comprehension tasks are preferred.
Finally, judgement tasks designed to elicit politeness ratings or the perception of
appropriateness are mostly used in (im)politeness research. Overall, investigations
in which comprehension and judgement tasks are employed concentrate predomi-
nantly on individual utterances and not on interactional discourse. Such tasks are,
therefore, generally not suitable for macro-pragmatic analysis, e. g. the examina-
tion of discursive sequences or entire speech events. In the following, two major
types of comprehension and judgement tasks are discussed. These are multiple
choice tasks (3.4.1) and rating scales (3.4.2).

3.4.1. Multiple choice tasks

Multiple choice tasks (MCTs) are commonly known to the general public, espe-
cially from testing contexts. In MCTs, informants are presented with several
options, usually four or five, from which they are asked to select one, especially
if MCTs are employed for the purposes of pragmatics research. In testing, by con-
trast, any number of the options presented may be correct answers. Yet, MCTs
for research purposes are not, as a rule, about correctness, but essentially about
appropriateness. Two subtypes can be distinguished; by default, both are adminis-
tered in writing. In one subtype, informants merely have to decide without context
which of the utterances presented is the most direct and/or the least direct one, or
which one is perceived as the most polite or the most impolite one. This subtype is
a judgement task suitable for perception studies.
The other subtype, which is probably used more frequently, includes a descrip-
tion of a situation similar to a scenario in a DCT (cf. section 3.3.4 above). Yet
unlike in DCTs, informants are not asked to produce a turn-at-talk including a
particular speech act, instead they are asked to select from among four or five
alternative realizations of a speech act the one they consider most appropriate in
the given situation. So this subtype is basically a selection task. Here is an example
(Roth 2002: 279):
70 Klaus P. Schneider

You are having a massive argument with your friend in the course of which she yells at
you: Oh you bloody liar! I can’t believe you’re being such a bitch!
Please circle the letter of the answer which best represents what you would say or do:
a) Sorry, I think you’ve misunderstood me.
b) Only because you are such a stupid cow.
c) Look who’s talking.
d) I would walk out.

Note that while the first three options in this particular example (a–c) include direct
speech (here represented in italics), the fourth option gives informants the opportu-
nity to select a non-verbal response. In another variant, an additional slot is added
to provide informants with the space to formulate an answer of their own, as in a
DCT, if they regard neither of the alternative realizations as appropriate.
Basically, the options informants can select from may be fabricated by the
researcher or taken from previous research. Intuitive fabrication is, however, not
recommended as researchers may not be aware of crucial alternative realizations
(cf. Kasper 2000: 331). The options included in the above example are taken from
DCT data elicited in the same study about responses to insults in British English in
which discourse completion tasks were combined with retrospective interviews and
MCTs (cf. Roth 2002: 37–50). MCTs seem particularly suitable to investigations
into impoliteness and verbal aggression, e. g. swearing or insults, as informants
are often inhibited when they are requested to commit “foul language” to paper.
Moreover, Rose (1994) found that in non-Western contexts a questionnaire
including MCTs yielded more valid data than a questionnaire including DCTs,
specifically with informants from Japan. Further studies support these findings
(cf. Kasper 2000: 330–331). Postgraduates from China (personal communication)
have also repeatedly confirmed that Asian students feel uncomfortable with DCTs,
not knowing what to write, and clearly prefer MCTs. Obviously, the processing
demands posed by MCTs are lower in terms of cognitive cost than in a free recall
task such as a DCT. Rose’s study was focused on request realization. More recent
studies have, however, challenged the reliability of MCTs in speech act production
research. This data collection method seems to be better suited to research into
comprehension (cf. Roever 2005).

3.4.2. Rating scales

Rating scales, specifically Likert scales, are used to elicit assessments of utterances
or situations in terms of correctness, appropriateness, politeness, formality, and
so on. For this purpose, scales are employed which predominantly range from 1
to 5 (cf. Dörnyei 2003: 36–39). For instance, in a study of compliment responses
among Malaysian multilinguals, Min (2015) devised a questionnaire (in four lan-
guages) in which the participants were asked to rank the five answers in each of
Methods and ethics of data collection 71

her multiple choice tasks on a five-point appropriateness scale, with 1 = “least

appropriate”, 5 = “most appropriate”.
In cross-cultural and interlanguage pragmatics, rating scales are used to elicit
pragmalinguistic assessments as well as sociopragmatic assessments. In the former
case, participants are asked to judge the appropriateness of linguistic realizations
of a particular speech act in a given social situation, in the latter case participants
are asked how they perceive and understand a situation in terms of e. g. power, dis-
tance, degree of imposition or severity of offence. Sociopragmatic assessments can
play a crucial role in the development of such instruments as role plays tasks and
discourse completion tasks. Researchers may outline scenarios for such tasks and
then ask participants in a pre-test to rate the contextual variables involved. This is a
procedure that helps to increase the validity of a study (cf. Kasper 2008: 295–296).
In politeness research, rating scales are less popular today than they used to be
before the advent of discursive approaches according to which politeness cannot
be judged in individual utterances and out of context (cf. e. g. Watts 2010). Leech
(2014: 250–251), however, insists that what he calls “pragmalinguistic polite-
ness” can be assessed without context, e. g. by rating alternative realizations of a
speech act, and that participants’ ratings reflect a default interpretation. It should be
emphasized here that such default interpretations reveal first-order conceptualiza-
tions of politeness. Leech also rejects Kasper and Dahl’s (1991: 219) criticism that
participants imagine a specific situational context if no context is provided by the
researcher in a judgement task. He argues that there is no proof of their assumption
and claims that participants rely on a “generalized context”. Some proof might be
gained from think-aloud protocols simultaneously recorded during the completion
of judgement tasks (cf. section 3.3.4 above).
Rating scales are also used for triangulation purposes, specifically in support of
other methods. For instance, in a comparative study on perceptions of impoliteness
across five cultures (England, China, Finland, Germany and Turkey), Culpeper et
al. (2010) first asked their student participants to report impoliteness events and
then gave them a rating task to assess the severity of the offence in the events
reported.
In the Gricean paradigm, judgement tasks are used to examine pragmatic
comprehension in adults as well as in children. Characteristically, participants are
asked to rate on a binary scale whether or not an utterance is e. g. true or false,
correct or incorrect, appropriate or inappropriate. This task type is commonly used
e. g. in empirical work on scalar implicatures, especially truth value judgements
(e. g. Barner et al. 2011). Variants of this particular format include felicity judge-
ment tasks, in which participants have to decide which of two (or more) utterances
matches a picture, and picture selection tasks, in which participants have to decide
which of two (or more) pictures matches a given utterance (cf. Félix-Brasdefer
and Hasler-Barker 2017: 34–35, also for examples). Binary scales may, however,
not reliably assess pragmatic comprehension competence, as Katsos and Smith
72 Klaus P. Schneider

(2010) demonstrate. In a series of experiments with young children, they found that
graded judgement tasks employing a five-point rating scale yielded more reliable
results (cf. Veenstra and Katsos, this volume, for more examples and discussion).
While the input in judgement tasks is predominantly written, Cohen (2012:
286) proposes that role plays could be used in intercultural pragmatics for both
self-assessment and peer-assessment. He further proposes to employ rating scales
in combination with video prompts for judging nonverbal behaviour, including the
appropriateness of target culture-specific gaze and gestures used by non-native
participants. Cohen (2012: 286–287) also refers to Roever’s (2010) suggestion to
judge the appropriateness of the overall speech style adopted in an entire speech
event. These examples show that rating scales and judgement tasks, while over-
whelmingly employed to assess individual written utterances, can also be used to
assess complete oral interactions including nonverbal behaviour.
Ratings can easily be quantified and subjected to statistical analysis (cf.
Félix-Brasdefer and Hasler-Barker 2017: 32). Yet the results from such analysis
may only be pseudo-objective. In a small-scale study, informants were asked to
assess the appropriateness of naturally occurring emails written by Australian stu-
dents to their lecturer (Schneider 2013). For the ratings, a five-point scale was
employed. Additionally, the informants were provided with space to say which
features of the emails they based their assessments on (cf. also Economidou-Ko-
getsidis 2011). More instructive than the numerical ratings were the participants’
comments on features of the emails, which in some cases were perfectly incom-
patible with the numerical ratings. It was not clear in these cases what the ratings
were actually based on. In a study of apologies in Japanese and English, Tanaka
et al. (2008) combined judgement tasks with discourse completion tasks, and in
a study of Chinese and British reactions to compliment responses, Spencer-Oatey
et al. (2008) combined judgement tasks with multiple choice tasks. In both these
studies, the participants had to assess aspects of each scenario on three different
five-point rating scales, e. g. on the responsibility for the problem (in the apology
study) or the conceitedness of the compliment responses. In each of the two stud-
ies, the participants were also asked to explain their ratings.

3.5. Further data collection methods

In addition to the methods discussed in the preceding sections, three further meth-
ods of data collection for pragmatics research are briefly mentioned here. These
are the philological method, the diary method, and a group of methods which can
be referred to collectively as psycho- and neurolinguistic methods.
The philological method is a time-honoured method of data collection employed
long before corpora in today’s technical sense were available. Essentially, research-
ers employing this method simply read, or more systematically mine, written texts
for particular phenomena that they are interested in, usually fictional texts, but
Methods and ethics of data collection 73

also letters and other non-fictional material. For many decades if not centuries,
this method was used e. g. in lexicography to find occurrences of words to be
included in dictionaries as examples. In pragmatics research, this method can be
used for finding occurrences of speech acts. Jucker (2009: 1616) classifies the
philological method as a field method, because the texts searched have occurred
naturally for a communicative goal and have not been elicited by investigators for
research purposes. Jucker highlights as a strength of this method that researchers
may go through the texts repeatedly and thus find all occurrences of a particular
speech act. He also mentions as a downside that this method is time consuming
and may not yield many instances of the speech act in question. Manually mining
a machine-readable corpus, or corpus samples, due to a lack of relevant pragmatic
annotation and in order to circumvent the problems of precision and recall (cf.
section 3.1 above) may also be considered as deploying the philological method.
The diary method has been used most extensively in second language research.
Typically, foreign language learners are asked to keep a diary e. g. about their
learning progress during a year abroad. Such diaries may shed some light on topics
and issues that are relevant to pragmatics research, e. g. on the learners’ perceptions
of native speaker practices, behaviours and norms that learners find surprising
or annoying. Learners may also report situations and events they experienced as
particularly pleasant or threatening, or instances of miscommunication and mis-
understandings. As there are, as a rule, only general instructions and no specific
guidelines what to enter in such diaries, Kasper (2008: 297) calls these diaries
“the least pre-structured type of self-report”. Diaries can also be kept by research-
ers to more systematically collect instances of particular incidents or phenomena
(e. g. Zhu 2004). Diary entries may be used in qualitative research to generate new
ideas and new research questions, especially for investigations in cross-cultural
and interlanguage pragmatics.
A range of methods adopted from psycholinguistics and psychology and, increas-
ingly, neurolinguistics and neurology are currently popular in Gricean pragmatics,
notably in “Experimental Pragmatics” in a narrow and specific sense, also known
as “XPrag” (cf., e. g., Noveck and Sperber 2004). Some psycholinguistic methods
employed to study pragmatic production are, in fact, similar to experimentational
methods discussed in section 3.3 above (cf. Gibbs, this volume). Eye-tracking and
neurolinguistic methods, on the other hand, which are predominantly used to study
pragmatic comprehension, are more specific. These methods include neuroimaging,
and specifically event-related potential (ERP), electroencephalography (EEG), and
functional magnetic resonance imaging (fMRI) (cf. Golato and Golato 2013: 3–4,
and Félix-Brasdefer and Hasler-Barker 2017: 35–36 for examples and discussion,
also Clark, this volume, and Veenstra and Katsos, this volume). Eye-tracking is,
however, also used in other paradigms of pragmatics. For instance, Auer (2017)
reports a study of gaze in conversation from an interactionist perspective in which
mobile eye-trackers were worn by the participants in a triadic conversation.
74 Klaus P. Schneider

It must be emphasized that neuroimaging methods cannot reveal what is said,

how it is said and why it is said or what is understood and how. In general, these
methods measure only blood flow in the brain and activation of specific brain areas
that correlate with particular linguistic activities. These methods thus provide indi-
rect and supportive evidence of some aspects of pragmatic processing.

4. Ethical issues in data collection

Methods and data collection procedures are not a given, they belong to research
traditions and have developed over time. This is why not only recent studies are
referred to in the above sections of this chapter, but also classical, pioneering and
groundbreaking work. Similarly, technical and ethical aspects of research as well
as relevant legislation all have a historical dimension. In his account of methods in
discourse analysis, Jones (2013) emphasizes that data collection and transcription
are cultural practices. His focus is this:
how these cultural practices have changed over the years as different cultural tools (tape
recorders, video cameras and computers) have become available to analysts, making new
kinds of knowledge and new kinds of disciplinary identities possible. (Jones 2013: 10;
original emphasis)

Yet data collection is not the only dimension of research which is subject to change
and cultural impact. Further relevant dimensions are ethics and related legislation.
In many countries, and particularly in the English-speaking world, there is explicit
legislation concerning research ethics (cf. Dörnyei 2007: 66, also Kono 2013).
Needless to say, legislation, and even entire legal systems, change over time and
vary across cultures. Ethical norms and ethical concepts are not invariant or uni-
versal either. The development of research ethics in particular can be described as
an ongoing process of increasing awareness and sensitivity, sometimes leading to
overreactions and undue rigidity, which has been referred to as “ethical correct-
ness” (Dörnyei 2007: 72). Furthermore, central ethical notions such as privacy and
ownership differ cross-culturally. However, over the past few decades standards
of ethical conduct and best practices in empirical research have emerged which
are generally accepted and subscribed to by most researchers in pragmatics. Any
kind of research, including armchair research, requires ethical conduct and respon-
sible behaviour on the part of the researchers, and this includes in particular sci-
entific integrity and academic rigour. As Dörnyei (2007: 66) puts it: “At the heart
of research ethics lies the moral character of the researcher.” In addition to such
general principles, empirical research involving the collection of data from people
other than the researcher makes it necessary to follow further more specific ethical
principles. These specific principles include first and foremost the principles of wel-
fare, of autonomy and of privacy (cf. e. g. Lazaraton 2013), and also the principle
Methods and ethics of data collection 75

of justice (cf. Kono 2013). The first three of these principles can be conceptualized
as macroethical principles, i. e. general principles standardly required by review
boards and ethics committees at universities and in other institutions. Microethics,
by contrast, concerns the more particular requirements in the specific context of an
investigation (on the distinction between macroethics and microethics, cf. Guille-
min and Gillam 2004). The three macroethical principles mentioned above will be
discussed in detail below (sections 4.1–4.3), with particular reference to empirical
research in pragmatics. The principle of justice will briefly be dealt with in the con-
text of the principle of welfare in section 4.1. A further principle worth mentioning
is the principle of indebtedness, which will also be discussed below (section 4.4).
Overall, the ethical principles surveyed in this chapter should not be understood as
rigid norms but as guidelines for responsible conduct in data collection.

4.1. The principle of welfare

The principle of welfare concerns the participants’ well-being. It is glossed by
Lazaraton (2013: 2) as “do no harm.” It should go without saying that no research
should ever inflict any harm on participating human subjects, neither physical or
psychological pain nor material damage, and participants should not be exposed
to any risk of suffering such harm. This basic ethical principle seems immediately
relevant to e. g. medical studies or research in pharmacology where such risks
exist and may be weighed against the benefits for patients. For research in prag-
matics, it is also the primary principle of research ethics, although physical pain
or material damage is not likely to occur in experimentational work, let alone in
field work. Psychological discomfort, on the other hand, may occur, especially in
laboratory settings. It is the researchers’ responsibility to create a non-threatening
atmosphere and make their participants comfortable (cf. section 3.3.3 above on
Labov’s measures to make interview situations less intimidating). More gener-
ally, participants may feel under pressure if they are the investigator’s students,
which is often the case in pragmatics research. The power differential may be
intimidating, the dependent relationship may be inhibiting and the students may
not participate entirely voluntarily. Similarly, asking colleagues to participate may
be problematic because they may think their professional face is at stake, and
involving family members and relatives may also have a negative effect on these
individuals as well as on the quality of the data. Kubanyiova (2008: 510) considers
such constellations “unethical at the micro-level”. Moreover, the so-called princi-
ple of justice is relevant here. According to this principle, “participants should be
selected based on their relevance to the research rather than convenience in order
to avoid manipulation of the study” (cf. Kono 2013: 2). As mentioned in section
2.7 above, involving one’s own students is a convenient way for academics and
school teachers to recruit participants for their research. What is needed, however,
are not convenience samples but relevance samples.
76 Klaus P. Schneider

4.2. The principle of autonomy

Arguably, the principle of autonomy is the ethical principle most relevant to data
collection in pragmatics research. This principle refers to the researchers’ respon-
sibility to respect and protect the autonomy of the individuals participating in their
research. This means that participant consent must be obtained whenever research-
ers wish to collect data from people by observation or by experimentation. Pro-
spective participants must be given the choice of participation, i. e. they must be
able to decide whether or not they want to get involved, and if they do, that they
take part voluntarily. The decision to participate should be an informed decision,
i. e. based on information about all relevant issues concerning the investigation.
In other words, informed consent is required, which ideally is given explicitly and
in writing. Prospective participants must be informed about the following issues:
(1) the nature and overall aims of the research,
(2) the purposes for which the data are collected,
(3) the tasks participants will be asked to perform,
(4) the possible risks and consequences of participating,
(5) the right to withdraw from the study at any time,
(6) the extent to which confidentiality will be protected, and
(7) who they can contact if they have questions.
Information about these issues is usually provided in relatively standardized con-
sent forms, in which participants declare, by signing the form, that they have read
and understood the information, agree with the conditions specified and are willing
to participate.
Providing detailed information about the nature and aims of the research may
be counter-productive. Disclosing the exact focus of a project may inappropriately
bias the participants and negatively impact the data quality (cf. Dörnyei 2007:
70). To avoid such influences, it is common practice to state the focus of a project
correctly but in more general terms. If, for instance, a project is aimed at exam-
ining the performance of a particular speech act, e. g. apologies or compliment
responses, then participants may be told that the project is about language use or
about oral communication (in a particular language).
Adult participants can give informed consent themselves. If, however, children
or adolescents under age are the targeted participant group, as e. g. in developmen-
tal pragmatics or interlanguage pragmatics, then consent must be obtained from
their parents or guardians (cf. also Hill 2005). Yet sometimes it is not clear whose
consent must be sought. In a school context, for instance, it is not always clear
whether it is sufficient or necessary to seek consent from the parents or guardians
of each child involved, or whether it is sufficient or necessary to obtain consent
from the teacher, the school administration or the school board (cf. Dörnyei 2007:
71 for some discussion). Relevant legislation may differ across countries.
Methods and ethics of data collection 77

In other contexts it may not be clear whether it is even necessary to obtain

consent. In general, review boards and ethics committees often exempt studies
for which data are collected from publicly accessible sources. There is, however,
some controversy whether public accessibility renders consent unnecessary, which
is especially relevant to investigations of language use on the internet and studies
of digital genres (cf. Kono 2013: 7). Accessibility in public contexts is sometimes
interpreted as informal consent. This interpretation is, however, challenged at least
by some experts who maintain that the fact that e. g. blog posts are publicly availa-
ble and can be accessed by anyone, and are indeed intended to be read by anyone,
does not generally legitimize researchers to use and analyse these blog posts in
their research without seeking consent. Yet it may not be clear from whom con-
sent should be sought, from the blog owners or from the blog posters. In a more
general vein, it may not be clear how providers of internet data might be contacted
or whether it is possible at all to obtain consent. Markham and Buchanan (2015)
reject prescribed practices for this situation and recommend a case-based approach
according to which ethical problems are solved as they occur in a research project.
They also emphasize that ethical guidelines for internet research will continue to
develop and change.
Generally, informed consent must be acquired prior to data collection. Yet for
the collection of naturally occurring spoken discourse this is not deemed desirable,
as informing participants would influence the quality of the data (cf. the discussion
of the observer’s paradox in section 3.2 above). Therefore, to achieve uncompro-
mised data quality some researchers make their recordings surreptitiously and seek
consent only after a recording has been made.
Surreptitious recording is, however, a highly sensitive topic and may have not
only ethical but also legal implications, depending on when and where the record-
ings are made. In Germany today, the privately spoken word is protected by law,
and making audio recordings of people who are unaware of it is a violation of
privacy (cf. also section 4.3), and thus a criminal offence, which can be punished
with imprisonment of up to three years (cf. German Criminal Code, Section 201).
It could of course be argued that the privately spoken word pertains exclusively to
conversations held in private settings where they cannot be overheard by eaves-
droppers. Yet this seems to be an inappropriately literal interpretation. Conver-
sations in such public places as e. g. shops or cafés are not automatically public
conversations just because the general public has access to these places. Such con-
versations are not public, simply because they are not intended to be public. They
are not meant to be overheard by bystanders, unless it is obvious from loudness
and gaze etc. that non-participants are also addressed.
Surreptitious recording was not always forbidden by law. In and before the
1970s, audio recordings made for corpora of spoken language were often made
secretly, without the consent or awareness of the participants. The Freiburg Cor-
pus, including approximately half a million words of spoken German recorded in
78 Klaus P. Schneider

a wide range of everyday contexts, is a case in point (Engel and Vogel 1975). A
further example seems to be the London-Lund Corpus of Spoken English, which
includes approximately half a million words of British English (Svartvik and Quirk
1980). However, while the recordings for the genre “spontaneous conversation”
were allegedly made surreptitiously, these conversations were neither spontaneous
nor were they surreptitiously recorded. In fact, “the recordings were made without
prior knowledge of the main participants” only, and, at least in some cases, “one or
more participants had knowledge of the recording (and had the task of keeping the
conversation going)” (Svartvik and Quirk 1980: 26). This procedure led to some
evidently unnatural contributions to the conversations. For more recent corpora of
spoken language, the participants were, as a rule, asked their consent prior to the
recordings. This applies, for instance, to the Bergen Corpus of London Teenage
Language (COLT) (Stenström et al. 2002) and the Hong Kong Corpus of Spoken
English (Cheng et al. 2008).
Researchers who inform their participants that they will be recorded often claim
that the participants forget about the recording machine and about being recorded
(e. g. Tannen 1984/2005, Rüegg 2014). Yet this is very hard to establish and has
not been studied systematically. Warren (2006: 23–25), on the other hand, quotes
some evidence showing that the participants were well aware of being taped at a
relatively late stage in the speech events and explicitly referred to the tape recorder.
This may render the discourse less natural than intended, but truly surreptitious
recording is not an ethical alternative.

4.3. The principle of privacy

Surreptitious recording is considered unethical, because it impinges on partici-
pants’ basic rights of freedom and autonomy (cf. section 4.2) and because it rep-
resents an intrusion into the private sphere of the secretly taped speakers. It is,
however, the researchers’ responsibility to protect their participants’ privacy and
warrant anonymity and confidentiality (e. g. Lazaraton 2013). This includes that
researchers must not disclose their participants’ identities.
By signing a consent form, participants give permission that the data collected
from them may be used for research purposes, and this usually involves publica-
tion. If data are published, they must be anonymized by the researcher, both when
reporting research findings and when presenting transcripts. No information must
be given which permits to draw conclusions as to the exact identities of individuals.
In transcripts, all names must be replaced, proper names of persons as well as place
names and geographical or other references which allow ready identification of the
individuals involved. It is common practice to use pseudonyms in transcripts, often
pseudonyms which have the same initials and the same prosodic features, e. g. num-
ber of syllables. Instead of pseudonyms, letters (e. g. A, B, C) are also frequently
used to distinguish speakers in transcripts. If recordings of spoken language are
Methods and ethics of data collection 79

published not as transcripts but as sound files, it is much harder to anonymize them,
not only because names are not as easily replaced, but more importantly because
voices are more easily recognized in audio material than in written representations.
Pitch and tempo, for example, could be manipulated to impede recognition, but this
does not seem to be commonly done. The sound files of the Santa Barbara Corpus
of Spoken American English, for example, have not been manipulated in this way.
It could be argued that the detailed demographic information provided about all
participants in that particular corpus might facilitate their identification, yet this is
not very likely on any significant scale. Video recordings pose a different problem.
Not only are they much harder to anonymize, they also reveal aspects of a partici-
pant’s identity which, while they may not enable exact identification of individuals,
the participants would not wish to see published. Such aspects include, for exam-
ple, particular features of outer appearance that are concealed in sound files. To
anonymize videos, software is available which ensures non-recognizability, while
at the same time preserving gestures and some facial expression, which may be
relevant if non-verbal behaviour is included in the analysis.
Finally, it must be emphasized that crucial ethical concepts such as privacy may
differ across cultures (cf. Elm 2009: 70). This is immediately relevant comparative
work in cross-cultural pragmatics. Diverging perception of data ownership and
property rights, including corresponding legislation, may also be relevant here
(cf. Kono 2013: 8). Another important issue is that the distinction between private
contexts and public contexts becomes increasingly blurred in digital genres and
on the internet (e. g. Markham et al. 2012: 6–7), which may necessitate flexible
microethical decisions.

4.4. The principle of indebtedness

The principle of welfare is more generally referred to as the principle of benefice
(cf., e. g., Kono 2013: 2), thus shifting the focus from risk avoidance to positive
benefits for the participants. While the interdependence of risks and benefits is
rather obvious in medical and pharmacological studies, as mentioned before in
section 4.1, the benefits may be even less clear than the risks in the case of prag-
matics research. In studies in pragmatics, and more generally linguistics, the cat-
egory “Risks and benefits”, invariably included in standardized consent forms, is
typically filled in by asserting that there are no benefits, although there should,
of course, always be the general benefit of an increase in knowledge from any
research project.
More particularly, a specific notion of benefit has developed in sociolinguis-
tics, and also applied linguistics, conceptualized today as “owing” or “giving
back”. This line of thinking goes back to Labov (1982), who formulated a ‘prin-
ciple of debt incurred’. The underlying idea is that researchers collecting data in
a particular speech community are indebted to this community for receiving the
80 Klaus P. Schneider

data. It is considered the researchers’ ethical responsibility to return the favour and
pay back any insights from data analysis that are beneficial to the community in
question, whenever such findings are needed. Wolfram (1993), in his “principle of
gratuity”, ascribes a more active role to researchers, demanding that they should
positively strive to find ways to share research findings to profit the community
that provided the data (cf. Kono 2013: 5). Similarly, Johnstone (2000: 49–50) dis-
tinguishes three types of research, namely research on the researched, research on
and for the researched, and research on, for and with the researched (cf. Lazarton
2013: 3–4). The first type is the traditional type of research that uses informants
to gain data. The second type involves advocacy for the informants, whereas the
third type is explicitly aimed at an empowerment of the informants. This third type
of research is primarily in the interest of the participants and not in the interest of
the researcher alone. Such motives and attitudes do not seem to be widespread in
pragmatics to date. It is, however, not difficult to imagine how results from prag-
matics research could profit informants. Such results could, for example, raise
an awareness of manipulative practices in discourse or effectively guard against
verbal aggression. Also, findings from work on pragmatic variation could make
miscommunication and its causes more transparent, while findings from interlan-
guage pragmatics might be made available more immediately to practitioners in
language teaching and testing as well as to authors of textbooks for foreign lan-
guage learning. More generally, pragmatics research could empower individuals
and social groups to more consciously manage interpersonal relations and use lan-
guage according to their needs and desires.

5. Conclusion: Methods and ethics of data collection

In conclusion, the stance taken in the present chapter, and indeed in this handbook,
can be summarized as follows. We firmly believe that there is no best method as
such, even though some researchers may claim that the method they have chosen
is generally superior to other methods. Trudgill, for example, is a case in point.
[…] if linguistics is not about language as it is actually being spoken and written by
human beings, then it is about nothing at all. (Trudgill 1996: xi)

As this quotation shows, Trudgill accepts only observational data, i. e. naturally
occurring spoken and written discourse, and thus, albeit implicitly, rejects all
experimentational methods. Arguably, his rather radical claim may hold true for
what his research is focused on in sociolinguistics. However, it does not hold true,
we maintain, for linguistics at large, or at least not for pragmatics. The selection
of data type (cf. chapter 1) and data collection method depends entirely on the
respective aims and purposes of a research project, its focus of analysis and its
research questions.
Methods and ethics of data collection 81

This may perhaps seem an idealistic position, as it must be conceded that feasi-
bility and the researcher’s qualifications cannot be ignored altogether. Feasibility,
in this case, pertains to the availability of time, money, equipment and personnel,
and the relevant qualifications include training, competence and experience, not
only in student projects.
A best method does not exist because each and every method has its specific
strengths and weaknesses, and this applies to field methods as well as laboratory
methods, and also the armchair method. So while a wide range of methods and data
collection procedures is available to investigators involved in pragmatics research,
each method and procedure may be suitable for addressing some research ques-
tions but not others. This means that investigators must be well aware of the nature
of their project, the aims and purposes pursued, and the questions posed. Given
this awareness, they can then consciously and carefully select the data collection
method best suited to their respective needs.
As far as the ethical dimension is concerned, investigators engaged in empirical
research by employing field methods or experimentational methods are responsi-
ble for the humans involved, i. e. their informants and participants. In particular,
the investigators must ask their consent prior to any recording or experiment, and
exemptions do not seem acceptable any more in general. Furthermore, investi-
gators have to take care of their participants’ well-being, respect their autonomy
and protect their privacy. There may also be legal requirements that have to be
observed, depending on where and when data are collected. Overall, it is important
that researchers take informed decisions, that they are aware of the ethical impli-
cations and act responsibly.

Acknowledgements

I would like to thank Susanne Mohr, Stefanie Pohle and Friederike Sell for their
inspiring input and my two volume and series co-editors, Andreas H. Jucker and
Wolfram Bublitz, for their detailed comments and valuable advice. Any remaining
errors are, of course, my own responsibility alone.
82 Klaus P. Schneider

References

Aijmer, Karin
2013 Understanding Pragmatic Markers: A Variational Pragmatic Approach. Edin-
burgh: Edinburgh University Press.
Angouri, Jo
2010 Quantitative, qualitative or both? Combining methods in linguistics research.
In: Lia Litosseliti (ed.), Research Methods in Linguistics, 29–45. London:
Continuum.
Archer, Dawn, Karin Aijmer and Anne Wichman
2012 Pragmatics: An Advanced Resource Book for Students. Abingdon: Routledge.
Auer, Peter
2017 Turn allocation, addressee selection, and gaze. Plenary lecture given at the 15th
International Pragmatics Conference, Belfast, 16–21 July 2017.
Austin, John L.
1962 How to Do Things with Words: The William James Lectures Delivered at Har-
vard University in 1955. Oxford: Oxford University Press.
Barner, David, Neon Brooks and Alan Bale
2011 Accessing the unsaid: The role of scalar alternatives in children’s pragmatic
inference. Cognition 118(1): 84–93.
Barron, Anne
2003 Acquisition in Interlanguage Pragmatics. Amsterdam/Philadelphia: Benjamins.
Barron, Anne
2012 Public Information Messages: A Contrastive Genre Analysis of State-Citizen
Communication. Amsterdam/Philadelphia: Benjamins.
Barron, Anne and Klaus P. Schneider (eds.)
2005 The Pragmatics of Irish English. Berlin/New York: Mouton de Gruyter.
Barron, Anne and Klaus P. Schneider
2009 Variational pragmatics: Studying the impact of social factors on language use
in interaction. Intercultural Pragmatics 6(4): 425–442.
Bednarek, Monika
2011 Approaching the data of pragmatics. In: Wolfram Bublitz and Neal R. Norrick
(eds.), Foundations of Pragmatics, 537–559. (Handbooks of Pragmatics 1.)
Berlin/Boston: de Gruyter Mouton.
Beeching, Kate
2016 Pragmatic Markers in British English. Meaning in Social Interaction. Cam-
bridge: Cambridge University Press.
Bieswanger, Markus
2015 Variational pragmatics and responding to thanks – revisited. Multilingua
34(4): 527–546.
Blum-Kulka, Shoshana, Juliane House and Gabriele Kasper
1989a Investigating cross-cultural pragmatics: An introductory overview. In: Shos-
hana Blum-Kulka, Juliane House and Gabriele Kasper (eds.), Cross-Cultural
Pragmatics: Requests and Apologies, 1–34. Norwood, N.J.: Ablex.
Blum-Kulka, Shoshana, Juliane House and Gabriele Kasper (eds.)
1989b Cross-Cultural Pragmatics: Requests and Apologies. Norwood, N.J.: Ablex.
Methods and ethics of data collection 83

Brown, Penelope and Stephen C. Levinson

1978 Universals in language usage: Politeness phenomena. In: Esther Goody (ed.),
Questions and Politeness, 56–311. Cambridge: Cambridge University Press.
Brown, Penelope and Stephen C. Levinson
1987 Politeness: Some Universals in Language Usage. Cambridge: Cambridge Uni-
versity Press.
Bublitz, Wolfram
2017 Oral features in fiction. In: Miriam A. Locher and Andreas H. Jucker (eds.),
Pragmatics of Fiction, 235–263. (Handbooks of Pragmatics 12.) Berlin/Bos-
ton: de Gruyter Mouton.
Chen, Rong
1993 Responding to compliments: A contrastive study of politeness strategies
between American English and Chinese speakers. Journal of Pragmatics 20:
49–75.
Chen, Rong and Dafu Yang
2010 Responding to compliments in Chinese: Has it changed? Journal of Pragmat-
ics 42: 1951–1963.
Cheng, Winnie, Chris Greaves and Martin Warren
2008 A Corpus-Driven Study of Discourse Intonation: The Hong Kong Corpus of
Spoken English (Prosodic). Amsterdam/Philadelphia: Benjamins.
Chomsky, Noam
1957 Syntactic Structures. The Hague: Mouton.
Chomsky, Noam
1965 Aspects of the Theory of Syntax. Cambridge, Mass.: M.I.T. Press.
Clark, Herbert H. and Adrian Bangerter
2004 Changing ideas about reference. In: Ira A. Noveck and Dan Sperber (eds.),
Experimental Pragmatics, 25–49. Basingstoke: Palgrave Macmillan.
Clements, Paul
1983 The Improvised Play: The Work of Mike Leigh. London: Methuen.
Clyne, Michael
1987 Cultural differences in the organization of academic texts. Journal of Prag-
matics 11: 211–247.
Cohen, Andrew D.
2012 Research methods for describing variation in intercultural pragmatics for cul-
tures in contact and conflict. In: J. César Félix-Brasdefer and Dale A. Koike
(eds.), Pragmatic Variation in First and Second Language Contexts: Method-
ological Issues, 271–294. Amsterdam/Philadelphia: Benjamins.
Culpeper, Jonathan
1996 Towards an anatomy of impoliteness. Journal of Pragmatics 25: 349–367.
Culpeper, Jonathan
2011 Impoliteness: Using Language to Cause Offence. Cambridge: Cambridge Uni-
versity Press.
Culpeper, Jonathan, Leyla Marti, Meilian Mei, Minna Nevala and Gila Schauer
2010 Cross-cultural variation in the perception of impoliteness: A study of impo-
liteness events reported by students in England, China, Finland, Germany and
Turkey. Intercultural Pragmatics 7(4): 597–624.
84 Klaus P. Schneider

Deutschmann, Mats
2003 Apologising in British English. (Skrifter från moderna språk 10.) Umeå: Insti-
tutionen för moderna språk, Umeå University.
Dinkin, Aaron
in press It’s no problem to be polite: Apparent-time change in responses to thanks.
Journal of Sociolinguistics.
Dörnyei, Zoltán
2003 Questionnaires in Second Language Research: Construction, Administration,
and Processing. Mahwah, N.J.: Erlbaum.
Dörnyei, Zoltán
2007 Research Methods in Applied Linguistics: Quantitative, Qualitative, and
Mixed Methodologies. Oxford: Oxford University Press.
Economidou-Kogetsidis, Maria
2011 “Please answer me as soon as possible”: Pragmatic failure in non-native
speakers’ e-mail requests to faculty. Journal of Pragmatics 43: 3193–3215.
Edmondson, Willis
1981 Spoken Discourse: A Model for Analysis. London: Longman.
Elm, Malin S.
2009 How do various notions of privacy influence decisions in qualitative Inter-
net research? In: Annette N. Markham and Nancy K. Baym (eds.), Internet
Inquiry: Conversations about Method, 69–87. Thousand Oaks: Sage.
Engel, Ulrich and Irmgard Vogel (eds.)
1975 Gesprochene Sprache [Spoken Language]. Tübingen: Narr.
Ervin-Tripp, Susan
1976 Is Sybil there? The structure of some American English directives. Language
in Society 5: 25–66.
Esser, Jürgen
2014 Taxonomies of discourse types. In: Klaus P. Schneider and Anne Barron (eds.),
Pragmatics of Discourse, 443–462. (Handbooks of Pragmatics 3.) Berlin: de
Gruyter Mouton.
Færch, Claus and Gabriele Kasper
1982 Phatic, metalingual and metacommunicative functions in discourse: Gambits
and repair. In: Nils E. Enkvist (ed.), Impromptu Speech, 71–103. Åbo: Åbo
Akademi.
Færch, Claus and Gabriele Kasper (eds.)
1987 Introspection in Second Language Research. Clevedon: Multilingual Matters.
Félix-Brasdefer, J. César
2009 Pragmatic variation across Spanish(es): Requesting in Mexican, Costa Rican
and Dominican Spanish. Intercultural Pragmatics 6(4): 473–515.
Félix-Brasdefer, J. César
2015 The Language of Service Encounters: A Pragmatic-Discursive Approach.
Cambridge: Cambridge University Press.
Félix-Brasdefer, J. César and Maria Hasler-Barker
2017 Elicited data. In: Anne Barron, Yueguo Gu and Gerard Steen (eds.), The Rout-
ledge Handbook of Pragmatics, 27–40. Abingdon/New York: Routledge.
Fillmore, Charles J.
1992 “Corpus linguistics” vs. “computer-aided armchair linguistics”. In: Jan Svart-
Methods and ethics of data collection 85

vik (ed.), Directions in Corpus Linguistics: 35–60. Berlin/New York: Mouton

de Gruyter.
Garfinkel, Harold
1967 Studies in Ethnomethodology. Englewood Cliffs: Prentice-Hall.
Garfinkel, Harold and Harvey Sacks
1970 On formal structures of practical action. In: John C. McKinney and Edward
A. Tiryakian (eds.), Theoretical Sociology: Perspectives and Developments,
338–366. New York: Appleton-Century-Crofts.
Golato, Andrea and Peter Golato
2013 Pragmatics research methods. In: Carol A. Chapelle (ed.), The Encyclopedia of
Applied Linguistics, 1–6. Oxford: Blackwell.
Göy, Elif, Deniz Zeyrek and Bahar Otcu
2012 Developmental patterns in internal modification of requests: A quantitative
study of Turkish learners of English. In: Maria Economidou-Kogetsidis and
Helen Woodfield (eds.), Interlanguage Request Modification, 51–86. Amster-
dam/Philadelphia: Benjamins.
Grice, Paul H.
1975 Logic and conversation. In: Peter Cole and J.L. Morgan (eds.), Syntax and
Semantics 3: Speech Acts, 41–58. New York: Academic Press.
Groth, Brian Ibbotson
2001 Brit Trips-Midway Hotel: A simulated negotiation. Business Communication
Quarterly 64(1): 63–78.
Guillemin, Marilys and Lynn Gillam
2004 Ethics, reflexivity, and “ethically important moments” in research. Qualitative
Inquiry 10: 261–280.
Harrison, Sandra and Diane Allton
2013 Apologies in email discussions. In: Susan C. Herring, Dieter Stein and Tuija
Virtanen (eds.), Pragmatics of Computer-Mediated Communication, 315–337.
(Handbooks of Pragmatics 9.) Berlin/Boston: de Gruyter Mouton.
Hassall, Tim
2012 Request modification by Australian learners of Indonesian. In: Maria Econo-
midou-Kogetsidis and Helen Woodfield (eds.), Interlanguage Request Modifi-
cation, 203–242. Amsterdam/Philadelphia: Benjamins.
Haugh, Michael and Donal Carbaugh
2015 Self-disclosure in initial interactions amongst speakers of American and Aus-
tralian English. Multilingua 34(4): 461–493.
Herbert, Robert K.
1989 The ethnography of English compliments and compliment responses: A con-
trastive sketch. In: Wieslaw Oleksy (ed.), Contrastive Pragmatics, 3–35.
Amsterdam: Benjamins.
Hill, Malcolm
2005 Ethical considerations in researching children’s experiences. In: Sheila Greene
and Diane Hogan (eds.), Researching Children’s Experience: Approaches and
Methods, 61–86. London: Sage.
Ho, Debbie G. E.
2013 Focus groups. In: Carol A. Chapelle (ed.), The Encyclopedia of Applied Lin-
guistics, 1–7. Oxford: Blackwell.
86 Klaus P. Schneider

Holmes, Janet
1986 Compliments and compliment responses in New Zealand English. Anthropo-
logical Linguistics 28(4): 485–508.
Holmes, Janet
1988 Paying compliments: A sex-preferential positive politeness strategy. Journal
of Pragmatics 12(3): 445– 465.
Holmes, Janet
1990 Apologies in New Zealand English. Language in Society 19(2): 155–199.
Holmes, Janet
1995 Women, Men and Politeness. London: Longman
Holmes, Janet, Meredith Marra and Bernadette Vine
2012 Politeness and impoliteness in ethnic varieties of New Zealand English. Jour-
nal of Pragmatics 44(9): 1063–1076.
Huang, Yan
2010 Pragmatics. In: Louise Cummings (ed.), The Pragmatics Encyclopedia, 341–
345. Abingdon/New York: Routledge.
Jacobs, Andreas and Andreas H. Jucker
1995 The historical perspective in pragmatics. In: Andreas H. Jucker (ed.), Histor-
ical Pragmatics: Pragmatic Developments in the History of English, 3–33.
Amsterdam: Benjamins.
Johnston, Bill, Gabriele Kasper and Steven Ross
1998 The effect of rejoinders in production questionnaires. Applied Linguistics 19:
157–182.
Johnstone, Barbara
2000 Qualitative Methods in Sociolinguistics. New York: Oxford University Press.
Jones, Rodney H.
2013 Data collection and transcription in discourse analysis. In: Ken Hyland and
Brian Paltridge (eds.), The Bloomsbury Companion to Discourse Analysis,
9–21. London: Bloomsbury.
Jones, Jeremy F. and Adrefiza
2017 Comparing apologies in Australian English and Bahasa Indonesia: Cultural
and gender perspectives. Journal of Politeness Research 13(1), 89–119.
Jucker, Andreas H.
2009 Speech act research between armchair, field and laboratory: The case of com-
pliments. Journal of Pragmatics 41(8): 1611–1635.
Jucker, Andreas H.
2015 Pragmatics of fiction: Literary uses of uh and um. Journal of Pragmatics 86:
63–67.
Jucker, Andreas H. and Daniela Landert
2015 Historical pragmatics and early speech recordings: Diachronic developments
in turn-taking and narrative structure in radio talk shows. Journal of Pragmat-
ics 79: 22–39.
Jucker, Andreas H., Gerold Schneider, Irma Taavitsainen and Barb Breustedt
2008 Fishing for compliments: Precision and recall in corpus-linguistic compliment
research. In: Andreas H. Jucker and Irma Taavitsainen (eds.), Speech Acts in
the History of English, 273–294. Amsterdam/Philadelphia: Benjamins.
Methods and ethics of data collection 87

Jucker, Andreas H. and Irma Taavitsainen (eds.)

2008 Speech Acts in the History of English. Amsterdam/Philadelphia: Benjamins.
Jucker, Andreas H. and Irma Taavitsainen (eds.)
2010 Historical Pragmatics. (Handbooks of Pragmatics 8.) Berlin/New York: de
Gruyter Mouton.
Jucker, Andreas H. and Irma Taavitsainen
2014 Complimenting in the history of American English: A metacommunicative
expression analysis. In: Irma Taavitsainen, Andreas H. Jucker and Jukka
Tuominen (eds.), Diachronic Corpus Pragmatics, 257–276. Amsterdam/Phil-
adelphia: Benjamins.
Jucker, Andreas H., Irma Taavitsainen and Gerold Schneider
2012 Semantic corpus trawling: Expressions of “courtesy” and “politeness” in the
Helsinki Corpus. In: Carla Suhr and Irma Taavitsainen (eds.), Developing Cor-
pus Methodology for Historical Pragmatics. Helsinki: Research Unit for Var-
iation, Contacts and Change in English. http://www.helsinki.fi/varieng/series/
volumes/11/jucker_taavitsainen_schneider/ (last accessed: November 2017)
Kádár, Daniel Z. and Michael Haugh
2013 Understanding Politeness. Cambridge: Cambridge University Press.
Kasper, Gabriele
1993 Interkulturelle Pragmatik und Fremdsprachenlernen. In: Johannes-Peter Thimm
and Helmut J. Vollmer (eds.), Kontroversen in der Fremdsprachenforschung,
41–77. Bochum: Brockmeyer.
Kasper, Gabriele
2000 Data collection in pragmatics research. In: Helen Spencer-Oatey (ed.), Cul-
turally Speaking: Managing Rapport Through Talk Across Cultures, 316–341.
London: Continuum.
Kasper, Gabriele
2008 Data collection in pragmatics research. In: Helen Spencer-Oatey (ed.), Cultur-
ally Speaking: Culture, Communication and Politeness Theory. Second edi-
tion, 279–303. London: Continuum.
Kasper, Gabriele and Merete Dahl
1991 Research methods in interlanguage pragmatics. Studies in Second Language
Acquisition 13(2): 215–247.
Katsos, Napoleon and Nafsika Smith
2010 Pragmatic tolerance and speaker-comprehender asymmetries. In: Katie F ranich,
Kate M. Iserman and Lauren L. Keil (eds.), The 34th Boston University Con-
ference in Language Development: Proceedings, 221–232. Boston: Cascadilla
Press.
Kim, Dan
2013 Mixed methods. In: Carol A. Chapelle (ed.), The Encyclopedia of Applied Lin-
guistics, 1–8. Oxford: Blackwell.
Kono, Nariyo
2013 Ethics in research. In: Carol A. Chapelle (ed.), The Encyclopedia of Applied
Linguistics, 1–10. Oxford: Blackwell.
Kubanyiova, Magdalena
2008 Rethinking research ethics in contemporary applied linguistics: The tension
between macroethical and microethical perspectives in situated research. The
Modern Language Journal 92(4): 503–518.
88 Klaus P. Schneider

Labov, William
1966 The Social Stratification of English in New York City. Washington, D.C.:
Center for Applied linguistics.
Labov, William
1970 The logic of non-standard English. In: James E. Alatis (ed.), Report of the
Twentieth Annual Round table Meeting on Linguistics and Language Studies,
1–44. Washington, D.C.: Georgetwon University Press.
Labov, William
1972 Sociolinguistic Patterns. Philadelphia: University of Pennsylvania Press.
Labov, William
1982 Objectivity and commitment in linguistic sciences: The case of the Black Eng-
lish trial in Ann Arbor. Language in Society 11: 165–201.
Lakoff, Robin
1973 The logic of politeness; or minding your p’s and q’s. In: Papers from the 9th
Regional Meeting of the Chicago Linguistic Society, 292–305. Chicago: Chi-
cago Linguistic Society.
Lakoff, Robin and Deborah Tannen
1984 Conversational strategy and metastrategy in a pragmatic theory: The example
of Scenes from a Marriage. Semiotica 49: 323–346.
Lazaraton, Anne
2013 Ethics in qualitative research. In: Carol A. Chapelle (ed.), The Encyclopedia of
Applied Linguistics, 1–5. Oxford: Blackwell.
Leech, Geoffrey N.
1983 Principles of Pragmatics. London: Longman.
Leech, Geoffrey N.
2014 The Pragmatics of Politeness. Oxford: Oxford University Press.
Lutzky, Ursula and Andrew Kehoe
2017 “I apologise for my poor blogging”: Searching for apologies in the Birming-
ham Blog Corpus. Corpus Pragmatics 1: 37–56.
Macaulay, Ronald K. S.
2009 Adolescents and identity. Intercultural Pragmatics 6(4): 597–612.
Mahlberg, Michaela
2014 Corpus linguistics and discourse analysis. In: Klaus P. Schneider and Anne
Barron (eds.), Pragmatics of Discourse, 215–238. (Handbooks of Pragmatics
3.) Berlin: de Gruyter Mouton.
Manes, Joan and Nessa Wolfson
1981 The compliment formula. In: Florian Coulmas (ed.), Conversational Routine,
115–132. The Hague: Mouton.
Markee, Numa
2013 Emic and etic in qualitative research. In: Carol A. Chapelle (ed.), The Encyclo-
pedia of Applied Linguistics, 1–4. Oxford: Blackwell.
Markham, Annette, Elizabeth Buchanan and The Association of Internet Researchers
(AoIR) Ethics Working Committee
2012 Ethical Decision-Making and Internet Research: Recommendations from the
AoIR Ethics Working Committee. (Version 2.0). http://aoir.org/reports/ethics2.
pdf (last accessed November 2017).
Methods and ethics of data collection 89

Markham, Annette and Elizabeth Buchanan

2015 Ethical considerations in digital research contexts. In: James Wright (ed.),
International Encyclopedia of the Social and Behavioral Sciences. Second
edition, 606–613. Amsterdam: Elsevier.
Martin, Gillian S.
2001 German-Irish Sales Negotiation: Theory, Practice and Pedagogical Implica-
tions. Frankfurt: Lang.
Miles, Matthew B. and A. Michael Huberman
2014 Qualitative Data Analysis: A Methods Sourcebook. Second edition. Thousand
Oaks, CA: Sage.
Min, Jennifer Quah Xiao
2015 Compliment responses among Malaysian multilinguals. In: Kate Beeching and
Helen Woodfield (eds.), Researching Sociopragmatic Variability: Perspec-
tives from Variational, Interlanguage and Contrastive Pragmatics, 119–148.
Basingstoke: Palgrave Macmillan.
Mulo Farenkia, Bernard
2015 Invitation refusals in Cameroon French and Hexagonal French. Multilingua
34(4): 577–603.
Noveck, Ira A. and Dan Sperber (eds.)
2004 Experimental Pragmatics. Basingstoke: Palgrave Macmillan.
Ogiermann, Eva
2009 On Apologizing in Negative and Positive Politeness Cultures. Amsterdam/
Philadelphia: Benjamins.
Ogiermann, Eva and Denise Sassenroth
2012 Statistics in contrastive pragmatics. In: Leyre Ruiz de Zarobe and Yolanda
Ruiz de Zarobe (eds.), Speech Acts and Politeness across Languages and Cul-
tures, 369–398. Bern: Lang.
Owen, Marion
1983 Apologies and Remedial Interchanges. Berlin: Mouton.
Pan, Ping Cathy
2012 Interlanguage requests in institutional e-mail discourse: A study in Hong Kong.
In: Maria Economidou-Kogetsidis and Helen Woodfield (eds.), Interlanguage
Request Modification, 119–161. Amsterdam/Philadelphia: Benjamins.
Pichler, Heike
2013 The Structure of Discourse-Pragmatic Variation. Amsterdam/Philadelphia:
Benjamins.
Placencia, Maria Elena
2008 Requests in corner shop transactions in Ecuadorian Andean and Coastal Span-
ish. In: Klaus P. Schneider and Anne Barron (eds.), Variational Pragmatics: A
Focus on Regional Varieties in Pluricentric Languages, 307–322. Amsterdam/
Philadelphia: Benjamins.
Pohle, Stefanie
2009 “I tell you what we could do, we could say, cut it to a hundred and ninety-five,
and offer you a significant discount on breakfast” – Expressing commitment
in business discourse: An empirical analysis of offers in Irish English negotia-
tions. Ph.D. dissertation, Department of English, University of Bonn. <http://
hss.ulb.uni-bonn.de/2009/1912/1912.htm>
90 Klaus P. Schneider

Pomerantz, Anita
1978 Compliment responses: Notes on the cooperation of multiple constraints. In:
Jim Schenkein (ed.), Studies in the Organization of Conversational Interac-
tion, 79–112. New York: Academic Press.
Ren, Wei
2015a L2 Pragmatic Development in Study Abroad Contexts. Bern: Lang.
Ren, Wei
2015b Sociopragmatic variation in Mainland and Taiwan Chinese refusals. In: Kate
Beeching and Helen Woodfield (eds.), Researching Sociopragmatic Variabil-
ity: Perspectives from Variational, Interlanguage and Contrastive Pragmatics,
72–93. Basingstoke: Palgrave Macmillan.
Renkwitz, Katrin and Pawel Sickinger
forthcoming Learner or native speaker? Native speaker perceptions of learner status
and appropriate communicative behaviour.
Roever, Carsten
2005 Testing ESL Pragmatics. Frankfurt: Lang.
Roever, Carsten
2010 Assessing language use in social context: A new approach to testing second
language pragmatics. In: Tien-en Kao and Yaofu Lin (eds.), A New Look at
Language Teaching and Testing: English as Subject and Vehicle, 87–97. Tai-
pei: Language Teaching and Testing Center.
Roever, Carsten
2013 Testing in pragmatics research. In: Carol A. Chapelle (ed.), The Encyclopedia
of Applied Linguistics, 1–8. Oxford: Blackwell.
Rose, Kenneth R.
1994 On the validity of discourse completion tests in non-Western contexts. Applied
Linguistics 15: 1–14.
Roth, Ruth-Maria
2002 Responding to insults in British English. Teaching degree dissertation, Depart-
ment of English, University of Bonn.
Roulston, Kathryn
2013 Interviews in qualitative research. In: Carol A. Chapelle (ed.), The Encyclope-
dia of Applied Linguistics, 1–9. Oxford: Blackwell.
Rüegg, Larssyn
2014 Thanks responses in three socio-economic settings: A variational pragmatics
approach. Journal of Pragmatics 71: 17–30.
Saussure, Ferdinand de
1916 Cours de Linguistique Générale. Paris: Payot.
Schauer, Gila A.
2004 “May you speak louder maybe?” Interlanguage pragmatic development in
requests. EUROSLA Yearbook 4: 253–272.
Schauer, Gila A.
2009 Interlanguage Pragmatic Development: The Study Abroad Context. London/
New York: Continuum.
Schegloff, Emanuel
1972 Sequencing conversational openings. In: John Gumperz and Dell Hymes
(eds.), Directions in Sociolinguistics, 346–380. New York: Holt, Rinehart and
Winston.
Methods and ethics of data collection 91

Schegloff, Emanuel
1992 Introduction. In: Gail Jefferson (ed.), Harvey Sacks, Lectures on Conversa-
tion, Vol. I, ix-lxii. Oxford: Blackwell.
Schegloff, Emanuel and Harvey Sacks
1973 Opening up closings. Semiotica 8: 289–328.
Schneider, Klaus P.
2005 “No problem, you’re welcome, anytime”: Responding to thanks in Ireland,
England, and the U.S. A. In: Anne Barron and Klaus P. Schneider (eds.), The
Pragmatics of Irish English, 101–139. Berlin/New York: Mouton de Gruyter.
Schneider, Klaus P.
2007 Genre matters: Textual and contextual constraints on contemporary English
speech behaviour. Anglia 125(1): 59–83.
Schneider, Klaus P.
2011 Imagining conversation: How people think people do things with words. Soci-
olinguistic Studies 5(1): 15–36.
Schneider, Klaus P.
2012 Pragmatic variation and cultural models. Review of Cognitive Linguistics
10(2): 346–372.
Schneider, Klaus P.
2013 Emerging e-mail etiquette: Lay perceptions of appropriateness in electronic
discourse. In: Katrin Röder and Ilse Wischer (eds.), Anglistentag 2012 Pots-
dam: Proceedings, 329–340. Trier: Wissenschaftlicher Verlag Trier.
Schneider, Klaus P.
2014 Comparability and sameness in variational pragmatics. In: Silvia Mergenthal
and Reingart M. Nischik (eds.), Anglistentag 2013 Konstanz: Proceedings,
361–372. Trier: Wissenschaftlicher Verlag Trier.
Schneider, Klaus P.
2017 Is that a threat? Forms and functions of metapragmatic terms in English dis-
course. Arbeiten aus Anglistik und Amerikanistik 42(2): 225–242.
Schneider, Klaus P. and Anne Barron (eds.)
2014 Pragmatics of Discourse. (Handbooks of Pragmatics 3.) Berlin/Boston: de
Gruyter Mouton.
Schröder, Anne and Klaus P. Schneider
in press Variational Pragmatics, Responses to Thanks, and the Specificity of English in
Namibia. English World Wide [to appear 2019].
Searle, John R.
1969 Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge
University Press.
Seedhouse, Paul
2013 Oral proficiency interviews as varieties of interaction. In: Steven J. Ross and
Gabriele Kasper (eds.), Assessing Second Language Pragmatics, 199–219.
Basingstoke: Palgrave Macmillan.
Short, Mick
1996 Exploring the Language of Poems, Plays and Prose. London/New York: Long-
man.
Spencer-Oatey, Helen, Patrick Ng and Li Dong
2008 British and Chinese reactions to compliment responses. In: Helen Spen-
cer-Oatey (ed.), Culturally Speaking: Culture, Communication and Politeness
Theory. Second edition, 95–303. London: Continuum.
92 Klaus P. Schneider

Sperber, Dan and Deirdre Wilson

1995 Relevance: Communication and Cognition. Second edition. Oxford: Basil
Blackwell.
Stenström Anna-Brita, Gisle Andersen and Ingrid Kristine Hasund
2002 Trends in Teenage Talk: Corpus Compilation, Analysis and Findings. Amster-
dam/Philadelphia: Benjamins.
Svartvik, Jan and Randolph Quirk
1980 A Corpus of English Conversation. Lund: Gleerup.
Svennevig, Jan
1999 Getting Acquainted in Conversation: A Study of Initial Interactions. Amster-
dam/Philadelphia: Benjamins.
Taavitsainen, Irma and Andreas H. Jucker
2015 Twenty years of historical pragmatics: Origins, developments and changing
thought styles. Journal of Historical Pragmatics 16(1): 1–24.
Tanaka, Noriko, Helen Spencer-Oatey and Ellen Cray
2008 Apologies in Japanese and English. In: Helen Spencer-Oatey (ed.), Culturally
Speaking: Culture, Communication and Politeness Theory. Second edition,
73–94. London: Continuum.
Tannen, Deborah
1984 Conversational Style: Analyzing Talk Among Friends. Norwood, N.J.: Ablex.
Tannen, Deborah
2005 Conversational Style: Analyzing Talk Among Friends. New edition. Oxford:
Oxford University Press.
Tardy, Christine M. and John M. Swales
2014 Genre Analysis. In: Klaus P. Schneider and Anne Barron (eds.), Pragmatics of
Discourse, 165–187. (Handbooks of Pragmatics 3.) Berlin/Boston: de Gruyter
Mouton.
Trosborg, Anna
1995 Interlanguage Pragmatics: Requests, Complaints and Apologies. Berlin:
Mouton de Gruyter.
Trudgill, Peter
1996 Series editor’s preface. In: Michael Stubbs, Text and Corpus Analysis: Com-
puter-Assisted Studies of Language and Culture, xi. Oxford: Blackwell.
Warren, Martin
2006 Features of Naturalness in Conversation. Amsterdam/Philadelphia: Benjamins.
Watts, Richard J.
2003 Politeness. Cambridge: Cambridge University Press.
Watts, Richard J.
2010 Linguistic politeness theory and its aftermath: Recent research trails. In: Mir-
iam A. Locher and Sage L. Graham (eds.), Interpersonal Pragmatics, 43–70.
(Handbooks of Pragmatics 6.) Berlin/New York: de Gruyter Mouton.
Watts, Richard J., Sachiko Ide and Konrad Ehlich
1992 Introduction. In: Richard J. Watts, Sachiko Ide and Konrad Ehlich (eds.),
Politeness in Language: Studies in its History, Theory and Practice, 1–20.
Berlin: de Gruyter.
Wierzbicka, Anna
1985 Different cultures, different languages, different speech acts: Polish vs. Eng-
lish. Journal of Pragmatics 9: 145–178.
Methods and ethics of data collection 93

Wodak, Ruth
2011 Critical discourse analysis: Overview, challenges, and perspectives. In: Gisle
Andersen and Karin Aijmer (eds.), Pragmatics of Society, 627–650. (Hand-
books of Pragmatics 5.) Berlin/Boston: de Gruyter Mouton.
Wolfram, Walt
1993 Ethical considerations in language awareness programs. Issues in Applied Lin-
guistics 4(2): 225–255.
Woodfield, Helen
2012 “I think maybe I want to lend the notes from you”: Development of request
modification in graduate learners. In: Maria Economidou-Kogetsidis and
Helen Woodfield (eds.), Interlanguage Request Modification, 9–49. Amster-
dam/Philadelphia: Benjamins.
Yuan, Yi
2001 An inquiry into empirical data gathering methods: Written DCTs, oral DCTs,
field notes, and natural conversations. Journal of Pragmatics 33: 271–292.
Zhu, Xiaoshu
2004 For a better understanding: An analysis of miscommunication between Amer-
icans and Chinese and Germans and Chinese. Ph.D. dissertation, Department
of English, University of Bonn.
3. The art of transcription:
Systems and methodological issues
Roger J. Kreuz and Monica A. Riordan

Abstract: A faithful reproduction of the words, paralanguage, and gestures

employed in an interaction is essential for researchers in many disciplines. The
reduction of a conversation to a transcription, however, is a process fraught with
difficult choices and inevitable tradeoffs. A large number of different transcription
systems have been developed, and this chapter provides an overview of the more
widely employed frameworks. These systems vary considerably in terms of their
scope, focus, completeness, and forms of notation. Just as no particular tool is the
best choice for all building tasks, there is no universal transcription system that will
be suitable for all researchers and all research questions. The goal of this review
is to provide a survey of the terrain so that practitioners of transcription can make
informed choices about the best system for their particular purpose.

1. Introduction

The transcription of face-to-face interaction presents formidable challenges for

researchers who study pragmatics. Those venturing into this domain are confronted
by a wide variety of transcription systems that have been devised by researchers
from a diversity of disciplines over several decades. Although these systems are
not mutually exclusive, they often possess large differences in scope, emphasis,
and nomenclature, as well as in the symbols used to transcribe these dimensions.
It is beyond the scope of this chapter to reconcile these systems, and it is far
from clear that such a reconciliation would be desirable. Instead, we will provide
an overview of a number of these systems and make suggestions about their suita-
bility for various transcription needs. In addition, we will briefly address transcrip-
tion issues with regard to speech in diverse populations (e. g., children, aphasics,
and cognitively impaired individuals). We will also consider the challenges of
transcribing the non-acoustic (i. e., facial and gestural) signals that are of special
interest to pragmatics researchers, as well as the transcribers of sign language. It
is our hope that this overview and discussion will provide some guidance to those
who wish to practice the art of transcription.

https://doi.org/10.1515/9783110424928-003
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 95–120. Berlin/
Boston: De Gruyter Mouton.
96 Roger J. Kreuz and Monica A. Riordan

2. Some preliminaries

2.1. Purpose of transcription

Researchers who study pragmatics create transcriptions in order to test their theo-
ries about discourse. In other words, no one creates transcriptions of face-to-face
interactions as an end in themselves; rather, transcription is always performed as
a means to an end. For example, President Richard Nixon’s Oval Office record-
ings (New York Times 1974) were transcribed for the purpose of determining what
the president knew about the Watergate cover-up. (The frequent use of the term
“expletive deleted” by the transcriber provided the public with some unintended
pragmatic insight into the speech of the head of the Executive Branch.) Other
researchers have gone on to use the Nixon administration’s transcripts for their
own research purposes (e. g., Novick, Walton and Ward 1996), but the original
transcriptions were not created with any other purpose in mind.
The fidelity of transcription is a matter of paramount importance in the fields
of medical and legal transcription. However, in these applied settings, the focus
is primarily on what was said or intended as opposed to how it was uttered.
Although there may be important commonalities between transcription in applied
and research domains, they will not be reviewed here. It should also be noted
that different academic disciplines have differing standards for what constitutes
a useful transcription. The relevant issues are discussed by Podesva and Sharma
(2013) with regard to linguistics; Widodo (2014) for second language research;
Gee (2011), Gee and Handford (2012) and Jones (2011) with regard to discourse
analysis; and Mallinson, Childs and Van Herk (2013) for sociolinguistics.

2.2. Transcription versus coding

Many issues in pragmatics may be profitably explored without recourse to tran-
scription. Because transcription is extremely time intensive (see section 2.5), it
may in fact be overkill for many purposes (Hammersley 2010). For instance, if
one simply wanted to compare the number of specific discourse markers (e. g.,
Schiffrin 1987) in two stretches of discourse, it might in fact be easier to tally
them directly from an audio- or videotape of the interaction. This approach might
also be more accurate, because it would avoid many of the complications that are
inevitable when an interaction is reduced to a transcription. Such coding should
be possible for many phenomena that are conceptually well-defined and macro-
scopic.
In other cases, a researcher may be interested in classifying utterances accord-
ing to a particular taxonomy, such as illocutionary speech acts (e. g., Searle 1975).
In such cases, it would probably be easiest to work from a relatively broad tran-
scription of the interaction. The focus of this review will be the large range between
The art of transcription: Systems and methodological issues 97

counts and coding schemes, which require the use of a particular transcription sys-
tem to produce a faithful record of an interaction.

2.3. Issues of terminology

Phonetic transcriptions are frequently characterized as “broad” or “narrow,” with
narrow transcription documenting the allophonic variation in speech sounds. Inter-
national standards exist for transcribing phonetic information at both the broad and
narrow levels (Pullum and Ladusaw 1996), although not surprisingly, reliability is
higher for broad transcriptions than for narrow ones (Shriberg and Lof 1991). In
a similar way, the coding of non-phonetic dimensions, such as pauses or gestures,
can be characterized as broad or narrow. A paradox of transcription is that, as a
transcript becomes narrower (and in theory more faithful to the discourse it rep-
resents) it becomes more difficult for others to read and interpret. In other words,
there may be a tradeoff between the fidelity of a transcript and its intelligibility.
Consider, for example, a few turns from the relatively narrow transcription
provided by Kyratzis (2001) in her analysis of the interactions of preschool friends:
1. Speaker 1: … if someone comes, then we **hi::de, really//
2. Speaker 2: ==yeah/ …
3. we’re *shy:: wizards// (Kyratzis 2001: 363)

The symbols provide important information about intonation, lengthened seg-

ments, and latching, although most readers will need to refer to the author’s list of
transcription conventions to fully decode these features.
It may also be helpful at this point to define the specific dimensions that are of
interest to transcribers of discourse. O’Connell and Kowal (1995a, 1999, Kowal
and O’Connell 2004) suggest that transcribed behaviors can be categorized as
belonging to one of four classes of features. Verbal features refer to the words
themselves (i. e., what was said), whereas prosodic features correspond to the ways
in which the words were spoken (e. g., pitch, duration, and loudness). Paralinguis-
tic features (such as laughter, breathing, sighing, or crying) may accompany the
spoken words, or they may occur independently. Finally, extralinguistic features
are behaviors defined as “nonvocal and nonverbal” (O’Connell and Kowal 1999:
109), but which are nevertheless germane to the discourse being described (e. g.,
facial expression, gaze, and gesture). The purpose of a transcription system is to
subdivide these features into a number of dimensions, which can then be labeled
with a particular code or set of codes. Transcription systems differ in their coverage
of these four features, the number of dimensions employed within each feature, and
in the specific codes that are used.
98 Roger J. Kreuz and Monica A. Riordan

2.4. Transcription layout

Edwards (1993b) has reviewed how transcription systems differ in the spatial
arrangement of information as well as the type and level of description. Vertical
arrangement, in which speakers’ turns are arranged sequentially (as in a script)
may be the most common, although other formats may be more informative. Rep-
resenting the interaction between interlocutors in columns may be helpful in some
cases, and a partiture format, as in Ehlich’s (1993) HIAT system (see section 4.6),
is effective for capturing interactions with lots of simultaneous speech.
There are several different ways in which a transcriber can choose to arrange
prosodic, paralinguistic, and extralinguistic features within a transcript. One
choice, referred to by Edwards (1993b) as running text, places such information
following the words (e. g., the code “laughter” immediately after an utterance con-
taining laughter). This method preserves the temporal contiguity of the informa-
tion in the transcript. However, many transcription systems use an interspersed
format for recording prosodic information (Edwards 1993b). Changes in pitch, for
example, can be directly mapped onto the syllables themselves, or indicated by
specific codes. A third approach is to use a segment-plus-specification (SPS) for-
mat, in which one tier provides the verbal dimension, and other tiers or rows below
the first tier provide syntactic, semantic, or pragmatic codes (Edwards 1993b). A
fourth choice is referred to as utterance-plus-clarification. In this format, utter-
ances are broken apart and nonverbal or contextual information (e. g., gesture,
gaze, or behavior of the speaker) appears below each speaker’s turn. As Edwards
(1993b) notes, even something as mundane as the arrangement of speakers’ turns
can have important implications for how a transcript is analyzed and interpreted.

2.5. Procedural issues

One issue on which all researchers in pragmatics can agree is that transcription is
very labor-intensive. MacWhinney (2001), for example, has estimated that it can
take over ten hours to transcribe one hour of discourse. Consequently, it makes
sense to transcribe as few dimensions as possible for the purposes at hand. For
example, if one wanted to study the use of discourse markers such as “uh” and
“um” (Clark and Fox Tree 2002), it might make little sense to transcribe head
movements of the listener. However, such choices are crucial, for as many have
argued, a transcription system is not theory-neutral: it already reflects the beliefs
and biases of the researcher (see, for example, Ochs 1979, Skukauskaite 2012 and
Gibson, Webb and vom Lehn 2014). By choosing not to transcribe a particular
dimension, the researcher has implicitly decided that the dimension plays no role
in the phenomenon in question. Considering the infinity of pragmatically sali-
ent dimensions that the researcher could transcribe, the choice of the dimensions
themselves is already something of a compromise. To put it another way, tran-
The art of transcription: Systems and methodological issues 99

scription is always subjective and interpretive to some degree, and transcription

inevitably leads to data reduction. These issues are of fundamental importance, but
it is beyond the scope of this chapter to do them justice: please refer to Bucholtz
(2000 2007a 2007b); Green, Franquiz and Dixon (1997); Jaffe (2007); Mishler
(1991); Preston (1982); Roberts (1997); and Tilley (2003) for further discussion
and analysis.

2.6. The role of context

Unless it is occurring under unnatural laboratory conditions, face-to-face interac-
tion is deeply embedded within a social context, involving dimensions that may
not be readily apparent to the transcriber. The relationship that exists between the
conversational participants may be crucial to the interpretation of the interaction
and should be specified when possible. However, assessing the relevant param-
eters, such as the amount of shared common ground (Clark 1996), may be quite
difficult. Most transcribers have not paid much attention to specifying context,
although some of these issues have been explored by Cook (1995), Norris (2004),
and Bucholtz (2007b). Sensitivity to these issues can make for a more informative
transcript: acoustically, there is no difference in transcribing a brief absence of
speech as a “pause” or as “shocked silence,” but clearly the latter provides more
information (perhaps at the expense of complete objectivity).

2.7. Technological considerations

Technological developments have greatly affected the way in which researchers
record conversational interaction. Researchers now routinely employ digitally
recorded video, which makes it possible to record acoustic, facial, and gestural
information (a helpful review of these issues may be found in Bavelas, Kenwood
and Phillips 2002).
An important technological issue with regard to transcription is the conver-
sion of transcription systems into mark-up languages that can be interpreted by
computer programs. Such conversion is necessary to allow efficient data storage
and retrieval, but as with other aspects of transcription, a number of tradeoffs are
involved (see Leech, Myers and Thomas 1995, for a variety of such examples).
A variety of software packages exist for use in the creation of transcripts, and
the number of such programs has proliferated as more researchers have entered the
field. These programs can greatly lessen some of the tedium involved in repeatedly
playing a troublesome section of speech, or temporally aligning verbal and extra-
linguistic features. However, these programs vary considerably in their purposes,
design philosophies, platform compatibility, cost, and ease of use (for a discussion,
see Jenks 2013). In addition, the lines between transcription software, annota-
tion packages, coding programs, and full-blown computer-assisted qualitative data
100 Roger J. Kreuz and Monica A. Riordan

analysis software (CAQDAS) have become blurred. It is beyond the scope of this
chapter to review the myriad of possibilities, and helpful reviews may be found
elsewhere (e. g., Ide and Pustejovsky 2017; Silver and Lewins 2014). Popular
programs include MAXQDA (http://www.maxqda.com) and QDA Miner (https://
provalisresearch.com), which are designed for mixed methods researchers; NVivo
(http://www.qsrinternational.com/nvivo-product); Atlas.ti (http://atlasti.com; see
Paulus and Lester 2016), which has its origins in grounded theory; Dedoose (http://
www.dedoose.com), which is web-based; VoiceWalker, described as “a discourse
transcription utility” (Du Bois 2006:1); ANVIL, an annotation program (Kipp
2003); and Transana (Woods 2007), an analysis package available at http://www.
transana.org.
Widely used and free software packages include CLAN (described in section
4.9); ELAN (https://tla.mpi.nl/tools/tla-tools/elan/), used for annotating audio and
video; EXMARaLDA (http://exmaralda.org/de/; see Meißner and Slavcheva 2013);
and Praat (http://www.praat.org), which is designed primarily for researchers in
phonetics. RQDA is a CAQDAS package that operates within R, the open-source
statistical programming language (http://rqda.r-forge.r-project.org). Programs that
can assist in the coding of gestural components of language also exist (Neidle,
Sclaroff and Athitsos 2001).
When using any software program, some thought should be given to whether
its file format is proprietary and whether it supports data export to other programs.
Today’s software standard has a way of becoming tomorrow’s historical footnote,
with the unfortunate consequence that transcripts or entire coding projects may be
rendered inaccessible over time.

3. Design principles

A number of theorists and researchers have made proposals concerning what an

ideal transcription system should encompass. These ideas will be reviewed chron-
ologically in order to show their development over time.
Du Bois (1991) proposed five maxims for designing transcription systems and
23 design principles that follow from these maxims. The maxims are category defi-
nition (e. g., use categories that are explicit, necessary, and sufficient); accessibility
(e. g., use familiar and easily learned notations); robustness (e. g., avoid invisible
or fragile contrasts); economy (e. g., avoid verbose notations, use space meaning-
fully); and adaptability (e. g., allow for seamless integration of user-defined codes
and transcription categories).
Edwards (1993b) argued that the creators of transcripts should keep in mind
issues of category design, readability, and computational tractability. By cate-
gory design, she means that the dimensions must be systematically discriminable,
exhaustive, and contrastive. Readability refers to placing related events in close
The art of transcription: Systems and methodological issues 101

proximity, the visual separation of unlike elements, time-space iconicity, logical

priority, mnemonic marking, and efficiency and compactness. Computational trac-
tability refers to systematicity and predictability in encoding dimensions. Failure
to consider computational tractability can lead to the underselection or overse-
lection of instances from a transcription. For example, a search of a transcript for
instances of “going to” would not necessarily identify instances of “gonna,” while
a search for the verb “bear” might also snare the ursine variety. Edwards (1995)
later expanded on these concerns, and also discussed issues of validity and relia-
bility when computerized language archives are used.
O’Connell and Kowal (1994, 1995a, 1999, 2008) point out that the ultimate
purpose of a transcription (i. e., the purpose of the researchers and the readership of
a transcript) must be kept in mind. In addition, these authors argue against inserting
transcription symbols within words to indicate prosodic features, because doing
so impairs the legibility of the transcript. A third point concerns the consistency
of notation: each symbol should encode only one dimension, and conversely, each
dimension should be encoded by only one symbol. O’Connell and Kowal also
suggest that conventional typographic elements, such as ampersands and ellip-
ses, should not be used as transcription symbols because their primary purpose is
already deeply ingrained in the minds of readers. Finally, the authors argue that
measures of continuous variables, such as amplitude and time, must be made with
accurate equipment, and not subjectively.
Dressler and Kreuz (2000) encouraged the developers of transcription systems
to keep in mind seven design principles: (1) specificity (the tradeoffs of broad
versus narrow transcription); (2) universality (not linking the conventions to a
particular language, such as “s” for “softly”); (3) consensus (using symbols as
others have used them in the past); (4) transparency (using intuitive symbols, such
as a rising line, “/”, to indicate rising intonation); (5) parsimony (the use of a small
number of codes); (6) conventionality (using codes that can be easily located on
a keyboard); and (7) extensibility (the system should be open-ended to allow new
dimensions to be transcribed).
Müller and Damico (2002), approaching these issues from the perspective of
clinical linguistics and phonetics, proposed six guiding principles that share much
in common with the points already raised. However, for dealing with the com-
plexities of disordered speech, they stress the importance of flexibility “to ensure
authenticity and individuality” (Müller and Damico 2002:312).

4. Review of transcription systems

Many researchers choose to reinvent the wheel and create their own notational sys-
tems when transcribing their data. Although this may be justifiable in some cases, it
is rarely necessary. There are, in fact, many disadvantages to this approach. An ad
102 Roger J. Kreuz and Monica A. Riordan

hoc system is likely to be less comprehensive and may be employed inconsistently.

In addition, if examples are included in published research, the use of a new sys-
tem requires some mastery on the part of editors, reviewers, and readers. Finally,
a transcription key must be provided, which wastes the resource of journal space
(Dressler and Kreuz 2000).
The list of transcription systems provided here is not intended to be com-
prehensive; as noted above, many systems have been developed and used only
once. In addition, some well-known systems are not particularly comprehensive,
whereas others have been employed in relatively small geographic regions or for
only one language. Therefore, the goal in this review is to briefly describe the
systems that are (a) widely employed, (b) in current use or historically important,
and (c) reasonably comprehensive. Most of these systems have been developed to
transcribe prosodic and paralinguistic features; the transcription of extralinguistic
features, such as gesture and eye gaze, will be reviewed in section 5. Finally, each
transcription system is listed by name (if it has one), or by reference to researchers
and publication(s) that describe the system in detail.

4.1. Jeffersonian Transcription System, or Conversation Analysis

Gail Jefferson’s transcription system (Sacks, Schegloff and Jefferson 1974; Schen-
kein 1978; Atkinson and Heritage 1984; Jefferson 2002, 2004) has been widely
employed and refined over a forty-year period and has become a de facto standard
in the field of conversation analysis (often referred to as simply “CA”). The journal
Research on Language and Social Interaction, for example, uses the Jeffersonian
notation as its default transcription system.
In its 1978 formulation (Schenkein 1978), the Jeffersonian system provided
about 18 codes for tracking seven different categories of conversational phenomena:
simultaneous utterances, overlapping utterances, contiguous utterances, intervals
within and between utterances, characteristics of speech delivery, transcriptionist
doubt, and other transcript symbols. In a later formulation (Atkinson and Heritage
1984), about eight codes, such as shifts in intonation and quieter talk, were added,
as well as the extralinguistic dimensions of gaze direction and applause. More
recent formulations (e. g., Jefferson 2002) have introduced a handful of other codes
to mark slower speech or suppressed laughter. A helpful discussion of the issues
involved in employing this system may be found in Psathas and Anderson (1990).

4.2. Ochs
Following the lead of the seminal paper of Sacks et al. (1974), Elinor Ochs (1979)
proposed a transcription system for verbal and nonverbal features. For verbal fea-
tures, she proposed the coding of eight dimensions: utterance boundary, latching,
pause length, overlap, self-interruption, intonation or prosodic quality, audible
The art of transcription: Systems and methodological issues 103

breathing, and metatranscription. Four additional dimensions were proposed for

nonverbal features: changes in gross motor activity, eye gaze, gestures, and body
orientation. In total, the system uses about 35 codes (for this and other systems,
an exact number is difficult to report, since some codes can be used iteratively: in
Ochs’ system, for example, lengthened syllables are indicated with a colon, and
additional colons can be used to indicate additional beats in time).

4.3. London-Lund Corpus transcription

The London-Lund Corpus of Spoken English (or LLC) (Svartvik and Quirk 1980;
Svartvik 1990) is of considerable historical importance. The project began in Lon-
don in 1959 and Lund, Sweden in 1975, and represents one of the first attempts to
gather a comprehensive corpus of spoken English. It was derived from the Survey
of English Usage (or SEU), a corpus which consists of a million words. Half of this
total was drawn from spoken English.
The LLC uses a transcription system that principally denotes prosodic fea-
tures. It stems from the “British School” of intonation analysis (see Kingdon 1958;
O’Connor and Arnold 1961). The “nucleus,” or main stressed syllable that has a
clearly perceptible movement of pitch, is divided into seven dimensions (e. g., fall,
level, fall-rise), and there are four codes for the “booster,” or range of pitch (e. g.,
higher than preceding syllable, very high). Stress is denoted as normal or heavy.
Pauses of varying durations, as well as simultaneous talk and laughter are coded
as well. Although this system is not as comprehensive as some others, it has been
widely employed, and may be sufficient for researchers whose primary interest is
prosody.

4.4. Tannen
Deborah Tannen’s (1984/2005) work is well known in sociolinguistics, and her
system has frequently been employed by later researchers. Her system codes
for pauses, stress, pitch, intonation, vowel lengthening, and overlapping speech.
Amplitude is described using six codes drawn from musical notation (e. g., piano,
fortissimo), and appears under the transcription line. Brackets are used to demar-
cate paralinguistic or extralinguistic information (e. g., [laughter]). The system
uses about 30 codes altogether.

4.5. Discourse Transcription (Du Bois)

The Discourse Transcription system proposed by John Du Bois (1991; Du Bois
et al. 1992, 1993), and often referred to simply as DT, has been adopted by many
researchers and takes into account Du Bois’ design principles outlined above. The
system includes codes for pauses of various kinds, overlapping sequences, vocal
104 Roger J. Kreuz and Monica A. Riordan

quality, and utterance boundaries. In addition, it includes several codes for prosody
(primarily accent and pitch), transcriber’s comments, and even “smile quality”.
The Discourse Transcription system uses about 40 codes.

4.6. HIAT (Ehlich)

HIAT, or Halbinterpretative Arbeitstranskriptionen (Heuristic Interpretative Audi-
tory Transcription, Ehlich 1993) represents a movement away from standard
orthography, which may lead to a loss of important information. A system of “lit-
erary transcription” is proposed instead. This system makes use of symbols from
the International Phonetic Alphabet and uses vertical space on the page to represent
simultaneous events, much like a musical score. The coding of intonation is rep-
resented in a similar way. The system excels at tracking multiparty conversations
and overlapping speech, although such transcriptions require a great deal of space.
Computer programs designed to facilitate transcription into the HIAT system are
available.

4.7. Gumperz and Berenz

Gumperz and Berenz (1993) approach transcription from a sociolinguistic perspec-
tive, emphasizing the situated interpretations of the conversational participants. As
they put it, “our main goal is to reveal the functioning of communicative signs in
the turn-by-turn interpretation of talk, not to record everything that can be heard or
to provide exact measures of duration and pitch” (Gumperz and Berenz 1993: 119).
The system uses about 22 codes and may be attractive to researchers who desire a
broader approach to transcription.

4.8. GAT (Selting et al.)

The Gesprächsanalytisches Transkriptionssystem (GAT) was created “to help
reduce the hitherto often unmotivated variation in transcripts” (Selting, Auer,
Barden, Bergmann, Couper-Kuhlen, Günthner, Meier, Quasthoff, Schlobinski and
Uhmann 1998: 91), and includes 14 dimensions and about 50 codes. Like Tannen’s,
this system provides many gradations for amplitude, using nomenclature derived
from musical notation (e. g., piano, forte, crescendo, and diminuendo).

4.9. CHILDES and CHAT (MacWhinney)

The Child Language Database Exchange System (CHILDES) was begun in 1981
in an attempt to gather together transcripts of child language (MacWhinney 2000).
The project, established by Brian MacWhinney, has grown and evolved over time
and now includes adult interactions in the TalkBank Project. Developments in
The art of transcription: Systems and methodological issues 105

computer technology have greatly enhanced the utility of this resource for lan-
guage researchers. The transcripts themselves are freely available on the Internet
(http://talkbank.org/). In addition, tools for coding and analyzing these corpora
have been devised. Codes for the Human Analysis of Transcripts, or CHAT, is the
transcription system and coding format, and CLAN (for Computerized Language
Analysis) is the software tool developed to create and analyze CHILDES tran-
scripts. The current version of CHAT (MacWhinney 2000) provides researchers
with extensive sets of codes for use in transcription, and even accommodates other
notational schemes, such as the Jeffersonian Transcription System described in
section 4.1. However, the sophistication of this system may also be its principal
weakness because researchers may need to devote a considerable amount of time
and effort to mastering its intricacies.

4.10. Dressler and Kreuz

Dressler and Kreuz (2000) reviewed 24 papers employing transcription that
appeared in one journal (Discourse Processes) over a five-year period. They found
that 21 codes could accommodate the majority of the researchers’ dimensions, and
grouped these dimensions into five classes: intonation, temporal features, inten-
sity, breathing, and transcriber’s comments. Although this system employs some
higher-level categories, such as backchannel communication and paralinguistic
behavior, no attempt was made to include finer distinctions.

4.11. Powers
Powers (2005), an anthropologist, produced a transcription handbook to be used
by ethnographers. Not surprisingly, therefore, the focus is somewhat different than
for the other systems described here. Specifically, Powers’ system codes for a
smaller number of dimensions (about 18) and does not include notations for intona-
tion or breathing. On the other hand, this system explicitly accommodates a num-
ber of dimensions of paralinguistic and extralinguistic features, such as weeping,
reported speech, and irony.

5. Transcription of extralinguistic features

Communication in face-to-face dialogue is not limited to spoken words. Visible

actions such as facial expressions and gestures can serve to reinforce words and to
decrease ambiguity in interpretation.
Bavelas and Chovil (2000) argue that visible actions are only important when
they are part of a communication; for example, scratching one’s knee to emphasize
a conversation about a rash one had two days ago would be a communicative ges-
106 Roger J. Kreuz and Monica A. Riordan

ture, but the same action during a conversation about the price of milk would not
be. The authors call these communicative actions “visible acts of meaning” (Bave-
las and Chovil 2000:165) and include among them facial displays such as eyebrow
raises, hand gestures such as circular motions to depict a circle, and communicative
body movements such as shrugging one’s shoulders.
Bavelas and Chovil (2000) outline four characteristics that define a visible
act of meaning: (1) the action must occur in face-to-face dialogue and be reduced
when the receiver of the action cannot see the action, (2) the action must stand
as a symbol for something that is not physically present at the moment, (3) the
meaning of the action must be expressed either in words or by a demonstration that
the receiver uses the information, and (4) the action must be integrated with the
spoken dialogue. The research questions at hand dictate how these visible actions
are transcribed. Facial expressions can be transcribed either as physical actions or
as meaning-based actions (Bavelas, Kenwood and Phillips 2002).
Several researchers have used Ekman and Friesen’s (1978) Facial Action Cod-
ing System (FACS), a transcription system based on physical actions that utilizes
44 “action units” such as “head turn right” and “lip stretcher,” several of which are
coded to varying degrees of intensity (for an evaluation of the system, see Sayette,
Cohen, Wertz, Perrott and Parrot 2001). Chovil (1989) developed a meaning-based
system to contrast with Ekman and Friesen’s (1978) physical transcription system.
Her system uses descriptions of the facial expression as a whole, such as “sad-
ness” and “skepticism.” Bavelas, Kenwood and Phillips (2002) argue that this
meaning-based approach may not only be more useful for discourse research, but
also less time-consuming to researchers, and indeed Chovil (1989) demonstrated
a higher interrater reliability than FACS. Some might argue, however, that a more
subjective system reduces validity.
These two extremes – musculature analysis and subjective ratings – may not
be helpful for a variety of researchers. For those looking for a middle ground,
Louwerse et al. (2007) devised an attractive alternative. Louwerse et al. (2007)
used a subset of Ekman, Friesen and Hager’s (2002) Facial Action Coding Scheme
standard, coding just 20 facial movements that were of interest for their research
questions. Other researchers may wish to employ this system or a different subset
from Ekman et al. (2002), based on their own particular research interests.

5.1. Coding of gesture

The transcription of gestures presents additional challenges to the discourse
researcher, because gestures occur simultaneously with talk and some means of
mapping the two in time must be considered (see Goldin-Meadow 2003). The fol-
lowing is a review of a subset of gestural coding systems.
The art of transcription: Systems and methodological issues 107

5.1.1. Ochs

Elinor Ochs’ (1979) system, described in 4.2, includes five codes for gestures like
pointing, holding up, and offering.

5.1.2. Schegloff
In Emanuel Schegloff’s (1984) analysis of deictic gestures, he proposed indexing
hand and limb movements on a line above the transcription of the words being
uttered. The system utilizes eight codes, denoting, for example, the onset of move-
ment, maximum extension, and pointing, as well as temporal elements.

5.1.3. Bull
Peter Bull (1987, 1989) proposed a Body Movement Scoring System, in which
body contact and object contact are described in terms of (1) the body part making
the motion, (2) the type of motion, and (3) the body part or object with which con-
tact is made. One attractive aspect of this system is that its practitioners have been
able to achieve high interrater reliability (Bull and Connelly 1985).

5.1.4. Ehlich
The HIAT system (Ehlich 1993), described in 4.6 above, includes 25 codes for
referring to parts of the head, hands, arms, legs, and body.

5.1.5. CoGesT (Gut et al.)

CoGesT, or the Conversational Gesture Transcription system (Gut, Looks, Thies,
Trippel and Gibbon 2002), is an attempt to create a notational system based on
distinctions between categories of gestural form and function. The system makes
distinctions on a variety of dimensions, such as form, phase, location, and direc-
tionality. Specific examples include hand shapes, repetitions, and speed.

5.1.6. McNeill
Susan Duncan has developed a coding manual that has been employed by David
McNeill and his collaborators (McNeill 2005). She suggests making eight passes
through the interaction to be analyzed, and in addition to acoustic and prosodic
dimensions described earlier, adds the categories of handedness, hand orientation,
hand position, and phases (i. e., points in the gesture process).
108 Roger J. Kreuz and Monica A. Riordan

5.2. Eye gaze

The eye gaze of interlocutors during an interaction can be pragmatically salient.

Speakers, for example, tend to establish eye contact with their partners at the end
of a turn (Levelt 1993). A number of the systems described in section 4 contain at
least some codes for eye gaze. An example would be the system proposed by Ochs
(1979), which provides six codes for looking up, down, left, right, and towards
and away from the camera. These codes can be paired with the person or object
being looked to. Damico and Simmons-Mackie (2002) have proposed a system in
which a layer of gaze and gesture information can be mapped onto a base layer of
broader transcription. This proposal is attractive because it allows extralinguistic
features to be represented separately from the prosodic and paralinguistic features
of discourse.

5.3. Body posture and orientation

Speakers and listeners rarely remain static during an interaction, although deciding
which body movements are pragmatically salient may be difficult to determine.
Most of the systems described earlier could accommodate such meaningful move-
ments as part of the transcriber’s comments. Ochs (1979) suggests using a U-shape
to indicate the direction of a speaker’s pelvis.

6. Child language transcription

Interpreting the language of adults is difficult enough, yet child language research-
ers must deal with all these issues and more. A good overview is provided by
Bloom (1993), who proposes a model system for the computer-aided transcription
of the speech of children. She highlights two issues in particular: the biases and
distortions that may be introduced by the observer, and the massive amount of
data reduction, from the recording of the interaction to the transcription process
itself.
The conversion of child language into forms that can be accessed electronically
has also been an issue in the transcription literature. Edwards (1992) proposed
four principles for the use of such “archived” data, which are similar to the design
principles discussed in section 3. However, one of her suggestions, the consistent
coding of the data, has been somewhat contentious. Edwards (1993a) noted that
the use of novel variations such as “falld” and “falled,” might cause one or the
other to be overlooked in an electronic search for such instances. She argues that
this is an important issue because many forms used by children are rather rare. Her
concerns were expressed with regard to the early forms of CHAT, described above
(a discussion of these issues may be found in MacWhinney and Snow 1992).
The art of transcription: Systems and methodological issues 109

The limitations of transcription also affect the accuracy of the transcription of

child speech, particularly as it relates to social rules. For example, children may
not follow the implicit turn-taking rules of conversation that adults do (Davidson
2010). These issues may lead to concerns about the validity or generalizability of
studies relying on such data. Davidson (2010) argues that a thorough understand-
ing of context, particularly aspects of social order that might be taken for granted
in adult discourse, is necessary for accurate transcription of child speech.
PhonBank, a shared corpus of child speech, and Phon, open-source software
that enables analysis of phonological elements of child speech, exist as an exten-
sion of CHILDES and use the transcription conventions of CHILDES. Descrip-
tions of PhonBank and Phon can be found in Rose and Stoel-Gammon (2015).

7. Signed language transcription

There are special concerns regarding transcription of sign language, since this
form of communication does not map precisely to spoken or written language.
These concerns make it difficult to construct a machine-readable corpus of tran-
scribed sign language using traditional transcription codes. The complexity of
sign language requires all the same contextual information as verbal language but
also an accurate record of handshape, finger position, spatial location, position
of the non-dominant hand, and facial movements, to name a few. All of this is
time-consuming and very difficult if not impossible to transcribe using traditional
codes meant for verbal language. HamNoSys (Prillwitz and Zienert 1990) is a
popular transcription system that uses a font composed of various symbols to get
around the problem of attempting to transcribe sign language using a standard
alphabet.
The choice of a transcription system for sign language should be based on
the theoretical question at hand. The Berkeley Transcription System (Hoiting and
Slobin 2002), inspired in part by the CHILDES system (see section 4.10), codes
for morphological and semantic properties and has been widely used. Johnson and
Liddell (2010, 2011a, 2011b, 2012) introduced a transcription system that records
not just handshapes but also fine-grained information such as finger position, and
includes transitional movements between signs to promote the study of phonetics
within sign language.
Sign languages are composed not just of hand but also facial movements, and
the Facial Action Coding System (Ekman and Friesen 1969) has been employed
to transcribe the facial movements of sign language users (e. g., Dachkyovsky and
Sandler 2009). Tools such as ELAN (Wittenburg, Brugman, Russel, Klassmann
and Sloetjes 2006), ANVIL (Bunt, Kipp and Petukhova 2012), which were men-
tioned in section 2.7, as well as SignStream (Needle 2002) are multimodal tran-
scription systems used for sign language as well. Further advances in technology
110 Roger J. Kreuz and Monica A. Riordan

may allow the construction of a transcription system that uses pixel coordinates to
determine hand position.

8. Transcribing cognitively impaired individuals

The difficulties involved in transcribing the interactions of adults and children

may pale in comparison to reproducing the productions of those with cognitive
impairments. Ball and Rahilly (2002) make some suggestions for transcribing the
prosodic features of disordered speech and propose a scheme that is similar to
the HIAT system reviewed in section 4.6 (Ehlich 1993). TalkBank (see 4.9) also
includes a section called Clinical Bank for the dissemination of transcriptions of
aphasic, dysfluent, and other forms of disordered speech.
Haravon, Obler and Sarno (1994) present a system for analyzing the discourse
of those with brain injury. They suggest that their approach has utility for studying
the productions of aphasics and those suffering from Alzheimer’s disease. Their
approach is notable in that it explicitly takes into consideration pragmatic issues
(in addition to morphology and syntax).
Müller and Guendouzi (2002) propose a multilayered approach in their system
for transcribing the discourse of Alzheimer’s patients. Specifically, they recom-
mend employing a baseline or orthographic layer, a layer addressing prosody and
voice issues, and a discourse layer. The codes they use are similar to those in other
transcription systems described above, but the multiple layers provide more clarity
and allow the reader a better chance of making sense of the disordered speech. This
approach is taken even further by Müller and Damico (2002), who propose six lay-
ers: in addition to the levels already described, they add gaze and gesture, speech
(phonetic transcription), and clinical analysis (analysis of specific behaviors).

9. Critiques of transcription systems

Clearly, the most important attribute of a transcription system is the capability

to recreate an interaction with a high degree of fidelity. The degree to which this
fidelity is achieved will depend on many factors that exist outside of the system
being employed. The experience level and the care taken by the transcriber are
crucial, since even small errors can completely change the perceived meaning of
an utterance (Easton, McComish and Greenberg 2000). In addition, once an inter-
action has been transcribed according to one system, it may be difficult to transfer
it into a different system (Allwood et al. 2005).
In a series of empirical papers, Daniel O’Connell and Sabine Kowal have
explored a number of issues related to the validity, generalizability, and objectivity
of transcription systems in current use. Their pessimistic conclusion is that “tran-
The art of transcription: Systems and methodological issues 111

scription itself is a limited and defective device” (O’Connell and Kowal 2008:93,
emphasis in the original). Although this gloomy assessment may seem overstated,
it is a conclusion that they have come to as the result of their research, which is
summarized below as a cautionary tale for the enterprising transcriber.
As a starting point, O’Connell and Kowal (1999) addressed the issue of stand-
ardization in transcription notation. In a review of three widely used transcription
systems, they found that a majority of the dimensions were used to transcribe pro-
sodic features, whereas codes for extralinguistic features made up between zero
and 22 % of the total for each system. Their conclusion, however, is that standard-
ization is not practical, or even warranted, given the diversity of behaviors that
researchers are interested in.
If standardization of transcription systems is not a realistic goal, then surely
at least the reproducibility of transcripts is achievable. However, O’Connell and
Kowal (2000) found that reproductions of transcripts in textbooks had, on average,
an error rate of one change per 6.6 syllables. They attribute this high error rate to
the density and relative unfamiliarity of transcription systems, which overload the
scholars and typesetters who reproduce the examples.
The idea of conceptual overload was further explored in a study by Romero,
O’Connell, and Kowal (2002). They asked undergraduate participants to reproduce
a 21-syllable question asked by a news reporter. Participants were assigned to a
variety of conditions in which they were provided with only the audio recording,
with an “ordinary” transcription (verbal features only), or with a transcript that had
been generated using one of three widely employed transcription systems in which
dimensions of prosodic features were explicitly coded. The participants’ task was
to reproduce the news reporter’s prosody as closely as possible. When the partic-
ipants’ productions were compared to the original, it was found that only one of
the three transcriptions yielded reproductions that were better than for participants
who heard the original recording. In general, the participants found the prosodic
codes difficult to interpret.
Finally, O’Connell and Kowal (1995b), in their review of five of the tran-
scription systems mentioned above, conclude that all of these notational schemes
violate, to some degree, the seven design principles proposed in O’Connell and
Kowal, (1994; see section 3).
It is also worth noting that the type of discourse can present considerable prob-
lems for transcribers. Lindsay and O’Connell (1995) have shown that the fragmen-
tary nature of spontaneous speech–filled with incomplete sentences, hesitations,
and overlapping speech–can be particularly troublesome to transcribe because of
its complexity (see also Bucholtz 2007b).
Given the tedium of transcribing long stretches of video or audiotape, it should
come as no surprise that such tasks are frequently assigned to graduate or even
undergraduate students with little background in theories of discourse or train-
ing in transcription. Some of the issues surrounding the use of such transcribers,
112 Roger J. Kreuz and Monica A. Riordan

such as issues of training, have been described by Tilley (2003) and by Davidson
(2009).
The fidelity of a given transcript to a particular notation system can be assessed
by comparing the work of two (or more) transcribers who have independently
applied the system to the same stretch of discourse. The measurements can range
from simple measures of agreement to more sophisticated approaches, such as
Cohen’s kappa, which controls for chance performance (Cohen 1960). A tradeoff
exists between the number of dimensions employed by a particular transcription
system and a measure of interrater reliability (for a more extended discussion, see
Roberts and Robinson 2004 and Stelma and Cameron 2007). It is worth noting that
some researchers have been critical of the quest to achieve high reliability, because
putative errors may in fact provide important information (Pye, Wilcox and Siren
1988).

10. Conclusions

The range of issues and choices that confront the would-be discourse transcriber
may seem overwhelming. In reality, however, any research project involves a vari-
ety of choices and trade-offs, and viewed from this perspective, the selection of
a transcription system is no different from the choice of a statistical test. In both
cases, the ultimate goal is to illuminate the underlying systematicity that exists
within the data, and there may be a variety of legitimate ways to achieve this end.
Furthermore, even though transcription can be very labor intensive, it is possible
to find the process enjoyable (Bird 2005). It is our hope that the information we
have presented can provide guidance for those who wish to explore these issues
in greater depth.

Author note

This chapter is a revised and updated version of “The Transcription of Face-to-face

Interaction” by Roger Kreuz and Monica Riordan that appeared in Wolfram Bublitz
and Neal Norrick’s edited volume Foundations of Pragmatics (2011), pp. 657–679
(Berlin: Mouton de Gruyter). Comments and questions concerning this chapter
may be directed to Roger Kreuz (rkreuz@memphis.edu).
The art of transcription: Systems and methodological issues 113

References

Allwood, Jens, Peter J. Henrichsen, Leif Grönqvist, Elisabeth Ahlsén and Magnus Gunnars-
son
2005 Transliteration between spoken language corpora. Nordic Journal of Linguis-
tics 28: 5–36.
Atkinson, J. Maxwell and John Heritage
1984 Transcription notation. In: J. Maxwell Atkinson and John Heritage (eds.),
Structures of Social Action: Studies in Conversation Analysis, ix-xvi. Cam-
bridge: Cambridge University Press.
Ball, Martin J. and Joan Rahilly
2002 Transcribing disordered speech: The segmental and prosodic layers. Clinical
Linguistics and Phonetics 16: 329–344.
Bavelas, Janet B. and Nicole Chovil
2000 Visible acts of meaning: An integrated message model of language in face-to-
face dialogue. Journal of Language and Social Psychology 19: 163–194.
Bavelas, Janet B., Christine Kenwood and Bruce Phillips
2002 Discourse analysis. In: Mark L. Knapp and John A. Daly (eds.), Handbook of
Interpersonal Communication, Third ed., 102–129. Thousand Oaks, CA: Sage
Publications.
Bird, Cindy M.
2005 How I stopped dreading and learned to love transcription. Qualitative Inquiry
11: 226–248.
Bloom, Lois
1993 Transcription and coding for child language research: The parts are more than
the whole. In: Jane A. Edwards and Martin D. Lampert (eds.), Talking Data:
Transcription and Coding in Discourse Research, 149–166. Hillsdale, N.J.:
Lawrence Erlbaum Associates.
Bucholtz, Mary
2000 The politics of transcription. Journal of Pragmatics 32: 1439–1465.
Bucholtz, Mary
2007a Variation in transcription. Discourse Studies 9: 784–808.
Bucholtz, Mary
2007b Reply: Variability in transcribers. Discourse Studies 9: 837–842.
Bull, Peter E.
1987 Posture and Gesture. Oxford: Pergamon.
Bull, Peter E.
1989 Psychological approaches to transcription. In: Derek B. Roger and Peter E.
Bull (eds.), Conversation: An Interdisciplinary Approach, 150–165. Bristol:
Multilingual Matters.
Bull, Peter E and Gerry Connelly
1985 Body movement and emphasis in speech. Journal of Nonverbal Behavior 9:
169–187.
Bunt, Harry, Michael Kipp and Volha Petukhova
2012 Using DiAML and ANVIL for multimodal dialogue annotation. In: Proceed-
ings of the Eighth International Conference on Language Resources and Eval-
uation (LREC), ELDA, Paris.
114 Roger J. Kreuz and Monica A. Riordan

Chovil, Nicole
1989 Communicative functions of facial displays in conversation. Unpublished doc-
toral dissertation, Department of Psychology, University of Victoria, Victoria,
British Columbia, Canada.
Cohen, Jacob
1960 A coefficient of agreement for nominal scales. Educational and Psychological
Measurement 20: 37–46.
Clark, Herbert H.
1996 Using Language. Cambridge: Cambridge University Press.
Clark, Herbert H. and Jean E. Fox Tree
2002 Using uh and um in spontaneous speaking. Cognition 84: 73–111.
Cook, Guy
1995 Theoretical issues: Transcribing the untranscribable. In: Geoffrey A. Leech,
Greg Myers and Jenny Thomas (eds.), Spoken English on Computer: Tran-
scription, Mark-up and Application, 35–53. New York: Longman Publishing.
Dachkyovsky, Svetlana and Wendy Sandler
2009 Visual intonation in the prosody of a sign language. Language and Speech, 52:
287–314.
Damico, Jack S. and Nina N. Simmons-Mackie
2002 The base layer and the gaze/gesture layer of transcription. Clinical linguistics
& Phonetics 16: 317–327.
Davidson, Christina
2009 Transcription: Imperatives for qualitative research. International Journal of
Qualitative Methods 8: 35–52.
Davidson, Christina
2010 Transcription matters: Transcribing talk and interaction to facilitate conversa-
tion analysis of the taken-for-granted in young children’s interactions. Journal
of Early Childhood Research 8: 115–131.
Dressler, Richard A. and Roger J. Kreuz
2000 Transcribing oral discourse: A survey and a model system. Discourse Pro-
cesses 29: 25–36.
Du Bois, John W.
1991 Transcription design principles for spoken discourse research. Pragmatics 1:
71–106.
Du Bois, John W.
2006 VoiceWalker: A discourse transcription utility. University of California
Regents.
Du Bois, John W., Susanna Cumming, Stephan Schuetze-Coburn and Dannae Paolino
1992 Discourse transcription. Santa Barbara Papers in Linguistics 4: 1–225.
Du Bois, John W., Stephan Schuetze-Coburn, Susanna Cumming and Danae Paolino
1993 Outline of discourse transcription. In: Jane A. Edwards and Martin D. Lampert
(eds.), Talking Data: Transcription and Coding in Discourse Research, 45–89.
Hillsdale, N.J.: Lawrence Erlbaum Associates.
Easton, Kristen L., Judith Fry McComish and Rivka Greenberg
2000 Avoiding common pitfalls in qualitative data collection and transcription.
Qualitative Health Research 10: 703–707.
The art of transcription: Systems and methodological issues 115

Edwards, Jane A.
1992 Computerized methods in child language research: Four principles for the use
of archived data. Journal of Child Language 19: 435–458.
Edwards, Jane A.
1993a Perfecting research techniques in an imperfect world: Response to MacWhin-
ney and Snow. Journal of Child Language, 20: 209–216.
Edwards, Jane A.
1993b Principles and contrasting systems of discourse transcription. In: Jane A.
Edwards and Martin D. Lampert (eds.), Talking Data: Transcription and Cod-
ing in Discourse Research, 3–31. Hillsdale, N.J.: Lawrence Erlbaum Associ-
ates.
Edwards, Jane A.
1995 Principles and alternative systems in the transcription, coding, and mark-up
of spoken discourse. In: Geoffrey A. Leech, Greg Myers and Jenny Thomas
(eds.), Spoken English on Computer: Transcription, Mark-up and Application,
19–34. New York: Longman Publishing.
Ehlich, Konrad
1993 HIAT: A transcription system for discourse data. In: Jane A. Edwards and Mar-
tin D. Lampert (eds.), Talking Data: Transcription and Coding in Discourse
Research, 123–148. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Ekman, Paul and Wallace V. Friesen
1969 The repertoire of nonverbal behavior: Categories, origins, usage, and coding.
Semiotica 1: 49–68.
Ekman, Paul and Wallace V. Friesen
1978 Facial Action Coding System. Palo Alto: Consulting Psychologist Press.
Ekman, Paul, Wallace V. Friesen and Joseph C. Hager
2002 Facial Action Coding System. CD-ROM. Salt Lake City, U.T.: A Human Face.
Gee, James P.
2011 How to do Discourse Analysis: A Toolkit. London: Routledge.
Gee, James P. and Michael Handford
2012 The Routledge Handbook of Discourse Analysis. London: Routledge.
Gibson, Will, Helena Webb and Dirk vom Lehn
2014 Analytic affordance: Transcripts as conventionalised systems in discourse
studies. Sociology 48: 780–794.
Goldin-Meadow, Susan
2003 Hearing Gesture: How Our Hands Help Us Think. Cambridge, M.A.: Harvard
University Press.
Green, Judith, Maria Franquiz and Carol Dixon
1997 The myth of the objective transcript: Transcribing as a situated act. TESOL
Quarterly 31: 172–176.
Gumperz, John J. and Norine Berenz
1993 Transcribing conversational exchanges. In: Jane A. Edwards and Martin
D. Lampert (eds.), Talking Data: Transcription and Coding in Discourse
Research, 91–121. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Gut, Ulrike, Karin Looks, Alexandra Thies, Thorsten Trippel and Dafydd Gibbon
2002 CoGesT: Conversational gesture transcription system version 1.0. University
of Bielefeld, Germany: Technical report 1.
116 Roger J. Kreuz and Monica A. Riordan

Haravon, Anita, Loraine K. Obler and Martha T. Sarno

1994 A method for microanalysis of discourse in brain-damaged patients. In: Ron-
ald L. Bloom, Loraine K. Obler, Susan De Santi and Jonathan S. Ehrlich (eds.),
Discourse Analysis and Applications: Studies in Adult Clinical Populations,
47–80. Hillsdale, N.J.: Lawrence Erlbaum Associates.
Ide, Nancy and James Pustejovsky (eds.)
2017 Handbook of Linguistic Annotation. Dordrecht: Springer.
Hammersley, Martyn
2010 Reproducing or constructing? Some questions about transcription in social
research. Qualitative Research 10: 553–569.
Hoiting, Nina and Dan I. Slobin
2002 Transcription as a tool for understanding: The Berkeley Transcription System
for sign language research (BTS). In: Gary Morgan and Bencie Wool (eds.),
Directions in Sign Language Acquisition, 55–75. Amsterdam: John Benjamins.
Jaffe, Alexandra
2007 Variability in transcription and the complexities of representation, authority
and voice. Discourse Studies 9: 831–836.
Jefferson, Gail
2002 Is “no” an acknowledgement token? Comparing American and British uses of
(+)/(-) tokens. Journal of Pragmatics 34: 1345–1383.
Jefferson, Gail
2004 Glossary of transcript symbols with an introduction. In: Gene H. Lerner (ed.),
Conversation Analysis: Studies from the First Generation, 13–31. Amsterdam:
John Benjamins.
Jenks, Christopher J.
2013 Working with transcripts: An abridged review of issues in transcription. Lan-
guage and Linguistics Compass 7: 251–261.
Johnson, Robert E. and Scott K. Liddell
2010 Towards a phonetic representation of signs: Sequentiality and contrast. Sign
Language Studies 11: 241–274.
Johnson, Robert E. and Scott K. Liddell
2011a A segmental framework for representing signs phonetically. Sign Language
Studies 11: 408–463.
Johnson, Robert E. and Scott K. Liddell
2011b Towards a phonetic representation of hand configuration: The fingers. Sign
Language Studies 12: 5–45.
Johnson, Robert E. and Scott K. Liddell
2012 Towards a phonetic representation of hand configuration: The thumb. Sign
Language Studies 12: 316–333.
Jones, Rodney H.
2011 Data collection and transcription in discourse analysis. In: Ken Hyland and
Brian Paltridge (eds.), The Bloomsbury Companion to Discourse Analysis,
9–21. London: Bloomsbury.
Kingdon, Roger
1958 The Groundwork of English Stress. London: Longman.
Kipp, Michael
2003 Anvil 4.0: Annotation of video and spoken language. University of the Saar-
land, Saarbrücken, Germany.
The art of transcription: Systems and methodological issues 117

Kowal, Sabine and Daniel C. O’Connell

2004 The transcription of conversations. In: Uwe Flick, Ernst von Kardoff, Ines
Steinke and Bryan Jenner (eds.), A Companion to Qualitative Research: Par-
adigms, Theories, Methods, Practice and Context, 248–252. Thousand Oaks,
CA: Sage Publications.
Kyratzis, Amy
2001 Emotion talk in preschool same-sex friendship groups: Fluidity over time and
context. Early Education & Development 12: 359–392.
Leech, Geoffrey, Greg Myers and Jenny Thomas (eds.)
1995 Spoken English on Computer: Transcription, Mark-up and Application. New
York: Longman Publishing.
Levelt, Willem J. M.
1993 Speaking: From Articulation to Intention. Cambridge, M.A.: MIT Press.
Lindsay, Jean and Daniel C. O’Connell
1995 How do transcribers deal with audio recordings of spoken discourse? Journal
of Psycholinguistic Research 24: 101–115.
Louwerse, Max M., Nick Benesh, Mohammed E. Hoque, Patrick Jeuniaux, Gwyneth Lewis,
Jie Wu and Megan Zirnstein
2007 Multimodal communication in face-to-face computer-mediated conversations.
Proceedings of the 28th Annual Conference of the Cognitive Science Society,
1235–1240. Mahwah, N.J.: Lawrence Erlbaum Associates.
MacWhinney, Brian
n.d. TalkBank. Retrieved September 2017 from http://talkbank.org/.
MacWhinney, Brian
2000 The CHILDES Project: Tools for Analyzing Talk, Third Edition. Volume I: Tran-
scription Format and Programs. Mahwah, N.J.: Lawrence Erlbaum Associates.
MacWhinney, Brian
2001 From CHILDES to TalkBank. In: Margareta Almgren, A. Barreña, M. Ezeiz-
aberrena, I. Idiazabal and Brian MacWhinney (eds.), Research on Child Lan-
guage Acquisition, 17–34. Somerville, M.A.: Cascadilla.
MacWhinney, Brian, and Catherine Snow
1992 The wheat and the chaff: Or four confusions regarding CHILDES. Journal of
Child Language, 19: 459–471.
Mallinson, Christine, Becky Childs and Gerard Van Henk (eds.).
2013 Data Collection in Sociolinguistics: Methods and Applications. New York:
Routledge.
McNeill, David
2005 Appendix. In: David McNeill, Gesture & Thought, 259–288. Chicago: Univer-
sity of Chicago Press.
Meißner, Cordula and Adriana Slavcheva
2013 Review of EXMARaLDA. Language Documentation & Conservation, 7:
31–40.
Mishler, Elliot G.
1991 Representing discourse: The rhetoric of transcription. Journal of Narrative
and Life History 1: 255–280.
Müller, Nicole and Jack S. Damico
2002 A transcription toolkit: Theoretical and clinical considerations. Clinical Lin-
guistics and Phonetics 16: 299–316.
118 Roger J. Kreuz and Monica A. Riordan

Müller, Nicole and Jacqueline A. Guendouzi

2002 Transcribing discourse: Interactions with Alzheimer’s disease. Clinical Lin-
guistics and Phonetics 16: 345–359.
Neidle, Carol
2001 SignStream: A database tool for research on visual-gestural language. In: Brita
Bergman, Penny Boyes-Braem, Thomas Hanke and Elena Pizzuto, (eds.), Sign
Language and Linguistics 4: 203–214.
Neidle, Carol, Stan Sclaroff and Vassilis Athitsos
2001 SignStream: A tool for linguistic and computer vision research on visual-ges-
tural language data. Behavior Research Methods, Instruments, & Computers
33: 311–320.
New York Times
1974 The White House Transcripts: Submission of Recorded Presidential Conver-
sations to the Committee on the Judiciary of the House of Representatives by
President Richard Nixon. New York: Viking.
Norris, Sigrid
2004 Analyzing Multimodal Interaction: A Methodological Framework. New York:
Routledge.
Novick, David G., Lisa Walton and Karen Ward
1996 Contribution graphs in multiparty conversations. In: Proceedings of the Inter-
national Symposium on Spoken Dialogue (ISSD-96), 53–56. Philadelphia, PA.
Ochs, Elinor
1979 Transcription as theory. In: Elinor Ochs and Bambi B. Schieffelin (eds.),
Developmental Pragmatics, 43–72. New York: Academic Press.
O’Connell, Daniel C. and Sabine Kowal
1994 Some current transcription systems for spoken discourse: A critical analysis.
Pragmatics 4: 81–107.
O’Connell, Daniel C. and Sabine Kowal
1995a Basic principles of transcription. In: Jonathan A. Smith, Rom Harre and Luk
Van Langenhove (eds.), Rethinking Methods in Psychology, 93–105. Thou-
sand Oaks, C.A.: Sage Publications.
O’Connell, Daniel C. and Sabine Kowal
1995b Transcription systems for spoken discourse. In: Jef Verschueren, Jan-Ola Öst-
man and Jan Blommaert (eds.), Handbook of Pragmatics, 646–656. Amster-
dam: John Benjamins.
O’Connell, Daniel C. and Sabine Kowal
1999 Transcription and the issue of standardization. Journal of Psycholinguistic
Research 28: 103–120.
O’Connell, Daniel and Sabine Kowal
2000 Are transcripts reproducible? Pragmatics 10: 247–269.
O’Connell, Daniel C. and Sabine Kowal
2008 Communicating with One Another: Toward a Psychology of Spontaneous Spo-
ken Discourse. New York: Springer.
O’Connor, Joseph D. and Gordon F. Arnold
1961 Intonation of Colloquial English. London: Longman.
Paulus, Trena M. and Jessica N. Lester
2016 ATLAS.ti for conversation and discourse analysis studies. International Jour-
nal of Social Research Methodology 19: 405–428.
The art of transcription: Systems and methodological issues 119

Powers, Willow R.
2005 Transcription Techniques for the Spoken Word. Oxford: AltaMira Press.
Preston, Dennis R.
1982 ‘Ritin’ fowklower daun ‘rong: Folklorists’ failures in phonology. Journal of
American Folklore 95: 304–326.
Prillwitz, Siegmund and Heiko Zienert
1990 Hamburg notation system for sign language: Development of a sign writing
with computer applications. In: Sigmund Prillwitz and Tomas Vollhaber (eds.),
Current Trends in European Sign Language Research, Vol. 9, 355–379. Ham-
burg: Signum Press.
Psathas, George and Timothy Anderson
1990 The ‘practices’ of transcription in conversation analysis. Semiotica 78: 75–99.
Podesva, Robert J. and Devyani Sharma (eds.)
2013 Research Methods in Linguistics. Cambridge: Cambridge University Press.
Pullum, Geoffrey K. and William A. Ladusaw
1996 Phonetic Symbol Guide, Second ed. Chicago: University of Chicago Press.
Pye, Clifton A., Kim A. Wilcox and Kathleen A. Siren
1988 Refining transcription: The significance of transcriber ‘errors’. Journal of
Child Language 15: 17–37.
Roberts, Celia
1997 Transcribing talk: Issues of representation. TESOL Quarterly, 31: 167–172.
Roberts, Felicia and Jeffrey D. Robinson
2004 Interobserver agreement on first-stage conversation analytic transcription.
Human Communication Research 30: 376–410.
Romero, Catherine, Daniel C. O’Connell and Sabine Kowal
2002 Notation systems for transcription: An empirical investigation. Journal of Psy-
cholinguistic Research 31: 619–631.
Rose, Yvan and Carol Stoel-Gammon
2015 Using PhonBank and Phon in studies of phonological development and disor-
ders. Clinical Linguistics & Phonetics 29: 686–700.
Sacks, Harvey, Emanuel A. Schegloff and Gail Jefferson
1974 A simplest systematics for the organization of turn-taking for conversation.
Language 50: 696–735.
Sayette, Michael A., Jeffrey F. Cohen, Joan M. Wertz, Michael A. Perrott and Dominic J.
Parrot
2001 A psychometric evaluation of the Facial Action Coding System for assessing
spontaneous facial expression. Journal of Nonverbal Behavior 25: 167–185.
Schegloff, Emanuel A.
1984 On some gestures’ relation to talk. In: J. Maxwell Atkinson and John Heritage
(eds.), Structures of Social Action: Studies in Conversation Analysis, 266–298.
Cambridge: Cambridge University Press.
Schenkein, Jim
1978 Explanation of transcript notation. In: Jim Schenkein (ed.), Studies in the
Organization of Conversational Interaction, xi–xvi. New York: Academic
Press.
Schiffrin, Deborah
1987 Discourse Markers. Cambridge: Cambridge University Press.
120 Roger J. Kreuz and Monica A. Riordan

Searle, John
1975 Indirect speech acts. In: Peter Cole and Jerry L. Morgan (eds.), Syntax and
Semantics, Volume 3: Speech Acts, 59–82. New York: Academic Press.
Selting, Margaret, Peter Auer, Birgit Barden, Jörg Bergmann, Elizabeth Couper-Kuhlen,
Susanne Günthner, Christoph Meier, Uta Quasthoff, Peter Schlobinski and
Susanne Uhmann
1998 Gesprächsanalytisches Transkriptionssytem (GAT). Linguistische Berichte
173: 91–122.
Shriberg, Lawrence D. and Gregory L. Lof
1991 Reliability studies in broad and narrow phonetic transcription. Clinical Lin-
guistics and Phonetics 5: 225–279.
Silver, Christina and Ann Lewins
2014 Using Software in Qualitative Research: A Step-by-Step Guide, Second ed.
Los Angeles: SAGE.
Skukauskaite, Audra
2012 Transparency in transcribing: Making visible theoretical bases impacting
knowledge construction from open-ended interview records. Forum Qualita-
tive Sozialforschung/Forum: Qualitative Social Research 13: article 14.
Stelma, Juurd H. and Lynne J. Cameron
2007 Intonation units in spoken interaction: Developing transcription skills. Text &
Talk 27: 361–393.
Svartvik, Jan (ed.).
1990 The London Corpus of Spoken English: Description and Research (Lund Stud-
ies in English 82). Lund: Lund University Press.
Svartvik, Jan and Randolph Quirk
1980 A Corpus of English Conversation (Lund Studies in English 56). Lund: Lund
University Press.
Tannen, Deborah
1984/2005 Conversational Style: Analyzing Talk Among Friends. Oxford: Oxford
University Press.
Tilley, Susan A.
2003 “Challenging” research practices: Turning a critical lens on the work of tran-
scription. Qualitative Inquiry 9: 750–773.
Widodo, Handoyo P.
2014 Methodological considerations in interview data transcription. International
Journal of Innovation in English Language Teaching and Research 3: 101–
107.
Wittenburg, Peter, Hennie Brugman, Albert Russel, Alex Klassmann and Han Sloetjes
2006 ELAN: A professional framework for multimodality research. In: Proceedings
of the 5th International Conference on Language Resources and Evaluation
(LREC 2006), 1556–1559. ELRA (European Language Resources Associa-
tion).
Woods, David K.
2007 Transana. Retrieved September 2017 from http://www.transana.com.
II. Introspectional pragmatics
4. Introduction to part 2:
Introspectional pragmatics
Wolfram Bublitz

It is the general objective of this handbook to provide a comprehensive and system-

atic overview of the different ways of doing pragmatics, i. e. of the range of method-
ological approaches to the description of data which is based on verbal and non-ver-
bal (prosodic, kinesic and, mutatis mutandis, signed) communicative occurrences.
Data, of course, is not phenomenologically given but conceptually construed, i. e.
entirely method- and theory-dependent. This truism aligns with the basic premise
of this handbook series (as spelt out in the Preface heading this volume) that prag-
matics is a research perspective (on how language is used to intentionally mean
and purposefully act in social contexts) which manifests itself in different theories
and methods that determine their descriptive objects in different ways. Pragmati-
cists, accordingly, must choose between various methods of defining, procuring and
analysing their data. The most prevalent methods, covered in this handbook, rest
on (deductive) introspection and intuition (part 2), experimentation and elicitation
(including laboratory and ethnographic field work) (part 3), (inductive) observation
(part 4) and corpus exploration (part 5). Unlike its competitive methods, whose data
is empirically gathered and verified, introspectional pragmatics deals with data that
arises from the individual pragmaticist’s intuitive knowledge of language and how
to put it to use. Thus, the viability of introspectional data rests essentially on deduc-
tive reasoning, which may on occasion be supplemented with abductive findings as
the result of accidental, non-systematic observation.1

1
We need to stress that the term “introspection” (respectively “introspective method”) is
potentially misleading in that it is used in different senses in different fields of study.
Klaus P. Schneider (pers. com.) points out that while it is widely employed in Relevance
Theory and related approaches to refer to the fabrication of examples and their intui-
tion based analysis (cf. below), in cognitive psychology and (applied) psycholinguis-
tics, the term “introspection” (as opposed to “retrospection”) carries quite a different
sense. In these disciplines, “introspection” refers to experimental methods involving
thinking-aloud and protocol analysis with groups of ordinary, i. e. non-expert, language
users. For work in this field, cf. studies by psychologists K. Anders Ericsson and Her-
bert A. Simon from the 1980s and 1990s (particularly two pivotal articles published in
1980 and 1993) or by applied linguists Claus Færch and Gabriele Kasper (e. g. their
volume Introspection in Second Language Research). The difference between these
two concepts of “introspection” is relevant, not least because researchers in Experi-
mental Pragmatics adopt experimental methods from psychology and psycholinguistics
and consider them markedly different from, and in fact superior, to “introspection” in

https://doi.org/10.1515/9783110424928-004
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 123–131. Berlin/
Boston: De Gruyter Mouton.
124 Wolfram Bublitz

Arguably, introspection is the earliest method used in pragmalinguistic research

favoured in particular by scholars (mostly with a background in ordinary language
philosophy) working on speech acts, presupposition and inference. The study of
usage-bound phenomena such as indexicals required right from the beginning close
observation of actually produced language in authentic speech situations.2 Thus,
pragmatic research is ab initio characterised by a critical methodological divide
between introspection and observation. At about the same time (i. e. approximately
in the 1950s and 1960s) that John L. Austin, John R. Searle and other speech act
theoreticians based their findings on the introspective interpretation of fabricated
examples in likewise fabricated potential contexts, other pragmatically oriented
linguists relied for their description of categories such as tense, mood, modality
and deictic reference on empirical evidence from the actual contexts in which they
occurred. Eventually, the increasing awareness of the need to supplement or rather
align introspectional with observational methodology

led to the pragmatic turn and the institutional establishment of linguistic pragmatics
in the ‘70s, linguistics and philosophy moving from the analytically oriented study of
de-contextualizable regularities of language toward the empirical study of the contex-
tualized use of language […] [and allowing] the convergence of pragmatics with the
empirical social sciences, which are traditionally concerned with actions and interpre-
tations in context. (Koyama 2011: 139–140)

A methodological shift (as well as conceptual widening of the field) took place in
the 1980s, when pragmatics linked up with the emerging interactional paradigms
in sociology (in general) and ethnomethodology (in particular). This methodo-
logical repositioning had a considerable bearing on the redetermination of such
pragmatically sensitive concepts as context (from a static and autonomous to a
dynamic and collaborative concept which is cognitively as well as situationally
and socio-culturally much more refined) and speech act (from a unilateral act, as
advocated by pioneers like John L. Austin, to a bi- or, in some types of interac-
tive computer-mediated genres, multi-lateral “inter-act”). In hindsight, it is clearly
perceivable that with the advent of modern computer technology (which fostered
fast and easy compilation of large text corpora together with the development of
sophisticated software for their analysis), introspection as the dominant method of

Relevance Theory. Cf. Part III of this handbook for the use of “introspection” in the
psychological sense in other traditions in pragmatics.
2
The significance of indexicality for semiotics in general had been acknowledged by
Charles Sanders Peirce and for pragmatics in particular by Yehoshua Bar-Hillel (1954
and 1971); cf. Koyama (2011: 141) and for an authoritative overview of deixis and
indexicality Hanks (2011), who justifies the central role of the study of deixis in prag-
matics by pointing out that “deictic systems define points of intersection between lin-
guistic structure and the social settings in which speech takes place” (2011: 315).
Introduction to part 2: Introspectional pragmatics 125

pragmatics lost its significance and was replaced by empirical, usage-based modes
of gathering and analysing data.
In the opening chapter of the second part of this handbook on methods of
introspectional pragmatics, which covers philosophical and cognitive approaches,
Marina Sbisà in chapter 5 (“Philosophical pragmatics”) sets the scene by critically
discussing in which way previous research into (varieties of) speech act theory,
models of interpretive inference and the multifaceted concept of context has con-
tributed to the methodology of pragmatics. As her guideline, she chooses a defini-
tion of pragmatics put forward by Charles Travis not long ago:
Pragmatics […] is the study of properties of words which depend on their having been
spoken, or reacted to, in a certain way, or in certain conditions, or in the way, or condi-
tions, they were. (Travis 2000: 87)

Sbisà paraphrases Travis’s view of pragmatics as the study of “the ways in which
language is used by speakers in contexts” and claims that this “matches the oldest
definition of pragmatics, that of the pragmatist philosopher Charles Morris (1938)”
(this volume). Such equation, however, needs to be taken cum grano salis. Mor-
ris’s understanding of pragmatics has to be considered against the backdrop of a
general theory of signs. In his renowned triangular model of semiotics (which is
still widely adopted, though mostly in slightly restricted form, cf. Nöth’s 2011: 167
caveat), pragmatics features as one of three domains:
Sign behavior, according to Morris, involves three main factors: “that which acts as a
sign [the sign vehicle], that which the sign refers to [the designatum], and that effect
on some interpreter in virtue of which the thing in question is a sign to that interpreter”
[the interpretant] (1938: 3). Based on this triad, Morris (1938: 6–7) defines semiotics
as a field of study of the following three domains corresponding to three well-known
branches of modern linguistics: syntax (or syntactics), the study of the relation between
sign vehicles, semantics, the study of the relations between sign vehicles and their
designata, and pragmatics, the study of the relation between sign vehicles and their
interpreters […]. (Nöth 2011: 167)

Morris drew upon a similar proposal by the sign theoretician Charles Sanders Peirce,
whose concept of the linguistic sign is characterized by an irreducibly triadic relation
between the sign itself and the object (which can be another sign) it represents by
way of a mediation between the two, i. e. by way of interpretatively relating them.3

3
In Peirce’s own words, the semiotic triangle can be described in this way: “A sign, or
representamen, is something which stands to somebody for something in some respect
or capacity. It addresses somebody, that is, it creates in the mind of that person an
equivalent sign, or perhaps a more developed sign. That sign which it creates I call the
interpretant of the first sign. The sign stands for something, its object. It stands for that
object, not in all respects, but in reference to a sort of idea, which I have sometimes
called the ground of the representamen.” (Peirce 1955: 99)
126 Wolfram Bublitz

While for Morris (as well as for Peirce) pragmatics is thus essentially concerned
with the relation between the linguistic sign and its interpreting user, Travis designs
a much broader scope of pragmatics by relating the linguistic sign to the ways and
conditions of its actual use. His central claim is that in pragmatics, linguistic expres-
sions (their forms, meanings and functions) are studied and explained solely in
relation to the particular contexts and situations in which a speaker or writer, hearer
or reader actually uses them, the focal issue thus being their “occasion-sensitivity”,
to use a key notion featuring largely in his work (cf. e. g. 2000: passim).
Against the backdrop of Travis’s broad definition of pragmatics, Sbisà works
out the methodological implications of research by Austin and other speech act
theorists, by Grice and those who modified his inference theory, as well as by
Stalnaker and others who investigated indexicals, presuppositions and the role of
context and common ground. Claiming that a theory which describes speaking as
acting has inevitably methodological implications, Sbisà in section 2 outlines and
critically evaluates Austin’s speech act theory. She argues that these implications
concern conditions of language use such as the essential distinction between the
four “kinds of uses” (Sbisà, this volume) locution, illocution, perlocution and aeti-
olation (with illocution as the central concept in Austin’s theory) and the “actual
execution of an illocutionary procedure and, if there are any, of its flaws generating
inappropriateness or infelicity” (Sbisà, this volume). Rounding off her account of
the implications of Austin’s theory for the methodology of pragmatics, she draws
the reader’s attention to the significance Austin assigned to ordinary language as a
means to gain “access to philosophical knowledge” (Sbisà, this volume). Section
3 is devoted to Paul Grice and the central role he plays in research on inferen-
tial meaning (which in its impact she judges to be comparable to Austin’s role in
speech act theory). Sbisà gives unreserved credit to Grice’s contribution to “reach
pacific coexistence” (this volume) of “logic and conversation” (thus the title of his
most influential study) as the two opposing mainstream approaches to meaning,
i. e. of truth-conditions based analysis of meaning in semantics (in Gricean terms
“what is said”), on the one hand, and analysis “of whatever else is meant or done
in speaking” (“what is meant / what is implicated”) in pragmatics, on the other
hand. Sbisà argues that Grice by relating meaning emerging in cooperative inter-
action to the interactants’ intentions manages to set pragmatics clearly apart from
semantics. Ultimately, this achievement helped to promote pragmatics as a field of
study in its own right, as did two other significant traits of his theory: rationality
and argumentation. Rationality, the defining feature of the Cooperative Principle,
must be taken as a “means-end relationship” in that “it is rational to make as much
sense as possible of one’s interlocutor’s conversational contributions, since this
can make the conversation more profitable” (Sbisà, this volume). And it is due to
rationality that observing the Cooperative Principle and its maxims is an ordinary,
but nonetheless optional and entirely context-dependent act:
Introduction to part 2: Introspectional pragmatics 127

the level at which the criteria of adequateness for quantity, quality, relation and manner
of information are set in individual cases, as aspects of speaker’s cooperativity, depends
on the context of the conversation and particularly on its „accepted purpose or direc-
tion“ (1989: 26). [This, she argues, is] different from the way in which the Principle
of Relevance is dealt with in Relevance Theory. (Sbisà, this volume; cf. also Clark,
chapter 7, this volume)

According to Grice, there is a tight connex between interactants behaving in a

rational way and argumentatively accounting for such behavior, which also applies
to the analyst, who is likewise expected to argumentatively track the creation of
meaning in context. Sbisà makes the methodological implication quite clear:
if what you are doing is […] the analysis of some discourse or conversation, or of some
recurrent fact in the use of language (such as e. g. the generation of a certain implicit
meaning), then the meaning you assign to the discourse or conversation, or to the kind of
utterance analyzed, should not be assigned without a reason, but be backed by argumen-
tation. This stance may guide the analyst in her setting limits to her interpretive activity,
against the temptations of so-called infinite semiosis […] and of deconstruction. (Sbisà,
this volume)
In section 4, Sbisà turns back to speech act theory or rather to the broad spectrum
of speech act theories developed in the wake of Austin’s original model. Of these
she singles out two exemplary ones with different philosophical and methodo-
logical implications, viz. John R. Searle’s philosophical and Kent Bach and R.M.
Harnish’s inferentialist and internalist approach. While the latter owes much to
Grice’s inference theory in that it regards the speech act as the expression of a com-
municative intention which the hearer needs to infer, the former is well-known for
its conformities with Austin’s original, but even more so for its differences. While
they agree on the structure of the speech act, their theories differ fundamentally as
to their ontological quality. Searle introduces rule-relatedness as an essential pre-
condition for the metamorphosis of any kind of verbal behaviour into the perfor-
mance of purposeful speech acts. Such rules that turn behaving into acting he calls
constitutive rules; they define the conditions under which an utterance X (“Have
you got a watch?”) counts as the speech act Y (request to tell me the time). Sbisà
draws our attention to the cutback that while his “rule-governed approach is ele-
gant in theory”, it nonetheless “does not yield a plausible picture of actual verbal
interaction” (this volume). Section 5 is devoted to Robert Stalnaker’s pragmatic
research into the context-dependency of meaning culminating in two main concepts
of description, viz. common ground (i. e. the participants’ mutual and continuously
developing and changing contextual beliefs) and pragmatic presupposition (i. e.
presupposition of a speaker, not a verbal item, which transmits known information
but may, on occasion, also convey new information). Sbisà discusses the implica-
tions for pragmatic research of both concepts, focusing in particular on the dynam-
ics they help to create in context. In the final section 6 (preceding her concluding
remarks) she turns to more philosophical pragmatic studies of context-dependency,
128 Wolfram Bublitz

which range from mainstream pragmatism by, for instance, Travis (2000), to what
she calls “radical contextualism” by Recanati (2004, 2010).
In the following chapter 6 (“Research methodology in classical and neo-Gri-
cean pragmatics”), Yan Huang surveys major research methodologies used in clas-
sical and neo-Gricean pragmatics to deal with different ways of collecting, classi-
fying and analysing data. A characteristic feature of his overview is that it is both a
careful linguistic stocktaking of some of the respective models (including Grice’s
original theory and Huang’s own neo-Gricean pragmatic theory of anaphora) and
a philosophical discussion of the methodological principles of falsifiability and
reduction versus expansion of a theory.
Section 2 sets in with an assiduous reconsideration of Grice’s classical princi-
ple of cooperation (together with its concomitant maxims of conversation), which
Huang relates to the Gricean theories of non-natural or (intention reflecting)
speaker meaning and conversational implicature, before turning to neo-Gricean
pragmatics, singling out Horn’s bipartite and Levinson’s tripartite theories. The
former, in which Grice’s maxims of conversation are reduced to just two basic
principles of Quantity and Relation, is challenged by Levinson on the grounds that
it fails to distinguish between “semantic minimization” and “expression minimi-
zation”, thus failing to set pragmatic principles that determine the linguistic form
of an utterance clearly apart from those that govern its content. Referring back
to Horn’s proposal and employing a plethora of enlightening examples, Huang
examines Levinson’s alternative theory. Sections 3 and 4 are devoted to two crucial
concepts of pragmatic methodology, viz. introspection as the dominant methodo-
logical principle in classical as well as neo-Gricean pragmatics and falsifiability as
an indispensable criterion of empirically-based science. Again basing his argumen-
tation on a great number of supporting examples, Huang discusses the advantages
of introspection in this field of pragmatics over other research methodologies and
also its disadvantages, which, however, are quite negligible in his view. Falsifia-
bility (in Karl Popper’s sense) is an obvious feature of classical and neo-Gricean
pragmatics whose findings can be tested for their truth or falsity, as Huang takes
pains to demonstrate, drawing on evidence from lexical narrowing and, in the pro-
cess, refuting counterexamples put forward in post-Gricean pragmatics.
Using the example of Grice’s Cooperative Principle with its accompanying
maxims, Huang considers (in section 5) two other methodological principles, viz.
reduction versus expansion, which can both be observed in the numerous attempts
at “improving” Grice’s theory. As mentioned above, Horn and Levinson, inter
alia, preferred the reductionist option, while Leech (1983), for instance, advanced
an alternative theory with an expanded number of principles and maxims. Both
options are critically evaluated from a methodological point of view, before Huang
turns to his final topic in section 6. At considerable length, he presents his version
of a neo-Gricean pragmatic theory of anaphora and binding (in which linguistic
characteristics are compared and contrasted in a great number of typologically
Introduction to part 2: Introspectional pragmatics 129

different languages) to demonstrate the advantages of such alternative methodo-

logical procedure.
The second part of this handbook on philosophical and cognitive methods of
introspectional pragmatics closes with Billy Clark’s chapter 7 on “Cognitive prag-
matics: Relevance-theoretic methodology”. It provides a systematic overview of
the early introspection based (following Grice) as well as the more recent observa-
tion based cognitive methods of analysis in relevance theory, including techniques
developed in what became to be known as “experimental pragmatics” (cf. below
and, inter alia, Gibbs, this volume).
The structure of Clark’s chapter follows his claim that “the story of relevance
theory” is best accounted for by dividing it into three phases. In the first phase
(which he titles “Introspection”), the early protagonists of relevance theory, notably
Sperber and Wilson, “were engaged in demonstrating that pragmatic theories were
possible at all. There was an assumption that the domain of pragmatic inference
was so wide that it was not amenable to systematic study” (Clark, this volume).
In doing so, they naturally followed Grice in using (their own) intuitions as their
principal method of collecting data and providing possible and thus, in accord-
ance with Grice’s theory, logical interpretations. In great detail (and aided by a
wide range of instructive examples), Clark discusses the early relevance theorists’
examination and critique of Grice’s theory, which they based on a useful “combi-
nation of intuitions, logical argumentation and appeals to theoretical simplicity”
(this volume). The second phase of relevance theory is characterised by the rise of
experimental relevance-theoretic work (not in lieu of but alongside introspectional
analysis), due to a growing concern among relevance theorists (and pragmaticists
in general) about the “perceived limitations of reliance on introspective data” (this
volume). In the respective section titled “Experiments”, Clark outlines and criti-
cally evaluates the assemblage of diverse experimental methods developed in the
1990s (with a notable increase since the 2000s), which use, inter alia, evidence
from questionnaires, reading and response times, electroencephalography (EEG),
functional magnetic resonance imaging (fMRI) and eye-tracking technology. This
is followed by an elaborate and insightful report of the various ways in which such
experimental work has been applied in “clinical work, developmental pragmatics,
language acquisition, first and second language learning and teaching, and sty-
listics”, and has thus developed into a “standard way of investigating theoretical
claims” (this volume) in pragmatics.
Even though “introspection and experiment are still the most commonly used
methods in relevance-theoretic work”, there are other methods whose application
“might soon be seen as a third phase” (Clark, this volume) in the methodology
characterising current studies in relevance theory, which has become increasingly
eclectic. In the section on “Other methods”, Clark, adopting a very critical stance,
explores such eclecticism. In particular, he rejects the tendency to value experi-
mental methods higher than others on several grounds. He then proceeds. to dis-
130 Wolfram Bublitz

cuss several key studies using alternative methods which are neither introspective
nor experimental but observational and corpus-based. The chapter is rounded off
with concluding comments on the usefulness of the different kinds of methods
presented and an outlook on possible future developments.

References

Bar-Hillel, Yehoshua
1954 Indexical expressions. Mind 63: 359–376.
Bar-Hillel, Yehoshua
1971 Out of the pragmatic wastebasket. Linguistic Inquiry 2: 401–407.
Bublitz, Wolfram and Neal Norrick
2011 Introduction: The burgeoning field of pragmatics. In: Wolfram Bublitz and
Neal Norrick (eds.). Foundations of Pragmatics. (Handbooks of Pragmatics
1.), 1–20. Berlin/Boston: de Gruyter Mouton.
Ericsson, K. Anders and Herbert A. Simon
1980 Verbal reports as data. Psychological Review 87: 215–251.
Ericsson, K. Anders and Herbert A. Simon
1993 Protocol Analysis: Verbal Reports as Data (revised edition). Cambridge, MA:
MIT Press.
Færch, Claus and Gabriele Kasper (eds.).
1987 Introspection in Second Language Research. Clevedon: Multilingual Matters.
Grice, Herbert Paul
1975 Logic and conversation. In: Peter Cole and Jerry L. Morgan (eds.). Syntax and
Semantics. Vol. 3: Speech Acts, 41–58. New York: Academic Press.
Hanks, William F.
2011 Deixis and indexicality. In: Wolfram Bublitz and Neal Norrick (eds.). Foun-
dations of Pragmatics. (Handbooks of Pragmatics 1.), 315–346. Berlin: de
Gruyter Mouton.
Koyama, Wataru
2011 The rise of pragmatics: A historiographic overview. In: Wolfram Bublitz and
Neal Norrick (eds.). Foundations of Pragmatics. (Handbooks of Pragmat-
ics 1.), 139–166. Berlin: de Gruyter Mouton.
Levinson, Stephen C.
1983 Pragmatics. Cambridge: Cambridge University Press.
Morris, Charles W.
1970 [1938] Foundations of the Theory of Signs (Foundations of the Unity of Science:
Towards an International Encyclopedia of Unified Science, vol. 1.2). Chi-
cago: University Press, 1970. First published in: Otto Neurath, Rudolf Carnap
and Charles W. Morris (eds.), International Encyclopedia of Unified Science,
77–138. Chicago: University Press.
Nöth, Winfried
2011 Semiotic foundations of pragmatics. In: Wolfram Bublitz and Neal Norrick
(eds.). Foundations of Pragmatics, (Handbooks of Pragmatics 1.) 167–202.
Berlin: de Gruyter Mouton.
Introduction to part 2: Introspectional pragmatics 131

Peirce, Charles Sanders

1955 [1902] Logic as semiotic: The theory of signs. In: Philosophical Writings, 98–119.
New York: Dover.
Recanati, François
2004 Literal Meaning. Cambridge: Cambridge University Press.
Recanati, François
2010 Truth-conditional Pragmatics. Oxford: Oxford University Press
Sperber, Dan and Deirdre Wilson
1995 [1986] Relevance: Communication and Cognition. Second edition. Oxford:
Wiley-Blackwell.
Travis, Charles
2000 Unshadowed Thought: Representation in Thought and Language. Cambridge,
MA: Harvard University Press.
5. Philosophical pragmatics
Marina Sbisà

Abstract: This paper deals with the contributions that have been made by philos-
ophers to the methodological aspects of pragmatics (considered as the study of the
use of language by speakers in contexts). Among those contributions, there are the
implications of Austin’s conception of the speech act for the analysis of “uses of
language” and of conversational interaction, as well as the implications of Grice’s
conceptions of meaning and conversational cooperativity for the delimitation of
“communication” and for the role of argumentation in meaning attribution. The
paper deals also with various possible implications of speech act theory as refor-
mulated by Searle and by Bach and Harnish and discusses the philosophical notions
of pragmatic presupposition, common ground, and context-dependency, indicating
some ways in which they can be made relevant to the analysis of discourse.

1. Introduction

Methodological issues play a role in philosophy insofar as a philosophical approach

characterizes itself metaphilosophically, that is, authors working in its framework
devote some part of their reflections to questions about the nature of philosophy
and the task of the philosopher. Although at least one of the main authors in phil-
osophical pragmatics, namely John L. Austin, had precise and controversial meta
philosophical views, I will not focus on metaphilosophical issues in this paper. I
will focus, instead, on the contributions that have been made by authors working
in philosophical pragmatics to the methodological aspects, not of philosophy, but
of pragmatics itself. To the aim of this paper I characterize pragmatics, following
Charles Travis (himself a philosopher), as follows:
Pragmatics […] is the study of properties of words which depend on their having been
spoken, or reacted to, in a certain way, or in certain conditions, or in the way, or condi-
tions, they were. (Travis 1997: 87)

Tackling these issues, indeed, involves taking into consideration the ways in which
language is used by speakers in contexts: Travis’s characterization, therefore,
matches the oldest definition of pragmatics, that of the pragmatist philosopher
Charles Morris, as “the science of the relation of signs to their interpreters” (1938:
30). Austin, Grice, Searle and other speech act theorists, Stalnaker, and the philo-
sophical tradition studying indexicals or debating contextualism, all gave impor-
tant theoretical contributions to these issues, with direct or indirect implications

https://doi.org/10.1515/9783110424928-005
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 133–153. Berlin/
Boston: De Gruyter Mouton.
134 Marina Sbisà

for empirical research. In the next sections of this paper I will consider some of
these contributions and attempt to squeeze out of them their (actual or potential)
methodological implications for pragmatic research.

2. John L. Austin

Austin’s most relevant contribution to philosophical pragmatics is his outline of

speech act theory, to be found in his posthumously published How to Do Things
with Words ([1962] 1975). Albeit sketchy and here and there incomplete, it has a lot
of implications for pragmatic issues, beyond the obvious fact that it launched (or
contributed to launch) some of the key notions used in pragmatics since then, such
as speech act, force, presupposition (let aside more specifically Austinian terminol-
ogy). By his outline of speech act theory, Austin proposes to study speech as action
and to do so under three main respects, which he identified as the locutionary, the
illocutionary, and the perlocutionary act. The proposal to study speech as action has
philosophical implications, but, from the point of view of pragmatic research, is lit-
tle more than programmatic: just one more reason to develop the discipline besides,
and beyond, syntax and semantics. But the concepts put forward to back the claim
that speech should be studied as action have also methodological implications, since
they can direct the attention of scholars toward certain facts of language use and ver-
bal interaction and suggest a number of distinctions to be made in their description.

2.1. Austin’s distinctions

What are the “uses of language”? Scholars interested in how language is “used”
should not gather all ways and senses of “use of language” in one and the same
heterogenous list (as suggested by Wittgenstein 1953: § 23). And perhaps they
should not rely on the assumption that the uses to be distinguished have a simple
one-to-one correspondence with a list of different features to be observed in the
speech situation, as in Jakobson ([1960] 1981: 21–27).
According to Austin, to study uses of language we have first to distinguish kinds
of uses of language, that is, locution, illocution, perlocution, and aetiolation. The
definitions of these terms are well known: I summarize them as follows (but see, on
this topic, Sbisà 2013: 25–37 , and for the notion of illocutionary procedure, Sbisà
forthcoming): locution is the production of words in conformity to a language and
with a meaning which may consist of sense and reference; illocution is the execution
of a procedure comprising the utterance of words, which brings about a conven-
tional effect; perlocution is the production, by means of the utterance, of non- con-
ventional effects or consequences; aetiolation or “non serious” speech is the framing
of speech so as to suspend some of its goals, effects, or implications. The distinction
is grounded in an analysis of kinds of doings and of the ways we report them.
Philosophical pragmatics 135

Is then meaning a “use of language”, an effect of using language in context?

Austin insists that even if there may be a sense in which locutionary meaning is
precisely this, it must be kept distinct from the illocutionary and perlocutionary
senses of “use of language”. He therefore rejects philosophical approaches reduc-
ing meaning to illocutionary force – as some ordinary language philosophers such
as Hart (1949) and Strawson (1949) were apparently keen to do, and now is done,
in a highly sophisticated way, in the inferentialist semantics of Brandom (1994) –
or even to perlocutionary efficacy (as in the tradition of pragmatism).
Austin focuses upon illocution in particular. Illocutionary acts are, for him, the
kind of acts that can be performed by means of performative utterances, that is, in
issuing sentences with the verb in the first person present indicative active that do
not describe the speaker as doing something, but make explicit what the speaker
is doing in uttering them and, therefore, contribute to the performance of the act
they mention. But illocutionary acts can also be performed in uttering sentences
of various other forms. Perlocutionary acts are distinct from illocutionary acts
because their effects are a matter of causally affecting the psychological states or
actual behaviour of some of the participants in the interactional situation, while the
effects of illocutionary acts, according to Austin at least, are conventional. Since
Austin, while exemplifying this conventionality mainly by citing cases of ritual or
institutional illocutionary acts, raises the claim of the conventionality of illocution
in general terms, those willing to stick to his intuition are left the task of showing
what kind of states of affairs the “conventional effects” of illocution consist in.
This can be done, for example, by finding a suitable way in which to describe them.
So I have described these effects as bearing upon the deontic statuses associated
with each participant (rights, obligations, licenses, commitments, and so on) (Sbisà
1984, 2001, 2002). Austin’s main contribution to the description of the kinds of
procedures comprising the utterance of words and bringing about conventional
effects, the execution of which amounts to the performance of illocutionary acts, is
to be found in his outline of a classification of illocutionary acts (1975: 151–163).

2.2. How to attribute illocutionary force to an utterance

Austin has also suggestions about the ways in which we attribute illocutionary
force to utterances. He studied explicit performatives, of the form “I promise you
to …” or “I warn you that …”. Here the verb of the main clause designates an
illocutionary act and makes it explicit that the whole utterance has the correspond-
ing illocutionary force. But, more interestingly, there are words or expressions,
or syntactic forms, whose main function is to indicate illocutionary force: they
include sentence type, mood, modal verbs and adverbs, evaluative words, dis-
course connectives (Austin 1975: 73–76). So, on the one side, an analyst who
wants to make sense of what the participants in an interaction were doing should,
first of all, observe illocutionary force-indicating devices such as mood or sen-
136 Marina Sbisà

tence type, modal verbs, evaluative words, and certain adverbs or connectives.
On the other side, a linguist trying to determine what these words or expressions,
or syntactic forms, mean, has to consider that they are not used merely, or are
not used at all, as descriptive or referential devices, but at least in part as force-
indicators.
A further task (supposing that descriptions of illocutionary procedures and their
expected effects are already available, see above, section 2.1) is the description of
the actual execution of an illocutionary procedure and, if there are any, of its flaws
generating inappropriateness or infelicity. The achievement of the expected effect
depends not only on the correctness of the performance and the appropriateness
of circumstances, but also, on whether the speaker has made what she was doing
clear enough to enable the audience to recognize the act performed. So the analysis
of illocutionary acts in an interactional episode must include consideration of the
ways in which the speaker secures the audience’s uptake, or in which the achieved
securing of uptake is made manifest by the audience.

2.3. The role of ordinary language

When analyzing discourse, illocutionary force attribution can in the first place
avail itself of the illocutionary lexicon of the natural language in which the dis-
course is formulated. This is the positive recommendation we can gather from
Austin’s insistence on performative verbs as designating illocutionary acts (1975:
150): it is indeed their gamut (their semantic field) that offers the most nuanced
potentialities to the description of what is done in speaking. Of course, this is only
to begin with. Then we have to ask ourselves what exactly the verb we have intu-
itively chosen to describe the function of a certain conversational contribution, or
discourse move, means. Why did we choose it? What intentions and other attitudes
are we attributing to the speaker, what conditions are we presupposing that speaker
and speech situation satisfy, how is the speech situation consequently reshaped?
Here, theory comes in: we may go on in different directions, with different results,
depending on the speech act-theoretical model we are following (see section 2.1,
and below, sections 4.1 and 4.2). But the first step should exploit ordinary language
and not hurriedly apply simplified theoretical concepts, if we do not want to miss
the subtler aspects of what is happening.
For Austin, of course, quite beyond the remark I have just made (which is
commonsense enough and should be uncontroversial), the appeal to ordinary lan-
guage has a metaphilosophical significance. Austin is convinced that our access
to philosophical knowledge, if there is such a thing, is through accurate use and
reflection upon the accurate use of ordinary language. So philosophers clean up
their linguistic tools and, hopefully, gather insights and suggestions from the subtle
distinctions that the wisdom of centuries has embodied in the lexicon of the natural
language they speak (see Austin 1979: 181–189).
Philosophical pragmatics 137

3. Paul Grice

If Austin set the basis for speech act theory, Paul Grice, known among scholars
in pragmatics above all for his theory of implicature, tackled a broad range of
issues relevant to pragmatic research and can be credited with a decisive move
that greatly stimulated its development. By the way, it should be noted that he
did not conceive of his work as being concerned with “pragmatics”: he was con-
cerned with meaning (the subject matter of semantics) and the attempt to trace even
word meaning or sentence meaning down to their roots in the speaker’s intentions
and hearer-directed activity. However, the opposition he discusses and attempts
to resolve between “logic” and “conversation” became the basis of what has been
for decades the mainstream way of distinguishing semantics from pragmatics:
truth-conditions based analysis of language on the one side, study of whatever else
is meant or done in speaking on the other side. In the title of his famous lectures,
Logic and Conversation (Grice 1989: 3–143), “logic” stands for truth-conditional
meaning, while “conversation” introduces meaning as it emerges in ordinary sit-
uations, that is, within a basically cooperative verbal interaction. From ordinary
language philosophy a pragmatic view of language was emerging that appeared to
be squarely opposed to the truth-conditions based analysis, claiming for example,
with Austin (1975), that the felicity or infelicity assessment takes priority with
respect to truth and falsity or, with Strawson (1950), that there can be truth-value
gaps. Grice showed how to reach pacific coexistence and relative autonomy, and
this undoubtedly created an environment in which it was easier for the new field
of pragmatics to develop. If now philosophers question the precise way in which
he drew the line between semantics and pragmatics (or, in Gricean terms, between
“what is said” and “what is implicated”), there is still a wide consensus that some
such line has to be drawn. Even philosophers who intend to shift that line remark-
ably (as for example Recanati, who claims that the truth-conditional meaning of
utterances itself belongs to pragmatics because it is determined contextually, see
below, section 6) agree that there need not be any clash between pragmatic research
and truth-conditions based research.
I will turn now to sketching some Gricean themes of methodological import.

3.1. What is communication?

Grice’s view of meaning as a complex, open intention of the speaker (1989: 213–
223) contrasts with views of communication such as that of the so-called “prag-
matics of communication” (Watzlawick et al. 1967). Watzlawick has famously
claimed that one cannot not communicate: we communicate something all the time
(with words, but also e. g. with the way we utter our words, our gaze, our bodily
position, and the like) even if we do not intend to. In a Gricean perspective, we
need not deny that a lot of interior states and attitudes transpire from speech and
138 Marina Sbisà

its accompanying behaviour. But Grice distinguishes “natural meaning” which is

the case when some fact works as an index or symptom of some other fact, from
“non-natural meaning” which is restricted to those cases in which it is possible, or
reasonable, to ascribe a complex meaning-intention to the speaker. The structure
of the relevant meaning-intention comprises the core intention of eliciting a certain
response in the audience, plus the intention that the core intention be recognized by
the audience and the intention that the audience be lead to respond in the designed
way at least in part in virtue of their recognition of the speaker’s core intention.
If anything makes the ascription of such a complex intention impossible, as when
contextual knowledge makes it clear that the speaker cannot have that intention or
that the recognition of the speaker’s core intention does not contribute to its ful-
filment, we should not say the speaker expresses the corresponding meaning. So
Grice’s view of meaning urges us to keep distinct, on the one hand, recognition of
meaning (communication proper, in which the informative content grasped by the
hearer is meant by the speaker) and, on the other hand, inference from symptoms
(by which a hearer or observer may grasp informative content that the speaker did
not mean). For example, we have communication proper if I confess that I am anx-
ious or use words that presuppose or implicate that I am anxious, but not if anxiety
merely transpires from my words, apparently against my will.

3.2. How do we understand discourse?

Grice suggests that the fullest comprehension of an utterance or discourse comes
when we take it to be a cooperative contribution to some conversation. In this way
we understand not merely its truth-conditional meaning, but also a larger halo of
assumptions that, although belonging to what the speaker means, are not explicitly
specified in the sentences she utters. This is why the Cooperative Principle is held
to be “rational”: under an instrumental conception of rationality, the rationality of
the means-end relationship, it is rational to make as much sense as possible of one’s
interlocutor’s conversational contributions, since this can make the conversation
more profitable (Grice 1989: 29–30). Disbelief or objections may come later, once
the full meaning is grasped. Scholars in pragmatics who are fascinated by the Gri-
cean Cooperative Principle should bear in mind that it has neither the normative
force of a law nor of a politeness rule: it is not something that someone, or perhaps
the social group itself to which we belong, has decided and imposes on us. Nor
can it be assimilated to a constitutive rule for conversation. It is true that if we do
not assume that our interlocutor follows the Cooperative Principle, we are likely
to have little conversation with him, but this is a natural consequence of not tak-
ing the attitude that would make listening fruitful. While in cognitive pragmatics,
since Sperber and Wilson ([1986] 1995), the rules or principles that reformulate
the Cooperative Principle are not optional (they are meant to be natural features
of our minds), it is a peculiarity of the Gricean, philosophical way of conceiving
Philosophical pragmatics 139

of the Cooperative Principle that, although it remains the best reference point for
all our understandings and interpretations, we are not forced to adopt it in any
circumstance and with any interlocutor. This makes the adoption of the Coopera-
tive Principle in one’s relationship to a certain speaker a significant, albeit quite
ordinary, move.
Another peculiarity of the Gricean, philosophical conception of conversation
is that the content of the maxims specifying the Cooperative Principle is not fixed,
but context-relative. That is, the audience expects from the cooperative speaker
information adequate in quantity, quality, relation and manner to their current
needs or interests. There is no requirement to say “all”, to choose the most inform-
ative sentence available, to be crystal-clear and explicit on everything. In Grice’s
view, the level at which the criteria of adequateness for quantity, quality, relation
and manner of information are set in individual cases, as aspects of speaker’s
cooperativity, depends on the context of the conversation and particularly on its
“accepted purpose or direction” (1989: 26). This too appears to mark a difference
from the way in which the Principle of Relevance is dealt with in Relevance The-
ory (Sperber and Wilson 1995; see Clark, this volume, chapter 7).

3.3. The role of argumentation

Grice illustrates some ways in which we can back our attributions of implicit mean-
ing: conventional implicature and the various kinds of conversational implicature,
which differ as to the structure of the inferential route, of which the recognition
of the implicature is the conclusion. But he does not claim that our processes of
comprehension actually function in the ways he describes. Grice’s theory may
suggest, and did in fact suggest, hypotheses about language processing, and this
task has been resumed, in the interdisciplinary field of cognitive pragmatics, by
neo-Gricean and post-Gricean theories (see Clark on Cognitive Pragmatics, chap-
ter 7, and Schneider on Experimentational Pragmatics, chapter 8, this volume). But
if Grice is not necessarily concerned with actual processes of comprehension, what
is, in his philosophy, the point of distinguishing saying from implicating, conven-
tional from conversational implicature, generalized from particularized implica-
ture? Grice thinks that it is typical of humans to be able to give reasons for what
they do: this may be even turned into a defining property of the human “person”
(Grice 1991: 84–90, 118). If this holds also in the field of speech production and
comprehension (and why should it not?), then Grice has a reason, internal to his
philosophy, for explaining meaning attribution, whenever possible, by making it
rely upon some argumentative activity. It is interesting to notice that Grice, when
it is under scrutiny whether the nature of a certain implicature is conversational
or conventional, favors the former solution (in which the implicature is backed by
an inferential route) and admits of the latter solution only if special justification
is provided (1989: 39). The methodological implication is clear: if what you are
140 Marina Sbisà

doing is not research on language processing (which has other methods and crite-
ria), but the analysis of some discourse or conversation, or of some recurrent fact
in the use of language (such as e. g. the generation of a certain implicit meaning),
then the meaning you assign to the discourse or conversation, or to the kind of
utterance analyzed, should not be assigned without a reason, but be backed by
argumentation. This stance may guide the analyst in her setting limits to her inter-
pretive activity, against the temptations of so-called infinite semiosis (the infinite
chain of interpretations that arises from each sign, inspired by Peirce [1932] 1960:
156, 169–170), and of deconstruction.

4. Speech act theory

Several varieties of speech-act theoretical views, explicitly stated as theory or elab-

orated in connection with research on some concrete pragmatic phenomenon, have
come to existence as a response, a continuation or a redressing of Austin’s initial
outline. Here I cannot examine them all. But since two main models can be recog-
nized, which have different philosophical implications and may convey different
methodological messages to scholars in pragmatics, I will devote some consider-
ations to their main representatives, the philosophy of language of John R. Searle
and the inferentialism of Kent Bach and R.M. Harnish.

4.1. John Searle

As is well known, the fortunes of speech act theory largely depended on the publi-
cation of Searle’s volume Speech Acts in 1969. It is thanks to Searle’s work, clear,
systematic, captivating, that speech acts actually became a focus of attention for
many linguists, sociolinguists, literary theorists, besides (obviously) philosophers.
Later on, Searle kept giving contributions to speech act theory, but discussed also
other philosophical topics, especially in the philosophy of mind. I will limit my
considerations to some aspects of his philosophy of language.
Searle’s approach to speech acts emphasizes the need to make principled dis-
tinctions, criticizing Austin for the ordinary-language driven flexibility he some-
times displays. Searle invites extreme explicitness in the analysis of illocutionary
acts, as well as in that of institutional facts. His main instruments are explicitly
stated, purportedly exhaustive sets of constitutive rules, that is, rules that must be
followed in order for one’s utterance to count as the performance of a certain act
(1969: 33–42). Constitutive rules, in turn, make counts-as rules possible, since
when something has the properties required by certain constitutive rules, it counts
as the act (or event or entity) that is so constituted. Searle uses counts-as rules in his
account of institutional facts in particular (1969: 51–52, 1995). This rule-governed
approach is elegant in theory, but does not yield a plausible picture of actual verbal
Philosophical pragmatics 141

interaction (do speakers dwell to check whether each constitutive rule, pertinent to
what they intend to do, is satisfied?). The abstract character of this kind of speech
act theory is underscored when it comes to “illocutionary logic”, where forces are
septuples of properties defining the illocutionary act, and the notion of illocution-
ary entailment (Searle and Vanderveken 1985: 129–137) presupposes that there
are cases in which the speaker who performs a certain illocutionary act must also,
in the same utterance, be performing another (which will be the act illocutionarily
entailed by the illocutionary act she overtly performs). Searle’s account of social
reality, however, has the merit of showing how much linguistic-pragmatic work
lies beneath the so many institutional facts that usually impose on us as ready-
made.
Searle’s classification of illocutionary acts is based upon differences in illocu-
tionary point, combined with differences in direction of fit between language and
world, and expressed psychological state (Searle 1979: 12). Illocutionary points are
presented as universal and natural, being rooted in varieties of human intentionality
(Searle 1979: 29; 1983). This classification, its theoretical virtues notwithstanding,
does not yield particularly insightful results when applied to discourse analysis.
Illocutionary force attribution is based on the recognition of illocutionary force
indicators, especially mood or sentence type (Searle assumes a sharp distinction
between propositional indicators and illocutionary force indicators). But classing,
for example, almost all utterances of declarative sentences as assertives and all
utterances of imperative sentences as directives does not by itself say much about
the properties of the discourse or discourse genre under scrutiny. Results become
richer if resort is made to Gricean inferences, which enable hearers to assign “indi-
rect” illocutionary forces (Searle 1979: 30–57). The notion of indirect illocutionary
act balances the rigidity of the definitions of illocutionary classes and in general
of (types of) illocutionary acts in Searle’s theory. The pragmatic facts it highlights,
moreover, are worth consideration also beyond Searle’s theoretical framework,
as is shown by the fact that it has inspired much work in politeness theory (since
Brown and Levinson 1987; see in the field of the empirical analysis of speech acts,
Blum-Kulka et al. 1989) and in psycholinguistics (e. g. Gibbs 1979, 1986).
Searle is also to be credited with introducing “degrees of strength” of illo-
cutionary force (1979: 5; Searle and Vanderveken 1985: 15, 19). He illustrates
differences in degree of strength by giving examples such as the contrast between
“I solemnly swear that Bill stole the money” and “I guess Bill stole the money”.
Similar contrasts, no longer between explicit performatives (utterances of the form
“I V that p”, where “V” is a performative verb), but between utterances whose
forces are indicated by other lexical or syntactic indicators or by textual strategies,
constitute an important and widespread phenomenon that has been addressed by a
wide pragmatic literature under names such as “mitigation” and “reinforcement”
(see among others Caffi 2007, Fraser 1980, Holmes 1984, Sbisà 2001).
142 Marina Sbisà

4.2. Kent Bach and R.M. Harnish

The speech act theory of Kent Bach and R.M. Harnish (1979), which owes much to
Grice’s philosophy of language, is inferentialist and internalist. For Bach and Har-
nish, the speech act is basically the expression of a communicative intention, and
hearers grasp the expressed communicative intention by means of inferences. They
focus on the task of reconstructing the processes by which a hearer comes to grasp
the speaker’s communicative intention, and elaborate a “Speech Act Schema”
illustrating these processes step by step. So goes the story (slightly simplified):
from hearing S utter e, H infers “S is uttering e”; from the fact that S is uttering e
(plus the assumption that there is a common understanding of the same language,
plus other salient contextual information), H infers “S means … by e” and then “S
is saying that (… p …)”, and then “S, if speaking literally, is F-ing that p” (where
“F” stands for a verb designating the literal illocutionary force of the uttered sen-
tence). From this last step H can proceed by resort to the Communicative Presump-
tion (the mutual belief of the members of the linguistic community that whenever
one of them S says something in L to another member H, she is doing so with
some recognizable illocutionary intent) and further mutual contextual beliefs, to
the assignment to S’s utterance either of a direct literal force, or of a literally
based indirect force (if S could not be merely performing the direct illocutionary
act), or of a direct non-literal force (if S could not, under the circumstances, be
performing the direct illocutionary act), or of a nonliterally based indirect force (if
S could not be performing an act with the direct non-literal force) (see Bach and
Harnish 1979: 3–37, 84–93). This reconstruction of the grasping of the speaker’s
communicative intention is, of course, theoretical and should be considered either
a merely conjectural account, or an idealized rationalization. With respect to the
cognitive-psychological study of language processing, the theoretical claims of
philosophers may appear at most as hypotheses in need of further elaboration and
testing. However, it is conceivable for a scholar engaged in discourse analysis to
be inspired by Bach and Harnish’s inferentialist proposal, in the attempt to make
explicit all the steps leading to understanding or misunderstanding a speaker or
even, all the steps ideally involved in discourse interpretation.
The speech act-theoretical inferentialism of Bach and Harnish connects to other
kinds of inferentialism such as those of neo-Griceans and of Relevance theory.
However, the latter two trends in pragmatic research do not belong primarily to
philosophical pragmatics, but to cognitive pragmatics. It is, moreover, to be noted
that Bach and Harnish do not treat all speech acts according to one and the same
model, but following Strawson (1964), reserve separate treatment to “conven-
tional” illocutionary acts. With respect to research in pragmatics, this amounts to
a suggestion to separate the analysis of ritual, ceremonial, or institutional events
involving the use of language from the analysis of discourse and conversation.
This in turn amounts to giving up the project (implicit in Austin and endorsed by
Philosophical pragmatics 143

Searle in his own way) of illocution as a flexible and truly transversal conceptual
tool across all kinds of speech situations.
The inferentialism of Bach and Harnish fits well with what may be called their
internalism (Harnish 2009). Both communicative intentions and beliefs about the
speaker’s communicative intention that are the effect of successful communicative
illocutionary acts are internal states of the participants’ minds. So what is high-
lighted is the relationship of speech acts to the expression and dynamics of mental
states.

5. Robert Stalnaker

Among the philosophers working within a formal-semantic paradigm that have

reflected upon issues belonging to pragmatics, a salient place is to be given to
Robert Stalnaker, in consideration of his fidelity to pragmatic themes throughout
his decades-long philosophical and logical research. He was among the first phi-
losophers who realized that what an utterance actually means cannot avoid being
context-dependent and gave rigorous theoretical form to this intuition in a paper
entitled “Pragmatics” in 1970 (Stalnaker 1999: 31–46). He got increasingly con-
cerned with the description of discourse dynamics and with the role that context
plays in it (1999: 96–113; 2014). He also lauched the notion of pragmatic pre-
supposition (1999: 47–62), to become the standard conception of presupposition
referred to in pragmatic research.

5.1. Context as common ground

Stalnaker maintains that any conversational exchange takes place on the back-
ground of a body of information that is not merely believed by the participants
to be true, but also believed by each participant to be believed to be true by the
others. This shared and believed to be shared body of information is the context
of that conversation or its “common ground” (Stalnaker 1999: 99; 2002): any dis-
course-dynamic phenomenon occurs within it and thanks to it. Indeed, the common
ground is not a static entity, but changes as the conversation goes on. Every speech
act adds something to the common ground: the speech act of assertion, for exam-
ple, adds to the common ground the proposition that is the content of the assertion
(Stalnaker 1999: 78–95).
Conceiving of the context in the way Stalnaker does yields at least two impli-
cations for research in pragmatics. First, it suggests that the significance of what
participants in a conversation say to each other cannot be grasped completely,
unless we are also able to reconstruct their shared cognitive world (this implica-
tion is congruent with one of the implications of Bach and Harnish’s analysis of
the understanding of speech acts, that is, the indispensable role of mutual contex-
144 Marina Sbisà

tual beliefs). Secondly, Stalnaker’s conception of the context as common ground

emphasizes the dynamic side of speech acts, often neglected by speech act theo-
rists, suggesting to describe their effects in terms of changes in the context. This
aspect of Stalnaker’s philosophical pragmatics links with the tradition of dynamic
semantics that also uses the notion of context change or context update. It should
be noted, however, that since the context is made out of attitudes, its changes
too must concern attitudes, not material or social realities according to Stalnaker
(1999: 86; see, however, the discussion of commands and permissions in 2014:
128–147). So, the effects of speech acts upon the context are not, or not directly,
effects upon any level or aspect of the world.
A possible objection to the conception of context as common ground is that it
requires of the participants to entertain complex beliefs that are not always psy-
chologically plausible. Each participant has to believe that all participants believe
that p if p is to be a member of the common ground. But in the course of a verbal
exchange, we do not usually reflect upon what exactly other participants believe.
Moreover, isn’t the belief that all participants believe that p also a member of the
common ground? If it is, each participant should also believe that all participants
believe that p. This scenario is common to the notion of “mutual contextual belief”
adopted, on a Gricean inspiration, by Bach and Harnish. So we should conclude
that, whatever the value of Stalnaker’s conception of context, its aim (as often hap-
pens in logical research) is not that of achieving psychological plausibility. Notice,
however, that the problem could shift from the definition of common ground to
what it means for a subject to “believe” something.
Finally, it should be noted that the conception of context as common ground is
not the only conception of context that may be relevant to pragmatics. If one wants
to focus attention on the hearer’s inferences, it is enough to consider the cogni-
tive context of the individual hearer’s mind: whether shared or not, that is what
actually makes them possible. In contrast, to the aim of understanding indexicals,
what counts is who is (actually) speaking, when and where (as opposed to who is
believed to be speaking, when and where) (Kaplan 1989a; Kaplan 1989b: 591–
593). Also the felicity or infelicity of illocutionary acts, at least in Austin’s per-
spective, depends on whether the context of utterance satisfies certain conditions,
where the role of context is played by the relevant features of the actual speech
situation (see Sbisà 2002: 421–22). The speech situation as the objective (vs cogni-
tive) context of an utterance is also what Travis refers to in his characterization of
pragmatics, cited above, by mentioning the “conditions” in which the words to be
studied have actually been spoken or reacted to (on the contrast between objective
and cognitive context, see Gauker 1998). The methodological suggestion implicit
in Stalnaker’s philosophical view of context as common ground, that is, that words
should be understood as contextualized in the participants’ common ground and
contributing to it, should be weighed up against this more heterogeneous scenario.
Philosophical pragmatics 145

5.2. Pragmatic presupposition

Stalnaker represents common ground as comprising propositions (which, in his

theory, are sets of possible worlds: each proposition is identified with the set of
possible worlds in which it is true) that, being believed by each participant to be
both true and believed by the others, are “presupposed” by them. In his view, prag-
matic presuppositions (1999: 47–62) are not presuppositions of the words used (as
presuppositions are often taken to be, and as all presuppositions should be accord-
ing to semantic conceptions), but of the speakers. They are those assumptions on
the basis of which participants in a conversation choose how to speak to each other
and which, therefore, an observer or analyst should grasp in order to understand
what is going on, it does not matter whether or not they are linguistically indicated
in the sentences that the participants utter. The focus of this conception is, there-
fore, on the preconditions for the successfulness of communication as opposed to
the preconditions for the truth-evaluability of utterances or their successfulness as
performances of illocutionary acts. While according to previous conceptions of
presupposition, often dubbed “semantic” (but see the classic discussion of exis-
tential presupposition as a fact of language use in Strawson 1950) the falsity of
a presupposition could make the illocutionary act the utterance was designed to
execute “null and void” and therefore, in the case of assertion, would open up
a truth-value gap, Stalnaker follows Grice in holding that the falsity of a prag-
matic presupposition has none of these consequences (although it may well create
some trouble to the participants in their conversational as well as extra-linguistic
activities). Historically speaking, Stalnaker’s notion of pragmatic presupposition
contributed to the success of the Gricean program for separating semantics from
pragmatics, that is, matters of truth-conditions and truth values from matters of
speaker’s beliefs, intentions and actions.

5.3. Presupposition accommodation

Presupposition markers or triggers are not essential to Stalnaker’s pragmatic pre-
supposition. However, sentences may contain words or constructions that set pre-
supposition requirements. But then, those sentences are appropriately uttered only
if speakers and hearers are actually making the required presupposition, that is,
already entertain the relevant beliefs. It is an apparent flaw of Stalnaker’s concep-
tion of presupposition that it is simply not true that all presuppositions of appropri-
ately uttered sentences are already believed by all participants, let alone believed to
be shared, at the time in which the sentence requiring them is uttered. When I don’t
know that p, but you do, you may utter a sentence to which the presupposition that
p is associated, and I not only understand it but also grasp (and usually come to
share) your presupposition. We cannot limit the notion of presuppositions to beliefs
already held by the participants in a conversation. However, Stalnaker admits that
146 Marina Sbisà

presuppositions may convey new (as opposed to old) content (1999: 51–52; see also
102–104). The phenomenon has been dubbed by David Lewis “accommodation”
and has been described by him as depending on a peculiar rule that imposes to hear-
ers to adapt their cognitive context in order to make what has been said true (Lewis
1979: 339–340). Stalnaker does not endorse Lewis’s account of accommodation as
governed by a specific rule and prefers to look for an explanation of the phenom-
enon within the dynamics of discourse. It is controversial whether his explanation
is fully plausible (for its various accounts, see Stalnaker 1999: 51–52, 102–104;
2014: 47–50). However, the issue of presupposition accommodation remains very
instructive for scholars in pragmatics: it shows that the study of the ways in which
presuppositions are linguistically indicated must not be neglected and alerts those
who are involved in discourse analysis to pay attention to the informative and per-
suasive uses of presupposition (see Sbisà 1999; for a recent, broader philosophical
discussion of the uses of presupposition accommodation see Langton forthcoming).

6. Contextualism

The basic idea that what is meant by an utterance or discourse is always at least
to some extent context-dependent is commonplace in pragmatic research. Con-
text-dependency and its multiple, pervasive manifestations are no doubt increas-
ingly explored in all the disciplines that are concerned with pragmatic phenomena
(linguistics, interactional sociology, linguistic anthropology, cognitive psychol-
ogy). With respect to all this, the philosophical discussion of context-dependency
has at least a scenario-setting and terminology-fixing role. It might, however, bear
some methodological relevance, since it proposes distinctions, not always accurate
and univocal but at any rate subtle and interesting, between different forms and
modes of context-dependency, which could provide useful guidelines to the recog-
nition and explanation of context-dependent meaning.
Context-dependency was noticed in the logical study of indexicals and demon-
stratives long ago (Bar-Hillel 1954; Kaplan 1989a). It has been also recognized as
not eliminable: a sentence containing indexicals cannot be reduced to another free
from them, since some indexical element will always be present if the paraphrase
has to be correct and complete (Bar-Hillel 1954; see also Perry 1979). The connec-
tion of indexicals to context was already implicit in Charles S. Peirce’s definition
of indexical signs (Peirce 1960: 143, 160–164): in Peirce, however, such a connec-
tion did not extend to symbolic signs such as most linguistic expressions. Austin
recognized the context-dependency of illocutionary force attribution, a concept
that was resumed (albeit in different terms) in the theory of indirect illocutionary
acts. Implicit meaning from its very beginnings was recognized to draw on context.
Later on, also the context-dependency of the meaning of non-indexical and non-
demonstrative words, as well as of whole sentences, has been taken into account.
Philosophical pragmatics 147

Contributions to contextualism include Bach (1994), Travis (2000), Carston (2002)

and Recanati (2004, 2010) (for an overview, see Bianchi 2011). Since Recanati
is perhaps the author who reaches the highest degree of systematicity in his way
of describing and classifying the kinds and levels of context-dependency, I will
illustrate what contextualism can do by summarizing his distinctions (as proposed
in Recanati 2004). It should be considered, however, that other contextualist phi-
losophers or theoretical linguists endorse slightly different distinctions. Obviously,
philosophers that do not share Recanati’s radical contextualism would accept some
of these kinds of context-dependency but not others.
So, to see how the meaning of an utterance is contextually determined, let us
start from sentence meaning, which is determined by the rules of language and
consists of a very sketchy structure that philosophers also call “logical form”. This
is the realm of semantics, which is, by the way, not enough to assign to the utter-
ance its truth-conditions. Once sentence meaning is acquired, various pragmatic
processes come in:
1) Saturation, which is the contextual assignment of values to indexicals and
other context-sensitive expressions. Consider “I read John’s book”: no definite
truth-conditions are expressed unless we assign a value to “John’s book” either as,
for example, “the book left here by John” or “the book owned by John” or “the
book published by John”. This is a bottom-up process, driven by the logico-linguis-
tic structure underlying the utterance, and is mandatory. It yields a truth-evaluable
content (Recanati 2004: 7–8, 52, 61–62).
2) Free enrichment, which consists of optional processes that specify the lin-
guistic meaning of the utterance, in consideration of contextual beliefs and needs
(Recanati 2004: 23–37). It too contributes to determining the truth-evaluable con-
tent of the utterance. There are various sorts: specifization, as when we say “rabbit”
but may mean either ‘rabbit fur’ or ‘rabbit meat’; strengthening, which consists of
restricting the application of a predicate by contextually providing further condi-
tions that are not linguistically encoded (as in the case of “All the books are on
the table”, where we understand “books” as “the books we need for the seminar”
and “table” as “the table in the seminar room”); loosening, which happens when a
condition of application belonging to the concept literally expressed by a predicate
is dropped to widen the application of the predicate (examples are cases of loose
use, up to metaphor); semantic transfer, which happens when the output of the
processing of a linguistic expression is a concept other than the concept literally
expressed by it, as in “The ham sandwich left without paying”, when by this it is
meant that the boy who ordered the ham sandwich did so. Free enrichment enables
us to process the same sentence in different ways depending on its context of utter-
ance. Sometimes the theory may even permit equivalent interpretive possibilities:
for example, the processing of “I finished the book” can be represented as involv-
ing the strengthening of “finish” as ‘finished reading’, or the transfer of “the book”
from referring to the concrete object to meaning ‘reading the book’.
148 Marina Sbisà

3) Implicatures: they belong to “secondary” inferential processes that apply to

a contextually saturated and (when suitable) enriched proposition, and that again
involve context (Recanati 2004: 52, 70–71). They differ from the primary processes
such as those described in (1) and (2) in that interpreters must have the reflective
capacity to rationally justify their own interpretations. Implicatures, indeed, can be
understood intuitively, but it is also held to be part of their definition that speakers
must have the competence to work out arguments justifying them.
This variety of ways and steps of the contextual processing of utterances may
well contribute to explaining and justifying the intuitions of discourse analysts
in assigning meaning to utterances in discourse. It can also bear, together with
research on presupposition (if the latter is meant to be linguistically indicated), on
the study of the pragmatic functions and assumptions associated with the use of
lexical items.

7. Concluding remarks

As a whole, philosophical pragmatics insists on principled ways of dealing with

varieties of pragmatic facts. One may choose among different models, but certainly
the distinction between illocution and perlocution, where it applies, is a decisive
contribution. Its main implication is that we should not, not even in empirical
research on corpora belonging to specific discourse genres, take the effectiveness
of discourse in producing beliefs or emotions or behavior as a direct result of the
words used and their content. There is something half-way between the two: it
is the power of discourse of shaping and reshaping the relationship between the
interlocutors, what they legitimately expect from one another, what they owe to
one another, what they are in a position to do to one another or to third parties. And
the exercise of this power is located at the level of illocution, which then appears as
intermediate between the uttering of words and the effectiveness of discourse. Illo-
cution belongs to language because it lives “in” acts of saying something, so that
it can be studied through its linguistic indicators, as well as examining the ways in
which its interactional dynamics is made manifest in conversational exchanges: it
helps therefore see how discourse can get its extra-linguistic effectiveness and why
it gets the extra-linguistic effectiveness it in fact has.
As to the details of their methodological implications, however, the three main
varieties of speech act theory I have presented in sections 2, 4.1 and 4.2 are not
equivalent. They suggest non-equivalent approaches to the analysis of illocution in
actual conversation or discourse. It is one thing to check the satisfaction of consti-
tutive rules, including those involving speaker’s intentions, another thing to relate
the words uttered and the participants’ behavior to the pattern of some procedure
(designed to have conventional effects that should be intersubjectively accepted). In
the former case, performance may be in any moment undermined by lack of some
Philosophical pragmatics 149

of the required participant’s intentions. This creates a circle between ascription of

intentions and understanding of speech acts, which makes the perspective method-
ologically doubtful. In the latter case, performance, effects of action, and intentions
of the participants are all retrieved from the participants’ words and behavior. The
pattern executed may be recognizable even when not completely realized (as it
happens with patterns of any kind) and this can license interpretations of what par-
ticipants do as well as of their associated intentions in a quite natural way.
Another emerging theme is the variety of kinds of interpretive inference that
have been identified by philosophers. Two aspects of this theme are worth spe-
cial consideration. The former underscores a positive, constructive relationship
between philosophy and pragmatic research. Whether the models of inference
are psychologically real or mere “rational reconstructions”, they can be useful to
the analyst anyway, helping her to find reasons for attributing a certain implicit
meaning to the utterances or discourses she analyzes. The latter aspect is more
problematic. In discussing varieties of interpretive inference, philosophers have
focused on whether and how to divide them into inferences contributing to the
contextually expressed proposition (“what is said”) and inferences which, on the
basis of the proposition expressed, make us retrieve what is implicated or more
generally communicated. The debate has been defatigating and without a clearly
winning opinion. One may suspect that this aspect of philosophical pragmatics
cannot be really useful to pragmatic research. But something methodologically
very interesting has been proposed by Jennifer Saul (2012). In her research on
lying and misleading, she too discusses the distinction between what is said and
what is implicated, but relativizes her criteria to the aims she has in that context:
her distinction has to yield plausible results when it comes to the identification of
lies and their distinction from cases of misleading speech. This is a case of theo-
retical dispute which can be given a stable solution only when contextualized with
respect to certain aims, and a nice example of how theory and practical concerns
can throw light upon each other.
Finally, as to the notion of context, I would like to underscore that its uses in
pragmatic research have a double direction. On the one hand, if we have informa-
tion available about mutual contextual beliefs or the participants’ common ground,
this facilitates reaching a sound understanding of what they say and do. Also fac-
tual knowledge about the participants’ goals, their ongoing activity, and various
circumstances of the utterance contribute to such an understanding. On the other
hand, albeit convinced of the importance of grasping the common ground of the
participants in a conversation, or of the indispensability of certain information
about the speech situation for understanding indexicals or illocutionary force, we
might not have access to them. We need therefore to develop strategies for get-
ting such access. Here so-called presupposition accommodation, together with the
retrieval of those implicatures that do not need particularized assumptions, appear
to play an essential role in enabling understanding, not merely of the utterance
150 Marina Sbisà

or discourse itself, but also of its context. Context, whether defined as common
ground or as speech situation, is not only something that explains what happens in
a verbal exchange and provides utterances with their actual meanings and forces,
but also something that hearers or bystanders or analysts must very often recon-
struct, at least in part, on the basis of the words and actions of the participants.

References

Austin, John L.
1975 [1962] How to Do Things with Words. The William James Lectures delivered
at Harvard University in 1955. Ed. by James O. Urmson and Marina Sbisà.
Oxford: Oxford University Press. First published (edited by James O. Urm-
son) 1962.
Austin, John L.
1979 Philosophical Papers. 3rd ed. Oxford: Oxford University Press.
Bach, Kent
1994 Impliciture. Mind and Language 9: 124–162.
Bach, Kent and Robert M. Harnish
1979 Linguistic Communication and Speech Acts. Cambridge, MA.: Harvard Uni-
versity Press.
Bar-Hillel, Yehoshua
1954 Indexical expressions. Mind 63(251): 359–379.
Bianchi, Claudia
2011 Contextualism. In: Marina Sbisà, Jan-Ola Östman and Jef Verschueren (eds.),
Philosophical Perspectives for Pragmatics, 53–70. (Handbook of Pragmatics
Highlights 10.) Amsterdam: John Benjamins.
Blum-Kulka, Shoshana, Juliane House and Gabriele Kasper (eds.)
1989 Cross-cultural Pragmatics: Requests and Apologies. Norwood: Ablex.
Brandom, Robert
1994 Making it Explicit. Cambridge, MA: Harvard University Press.
Brown, Penelope and Stephen Levinson
1987 Politeness. Universals in Language Use. Cambridge: Cambridge University
Press.
Caffi, Claudia
2007 Mitigation. Amsterdam: Elsevier.
Carston, Robyn
2002 Thoughts and Utterances: The Pragmatics of Explicit Communication.
Oxford: Blackwell.
Fraser, Bruce
1980 Conversational mitigation. Journal of Pragmatics 4: 341–350.
Gauker, Christopher
1998 What is a context of utterance? Philosophical Studies 9: 149–172.
Gibbs, Raymond W., Jr
1979 Contextual effects in understanding indirect requests. Discourse Processes 2:
1–10.
Philosophical pragmatics 151

Gibbs, Raymond W., Jr

1986 What makes some indirect speech acts conventional? Journal of Memory and
Language 25: 181–196.
Grice, Paul
1989 Studies in the Way of Words. Cambridge, MA: Harvard University Press.
Grice, Paul
1991 The Conception of Value. Oxford: Oxford University Press.
Harnish, Robert M.
2009 Internalism and externalism in speech act theory. Lodz Papers in Pragmatics
5: 9–31.
Hart, Herbert L. A.
1949 The ascription of responsibility and rights. Proceedings of the Aristotelian
Society 49: 171–194.
Holmes, Janet
1984 Modifying illocutionary force. Journal of Pragmatics 8: 345–365.
Jakobson, Roman
1981 [1960] Linguistics and poetics. In Selected Writings, vol. 3, ed. by Stephen Rudy,
18–51. The Hague: Mouton. First published 1960.
Kaplan, David
1989a Demonstratives. In: Joseph Almog, John Perry and Howard Wettstein (eds.),
Themes from Kaplan, 481–563. Oxford: Oxford University Press.
Kaplan, David
1989b Afterthoughts. In: Joseph Almog, John Perry and Howard Wettstein (eds.),
Themes from Kaplan, 565–614. Oxford: Oxford University Press.
Langton, Rae
forthcoming Blocking as counter-speech. In: Daniel Harris, Daniel Fogal and Matt
Moss (eds.), New Work on Speech Acts. Oxford: Oxford University Press.
Lewis, David
1979 Scorekeeping in a language game, Journal of Philosophical Logic 8: 339–359.
Morris, Charles W.
1938 Foundations of the Theory of Signs. (International Encyclopedia of Unified
Science, Vol. 1, n. 2.) Chicago, IL: The University of Chicago Press.
Peirce, Charles S.
1960 [1932] Speculative grammar. In: Collected Papers, vol. 2, 128–269. Cambridge,
MA, Harvard University Press. First published 1932.
Perry, John
1979 The problem of the essential indexical. Noûs 13: 3–21.
Recanati, François
2004 Literal Meaning. Cambridge: Cambridge University Press.
Recanati, François
2010 Truth-conditional Pragmatics. Oxford: Oxford University Press.
Saul, Jennifer
2012 Lying, Misleading, and What is Said. Oxford: Oxford University Press.
Sbisà, Marina
1984 On illocutionary types. Journal of Pragmatics 8: 93–112.
Sbisà, Marina
1999 Ideology and the persuasive use of presupposition. In: Jef Verschueren (ed.),
152 Marina Sbisà

Language and Ideology. Selected Papers from the 6th International Prag-
matics Conference, 492–509. Antwerp: International Pragmatics Associa-
tion.
Sbisà, Marina
2001 Illocutionary force and degrees of strength in language use. Journal of Prag-
matics 33: 1791–1814.
Sbisà, Marina
2002 Speech acts in context. Language & Communication 22: 421–436.
Sbisà, Marina
2013 Locution, illocution, perlocution. In: Marina Sbisà and Ken Turner (eds.),
Pragmatics of Speech Actions, 25–75. (Handbooks of Pragmatics 2.) Berlin:
Mouton de Gruyter.
Sbisà, Marina
forthcoming Varieties of speech act norms. In: Maciej Witek and Iwona Witczak-
Plisiecka (eds.), Normativity and Variety of Speech Actions. Leiden: Brill
(Poznań Studies in the Philosophy of the Sciences and the Humanities).
Searle, John R.
1969 Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge
University Press.
Searle, John R.
1979 Expression and Meaning. Cambridge: Cambridge University Press.
Searle, John R.
1983 Intentionality: An Essay in the Philosophy of Mind. Cambridge: Cambridge
University Press.
Searle, John R.
1995 The Construction of Social Reality. New York: The Free Press.
Searle, John R. and Daniel Vanderveken
1985 Foundations of Illocutionary Logic. Cambridge: Cambridge University Press.
Sperber, Dan and Deirdre Wilson
1995 [1986] Relevance: Communication and Cognition. Oxford: Blackwell. First pub-
lished 1986.
Stalnaker, Robert
1999 Context and Content. Oxford: Oxford University Press.
Stalnaker, Robert
2002 Common Ground. Linguistics and Philosophy 25: 701–721.
Stalnaker, Robert
2014 Context. Oxford: Oxford University Press.
Strawson, Peter F.
1949 Truth. Analysis 9: 83–97.
Strawson, Peter F.
1950 On referring. Mind 59: 320–344.
Strawson, Peter F.
1964 Intention and convention in speech acts. The Philosophical Review 73: 439–
460.
Travis, Charles
1997 Pragmatics. In: Bob Hale and Crispin Wright (eds.), A Companion to the Phi-
losophy of Language, 87–107. Oxford: Blackwell.
Philosophical pragmatics 153

Travis, Charles
2000 Unshadowed Thought: Representation in Thought and Language. Cambridge,
MA: Harvard University Press.
Watzlawick, Paul, Janet Helmick Beavin and Don D. Jackson
1967 Pragmatics of Human Communication. New York: Norton.
Wittgenstein, Ludwig
1953 Philosophische Untersuchungen / Philosophical Investigations. Ed. by Eliza-
beth Anscombe and Rush Rhees. With English translation. Oxford: Blackwell.
6. Research methodology in classical
and neo-Gricean pragmatics
Yan Huang

This chapter is dedicated to Professor Stephen Levinson,

my PhD supervisor and mentor at Cambridge.

Abstract: Research methodology in linguistics can roughly be divided into three

types: (i) introspection, as in generative syntax (armchair), (ii) experimentation,
as in psycholinguistics (laboratory), and (iii) observation, as in sociolinguistics
(field). Needless to say, some of these research methods are also found in pragmat-
ics including classical and neo-Gricean pragmatics. In this article, I assess some of
the main research methodologies employed in classical and neo-Gricean pragmat-
ics, covering different types of data, different ways of collecting data, and different
ways in which data is analysed. The assessment is conducted from both a linguistic,
and a philosophical, methodological point of view. Topics that are addressed in this
chapter include introspection, falsifiability, reduction versus expansion of theories,
and the use of cross-linguistic data to compare linguistic characteristics across
typologically different languages in the formulation and development of theories.

1. Introduction

Research or analytical methodology in linguistics can roughly be divided into three

types: (i) introspection, as in generative syntax, (ii) experimentation, as in exper-
imental psycholinguistics, and (iii) observation, as in sociolinguistics. Given the
typical places where the three types of research methods are employed in linguis-
tics, the first may be called “the armchair method”, the second, “the laboratory
method”, and the third, “the field method” (e. g. Clark and Bangerter 2004, Jucker
2009). As mentioned in Talmy (2007a, b), these research methods in linguistics
include (i) introspection into the meanings and structures of linguistic expressions
and forms, (ii) comparison of linguistic characteristics across typologically dif-
ferent languages, (iii) investigation of how speech events interact with context,
(iv) analysis of audio and/or video recordings of naturally occurring, spontaneous
conversations, (v) (computer-aided) compilation and examination of (frequently
annotated) corpora, (vi) analysis of cumulatively recorded observations of linguis-
tic behaviour, (vii) experimental techniques, as used in psycholinguistics, (viii)
instrumental probes of the brain’s linguistic functioning, as deployed in neurosci-
ence including neuro-linguistics, and (ix) simulation of human linguistic behaviour

https://doi.org/10.1515/9783110424928-006
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 155–183. Berlin/
Boston: De Gruyter Mouton.
156 Yan Huang

in artificial intelligence. Needless to say, some of these research methodologies

can also be found in pragmatics including classical and neo-Gricean pragmatics.
This chapter aims to assess some of the main research methods employed in
classical and neo-Gricean pragmatics, covering different types of data, different
ways of collecting data, and different ways in which data is analysed. The assess-
ment is conducted from both a linguistic, and a philosophical, methodological
point of view. The organization of the chapter is as follows. Section 2 outlines clas-
sical and neo-Gricean pragmatics, focusing on the classical Gricean co-operative
principle and its component maxims of conversation (e. g. Grice 1975, 1978, 1989)
(section 2.1) and the bipartite neo-Gricean model set forth by Horn (1984, 2012)
(section 2.2.1) and the tripartite neo-Gricean model proposed by Levinson (1987,
2000) (section 2.2.2). In section 3, I examine introspection – the primary linguistic
methodology found in classical and neo-Gricean pragmatics. This is followed by a
philosophical methodological discussion of falsifiability in section 4 and reduction
versus expansion of a theory in section 5. Finally, in section 6, I show how another
main linguistic method, namely, comparison of linguistic characteristics across
typologically different languages by means of cross-linguistic data contributes to
the formulation and development of (my version of) the neo-Gricean pragmatic
theory of anaphora (Huang 1991, [1994] 2007, 2000a, b, 2004, 2006, 2007, 2014a,
2016b, 2017b, e).

2. Classical and neo-Gricean pragmatics

2.1. Classical Gricean pragmatics

On a general Gricean account of meaning and communication (e. g. Grice 1975,

1978, 1989), there are two theories: a theory of meaningn[on]n[atural] and a theory of
conversational implicature. In his theory of meaningnn, Grice emphasized the con-
ceptual relation between natural meaning in the external world and non-natural,
linguistic meaning of utterances. He developed a reductive analysis of meaningnn
in terms of the speaker’s reflexive intention, the essence of which is that meaningnn
or speaker meaning is a matter of expressing and recognizing intention.
In his theory of conversational implicature, Grice suggested that there is an
underlying principle that determines the way in which language is used maximally
efficiently and effectively to achieve rational interaction in communication. He
called this overarching dictum the co-operative principle and subdivided it into
nine maxims of conversation classified into four categories: Quality, Quantity,
Relation, and Manner. These four categories are taken from the German philoso-
pher Immanuel Kant (Grice 1989: 26). The co-operative principle and its constitu-
ent maxims ensure that in an exchange of conversation, truthfulness, informative-
ness, relevance, and clarity are aimed at.
Research methodology in classical and neo-Gricean pragmatics 157

(1) Grice’s co-operative principle and its constituent maxims of conversation

(simplified) (e. g. Huang 2014a: 30)
a. The co-operative principle
Be co-operative.
b. The maxims of conversation
` Quality: Be truthful.
(i) Belief: Don’t say what you believe to be false.
(ii) Evidence: Don’t say what you lack evidence for.
Quantity:
(i) Don’t say less than is required.
(ii) Don’t say more than is required.
Relation: Be relevant.
Manner: Be clear.
(i) Avoid obscurity.
(ii) Avoid ambiguity.
(iii) Be brief.
(iv) Be orderly.

Assuming that the co-operative principle and its associated conversational maxims
are normally adhered to by both the speaker and addressee in a conversational
interaction, Grice suggested that a conversational implicature – roughly, any mean-
ing or proposition expressed implicitly by a speaker in his or her utterance of a
sentence which is meant without being part of what is said in the strict sense – can
arise from either strictly observing or ostentatiously flouting the maxims. In Huang
(e. g. 2007: 27–31, 2014: 33–37, 2017c: 158), I called conversational implicatures
that are engendered by way of directly observing the maxims conversational impli-
caturesO, as in (2); and conversational implicatures that are generated by way of the
speaker’s deliberately flouting the maxims conversational implicaturesF,as in (3).
(I use +> to stand for (ceteris paribus) conversationally implicates.)

(2) Conversational implicaturesO

The coffee is warm.
+> The coffee is not hot.
(3) Conversational implicaturesF
Tony Blair is no longer the Prime Minister of Britain, he is the Foreign Minister of the
United States. (Nelson Mandela, quoted in Susie Dent, Language Report 2003: 62)
+> e. g. Tony Blair has followed the American foreign policies too closely.

In (2), the conversational implicature results from the observation of Grice’s first
sub-maxim of Quantity. By contrast, in (3), the conversational implicature was
engendered by Nelson Mandela’s deliberately flouting or exploiting Grice’s first
sub-maxim of Quality. (Incidentally, the head of the US government department
that deals with foreign affairs is styled “the Secretary of State” rather than “the
Foreign Minister”.)
158 Yan Huang

A second Gricean dichotomy, independent of the conversational implicatureO/

conversational implicatureF one, is between those conversational implicatures
which arise without requiring any particular contextual conditions and those which
do require such conditions. Grice (1989: 31–38) called the first kind generalized
conversational implicatures (GCIs), as in (4a); and the second kind particularized
conversational implicatures (PCIs), as in (4b).
(4) John: How did yesterday’s research seminar go?
Mary: Some of the faculty left before it ended.
+> (a) Not all of the faculty left before it ended. (GCI)
+> (b) The seminar didn’t go well. (PCI)

Finally, Grice designed a battery of tests to facilitate the identification of conver-

sational implicature. First, defeasibility or cancellability – conversational implica-
tures can disappear in certain linguistic or non-linguistic contexts, as in (6). (I use
“~ +>” to signify “does not conversationally implicate”.) Secondly, non-detach-
ability – any linguistic expression with the same semantic content tends to carry
the same conversational implicature, as in (7). (A principled exception is those
conversational implicatures that arise via the maxim of Manner.) Thirdly, calcula-
bility – conversational implicatures can transparently be derived via the co-opera-
tive principle and its attendant maxims. Fourthly, non-conventionality – conversa-
tional implicatures, though dependent on the saying of what is said or coded, are
non-coded in nature, that is, they are not part of what is said. Fifthly, reinforce-
ability – conversational implicatures can be made explicit without producing too
much sense of redundancy, as in (8). Sixthly, some conversational implicatures
may be indeterminate. They can be taken as conveying an open-ended range of
implicitly-expressed meanings relating to matters in hand. This is illustrated in (9).
Finally, we have universality – conversational implicatures tend to be universal,
being rationally motivated rather than arbitrary. In Huang (2014: 41–42), I cited
the parallel examples from Arabic, Catalan, Chinese, Modern Greek, Kashmiri,
and Malagasy to show that if a language has, for instance, “all” and “some”, the
use of the semantically weaker “some” will universally carry the conversational
implicature “not all”.
(5) Helen is often late.
+> Helen is not always late.
(6) Helen is often, if not always, late.
~ +> Helen is not always late.
(7) Hillary almost/nearly won/came close to winning the American presidency.
+> Hillary didn’t quite win the American presidency.
(8) Helen is attractive.
+> Helen is not beautiful.
Helen is attractive, but not beautiful.
Research methodology in classical and neo-Gricean pragmatics 159

(9) Their new boss is a machine.

+> Their new boss is cold. Or
+> Their new boss is efficient. Or
+> Their new boss is a workaholic. Or
+> …

2.2. Neo-Gricean pragmatics

2.2.1. The Horn model

Horn (1984, 2012) put forward a bipartite model. In Horn’s view, all of Grice’s
maxims (except the maxim of Quality) can be replaced with two fundamental and
antithetical pragmatic principles: the Q[uantity]- and R[elation]-principles.
(10) Horn’s Q- and R-principles
The Q-principle (Addressee/hearer-based)
Make your contribution sufficient;
Say as much as you can (modulo the R-principle).
The R-principle (Speaker-based)
Make your contribution necessary;
Say no more than you must (modulo the Q-principle).
In terms of information structure, Horn’s Q-principle, which collects Grice’s first
sub-maxim of Quantity and his first two sub-maxims of Manner, is a lower-bound-
ing pragmatic principle which may be (and characteristically is) exploited to engen-
der upper-bounding conversational implicatures: a speaker, in saying “… p …”,
conversationally Q-implicates that (for all he or she knows) “… at most p …”. In
other words, as pointed out by Horn (2012), what is Q-implicated relies on what
is not (but could have been) said. The locus classicus here is those conversational
implicatures that arise from a semantic or lexical scale called a Q or Horn scale.
Given a Horn scale, if a speaker asserts a lower-ranked or semantically weaker
alternate (i. e. a rightwards linguistic expression in the ordered set), then he or
she conversationally Q-implicates that he or she is not in a position to assert any
of the higher-ranked or semantically stronger ones (i. e. leftwards expressions in
the ordered set) in the same set. Thus, the use of some in (11b) gives rise to the
Q-implicature in (11c).
(11) a. Horn scale <all, most, many, some>
b. Some of the research institutes are carrying out a study of potential vaccines against
the Zika virus.
c. +> Not many/most/all of the research institutes are carrying out a study of potential
vaccines against the Zika virus.
160 Yan Huang

Having discussed Horn’s Q-principle, let me turn to his countervailing R-princi-

ple. The R-principle, which subsumes Grice’s second sub-maxim of Quantity, his
maxim of Relation, and his last two sub-maxims of Manner, and which is based
on Atlas and Levinson’s (1981) principle of informativeness, is an upper-bounding
pragmatic law which may be (and systematically is) exploited to invite low-bound-
ing conversational implicatures: a speaker, in saying “… p …”, conversationally
R-implicates that (for all he or she knows) “… more than p …”. An example is
given in (12), adapted from Grice (1989: 38).
(12) John broke a finger yesterday.
+> The finger was one of John’s own

However, more recently, Horn (2012) has been of the view that the R-principle is
not in itself subsumable under Grice’s co-operative principle, but under rationality.
Furthermore, Horn argued that the whole Gricean mechanism for pragmatically
contributed meaning can be derived from the dialectic interaction (in the classical
Hegelian sense) between the two mutually constraining mirror-image forces (Q
and R) in the following way.
(13) Horn’s division of pragmatic labour
The use of a marked (relatively complex and/or prolix) expression when a corre-
sponding unmarked (simpler, less “effortful”) alternate expression is available tends
to be interpreted as conveying a marked message (one which the unmarked alternative
would not or could not have conveyed).

In effect, what the communicative equilibrium in (13) basically says is this: the
R-principle generally takes precedence until the use of a contrastive linguistic form
induces a Q-implicature to the non-applicability of the pertinent R-implicature.

2.2.2. The Levinson model

Horn’s proposal to reduce Grice’s maxims to the Q- and R-principles was chal-
lenged by Levinson (1987, 2000). In Levinson’s opinion, Horn failed to draw a
distinction between what Levinson called semantic minimization (Semantically
general expressions are preferred to semantically specific ones.) and expression
minimization (“shorter” expressions are preferred to “longer” ones.). Consequently,
inconsistency arises with Horn’s use of the Q- and R-principles. For example, in
Horn’s division of pragmatic labour, the Q-principle operates primarily in terms
of units of speech production whereas elsewhere, in Horn scales, for instance, it
operates primarily in terms of semantic informativeness.
Considerations along these lines led Levinson to argue for a clear separation
between pragmatic principles governing an utterance’s surface linguistic form and
pragmatic principles governing its informational content. He proposed that the
original Gricean program (the maxim of Quality apart) be reduced to three neo-Gri-
Research methodology in classical and neo-Gricean pragmatics 161

cean pragmatic principles: what he dubbed the Q[uantity]-, I[nformativeness]- and

M[anner]-principles. Each of the three principles has two sides: a speaker’s maxim,
which specifies what the principle enjoins the speaker to say and implicate and a
recipient’s corollary, which dictates what it allows the addressee to infer. Let me
take them one by one.
(14) Levinson’s Q-principle (simplified) (e. g. Huang 2014a: 50–51)
Speaker: Don’t say less than is required (bearing the I-principle in mind).
Addressee: What isn’t said isn’t the case.

The basic idea of the metalinguistic Q-principle is that the use of a linguistic
expression (especially a semantically weaker one) in a set of contrastive semantic
alternates (such as a Horn scale) Q-implicates the negation of the interpretation
associated with the use of another linguistic expression (especially a semantically
stronger one) in the same set. Seen the other way round, from the absence of a
semantically stronger linguistic expression, we infer that the interpretation associ-
ated with the use of that expression does not hold. Hence, the Q-principle is essen-
tially negative in nature. By way of illustration, see (2) and (11) above.
Next, there is Levinson’s I-principle.
(15) Levinson’s I-principle (simplified) (e. g. Huang 2014a: 57–58)
Speaker: Don’t say more than is required (bearing the Q-principle in mind).
Addressee: What is generally said is stereotypically and specifically exemplified.

Mirroring the effects of his Q-principle, Levinson’s I-principle is a pragmatic law

of semantic economy, the central tenet of which is that the use of a semantically
general linguistic expression I-implicates a semantically specific interpretation.
More accurately, in some cases, the implicature engendered by the I-principle
is one that accords best with the most stereotypical and explanatory expectation
given our background assumptions or real-world knowledge.
(16) John pressed the spring and the drawer opened.
+> John pressed the spring and then the drawer opened
+> John pressed the spring and thereby caused the drawer to open
+> John pressed the spring in order to make the drawer open

Finally, we come to Levinson’s M-principle.

(17) Levinson’s M-principle (simplified) (e. g. Huang 2014a: 62–63)
Speaker’s maxim: Don’t use a marked expression without reason.
Addressee: What is said in a marked way conveys a marked message.

Unlike the Q- and I-principles, which operate primarily in terms of semantic infor-
mativeness, the metalinguistic M-principle is operative primarily in terms of a set
of alternates that contrast in form. The crux of this pragmatic principle is that the
use of a marked linguistic expression M-implicates the negation of the interpreta-
162 Yan Huang

tion associated with the use of an alternative, unmarked linguistic expression in the
same set. This is exemplified in (19).
(18) John stopped the alarm clock.
+>I John stopped the alarm clock in a usual manner.
(19) John got the alarm clock to stop.
+>M John stopped the alarm clock in an unusual manner, e. g. by deliberately throwing
it to the floor.

Given the above tripartite classification of the neo-Gricean pragmatic principles,

the question that arises next is how inconsistencies arising from these potentially
conflicting conversational implicatures can be resolved. According to Levinson
(2000), they can be resolved by an ordered set of precedence, which encapsulates
in part the Hornian division of pragmatic labour.
(20) Levinson’s resolution schema for the interaction
of the Q-, I-, and M-principles
a. Level of genus: Q > M > I
b. Level of species: e. g. Q-clausal > Q-scalar
(By level of genus is meant the level where different types of conversational impli-
cature are placed; level of species refers to the level where different sub-types of the
same type of conversational implicature are place.)
This is tantamount to saying that genuine Q-implicatures (where Q-clausal cancels
rival Q-scalar) supersede inconsistent I-implicatures, but otherwise I-implicatures
take precedence until the use of a marked linguistic expression triggers a comple-
mentary M-implicature to the negation of the applicability of the pertinent I-impli-
cature. Consider, for example, (21).

(21) Q > I
If Donald Trump gives you a gun for Christmas, it may be a real one.
a. Q <(since p, q), (if p, q)>
+>The gun may or may not be a real one.
b. I [a gun for Christmas]
+> The gun is a toy gun.
c. Q > I
Possibly the gun is a real gun.

In (21), there is a Q-clausal implicature due to the use of (if p, q). But there is also
a potential I-implicature to stereotype arising from the employment of a gun for
Christmas. The two conversational implicatures are inconsistent with each other.
Now, given (20a), the I-implicature is outdone by the Q-implicature, hence the
winning Q-implicature becomes the implicature of the whole sentence, as in (21c).
For an example illustrating Q > M, see (24) below, and for examples displaying
M > I and Q-clausal > Q-scalar, see Huang (e. g. 2014a: 65–66, 2015b, 2016a; see also
Huang 2017b, c, f).
Research methodology in classical and neo-Gricean pragmatics 163

In addition to Horn’s and Levinson’s work, surveyed above, other important

research conducted in the neo-Gricean pragmatic framework include that carried
out by Yan Huang and Elizabeth Traugott in linguistics, and Jay Atlas, Kent Bach
(see e. g. Bach 2012), Bart Geurts, and François Recanati (see e. g. Recanati 2010)
in the philosophy of language.

3. Introspection

As mentioned in section 1, a variety of research methods are used in classical and

neo-Gricean pragmatics, but of all these methodologies, introspection has been
central in the formulation and development of, and has remained the dominant
methodological means in, this school of thought in pragmatics. What, then, is intro-
spection? By introspection is meant roughly the process where a linguist or phi-
losopher of language utilizes his or her intuitions to invent linguistic expressions
or forms in his or her native language as linguistic examples and make judgements
about certain linguistic aspects such as meanings, uses, and structures of these
linguistic expressions or forms, either in isolation or in context. Introspection also
involves the comparison of the linguist’s or philosopher of language’s own intui-
tions/introspections with those reported by other native speakers of the language.
The data produced by introspection is often used for the linguist or the philosopher
of language to formulate, develop, and advance, and/or assess, test, confirm, or
disconfirm a particular theoretical argument or position (see e. g. Talmy 2007a,
b, Meyer and Nelson 2006). Returning to classical and neo-Gricean pragmatics,
introspection functions as the main method of data collection, introspective data
is the main type of data, and data analysis is mainly through introspection. In
Grice (1975, 1989), for example, he used the introspectively constructed sentence
in isolation (22) to develop his notion of GCI and the introspectively formulated
sentence in context (23) to illustrate his concept of conversational implicature.
(22) (Grice 1989: 37)
a. X is meeting a woman this evening.
b. +> The woman in question isn’t X’s wife, mother, sister, or perhaps even close
platonic friend
(23) (Grice 1989: 32)
a. (A is standing by an obviously immobilized car and is approached by B.)
A: I am out of petrol.
B: There is a garage round the corner.
+> The garage is, or at least may be, open, and A can buy petrol there

Why has introspection been the main research method in classical and neo-Gricean
pragmatics? In the first place, classical and neo-Gricean pragmatics is a philosoph-
ically inspired pragmatic theory of language use, the main aim of which is to pro-
164 Yan Huang

vide a systematic account of meaning as it is intended by a speaker and understood

by the addressee in an attempt to work out what is meant (meaningnn) from the
conjunction of what is said and what is conversationally implicated. In formulating
such a theory, Grice’s point of departure is logical particles (such as not, and, and
some) of natural language, that is, the natural language analogues or counterparts
of what he called “formal devices” (Grice 1989: 22) like ~, &, and ⋀ in logic and
the philosophy of language. It is natural, therefore, that not only did his research
agenda focus on some of the topics emerging from the principal concerns of, but
his main research methodology also followed the traditional methods adopted in,
twentieth-century Anglo-American analytic philosophy. Secondly, classical and
neo-Gricean pragmatics is a pragmatic theory that reflects not only a native speak-
er’s linguistic performance but his or her linguistic competence as well. Stated
thus, together with, for example, semantics, syntax, and phonology, it constitutes
an essential component of an overall theory of a native speaker’s linguistic ability.
Thirdly, and more importantly, sometimes introspection constitutes the only direct
means for obtaining and assessing certain aspects of meaning and use in language.
These linguistic semantic and pragmatic aspects are unlikely to be found in an
actual corpus of linguistic data, because they are unlikely to occur in real language
usage. While a linguistic corpus contains a record of linguistic expressions that a
speaker or writer actually utilizes to express meanings, it does not contain all the
linguistic expressions that he or she might potentially use to do so. In other words,
whatever we find in a linguistic corpus is restricted to what exists in that corpus,
and may not be representative of the entire potential of meaning and use of a given
language. One such example is given in (24), which Levinson (2000) introspected
to illustrate his mechanism for the interaction of his Q-, I- and M-principles
in (20).

(24) (Levinson 2000: 160, slightly simplified)

a. It’s not unlikely that Giant Stride will win the Derby, and indeed I think it likely.
b. +>M It’s less than fully likely that Giant Stride will win the Derby.
c. +>Q It is possible it is likely
d. +> Q defeats M
It is possible it is likely

On the other hand, as a research methodology, introspection suffers a number of

weaknesses, two of which are particularly consequential. For the first limitation,
as pointed out by Meyer and Nelson (2006), the data collected in an introspective
way is usually decontextualized or comes with a default context. It exists in the lin-
guist’s or philosopher’s brain/mind rather than in any real communicative context.
To remedy this weakness, in classical and neo-Gricean pragmatics, introspection is
sometimes complemented by other methodological means such as using observed
or attested data. This is the case of (25).
Research methodology in classical and neo-Gricean pragmatics 165

(25) She [Ally] looked me right in the eye and said, ‘I need to know how you feel about
me.’ I didn’t say anything for a good time … ‘I care deeply about you,’ I said. ‘But
you don’t love me?’ ‘I don’t know.’ She nodded. Tears streamed down her face. (Peter
David Marks: A Bad Case of Puppy Love, The New York Times)
I care deeply about you.
+> The speaker does not love the addressee.
(Huang 2014: 33)

Secondly, one linguist or philosopher’s intuitions or introspections may be differ-

ent from those of another linguist or philosopher of language, which may lead to
the formulation of different analyses or theories. In extreme cases, an analyst’s
introspections may yield a theory of a linguistic phenomenon that is reflexive of
his or her own idiolect (see also Meyer and Nelson 2006). For example, according
to Chierchia and his associates, while a standard upper-bounding Q-scalar implica-
tue arises from a positive Horn scale, as, for example, in (2) above, in a negative
Horn scale and other downward entailing environments, it is quite weak and even
blocked, as in (26b).
(26) a. Negative Horn scale <not some, not many, not most, not all>
b. The earthquake didn’t kill many of the villagers.
c. +> The earthquake killed some of the villagers.

On the basis of this introspective judgment of the data, Chierchia (2004, 2013)
and Chierchia et al. (2012) argued that Q-scalar implicatures be computed composi-
tionally. Furthermore, he devised an interpretation procedure, according to which,
Q-scalar implicatures are calculated locally in the tree diagram of a sentence and
are integrated in the semantics where they occur. This has the consequence that
the computation of Q-scalar implicatures falls under compositional semantics, hence
part of grammar. But this introspective judgment of the data is challenged by Horn
(2006). In Horn’s view, Q-scalar implicatures stemming from a negative Horn scale
are not less robust than those which are derived from its positive counterpart –
contra Chierchia. In fact, as pointed out by both Levinson (2000: 82, 254–255) and
Horn (2006), the alleged blockage of Q-scalar implicatures is due to the fact that a
positive Horn scale is reversed under negation and other downward entailing oper-
ators and consequently a different Q-scalar implicature is derived from the inverse
scale (see also Huang 2007, 2011, 2014a: 67).
Another case in point is concerned with so-called embedded (conversational)
implicatures in (27).
(27) John believes that some of the research institutes are carrying out a study of potential
vaccines against the Zika virus.
+> John believes that not many/most/all of the research institutes are carrying out a
study of potential vaccines against the Zika virus. (strong)
+> John does not believe that many/most/all of the research institutes are carrying out
a study of potential vaccines against the Zika virus. (weak)
166 Yan Huang

Here, the introspections of Chierchia and other conventionists – scholars who are
attempting to reduce embedded implicatures to the conventional, lexico-grammat-
ical content of a sentence – are different from those of Geurts (2010). According
to Geurts, contrary to the conventionalist continuity hypothesis or stance that the
upper-bounded reading of a Q-scalar implicature occurs across the board, be it at the
sentential level (unembedded) or at the sub-sentential level (embedded) and that it
“occurs systematically and freely in arbitrarily embedded positions” (Chierchia et
al. 2012), an embedded Q-scalar implicature requires special linguistic marking such
as a contrastive stress. It is marginal and rare and sometimes the upper-bounded
reading has to be forced. In other words, an embedded Q-scalar implicature consti-
tutes an exceptional and marked case. Furthermore, the use of embedded and unem-
bedded scalar expressions is computed differently. While the use of unembedded
scalar expressions invites Q-scalar implicatures, the use of embedded ones does not.
Moreover, embedded scalar expressions are frequently dealt with on a case-by-case
basis (see also Huang 2017c: 170–171). As an attempt to strengthen and confirm
each side’s introspection and analysis of embedded implicature, another weapon in
the pragmaticist’s arsenal, namely, pragmatic experiments are resorted to. But the
results of these experimental testings are not conclusive: while conventionalism
set forth by Chierchia and his associates has received support from, for instance,
Clifton Jr. and Dube (2010) and Chemla and Spector (2011), the Gricean global
analysis produced by Geurts has been experimentally backed by, for example,
Geurts and Pouscoulous (2009) and Geurts and Tiel (2013).1
In summary, as pointed out by Talmy (2007a, b), each research method used in
linguistic investigation has a different profile for what it is better or worse at. This
is true of introspection when applied to pragmatics as well. Interesting enough,
meaning is one of those aspects of language which the introspective methodology
is best at. Furthermore, introspection has the advantage over other research meth-
odologies in that it appears to be the only one that has unique access to meaning
directly. That is, perhaps, why introspection has remained the dominant research
and analytical methodology in the study of meaning and use in philosophically
motivated or inspired pragmatic theories including classical and neo-Gricean prag-
matics. Moreover, as I have briefly mentioned above, introspection can be, and

1
Notice the so-called “experimental paradox” – a well-known dilemma in experimen-
tal psycholinguistics including experimental pragmatics. The dilemma is that the more
perfect an experiment, the less like the real speech situation it is, and the more likely
that subjects of the experiment will produce unnatural responses. On the other hand, the
more like the real speech situation the experiment, the less easy for the experimenters
to control the external factors that may interfere with the experiment. The consequence
of this paradox is that it is almost impossible to design a perfect experiment (see e. g.
Huang 2017a).
Research methodology in classical and neo-Gricean pragmatics 167

indeed is, complemented by other research methods such as the use of attested data
and experimentation.2
Linguistic introspection, according to Talmy’s (2007a, b), is both a natural and
an indispensable component of language cognition, carrying out certain necessary
functions. When linguists/philosophers including semanticists/pragmaticists and
philosophers of language utilize introspection to examine meaning and language
use, they are merely employing, perhaps slightly more systematically, a cognitive
faculty that is already in place for everyday (linguistic) functioning.

4. Falsifiability

Simply put, falsifiability refers to the Popperian thesis in the philosophy of science
that an empirically-based scientific theory (under which linguistics falls) can only
be refuted, but not be confirmed. This is because no matter how many confirming
observations (i. e. observations that are compatible with the empirical predictions
of the theory) can be achieved, potential disconfirming observations can never be
ruled out. This criterion of falsifiability is considered by Popper as the most funda-
mental methodology for an empirically-based science, according to which, while
physics, for example, is an empirically-based scientific theory, astrology, philo-
sophical metaphysics, and psychoanalysis, for instance, are not (e. g. Popper 1973).
As an empirically-based scientific theory of linguistic meaning and language
use, classical and neo-Gricean pragmatics is formulated in such a way that its
empirical predications can be falsified. In other words, the claims of classical and
neo-Gricean pragmatics can be empirically assessed and tested for its truth or fal-
sity, that is, to be confirmed or disconfirmed. One such claim is that Grice’s co-op-
erative principle and its component maxims of conversation are universal. More
specifically, in regard to Grice’s first sub-maxim of Quantity or Horn’s or Levin-
son’s Q-principle, the prediction is that the use of a semantically weaker expression
in a Horn scale in any language will create a Q-scalar implicature that the alternate,
semantically stronger expressions in the same set are not the case in that language.3

2

Needless to say, there are cases in pragmatic research where introspection cannot be
used. This is largely the case, for example, of historical pragmatics. In historical prag-
matics including neo-Gricean oriented one, attested textual data (collected in corpora)
is usually used. See, for instance, Traugott (2004).
3
In Sperber and Wilson’s (1986, 1995) relevance-theoretic framework, pragmatics is
reduced to a single notion of relevance, which is realized in two principles of rele-
vance. But unlike Grice’s co-operative principle and its attendant maxims of conversa-
tion, the principles of relevance are not a maxim addressed to a speaker, known by the
addressee, and obeyed or exploited in communication. Rather, grounded in a general
view of human cognition, they are an automatic reflex of the human mental capacity
168 Yan Huang

This claim of universality for Grice’s co-operative principle and its associated
set of conversational maxims was called into question by Keenan (1976). On the
basis of the anthropological fieldwork she conducted in a small village in Mad-
agascar, Keenan argued that the Malagasy-speaking culture of the country is a
speech community in which Grice’s co-operative principle and in particular, his
first sub-maxim of Quantity is not conformed to. For example, she noticed that in
talking to her son, a Malagasy mother once used (28) to refer to her husband.
(28) Mbola mator y ve ny olona?
‘Is the person still asleep?’

Given Grice’s first sub-maxim of Quantity or Horn’s or Levinson’s Q-principle,

since the mother employed a semantically weaker expression olona (person), she
would conversationally imply that the person referred to is not her husband or her
son’s father.
On the basis of examples like (28), Keenan concluded that Grice’s pragmatic
theory is culture-specific rather than universal. However, if we examine the Mal-
agasy fact more closely, we find that the use of a “general animate noun refer-
ring to some social category of which the referent is a member” (Keenan 1976:
72) is not just in conformity with Grice’s first sub-maxim of Quantity, it actually
requires the existence of this pragmatic sub-maxim/principle for the usage to be
interpreted (see also Brown and Levinson 1978: 288–289, Levinson 2000: 423). As
Keenan herself was aware, “[i]t would be misleading to conclude that the maxim
‘Be informative’ does not operate at all in a Malagasy community. We would not
be justified in proposing the contrary maxim ‘Be uninformative’ as a local axiom”
(Keenan 1976: 75–76). In fact, Grice’s first sub-maxim of Quantity or Horn’s or
Levinson’s Q-principle does generally hold for the Malagasy-speaking culture, as
is attested by (29).4

that works without the communicators having any overt knowledge of it. How do a
speaker and the addressee follow the principles of relevance? They do not. According
to Sperber and Wilson (1995: 162), “[c]ommunicators and audience need no more know
the principle of relevance to communicate than they need to know the principles of
genetics to reproduce. Communicators do not “follow” the principle of relevance; and
they could not violate it even if they wanted to. The principle of relevance applies with-
out exception: every act of ostensive communication communicates a presumption of
relevance”. Relevance is thus a form of unconscious inference. In other words, the prin-
ciples of relevance are governing cognitive principles that are not themselves an object
of processing. This raises the larger issue of whether relevance theory, as formulated
thus, can be falsified or not. Given that relevance is an exceptionless generalization, it is
likely to be immune from any possible counterexamples (see e. g. Huang 2007, 2014a:
290–291, but see e. g. Wilson 2017 for spirited counterarguments).
4
This was confirmed to me by Larry Horn (personal communication), who checked the
Malagasy fact with Keenan after her paper was published.
Research methodology in classical and neo-Gricean pragmatics 169

(29) Misy tanora tia ny hira malaza.

exist young like the song famous
‘Some young people like famous songs.’
+> “Not many/most/all young people like famous songs.”

What Keenan has showed, however, is that in the Malagasy society, Grice’s first
sub-maxim of Quantity or Horn’s or Levinson’s Q-principle can be overridden by
some sociolinguistic principle such as the one of avoiding bringing tsiny ‘guilt’ to
the speaker or henatra ‘shame’ to the speaker’s family. As pointed out by Keenan
(1976):

A second and perhaps more significant motivation for revealing less information than
would satisfy the addressee is the fear of committing oneself explicitly to some par-
ticular claim. Individuals regularly avoid making explicit statements about briefs and
activities. They do not want to be responsible for the information communicated. For
example, if someone asks, ‘Who broke the cup?’, most speakers would not like to be
the one to specify the culprit. Such a statement may have unforeseen unpleasant conse-
quences for him and his family, and he alone would have to shoulder the tsiny (the guilt)
for uttering such a claim (original emphasis). (Keenan 1976: 70)

Other factors that outweigh the operation of Grice’s first sub-maxim of Quantity
or Horn’s or Levinson’s Q-principle in the Malagasy-speaking community, men-
tioned by Keenan, include satisfying the principle “would be indiscrete, impolite,
unethical, loss of face, etc.” (Keenan 1976: 69). In the case of personal reference,
it is because of a particular Malagasy social taboo on avoiding identifying an indi-
vidual in utterances that the mother did not deploy a more informative term such
as “your father” to refer to her husband. Instead, she used a general noun. By this
way of referring, she succeeded in identifying him without bringing harm to him
or shame to herself. If Grice’s first sub-maxim of Quantity or Horn’s or Levinson’s
Q-principle did not work at some deeper level, her son would fail to recognize
the intended referent. Notice that Grice’s cooperative principle and its constitu-
ent maxims of conversation including his first sub-maxim of Quantity define an
“unmarked” presumptive framework for communication, the essential assumption
being “no deviation from rational efficiency without a reason”. The deviation here
is the culture-specific Malagasy taboo on exact identification, but the norm is the
universal, first sub-maxim of Quantity proposed by Grice. In other words, social or
cultural factors such as taboos are implicated in the classical way, with maximum
theoretical parsimony, from Grice’s co-operative principle and its component max-
ims of conversation including the first submaxim of Quantity.5

5
More recently, Senft (2008) claimed that Grice’s maxim of Quality and his first and
second sub-maxims of Manner are not adhered to by the Kilivila-speaking Trobriand
170 Yan Huang

If the counterexamples presented by Keenan above are “apparent” rather than

“real” ones, the question that arises next is whether or not there are “genuine”
counterexamples to Grice’s co-operative principle and its attendant maxims of con-
versation.6 The answer is yes. In my work on neo-Gricean lexical pragmatics, I
discussed lexical narrowing (e. g. Huang 2009, 2017b, d, see also Huang 2015a). By
lexical narrowing is meant the phenomenon whereby the use of a lexical expression
implicitly conveys a meaning that is more specific than the lexical item’s lexically
encoded meaning. Within the framework of neo-Gricean lexical pragmatics, lexical
narrowing can be grouped into two types. In the first, the use of the superordinate
term of a hyponymic taxonomy where there is a specific hyponym denotes more
narrowly the complement of the extension of the hyponym. This is the case for (30).
(30) John broke a finger.
+> John didn’t break a thumb.

Lexical narrowing of this type follows directly from Horn’s or Levinson’s Q-prin-
ciple. Notice that thumb and finger form a Horn scale. Given the Q-principle, from
the use of the semantically weaker finger, we obtain the pragmatically narrowed
meaning ‘not thumb’. This Q-based strengthening of meaning typically gives rise
to what Horn (1984) and Levinson (2000) called autohyponymy – the phenomenon
whereby a lexical item has two senses, one of which is included in the other. Other
examples include rectangle +> ‘not square’, gay +> ‘not lesbian’, and actor +>
‘not actress’.

Islanders of Papuan New Guinea. This is the case with both their ritualized communi-
cation and everyday conversation, especially with the use of the non-diatopical register/
variety called biga sopa (the joking or lying speech, the indirect speech, and the speech
which is not vouched for). While the details of Senft’s work need to be more care-
fully studied, his claim seems to be another apparent counterexample. Heated debates
about the issue of universality in terms of the distinction between the etic versus emic
approach/grid have been going on also with regard to, for example, speech act theory,
politeness/impoliteness theory, and conversational structure.
6
In rejecting Keenan’s counterexamples as real ones, I am fully aware of what a recent
Editorial in Nature (2015) calls the human “cognitive bias”, namely, “[t]he human
brain’s habit of finding what it wants to find,” which “is a key problem for research”.
As pointed out by the Editorial, “One enemy of robust science is our humanity – our
appetite for being right, and our tendency to find patterns in noise, to see supporting
evidence for what we already believe is true, and to ignore the facts that do not fit”.
See also the other three relevant papers published in the same issue of Nature (vol. 526,
no. 7572). Somewhat related is that as pointed out by Kuhn (1962/2012), the research
methodology of observation, for example, is “strongly theory-laden”. This is because
when a researcher makes observations, he or she has already been significantly influ-
enced by his or her previously held theoretical and methodological assumptions. The
same can also be said of experimentation.
Research methodology in classical and neo-Gricean pragmatics 171

Secondly, there is the R/I-based lexical narrowing. The basic idea here is that
the use of a semantically general lexical item is R/I-implicated to a semantically
more specific interpretation. This is the case for (31), where the semantically gen-
eral term milk is R/I-narrowed to denote its culturally salient subset ‘cow’s milk’
(cf. goat’s milk, soy milk, almond milk, coconut milk, rice milk etc.).
(31) John had a glass of milk for breakfast this morning.
+> John had a glass of cow’s milk for breakfast this morning.

Other examples include nurse +> “female nurse”, relationship +> “sexual/roman-
tic relationship”, and drink +> “alcoholic drink”. Of these, Horn (1984) and Lev-
inson (2000) were of the view that while drink is an autohyponym, nurse is not
(see especially Huang 2017d for a wide variety of examples and detailed analyses).
While the analysis works quite well for English, it becomes problematic when
we turn to Chinese. For example, given the R/I-principle, and the same social
division of labour in China, it is predicted that hushi (nurse) rather than nü hushi
(female nurse) should be normally used in Chinese, but this prediction is falsified:
the latter is commonly employed in the language.

5. Methodological reductionism or expansionism?

As we saw in section 2.1, in his theory of conversational implicature, Grice put

forward an overarching co-operative principle and a set of nine attendant maxims
of conversation classified into four categories. Since its inception, the Gricean
mechanism has been subject to numerous attempts at revision. The revisions have
been of two types: reduction and expansion.
Harnish (1976) and Kasher (1976) were the two early neo-Gricean reductionist
attempts. The former argued that Grice’s maxims of Quality and Quantity be col-
lapsed into a single maxim, namely, make the strongest relevant claim justifiable
by your evidence. In the latter, the entire Gricean machinery is reduced to resulting
from a “most effective, least effort” rationality principle of some sort. However,
as already discussed in section 2.2, of all the neo-Gricean reductionist models, the
most influential are the two-principled one proposed by Horn and the three-princi-
pled one posited by Levinson.7

7
It goes without saying that another influential, more radical, post-Gricean reductionist
model is relevance theory. Notice that in Sperber and Wilson (1986), there was only one
principle of relevance, namely, the communicative principle of relevance. In Sperber
and Wilson (1995), however, there were two principles of relevance: the cognitive and
the communicative principles of relevance. According to Sperber and Wilson (1995:
261), “[t]he change is, of course, expository and not substantive”. On Horn’s (2007)
view, one-principled relevance theory is implicitly dualistic in nature, given that rel-
172 Yan Huang

By contrast, in a quite contrary spirit to the reductionist approach, Leech (1983)

proposed that the Gricean maxims be revised upward, that is, be proliferated. In
particular, encouraged by Grice’s (1989: 28) remarks on “other maxims (aesthetic,
social, or moral in character), such as ‘Be polite’”, he argued that a politeness prin-
ciple be added to the Gricean programme, and that it should be taken as co-ordinate
in nature to Grice’s co-operative principle. The politeness principle is realised by
a set of maxims: tact/generosity, approbation/modesty, agreement, and sympathy
(Leech 1983: 131–132). In Leech (2007), his politeness principle was restyled as a
Grand Strategy of Politeness (GSP) and his set of attendant maxims was reformu-
lated as a set of paired pragmatic constraints.
(32) Leech’s (2007: 182) GSP and pragmatic constraints
a. GSP
In order to be polite, S expresses or implies meanings which associates a high value
with what pertains to O or associates a low value with what pertains to S. (S = self,
speaker; O = other, mainly addressee)
b. Pragmatic constraints
(i) Generosity: Place a high value on O’s wants.
Tact: Place a low value on S’s wants.
(ii) Approbation: Place a high value on O’s qualities.
Modesty: Place a low value on S’s qualities.
(iii) Obligation (of S to O): Place a high value on S’s obligations to O.
Obligation (of O to S): Place a low value on O’s obligations to S.
(iv) Opinion-agreement: Place a high value on O’s opinions.
Opinion-reticence: Place a low value on S’s opinions.
(v) Feeling-sympathy: Place a high value on O’s feelings.
Feeling-reticence: Place a low value on S’s feelings.

As pointed out in Huang (2007, 2014: 44), a number of arguments, however, can
be mounted against Leech’s expansionist analysis. In the first place, if we are
allowed to invent a pragmatic maxim/constraint for every regularity that is actu-
ally observed in the use of language, not only will we have an indefinite number
of maxims/constraints, but pragmatic theory will be too unconstrained to be fal-
sified. Secondly, if there are too many maxims/constraints in a theory, then it will
become very difficult, if not impossible, to tackle the projection problem, namely,
the problem of which maxim/constraint will override which under what circum-
stances. As the third argument against Leech’s expansionist approach, the distri-
bution of politeness/impoliteness (who can/has to be polite/impolite to whom) is
socially controlled. By contrast, language usage principles of the Gricean sort are

evance is measured in a minimax of give-take effort and effect. Note further that this
Cartesian principle of methodological reductionism has to some extent become the
orthodox methodological approach, and has been applied rather successfully to natural
sciences (cf. Popper 1945).
Research methodology in classical and neo-Gricean pragmatics 173

of a quite different status. As already mentioned, Grice’s cooperative principle

and its constituent maxims of conversation define an “unmarked” or socially neu-
tral (and indeed asocial) presumptive framework for communication, the essential
assumption being “no deviation from rational efficiency without a reason”. Polite-
ness/impoliteness considerations are, however, just such principled reasons for
deviation. Therefore, linguistic politeness/impoliteness is also implicated in the
classical way, with maximum theoretical parsimony, from Grice’s co-operative
principle and its component maxims of conversation. Fourthly, the assumption of
co-cooperative behaviour is hard to undermine: tokens of apparent non-co-oper-
ative behaviour tend to get interpreted as in fact co-operative at a “deeper” level.
Now, if Leech’s politeness principle/GSP had maxim-like status, we would expect
the same robustness: it should be hard to be impolite. But this is clearly counterin-
tuitive (Brown and Levinson 1987: 4–5). Finally, from a methodological point of
view, unlike the reductionist approach, Leech’s expansionist approach runs directly
against the spirit of a meta-theoretical/meta-methodological desideratum known as
“modified Occam’s razor” (Grice 1989: 47), which dictates that theoretical entities
are not to be multiplied beyond necessity.8

6. Comparison of linguistic characteristics across typologically

different languages by means of cross-linguistic data

Finally, let me show how the use of cross-linguistic data to compare linguistic char-
acteristics across typologically different languages contributes to the formulation
and development of (my version of) the neo-Gricean pragmatic theory of anaphora
(Huang 1991, 1994/2007, 2000a, b, 2004, 2007, 2013a, 2014, 2016b, 2017b, e; see
also Levinson 1987, 1991, 2000).
Anaphora is definable as a relation between two or more linguistic elements,
in which the interpretation of one (called an anaphoric expression) is in some way
determined by the interpretation of the other (called an antecedent). Linguistic
expressions that can be employed as an anaphoric expression include gaps (or
empty categories), pronouns, reflexives, proper names, and definite descriptions.
Within the principles-and-parameters theory and its minimalist descendent,
Chomsky (e. g. 1995) postulated three binding principles in (33), providing an
account of the allegedly universal, syntactic distribution of three types of overt
anaphoric expressions, namely, lexical anaphors (such as reflexives and reciprocals
like himself and each other in English), pronominals (such as pronouns like he in

8
It should be noted that modified Occam’s razor does not necessarily require that entities
be maximally reduced. Put the other way round, it is not necessarily the case that “the
less entities, the better”. Rather, entities should be reduced in an “optimal” way.
174 Yan Huang

English), and r[eferential]-expressions (such as proper names and definite descrip-

tions like John and the President of the United States of America in English), in
language.
(33) Chomsky’s binding conditions
A. An anaphor is bound in a local syntactic domain.
B. A pronominal is free in a local syntactic domain.
C. An r-expression is free.

The paradigmatic patterns of binding are illustrated by (34) from English.

(34) a. Newton1 admired himself1.
b. Newton1 admired him2.
c. Newton1 admired Newton2.

In (34a), himself, being a reflexive, is an anaphor in the Chomskyan sense. As

such, it falls under binding condition A, according to which, it is bound to its local
antecedent Newton. Next in (34b), him, being a pronominal, is subject to binding
condition B. Given binding condition B, it cannot be bound in its local domain, and
there is thus disjoint reference between it and Newton. Finally, in (34c), the second
Newton, being a proper name, is an r-expression. By binding condition C, it cannot
be co-indexed with the first Newton. From English examples like these, Chomsky
concluded that the syntactic distribution of anaphors, pronominals, and r-expres-
sions is universally dictated by binding conditions A, B, and C, respectively. How-
ever, when confronted with a wider range of languages other than English, these
binding conditions run into serious difficulties.
Let me take binding condition A first. The cross-linguistic, syntactic distribu-
tion of anaphors including reflexives violates this primitive rule of grammar in
both directions. On the one hand, many (and perhaps the majority of) languages
in the world “systematically” allow long-distance reflexives – reflexives that are
bound outside their local syntactic domain, and even across sentence boundaries
into discourse. These include most East, South, and Southeast Asian languages
(e. g. Chinese, Kannada, and Malay), some mainland and insular Scandinavian
languages (e. g. Norwegian, Swedish, and Icelandic), some Germanic (other than
Scandinavian) and Romance languages (e. g. Dutch, Italian, and Old Provençal),
some Slavonic languages (e. g. Czech, Polish, and Russian), and languages like
Finnish, Modern Greek, Inuit, KiNande, Marathi, Northern Pomo, Tuki, and Turk-
ish (see e. g. Huang 2000a: 19–20, 90–130 for examples from, and sources of, these
languages). An example from Chinese is given in (35).
(35) Xiaoming1 shuo Xiaohua2 kanbuqi ziji1/2.
Xiaoming say Xiaohua look down upon self
‘Xiaoming1 says that Xiaohua2 looks down upon him1/himself2’.
Research methodology in classical and neo-Gricean pragmatics 175

On the other hand, a reflexive may not be bound in its local syntactic domain, as
in Dutch, Norwegian, and Swedish (Huang 2000a: 20). This is shown by (36) from
Dutch.
(36) (Dutch, cited in Huang 2000a: 20)
*Rint veracht zich.
Rint despises self
‘Rint despises himself’.

Next, evidence from various languages in the world casts serious doubts on Chom-
sky’s binding condition B. First, many languages in the world have no reflexives,
and consequently utilize pronouns as one of the means to encode coreference.
These include some Low West Germanic languages (e. g. Old and Middle Dutch,
Old English, Old Frisian, and perhaps West Flemish and Modern Frisian), Bamako
Bambara, Biblical Hebrew, Isthmus Zapotec, the majority of Australian aboriginal
languages (e. g. Gumbaynggir, Jiwarli, and Nyawaygi), some Austronesian abo-
riginal languages (e. g. Chamorro, Kilivila, and Tahitian), some Papuan languages
(e. g. Harway), all Oceanic languages, and many pidgin and creole languages (e. g.
the Spanish-based Palenquero, and perhaps Bislama, Chinook Jargon, the French-
based Guadeloupe, the Arabic-based KiNubi, Kriyol, Martinique Creole, and
Negerhollands). Secondly, there are languages that lack first- and/or second-per-
son reflexives. In these languages, first- and second-person pronouns are instead
used as bound anaphors. Some Germanic (e. g. Danish, Dutch, and Icelandic) and
Romance (e. g. French and Italian) languages, for instance, belong to this type.
Thirdly, the use of a locally-bound third-person pronoun in syntactic structures
where its corresponding, third-person reflexive is not available is attested in a
range of languages. This is the case of, for example, Catalan, French, Galician,
Piedmontese, Portuguese, Rumanian, Russian, Sardinian, Spanish, and Tsaxur (see
e. g. Huang 2000a: 21–22 for examples from, and sources of, these languages).
Given the standard formulation of Chomsky’s binding conditions A and B, it
is predicted that anaphors (e. g. reflexives and reciprocals) and pronominals (e. g.
pronouns) be in strict complementary syntactic distribution, that is, anaphors can
occur only where pronominals cannot, and vice versa. This is because the two
binding conditions are precise mirror-images of each other. Cross-linguistically,
this predicted syntactic, distributional complementarity between anaphors and pro-
nominals, however, breaks down. Take bound possessive anaphora as an example.
(37) (Gimira, cited in Huang 2000a: 24)
Ba/yi dor gotue.
self’s/his sheep sold-3M-FIN
‘He1 sold self’s1/his2 sheep’.

Here, languages in the world can be grouped into three types: (i) those allowing
anaphors but not pronominals (e. g. Basque, Chechen, Danish, Gimira, Hindi/Urdu,
176 Yan Huang

Ingush, Kashmiri, Norwegian, Latin, Russian, and Telugu), as in (37) above; (ii)
those permitting pronominals but not anaphors (e. g. Akan, Arabic, English, Ger-
man, Guugu Yimidhirr, and Spanish), and (iii) those permitting both anaphors and
pronominals (e. g. Bangala, Bengali, Chinese, Japanese, Kannada, Korean, Malay,
Malayalam, Marathi, Oriya, Sinhala, Tamil, and Tuki) (see. e. g. Huang 2000a:
24–25 for examples from, and sources of, these languages). Whereas Chomsky’s
binding conditions A and B may jointly make correct predications for the distribu-
tion of bound possessive anaphora in type (i), “anaphors only” and perhaps also
in type (ii), “pronominals only” languages, depending on how the local syntactic
binding domain is technically defined, they certainly make wrong predictions for
type (iii), “both anaphors and pronominals” languages.
Finally, even a cursory examination of some East, South, and Southeast Asian
languages such as Bangala, Chinese, Hindi/Urdu, Japanese, Malayalam, Sinhala,
Vietnamese, and Thai indicates that Chomsky’s binding condition C cannot be
taken as a primitive rule of grammar, either.
(38) (Thai, cited in Huang 2000a: 27)
Cɔɔn1 chɔɔp Cɔɔn1.
John likes John
‘John1 likes John1.’9

As an alternative to various syntactic and semantic approaches, a neo-Gricean

pragmatic theory of anaphora has been developed by Huang (1991, 1994/2007,
2000a, b, 2004, 2006, 2007, 2014a, 2016b, 2017b, see also Levinson 1987, 1991,
2000), using and based on a rich collection of data drawn from a wide range of
more than 550 of the world’s languages, which represent a variety of areal, genetic,
and typological characteristics (see especially Huang 2000a). The central idea
underlying the theory is that the production and comprehension of certain patterns
of anaphora can be made utilizing pragmatically enriched meaning such as con-
versational implicatures, dependent on a language user’s knowledge of the range
of options available in the grammar, and of the systematic use or avoidance of
particular anaphoric expressions or structures on particular occasions.

9
One of the current developments in the Chomskyan syntactic analysis of binding is to
eliminate all the conditions that are postulated specifically for binding such as binding
conditions A, B and C, discussed above, and to reduce these specific conditions to ele-
mentary, general, and independent principles of the computational system of language
within Chomsky’s minimalist programme. Whereas this new development constitutes
a step forward in our understanding of anaphora and binding, it creates a number of
new conceptual and empirical problems of its own (see e. g. Huang 2014a: 350–351 for
further discussion).
Research methodology in classical and neo-Gricean pragmatics 177

(39) Huang’s revised neo-Gricean pragmatic apparatus for anaphora (simplified)

(i) The use of an anaphoric expression x I-implicates a local co-referential
interpretation, unless (ii) or (iii).
(ii) There is an anaphoric Q-scale <x, y>, in which case the use of y
Q-implicates the complement of the I-implicature associated with
the use of x in terms of reference.
(iii) There is an anaphoric M-scale {x, y}, in which case the use of y
M-implicates the complement of the I-implicature associated with the use
of x, in terms of either reference or expectedness.

Needless to say, any interpretation generated by (39) is subject to the general con-
sistency constraints applicable to conversational implicatures. These constraints
include real-world knowledge, contextual information, and semantic entailments.
There is substantial cross-linguistic evidence to show that empirically, the
revised neo-Gricean pragmatic theory of anaphora is more adequate than both a
syntactic and semantic approach. Consider, for instance, Chomsky’s binding con-
ditions, as in (33), and its paradigmatic illustrations, as in (34) above. On the
neo-Gricean pragmatic account, Chomsky’s binding conditions B and C need not
to be laid at the doorstep of generative syntax and can be reduced to pragmatics. In
somewhat simplified terms, this can be achieved in the following way. If binding
condition A is taken to be either grammatically constructed (as in the English-type
languages) or pragmatically specified via the I-principle (as in the Chinese-type
languages), then binding condition B can be pegged directly to the application of
the Q-principle. Given a speaker’s knowledge of grammar and/or the I-principle,
an anaphor/reflexive will be chosen if coreference is intended. This has the con-
sequence that if the anaphor/reflexive is not employed but a pronominal/pronoun
is used instead, a Q-implicature will arise, namely, no coreference is intended. In
other words, we have a Horn scale <anaphor/reflexive, pronominal/pronoun> here
such that the use of a semantically weaker pronominal/pronoun Q-implicates that
the more informative, coreferential interpretation associated with the use of the
anaphor/reflexive cannot be truthfully entertained, as in (34b). By the same rea-
soning, binding condition C can also be eliminated. Wherever an anaphor/reflexive
could occur, the use of a semantically weaker r-expression/proper name Q-impli-
cates the non-applicability of the more informative, coreferential interpretation
associated with the use of the anaphor/reflexive. This is exactly what has happened
in (34c). Furthermore, the revised neo-Gricean pragmatic theory can provide an
elegant account of many of the anaphoric patterns that have embarrassed a gen-
erative analysis such as the case where contra binding condition B, a pronominal/
pronoun is bound in its local syntactic domain. In the case of long-distance anaph-
ora/reflexivization where there is a referential overlap between a long-distance
anaphor/reflexive and a pronominal/pronoun, the concept of unexpectedness is
invoked to explain why such a marked anaphoric expression (that is, a long-distance
anaphor/reflexive) is used. Examined in a more careful way, cross-linguistically,
178 Yan Huang

unexpectedness turns out to be mainly of three types: (i) contrastiveness/emphatic-

ness, (ii) logophoricity, and (iii) de se attitude/belief ascription. First, long-distance
anaphors/reflexives are commonly used for marking contrast and/or emphasis. A
second dimension of unexpectedness arising from the employment of long-distance
anaphors/reflexives involves logophoricity – the phenomenon whereby the ‘point
of view’ of an internal protagonist of a sentence or discourse, as opposed to that of
the current, external speaker, is being reported using some morphological and/or
syntactic means (see e. g. Huang 1994/2007, 2000a: 172–204, 2002, 2004, 2007,
2010, and 2014b, 2017b, e for detailed discussion of logophoricity). Thirdly and
finally, long-distance anaphors/reflexives can be utilized to encode a de se attitude/
belief – self-locating attitude/belief – ascription (see e. g. Huang 2013b for further
discussion). The use of long-distance anaphors/reflexives to mark unexpectedness
is accountable in terms of the M-principle. Since the grammar allows the unmarked
pronominal/pronoun to be employed to encode coreference, a speaker will use
it if such a reading is intended. On the other hand, if the unmarked pronominal/
pronoun is not used, but the marked long-distance anaphor/reflexive is employed
instead, then an M-implicature will be licensed. The conversational implicature is
that not only coreference but contrastiveness/emphaticness, logophoricity, and/or
de se attitude/belief ascription as well is intended by the speaker.

7. Conclusion

In this chapter, I have evaluated some of the major research methodologies used
in classical and neo-Gricean pragmatics, covering different types of data, different
ways of data collection, and different ways in which data is adopted and analysed.
I have explained why introspection has been the principal research method in clas-
sical and neo-Gricean pragmatics, pointed out its strengths and weaknesses, and
showed that it should and can be complemented by other research methodologies
such as the use of attested data and experimentation. I have then moved to a discus-
sion of falsifiability and methodological reductionism versus expansionism with
regard to linguistic theory-building from a philosophical perspective. Finally, I
have shown how the employment of cross-linguistic data to contrast and compare
linguistic characteristics across a wide range of typologically different languages
can contribute to the formulation and development of a better theory of anaphora
and binding.10

10
I am grateful to Wolfram Bublitz and especially Andreas Jucker for their useful com-
ments on an early version of this article.
Research methodology in classical and neo-Gricean pragmatics 179

References

Atlas, Jay D. and Stephen C. Levinson

1981 It-clefts, informativeness and logical form: Radical pragmatics. In: Peter Cole
(ed.), Radical Pragmatics, 1–61. London: Academic Press.
Bach, Kent
2012 Context dependence. In: Manuel Garcia-Carpintero and Max Kolbel (eds.),
The Continuum Companion to the Philosophy of Language, 153–184. London:
Continuum.
Brown, Penelope and Stephen C. Levinson
1978 Politeness: Some universals in language usage. In: Esther N. Goody (ed.),
Questions and Politeness, 56–310. Cambridge: Cambridge University Press.
Brown, Penelope and Stephen C. Levinson
1987 Politeness: Some Universals in Language Usage. Cambridge: Cambridge Uni-
versity Press.
Chemla, Emmanuel and Benjamin Spector
2011 Experimental evidence for embedded scalar implicatures. Journal of Seman-
tics 28: 359–400.
Chierchia, Gennaro
2004 Scalar implicatures, polarity phenomena, and the syntax/pragmatics interface.
In: Adriana Belletti (ed.), Structures and Beyond, 39–103. Oxford: Oxford
University Press.
Chierchia, Gennaro
2013. Logic in Grammar: Polarity, Free Choice, and Intervention. Oxford: Oxford
University Press.
Chierchia, Gennaro, Danny Fox and Benjamin Spector
2012 Scalar implicature as a grammatical phenomenon. In: Claudia Maienborn,
Klaus von Heusinger and Paul Portner (eds.), Semantics: An International
Handbook of Natural Language Meaning, 2297–2331. Berlin: de Gruyter
Mouton.
Clark, Herbert H. and Adrian Bangerter
2004 Changing ideas about reference. In: Ira A. Noveck and Dan Sperber (eds.),
Experimental Pragmatics, 25–49. New York: Palgrave Macmillan.
Chomsky, Noam
1995 The Minimalist Program. Cambridge, MA: The MIT Press.
Clifton Jr., Charles and Chad Dube
2010 Embedded implicatures observed. Semantics and Pragmatics 3: 1–13.
Geurts, Bart
2010 Quantity Implicature. Cambridge: Cambridge University Press.
Geurts, Bart and Nausicaa Pouscoulous
2009 Embedded implicatures?!? Semantics and Pragmatics 2: 1–34.
Geurts, Bart and Bob van Tiel
2013 Scalar expressions under embedding. Semantics and Pragmatics 6: 1–37
Grice, H. Paul
1975 Logic and conversation. In: Peter Cole and Jerry Morgan (eds.), Syntax and
Semantics 3: Speech Acts, 41–58. London: Academic Press.
180 Yan Huang

Grice, H. Paul
1978 Further notes on logic and conversation. In: Peter Cole (ed.), Syntax and
Semantics 9: Pragmatics, 113–128. London: Academic Press.
Grice, H. Paul
1989 Studies in the Way of Words. Cambridge, MA: Harvard University Press.
Harnish, Robert M.
1976 Logical form and implicature. In: Thomas G. Bever, Jerrold J. Katz and D.
Terence Langendoen (eds.), An Integrated Theory of Linguistic Ability, 313–
392. New York: Crowell.
Horn, Laurence R.
1984 Toward a new taxonomy for pragmatic inference: Q-based and R-based impli-
cature. In: Deborah Schiffrin (ed.), Meaning, Form, and Use in Context: Lin-
guistic Applications, 11–42. Washington DC: Georgetown University Press.
Horn, Laurence R.
2006 The border wars: A neo-Gricean perspective. In: Klaus von Heusinger and Ken
Turner (eds.), Where Semantics Meets Pragmatics, 21–48. Oxford: Elsevier.
Horn, Laurence R.
2007 Neo-Gricean pragmatics: A Manichaean manifesto. In: Noel Burton-Roberts
(ed.), Pragmatics, 158–183. London: Palgrave Macmillan.
Horn, Laurence R.
2009 WJ-40: Implicature, truth, and meaning. International Review of Pragmatics
1: 3–34.
Horn, Laurence R.
2012 Implying and inferring. In: Keith Allan and Kasia Jaszczolt (eds.), The Cam-
bridge Handbook of Pragmatics, 69–86. Cambridge: Cambridge University
Press.
Horn, Lawrence R. and Gregory Ward (eds.)
2004 The Handbook of Pragmatics. Oxford: Blackwell.
Huang, Yan
1991 A neo-Gricean pragmatic theory of anaphora. Journal of Linguistics 27: 301–
335.
Huang, Yan
[1994] 2007 The Syntax and Pragmatics of Anaphora: A Study with Special Reference
to Chinese. Cambridge: Cambridge University Press.
Huang, Yan
2000a Anaphora: A Cross-Linguistic Study. Oxford: Oxford University Press.
Huang, Yan
2000b Discourse anaphora: Four theoretical models. Journal of Pragmatics 32: 151–
176.
Huang, Yan
2002 Logophoric marking in East Asian languages. In: Tom Güldemann and Man-
fred von Roncador (eds.), Reported Discourse, 213–224. Amsterdam: John
Benjamins.
Huang, Yan
2004 Anaphora and the pragmatics-syntax interface. In: Laurence R. Horn and
Gregory Ward (eds.), The Handbook of Pragmatics, 288–314. Oxford: Black-
well.
Research methodology in classical and neo-Gricean pragmatics 181

Huang, Yan
2006 Anaphora, cataphora, exophora, logophoricity. In: Keith Brown (ed.), The
Encyclopedia of Languages and Linguistics, 231–238. Second edition, Vol-
ume 1 of 14. New York: Elsevier Science.
Huang Yan
2007 Pragmatics. Oxford: Oxford University Press.
Huang, Yan
2009 Neo-Gricean pragmatics and the lexicon. International Review of Pragmatics
1: 118–153.
Huang, Yan
2010 Switch-reference in Amele and logophoric verbal suffix in Gokana: a general-
ized neo-Gricean pragmatic analysis. In: Dingfang Shu and Ken Turner (eds.),
Contrasting Meaning in Languages of the East and West, 75–101. Berlin: Peter
Lang.
Huang, Yan
2013a Bayesian probabilistic model of discourse anaphoric comprehension, linguis-
tic typology, and neo-Gricean pragmatics. Theoretical Linguistics 39: 95–108.
Huang, Yan
2013b De se attitude/belief attribution and neo-Gricean truth-conditional pragmatics:
logophoric expressions in West African languages and long-distance reflex-
ives in East, South, and Southeast Asian languages. In: Neil Feit and Alessan-
dro Capone (eds.), Attitudes De Se: Linguistics, Epistemology, Metaphysics,
185–209. Stanford: CSLI Publications.
Huang, Yan
2014a Pragmatics. Second edition. Oxford: Oxford University Press.
Huang, Yan
2014b Logophoricity and neo-Gricean truth-conditional pragmatics. In: Alessandro
Capone, Franco Lo Piparo and Marco Carapezza (eds.), Perspectives on Lin-
guistic Pragmatics, 217–242. Heidelberg: Springer.
Huang, Yan
2015a Lexical cloning in English: a neo-Gricean lexical pragmatic analysis. Journal
of Pragmatics 86: 80–85.
Huang, Yan
2015b Neo-Gricean pragmatic theory of conversational implicature. In: Bernd Heine
and Heiko Narrog (eds.), The Oxford Handbook of Linguistic Analysis, 615–
639. Second edition. Oxford: Oxford University Press.
Huang, Yan
2016a Pragmatics: language use in context. In: Keith Allan (ed.), The Routledge
Handbook of Linguistics, 205–220. London: Routledge.
Huang, Yan
2016b Aspects of anaphora in Chinese and in some Germanic, Romance, and Slavic
languages, the ‘syntactic’ versus ‘pragmatic’ language typology, and neo-Gri-
cean pragmatics. In: Keith Allan, Alessandro Capone, and Istvan Kecskes
(eds.), Pragmemes and Theories of Language Use, 21–43. Heidelberg and
New York: Springer.
Huang, Yan
2017a Introduction: what is pragmatics? In: Yan Huang (ed.), The Oxford Handbook
of Pragmatics, 1–8. Oxford: Oxford University Press.
182 Yan Huang

Huang, Yan
2017b Neo-Gricean pragmatics. In: Yan Huang (ed.), The Oxford Handbook of Prag-
matics, 47–78. Oxford: Oxford University Press.
Huang, Yan
2017c Implicature. In: Yan Huang (ed.), The Oxford Handbook of Pragmat-
ics,155–179. Oxford: Oxford University Press.
Huang, Yan
2017d Implicitness in the lexis: Lexical narrowing and neo-Gricean pragmatics. In:
Piotr Cap and Marta Dynel (eds.), Implicitness: From Lexis to Discourse,
67–94. Amsterdam: John Benjamins.
Huang, Yan
2017e Pre-semantic pragmatic enrichment: the case of long-distance reflexivisation.
In: María de Ponte and Kepa Korta (eds.), Reference and Representation in
Language and Thought, 126–143. Oxford: Oxford University Press.
Huang, Yan
2017f Truth-condition-contributing conversational implicatures, intrusive construc-
tions, and neo-Gricean pragmatics. Waiyu Jiaoxue yu Yanjiu 49: 643–662.
Huang, Yan (ed.)
2017g The Oxford Handbook of Pragmatics. Oxford: Oxford University Press.
Jucker, Andreas H.
2009 Speech act research between armchair, field and laboratory: The case of com-
pliments. Journal of Pragmatics 41: 1611–1635.
Kasher, Asa
1976 Conversational maxims and rationality. In: Asa Kasher (ed.), Language in
Focus: Foundations, Methods and Systems, 197–216. Dordrecht: Reidel.
Keenan, Elinor Ochs
1976 The universality of conversational postulates. Language in Society 5: 67–80.
Kuhn, Thomas S.
[1962] 2012 The Structure of Scientific Revolutions. The 50th Anniversary Edition. Chi-
cago: The University of Chicago Press.
Leech, Geoffrey N.
1983 Principles of Pragmatics. London: Longman.
Leech, Geoffrey N.
2007 Politeness: Is there an East-West divide? Journal of Politeness Research 3:
167–206.
Levinson, Stephen C.
1987 Pragmatics and the grammar of anaphora. Journal of Linguistics 23: 379–434.
Levinson, Stephen C.
1991 Pragmatic reduction of the binding conditions revisited. Journal of Linguistics
27: 107–161.
Levinson, Stephen C.
2000 Presumptive Meanings. Cambridge, Mass: MIT Press.
Meyer, Charles, F. and Gerald Nelson
2006 Data collection. In: Bas Arts and April McMahon (eds.), The Handbook of
English Linguistics, 93–114. Oxford: Blackwell.
Nature
2015 Editorial: Let’s think about cognitive bias. Nature 526 (7572): 163.
Research methodology in classical and neo-Gricean pragmatics 183

Popper, Karl
1945 The Open Society and Its Enemies. London: Routledge.
Popper, Karl
1973 Objective Knowledge. Oxford: Oxford University Press.
Recanati, Francois
2010 Truth-conditional Pragmatics. Oxford: Oxford University Press.
Senft, Gunter
2008 The case: The Trobriand Islanders vs. H.P. Grice – Kilivila and the Gricean
maxims of quality and manner. Anthropos 103:139–147.
Sperber, Dan and Deirdre Wilson
1986 Relevance: Communication and Cognition. Oxford: Blackwell.
Sperber, Dan and Deirdre Wilson
1995 Relevance: Communication and Cognition. Second edition. Oxford: Black-
well.
Talmy, Leonard
2007a Forward. In: Monica Gonzalez-Marquez, Irene Mittelberg, Seana Coulson and
Michael J. Spivey (eds.), Methods in Cognitive Linguistics, xi–xxi. Amster-
dam: John Benjamins.
Talmy, Leonard
2007b Handout of Introspection as a methodology in linguistics, 10th International
Conference on Cognitive Linguistics.
Traugott, Elizabeth Closs
2004 Historical pragmatics. In: Laurence R. Horn, and Gregory Ward, (eds.), The
Handbook of Pragmatics, 538–561. Oxford: Blackwell.
Wilson, Deirdre
2017 Relevance theory. In: Yan Huang (ed.), The Oxford Handbook of Pragmatics,
79–100. Oxford: Oxford University Press.
7. Cognitive pragmatics: Relevance-theoretic
methodology
Billy Clark

Abstract: Early work in relevance theory followed Grice’s approach in being

based mainly on evidence from introspection. Ideas were developed and tested
mainly by reference to the intuitions of researchers about examples, often invented
for the purposes of the investigation through thought experiments, logical argu-
ment and conceptual analysis. Sometimes, choices between competing ideas
were made based on theoretical simplicity. In the 1990s, there was a significant
increase in work based on data from experiments, leading to the development of
what is now referred to as the field of “experimental pragmatics”. Experimental
work since then has included questionnaire-based work (which often focuses on
the intuitions of participants), data from reading and response times, and, more
recently, evidence from electroencephalography (EEG), functional magnetic
resonance imaging (fMRI) and the use of eye-tracking technology. Other ways
of testing and developing ideas have included the use of data from corpora and
other observational work, and applications of the theory in clinical work, devel-
opmental pragmatics, language acquisition, first and second language learn-
ing and teaching, and stylistics. Applications vary in the extent to which they
restrict their focus to understanding phenomena in the light of the ideas being
applied or aim also to test theoretical ideas. While current research uses a wider
range of techniques, introspection and experimentation are still the most used
methods.

1. Introduction

Introspective and experimental methods are by far the ones most commonly used
by researchers aiming to test, develop or apply ideas from the perspective of rele-
vance theory (this is also true of work on neo-Gricean and post-Gricean pragmatics
in general). Introspective methods were the main ones used in early stages of the
development of the theory. During the 1990s, there was increased interest in using
other methods and there has been a sharp increase in experimental work since the
2000s. Other methods have been used, and recent work is beginning to reflect a
more general trend in linguistics to adopt a range of methods rather than to assume
a close connection between particular theoretical approaches or phenomena and
particular methods. Despite this, introspection and experiment are still the most
commonly used methods in relevance-theoretic work.

https://doi.org/10.1515/9783110424928-007
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 185–215. Berlin/
Boston: De Gruyter Mouton.
186 Billy Clark

One way of telling the story of relevance theory would be to divide it into
three phases. In the first phase, Sperber and Wilson and other researchers were
engaged in demonstrating that pragmatic theories were possible at all. There was
an assumption that the domain of pragmatic inference was so wide that it was not
amenable to systematic study. Perhaps most significantly, this view was held by
Jerry Fodor, whose ideas about the modularity of mind (Fodor 1983) were adopted
in early relevance-theoretic work. In this phase, intuitions (often the researchers’
own intuitions) were the main source of data. In the second phase, which began
in the 1990s, reservations about the reliance on this kind of introspection led to
the development of experimental work and to the now large and growing field
of experimental pragmatics. In what might soon be seen as a third phase, rele-
vance-theoretic work is beginning to involve a wider range of methods but the
majority of research is still based on introspection and experiment.1
This chapter adopts the structure suggested by this way of telling the story
of relevance theory, even though it simplifies things to some extent. The next
section discusses some of the earlier developments based on introspection. This
is followed by a section discussing the rise of experimental work, and then by a
section considering other methods, including corpus-based work, observation, and
applications in a range of areas, including clinical work, developmental pragmat-
ics, language acquisition, first and second language learning and teaching, and
stylistics.2 Discussion of different methods is briefer here, reflecting the fact that
these methods have been used less often in relevance-theoretic work so far. While
the main focus of applications of the theory is often mainly on developing accounts
of particular phenomena, they can also provide evidence to support or disconfirm
theoretical ideas. The concluding section comments on the usefulness of different
kinds of methods and speculates on possible future directions. One conclusion is
that adopting a wider range of methods has helped to develop the theory and to
develop understanding of particular phenomena. There is still much to explore and
we now have a good range of methods to use in doing so.

1
The term “introspection” can be used in more than one way. Here, I use it to refer to the
use of researchers’ own intuitions, perhaps alongside those of other researchers, rather
than to the use of experimental methods such as thinking-aloud and protocol analysis
with groups of non-expert language users. These approaches were significantly influ-
enced by the work of Ericsson and Simon (1980, 1984). For discussion of their use in
second language research, see Færch and Kasper (1987).
2
In considering work in each of the areas discussed here, there is space only to mention
a few illustrative studies. A comprehensive listing, organised under thematic headings,
is available online at the Relevance Theory Online Bibliographic Service (Yus 2017).
Cognitive pragmatics: Relevance-theoretic methodology 187

2. Introspection

Early work in relevance theory, like other work in pragmatics which built on
Grice’s ideas, also followed Grice in using largely introspective methods. Grice
(1975, 1989) focused explicitly on intuitions in developing his ideas, including the
two aspects of his thinking which have arguably been seen as most significant and
influential: the distinction between saying and implicating, and the suggestion that
utterance interpretation is guided by ultimately rational pragmatic principles which
play a key role in explaining pragmatic phenomena.
In developing relevance theory, Wilson and Sperber critically discussed the
details of Grice’s work, refining assumptions about the distinction between saying
and implicating and about the nature of pragmatic principles. The key insight they
retained was that pragmatic principles lie at the heart of communication:
The value of Grice’s work derives not so much from the detail of his analyses as from
the general claim that underlies them. Grice has shown that given an adequate set of
pragmatic principles – to which his conversational maxims are a first approximation –
a wide range of what at first sight seem to be arbitrary semantic facts can be seen as
consequences of quite general pragmatic constraints. (Wilson and Sperber 1981: 155)

Arguably, the use of introspective methods based on researcher intuitions is natural

given what Grice was aiming to do and given the nature of Wilson and Sperber’s
critical discussion. A key motivation for Grice was to show that we do not need to
assume that natural language expressions such as and, or, if … then … are ambig-
uous in order to account for the range of ways in which they can be understood in
context. This is essentially a logical argument which can then be further developed
and supported by investigations of how communicators use these expressions and
how they understand utterances containing them.
In Logic and Conversation, Grice (1975) showed that different interpretations
of such expressions could be explained by assuming a univocal semantics with
pragmatic principles accounting for different interpretations. Different ways of
understanding the utterances with and in (1) can be seen as following from differ-
ent pragmatic processes rather than from different encoded senses of and:
(1) a. Edinburgh is in Scotland and Newcastle is in England.
b. He got on his bike and cycled home.
c. He pressed the light switch and the bulb shattered.

Grice showed that we can explain the temporal interpretation of (1b) and the
causal interpretation of (1c) (which also includes a temporal interpretation) with-
out assuming that and is ambiguous. Instead, if we assume pragmatic principles
(“maxims” for Grice), we can assume that hearers infer temporality and causality
from an underlying “logical” sense in which and has the same meaning as the
logical symbol &. A key thing to notice here is that Grice’s point can be under-
188 Billy Clark

stood without the need for support other than what can be inferred from logical
introspection.
In Grice’s work, evidence for the maxims themselves and for the different
interpretations comes from intuitions. It is intuitively plausible, for example, that
the response in example (2a) seems odd because it is overinformative and that the
response in (2b) seems odd because it is irrelevant:
(2) a. A: How do you get to the town hall from here?
B: First, lift one foot and place it in front of you. Next, lift the other foot
b. A: How you getting on with your conference paper?
B: I hear there’s going to be a heat wave next month.

A Gricean account would assume that recognition of the surface overinformative-

ness of B’s utterance in (2a) and the underinformativeness (or irrelevance) of the
response in (2b) leads to the inference of implicatures which are informative and
relevant (rudely implicating that A is not very bright in the former case and impli-
cating unwillingness to discuss the conference paper in the latter).
Evidence for contrasting interpretations of the conjunctions in (1) can be found
by swapping the conjuncts as in (3):
(3) a. Newcastle is in England and Edinburgh is in Scotland.
b. He cycled home and got on his bike.
c. The bulb shattered and he pressed the light switch.

Changing the order of conjuncts in (3a) does not seem to make much difference but
the ordering in (3b) and (3c) suggests markedly different interpretations.
The plausibility of the account in terms of Grice’s maxims is both logical and
based on intuitions. If we accept the existence of pragmatic principles such as the
maxims, we can construct a logical explanation for the interpretations. For Grice,
the principles and the explanation are rational and the specific interpretations arise
because of implicatures generated by the utterances. Given general assumptions
about the world (about people getting on their bikes and heading home, about
bulbs shattering, etc.), it is rational to assume that getting on the bike will have
preceded cycling home and that the bulb shattered as a result of electric current
passing through it after the switch was pressed. It would therefore be irrationally
underinformative to utter (2b) and (2c) if the speaker did not intend the temporal
and causal interpretations.
Grice appealed to one more key idea here, which can be understood as a guid-
ing methodological principle, and which he termed “Modified Occam’s Razor”:3
“Senses are not to be multiplied beyond necessity” (Grice 1978: 118–119; 1989:

3
For discussion of this idea, see Bontly 2005; Phillips 2012.
Cognitive pragmatics: Relevance-theoretic methodology 189

47). Grice is hesitant in proposing this and hedges when discussing what it might
do. He says:
Like many regulative principles, it would be a near platitude, and all would depend on
what was counted as “necessity”. Still, like other regulative principles, it may guide
(Grice 1978: 118–119; 1989: 47).

The role it plays here is that it suggests that we should not propose ambiguity for
expressions such as and when an alternative, pragmatic, explanation, is available.
Early work in relevance theory, like other work which built on Grice’s ideas,
was similarly based on intuitions about possible interpretations, logical reasoning
about how they might come about, and assumptions about theoretical simplicity.
In their early critique of Grice’s approach, Wilson and Sperber (1981) endorse
the notion of pragmatic principles and of a distinction between “saying” and
“implicating” while arguing for different kinds of principles and for a different
way of understanding “what is said” (soon to be replaced with the technical term
“explicature”). A combination of intuitions, logical argumentation and appeals
to theoretical simplicity were used to develop these and other key ideas of the
theory.
Amongst other things, these introspective methods support the technical defi-
nition of relevance in terms of cognitive effects and effort, the distinction between
explicature and implicature, the notion that implicatures can be stronger or weaker,
accounts of figurative language, semantic analyses of particular expressions, the
development of a distinction between conceptual and procedural meanings, and
accounts of the interpretations of particular utterances, some of which demon-
strate how the relevance-theoretic account of interpretation works.4 Here are some
examples of each.
Intuitions are used in accounts of particular interpretations, including ones
which are taken to demonstrate how considerations of relevance guide interpreta-
tions.
(4) a. It’s raining.
b. It’s raining now.
(5) There’s a cat outside.

Intuitions are used as evidence that (4b) communicates more than (4a). The pres-
ence of the word now helps the hearer to assign a time reference but this is unlikely
to be different to that which would follow from (4a). The small amount of extra
effort involved in processing now gives rise to further effects than would have fol-

4
The classic source for the theory is Sperber and Wilson (1986). Brief overviews include
Carston and Powell (2006), Clark (2011), Sperber and Wilson (2005), Wilson and Sper-
ber (2004), Yus (2006, 2010). Clark (2013) offers a comprehensive introduction.
190 Billy Clark

lowed from (4a), suggesting a contrast between the rain which is happening at the
time of utterance and either an earlier weather state or some other effects depend-
ing on the context. This is taken to support the view that extra effort creates an
expectation of extra effects (more than just inferring when the raining is assumed
to be happening, which could have been inferred without the presence of now).
This kind of reasoning has been applied to a wide range of utterances.
Example (5) is used by Sperber and Wilson to demonstrate that the commu-
nicative principle of relevance limits what hearers will take an utterance to com-
municate. Intuitions provide evidence that a hearer in an environment such as a
city in England around the time when I am writing this chapter is most likely to
assume that the cat outside is a domestic cat (rather than a wild cat such as a tiger
or leopard). Logical argumentation is used alongside the intuition to argue that the
more surprising interpretation is ruled out by the communicative principle of rele-
vance. It would require unnecessary effort to expect a hearer to construct and enter-
tain a plausible interpretation and then go further to think of something new. The
existence of an easily reached plausible interpretation means that this must be the
one the speaker intended. The explanation in more recent work (e. g. Wilson and
Sperber 2004; Sperber and Wilson 2005) involves reference to a “relevance-guided
comprehension heuristic”:
(6) Relevance-Guided Comprehension Heuristic:
a. Follow a path of least effort in deriving cognitive effects: test interpretations (e. g.
disambiguations, reference resolutions, implicatures, etc.) in order of accessibility.
b. Stop when your expectations of relevance are satisfied.

This removes the suggestion of interpreters taking time to rationally work out
interpretations which earlier versions of the theory had inherited from Grice. The
key thing to notice at this stage is that these theoretical claims are based on a com-
bination of intuitions and logical argument. Many of the developments in relevance
theory have also been based on these.
Much work in relevance theory has also followed Grice in adopting assump-
tions similar to his “Modified Occam’s Razor”. In fact, this guiding principle can
be seen as a specific case of a more general one which is not only about assuming
senses. The more general aim is for theories to be as simple as possible. While it
is, of course, possible that phenomena, including human minds, are not as simple
as they could be, it is often assumed to be a good idea in developing theories to
keep things as simple as possible. The clearest cases where this occurs are in rele-
vance-theoretic work on aspects of linguistic semantics. Two examples are Wilson
and Sperber’s (1988) work on the semantics of declarative and non-declarative
sentences and Groefsema’s (1995) work on modal verbs. In both cases, very gen-
eral semantic analyses are assumed and relevance-theoretic pragmatic principles
interact with contextual assumptions to lead to a much wider range of interpreta-
tions in particular contexts.
Cognitive pragmatics: Relevance-theoretic methodology 191

A large amount of relevance-theoretic work is still based on introspection.

However, there is now a tendency for this to work alongside experimental, and
sometimes other kinds of empirical, work. One example of this is work on meta-
phor which has been developed and tested using introspection (see, for example,
Carston 2002, 2010a, 2010b) and also through experimental investigations (see, for
example, Rubio-Fernández, Wearing and Carston 2015; Rubio-Fernández, Cum-
mins and Tian 2016) and corpus-based work (Kolaiti and Wilson 2014).

3. Experiments

Wilson and Sperber (1993) characterise their work in the early stages of relevance
theory as having been focused on addressing what they term “Fodor’s challenge”:
[…] we had to address Fodor’s challenge that while decoding processes are quite well
understood, inferential processes are not only not understood, but perhaps not even
understandable (Wilson and Sperber 1993: 1).

Since pragmatics involves inferential processes, developing an account of prag-

matics would seem to require doing what Fodor assumed was not possible, i. e.
to explain something neither understood nor understandable. So the challenge
referred to here could be reframed as being about trying to show that it is possible
to develop pragmatic theories, given the wide range of things involved in prag-
matic processes.5
Arguably, relevance theorists have done a good job of showing that a pragmatic
theory is at least possible. While these developments were taking place, relevance
theorists (and other pragmaticists) were also concerned about the perceived limita-
tions of reliance on introspective data. This discussion tended to focus on general
questions about the reliability of intuitions. Noveck and Sperber (2007: 185–186)

5
Pragmatic processes are not, of course, the only kinds of inferential processes, so the
development of relevance theory was not seen as addressing Fodor’s problem compre-
hensively. In fact, one way of thinking about the claimed early success of relevance
theory was to say that the nature of the inferential processes involved in pragmatic
interpretation, and in particular the ways in which they are constrained, make them
amenable to explanation in ways in which other processes might not be. In early work,
it was assumed that pragmatic processes were a special sub-variety of central processes
with their own constraints. More recently, Sperber and Wilson (2002) have modified
their assumptions so that pragmatic processes are now seen as modular. This is con-
sistent with the assumptions of “massive modularity” which Sperber and others have
argued for (e. g. Carruthers 2006; Cosmides and Tooby 1992; Sperber 1994, 2001).
Work on massive modularity has been partly explored in experimental work but some
of the work, including early discussion, was speculative.
192 Billy Clark

point out that there are particular issues with pragmatic intuitions and that it is
important to notice how they are different from semantic intuitions. Semantic intui-
tions are intuitions about the meanings of linguistic expressions. To take one exam-
ple (discussed by Noveck and Sperber) an intuition about an entailment relation
(e. g. that John knows it is raining entails it is raining) is, they say, a “semantic
fact”. Pragmatic intuitions are different since they are about “hypothetical cases
involving imaginary or generic interlocutors”. They say:

Pragmatic intuitions on hypothetical utterances have proved useful in a variety of ways,

but it is important to keep in mind that these are not about how an utterance is interpret-
ed, but about how an utterance would be interpreted if it were produced in a specific
situation by a speaker addressing a listener, with referring expressions having actual
referents, and so on. These intuitions are educated guesses – and, no doubt, generally
good ones – about hypothetical pragmatic facts, but are not themselves pragmatic facts
and they may well be in error. That is, we may be wrong about how, in fact, we would
interpret a given utterance in a given context (Noveck and Sperber 2007: 186).

Since the 1990s, there has been a very significant increase in the use of experi-
mental methods to explore predictions of pragmatics, including ideas developed
within relevance theory. This range of work is often referred to as “experimental
pragmatics”. A history of its development might start during the 1990s when there
was informal discussion on email lists and at conferences of how to test ideas
from pragmatics with a wider range of empirical methods. Other significant steps
include publications of experimental results including by Happé (1993, 1995),
Sperber, Cara and Girotto (1995), Gibbs and Moise (1997), Nicolle and Clark
(1999) and Noveck (2001), each of which are mentioned below, a workshop at the
Linguistics Association of Great Britain in 1998, a European Science Foundation
workshop at Lyon in 2001, an influential collection (Noveck and Sperber 2004)
which resulted from the Lyon workshop, the European Science Foundation EURO-
XPRAG research funding programme which ran from 2009 to 2014, and the Ger-
man Research Foundation priority programme XPRAG.de which was established
in 2014 and is still awarding funding.6
The field of experimental pragmatics clearly has roots in a wider range of
experimental work which focused on questions relevant to pragmatics. Since the
1960s, psycholinguists have carried out a large number of experimental studies
on language and communication. As Sperber and Noveck (2004) point out, there
was little interaction between researchers in psycholinguistics and researchers in
pragmatics and little work which focused directly and explicitly on predictions of
specific pragmatic theories. Pioneers in experimental pragmatics who were active

6
For discussion of developments in experimental pragmatics, see Breheny 2011; Katsos
and Cummins 2010; Noveck and Reboul 2008.
Cognitive pragmatics: Relevance-theoretic methodology 193

before the new field developed include Herb Clark and Ray Gibbs who have car-
ried out a large number of experiments on aspects of pragmatics (as just a very
small sample, see Clark and Lucy 1975; Clark 1979; Gibbs 1979, 1981, 1983,
1986; see also Gibbs, this volume). A key consequence of the development of
the new field is that pragmaticists now regularly consider experimental evidence
alongside evidence from introspection.
An early experimental study focusing directly on an idea developed within
relevance theory was Jorgensen, Miller and Sperber (1984). This study tested pre-
dictions of the mention theory of irony developed by Sperber and Wilson, along-
side those of the traditional account which sees irony as involving the expression
of the opposite of what is intended. Their results were consistent with the mention
theory.7
Another early set of experimental studies was carried out by Francesca Happé
(1993, 1995) who set out to test predictions of relevance theory with regard to the
relationship between “theory of mind” abilities8 and autism spectrum disorder
(ASD). Her results were consistent with the assumption that individuals with ASD
are atypical with regard to “theory of mind” abilities. Her studies investigated
correlations between ability to pass different levels of false belief tasks (no ability,
ability to pass first-order tasks only, and ability to pass second-order tasks) and
the ability to understand simile, metaphor and irony (which, on relevance-theo-
retic accounts, differ with regard to how much theory of mind ability is required).
First-order false belief tasks require the ability to represent another individual’s
thoughts (e. g. thinking that somebody else believes that a box contains sweets).
Second-order false belief tasks require the ability to represent another individu-
al’s thoughts about somebody else’s thoughts (e. g. thinking that somebody else
believes that another person believes that a box contains sweets).
Another important experimental study was Sperber, Cara and Girotto’s (1995)
work which investigated performance on versions of Peter Wason’s selection task
(Wason 1966). This work directly tested central claims of the theory, suggesting
that performance on versions of the task could be explained by reanalysing what
the task involves and with reference to predictions of relevance theory. A standard
version of Wason’s selection task is presented in Figure 1.

7
The mention theory, in later versions usually termed an “echoic” account, is an account
which was developed within relevance theory but which does not depend on central
notions of the theory. It shares this with other work developed within this approach,
including Wilson and Sperber’s account of metaphor (for discussion of both, see Wilson
and Sperber 2012).
8
“Theory of mind”, often abbreviated to ToM, is a slightly problematic term used to refer
to abilities to attribute mental states to ourselves and others, to recognise that others
might have different mental states from our own, and to explain and predict other peo-
ple’s actions based on these.
194 Billy Clark

Here are four cards. Each has a letter on one side and a number on the other side. Two
of these cards are with the letter side up, and two with the number side up:

Indicate which of these cards you need to turn over in order to judge whether the fol-
lowing rule is true:
if there is an A on one side, there is a 7 on the other side

Figure 1. A standard version of the Wason selection task

In repeated experiments involving this version of the task, most participants

(around 90 %) choose the A and 7 cards. The “correct” response is “A and 8” since
a card with an A on one side and something other than 7 (e. g. an 8) on the other
would show that the rule is false. A card with G on one side can have any figure
on the other side and still be consistent with the rule. A card with 8 on one side can
have any letter on the other side. Explanations of the logic of the task usually point
out that the rule is in the form “if P then Q”. The correct response is to look for
cases of “P and not Q”. In choosing A and 7, participants are choosing to turn over
a card with P on one side and a card with Q on one side (rather than the “correct”
choice of a “not Q” card).9
A vast number of experiments have investigated the selection task. While most
participants “fail” on standard versions of the task, performance is much better on
some versions, e. g. where the second conjunct is in a negative form (Evans 1972)
or in some deontic versions of the task (Johnson-Laird, Legrenzi and Legrenzi
1972). Sperber, Cara and Girotto (1995) suggested that performance on the task
could be explained by assuming that participants in the task infer testable conse-
quences of the rule and then look for ways of testing these by turning over cards,
and that they infer these consequences in line with predictions of relevance theory,
i. e. they infer them in order of accessibility and stop when their interpretation of
the rule meets their expectations of relevance. Based on these assumptions, they
came up with a “recipe” for creating easy versions of the task:

9
In fact, Sperber, Cara and Girotto (1995) point out that the rule stated in the task is a
general and not a particular conditional statement. This does not affect the discussion
here.
Cognitive pragmatics: Relevance-theoretic methodology 195

(7) Recipe for creating an easy selection task:

a. make it easier to represent “P-and-not-Q” than “P-and-Q”
b. create a context where more follows from knowing that there are “P-and-not-Q”
cases than from knowing that there are “P-and-Q” cases
c. present the “if-P-then-Q” rule in a pragmatically felicitous way
(adapted from Sperber, Cara and Girotto 1995: 60)

The first line in the recipe, (7a), is about effort. Following this will make “P-and-
not-Q” (which leads to correct selections) relatively easy to represent, which is in
line with the relevance-theoretic assumption that the more effort involved in some-
thing the less relevant it is. The second line, (7b), is about effects. Following this
will make “P-and-not-Q” have more effects than “P-and-Q”, which is in line with
the relevance-theoretic assumption that the more effects something has the more
relevant it is. (7c) is about the framing of the rule. It is there to discourage artificial
formulations of the rule which are likely to focus attention on the task’s experimen-
tal status and so encourage lines of reasoning about what the experimenter might
be hoping for participants to do.
Sperber, Cara and Girotto (1995) then created four selection tasks. One of them
followed all three lines of the recipe. One of the others departed from (7a), one
departed from (7b), and one departed from both (7a) and (7b). All of them followed
(7c). This meant that there was one version of the task where relevance-theoretic
considerations predicted good performance, two where performance should be less
good (one where “correct” reasoning involves extra effort and one where it leads to
fewer effects), and one where performance is predicted to be poor (the one where
“correct” reasoning involves more effort and fewer effects).
The results confirmed the relevance-theoretic predictions. Participants per-
formed best where all three lines of the recipe were followed, worse in the con-
ditions which departed from one of (7a) or (7b), and worst of all in the condition
which departed from both (7a) and (7b). This was a very influential paper, leading
to considerable discussion on reasoning and inference and also showing that exper-
imental work could test central claims of relevance theory.10
Gibbs and Moise (1997) carried out questionnaire-based work which, they
claimed, showed that speaker intuitions about “what is said” by an utterance
are consistent with the relevance-theoretic view that explicit content consists of
enriched explicatures rather than the more minimal propositions which seem to
be assumed by Grice. Nicolle and Clark (1999) reviewed this work and carried
out new experiments. They argued that, while Gibbs and Moise had shown that
individuals have intuitions which can be manipulated experimentally, their experi-

10
This was not the last word on the selection task, of course. A number of subsequent
papers have built on and debated Sperber et al’s findings. See, for example, Fiddick,
Cosmides and Tooby 2000, Sperber and Girotto 2002.
196 Billy Clark

ments did not provide direct evidence about intuitions about explicit content. They
argued that participants in the experiment were not accessing intuitions about what
is said but instead aiming to choose paraphrases which were likely to give rise to a
similar range of effects to those conveyed by the original utterance. This work was
built on by Ariel (2002, 2016) who developed a notion of “privileged interactional
interpretations” which she suggests constitute “what the speaker is taken to be
truthfully or sincerely committed to” and also as what is taken to be “the speaker’s
relevant contribution to the discourse”. She suggests that privileged interactional
interpretations are often but not always explicatures in relevance-theoretic terms
(i. e. developments of a linguistically encoded logical form).
Another early study by Noveck (2001) investigated the phenomenon of “scalar
implicature” (Horn 1984, 1988, 1989) and compared the responses of children with
those of adults. In Horn’s view, scalar implicatures are conclusions associated with
particular linguistic forms which tend to be made in the absence of linguistic or
contextual indications which rule them out. Examples (my own) include those in
(8) and (9):
(8) Utterance:
a. Some of the loaves are stale.
Scalar implicature:
b. Not all of the loaves are stale.
(9) Utterance:
a. It’s possible that Andy will come to the party.
Scalar implicature:
b. It’s not certain that Andy will come to the party.

Hearers are likely to infer (8b) and (9b) in most cases when they hear utterances of
(8a) or (9a) respectively. The conclusions are not guaranteed and will not follow if
there are contextual reasons not to draw them or if they are ruled out by accompa-
nying linguistic material, e. g. by adding “in fact they all are” to (8a) or ‘in fact, it’s
certain that he will’ after (9a). Horn suggested that these conclusions arise because
of logical scales where stronger items entail weaker ones (all entails some, must
entails might and so on). While an utterance containing a stronger item on a scale
entails propositions which would follow from utterances containing a weaker term
(all of the loaves are stale entails that some of them are), an utterance containing a
weaker term usually implicates the negation of a stronger proposition which would
have followed from a corresponding utterance containing the stronger term (as in
the examples above).
Following Horn, many theorists (including Levinson 1987, 2000) assume that
scalar implicatures arise as “default” inferences, i. e. that they are automatically
computed and only rejected as a second step if there are reasons for them not
to follow. This approach is often taken to be an elaboration of Grice’s notion of
“generalised conversational implicature”. Relevance theorists, by contrast, (see,
Cognitive pragmatics: Relevance-theoretic methodology 197

for example, discussion by Carston 1998, 2002; Noveck and Sperber 2007) do
not assume the existence of generalised conversational implicatures or of default
inferences like these. Rather, they assume that all implicatures depend on the inter-
action of particular contextual assumptions with linguistic meanings and pragmatic
processes.
Noveck’s paper (Noveck 2001) reports three experiments. Two of the experi-
ments focused on constructions with might and must. Participants were children,
aged 5, 7 and 9, and adults. These showed that younger children were more likely
than older children and adults to evaluate positively utterances containing the form
might be x where must be x was clearly true, and that adults could be “trained” to
accept these more often. The third experiment focused on the French expressions
certains (similar to English some) and tous (similar to English all). Here “linguis-
tically sophisticated” children (aged 8 and 10) were found typically to treat utter-
ances containing certains to be compatible with situations where tous would have
been appropriate while adults were equivocal.
This work shed light on the development of particular kinds of pragmatic pro-
cessing and inspired further experimental work focusing on scalar implicature.
In fact, scalar implicature is arguably the pragmatic phenomenon which has been
most often studied using experimental approaches. As Grossman and Noveck
(2015: 147–148) point out, the evidence (developmental evidence and evidence
from comparisons among adult participants) has tended to be against the view that
there are processing defaults here. Nevertheless, this topic is still being debated
and studied experimentally (for recent discussion, see Chierchia 2017, Skordos and
Papafragou 2016, van Tiel et al. 2016).
Since then, a much wider range of experimental work has been developed
involving a wider range of techniques, including the measurement of event related
potentials (ERPs) using electroencephalography (EEG) and eye-tracking (for an
overview discussion mentioning examples of these, see Sauerland and Schumacher
2016).
We have moved from a situation in the 1990s when there was scepticism about
whether predictions of pragmatic theories could be tested experimentally to one
where experimental techniques are a standard way of investigating theoretical
claims.

4. Other methods

Over the years, researchers on language and communication in general have used
an increasingly wider range of methods. Researchers have also become more eclec-
tic with regard to methods, not tying themselves in advance to one method, and
also sometimes using more than one method to investigate a particular question.
This tendency has also been evident in work in pragmatics. At the same time, there
198 Billy Clark

are arguably tendencies among researchers to value particular kinds of methods

over others. If there is a general bias in this area, then it is towards what are per-
ceived as more solidly empirical methods, with experimental methods sometimes
seeming to be more valued than others. I would propose three reasons to resist this
assumption. First, any method is as useful as its results, and it is always necessary
to take the findings of each research project on its merits, critically assessing its
findings and the nature of the evidence which supports them. Second, experimental
methods have their limitations just as all methods do; most obviously, experimen-
tal contexts are not natural contexts and the behaviour of participants should be
understood as different from what their behaviour would have been in more natural
contexts. Third, some methods which are seen as less empirical, or even as unem-
pirical, can provide evidence to help test aspects of a theory. It is often assumed
that applications of a theory should be assessed purely on the basis of what they
tell us about the phenomena they are being applied to. I would argue, however,
that applications can provide evidence which helps us to test particular theoretical
assumptions.
With these thoughts in mind, this section considers methods which are neither
introspective nor experimental. The methods discussed here are observational,
including corpus-based work, and applications of the theory in clinical contexts, in
developmental and pedagogical work, and in stylistics. Each of them is discussed
more briefly than the two kinds of methods discussed above, reflecting the fact that
most current work is still based either on intuitions or on experiments.

4.1. Corpus-based and observational methods

Given its focus on psychological processes and the assumption of generally Chom-
skyan and Fodorian ideas, it is not surprising that corpus-based approaches were
not explored in the early days of relevance theory. In fact, as Rühlemann and
Aijmer (2014: 1) point out, corpus pragmatics is a “relative newcomer” to prag-
matics with interest in using corpus methods increasing in recent years. As cor-
pus methods have become more widely adopted, and as linguists and others have
become more eclectic, it is also not surprising that relevance theorists have begun
to explore corpus methods more fully.
Gisle Andersen (2000, 2001, 2015) was one of the earliest researchers to use
corpus techniques in work based on relevance theory, looking in particular at dis-
course markers, which are usually seen by relevance theorists as encoding proce-
dural meanings. Andersen (2015) points out that corpus methods can allow us to
“observe the stimuli that speakers offer and listeners interpret […] and to apply
various types of qualitative and quantitative analyses to them” (Andersen 2015:
143). Andersen provides a useful reminder of the relationship between corpus data
and what pragmatic theories aim to explain. Like the evidence provided by intro-
spection, corpus data is removed to some extent from the pragmatic processes
Cognitive pragmatics: Relevance-theoretic methodology 199

involved in production and interpretation (with the latter, of course, having been
the main focus of much work in pragmatics). He goes on to show how evidence
from corpora can complement introspective and experimental work and in fact
suggests that it:
leads to new knowledge about functional properties of individual forms, and […] this
knowledge extends beyond what can be gained from a strictly theoretical or experimen-
tal approach (Andersen 2015: 143).

He points out that “corpora contain tangible evidence of speakers’ choices of overt
stimuli in specific speech situations” (Andersen 2015: 151). His own work has
included work on the English markers like and innit (2000, 2001). His 2015 chap-
ter considers the emerging markers as if and duh. It is, of course, hard to establish
when these expressions began to be used with the functions focused on by Ander-
sen, but it is clear that their use with these functions increased towards the end of
the twentieth century and began to be borrowed into other languages. Andersen
discusses them cross-linguistically, including exchanges in Norwegian as well as
English, and develops arguments about how they are used based on these. This
includes the claim that their uses are not significantly different in Norwegian and
in English.
One key point to notice is that Andersen’s analyses depend on his intuitions
about the data he discusses, demonstrating that there is an introspective element to
corpus-based work. It is trivially true, of course, that researchers always use their
intuitions in analysing data. Introspection is used in a specific way in corpus-based
work in pragmatics, since researchers are using their own pragmatic processes,
engaging in metapragmatic inference, and considering what individuals intended
by their use of particular expressions and how they were understood.
Kolaiti and Wilson (2014) report corpus-based work on lexical pragmatics,
and metaphor in particular, carried out as part of a research project at University
College London (http://www.phon.ucl.ac.uk/home/lexprag07/corpus.html). They
quote Michael Stubbs (2001: 71), agreeing with him that, in their words:
[…] corpus-based evidence provides a valuable complement to more traditional meth-
ods of investigation, by helping to sharpen intuitions, develop and test hypotheses and
reduce the possibility of intuitive data being mere artefacts of the linguist (Kolaiti and
Wilson 2014: 212).

They mention some specific issues they encountered in using corpus data for this
work. One is that their work took a different view from previous corpus-based
work on metaphor (e. g. Deignan and Potter 2004; Deignan 2005; Pragglejaz Group
2007; Steen 2007; Steen et al. 2010a,b; Hanks 2012) which aimed to find criteria
for distinguishing metaphors from other kinds of language usage. By contrast, their
relevance-theoretic approach assumes a continuum with utterances being more or
less metaphorical rather than definitively metaphorical or not. This meant that they
200 Billy Clark

could not build straightforwardly on previous work on this topic. Another issue
was that they were looking at novel rather than conventionalised uses. Corpus data
are, in general, more useful for looking at recurring patterns. Nevertheless, they
show how corpus data provided significant evidence, including about the perva-
siveness of processes of lexical adjustment in comprehension, and some evidence
which contradicted their own intuitions, illustrated with discussion of the meaning
of the word raw. They claim that corpus evidence suggests the existence of an
encoded sense of not processed for this word in English which has emerged from
relatively metaphorical uses.
Kolaiti and Wilson refer to the comments about semantic and pragmatic intui-
tions made by Sperber and Noveck (2007) which were mentioned above and sug-
gest that the intuitions which their work has focused on fall midway between the
kinds of semantic and pragmatic intuitions envisaged by Sperber and Noveck:
On the one hand, these intuitions are about actual utterances, produced in actual sit-
uations. On the other hand, those utterances were not addressed to us, which puts us
in the position of overhearers rather than actual addressees. As a result, the pragmatic
intuitions they give rise to are still to some extent about hypothetical pragmatic facts,
and are open to error or influence by our prior theoretical commitments. This seems to
be an unavoidable feature of the use of corpus data in lexical pragmatics (Kolaiti and
Wilson 2014: 236).
Like Andersen, then, Kolaiti and Wilson see corpora as offering a useful comple-
ment to introspective and other data, also emphasising the importance of thinking
about the nature of the data and how researchers interact with it, including how
they use their own intuitions in analysing it.
There are other researchers who have used corpus data. De Klerk (2005), for
example, uses corpus data to explore uses of well in Xhosa English, arguing that this
data provides support for Blakemore’s (1987, 2002) account in terms of procedural
meaning. Jary (2008) uses corpus data to explore aspects of complement-choice
for the verb believe. His approach assumes a relevance-theoretic framework and
he argues that the theoretical ideas and corpus data interact here to provide insights
which would not have arisen from either corpus data or the theoretical ideas
alone.
Not all observational data come from corpora and there has been some work by
relevance theorists which involves observing utterances not gathered into corpora.
Examples include Watts (1986, 1988), who uses data from recorded conversations
in his discussion of the semantics of well, actually, really and basically, and Jucker
(1993), who uses data elicited in interviews in his analysis of well.

4.2. Applications
Relevance-theoretic ideas have been applied in a range of areas, including clinical
work, work on first and second language acquisition and teaching, and stylistics.
Cognitive pragmatics: Relevance-theoretic methodology 201

In each case, the balance varies between application, either to understand a phe-
nomenon or to help address a particular issue, and testing or developing the theory.
A number of researchers have applied or explored aspects of the theory in
clinical contexts, including in work on autism spectrum disorder and linguistic
developmental disorders. Happé’s (1993, 1995) work on autism spectrum disorder,
mentioned above, has been followed by a large number of studies and discussion
(including Chevallier et al. 2009; Colombino 2006; Kissine 2012; Loukusa et al.
2007; Norbury 2005; Papp 2006; Reboul et al. 2012; Surian 1996; Wearing 2010).
There is ongoing research on, and debate about, exactly what are the key features of
autism spectrum disorder and about how exactly ideas from pragmatics in general,
and relevance theory in particular, play a role here. One debate is around whether,
as Happé’s early studies suggested, differences in theory of mind ability are the key
factors correlating with the ability to understand metaphorical utterances. Norbury
(2005) suggests that language level is a more important predictor. What is clear
overall, though, is that ideas from pragmatics can help in understanding the nature
of autism spectrum disorder and that evidence from individuals with autism spec-
trum disorder can help in the development of pragmatic theories.
Foster-Cohen and Wong (2017) present and discuss three studies looking at
strategies used by and taught to adults interacting with children with language
delay. They argue that relevance theory helps to understand these interactions and
claim that the relevance-theoretic framework “works just as well for them as it
does for more familiar, adult to adult, ways of speaking” (Foster-Cohen and Wong
2017: 179). This is another example of applied work simultaneously helping to
explore aspects of the theory which is being applied.
Relevance theory has also been applied in looking at developmental pragmat-
ics, first language acquisition, and first and second language learning and teaching.
Most of this work has applied ideas from relevance theory in order to understand
the phenomena but some studies have aimed to develop interventions which can
help with the processes of acquisition and learning.
Most work on developmental pragmatics and language acquisition has applied
ideas from relevance theory and other approaches to account for the pragmatic
abilities of individuals at various stages of development, sometimes comparing
this with the behaviour of atypical individuals. An early paper by Bara, Bosco and
Bucciarelli (1999) explored the extent to which then current pragmatic theories
could account for pragmatic abilities of children with and without brain damage.
A study by Bezuidenhout and Sroda (1998) presented evidence suggesting that
children perform as well on some pragmatic inferential tasks as adults. Since then,
a large number of studies have explored the abilities of infants and children at
various stages. Some researchers have seen the abilities of infants as a problem for
pragmatic theories which assume that pragmatic processing requires very sophis-
ticated inferential ability (see, for example, Breheny 2006; Pfister 2010). More
recently, work with infants and children has been seen as providing evidence to
202 Billy Clark

guide pragmatic theorists in developing ideas about what is involved in pragmatic

inference. Mascaro and Sperber (2016) review research on pragmatics in infancy
and argue for further experimental work in this area.
There has also been some work on the acquisition of specific aspects of language.
Wharton (2014) applies ideas from relevance theory in considering the processes
of lexical acquisition, arguing both that they involve pragmatic mind-reading and
that the processes share many properties with adult comprehension. Wałaszewska
(2015) applies ideas from Carston’s (1997, 2002) work on broadening and narrow-
ing in considering how children understand word meanings, including discussion
of underextension and overextension of meanings. Gundel (2011) considers the
role of determiners and pronouns (“procedural expressions” in relevance-theoretic
terms) in helping identify the referents of nominal expressions. She argues that
understanding the nature of these expressions helps to explain why children aged
3 and younger can use and understand them appropriately before they can per-
form tasks which require more sophisticated representational/mind-reading ability.
Gundel’s work here, then, simultaneously considers ideas about developmental
pragmatics and language acquisition.
There has been growing interest over the years in the role of ideas from rel-
evance theory in understanding processes involved in second language learning
and teaching. In a review article, Foster-Cohen (2000) argued that ideas from rel-
evance theory could be useful in this area since second language teachers and
learners are engaged in communication guided by pragmatic principles. She edited
an influential journal special issue (Foster-Cohen 2004a) which collected papers
on second language acquisition and argued (Foster-Cohen 2004b) specifically for
relevance theory as a competitor to, and a better approach than, Herbert Clark’s
(1996) Action Theory approach. De Paiva (2007) argued that the technical notions
of relevance and manifestness are particularly useful here.
Jodlowiec (2010) explores ways in which ideas from relevance theory can
be useful in work on second language acquisition. She focuses in particular on
the “emergentist” programme (discussed, for example, by O’Grady 2008 and
MacWhinney 2006), which argues for understanding properties of language with
reference to non-linguistic factors, suggesting that relevance-theoretic ideas can be
particularly useful here. She says that:
[…] the major concern of emergentism, that is ensuring that SLA theory accounts for the
mental representations underlying language processing, can find support from a rele-
vance-theoretic treatment of the metarepresentational abilities that speakers and hearers
use in communication (Jodlowiec 2010: 56).

She also gives a useful brief survey of earlier work (Jodlowiec 2010: 53–55).
Ifantidou (2014) develops a sustained account of the relevance of ideas from
relevance theory in accounting for the development of pragmatic abilities by sec-
ond language learners. She also reports some experimental work on this. In a more
Cognitive pragmatics: Relevance-theoretic methodology 203

recent paper (Ifantidou 2016), she focuses in particular on the role of “epistemic
vigilance” (Sperber et al. 2010; Mercier and Sperber 2017): cognitive mechanisms
which help individuals to avoid being misled or misinformed, in accepting inter-
pretations in second language contexts.
Stylistics is another area where ideas from relevance theory have been applied.
Work in stylistics involves the application of ideas from linguistics and other areas
in accounting for the production, interpretation and evaluation of texts. In practice,
work in stylistics has most often focused on how literary texts are interpreted, but
there is no principled reason for this relatively narrow focus and work in stylistics
has developed a broader focus recently, including work on non-literary texts and on
production and evaluation. As with other applications, this work can also be under-
stood as testing the ideas which are being applied. As one example (not, in fact,
presented as an example of stylistic analysis), Carston’s discussion of examples
such as the metaphor in Carl Sandburg’s Fog (e. g. in Carston 2002, 2010a, 2010b;
Carston and Wearing 2015; this metaphor is also discussed by Sperber and Wilson
2008) played a role in the development of new thinking about how to account for
creative and extended metaphors.
While work in stylistics in general is not confined to accounting for interpre-
tations, and not confined to accounting for literary texts, the majority of work
in stylistics has focused on these two areas, and this has also been true for rel-
evance-theoretic work on stylistics. Key examples include work by Pilkington
(2000) on “poetic effects” (the term usually used within relevance theory to refer
to cases where texts seem to give rise to impressionistic and aesthetic responses),
MacMahon (1996, 2009a, 2009b) on poetic voice and metarepresentation, Force-
ville (1996, 2014) and Yus (2008, 2009) on visual and multimodal communica-
tion,11 Blakemore (1993, 2009, 2013) on reformulations and on free indirect style,
Morini (2010) on irony, and Unger (2006) on genre.12 A special issue edited by
Pilkington (1996) and another edited by Caink and Clark (2012) collect articles
applying ideas from relevance theory in a range of ways.
As well as aiming to account for particular interpretations or texts, some work
focuses on particular interpretive phenomena. Work which does not identify itself
within stylistics is also relevant here, e. g. work on metaphor (e. g. Carston 2002,
2010a, 2010b, 2011; Wilson and Carston 2006, 2007; Vega Moreno 2007) and
irony (e. g. Wilson and Sperber 2007, 2012). Furlong (1996, 2001, 2007, 2011,
2012) and Fabb (1995, 1997, 2002, 2010) have explored the nature of “literari-
ness”, with Fabb’s work also focusing on what constitutes literary form and on

11
Wharton (2009) applies relevance theory in considering nonverbal communication but
does not focus on stylistic analysis.
12
For overview discussions of relevance theory and stylistics, see Clark (2014), MacMa-
hon (2006).
204 Billy Clark

aesthetic experience more generally. Wilson (2011) explores the application of

relevance theory in accounting for literary interpretation more generally.
More recently, there has been an increase in work which focuses on production
and on evaluation. Clark and Owtram (2012) discuss ways of applying ideas from
relevance theory in teaching writing. Clark (2012) explores pragmatic processes
involved in the writing and editing of short stories by Raymond Carver. Clark
(2014) considers how texts are evaluated.
Work on stylistics so far has mainly aimed to account for intuitions about texts
with little other empirical investigation. One exception is B. Clark’s (1996) report
of exploratory work with students on the interpretation of a short story. There has
been an increase in empirical work in stylistics more generally in recent years (see,
for example, Macrae 2016; Miall 2006) and so work in relevance-theoretic stylis-
tics is likely to involve a broader range of empirical work in future.
While the aim in applying ideas from relevance theory in the areas just dis-
cussed is often mainly to develop understanding of the phenomena or to find
ways of addressing particular issues associated with them, this work also helps to
develop understanding of the theory and to develop new ideas.

5. Conclusions

Naturally, it is never possible to say in advance what kinds of methods will be

useful in addressing particular research questions. The usefulness of particular
methods is assessed by considering what findings they lead to and the nature of
the evidence they provide. Ideas from relevance theory have been developed and
applied in a wide range of areas and the theory has influenced thinking in a range
of disciplines. These developments have mainly been based on introspective and
on experimental methods (with some overlap, given that much experimental work
aims to gather evidence about the intuitions of participants). Given this, it seems
safe to assume that both kinds of methods will continue to be used in future work.
Other methods have also been used, including corpus methods, observation, and
applications of various types, notably in clinical work, pedagogy, text analysis and
stylistics. While the usefulness of various kinds of applications in developing the-
oretical ideas has not been much remarked upon, some findings have arisen based
on work in each of these. Work in relevance theory so far has explored a wide
range of topics and shown itself to be relevant to a wide range of areas. The theory
continues to be applied to further questions and topics. A good range of methods is
available to explore them with.
Cognitive pragmatics: Relevance-theoretic methodology 205

6. References

Andersen, Gisle
2000 The role of the pragmatic marker like in utterance interpretation. In: Gisle
Andersen and Thorstein Fretheim (eds.), Pragmatic Markers and Proposi-
tional Attitude, 17–38. Amsterdam: John Benjamins.
Andersen, Gisle
2001 Pragmatic Markers and Sociolinguistic Variation: A Relevance-theoretic
Approach to the Language of Adolescents. Amsterdam: John Benjamins.
Andersen, Gisle
2015 Relevance. In: Karin Aijmer and Christoph Rühlemann (eds.), Corpus Prag-
matics: A Handbook, 143–168. Cambridge: Cambridge University Press.
Ariel, Mira
2002 Privileged interactional interpretations. Journal of Pragmatics 34(8): 1003–
1044.
Ariel, Mira
2016 Revisiting the typology of pragmatic interpretations. Intercultural Pragmatics
13(1): 1–35.
Bara, Bruno G., Francesca M. Bosco and Monica Bucciarelli
1999 Developmental pragmatics in normal and abnormal children. Brain and Lan-
guage 68(3): 507–528.
Bezuidenhout, Anne and Mary Sue Sroda
1998 Children’s use of contextual cues to resolve referential ambiguity: An applica-
tion of relevance theory. Pragmatics and Cognition 6: 265–299.
Blakemore, Diane
1987 Semantic Constraints on Relevance. Oxford: Wiley-Blackwell.
Blakemore, Diane
1993 The relevance of reformulations. Language and Literature 2: 101–120.
Blakemore, Diane
2002 Relevance and Linguistic Meaning: The Semantics and Pragmatics of Dis-
course Markers. Cambridge: Cambridge University Press.
Blakemore, Diane
2009 Parentheticals and point of view in free indirect style. Language and Litera-
ture 18(2): 129–153.
Blakemore, Diane
2013 Voice and expressivity in free indirect thought representations: Imitation and
representation. Mind and Language 28(5): 579–605.
Bontly, Thomas
2005 Modified Occam’s razor: Parsimony, pragmatics and the acquisition of word
meaning. Mind and Language 20(3): 288–312.
Breheny, Richard
2006 Communication and folk psychology. Mind and Language 21(1): 74–107.
Breheny, Richard
2011 Experimentation-based pragmatics. In: Wolfram Bublitz and Neal Norrick
(eds.), Handbook of Pragmatics, Volume 1: Foundations of Pragmatics, 561–
586. Berlin: de Gruyter Mouton.
206 Billy Clark

Caink, Andrew and Billy Clark (eds.)

2012 Inference and implicature in literary interpretation. Special issue of Journal of
Literary Semantics 41(2).
Carruthers, Peter
2006 The Architecture of the Mind. Oxford: Oxford University Press.
Carston, Robyn
1997 Enrichment and loosening: Complementary processes in deriving the proposi-
tion expressed. Linguistische Berichte 8: 103–127.
Carston, Robyn
1998 Informativeness, relevance and scalar implicature. In: Robyn Carston and Seiji
Uchida (eds.), Relevance Theory. Applications and Implications, 179–236.
Amsterdam: John Benjamins.
Carston, Robyn
2002 Thoughts and Utterances. Oxford: Wiley-Blackwell.
Carston, Robyn
2010a Lexical pragmatics, ad hoc concepts and metaphor: From a relevance theory
perspective. Italian Journal of Linguistics 22(1): 153–180.
Carston, Robyn
2010b Metaphor: Ad hoc concepts, literal meaning and mental images. Proceedings
of the Aristotelian Society 110(3): 295–321.
Carston, Robyn
2011 Metaphor and the literal/nonliteral distinction. In: Keith Allan and Kasia M.
Jaszczolt (eds.), The Cambridge Handbook of Pragmatics, 469–492. Cam-
bridge: Cambridge University Press.
Carston, Robyn and George Powell
2006 Relevance theory: New directions and developments. In: Ernest Lepore and
Barry Smith (eds.), The Oxford Handbook of Philosophy of Language, 341–
360. Oxford: Oxford University press.
Carston, Robyn and Catherine Wearing
2015 Hyperbolic language and its relation to metaphor and irony. Journal of Prag-
matics 79: 79–92.
Chevallier, Coralie, Deirdre Wilson, Francesca Happé and Ira Noveck
2009 From acoustics to grammar: Perceiving and interpreting grammatical prosody
in adolescents with Asperger Syndrome. Research in Autism Spectrum Disor-
ders 3(2): 502–516.
Chierchia, Gennaro
2017 Scalar implicatures and their interface with grammar. Annual Review of Lin-
guistics 3: 245–264.
Clark, Billy
1996 Stylistic analysis and relevance theory. Language and Literature 5(3): 163–178
Clark, Billy
2002 Beginning with ‘One more thing’: Pragmatics and editorial intervention in the
work of Raymond Carver. Journal of Literary Semantics 41(2): 155–173.
Clark, Billy
2011 Recent developments in relevance theory. In: Peter Grundy and Dawn Archer
(eds.), The Pragmatics Reader, 129–137. London: Routledge.
Clark, Billy
2013 Relevance Theory. Cambridge: Cambridge University Press.
Cognitive pragmatics: Relevance-theoretic methodology 207

Clark, Billy
2014 Stylistics and relevance theory. In: Michael Burke (ed.), Routledge Handbook
of Stylistics, 155–174. London: Routledge.
Clark, Billy and Nicola Owtram
2012 Imagined inference: Teaching writers to think like readers. In: Michael Burke,
Szilvia Czabo, Lara Week and Judit Zerkowitz (eds.), Current Trends in Ped-
agogical Stylistics, 126–141. London: Continuum.
Clark, Herbert H.
1979 Responding to Indirect Speech Acts. Cognitive Psychology 11: 430–477.
Clark, Herbert H.
1996 Using Language. Cambridge: Cambridge University Press.
Clark, Herbert H. and Peter Lucy
1975 Understanding what was meant from what was said: A study in conversation-
ally conveyed requests. Journal of Verbal Learning and Verbal Behavior 14:
56–72.
Colombino, Tommaso
2006 Problems with a relevance-theoretic account of autism. Theory and Psychol-
ogy 16(2): 169–177.
Cosmides, Leda and John Tooby
1992 Cognitive adaptations for social exchange. In: Jerome Barkow, Leda Cosmides
and John Tooby (eds.), The Adapted Mind,163–228. Oxford: Oxford Univer-
sity Press.
Deignan, Alice
2005 Metaphor and Corpus Linguistics. Amsterdam: John Benjamins.
Deignan, Alice and Liz Potter
2004 A corpus study of metaphors and metonyms in English and Italian. Journal of
Pragmatics 36(7): 1231–1252.
Ericsson, K. Anders and Herbert A. Simon
1980 Verbal reports as data. Psychological Review 87.3: 215–251.
Ericsson, K. Anders and Herbert A. Simon
1984 Protocol Analysis: Verbal Reports as Data. Cambridge MA: MIT Press.
Evans, Jonathan
1972 Interpretation and ‘matching bias’ in a reasoning task. British Journal of Psy-
chology 24: 193–199.
Fabb, Nigel
1995 The density of response: a problem for literary criticism and cognitive s cience
In: Jonathan Payne (ed.) Linguistic Approaches to Literature: Papers in Lit-
erary Stylistics, 143–157. University of Birmingham: English Language
Research.
Fabb, Nigel
1997 Linguistics and Literature. Oxford: Wiley-Blackwell.
Fabb, Nigel
2002 Language and Literary Structure. Cambridge: Cambridge University Press.
Fabb, Nigel
2010 Is literary language a development of ordinary language? Lingua 120(5):
1219–1232.
Færch, Claus and Gabriele Kasper
1987 Introspection in Second Language Research. Clevedon: Multilingal Matters.
208 Billy Clark

Fiddick, Laurence, Leda Cosmides and John Tooby

2000 No interpretation without representation: The role of domain-specific rep-
resentations in the Wason selection task. Cognition 77: 1–79.
Forceville, Charles
1996 Pictorial Metaphor in Advertising. London: Routledge.
Forceville, Charles
2014 Relevance theory as model for analysing visual and multimodal communica-
tion. In: David Machin (ed.), Visual Communication, 51–70. Berlin: Mouton
de Gruyter.
Fodor, Jerry A.
1983 Modularity of Mind. Cambridge, MA: MIT Press.
Foster-Cohen, Susan H.
2000 Review Article: Relevance: communication and cognition. Second Language
Research 16(1): 77–92.
Foster-Cohen, Susan H.
2004a Relevance theory and second language learning/behaviour. Second Language
Research 20(3): 189–192.
Foster-Cohen, Susan H.
2004b Relevance theory, action theory and second language communication strate-
gies. Second Language Research 20(3): 289–302.
Foster-Cohen, Susan H. and Tze-Peng Wong
2017 Early intervention at the interface: Semantic-pragmatic strategies for facili-
tating conversation with children with developmental disabilities. In: Ilse
Depraetere and Raphael Salkie (eds.), Semantics and Pragmatics: Drawing A
Line, 163–182. Berlin: Springer.
Furlong, Anne
1996 Relevance theory and literary interpretation. PhD. Thesis, University College
London.
Furlong, Anne
2001 ‘Is it a classic if no-one reads it?’ Proceedings of the 24th Annual Meeting
of the Atlantic Provinces Linguistics Association (APLA). Université de
Moncton: Moncton NB.
Furlong, Anne
2007 A modest proposal: linguistics and literary studies. Canadian Journal of
Applied Linguistics 10(3): 325–347.
Furlong, Anne
2011 The soul of wit. Language and Literature 20: 136–150.
Furlong, Anne
2012 ‘It’s not quite what I had in mind’: Adaptation, faithfulness, and interpretation.
Journal of Literary Semantics 41(2): 175–191.
Gibbs, Raymond
1979 Contextual effects in understanding indirect requests. Discourse Processes 2:
1–10.
Gibbs, Raymond
1981 Your wish is my command: Convention and context in interpreting indirect
requests. Journal of Verbal Learning and Verbal Behavior 20: 431–444.
Gibbs, Raymond
1983 Do people always process the literal meanings of indirect requests? Journal of
Experimental Psychology: Learning, Memory, and Cognition 9(3): 524–533.
Cognitive pragmatics: Relevance-theoretic methodology 209

Gibbs, Raymond
1986 Skating on thin ice: Literal meaning and understanding idioms in conversa-
tion. Discourse Processes 9(1): 17–30.
Gibbs, Raymond and Jessica Moise
1997 Pragmatics in understanding what is said. Cognition 62(1): 51–74.
Grice, H. Paul
1975 Logic and conversation. In: Peter Cole and Jerry Morgan (eds.), Syntax and
Semantics 3: Speech Acts, 41–58. New York: Academic Press.
Grice, H. Paul
1978 Further notes on logic and conversation. In: Peter Cole (ed.), Syntax and
Semantics, Volume 9. Pragmatics, 113–28. New York: Academic.
Grice, H. Paul
1989 Studies in the Way of Words. Cambridge, MA: Harvard University Press.
Groefsema, Marjolein
1995 Can, may, must and should: A relevance-theoretic account. Journal of Linguis-
tics 31: 53–79.
Grossman, Eitan and Ira Noveck
2015 What can historical linguistics and experimental pragmatics offer each other?
Linguistics Vanguard 1(1): 145–153.
Gundel, Jeanette
2011 Child language, theory of mind, and the role of procedural markers in identi-
fying referents of nominal expressions. In: Victoria Escandell-Vidal, Manuel
Leonetti and Aoife Ahern (eds.), Procedural Meaning: Problems and Perspec-
tives, 205–231. Bingley: Emerald Group Publishing.
Hanks, Patrick
2012 How people use words to make meanings. In: Alex Boulton and James Thomas
(eds.), Input, Process and Product: Developments in Teaching and Language
Corpora, 52–67. Brno: Masaryk University Press.
Happé, Francesca
1993 Communicative competence and theory of mind in autism: A test of relevance
theory. Cognition 48(2): 101–119.
Happé, Francesca
1995 Understanding minds and metaphors: Insights from the study of figurative lan-
guage in autism. Metaphor and Symbolic Activity 10: 275–295.
Horn, Laurence R.
1984 Towards a new taxonomy for pragmatic inference: Q- and R-based implica-
ture. In: Deborah Schiffrin (ed.), Meaning, Form, and Use in Context: George-
town University Round Table on Languages and Linguistics, 11–42. Washing-
ton DC: Georgetown University Press.
Horn, Laurence R.
1988 Pragmatic Theory. In: Frederick J. Newmeyer (ed.) Linguistics: The Cam-
bridge Survey, Volume 1: Linguistic Theory: Foundations, 113–145. Cam-
bridge: Cambridge University Press.
Horn, Laurence R.
1989 A Natural History of Negation. Chicago: University of Chicago Press.
Ifantidou, Elly
2014 Pragmatic Competence and Relevance. Amsterdam: John Benjamins.
210 Billy Clark

Ifantidou, Elly
2016 Relevance theory, epistemic vigilance and pragmatic competence. In: Manuel
Padilla Cruz (ed.) Relevance Theory: Recent Developments, Current Chal-
lenges and Future Directions, 193–238. Amsterdam: John Benjamins.
Jary, Mark
2008 The relevance of complement choice: A corpus study of believe. Lingua 118:
1–18.
Jodlowiec, Maria
2010 The role of relevance theory in SLA studies. In: Martin Pütz and Laura Sicola
(eds.), Cognitive Processing in Second Language Acquisition, 49–66. Amster-
dam: John Benjamins.
Johnson-Laird, Philip N., Paolo Legrenzi and Maria Sonino Legrenzi
1972 Reasoning and a sense of reality. British Journal of Psychology 63(3): 395–
400.
Jorgensen, Julia, George Miller and Dan Sperber
1984 Test of the mention theory of irony. Journal of Experimental Psychology
113(1): 112–120.
Jucker, Andreas H.
1993 The discourse marker well: a relevance-theoretical account. Journal of Prag-
matics 19: 435–452.
Katsos, Napoleon and Chris Cummins
2010 Pragmatics: From theory to experiment and back again. Language and Lin-
guistics Compass 4(5): 282–295.
Kissine, Mikhail
2012 Pragmatics, cognitive flexibility and autism spectrum disorders. Mind and
Language 27(1): 1–28.
de Klerk, Vivian
2005 Procedural meanings of well in a corpus of Xhosa English. Journal of Prag-
matics 37(8): 1183–1205.
Kolaiti, Patricia and Deirdre Wilson
2014 Corpus analysis and lexical pragmatics: An overview. International Review of
Pragmatics 6(2): 211–239.
Levinson, Stephen C.
1987 Minimization and conversational inference. In: Jeff Verschueren and Marcella
Bertuccelli Papi (eds.), The Pragmatic Perspective, 61–129. Amsterdam: John
Benjamins.
Levinson, Stephen C.
2000 Presumptive Meanings: The Theory of Generalised Conversational Implica-
ture. Cambridge MA: MIT Press.
Loukusa, Soile, Eeva Leinonen, Sanna Kuusikko, Katja Jussila, Marja-Leena Mattila, Nuala
Ryder, Hanna Ebeling and Irma Moilanen
2007 Use of context in pragmatic language comprehension by children with
Asperger syndrome or high-functioning autism. Journal of Autism and Devel-
opmental Disorders 37(6): 1049–1059.
MacMahon, Barbara
1996 Indirectness, rhetoric and interpretive use: Communicative strategies in
Browning’s My Last Duchess. Language and Literature 5(3): 209–223.
Cognitive pragmatics: Relevance-theoretic methodology 211

MacMahon, Barbara
2006 Relevance theory: Stylistic applications. In: E. Keith Brown (ed.), Interna-
tional Encyclopedia of Language and Linguistics, 519–522, Volume 10, Sec-
ond edition. Amsterdam: Elsevier.
MacMahon, Barbara
2009a Metarepresentation and decoupling in Northanger Abbey: Part I. English Stud-
ies 90(5): 518–544.
MacMahon, Barbara
2009b Metarepresentation and decoupling in Northanger Abbey: Part II. English
Studies 90(6): 673–694.
Macrae, Andrea
2016 You and I, past and present: Cognitive processing of perspective. Diegesis
5(1): 64–80. Available at: https://www.diegesis.uni-wuppertal.de/index.php/
diegesis/article/view/214.
MacWhinney, Brian
2006 Emergentism – use often and with care. Applied Linguistics 27(4): 729–740.
Mascaro, Olivier and Dan Sperber
2016 Pragmatic inference in infancy. In: Fabienne Salfner and Uli Sauerland (eds.),
Pre-proceedings of Trends in Experimental Pragmatics Workshop at Center
for General Linguistics, Berlin, Germany January 18–20, 2016: 95–102.
http://www.xprag.de/wp-content/uploads/2015/08/TiXPrag-preproc.pdf.
Mercier, Hugo and Dan Sperber
2017 The Enigma of Reason: A New Theory of Human Understanding. London:
Allen Lane.
Miall, David S.
2016 Literary Reading: Empirical and Theoretical Studies. Bern: Peter Lang.
Morini, Massimiliano
2010 The poetics of disengagement: Jane Austen and echoic irony. Language and
Literature 19(4): 339–356.
Nicolle, Steve and Billy Clark
1999 Experimental pragmatics and what is said: A response to Gibbs and Moise.
Cognition 69: 337–354.
Norbury, Courtenay Frazier
2005 The relationship between theory of mind and metaphor: Evidence from chil-
dren with language impairment and autistic spectrum disorder. British Journal
of Developmental Psychology 23(3): 383–399.
Noveck, Ira
2001 When children are more logical than adults: Experimental investigations of
scalar implicature. Cognition 78(2): 165–188.
Noveck Ira and Anne Reboul
2008 Experimental pragmatics: A Gricean turn in the study of language. Trends in
Cognitive Sciences 12(11): 425–431.
Noveck, Ira and Dan Sperber
2004 Experimental Pragmatics. Basingstoke: Palgrave Macmillan.
Noveck, Ira and Dan Sperber
2007 The why and how of experimental pragmatics: The case of ‘scalar inferences’.
In: Noel Burton-Roberts (ed.), Pragmatics, 184–212. Basingstoke: Palgrave
Macmillan.
212 Billy Clark

O’Grady, William
2008 The emergentist program. Lingua 118: 447–464.
de Paiva, Beatriz
2007 Integrating relevance: An evaluation of theoretical accounts for the acquisi-
tion of pragmatic abilities in a second language. In: Bettina Kraft and Ron-
ald Geluykens (eds.), Cross-Cultural Pragmatics and Interlanguage English,
91–104. Munich: Lincom.
Papp, Szilvia
2006 A relevance-theoretic account of the development and deficits of theory of
mind in normally developing children and individuals with autism. Theory &
Psychology 16(2): 141–161.
Pfister, Jonas
2010 Infant communication: A problem for relevance theory? International Review
of Pragmatics 2(1): 3–20.
Phillips, Ben
2012 Modified Occam’s razor. Australasian Journal of Philosophy 90(2): 371–382.
Pilkington, Adrian (ed.)
1996 Relevance theory and literary style. Special issue of Language and Literature
5(3).
Pilkington, Adrian
2000 Poetic Effects. Amsterdam: John Benjamins.
Pragglejaz Group
2007 MIP: A method for identifying metaphorically used words in a corpus. Meta-
phor and Symbol 22: 1–39.
Reboul, Anne, Sabine Manificat and Nadege Foudon
2012 Autism from a cognitive-pragmatic perspective. In: Hans-Jörg Schmid (ed.),
Cognitive Pragmatics (Handbooks in Pragmatics 4.), 317–344. Berlin: de
Gruyter Mouton.
Rubio-Fernández, Paula, Chris Cummins and Ye Tian
2016 Are single and extended metaphors processed differently? A test of two rele-
vance-theoretic accounts. Journal of Pragmatics 94: 15–28.
Rubio-Fernández, Paula, Catherine Wearing and Robyn Carston
2015 Metaphor and hyperbole: Testing the continuity hypothesis. Metaphor and
Symbol 30: 24–40.
Rühlemann, Christoph and Karin Aijmer
2014 Corpus pragmatics: Laying the foundations. In: Karin Aijmer and Christoph
Rühlemann (eds.), Corpus Pragmatics, 1–26. Cambridge: Cambridge Univer-
sity Press.
Sauerland, Uli and Petra Schumacher
2016 Pragmatics: Theory and experiment growing together. Linguistische Berichte
245: 3–24.
Skordos, Dimitrios and Anna Papafragou
2016 Children’s derivation of scalar implicatures: Alternatives and relevance. Cog-
nition 153: 6–18.
Sperber, Dan
1994 The modularity of thought and the epidemiology of representations. In: Law-
rence A. Hirschfeld and Susan A. Gelman (eds.), Mapping the Mind: Domain
Cognitive pragmatics: Relevance-theoretic methodology 213

specificity in cognition and culture, 39–67. Cambridge: Cambridge University

Press. Revised version published in: Dan Sperber. 1996. Explaining Culture:
A Naturalistic Approach. Oxford: Wiley-Blackwell.
Sperber, Dan
2001 In defense of massive modularity. In: Emmanuel Dupoux (ed.), Language,
Brain and Cognitive Development: Essays in Honor of Jacques Mehler,
47–57. Cambridge MA: MIT Press.
Sperber, Dan, Francesco Cara and Vittorio Girotto
1995 Relevance theory explains the selection task. Cognition 57: 31–95.
Sperber, Dan and Vittorio Girotto
2002 Use or misuse of the selection task? Rejoinder to Fiddick, Cosmides, and
Tooby. Cognition 85: 277–290.
Sperber, Dan and Deirdre Wilson
[1986] 1995 Relevance: Communication and Cognition. Second edition. Oxford: Wiley-
Blackwell.
Sperber, Dan and Deirdre Wilson
2002 Pragmatics, modularity and mind-reading. Mind & Language 17: 3–23.
Sperber, Dan and Deirdre Wilson
2005 Pragmatics. In: Frank Jackson and Michael Smith (eds.), Oxford Handbook of
Contemporary Philosophy, 468–501. Oxford: Oxford University Press.
Sperber, Dan and Deirdre Wilson
2008 A deflationary account of metaphor. In: Raymond Gibbs (ed.) The Cambridge
Handbook of Metaphor and Thought, 84–105. Cambridge: Cambridge Univer-
sity Press.
Sperber, Dan, Fabrice Clément, Christophe Heintz, Olivier Mascaro, Hugo Mercier, Gloria
Origgi and Deirdre Wilson
2010 Epistemic Vigilance. Mind & Language 25(4): 359–393.
Steen, Gerard
2007 Finding Metaphor in Grammar and Usage: A Methodological Analysis of
Theory and Research. Amsterdam: John Benjamins.
Steen, Gerard, Aletta Dorst, J. Berenike Herrman, Anna Kaal, Tina Krennmayr and Trijntje
Pasma
2010a A Method for Linguistic Metaphor Identification. Amsterdam: John Benjamins.
Steen, Gerard, Ewa Biernacka, Aletta Dorst, Anna Kaal, Clara Lopez-Rodriguez and Tri-
jntje Pasma
2010b Pragglejaz in practice: Finding metaphorically-used words in natural dis-
course. In: Graham Lowe, Zazie Todd, Alice Deignan and Lynne Cameron
(eds.), Researching and Applying Metaphor in the Real World, 165–184.
Amsterdam: John Benjamins.
Stubbs, Michael
2001 Words and Phrases: Corpus Studies of Lexical Semantics. Oxford: Wiley-Black-
well.
Surian, Luca
1996 Are children with Autism deaf to Gricean maxims? Cognitive Neuropsychiatry
1(1): 55–72.
van Tiel, Bob, Emiel van Mittenburg, Natalia Zevakhna and Bart Geurts
2016 Scalar diversity. Journal of Semantics 33(1): 137–175.
214 Billy Clark

Unger, Christoph
2006 Genre, Relevance and Global Coherence. The Pragmatics of Discourse Type.
Basingstoke: Palgrave Macmillan.
Vega Moreno, Rosa
2007 Creativity and Convention. The Pragmatics of Everyday Figurative Speech.
Amsterdam: John Benjamins.
Wałaszewska, Ewa
2011 Broadening and narrowing in lexical development: How relevance theory can
account for children’s overextensions and underextensions. Journal of Prag-
matics 43: 314–326.
Wałaszewska, Ewa
2015 Relevance-Theoretic Lexical Pragmatics: Theory and Applications. Newcastle
upon Tyne: Cambridge Scholars.
Wason, Peter C.
1966 Reasoning. In: Brian M. Foss (ed.), New Horizons in Psychology, 135–151.
Harmondsworth: Penguin.
Watts, Richard J.
1986 Relevance in conversational moves: A reappraisal of well. Studia Anglica Pos-
naniensia 19: 37–59.
Watts, Richard J.
1988 A relevance-theoretic approach to commentary pragmatic markers: The case
of actually, really and basically. Acta Linguistica Hungarica 38: 235–260.
Wearing, Catherine
2010 Autism, metaphor and relevance theory. Mind and Language 25(2): 196–216.
Wharton, Tim
2009 Pragmatics and Nonverbal Communication. Cambridge: Cambridge Univer-
sity Press.
Wharton, Tim
2014 What words mean is a matter of what people mean by them. Linguagem em
(Dis)curso 14(3): 473–488.
Wilson, Deirdre
2011 Relevance and the interpretation of literary works. UCL Working Papers in
Linguistics 23: 69–80. http://www.ucl.ac.uk/psychlangsci/research/linguis-
tics/publications/uclwpl23.
Wilson, Deirdre and Robyn Carston
2006 Metaphor, relevance and the ‘emergent property’ issue. Mind and Language
21(3): 404–433.
Wilson, Deirdre and Robyn Carston
2007 Metaphor, relevance and the ‘emergent property’ problem. The Baltic Interna-
tional Yearbook of Cognition, Logic and Communication 3: 1–40.
Wilson, Deirdre and Dan Sperber
1981 On Grice’s theory of conversation. In: Paul Werth (ed.), Conversation and
Discourse, 155–178. London: Croom Helm.
Wilson, Deirdre and Dan Sperber
1988 Mood and the analysis of non-declarative sentences. In: Jonathan Dancy,
Julius M. E. Moravcsik and Christopher C. W. Taylor (eds.), Human Agency:
Language, Duty and Value, 77–101. Stanford, CA: Stanford University Press.
Cognitive pragmatics: Relevance-theoretic methodology 215

Wilson, Deirdre and Dan Sperber

1993 Linguistic form and relevance. Lingua 90(1): 1–25.
Wilson, Deirdre and Dan Sperber
2004 Relevance Theory. In: Gregory Ward and Laurence Horn (eds.) Handbook of
Pragmatics, 607–632. Oxford: Wiley-Blackwell.
Wilson, Deirdre and Dan Sperber
2007 On verbal irony. In: Raymond Gibbs and Herbert L. Colston (eds.) Irony in
Language and Thought: A Cognitive Science Reader, 35–55. New York: Law-
rence Erlbaum Associates.
Wilson, Deirdre and Dan Sperber
2012 Meaning and Relevance. Cambridge: Cambridge University Press.
Yus, Francisco
2006 Relevance Theory. In: E. Keith Brown (ed.) Encyclopedia of Language and
Linguistics. Volume 10. Second edition, 512–519. Amsterdam: Elsevier.
Yus, Francisco
2008 Inferring from comics: a multi-stage account. In: Pelegri Sancho Cremades.
Carmen Gregori Signes and Santiago Renard (eds.) El Discurs del Comic,
223–249. Valencia: University of Valencia.
Yus, Francisco
2009 Visual metaphor versus verbal metaphor: A unified account. In: Charley
Forceville and Eduardo Urios-Aparisi (eds.), Multimodal Metaphor, 147–172.
Berlin: Mouton de Gruyter.
Yus, Francisco
2010 Relevance theory. In: Bernd Heine and Heiko Narrog (eds.), Oxford Handbook
of Linguistic Analysis, 679–701. Oxford: Oxford University Press.
Yus, Francisco
2017 Relevance Theory Online Bibliographic Service. A Regularly Updated Bibli-
ography of Work on Relevance Theory. https://personal.ua.es/francisco.yus/
rt.html.
III. Experimentational pragmatics
8. Introduction to part 3: Experimentational
pragmatics
Klaus P. Schneider

1. On experimentation in pragmatics research

While all methods discussed in part 2 of this handbook rely on the researcher’s own
intuitions, the methods discussed in the remaining parts 3, 4 and 5 all involve the
use of data provided by people other than the researcher. Approaches employing
these methods can therefore be subsumed under the label “empirical pragmatics”.
In both part 4 and part 5, the focus is on so-called naturally occurring data (on the
complexities of this concept cf. Jucker, this volume, chapter 13), i. e. instances of
verbal (and non-verbal) communication as an integral part of real life situations,
which are either immediately observed by the researcher and recorded by using pen
and paper or an electronic device (part 4), or retrieved from usually very large and
machine-readable (pre-existing) collections of such instances specifically referred
to as corpora (part 5). By contrast, the methods surveyed in the present part 3 are
all methods involving data which do not occur naturally in real life situations, but
are elicited under conditions created by the researcher for the specific purpose of
data collection, which can be referred to loosely and in many, if not most cases
rather metaphorically as “laboratory conditions” (cf. Jucker 2009), i. e. not usually
in the literal sense of laboratory conditions in the natural sciences such as, perhaps
prototypically, chemistry. Although in many academic disciplines scientific work
is unthinkable without experimentation – and this applies not only to the natural
sciences, but also to disciplines interested in human behaviour, including commu-
nication, such as behavioural economics or psychology –, experimentational work
in pragmatics is sometimes frowned upon, especially by researchers working in
the traditions of conversation analysis and interactional linguistics and rejecting
experimentational data as “artificial” or “inauthentic”. Scholars who are dogmatic
about these issues should, however, remember that the choice of method depends
entirely on the respective research questions, and that methods which are well
suited to find answers to some questions may be ill suited for finding answers to
other questions (cf. Schneider, this volume, chapter 2). The strongest advantage
of all experimentational methods is that they permit systematic manipulation and
control of relevant parameters as, for instance, situational variables. That is the
common denominator of all methods discussed in this part of the handbook.
In the present context, the term “experimentational” is preferred over the more
frequent term “experimental” to emphasize the broad notion of experimental prag-
matics advocated in this handbook, which goes well beyond the narrower and

https://doi.org/10.1515/9783110424928-008
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 219–228. Berlin/
Boston: De Gruyter Mouton.
220 Klaus P. Schneider

more specific interpretation championed in the approach called “Experimental

Pragmatics” commonly referred to as “XPrag” (cf., e. g., Noveck and Sperber
2004), which developed in the relevance-theoretic tradition. While in the frame-
work of Relevance Theory researchers originally relied on their own intuitions
(e. g. Sperber and Wilson 1995), in XPrag approaches methods are adopted from
psycho- and neurolinguistics and more generally psychology and neuroscience,
including eye-tracking and neuroimaging, and specifically such methods as elec-
troencephalography (EEG) and functional magnetic resonance imaging (fMRI) (cf.
Clark in part 2 of this volume). Yet there are also other approaches in pragmatics,
older than XPrag, in which experimental methods are employed, including meth-
ods taken over from psychology, specifically in interlanguage pragmatics, applied
linguistics and second language research aimed at developing and assessing the
pragmatic competence and performance of learners in the context of foreign lan-
guage teaching and testing (cf., e. g., Kasper and Blum-Kulka 1993, Kasper and
Rose 2002, Ross and Kasper 2013). Interestingly, criticism of experimental meth-
ods has been levelled more against these more applied areas of pragmatics than
against approaches in the tradition of Relevance Theory.
This third part of the handbook includes a total of four chapters, each dealing
with a particular method, or set of methods, used in and representative of dif-
ferent approaches to pragmatics. Chapter 9, by Eva Ogiermann, deals with dis-
course completion tasks, and chapter 10, by Alma Veenstra and Napoleon Katsos,
focuses on sentence judgment tasks. In chapter 11, Raymond W. Gibbs, Jr. surveys
a wide range of psycholinguistic production tasks, whereas in chapter 12, J. César
Félix-Brasdefer discusses role plays. Role plays, discourse completion tasks and
the psycholinguistic tasks surveyed by Gibbs all serve the purpose of eliciting
language production, while the tasks examined by Veenstra and Katsos are used
to assess comprehension. These comprehension tasks, and also some of the pro-
duction tasks discussed by Gibbs, are typical of research carried out in that field
of pragmatics which is inspired by the work of Paul Grice and is characteristically
focused on individual utterances or sentences. In other words, these methods are
typical of that branch of pragmatics sometimes referred to as the “Anglo-Ameri-
can tradition” (Huang 2010), in which pragmatics is conceptualized essentially as
an extension of semantics (Huang 2007: 4). Role plays and especially discourse
completion tasks, on the other hand, are typically used in interlanguage pragmat-
ics and educational contexts to empirically establish native speaker norms of lan-
guage use and pragmatic features of learner performance as well as for assessing
pragmatic competence in a second language (cf. Edmondson et al. 1979; Blum-
Kulka, House, and Kasper 1989; Taguchi and Roever 2017). They are also used in
contrastive, cross-cultural and variational pragmatics to identify differences and
similarities between languages, cultures and social groups (cf., e. g., the contribu-
tions in Barron and Schneider 2009 and Beeching and Woodfield 2015). In these
fields of inquiry and beyond, discourse completion tasks (DCTs, originally termed
Introduction to part 3: Experimentational pragmatics 221

“discourse completion tests”), are an extremely popular elicitation method, not

least for the practical advantage that relatively large amounts of immediately com-
parable data can be gathered in a fairly short time. Hundreds if not thousands of
studies have been conducted employing this particular method for investigating
individual speech acts, most notably requests and apologies (Blum-Kulka, House,
and Kasper 1989, Trosborg 1995). Since DCTs have predominantly, though not
exclusively, been administered in writing (e. g. Jones and Adrefiza 2017 is a recent
exception), while the purpose is to study oral communication, specifically speech
acts, this method has received a great amount of criticism. Needless to say, several
features of spoken discourse such as prosody, intonation and features sometimes
collectively referred to as “normal non-fluency” (Short 1996: 176), i. e. hesitation,
backchannelling, interruptions, overlap, and so on, cannot be studied in written
DCTs. What can be studied, on the other hand, are the social norms and cultural
values underlying spoken discourse, which are reflected in this type of experimen-
tational data and can be conceptualized as “cultural models” (cf. Schneider 2012:
360–367 for some discussion). Normal non-fluency, intonation and prosody can,
however, be studied in role play data. Employing role plays is also a method of
generating data for the analysis of speech acts in context and sequences of inter-
actional behaviour.
While the four chapters in part 3 highlight methods which are typically and
extensively employed in experimentational pragmatics and have contributed and
are still contributing significantly to advancing the entire discipline, these chapters
are not intended to cover the full range of methods ever employed to elicit data for
pragmatic analysis. Further methods not addressed include various kinds of rating
tasks, perception tasks, individual thinking aloud, joint production, interviews,
and diaries. Most of these methods are discussed in detail in Kasper (2000) and
Kasper (2008)1 (cf. also Félix-Brasdefer and Hasler-Barker 2017 and Schneider,
this volume, chapter 2).

1
Essentially, these two papers are two versions of the same article, which appeared in the
first and second edition of the same book (Spencer-Oatey 2000 and 2008), two editions
which do, however, not include the same authors, although there is, of course, massive
overlap. What is worth noting about Kasper’s contributions are the apparent differences
between the two versions, which clearly demonstrate that the author has left experimen-
tational pragmatics and moved over to observational pragmatics, no longer interested
in role plays and DCTs successfully employed in her earlier projects (cf., e. g., House
and Kasper 1981 and Blum-Kulka, House, and Kasper 1989), but now favouring CA
methodology (cf., e. g., Kasper 2009).
222 Klaus P. Schneider

2. The chapters in this part of the handbook

Part 3 of this handbook includes chapters 9, 10, 11 and 12. In chapter 9, discourse
completion tasks (DCTs) are introduced and discussed. The DCT is an elicitation
format which consists of the description of a specific social situation to which
informants are requested to react, as a rule by producing a speech act, or more
precisely a missing turn-at-talk (in a dialogue excerpt). In this chapter, Eva Ogier
mann emphasizes the unique suitability of this particular data collection method
for the specific purposes of cross-cultural, variational and interlanguage pragmat-
ics, enabling large-scale systematic comparison of languages and varieties of lan-
guages as well as native-speaker and non-native-speaker productions by eliciting
large amounts of contextually varied data with a focus on speech acts and their
realizations. She provides an extensive overview of the wide range of languages
and first, second and foreign language varieties that have been studied with this
method, which include not only English and further Indo-European languages,
but also typologically unrelated languages from around the globe, among them
several which are still understudied in pragmatics research, e. g. Setswana, Lom-
bok Indonesian and Jordanian Arabic. She also gives an overview of the language
pairs that have been compared and the many speech acts that have been examined.
In this context, she refers to the Cross-Cultural Speech Act Realization Project
(CCSARP), which was a large-scale project investigating requests and apologies
in five different languages and several varieties of these languages (Blum-Kulka,
House, and Kasper 1989). This project has had, and still has, a huge impact on
later DCT studies, not only with the scenarios used in it, but also with the elaborate
coding scheme developed for the processing and analysis of DCT data.
Ogiermann furthermore describes the different design features and variations
of the DCT format, including the nature of the scenarios to which the informants
are requested to react verbally, the length and explicitness of the instructions, the
systematic manipulation and integration of social variables (e. g. power and par-
ticipant sex), and the presence or absence of a so-called rejoinder, i. e. a follow-up
turn to limit the options of discourse completion. The respective advantages and
disadvantages of the available options are pointed out, and how different options
can be chosen to examine different types of speech acts, e. g. initiating acts such as
requests or responding acts such as compliment responses, formulaic acts such as
thanking or more complex acts such as apologies or complaints.
Ogiermann then surveys studies comparing DCTs to other experimentational
methods, including in particular role plays and multiple choice questionnaires
(MCQs). It was found that, apart from obvious differences, DCT, MCQ and role
play data also display crucial similarities. Finally, Ogiermann deals with studies
comparing DCT data and naturally occurring discourse and specifically addresses
the criticism of the DCT method voiced in these studies. By discussing a number of
examples in detail, she shows that these comparisons are biased and not fair, in that
Introduction to part 3: Experimentational pragmatics 223

they take naturally occurring data as their starting point and typically focus exclu-
sively on those features of spoken discourse which, for obvious reasons, cannot
be investigated in DCT data, while ignoring the strengths of the DCT method, and
especially their suitability for comparative work, which researchers working with
natural data are, however, not usually interested in. Yet, Ogiermann concludes, for
the purposes of comparative work across languages and cultures, requiring large
sets of contextually varied and directly comparable data, no better data collection
method is currently available than the DCT method.
In chapter 10, Alma Veenstra and Napoleon Katsos give a critical account of
sentence judgment tasks, which are a particular type of comprehension task com-
monly used in the Gricean tradition, at the interface of semantics and pragmatics,
to assess the interpretation of utterance meaning. In sentence judgment tasks, par-
ticipants are asked to decide whether a sentence is true or false or rate it in terms
of e. g. (in)correctness or (in)appropriateness. Veenstra and Katsos are especially
critical of such binary decisions. They report recent studies which suggest that
sentence judgment tasks requiring such decisions do not reliably assess pragmatic
comprehension. They refer to investigations into scalar implicatures and how they
are understood which demonstrate that participants, e. g. young children, often
accept pragmatically incorrect sentences, although they perform successfully on
tasks designed to test other aspects of pragmatic competence. It was therefore
concluded that these participants do not display comprehension deficits concern-
ing the interpretation of scalar implicatures, as originally assumed, but are in fact
pragmatically competent, showing more tolerance concerning the correctness of
sentences, which they do not rate as correct or incorrect, but as more or less correct
or incorrect, accepting sentences they regard as sufficiently correct or not very
incorrect. To remedy the shortcomings of binary judgment tasks, alternative task
formats, considered more suitable for the assessment of pragmatic comprehension,
have been developed which Veenstra and Katsos discuss at the end of their chapter.
Veenstra and Katsos begin their account with a theoretical discussion of scalar
implicatures and their properties, which are exemplified with implicatures based
on the Gricean maxim of quantity. In this context, they briefly point to the con-
sequences of diverging theoretical positions for psycholinguistic studies on the
acquisition and development of pragmatic competence. They then survey a body of
research employing sentence judgment tasks, especially empirical studies in acqui-
sition research, in which decontextualized underinformative sentences have played
a crucial role. These studies seem to suggest that children acquire comprehension
of scalar implicatures at a relatively late age. This, however, does not appear to
make sense in the light of other acquisitional experiments which clearly demon-
strate that children at a younger age are well able to draw e. g. relevance inferences.
In the ensuing discussion, Veenstra and Katsos present a detailed description of a
large number of studies demonstrating the inadequacy of binary judgment tasks
and offering a more convincing picture of pragmatic comprehension competence
224 Klaus P. Schneider

in young children arrived at by employing for instance graded judgment tasks.

Against this background, the authors introduce and elaborate their Pragmatic Tol-
erance Hypothesis, according to which at least some children notice pragmatic
violations, but accept such sentences all the same.
While graded sentence judgement tasks are shown to be a more suitable method
for assessing pragmatic comprehension than the much more frequently used binary
judgment tasks, the former involve the same problem as the latter, namely a com-
bination of a linguistic and a meta-linguistic component characteristic of all judg-
ment tasks. Veenstra and Katsos therefore propose to abandon the sentence judg-
ment paradigm and to adopt alternative methods in the examination of pragmatic
acquisition and the development of pragmatic comprehension, which they briefly
outline at the end of their chapter. These range from picture-matching and more
complex behavioural tasks to eye-tracking and neuroimaging techniques such as
ERP. In other words, Veenstra and Katsos advocate the methodological inventory
currently deployed in the experimental paradigm otherwise known as XPrag (see
above), although XPrag is not explicitly mentioned in their text.
In chapter 11, Raymond W. Gibbs, Jr. surveys a great number of experimen-
tational methods developed and employed in psycholinguistics for investigating
various pragmatic aspects of language production. These include a wide range of
diverse tasks, designed to investigate when and how people use particular prag-
matic phenomena in their language production, e. g. metaphors, irony, indirectness
or rhetorical questions.
Gibbs emphasizes that language production is not an isolated psychological
process. He points out that pragmatic language production is closely linked to lan-
guage comprehension and therefore most suitably conceptualized as coordinated
action involving the speaker as well as the hearer. Furthermore, language produc-
tion is interrelated with several non-linguistic processes within interacting humans
and also with their non-verbal behaviour. These insights underscore the necessity
to examine language production in multimodal contexts.
The chapter is organized according to the research questions commonly
addressed by psycholinguists interested in and working on pragmatic aspects of
language production. Specifically, these questions are: (1) What do people say in
a given social situation and how do they say it? (2) Why do people say what they
say and what are the reasons for the linguistic choices they make? (3) How do
people describe scenes they have watched and relate them to other people, and
how do people read stories to other people? (4) How do people answer questions
which require specific types of processing? (5) How do people use language to
coordinate their own actions and joint action with other people? (6) Which role do
bodily processes play in the coordination of actions? (7) To what extent is prag-
matic language production conscious strategic choice or automatized? To each of
these questions, Gibbs dedicates a section of his chapter, in which he reviews the
experimentational methods respectively employed to find answers to these ques-
Introduction to part 3: Experimentational pragmatics 225

tions. Methods developed to find out what people say in particular situations and
how they say it (question 1) involve tasks that include the description of a scenario
to which the participants are asked to react. The examples reported, which were
designed to examine to what extent ordinary language users have tacit knowledge
of the felicity conditions speech act philosophers had postulated for e. g. promises
or in what situations they would prefer an indirect realization of e. g. a request,
are strikingly similar to the discourse completion tasks discussed in chapter 9.
However, neither psycholinguists nor applied linguistics seem to be aware that
the particular method they employ is also employed for very similar purposes in
another discipline interested in essentially the same pragmatic phenomena. In his
report of studies involving such speech act production tasks, which were some-
times combined with rating scales, Gibbs emphasizes how difficult it would be to
identify systematically varied real life situations to research the same questions.
Researchers interested in the motivation of speakers to use particular types
of figurative language (cf. question 2) for example asked participants directly for
the reasons of their choices and then organized their answers (e. g. “to be funny”,
“to be polite”) into an inventory of so-called discourse goals. To answer ques-
tion 3, participants were, for instance, shown videos whose contents they were to
tell to listeners who had not watched them; other participants were asked to read
aloud short stories to an audience. In either case, the focus of these studies was
on gestures and prosodic features, or more specifically on iconicity. Gibbs admits
that similar narrative elicitation tasks are used outside psycholinguistics, e. g. in
sociolinguistics, but maintains that in those cases the tasks are not carried out
under systematically varied experimental conditions or to test specific hypotheses.
Similar criticism could be levelled against recent work in variational pragmatics
(Bieswanger 2015), asking people in the street for directions to elicit responses to
thanks. This same tool and similar tools (e. g. asking the time) were used, among
other instruments, by psycholinguists dealing with question 4 about question
answering. Their studies were concerned with beliefs about common ground and
relevance optimality. These examples show that psycholinguists interested in prag-
matic language production carry out their experiments not only under “artificial”
laboratory conditions, but also in controlled real life situations.
With J. César Félix-Brasdefer’s chapter 12 on role plays, this part of the hand-
book returns to more applied research contexts. Role plays, like discourse comple-
tion tasks (chapter 9), are commonly used in cross-cultural pragmatics and inter-
language pragmatics and specifically employed for investigating the pragmatic
competence of foreign language learners and second language users and also for
testing their pragmalinguistic and sociopragmatic knowledge. Crucial distinctions
made in this chapter include closed versus open role play, and role play versus role
enactment.
The term “closed role play”, as defined in this chapter, is in fact an alternative
label for oral discourse completion tasks. Hence, closed role plays elicit only sin-
226 Klaus P. Schneider

gle-turn responses. They involve only one participant and are therefore non-inter-
actional. In some studies, participants “interact” with puppets, especially children
in developmental studies. The instructions in closed role plays and the scenarios
to which the participants are expected to react verbally, may be given in writing,
orally by the researcher or, in more recent work, by a computer. A particular variant
is the computer-based multimedia elicitation task (MET) which provides partici-
pants with audio and visual input.
Open role plays, on the other hand, are prototypical role plays. They involve two
participants who are given written instructions including the description of a situa-
tion they are to act out, after reading, in face-to-face dialogical interaction, which is
audio- or video-taped. Such recordings enable the analysis not only of single-turn
speech acts, but also of multi-turn speech act negotiations, speech act sequences,
conversational openings and closings and further interactive features of spoken dis-
course as well as paralinguistic and prosodic features plus, in the case of video-re-
cordings, non-verbal behaviour. Both open and closed role plays enable systematic
control of relevant contextual variables such as setting and participant attributes and
identities and thus warrant comparability within collections of role play data.
While “role play” is, as a rule, used as a cover term for various subtypes, it
can also be used in a more specific sense and distinguished from “role enactment”.
This distinction pertains to the roles the participants are requested to play. In role
enactment, participants can be themselves and behave as they would in real life
situations. In role plays in the narrow sense, by contrast, participants have to pre-
tend to be somebody else and adopt roles for which they may lack immediate
experience. For instance, students may be asked to perform the role of a professor,
a doctor or a manager. Needless to say, role enactments generate more valid data,
but at the same time they limit the range of social roles and social situations which
can be examined by employing this particular method.
Special types of role play tasks and role enactment are integrated into test bat-
teries developed to assess the linguistic competencies of second language learners
and foreign language users. A prominent example is the “oral proficiency inter-
view” (OPI). Variants of the OPI are included in several standardized language
tests used around the globe, such as the well-known International English Lan-
guage Testing System (IELTS) test. OPIs are organized in various stages to espe-
cially elicit data for the analysis of many different aspects of that component of
pragmatic competence commonly referred to as interactional competence.
Félix-Brasdefer provides examples of entire role play transcripts to illustrate
their processing and analysis, specifically their segmentation and coding. At the
end of his chapter, he explicitly addresses issues of reliability and validity, and also
some ethical concerns. The specific limitations of the role play method include
their non-consequential nature. That is to say, role play interactions differ from
naturally occurring interactions in that they do not have any social consequences,
e. g. regarding the relationship between the interactants.
Introduction to part 3: Experimentational pragmatics 227

The four chapters in part 3 of this handbook thus deal with distinctly different
experimentational methods which are widely used, but used in different areas of
pragmatics research with different goals and purposes in mind and employed by
researchers who do not usually take notice of each other. Yet, as indicated above,
there seem to be some areas of overlap. The comprehension tasks and production
tasks introduced and discussed in chapters 10 and 11 are commonly used in the
overall framework of Gricean pragmatics, with a high degree of experimental rig-
our and explicit hypotheses to be tested. These tasks belong to the methodological
inventory of Experimental Pragmatics known as XPrag. Here the focus is often
on implicatures, metaphors and irony, but also many other aspects of utterances.
Discourse completion tasks and role play tasks, on the other hand, are frequently
employed to collect large comparable sets of data for contrasting languages and
varieties of languages, cultures and social groups, native speakers and language
learners in cross-cultural, variational and interlanguage pragmatics. Here the focus
is often on speech acts and speech act sequences, but also further features of spoken
discourse. Of all experimentational methods reviewed in this part of the handbook,
role play tasks seem best suited to the systematic study of a range of interactional
aspects. It is hoped that the chapters in the present part, and more generally in this
handbook, contribute to methodological cross-fertilization and facilitate interdis-
ciplinary work across the boundaries of what seem to be complementary research
areas within the vast field of pragmatics.

References

Barron, Anne and Klaus P. Schneider (eds.)

2009 Variational Pragmatics. Special Issue. Intercultural Pragmatics 6(4): 425–
616.
Beeching, Kate and Helen Woodfield (eds.)
2015 Researching Sociopragmatic Variability: Perspectives from Variational, Inter-
language and Contrastive Pragmatics. Basingstoke: Palgrave Macmillan.
Bieswanger, Markus
2015 Variational pragmatics and responding to thanks – revisited. Multilingua
34(4): 527–546.
Blum-Kulka, Shoshana, Juliane House and Gabriele Kasper (eds.)
1989 Cross-Cultural Pragmatics: Requests and Apologies. Norwood, N.J.: Ablex.
Edmondson, Willis, Juliane House, Gabriele Kasper and Brigitte Stemmer
1979 Kurzbeschreibung des Projekts “Kommunikative Kompetenz als realisier-
bares Lernziel”. Linguistische Berichte 67: 50–57.
Félix-Brasdefer J. César and Maria Hasler-Barker
2017 Elicited data. In: Anne Barron, Yuego Gu and Gerard Steen (eds.), The Rout-
ledge Handbook of Pragmatics, 27–40. London: Routledge.
Huang, Yan
2007 Pragmatics. Oxford: Oxford University Press.
228 Klaus P. Schneider

Huang, Yan
2010 Pragmatics. In: Louise Cummings (ed.), The Pragmatics Encyclopedia, 341–
345. London: Routledge.
Jones, Jeremy F. and Adrefiza
2017 Comparing apologies in Australian English and Bahasa Indonesia: Cultural
and gender perspectives. Journal of Politeness Research 13(1): 89–119.
Jucker, Andreas H.
2009 Speech act research between armchair, field and laboratory: The case of com-
pliments. Journal of Pragmatics 41(8): 1611–1635.
Kasper, Gabriele
2000 Data collection in pragmatics research. In: Helen Spencer-Oatey (ed.), Cul-
turally Speaking: Managing Rapport through Talk across Cultures, 316–341.
London: Continuum.
Kasper, Gabriele
2008 Data collection in pragmatics research. In: Helen Spencer-Oatey (ed.), Cul-
turally Speaking: Managing Rapport through Talk across Cultures, 279–303.
(Second edition.) London: Continuum.
Kasper, Gabriele
2009 Categories, context and comparison in conversation analysis. In: Hanh thi
Nguyen and Gabriele Kasper (eds.), Talk-in-Interaction: Multilingual Per-
spectives, 1–28. Honolulu, HI: University of Hawai’i at Manoa, National For-
eign Language Resource Center.
Kasper, Gabriele and Shoshana Blum-Kulka (eds.)
1993 Interlanguage Pragmatics. New York: Oxford University Press.
Kasper, Gabriele and Kenneth R. Rose
2002 Pragmatic Development in a Second Language. Oxford: Blackwell.
Noveck, Ira A. and Dan Sperber (eds.)
2004 Experimental Pragmatics. Basingstoke: Palgrave Macmillan.
Ross, Steven J. and Gabriele Kasper (eds.)
2013 Assessing Second Language Pragmatics. Basingstoke: Palgrave Macmillan.
Schneider, Klaus P.
2012 Pragmatic variation and cultural models. Review of Cognitive Linguistics
10(2): 346–372.
Short, Mick
1996 Exploring the Language of Poems, Plays and Prose. London: Longman.
Spencer-Oatey, Helen (ed.)
2000 Culturally Speaking: Managing Rapport through Talk across Cultures. Lon-
don: Continuum.
Spencer-Oatey, Helen (ed.)
2008 Culturally Speaking: Managing Rapport through Talk across Cultures. (Sec-
ond edition.) London: Continuum.
Sperber, Dan and Deirdre Wilson
1995 Relevance: Cognition and Communication. (Second edition.) Oxford: Black-
well.
Taguchi, Naoko and Carsten Roever
2017 Second Language Pragmatics. Oxford: Oxford University Press.
Trosborg, Anna
1995 Interlanguage Pragmatics: Requests, Complaints and Apologies. Berlin/New
York: Mouton de Gruyter.
9. Discourse completion tasks
Eva Ogiermann

Abstract: The present chapter examines Discourse Completion Tasks (DCTs), a

data elicitation method that generates large amounts of contextually varied and
comparable cross-linguistic speech act data, used predominantly in cross-cul-
tural and interlanguage pragmatics. It discusses different features of DCT design,
including the formulation of scenarios, the incorporation of social variables and
format choice. The chapter then reviews studies comparing DCTs to other data
elicitation methods and to naturally occurring data. It shows that while the dif-
ferent data collection methods generate similar speech act realisation strategies,
the reported differences – mainly regarding directness, mitigation, and politeness
marking – are largely inconclusive, with the results depending on the speech acts
and groups of speakers under study.

1. Introduction

The Discourse Completion Task (DCT) is probably the most widely used data
collection instrument in cross-cultural pragmatics, a field of enquiry that compares
different speech acts across languages, and in interlanguage pragmatics, which
examines learners’ pragmatic competence and development. What makes DCTs
particularly valuable for these areas of investigation is that research aiming to
establish culture-specific patterns in speech act realisation or the pragmatic fea-
tures of a specific interlanguage needs to draw on large quantities of data, and the
DCT is the only available data collection instrument that generates sufficiently
large corpora of comparable, systematically varied speech act data. Since DCTs
can be translated into any language and distributed to large groups of informants
within a short period of time, they are the ideal instrument for the contrastive study
of speech acts (Aston 1995: 62; Barron 2003: 85).
Although DCT responses do not fully resemble naturally occurring data, the
administrative advantages make the DCT a valuable and effective data collection
method (Johnston, Kasper and Ross 1998: 157; Billmyer and Varghese 2000: 521;
Kasper 2000: 325; Barron 2003: 85), in particular for large-scale projects (Sasaki
1998: 479). DCTs can be designed to elicit multiple occurrences of any speech act
across a variety of situations, thus documenting a wide range of semantic formulae
by which a given speech act can be implemented (Beebe and Cummings 1996:
80; Johnston, Kasper and Ross 1998: 158; Kasper 2000: 325; Barron 2003: 84).
This is particularly useful “when investigating languages which have not yet been

https://doi.org/10.1515/9783110424928-009
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 229–255. Berlin/
Boston: De Gruyter Mouton.
230 Eva Ogiermann

described pragmatically and for speech acts which have not been described in lan-
guages which are better documented” (Bardovi-Harlig 1999: 239). Accordingly,
one of the main merits of DCT-based research is that it has generated a vast amount
of cross-linguistic data and provided insights into the pragmatics of numerous lan-
guages and language varieties.
The next section of this chapter illustrates this by providing a brief overview of
DCT studies that have been conducted in the areas of cross-cultural and interlan-
guage pragmatics. Section 3 discusses DCT design, with a focus on sociolinguis-
tic variables (3.1.) and format choice (3.2.). Section 4 reviews studies comparing
DCTs with other elicitation methods (4.1.) and naturally occurring data (4.2.), and
section 5 concludes the chapter by evaluating the role of the DCT in contrastive
pragmatic research.

2. The impact of the DCT

The largest and most influential DCT study to date, the Cross-Cultural Speech
Act Realisation Project (CCSARP), was conducted by an international team of
linguists (Blum-Kulka, House and Kasper 1989). The project examined requests
and apologies in five languages (Canadian French, Danish, German, Hebrew and
English); with the last one represented by three varieties (American, Australian
and British).
The framework developed in the CCSARP has been replicated in numerous
speech act studies, resulting in a large body of comparable data from many more
languages. Many DCT studies have followed the design of the project closely, and
focused on requests and/or apologies. This was facilitated not only by replicating
the CCSARP DCT – or a modified version thereof – but also by the availability of
a detailed coding scheme for the two speech acts developed in the project.
As a result, the DCT has introduced many under-researched languages into the
field of pragmatics, with studies analysing apologies and requests in South African
Indian English (Bharuthram 2003), requests in Korean (Byon 2006) and apologies
in Lombok Indonesian (Wouk 2006), as well as in Sudanese (Nureddeen 2008) and
Tunisian Arabic (Jebahi 2011). Most DCT-based research, however, follows the
cross-cultural design of the CCSARP, i. e. it compares different languages (mainly
contrasting them with English), thus contributing to the debate on pragmatic uni-
versality vs. culture-specificity.
Apology studies have compared English with Hungarian (Suszczyńska 1994,
1999), Polish (Suszczyńska 1999, Ogiermann 2009a), Russian (Ogiermann 2008,
2009a), the South African variety of Setswana (Kasanga and Lwanga-Lumu 2007)
and Jordanian Arabic (Bataineh and Bataineh 2008). Requests have not only been
studied across languages such as French and Dutch (Van Mulken 1996) or English,
German, Polish and Russian (Ogiermann 2009b), but have also been the subject of
Discourse completion tasks 231

study in variational pragmatics, where they have been contrasted across different
varieties of English (e. g. Barron 2008), German (Warga 2008) and Spanish (Pla-
cencia 2008) inter alia.
Apart from apologies and requests, DCTs have been used to investigate a num-
ber of other speech acts, with the most popular ones being refusals, e. g. in Korean
and American English (Kwon 2004) and Mexican Spanish and American English
(Félix-Brasdefer 2008). There are also studies of compliments, e. g. Mulo Faren-
kia’s variational study of Cameroon and Canadian French (2012), and compliment
responses, e. g. comparing Mandarin with Australian English (Tang and Zhang
2009).
Another area where DCTs have been extensively used is the field of interlan-
guage pragmatics, which is closely related to cross-cultural pragmatics, in that
interlanguage studies typically elicit three sets of data, allowing for a compari-
son between the native and the target language, as well as an examination of the
pragmatic features of the interlanguage. Apart from examining learners’ pragmatic
transfer, thus documenting their difficulties in bringing across the intended illo-
cutionary force of a given speech act, interlanguage studies using DCTs have also
examined pragmatic development, albeit almost exclusively via a cross-sectional
design (but see Barron 2003).
As with cross-cultural studies, apologies and requests are among the most
researched speech acts in interlanguage pragmatics. DCT studies have examined
apologies produced by Thai (Bergman and Kasper 1993), Jordanian (Bataineh and
Bataineh 2006), and Catalan (Sabate i Dalmau and Curell i Gotor 2007) learners of
English. Some other studies involved a wider range of participants, such as Al-Zu-
mor’s study (2011), which examined English apologies produced by learners from
five different Arab countries. While English continues to be the most researched
target language, there are also studies of apologies offered by Americans in Rus-
sian (Shardakova 2005), Austrians in French (Warga and Schölmberger 2007) and
by English learners of Greek (Bella 2014).
Request studies have investigated the pragmatic competence of English learn-
ers from countries as varied as the Netherlands (Hendriks 2008), Spain/Basque
country (Cenoz 2003), Turkey (Otcu and Zeyrek 2008), Greece (Economidou-Ko-
getsidis 2009), Iran (Eslami and Noora 2008), Jordan (Al-Ali and Alawneh 2010),
and Germany and Japan (Woodfield 2008). Marti (2006) examined pragmatic
transfer in Turkish requests produced by Turkish/German bilinguals and Byon
(2004) analysed American speakers’ requests in Korean. Pinto (2005) studied the
acquisition of requests of English learners of Spanish, and Bella’s work on requests
(2012a, 2012b) examines the pragmatic development of learners of Greek from a
variety of L1 backgrounds.
Barron (2003) conducted a longitudinal study of Irish speakers’ acquisition
of German, focusing on requests, offers and refusals. Interlanguage studies using
DCTs to investigate refusals have also looked at Iranian (Allami and Naeimi 2011)
232 Eva Ogiermann

and Japanese (Beebe, Takahashi and Uliss-Weltz 1990) EFL learners. The prag-
matic competence of Japanese speakers of English was also studied on the basis of
complaints (Nakabachi 1996), as was that of Korean English learners (Murphy and
Neu 1996) and learners of Hebrew (Olshtain and Weinbach 1993).1
While the above review allows only a small glimpse into the wealth of DCT
studies and the broad variety of languages they have investigated, it illustrates the
international scope of the fields of cross-cultural and interlanguage pragmatics.

3. Designing a DCT

The DCT evolved from discourse completion exercises developed by Levenston

and Blum (1978), which were designed for the study of L2 lexical acquisition. One
of the advantages of these exercises was that they enabled researchers to compare
the performance of learners and native speakers or learners at different proficiency
levels. Participants completing the exercises were instructed to fill in a blank with
one word. The provided “discourse” was designed “to restrict as far as possible the
number of acceptable alternatives” (1978: 5) and consisted of one or maximally
two sentences.
Adapting this data collection instrument to investigate speech act realisation
(Blum-Kulka 1982) involved expanding the “discourse” to provide more context
and elicit complete conversational turns. Accordingly, DCTs consist of a number
of scenarios2 (typically between 8 and 12) describing different situations to which
the participants are asked to react, e. g.:
You are on your way to work but your car won’t start. You see your neighbour get into
his. He notices you and waves, so you decide to say …

The length of the scenarios varies across studies, with longer ones providing more
contextualisation and shorter ones having the advantage of being easier to process.
DCTs with particularly detailed descriptions of the scenarios (and more space for
responses!) are bound to produce longer responses, but their length does not seem
to affect speech act realisation (Billmyer and Varghese 2000).
The DCT usually contains instructions requesting the participants to respond
spontaneously, without much thinking or to write down the first thing that comes
to mind. There are, of course, limitations to how spontaneous one can be when

1
For an extensive list of cross-cultural and interlanguage speech act studies, including
many using DCTs, see the webpage of the Center for Advanced Research on Language
Acquisition: http://carla.umn.edu/speechacts/.
2
Strictly speaking, each scenario constitutes an individual Discourse Completion Task,
which is perhaps why alternative terms have been proposed to refer to this data collec-
tion instrument, such as Discourse Completion Test or production questionnaire.
Discourse completion tasks 233

instructed to respond to hypothetical situations in written form. The spontaneity

and authenticity of the responses are also likely to be affected by the length of the
scenarios and the amount of detail to be processed.
In general, researchers agree that completing a written task involves different
cognitive processes than speaking (Cohen and Olshtain 1994: 148). It requires
participants to “recall pragmatic information from memory and report rather than
use it” (Barron 2003: 85). One of the main arguments against DCTs has, therefore,
been that the responses do not necessarily reflect what the speakers would say if
they found themselves in the presented situations, but rather what they think they
would say (Aston 1995: 62; Schneider 2011: 18). This, however, does not neces-
sarily invalidate DCT findings, given that the aim of cross-cultural pragmatic stud-
ies is to establish general, culture-specific patterns of language use. Whether the
participants would use exactly the same expressions once they found themselves
in the described situations is not crucial as long as they regard their responses as
socially and culturally appropriate.

3.1. Incorporating sociolinguistic variables

With the focus of cross-cultural and interlanguage pragmatic studies being on prag-
malinguistics (the linguistic formulation of illocutions) as well as sociopragmatics
(their contextual variation) (Leech 1983), DCT scenarios are designed to contain
certain social variables. Correlating these variables with preferences for particular
speech act features can establish how they impact on strategy choice and politeness
marking (Barron 2003: 85, Schauer and Adolphs 2006: 131). In order to investigate
their impact on speech act realisation, the variables under study are varied system-
atically and, ideally, those not examined are kept constant across scenarios.
The contextual variables that have been examined in cross-cultural and inter-
language pragmatic studies are mainly those proposed by Brown and Levinson
([1978]1987), i. e. social distance, social power and the degree of imposition; as
well as sex and (rarely) age. Social distance and power define the relationship
between two interlocutors. In the context of a DCT, the relationship is between the
character (the hearer) described in a given scenario and the participant filling in
the DCT (the speaker).
Social distance (D) has been defined as a symmetrical variable which indicates
the degree of familiarity and frequency of interaction between two interlocutors. In
DCT studies, this variable is generally represented on three levels: strangers (high
D), acquaintances (medium D) and friends (low D). Social power (P), on the other
hand, is an asymmetrical variable indicative of the degree to which a speaker can
impose his or her will on their interlocutor. As with social distance, this allows
for three constellations, with the interlocutors being either of equal status (S=H),
the DCT character being more powerful than the participant (S<H) or vice versa
(S>H).
234 Eva Ogiermann

Accordingly, a DCT consisting of eight scenarios could contain four situations

featuring status equal interlocutors (S=H) who know each other well (low D),
resulting in interactions between friends, and four situations combining high social
distance (high D) with equal status, which is generally assumed between strangers
(see Ogiermann 2009a: 83 for further discussion).
Assigning the same sex to all characters and keeping the degree of imposition
constant across all scenarios can then render results showing the impact of social
distance on speech act realisation, while distributing hearer sex symmetrically
across the two types of scenarios (see table 1) can provide additional insights into
how sex influences strategy choice. The more scenarios per category are included
in the DCT, the more reliable the findings regarding the impact of social variables
on strategy choice.

Table 1. Distribution of social variables across scenarios

Social Distance Social Power Hearer Sex

Scenario 1 [low D] [S=H] Male
Scenario 2 [low D] [S=H] Male
Scenario 3 [low D] [S=H] Female
Scenario 4 [low D] [S=H] Female
Scenario 5 [high D] [S=H] Male
Scenario 6 [high D] [S=H] Male
Scenario 7 [high D] [S=H] Female
Scenario 8 [high D] [S=H] Female

While the identity of the characters and their relationship with the participant are
described in the scenarios, information about the participants, such as their age
and sex are usually among the biographic information elicited through the DCT,
along with their native language. DCTs used in interlanguage studies also contain
questions regarding the participants’ proficiency in the tested L2. Demographic
information on the participants can thus serve for further comparisons, for instance
establishing differences between responses provided by male and female partici-
pants or across different proficiency levels.
The vast majority of subjects in DCT studies are university students, and in
most studies they retain their identity when responding to the scenarios. This gen-
erally restricts the choice of scenarios to students’ everyday life, though it also
has the advantage of increasing comparability across studies. However, since the
chosen situations need to be realistic, the power constellation (S>H) is under-re-
searched; students do not often adopt powerful roles so few studies use scenarios
where the described characters are of a lower status than the participants.
Discourse completion tasks 235

Some studies require the participants to adopt a range of different roles. For
instance, in the CCSARP, the students were asked to act out the role of a profes-
sor, police officer, waiter and even characters of both sexes (Rintell and Mitchell
1989: 252). The extent to which students can actually reproduce the pragmatic
features of these people’s speech will vary, but generally they are likely to resort
to stereotypes, which reduces the authenticity of the results. Ultimately, “if social
roles were interchangeable and anybody could act like anybody else, there would
be little need for sociolinguistic research” (Ogiermann 2009a: 77–78).
Another problematic aspect of DCT design is that although experimental data
collection methods allow for controlled variation of contextual factors, in the end,
all the situations are different and include additional factors influencing strategy
choice. Clearly, the most reliable way of determining the variable responsible for
the use of particular linguistic items would be by using different versions of the
same scenario, varied by one variable only, for instance: apologising for stepping
on a female stranger’s, male stranger’s, and a female friend’s and a male friend’s
foot. This, however, would give away the design of the study and the responses
could easily become mechanical.
Although this could be resolved by distributing the four scenarios over as many
versions of the DCT, to be distributed to parallel groups of participants, this is
not practiced in cross-cultural and interlanguage pragmatics. What makes such a
design problematic is that the interactions we have with our interlocutors tend to
reflect the kind of relationships we have with them. Apologies between strangers,
for instance, are generally limited to space offences; and there are things that we
would only request from people we are close with. Hence, although using the same
scenario while varying one contextual variable would increase the comparability
and reliability of the findings, it would also considerably restrict the range of situ-
ations that could be examined.
Furthermore, a careful analysis of responses to DCT scenarios shows that the
sociolinguistic variables incorporated into them are often insufficient when it
comes to interpreting the described context and that additional factors may impact
on how participants respond as well. The impact of P and D3 can be affected by
other situational factors; interacting with one’s boss in a professional setting will
be different from talking to him or her privately. Formal settings will differ from
informal ones, private from public ones, and even third parties present during an
interaction could make a huge difference to how we express ourselves. The par-

3
It has also been argued that the variables of social distance or power are too broad. For
instance, it has been suggested that social distance is made up of affect/liking as well
as familiarity (Slugoski and Turnbull 1988; Brown and Gilman 1989). Interacting with
somebody who we know and like well will clearly be different from talking to some-
body we know well but dislike.
236 Eva Ogiermann

ticipants are also likely to be guided by their interpretations of certain situations,

based on their previous experiences in similar contexts.
Ultimately, it may not always be possible to fully determine which situational
factors have brought about the use of a particular strategy or politeness formula.
Therefore, DCT scenarios need to be carefully designed and subjected to thor-
ough pilot testing before data can be collected – and analyses correlating particu-
lar sociolinguistic categories with strategy choice need to carefully examine the
responses and look beyond the incorporated factors.

3.2. Choosing a DCT format

While all DCTs consist of a number of scenarios in response to which the par-
ticipants are expected to produce different realisations of the speech act(s) under
investigation, the exact format of the scenarios varies across studies. DCT scenar-
ios can be open, simply presenting the participant with a situation (e. g. Beebe and
Takahashi 1989; Ogiermann 2009a), often including a prompt or an initiating line
of dialogue in direct speech (e. g. Bardovi-Harlig and Hartford 1993), or they can
be closed, i. e. providing the hearer’s response to the speech act to be elicited (e. g.
Blum-Kulka 1982, Blum-Kulka 1989). Some researchers have used longer dia-
logues, with multiple slots to be filled in by the participant (e. g. Beebe, Takahashi,
and Uliss-Weltz 1990), while others asked the participants to construct an entire
dialogue between two speakers (e. g. Barron 2003; Schneider 2008).
Hence, DCT scenarios minimally consist of a description of a particular situ-
ation, such as:
Your flatmate is about to go to the grocer’s and asks you if you need anything. You real-
ise that you have run out of toothpaste.

This scenario specifies the relationship between the speakers (flatmates: S=H,
low D), describes the situation (flatmate goes shopping, participant needs tooth-
paste), and contains an offer inviting a request, i. e. the speech act under investiga-
tion. Adding a prompt, such as “What do you say?” can provide additional guid-
ance on what is expected from the participants, e. g. reminding them that a verbal
turn is required. The addition of “What do you say to her?”, on the other hand, also
specifies the hearer’s sex.4

4
The character’s sex can also be made explicit by including a pronoun in the description
(e. g. “and she asks you”), though this is more likely to be overlooked than a pronoun at
the end of the description. Some studies have also used first names (e. g. “your flatmate
Fiona”) to mark the sex of the hearer.
Discourse completion tasks 237

DCTs used across different studies tend to differ in terms of the explicitness of
the instructions they provide. While some researchers prefer not to reveal the speech
act under study, with merely instructing the participants to “react” (e. g. Ogiermann
2009a), others provide much more specific information. In Barron’s study (2003:
90), for instance, the participants were explicitly told to produce a refusal. The
rationale behind this was that Barron was interested in eliciting refusals to offers,
and not making the focus explicit may have resulted in some participants accepting
rather than refusing (see e. g. Gass and Houck 1996). Hence, more explicit instruc-
tions may be needed in studies aiming to elicit a specific type of a reactive speech
act, e. g. a dispreferred rather than a preferred response.
Whether to make the aim of the study explicit or not is also often a decision
between ensuring that sufficient instances of the studied speech act are elicited vs.
keeping the data maximally authentic. Clearly, using prompts such as “How do you
apologise?” or “How would you complain?” presumes that the participants would
indeed want to apologise or complain in the described situations.
Those stressing the importance of authenticity insist that the responses should
not be unnecessarily constrained, by allowing the participants to produce what-
ever response they see fit, including a non-verbal response, as well as to opt out
(e. g. Eisenstein and Bodman 1986, see also Bonikowska 1988). Leaving it open
for participants to opt out may require asking them to provide a reason for doing
so, in order to be able to distinguish genuine instances of opting out from scenarios
being left blank for other reasons. This information can generate valuable meta
pragmatic data, allowing additional insights into participants’ politeness norms.
Guidance on how to respond can also be provided indirectly, by embedding the
turn to be elicited in a dialogue. The use of direct speech following the scenario
has the advantage of not only clarifying what is required, but also considerably
reduces the risk of participants describing what they would say or do instead. In
studies of rejections of advice (Bardovi-Harlig and Hartford 1993; Bardovi-Harlig
1999), for instance, the advice to be rejected was given in the form of an initiating
turn by the rejection recipient:

Your advisor suggests that you take a course which you would rather not take because
you think that it will be too difficult for you.
Advisor: If you are registered in our program you must take Syntax.
You say:

(Bardovi-Harlig 1999: 242)

The inclusion of conversational turns preceding the turn to be elicited helps prompt
the targeted reactive speech act, but it may not be feasible if the speech act under
study is an initiating one. On the other hand, not all reactive speech acts require
a verbal first pair part (FPP). Apologies, for instance, may but do not have to be
preceded by a (verbal) complaint. The complaint becomes superfluous when both
238 Eva Ogiermann

parties are aware of the offence and the offender recognises the need for an apology.
More importantly, in many situations, a complaint would not only sound unnatural
but may even make the offender less inclined to apologise (see Owen 1983: 51).
In the “classic” DCT used in the CCSARP, the scenarios were constrained even
more, as they were followed by an initiating and a closing line of dialogue (also
referred to as a rejoinder):
A student has borrowed a book from her teacher, which she promised to return today.
When meeting her teacher, however, she realizes that she forgot to bring it along.
Teacher: Miriam, I hope you brought the book I lent you.
Miriam:
Teacher: OK, but please remember it next week.
(Blum-Kulka, House and Kasper 1989: 14)
Since the final turn expresses agreement, indirectly accepting the apology to be
elicited, it does not allow the participant to opt out or produce a different speech
act. When the apology has been accepted “it seems logical that the speaker has
previously offered an apology and/or assumed responsibility for the offense” (Rose
1992: 53). Hence, a design like the one used in the CCSARP can produce findings
on how people apologise in different languages but not whether they do or do not
apologise in comparable situations.
Some DCT studies have expanded the dialogue even further, by including sev-
eral turns requiring the respondents to provide two answers. This design is more
likely to be used for the elicitation of speech acts that tend to evolve over sev-
eral turns. Invitations or offers, for instance, when rejected, may be reiterated to
provide the hearer with another opportunity to accept. The DCT used in Beebe,
Takahashi and Uliss-Weltz’s study of offer refusals (1990), for instance, consisted
of a four turn dialogue, with two offers and two slots made available for refusals.
You are at a friend’s house for lunch.
Friend: How about another piece of cake?
You:

Friend: Come on, just a little piece?

You:

(Beebe, Takahashi and Uliss-Weltz 1990: 71)

This design requires the participants to produce at least one refusal, not leaving
them the choice to accept in the first turn. While the second turn could result in
acceptance of the offer, a study focusing on refusals is likely to explicitly instruct
the participants to refuse the second offer as well.
While providing an extended dialogue as the one above acknowledges the inter-
active character of speech acts such as refusals, this design does not necessarily
Discourse completion tasks 239

allow for a cross-linguistic comparison of their sequential organisation. Previous

research has shown that the number of turns involved in accepting an offer is highly
culture-specific and can range from prompt acceptance, e. g. in north European
contexts, to extended rituals of rejecting and re-offering, in particular in Ara-
bic-speaking contexts (see e. g. Grainger, Kerkam, Mansor and Mills 2015). A DCT
scenario with two offer turns, therefore, while having the advantage of eliciting
two instances of refusing, is unable to capture the most salient culture-specific fea-
ture of offer-refusal sequences, namely their length and the amount of negotiation
required to make an acceptance acceptable in a particular socio-cultural context.
The constraints imposed by the format used by Beebe, Takahashi and Uliss-
Weltz have led Barron (2003) to develop an alternative format, the so-called Free
Discourse Completion Task (FDCT), which “requires respondents to write both
sides of an open role play” (2003: 90). In Barron’s study both offers and refusals
were elicited by providing a blank space of eight centimetres and asking the partic-
ipants to write as much as they deemed necessary (within the space provided). This
format then captures the sequential organisation of offers and refusals as it requires
the participants “to interact with an imaginary interlocutor until an appropriate
compromise is found” (2003: 91).
Schneider (2008, 2011) adopts a similar approach in that his Dialogue Pro-
duction Task (DPT) requires respondents to adopt the role of both interactants.
His work lies within variational pragmatics and the DPT has been employed to
compare the ways in which Irish, English and American speakers engage in small
talk when meeting a stranger at a party. One of the examples he provides runs as
follows:

1 A: This party is real cool, don’t you think?

2 B: Yeah, it rocks!
3 C: What’s your name?
4 B: I’m called Joan, what’s yours?
5 A: I’m Dorothy, but you can call me Dotty.
6 B: Anyway I’ll maybe see you later.
7 A: Bye.
(Schneider 2008: 108)

This design, as well as the choice of a longer, more flexible and yet highly recur-
rent interactional unit, allows Schneider to demonstrate that while speakers of all
varieties of English resort to the same range of moves, there are systematic differ-
ences in the order in which they appear.
However, while the DPT has the advantage of capturing the sequential proper-
ties of speech acts and eliciting schematic knowledge about entire speech events, it
moves away from the concept of a discourse completion task. As Schneider (2011)
himself states, the creation of dialogues is comparable to (and requires the skills
necessary for!) playwriting. It seems, therefore, that the high language proficiency
240 Eva Ogiermann

required to perform such tasks makes this instrument unsuitable for most interlan-
guage pragmatic studies.
And while the DPT comes closer to capturing the ways in which naturally
occurring conversations evolve than does a DCT, it requires imagining several
turns in advance, while turns in naturally occurring conversations evolve locally,
with speakers re-assessing the context at every turn and adjusting their responses
accordingly.
The above overview has illustrated that the different DCT formats that have
been used in cross-cultural, interlanguage and variational pragmatics reflect the
needs of the particular studies employing them. The choice of the most suitable
format will depend on the type of speech act under investigation; whether it is an
initiating (e. g. request) or a reactive speech act (e. g. refusal), whether it is for-
mulaic (e. g. thanking) or involves a wide range of formulations (e. g. complaint),
and whether it is generally performed in one turn (e. g. apology) or likely to be
negotiated over several turns (e. g. offer-refusal sequences).
Those who place emphasis on eliciting spontaneous, maximally authentic
responses will prefer vague instructions asking for people’s reactions, whatever
they are. They will also prefer open-ended scenarios over closed ones, given that
closing turns create an artificial setting which provides responses to turns that have
not yet been produced. They are also more likely to require the informants to react
to the scenarios as they would, rather than adopting different roles, so as to elicit
responses reflecting their politeness norms.
However, while this flexibility helps increase the authenticity of the data, it
inadvertently reduces its comparability. Among the elicited responses, there may
be other speech acts, instances of description of non-verbal behaviour and opting
out. Keeping the instructions explicit and restricting the respondents’ choices, on
the other hand, not only produces more instances of the desired speech act, but has
also been shown to facilitate the task for learners (e. g. Bardovi-Harlig and Hart-
ford 1993). Hence, more structured DCTs may be the better option for interlan-
guage pragmatic studies. In fact, there is an extensive pool of literature comparing
different types of DCTs and DCTs with other data elicitation instruments, which
shows that non-native speakers’ responses tend to be more affected by the different
elicitation methods than native ones.

4. Methodological comparisons

4.1. Studies comparing different types of DCTs and DCTs

with other elicitation methods
Research revealing that the different DCT formats used in cross-cultural and inter-
language pragmatics affect the findings has triggered an abundance of publications
Discourse completion tasks 241

comparing different DCT formats, as well as DCTs with other elicitation methods,
such as oral role plays or multiple choice questionnaires.
Overall, the studies report a similar use of speech act strategies and mitiga-
tion across the methods, though differences have been found in length (with open
formats generally eliciting longer responses), level of directness and the range of
strategies. Comparisons of different DCT formats include Rose’s (1992) study,
which compares requests elicited with an open DCT with those elicited by means
of a DCT with a hearer response, and Bardovi-Harlig and Hartford’s study (1993),
which compares DCTs with and without an initiating line of dialogue used to elicit
rejections of advice.
While Rose found that both formats elicited very similar results (in terms of
the choice of strategies and level of directness), the only difference being that the
open format produced longer responses, the differences established by Bardovi-
Harlig and Hartford were more striking. The DCT with an initiating turn of dia-
logue not only elicited longer responses, but they also contained more oral fea-
tures – and the initiating turn seemed to facilitate the task for non-native partici-
pants (see also Rintell and M
itchell 1989, and Johnston, Kasper and Ross 1998). A
crucial difference between these two studies, however, is that the former examines
an initiating speech act, where the provided second pair part (SPP) confirms that
it has been successful, whereas the latter looks at a reactive speech act, where the
provision of the FPP helps contextualise the refusal to be elicited.
The impact of this difference has also been confirmed by Johnston, Kasper
and Ross (1998), who compared the realisations of complaints, requests and apol-
ogies in three different DCT formats: open-ended, including a preferred, and a
dispreferred SPP. Not surprisingly, they found that apologies “were most strongly
affected by rejoinder type” (1998: 170), with a dispreferred uptake eliciting
responses downgrading the offence.
Rose’s later study (1994) compared Japanese speakers’ use and perception of
requests, using open-ended DCTs and multiple choice questionnaires (MCQ). It
showed that the DCT responses were more direct than the MCQ responses, where
the participants chose opting out and hinting more often. This led Rose to suggest
that the DCT may not be suitable for studying speech acts in non-Western contexts.
The results were confirmed in a follow-up study (Rose and Ono 1995), which also
showed a reverse trend for speakers of American English, who were less direct on
the DCT and more direct on the MCQ.
Hinkel (1997) conducted a similar comparison between advice elicited via
DCTs and MCQs from American native speakers and Chinese speakers of Eng-
lish. Her results, however, are diametrically opposed to those established by Rose
and Ono as she found her non-native speakers to be more direct in the MCQ than
on the DCT; and the native speakers to be more direct on the DCT than the MCQ.
Comparisons between DCTs and oral role plays (Rintell and Mitchell 1989;
Sasaki 1998, Yuan 2001; Félix-Brasdefer 2008, this volume), on the other hand,
242 Eva Ogiermann

all show that both instruments elicit similar expressions, but that oral responses
tend to be longer and to contain a wider range of speech act strategies. Not surpris-
ingly, oral role play responses have also been found to contain more features of
spoken language. Written requests have been found to be more direct (Rintell and
Mitchell 1989) while written refusals turned out to be more polite than oral ones
(Félix-Brasdefer 2008, this volume).
On the whole, these methodological studies confirm that the choice and design
of a DCT need to be adjusted to both the speech act and the groups of speakers
under study. While this research has shown that written DCT responses are overall
very similar to their oral counterparts, the comparisons with MCQs need to be
treated with caution, given that MCQs test the perception and not production of
speech acts.

4.2. Studies comparing DCTs with naturally occurring data

Cross-cultural and interlanguage studies based on DCT data work on the assump-
tion that DCTs elicit spoken language “indirectly through the written mode” (Sasaki
1998: 458); and while it is simply not possible for elicited, written responses to fully
resemble naturally occurring talk, it has been shown that DCT data “accurately
reflect the content expressed in natural speech” (Beebe and Cummings 1996: 75).
While there is no doubt that language use is best studied by analysing actual
speech, it is also evident that the large quantities of comparable speech act data
that can be obtained by means of a DCT could never be derived from recordings
of naturally occurring data. It has been argued that “with exception of highly rou-
tinised and standardized speech events, sufficient instances of cross-linguistically
and cross-culturally comparable data are difficult to collect through observation of
authentic conversation” (Kasper and Dahl 1991: 245).
Studies comparing DCT responses with naturally occurring data are different
from the methodological comparisons discussed above, since they are contrasting
two types of data typically used in different disciplines and for different purposes.
Most of these studies build on the authors’ previous research based on naturally
occurring data. Collecting some additional DCT data related to the original project
enables the researchers to conduct a methodological comparison. These compar-
isons tend to focus on features of natural data that are missing in the DCT data,
thus illustrating the shortcomings of DCTs and their limited potential to represent
naturally occurring conversations.
Although the main strength of the DCT is the amount of contextually varied
data it can generate, these studies use relatively low numbers of participants and
most of them only one DCT scenario in their comparisons. Hartford and Bar-
dovi-Harlig (1992), for instance, compared rejections produced during 39 aca-
demic advising sessions with rejections elicited via a DCT, which was distributed
to 24 participants (13 native and 11 non-native speakers). Golato (2003), on the
Discourse completion tasks 243

other hand, used the naturally occurring compliment sequences collected for her
PhD thesis (2005) to design a DCT, allowing her to compare DCT compliment
responses to spoken ones. The 50 tokens of compliment responses identified in
31 hours of recordings were contrasted with 20 DCTs.
Beebe and Cummings’s study (1996) compared request refusals produced dur-
ing eleven phone calls to an equal number of DCT responses. Similarly, Maíz-
Arévalo (2015) collected disagreements from students engaging in an online group
work assignment and derived one DCT scenario from this data. The 10 participants
who responded to it produced 15 instances of disagreements.
While other researchers involved higher numbers of participants, they still
asked them to respond to only one scenario taken over from their naturally occur-
ring data. Bou Franch and Lorenzo-Dus (2008), for instance, collected 60 student
email requests directed at lecturers (30 in Spanish and 30 in English) and picked
one of the recurrent requests to create a DCT scenario to which then 58 Spanish
and 58 British speakers responded. Similarly, Economidou-Kogetsidis (2013) used
requests for information received by a flight reservation centre to construct a DCT
scenario which was then distributed to 86 people.
The most comprehensive study comparing relatively large amounts of DCT
data to other types of speech act data is Turnbull’s (2001) methodological com-
parison of request refusals derived from both written and oral DCTs, role plays,
experiments, and naturally-occurring data. While the naturally occurring refusals
were produced during 113 phone calls, the DCTs were distributed to 80 students.
The telephone numbers used for the phone calls were provided by research assis-
tants who obtained them from students who had expressed a general interest in par-
ticipating in an experiment. The students whose refusals were used in Turnbull’s
study were, therefore, strictly speaking not aware of the study they were taking
part in – and they were only informed retrospectively that they had been recorded.
Turnbull propagates the use of pragmatic elicitation techniques that generate
data “in situations in which researchers can manipulate variables in the testing of
hypotheses and speakers can talk freely and spontaneously without awareness that
their talk is the object of study” (2001: 31). However, while his phone call data
come close to fulfilling all these criteria, the procedure employed was not fully
ethical, and while it has worked in the context of request refusals, it is difficult to
see how it could be used to elicit other speech acts.
On the whole, the above discussed studies have confirmed that DCTs and natu-
rally occurring data contain similar semantic formulae (e. g. Eisenstein and Bodman
1993; Beebe and Cummings 1996; Economidou-Kogetsidis 2013). DCT responses
were found to be longer (Golato 2005) or shorter (Beebe and Cummings 1996),
depending on the speech act under study. In some studies they were more formulaic
(Golato 2005; Maíz-Arévalo 2015), in others more direct and less polite (Hartford
and Bardovi-Harlig 1992), and in yet others the two types of data were similar in
terms of directness and lexical modification (Economidou-Kogetsidis 2013).
244 Eva Ogiermann

In comparison to e-mail messages, DCT requests were described as bare (Bou

Franch and Lorenzo-Dus 2008: 261) because they lacked the opening and closing
sequences found in emails; though this was perhaps to be expected since the DCT
scenario did not request the respondents to write an email, instead eliciting face-to-
face requests. Some researchers found a smaller range of linguistic expressions in
the DCT data (e. g. Hartford and Bardovi-Harlig 1992; Maíz-Arévalo 2015). How-
ever, since the numbers of DCT responses collected in these studies were rather
low and the scenarios chosen for the DCT represented only a subset of the contexts
found in the natural data, it is not surprising that the DCT responses contained a
narrower range of linguistic formulae.
The main shortcoming repeatedly reported in relation to DCT data is that they
lack the interactional and prosodic features found in naturally occurring conver-
sations. Admittedly, written data cannot convey prosodic (e. g. pitch, intonation)
or kinesic (e. g. gesture, facial expressions, posture) features, which can be crucial
to the interpretation of the responses. It has been argued that only when working
with video-recorded data “every element of the interaction (hesitation, laughter,
silences, eye-contact, and body-movements) may be incorporated in the analysis”
(Golato 2003: 111).
The type of analysis described by Golato is conducted predominantly in the
discipline of Conversation Analysis, which takes a qualitative approach and exam-
ines relatively small amounts of data in great detail. Cross-cultural pragmatics, on
the other hand, takes a quantitative approach and analyses large amounts of data in
the search of general patterns.
Likewise, that a written data collection method designed to elicit one-turn-re-
sponses lacks interactive features (but see the DPT) should not come as a surprise.
Bardovi-Harlig and Hartford’s comparison of recordings of advising sessions and
DCT data on rejections has led them to conclude that DCTs do not “promote the
turn-taking and negotiation strategies found in natural conversations” (1992: 47).
DCTs have been declared to “obscure the sequential and co-constructed nature of
talk” (Turnbull 2001: 35) and to be inappropriate for studies of “interactional rules
and patterns of actual language use” (Golato 2003: 110).
Cross-cultural and interlanguage pragmatic studies, however, do not study
interactional rules. Speech act studies, even if they are based on interactional data,
“isolate the focal speech act from its interactional environment, submit its linguis-
tic design to scrutiny, and relate the identified meaning and form conventions to
discourse-external context factors” (Kasper 2004: 125).
What also needs to be considered is that speech acts differ in the extent to
which they are likely to be performed over several conversational turns; which
makes the DCT suitable for studying some speech acts more than others. Refusals,
for instance, have been shown to consist of “multi-turn responses involving nego-
tiation, hedging and even reversal” (Houck and Gass 1996: 47). Compliments, in
contrast, are “most frequently packaged as single-turn utterances with a simple,
Discourse completion tasks 245

short, highly formulaic structure” (Kasper 2000: 319), and apologies “constitute a
complete segment of a speech event” (Coulmas 1981: 86).
As the above discussion has shown, research comparing DCTs with naturally
occurring data tends to be biased towards the latter by stressing the disadvantages
of DCTs and leaving their strengths unmentioned. What is generally taken for
granted is that the DCT has been developed to generate large amounts of compa-
rable data allowing for generalisations about speech act realisation patterns across
groups – something that could not be accomplished with the naturally occurring
data discussed.
While DCTs can elicit any speech act across a wide range of contexts, their
frequency and predictability in naturally occurring talk varies greatly, which is
why speech act studies based on recordings of authentic conversations tend to be
restricted to a particular situation in which the speech act under investigation is
likely to recur. Aston’s contrastive study of thanking in English and Italian (1995),
for instance, is based on data collected during service encounters, with his insights
into the speech act of thanking being restricted to this very specific setting. He
admits that because of “their lack of situational variation” recordings of natural
conversations appear “excessively restricted and routine” (Aston 1995: 64) in com-
parison to experimentally elicited data.
While CA studies examine the sequential organisation of talk, including that
of speech acts, as, for instance, Robinson’s (2004) work on apologies, the focus
has overwhelmingly been on the structural properties of “responses” to speech
acts, such as compliment responses (Pomerantz 1978) or agreements and disa-
greements with assessments (Pomerantz 1984). An interest in the linguistic forms
used to implement speech acts only developed in the late 2000s (with the notable
exception of Wootton 1981, 1997), which saw the publication of numerous CA
studies on requests in both institutional and everyday settings. The fact that the
vast majority of these studies focus on requests reflects the ubiquitous and recur-
rent nature of this speech act. The available CA research covers a wide range of
languages, such as Swedish (Lindström 2005), Danish (Heinemann 2006), British
English (Curl and Drew 2008; Craven and Potter 2010; Antaki and Kent 2012),
American English (Mandelbaum 2014), Italian (Rossi 2012) and Polish (Zinken
and Ogiermann 2013). While most of them contrast the use of two request forms
in the analysed settings, cross-linguistic CA speech act studies are exceedingly rare
(but see Zinken and Ogiermann 2013).
What does seem to emerge from these studies, however, is that in comparison
to research on requests conducted in cross-cultural pragmatics, the requests ana-
lysed in CA studies exhibit an overall higher level of directness than the requests
elicited by means of DCTs, which show a very strong preference for conventional
indirectness across all languages examined. This is, however, likely to be related to
the types of requests examined in the two disciplines, with many of the CA request
studies looking at low imposition requests for immediate actions, such as requests
246 Eva Ogiermann

for objects to be passed at the dinner table, or produced during collaborative activ-
ities where the outcome benefits the speaker and the hearer alike. DCT scenarios,
on the other hand, almost exclusively depict requests solely benefiting the speaker
and requesting favours that lie in the future.

5. Discussion and conclusions

As the above discussion has shown, the DCT has not only been extensively applied
to the study of a wide range of speech acts in numerous languages, it has also been
subject to scrutiny, variation, comparison with other methods, and ample criticism.
The comparisons between different data elicitation methods are largely incon-
clusive, with the results varying according to the speech act examined as well as
the participants’ linguistic backgrounds and proficiency levels. There does seem to
be a general agreement, however, that DCT responses do contain a similar range
of linguistic expressions to those found in other types of data. With the focus in
cross-cultural and interlanguage pragmatics being on patterns of speech act real-
isation, the ability to elicit such realisations is the main criterion in choosing a
data collection method. The DCT not only provides this, but also fulfils these
disciplines’ requirement for large amounts of contextually varied and comparative
data – as no other data collection instrument does.
Even though DCT responses may differ from actual language performance,
they represent “a participant’s accumulated experience within a given setting”
(Golato 2003: 92), and it has been argued that “it is precisely this more stereotyped
aspect of speech behavior that we need for cross-cultural comparability” (Blum-
Kulka 1989: 13). It is by “abstracting away the uncontrollable accidentalities and
often inaccessible idiosyncrasies of actual performance” (Schneider 2011: 30) that
the data become maximally comparable. Importantly, cross-cultural and interlan-
guage pragmatic studies do not study prosodic features, non-verbal or sequential
properties of speech acts; and research that does would never use DCT data.
What has perhaps negatively affected these two fields of enquiry is the perceived
ease with which DCT data can be collected and analysed, resulting in a large body
of “quick” studies which often do not go beyond quantifying and comparing speech
act strategies. Designing a robust DCT is a laborious and time-consuming process.
In order to generate valid and reliable findings, the construction process should start
with observations of real-life interactions (see e. g. Eisenstein and Bodman 1993),
also ensuring that they are likely to occur in all languages examined, and extensive
pilot testing, ensuring that the incorporated variables have the desired impact.
The potential of the DCT to assemble large corpora of speech act data should be
fully exploited, so that the results are indeed representative and generalisable. The
quantitative analysis should ideally be backed up by statistical testing (see Ogier-
mann and Saßenroth (2012) for an overview of statistical tests used in contrastive
Discourse completion tasks 247

pragmatics), and complemented with qualitative analysis and careful interpretation

of the findings within the relevant theoretical framework.
What complicates things is that the theoretical frameworks underlying cross-cul-
tural and interlanguage pragmatic studies have also met with ample criticism over
the last few decades. Speech act theory as well as politeness theory have both been
criticised for equating linguistic expressions with functions and overemphasising
the role of the speaker. Separating the analysed speech acts from their sequential
context (or placing them in a reduced context created within a DCT) means that
the analysis cannot take into account the hearers’ uptake, thus relying solely on
the linguistic content produced by the speaker. While this is untenable from a CA
perspective, where meaning is validated by the following turn, recent politeness
research has also moved away from equating linguistic structures with politeness
(e. g. Watts 2003). Politeness is increasingly viewed as something that is co-con-
structed and negotiable, with the focus shifting towards the hearer’s evaluations of
im/politeness. However, despite all the criticism directed at Brown and Levinson’s
theory and cross-cultural speech act research in recent years, no new framework
suitable for a cross-cultural comparison has been proposed thus far.
Ultimately, one could argue that if hundreds. of speakers agree on using a
particular speech act formulation in a particular context, this formulation is likely
to be perceived as appropriate by these and other speakers of a language. And if
hundreds of speakers of another language prefer a different strategy in the same
context, then cross-cultural pragmatic differences have been established. The DCT
cannot capture all aspects of spoken language, but it does provide valuable data on
some of them. As long as we are aware what it can and cannot provide, and of other
methods that enable us to analyse other aspects of interaction, and as long as those
methods cannot provide us with large amounts of contextually varied, comparable
data, the DCT has its place in pragmatic research.

References

Al-Ali, Mohammed Nahar and Rami Alawneh

2010 Linguistic mitigating devices in American and Jordanian students’ requests.
Intercultural Pragmatics 7: 311–339.
Al-Zumor, Abdul Wahed Qasem Ghaleb
2011 Apologies in Arabic and English: An inter-language and cross-cultural study.
Journal of King Saud University – Languages and Translation 23: 19–28.
Allami, Hamid and Amin Naeimi
2011 A cross-linguistic study of refusals: An analysis of pragmatic competence
development in Iranian EFL learners. Journal of Pragmatics 43: 385–406.
Antaki, Charles and Alexandra Kent
2012 Telling people what to do (and, sometimes, why): Contingency, entitlement
and explanation in staff requests to adults with intellectual impairments. Jour-
nal of Pragmatics 44: 876–889.
248 Eva Ogiermann

Aston, Guy
1995 Say “Thank you”. Some pragmatic constraints in conversational closings.
Applied Linguistics 16: 57–86.
Bardovi-Harlig, Kathleen
1999 Researching method. In: Lawrence F. Bouton (ed.), Pragmatics and Language
Learning 9, 237–264. Urbana, IL: University of Illinois at Urbana-Champaign.
Bardovi-Harlig, Kathleen and Beverly S. Hartford
1993 Refining the DCT: Comparing open questionnaires and dialogue completion
tasks. In: Lawrence F. Bouton and Yamuna Kachru (eds.), Pragmatics and
Language Learning 4, 143–165. Urbana, IL: University of Illinois at Urba-
na-Champaign.
Barron, Anne
2003 Acquisition in Interlanguage Pragmatics: Learning How to Do Things with
Words in a Study Abroad Context. Amsterdam: John Benjamins.
Barron, Anne
2008 Contrasting requests in Inner Circle Englishes: A study in variational prag-
matics. In: Martin Pütz and JoAnne Neff-van Aertselaer (eds.), Developing
Contrastive Pragmatics. Interlanguage and Cross-Cultural Perspectives,
355–402. Berlin/New York: Mouton de Gruyter.
Bataineh, Ruba Fahmi and Rula Fahmi Bataineh
2006 Apology strategies of Jordanian EFL university students. Journal of Pragmat-
ics 38: 1901–1927.
Bataineh, Rula Fahmi and Ruba Fahmi Bataineh
2008 A cross-cultural comparison of apologies by native speakers of American Eng-
lish and Jordanian Arabic. Journal of Pragmatics 40: 792–821.
Beebe, Leslie M. and Martha Clark Cummings
1996 Natural speech data versus written questionnaire data: How data collection
method affects speech act performance. In: Susan M. Gass and Joyce Neu
(eds.), Speech Acts Across Cultures: Challenges to Communication in a Sec-
ond Language, 65–86. Berlin/New York: Mouton de Gruyter.
Beebe, Leslie M., Tomoko Takahashi and Robin Uliss-Weltz
1990 Pragmatic transfer in ESL refusals. In: Robin Scarcella, Elaine Andersen and
Stephen D. Krashen (eds.), Developing Communicative Competence in a Sec-
ond Language, 55–73. New York: Newbury House.
Beebe, Leslie M. and Tomoko Takahashi
1989 “Do you have a bag?” Social status and patterned variation in second language
acquisition. In: Susan M. Gass, Carolyn G. Madden, Dennis Preston and Larry
Selinker (eds.), Variation in Second Language Acquisition: Discourse and
Pragmatics, 103–128. Clevedon: Multilingual Matters.
Bella, Spyridoula
2012a Length of residence and intensity of interaction: Modification in Greek L2
requests. Pragmatics 22: 1–39.
Bella, Spyridoula
2012b Pragmatic development in a foreign language: A study of Greek FL requests.
Journal of Pragmatics 44: 1917–1947.
Bella, Spyridoula
2014 A contrastive study of apologies performed by Greek native speakers and Eng-
lish learners of Greek as a foreign language. Pragmatics 24: 679–713.
Discourse completion tasks 249

Bergman, Marc L. and Gabriele Kasper

1993 Perception and performance in native and nonnative apology. In: Gabriele
Kasper and Shoshana Blum-Kulka (eds.), Interlanguage Pragmatics, 82–107.
New York: Oxford University Press.
Bharuthram, Sharita
2003 Politeness phenomena in the Hindu sector of the South African Indian English
speaking community. Journal of Pragmatics 35: 1523–1544.
Billmyer, Kristine and Manka Varghese
2000 Investigating instrument-based pragmatic variability: Effects of enhancing
discourse completion tests. Applied Linguistics 21: 517–552.
Blum-Kulka, Shoshana
1982 Learning how to say what you mean in a second language: A study of speech
act performance of learners of Hebrew as a second language. Applied Linguis-
tics 3: 29–59.
Blum-Kulka, Shoshana, Juliane House and Gabriele Kasper (eds.)
1989 Cross-cultural Pragmatics: Requests and Apologies. Norwood, NJ: Ablex.
Blum-Kulka, Shoshana, Juliane House and Gabriele Kasper
1989 Investigating cross-cultural pragmatics: An introductory overview. In: S
hoshana
Blum-Kulka, Juliane House and Gabriele Kasper (eds.), Cross-cultural Prag-
matics: Requests and Apologies, 1–36. Norwood, NJ: Ablex.
Bonikowska, Małgorzata
1988 The choice of opting out. Applied Linguistics 9: 169–181.
Bou Franch, Patricia and Nuria Lorenzo-Dus
2008 Natural versus elicited data in cross-cultural speech act realization: The case
of requests in Peninsular Spanish and British English. Spanish in Context 5:
246–277.
Brown, Penelope and Stephen Levinson
[1978] 1987 Politeness. Some Universals in Language Usage. Cambridge: Cambridge
University Press.
Brown, Roger and Albert Gilman
1989 Politeness theory and Shakespeare’s four major tragedies. Language in Society
18: 159–212.
Byon, Andrew Sangpil
2004 Sociopragmatic analysis of Korean requests: Pedagogical settings. Journal of
Pragmatics 36: 1673–1704.
Byon, Andrew Sangpil
2006 The role of linguistic indirectness and honorifics in achieving linguistic polite-
ness in Korean requests. Journal of Politeness Research 2: 247–276.
Cenoz, Jasone
2003 The intercultural style hypothesis: L1 and L2 interaction in requesting behav-
iour. In: Vivian Cook (ed.), Effects of the Second Language on the First,
62–80. Clevedon: Multilingual Matters.
Craven, Alexandra and Jonathan Potter
2010 Directives: Entitlement and contingency in action. Discourse Studies 12: 419–
442.
Cohen, Andrew D. and Elite Olshtain
1994 Researching the production of second-language speech acts. In: Elaine E. Tar-
one, Susan M. Gass and Andrew D. Cohen (eds.), Research Methodology in
250 Eva Ogiermann

Second-Language Acquisition, 143–156. Hillsdale, N.J.: Lawrence Erlbaum

Associates.
Coulmas, Florian
1981 “Poison to your soul”. Thanks and apologies contrastively viewed. In: Florian
Coulmas (ed.), Conversational Routine: Explorations in Standardized Com-
munication Situations and Prepatterned Speech, 69–91. The Hague: Mouton.
Curl, Traci and Paul Drew
2008 Contingency and action: A comparison of two forms of requesting. Research
on Language and Social Interaction 41: 129–153.
Economidou-Kogetsidis, Maria
2009 Interlanguage request modification: The use of lexical/phrasal downgraders
and mitigating supportive moves. Multilingua 28: 79–112.
Economidou-Kogetsidis, Maria
2013 Strategies, modification and perspective in native speakers’ requests: A com-
parison of WDCT and naturally occurring requests. Journal of Pragmatics 53:
21–38.
Eisenstein, Miriam and Jean Bodman
1986 “I very appreciate”: Expressions of gratitude by native and non-native speak-
ers of American English. Applied Linguistics 7:167–185.
Eisenstein, Miriam and Jean Bodman
1993 Expressing gratitude in American English. In: Gabriele Kasper and Shoshana
Blum-Kulka (eds.), Interlanguage Pragmatics, 64–81. New York: Oxford
University Press.
Eslami, Zohreh R. and Aazam Noora
2008 Perceived pragmatic transferability of L1 request strategies by Persian learners
of English. In: Martin Pütz and JoAnne Neff-van Aertselaer (eds.), Develop-
ing Contrastive Pragmatics. Interlanguage and Cross-Cultural Perspectives,
301–334. Berlin/New York: Mouton de Gruyter.
Félix-Brasdefer, J. César
2008 Politeness in Mexico and the United States: A Contrastive Study of the Reali-
zation and Perception of Refusals. Amsterdam: John Benjamins.
Golato, Andrea
2003 Studying compliment responses: A comparison of DCTs and recordings of nat-
urally occurring talk. Applied Linguistics 24: 90–121.
Golato, Andrea
2005 Compliments and Compliment Responses. Grammatical Structure and Sequen-
tial Organisation. Amsterdam: John Benjamins.
Grainger, Karen, Zainab Kerkam, Fathia Mansor and Sara Mills
2015 Offering and hospitality in Arabic and English. Journal of Politeness Research
11: 41–70.
Hartford, Beverly S. and Kathleen Bardovi-Harlig
1992 Experimental and observational data in the study of interlanguage pragmatics.
In: Lawrence F. Bouton and Yamuna Kachru (eds.), Pragmatics and Language
Learning 3, 33–52. Urbana, Illinois: University of Illinois at Urbana-Cham-
paign.
Heinemann, Trine
2006 ‘‘Will you or can’t you?’’ Displaying entitlement in interrogative requests.
Journal of Pragmatics 38: 1081–1104.
Discourse completion tasks 251

Hendriks, Berna
2008 Dutch English requests: A study of request performance by Dutch learners
of English. In: Martin Pütz and JoAnne Neff-van Aertselaer (eds.), Develop-
ing Contrastive Pragmatics. Interlanguage and Cross-Cultural Perspectives,
335–354. Berlin/New York: Mouton de Gruyter.
Hinkel, Eli
1997 Appropriateness of advice: DCT and multiple choice data. Applied Linguistics
18: 1–26.
Houck, Noel and Susan M. Gass
1996 Non-native refusals: A methodological perspective. In: Susan M. Gass and
Joyce Neu (eds.), Speech Acts Across Cultures: Challenges to Communication
in a Second Language, 45–64. Berlin/New York: Mouton de Gruyter.
Jebahi, Khaled
2011 Tunisian university students’ choice of apology strategies in a discourse com-
pletion task. Journal of Pragmatics 43: 648–662.
Johnston, Bill, Gabriele Kasper and Steven Ross
1998 Effect of rejoinders in production questionnaires. Applied Linguistics 19(2):
157–182.
Kasanga, Luanga A. and Joy-Christine Lwanga-Lumu
2007 Cross-cultural linguistic realization of politeness: A study of apologies in Eng-
lish and Setswana. Journal of Politeness Research 3: 65–92.
Kasper, Gabriele
2004 Speech acts in (inter)action: Repeated questions. Intercultural Pragmatics 1:
125–133.
Kasper, Gabriele
2000 Data collection in pragmatics research. In: Helen Spencer-Oatey (ed.), Cul-
turally Speaking: Managing Rapport Through Talk Across Cultures, 316–334.
London: Continuum.
Kasper, Gabriele and Merete Dahl
1991 Research methods in interlanguage pragmatics. Studies on Second Language
Acquisition 13: 215–247.
Kwon, Jihyun
2004 Expressing refusals in Korean and in American English. Multilingua 23: 339–
364.
Leech, Geoffrey N
1983 Principles of Pragmatics. New York: Longman.
Levenston, Edward A. and Shoshana Blum
1978 Discourse completion as a technique for studying lexical features of interlan-
guage. Working Papers on Bilingualism 15: 13–21.
Lindström, Anna
2005 Language as social action: A study of how senior citizens request assistance
with practical tasks in the Swedish home help service. In: Auli Hakulinen and
Margret Selting (eds.), Syntax and Lexis in Conversation: Studies on the Use
of Linguistic Resources in Talk-in-interaction, 209–230. Amsterdam: John
Benjamins.
Maíz-Arévalo, Carmen
2015 DCTs versus Naturally Occurring Data in the realization of disagreement by
non-native speakers of English. In: Sara Gesuato, Francesca Bianchi and Win-
252 Eva Ogiermann

nie Cheng (eds.), Teaching, Learning and Investigating Pragmatics: Principles,

Methods and Practices, 185–205. Newcastle: Cambridge Scholars Publishing.
Mandelbaum, Jenny
2014 “How to do things with requests.” Request sequences at the family dinner
table. In: Drew, Paul and Elizabeth Couper-Kuhlen (eds.), Requesting in
Social Interaction, 215–242. Amsterdam: John Benjamins.
Marti, Leyla
2006 Indirectness and politeness in Turkish-German bilingual and Turkish monolin-
gual requests. Journal of Pragmatics 38: 1836–1869.
Mulo Farenkia, Bernard
2012 Compliment strategies and regional variation in French: Evidence from Cam-
eroon and Canadian French. Pragmatics 22: 447–476.
Murphy, Beth and Joyce Neu
1996 My grade’s too low: The speech act set of complaining. In: Susan M. Gass and
Joyce Neu (eds.), Speech Acts Across Cultures: Challenges to Communication
in a Second Language, 191–216. Berlin/New York: Mouton de Gruyter.
Nakabachi, Keiichi
1996 Pragmatic transfer in complaints: Strategies of complaining in English and
Japanese by Japanese EFL speakers. JACET Bulletin 27: 127–142.
Nureddeen, Fatima Abdurahman
2008 Cross cultural pragmatics: Apology strategies in Sudanese Arabic. Journal of
Pragmatics 40: 279–306.
Ogiermann, Eva
2008 On the culture-specificity of linguistic gender differences: The case of English
and Russian apologies. Intercultural Pragmatics 5: 259–286.
Ogiermann, Eva
2009a On Apologising in Negative and Positive Politeness Cultures. Amsterdam:
John Benjamins.
Ogiermann, Eva
2009b Politeness and in-directness across cultures: A comparison of English, Ger-
man, Polish and Russian requests. Journal of Politeness Research 5: 189–216.
Ogiermann, Eva and Denise Saßenroth
2012 Statistics in contrastive pragmatics. In: Leyre Ruiz de Zarobe and Yolanda
Ruiz de Zarobe (eds.), Speech Acts and Politeness across Languages and Cul-
tures, 369–398. Bern: Peter Lang.
Olshtain, Elite and Liora Weinbach
1993 Interlanguage features of the speech act of complaining. In: Gabriele Kasper
and Shoshana Blum-Kulka (eds.), Interlanguage Pragmatics, 108–122. New
York: Oxford University Press.
Otcu, Bahar and Deniz Zeyrek
2008 Development of requests: A study on Turkish learners of English. In: Martin
Pütz and JoAnne Neff-van Aertselaer (eds.), Developing Contrastive Prag-
matics. Interlanguage and Cross-Cultural Perspectives, 265–300. Berlin/New
York: Mouton de Gruyter.
Owen, Marion
1983 Apologies and Remedial Interchanges: A Study of Language Use in Social
Interaction. Berlin/New York: Mouton de Gruyter.
Discourse completion tasks 253

Pinto, Derrin
2005 The acquisition of requests by second language learners of Spanish. Spanish in
Context 2: 1–27.
Placencia, Maria Elena
2008 Requests in corner shop transactions in Equadorian Andean and coastal Span-
ish. In: Klaus Schneider and Anne Barron (eds.), Variational Pragmatics: A
Focus on Regional Varieties in Pluricentric Languages, 307–332. Amsterdam:
John Benjamins.
Pomerantz, Anita
1978 Compliment responses: notes on the cooperation of multiple constraints. In:
Jim Schenkein (ed.), Studies in the Organization of Conversational Interac-
tion, 79–112. New York: Academic Press.
Pomerantz, Anita
1984 Agreeing and disagreeing with assessments: Some features of preferred/dis-
preferred turn shapes. In: J. Maxwell Atkinson and John Heritage (eds.), Struc-
tures of Social Action: Studies in Conversation Analysis, 57–101. Cambridge:
Cambridge University Press.
Rintell, Ellen M. and Candace J. Mitchell
1989 Studying requests and apologies: An inquiry into method. In: Shoshana Blum-
Kulka, Juliane House and Gabriele Kasper (eds.), Cross-cultural Pragmatics:
Requests and Apologies, 248–272. Norwood, N. J.: Ablex.
Robinson, Jeffrey D.
2004 The sequential organization of “explicit” apologies in naturally occurring Eng-
lish. Research on Language and Social Interaction 37: 291–331.
Rose, Kenneth R.
1992 Speech acts and questionnaires: The effect of hearer response. Journal of
Pragmatics 17: 49–62.
Rose, Kenneth R.
1994 On the validity of discourse completion tests on non-western contexts. Applied
Linguistics 15: 1–14.
Rose, Kenneth R. and Reiko Ono
1995 Eliciting speech act data in Japanese: The effect of questionnaire type. Lan-
guage Learning 45: 191–223.
Rossi, Giovanni
2012 Bilateral and unilateral requests: The use of imperatives and Me X? interroga-
tives in Italian. Discourse Processes 49: 426–458.
Sabaté i Dalmau, Maria and Hortènsia Curell i Gotor
2007 From “Sorry very much” to “I’m ever so sorry”: Acquisitional patterns in L2
apologies by Catalan learners of English. Intercultural Pragmatics 4: 287–
315.
Sasaki, Miyuki
1998 Investigating EFL students’ production of speech acts: A comparison of pro-
duction questionnaires and role plays. Journal of Pragmatics 30: 457–484.
Schauer, Gila A. and Svenja Adolphs
2006 Expressions of gratitude in corpus and DCT data: Vocabulary, formulaic
sequences, and pedagogy. System 34: 119–134.
254 Eva Ogiermann

Schneider, Klaus P.
2008 Small talk in England, Ireland, and the USA. In: Klaus P. Schneider and Anne
Barron (eds.), Variational Pragmatics: A Focus on Regional Varieties in Pluri-
centric Languages, 99–138. Amsterdam: John Benjamins.
Schneider, Klaus P.
2011 Imagining conversation: How people think people do things with words. Socio
linguistic Studies 5: 15–36.
Shardakova, Maria
2005 Intercultural pragmatics in the speech of American L2 learners of Russian:
Apologies offered by Americans in Russian. Intercultural Pragmatics 2: 423–
451.
Slugoski, Ben R. and William Turnbull
1988 Cruel to be kind and kind to be cruel: Sarcasm, banter and social relations.
Journal of Language and Social Psychology 7: 101–121.
Suszczyńska, Małgorzata
1994 A study in intercultural pragmatics: Apology. Studies in Applied Linguistics 1:
11–22.
Suszczyńska, Małgorzata
1999 Apologizing in English, Polish and Hungarian: Different languages, different
strategies. Journal of Pragmatics 31: 1053–1065.
Tang, Chen-Hsin and Grace Qiao Zhang
2009 A contrastive study of compliment responses among Australian English and
Mandarin Chinese speakers. Journal of Pragmatics 41: 325–345.
Turnbull, William
2001 An appraisal of pragmatic elicitation techniques for the social psychological
study of talk: The case of request refusals. Pragmatics 11: 31–61.
Van Mulken, Margot
1996 Politeness markers in French and Dutch requests. Language Sciences 18: 689–
702.
Warga, Muriel
2008 Requesting in German as a pluricentric language. In: Klaus P. Schneider and
Anne Barron (eds.), Variational Pragmatics: A Focus on Regional Varieties in
Pluricentric Languages, 245–266. Amsterdam: John Benjamins.
Warga, Muriel and Ursula Schölmberger
2007 The acquisition of French apologetic behaviour in a study abroad context.
Intercultural Pragmatics 4: 221–252.
Watts, Richard J.
2003 Politeness. Cambridge: Cambridge University Press.
Woodfield, Helen
2008 Interlanguage requests: A contrastive study. In: Martin Pütz and JoAnne Neff-
van Aertselaer (eds.), Developing Contrastive Pragmatics. Interlanguage and
Cross-Cultural Perspectives, 231–264. Berlin/New York: Mouton de Gruyter.
Wootton, Anthony
1981 Two request forms of four year olds. Journal of Pragmatics 5: 511–523.
Wootton, Anthony
1997 Interaction and the Development of Mind. Cambridge: Cambridge University
Press.
Discourse completion tasks 255

Wouk, Fay
2006 The language of apologizing in Lombok, Indonesia. Journal of Pragmatics 38:
1457–1486.
Yuan, Yi
2001 An inquiry into empirical pragmatics data-gathering methods: Written DCTs,
oral DCTs, field notes, and natural conversations. Journal of Pragmatics 33:
271–292.
Zinken, Jörg and Eva Ogiermann
2013 Responsibility and action: Invariants and diversity in requests for objects
in British English and Polish interaction. Research on Language and Social
Interaction 46: 256–276.
10. Assessing the comprehension of pragmatic
language: Sentence judgment tasks
Alma Veenstra and Napoleon Katsos

Abstract: Researchers have used several different types of comprehension tasks to

investigate how listeners interpret language pragmatically. This chapter focuses on
sentence judgment tasks in which participants typically judge utterances of other
speakers on a binary scale, e. g. for correctness/incorrectness or appropriateness/
inappropriateness. Using examples from the scalar implicature literature, we argue
that these sentence judgment tasks should be used with caution. In binary judgment
paradigms, where participants are asked to make a judgment as to whether a sen-
tence is correct/incorrect, the acceptance of pragmatically incorrect sentences has
often led researchers to conclude that the participant has not yet acquired the prag-
matic phenomenon under investigation. A growing number of studies, however,
suggest that these speakers, when tested with other paradigms, actually are com-
petent in pragmatics. We explain why the experimental investigation of pragmatic
phenomena is particularly sensitive to the type of task chosen, and we conclude this
chapter with an overview of alternative comprehension tasks that are less likely to
underestimate the performance of participants.

1. Introduction: Eliciting data about meaning

A widely held idea in the study of semantics and pragmatics is that the meaning of
a sentence can be described (at least in part) in terms of its truth-conditions (see
Davidson 1967; Dummett 1959). In this view, a good starting point for obtain-
ing empirical data for a theory of meaning in natural languages is the speaker’s
intuitive understanding of the conditions under which a sentence is true. When
it comes to quantifying and comparing intuitions about the truth-conditions of
different kinds of sentences (or of the same sentence in different conditions), a
seemingly straightforward method is to elicit judgments of truth and falsity from a
number of speakers using comprehension tasks. The field of language acquisition,
language processing, and experimental research in linguistics in general, abounds
with variations of such tasks. In its most simple, and most frequent, form, the one
we called here Sentence Judgment Task, this involves the presentation of a sentence
(either in spoken or written form) and a situation of evaluation which is usually
depicted pictorially (though of course the situation can be presented orally as well).
The participant is then asked for a binary judgment on the sentence’s truth or falsity
for the situation. Among many other phenomena, sentence judgment tasks (hence-

https://doi.org/10.1515/9783110424928-010
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 257–279. Berlin/
Boston: De Gruyter Mouton.
258 Alma Veenstra and Napoleon Katsos

forth SJTs) have been used to study quantifier meaning (Smith 1980), quantifier
scope ambiguities (Lidz and Musolino 2000), presupposition projection (Chemla
and Bott 2013), speech acts (Gibbs 1994), figures of speech (Glucksberg 2003) and
implicature (Noveck 2001).
In the last fifteen years, the study of implicature, and in particular of scalar
implicature, has enjoyed increasing and sustained focus as a paradigmatic case
of the interaction of semantic and pragmatic meaning. In this chapter, our focus
on implicature is from a methodological perspective, with the aim to scrutinize
the felicity of SJTs as a suitable paradigm for the study of pragmatic meaning.
The chapter is structured as follows: first, we will introduce the phenomenon in
broad terms and hint at (but not explore) the theoretical debates surrounding the
mechanism and nature of implicature. Second, we will review how SJTs have been
used to study the acquisition and processing of implicature. Next, we will outline
in what respects the binary version of SJTs can be misleading as regards a partici-
pant’s competence with implicature. To spill the proverbial beans, participants may
deem that while the situation they are presented with is not compatible with the
implicature they have drawn, this violation may not be grave enough to warrant
the down-right rejection of the sentence in a binary sentence judgment task. In this
respect, participants in sentence judgment tasks are asked to take part in a meta-lin-
guistic judgment, which evaluates their tolerance towards pragmatic violations, in
addition to their actual comprehension of pragmatic meaning. While the object of
theoretical inquiry is the latter, namely what a participant understood, SJTs mostly
reveal the former, namely the participant’s disposition towards what they under-
stood. In the final section we will review several other paradigms of comprehen-
sion tasks that do not face this challenge. We will also acknowledge the conditions
under which using a SJT would still be felicitous. Our take-home message is a call
for a nuanced understanding of what comprehension tasks can and cannot reveal,
and that extra care should be given to the choice of tasks to minimize (or at least
acknowledge) the meta-linguistic component of SJTs.

2. Quantity implicatures

In Gricean pragmatics, the first sub-maxim of Quantity, “make your contribution

as informative as is required (for the current purposes of the exchange)” (Grice
1989: 26), is expected to guide speakers and addressees to communicate and infer
a broad class of inferences. These inferences include scalar implicatures (SIs),
which are related to scales whose terms differ in informativeness, such as <some,
all>, as in (1) below.
(1) A: Did you eat the cake?
B: I ate some of it. (=> not all of it)
Assessing the comprehension of pragmatic language: Sentence judgment tasks 259

In (1), the use of an informationally weaker term, some, implies that the speaker
cannot make a statement with the stronger term, all. Assuming that it was relevant
to do so, if she could, and that she was in position to know if the statement with all
was true (which is intuitive in this mock example), then the speaker can be taken to
communicate that she did not eat all of it. Scalar implicatures involve expressions
that can form a scale of entailment which could be evoked without any context,
such as the quantifier scale mentioned already, modal scales <might, must>, dis-
junction <or, and>, verbs of affection <like, love> among many others. However,
scalar implicatures are but special cases of a larger set of inferences that are based
on the Quantity maxim. Let us take the following conversation where a bride is
talking to her former boss and lover, Bill, about her fiancée, Tommy:
(2) The Bride: Have you seen Tommy?
Bill: Big guy in the tux?
The Bride: Yes.
Bill: Then I saw him. I like his hair.
The Bride: You promised you’d be nice!
(Kill Bill, Volume 2; script by Quentin Tarantino;
Uma Thurman as the Bride; David Carradine as Bill)
And let us further assume that the following inferences can be drawn from Bill’s
last utterance:
(3) a. Bill doesn’t love, adore, or worship Tommy’s hair.
b. Bill doesn’t like anything else about Tommy besides his hair.

While (3a) involves the cancellation of the stronger scalar alternatives of like in
a scale like the following, <like, love, adore, worship …>, the inference in (3b)
relies on alternatives that are harder to pin down. An entailment scale for (3b)
could consist of sets of things that one can like about a prospective groom, <{hair},
{personality}, {…}, {hair, personality}, {hair, personality, …}, …>. Alternatively,
the scale for (3b) could involve a person’s characteristics, ranked in terms of note-
worthiness, e. g. <hair, …, overall physical appearance, …, personality …>. There
is not an entailment relation between the terms of such a scale, but we can still see
how the terms could be ranked according to this (subjective) criterion. Whichever
way the scale is formed, in both (3a) and (3b), the reasoning that derives the infer-
ence goes somewhat like this: the interlocutors are interested in establishing what
Bill thinks of Tommy. Hence any information to that effect is relevant to the topic
of the conversation. Now, the speaker said “I like [Tommy’s] hair”. If the speaker
wanted to say something more informative than what he said, he would have done
so. If we assume that he is cooperative and that he knows what he is talking about,
then the fact that he did not say anything more implies that he did not want to. In
plain terms, the fact that Bill only said that he liked Tommy’s hair implies that he
does not feel any stronger towards it, and the fact that Bill only said that he liked
Tommy’s hair implies that he does not like anything else about Tommy. This,
260 Alma Veenstra and Napoleon Katsos

arguably, justifies the Bride’s reproach. While the first inference is considered a
scalar implicature on the grounds that there exists a context-independent scale of
alternatives for verbs of affection, the second inference is substantially dependent
on the context of conversation. Notwithstanding this difference, both inferences
are considered implicatures, and more precisely quantity implicatures, as they rely
on the maxim of Quantity.
The precise mechanism by which quantity implicatures are generated has been
the subject of much linguistic debate (Carston 1998; Chierchia 2004; Hirschberg
1991; Horn 1984; Geurts 2010; Levinson 2000; Sauerland 2004; Sperber and Wil-
son [1986] 1995; among others) and corresponding psycholinguistic investigations
(Bott, Bailey, and Grodner 2012; Bott and Noveck 2004; Breheny, Katsos, and
Williams 2006; De Neys and Schaeken 2007; Dieussaert, Verkerk, Gillard, and
Schaeken 2011; Feeney et al. 2004; Grodner, Klein, Carbary, and Tanenhaus 2010;
Huang and Snedeker 2009; Noveck and Posada 2003; Panizza, Chierchia, and
Clifton Jr. 2009; among others).
From a linguistic perspective, the questions that arise concern whether these
inferences are generated post-propositionally, that is after the truth-conditions of
the sentence have been computed, or sub-propositionally, and whether there is
a substantial distinction between generalized conversational distinctions, which
according to Horn (1984) seem to be available unless the specific context of the
conversation cancels them (this would be the case for (1) and (3a)), and particular-
ized conversational implicatures, which are available only if the specific context
of conversation supports them (this would be the case of (3b)).
These considerations have led to two broad classes of psycholinguistic
accounts. Contextual accounts propose that participants only infer SIs when cer-
tain contextual conditions are met, such as that the inference would be relevant,
that the speaker is cooperative, and that the speaker would be in a position to assert
the more informative proposition with all if it were true. Because these conditions
are not necessary in order to access the plain meaning of some (= at least one
and possibly even all), contextual models often assume that additional processing
costs are incurred when an interpretation with an SI is generated compared to an
interpretation without an SI (e. g. see Bott et al. 2012). Default accounts on the
other hand predict that SIs are the preferred interpretation of words like some, and
that SIs are not generated by standard pragmatic mechanisms. Instead, SIs are
relatively context-independent inferences, and additional costs are incurred when-
ever the SI is not generated (if it transpires that some of the contextual conditions
were not met). In these cases, the preferred interpretation with the SI which was
generated by default will be cancelled, leading to processing costs associated with
backtracking and re-analysis. From a child language acquisition perspective, these
two issues arise in terms of the age at which children acquire the ability to infer
scalar and other quantity implicatures, and the contextual conditions in which they
do so (see Katsos 2014, for an overview).
Assessing the comprehension of pragmatic language: Sentence judgment tasks 261

3. Sentence judgment tasks and the study of implicature

As mentioned in the introduction, a method that is very commonly used to study

the pragmatic inferences that participants make is the comprehension task. In its
most common form, the sentence judgment task, or SJT, involves the participants
hearing or reading a sentence which is intended to be a description of a situation
that they are presented with (visually, as a picture, or in a narrative). The par-
ticipants are then asked to make a judgment on the felicity of the sentence as a
description of that situation, e. g. they may be asked if the sentence was “true” or
“false”, or if it was “correct” or “incorrect”, or if they “agree” or “disagree”. The
comprehension task is designed in such a way as to elicit one kind of response
if the participant does make the pragmatic inference that is associated with the
sentence they were presented with, and to give another kind of response if the
participant does not make the inference.
When it comes to implicature, the most prolific strand of research has tested
the debate outlined in section 2 using SJTs where underinformative statements
are presented in order to elicit participants’ judgments. For example, Noveck and
Posada (2003) asked participants to perform a timed binary judgment task on sen-
tences such as (4).
(4) Some elephants have trunks.

This sentence is pragmatically infelicitous, given that all elephants (stereotypi-

cally), have trunks. Noveck and Posada (2003) argued that rejection of (4) would
indicate the generation of the scalar implicature some but not all, whereas accept-
ance of (4) would indicate an interpretation without the SI. Consequently, by com-
paring rejection and acceptance times, Noveck and Posada were able to compare
interpretations with and without the SI respectively, and documented significantly
longer response times for rejection. This supports the contextual model, which
predicted a slow SI derivation. To further test these two models, Bott and Noveck
(2004) controlled for response type (acceptance or rejection), and also documented
a preference for SI responses when the permitted response time was increased. They
also reported that SI responses incurred a processing cost relative to responses to
semantically true or false control utterances.
SJTs have also been used in language acquisition research. In a paradigm that
kick-started much of the developmental research on implicature, Noveck (2001,
Experiment 3, 179–183) presented 8- and 10-year-old children and adults with
de-contextualized underinformative utterances, such as some giraffes have long
necks or some cars have motors. They found that while adults rejected these utter-
ances at rates around 40 %, 8- and 10-year-old children overwhelmingly accepted
them. It is critical to note that neither children nor adults had substantial difficulties
accepting or rejecting patently true or patently false sentences, indicating that the
difficulty was specifically to do with the pragmatically infelicitous sentences. It
262 Alma Veenstra and Napoleon Katsos

is important to note that the purpose of the studies reported in Noveck (2001) was
not to investigate the youngest age at which children can generate implicatures,
but rather to highlight quantity implicature as a systematically challenging kind
of inference for children (and adults), using a range of paradigms and a range of
expressions.
A number of studies have since focused on when and under what conditions
children derive quantity implicatures. In most subsequent work, the context of
evaluation is not the world at large (as is the case with the de-contextualized
sentences in Noveck 2001, Experiment 3) but rather a more strictly delineated
situation, usually by visual depiction. Papafragou and Musolino (2003) among
others, again using a binary SJT, studied 5-year-old children’s performance on
quantified, numerical, and aspectual scales, and found that children often accepted
underinformative statements. Training and explicit instruction improved the chil-
dren’s performance, increasing the success on the numerical scales to a near-ceil-
ing, 90 %. Feeney et al. (2004) found that manipulation of the relevance of the
implicature also enhances rejections of underinformative utterances, finding higher
rejection rates in 7-year-olds than those reported in Noveck (2001, Experiment 3)
for 10-year-olds. Guasti et al. (2005) studied children’s pragmatic competence in
a binary SJT setting by adapting Papafragou and Musolino’s training and explicit
instructions. They found that the performance of 7-year-olds did improve, but this
effect did not persist over a longer period of time. Barner, Brooks and Bale (2011),
Foppolo, Guasti and Chierchia (2012), Papafragou and Tantalou (2004), and Skor-
dos and Papafragou (2016) shed more light on the discourse, task and scale-related
conditions under which SIs are generated, which go beyond the scope of the present
chapter.
Overall, the developmental trajectory of competence in informativeness can
be understood as one in which children initially interpret sentences in the literal,
semantic interpretation, with the pragmatic interpretation, the implicature, being
acquired later on. These studies all base their conclusions on the assumption that
the children who accept underinformative sentences in the SJT do not have the
competence to draw an implicature. But 5-year-old children’s difficulties with
deriving implicatures are surprising given their competence with many of the pre-
conditions for pragmatic inferencing at an age as young as two years. For example,
much younger children, 2- and 3-year-olds, can track their interlocutors’ attention,
epistemic state and cooperativity as well as the common ground for the on-going
communicative interaction (Baldwin 1993; Behne, Carpenter, and Tomasello 2005;
Grassmann, Stracke, and Tomasello 2009; Liebal et al. 2009; Southgate, Cheval-
lier, and Csibra 2010; Tomasello 1992).
Moreover, in the well-known case of excluding already-labelled referents in
novel word learning, 2-year-old children succeed with a form of related counter-
factual reasoning (Clark 1987, 1988; Grassmann et al. 2009; Markman 1989, 1990;
Markman and Wachtel, 1988). Specifically, they can reason that had their interloc-
Assessing the comprehension of pragmatic language: Sentence judgment tasks 263

utor wished to refer to an object whose label is known, he or she would have used
the known label. Hence, the fact that it was not used, suggests that the object being
referred to has not yet been labelled.
Likewise, studies by Wynn (1992) and by Barner and colleagues (Barner and
Bachrach, 2010; Barner et al., 2009) show that children as young as two years of
age can compute exclusion inferences based on numerals: in a forced choice pic-
ture selection task, children who only know the meaning of one and are shown two
sets of one and four items, pick the set of four when asked to “point to four”. These
inferences are very similar in form to quantity implicatures.
Additionally, another study demonstrated 3-year-old children’s skills in draw-
ing relevance inferences (Schulze, Grassmann, and Tomasello 2013). In that study,
an adult answers a closed question (e. g. “Would you like cornflakes or a roll for
breakfast?”) with a seemingly irrelevant utterance (e. g. “The milk is gone”). Only
by inferring the adult’s intentions in the situational context can the child conclude
that the adult would prefer the roll. Moreover, even 18-month-old children have
been shown to make relevance inference in indirect nonverbal communication
(Schulze and Tomasello, 2015).
Why, therefore, would quantity implicature generally, and scalar implicature
more specifically, be especially difficult for 5-year-old, let alone 7- or 10-year-old
children? In this chapter we will not attempt to give a comprehensive answer to
this question, but see Katsos and Wilson (forthcoming) who assume that there is
no single answer but rather a confluence of linguistic, cognitive and experimental
factors. Here, our contribution is to point out the importance of the methodology
used to study the inference, and especially the potentially misleading role of binary
SJTs because of the meta-cognitive component involved. To do so, we turn to a
series of studies that have directly taken issue with the SJT.

4. Sentence judgment tasks and children’s pragmatic competence

A number of recent studies have shown that SJTs are not the ideal paradigm to
test participants’ competence with pragmatics, especially in the context of child
language acquisition. This section will review these studies and touch lightly upon
the Pragmatic Tolerance Hypothesis which we will return to in more detail in the
next section. First of all, Katsos and Smith (2010) were interested in the acquisition
of informativeness from both a production and a comprehension perspective. The
literature on exhaustivity had shown that children start to produce fully inform-
ative answers to questions (thus, to have acquired the first maxim of Quantity in
language production: do not be underinformative) around the age of 5 to 6 years
(Roeper 2004; Roeper, Schulz, Pearson, and Reckling 2006) For example, a fully
exhaustive response to the question ‘Who is holding a balloon’ would be to men-
tion all the people who are holding a balloon rather than just one of them. The
264 Alma Veenstra and Napoleon Katsos

literature on the acquisition of informativeness, however, had shown that chil-

dren rejected underinformative statements (thus, to have acquired the first maxim
in comprehension: speakers are not underinformative) only at a later age (e. g.
Noveck, 2001). Was this discrepancy in the age of acquisition a manifestation of
an interface or of a production/comprehension asymmetry? Or could it be that the
paradigm used for the investigation of comprehension somehow did not test the
children’s actual competence in informativeness?
The idea that binary SJTs might obscure participants’ actual competence in
informativeness was first put forward by Katsos and Bishop (2008). They argued
that underinformative sentences in SJTs not only require the participants to gen-
erate an implicature, but also require them to consequently reject the underin-
formative sentence. It might very well be possible that the children notice the
pragmatic violation or that they draw the implicature, but at the meta-cognitive
level, they do not think this is a violation grave enough to warrant a rejection of
the utterance (which semantic violations typically do). To explore if this is the
case, Katsos and Smith (2010) compared children’s performance on a binary SJT
and a graded SJT. By providing participants with more options than the binary
“right” or “wrong”, the authors predicted that if the participants’ true compe-
tence was obscured by the binary choice, their sensitivity to underinformativeness
might be shown with a graded rating scale: although the pragmatic violation of
an underinformative sentence is less “wrong” than a semantic violation, it might
still be less “right” than an unambiguously semantically and pragmatically correct
sentence.
In the first experiment, twenty 7-year-old English speaking children were
tested using a binary SJT (which also had a production component, but the focus
here is on the comprehension part). On a computer screen, the participants were
introduced to a fictional character, Mr. Caveman, who was “trying to learn Eng-
lish”. The experimenter would narrate a short story and end the story with a display
of a protagonist together with the objects he/she was manipulating and ask Mr.
Caveman what the protagonist had been doing in the story. Using pre-recorded
sentences, Mr. Caveman gave semantically correct, semantically incorrect, and,
critically, underinformative answers (the control conditions were there to confirm
that the participants were willing to reject or accept sentences in general). To help
Mr. Caveman learn English, participants had to judge whether he said it right or
wrong. Whenever the answer from Mr. Caveman was wrong, the participant was
invited to provide the correct answer themselves, thus, eliciting fully informative
sentences in the underinformative condition. Half of the implicatures were sca-
lar implicatures, based on context-independent so-called generalized scales (e. g.
the mouse picked up some of the carrots, while all of the carrots were in fact
picked up) and half of them were context-dependent quantity implicatures, based
on so-called particularized ad hoc scales (e. g. the dog painted the triangle, while
the dog painted both a triangle and a heart). The participants were at ceiling for
Assessing the comprehension of pragmatic language: Sentence judgment tasks 265

the semantically correct and incorrect sentences, both in production and compre-
hension. However, in the underinformative condition, their comprehension sig-
nificantly lagged behind the comprehension of semantically correct or incorrect
sentences, as well as their production.
In the second experiment, fifteen 6- to 7-year-old English speaking children
were tested on a five-point scale rating task. The same items as Experiment 1 were
used, but instead of providing a judgment in the form of right and wrong, now
participants had to reward the answers from Mr. Caveman with one, two, three,
four, or five strawberries. Again, the children were at ceiling for the semantically
correct and incorrect sentences. The underinformative sentences, however, were
rated significantly higher than the semantically incorrect sentences, but lower than
the correct ones. This was taken as evidence for the Pragmatic Tolerance Hypoth-
esis, which proposes that children at a certain age may have acquired the ability to
derive implicatures, but may not reliably show it in a binary SJT.
More evidence for the Pragmatic Tolerance Hypothesis was reported by Kat-
sos and Bishop (2011). The authors put the binary SJT to the test by comparing it
to a graded SJT and a sentence-to-picture matching task. In the first experiment,
twenty 5- to 6-year-old English speaking children, as well as twenty adults, were
tested in a binary SJT. Apart from the absent production component, this experi-
ment was identical to the procedure and items in Experiment 1 from Katsos and
Smith (2010), above, in which Mr. Caveman was right or wrong in describing
situations from a narrative. The children and adults were at ceiling for the correct
and incorrect control conditions, and whereas the adults never accepted any under-
informative sentences, the children accepted them on over 70 % of the trials. The
children thus performed significantly worse on the critical underinformative con-
dition compared to the control conditions, and also when compared to the adults’
performance on the critical condition.
In the second experiment, eighteen 5- to 6-year-old English speaking children,
as well as ten adults, were tested in a ternary SJT. This experiment was comparable
to Experiment 2 from Katsos and Smith (2010), but instead of judging them on a
five-point scale, here participants had to reward the answers from Mr. Caveman
with a small, large, and huge strawberry. Both children and adults rewarded the
semantically incorrect sentences with the small strawberry, the semantically cor-
rect sentences with the huge strawberry, and the underinformative sentences with
the intermediate, large strawberry. This clearly demonstrates that when given an
additional option that does not represent “right” or “wrong”, children as well as the
adults show their sensitivity to underinformativeness, by selecting the intermediate
option.
The final experiment was a sentence-to-picture matching task. Here, the same
fictional character from the previous experiments, Mr. Caveman, narrated by
means of pre-recorded utterances the same stories that were used in the previous
experiments. At the end of each story, four pictures appeared on the screen and the
266 Alma Veenstra and Napoleon Katsos

participants were asked to indicate which picture matched the story. To test the
generalized scale <some, all>, Mr. Caveman would say “the mouse picked up some
of the carrots”. The corresponding pictures showed a mouse picking up three out
of five carrots, three out of five pumpkins, five out of five carrots, and five out of
five pumpkins. To test the ad hoc scale <triangle, triangle and heart>, Mr. Caveman
would say “the dog painted the triangle,” while the corresponding pictures showed
only a triangle, only a heart, a heart and a triangle, or a star and a triangle. Fifteen
5-year-old English speaking children participated in this experiment, as well as ten
adults. The adults were at ceiling for all conditions and the children did not perform
significantly different from the adults.
To summarize the findings from Katsos and Bishop (2011) which are relevant
for our methodological review, with regard to underinformativeness, 5-year-old
children lag behind adult performance when tested with the binary SJT, but not
when tested with the graded SJT or the sentence-to-picture task. Not only did the
authors show that the 5- to 6-year-old children show sensitivity to underinforma-
tiveness in some (but not all) tasks, they also pointed out that even rejecting an
underinformative sentence does not automatically imply that an implicature has
been generated. If the dog painted a heart and a triangle, but only the triangle is
mentioned, you could argue the answer is wrong, because the character used the
less informative form on the scale <triangle, triangle and heart>. Using this weaker
term may be taken to imply that the speaker means that the dog did not paint any-
thing but what he said; therefore, not the triangle and the heart, which is not true
because he did. Therefore, generating an implicature could lead some participants
to reject the critical underinformative utterance. However, the mere observation
that the speaker was underinformative may lead participants to reject the under-
informative utterance as well. That is, even without enriching what Mr. Caveman
said with an implicature, and taking him to have said that “the dog painted a trian-
gle, and possibly more items” or that “the mouse picked some and maybe all of the
carrots” is still infelicitous, because Mr. Caveman has witnessed a situation where
a stronger term could be used.
Whereas the studies above investigated the acquisition and processing of the
first maxim of Quantity (do not be underinformative), another study investigated
the acquisition of the second maxim of Quantity (do not be overinformative).
Davies and Katsos (2010) studied whether children can be pragmatically tolerant
towards overinformativeness, in the same way as they were shown to be toler-
ant towards underinformativeness. They investigated children’s performance with
regard to both the production and comprehension of under- and overinformative-
ness. Parallel to predictions about underinformativeness, the Pragmatic Tolerance
hypothesis predicted that children should be adult like in their production, but not
in their comprehension, if that is measured with a binary SJT. However, when
comprehension is measured with a graded SJT, the children’s competence is more
likely to be revealed.
Assessing the comprehension of pragmatic language: Sentence judgment tasks 267

In the first experiment, twenty-four 5-year-old English speaking children

and twenty-four adults participated in a production task. To create situations in
which a speaker can make under- and overinformative sentences, the study used
referring expressions in a context of four possible choices. The participants were
shown a display of four objects and had to instruct a fictional character to pick
up one of the objects. In one set of items, there were four different objects, in the
other set of items, among the four objects, there were two that only differed in
some attribute (e. g. a fresh and a moldy apple). The participants received a sep-
arate booklet with displays matching the ones on the computer screen. An arrow
in the paper display indicated the object that the participant had to ask for. For
example, when the display showed a fresh apple, a moldy apple, a comb, and
a sausage, and the participant had to instruct the character to pick up the fresh
apple, the adjective fresh was necessary to distinguish between the two apples.
The instruction “pick up the apple” would be underinformative in this case. How-
ever, when the display showed an ice cream cone, a fresh apple, a comb, and
a sausage, and the participant had to instruct the character to pick up the fresh
apple, the adjective fresh was not necessary. Therefore, the instruction “pick up
the fresh apple” would be overinformative. The adults were rarely overinform-
ative, but did produce a few underinformative sentences. The children were also
rarely overinformative, but produced underinformative sentence about half of the
time.
In the second experiment, the same children who had participated in Experi-
ment 1 and twelve new adults were asked to judge referring expressions in a binary
SJT. Participants heard pre-recorded descriptions of one object from a four-object
display. The descriptions were instructions of one fictional character to another
fictional character to pass them one of the objects. Similar to Experiment 1, some-
times there were four different objects, sometimes two of the objects only differed
in some attribute (e. g. a small and a large star). The description would either
contain an adjective, or no adjective. In a display with a pineapple, a toothbrush,
a frog, and a star, “pass me the small star” would be overinformative. In a dis-
play with a large star, a chick, a house, and a small star, “pass me the star” would
be underinformative. The adults accepted over 70 % of overinformative sentences
and about 40 % of underinformative sentences. The children accepted almost 90 %
of overinformative sentences and around 75 % of underinformative sentences. So
whereas the adults penalized underinformativeness, but not overinformativeness,
the children did not penalize either of them.
In the third experiment, the same children who had participated in Experiment 1
and 2 and the twenty-four adults who had participated in Experiment 1 were asked
to judge referring expressions on a magnitude estimation scale, which is a type of
graded SJT. The magnitude estimation scale is a scale set up by each individual
participant without upper or lower limits and has been argued to be sensitive to fine
distinctions that participants may make but not be able to express in pre-defined
268 Alma Veenstra and Napoleon Katsos

Likert scales (Bard, Robertson, and Sorace 1996)1. Here, the participants were
instructed to award the speaker as many strawberries they wished, with at least one
strawberry as a positive lower limit. The procedure was similar to Experiment 2,
where sentences (correct, incorrect, underinformative, and overinformative) had to
be judged. The adults rated the under- and overinformative sentences lower than
the correct ones, and the difference between the under- and overinformative ratings
were not significant. The children’s ratings were very similar to the adults’, with
lower ratings for under- and overinformative sentences than correct ones, and no
difference between the two critical conditions.
To summarize, Davies and Katsos (2010) showed that 5-year-old children have
acquired the second maxim of Quantity, overinformativeness, in both production
and comprehension. However, as with the studies on underinformativeness, the
children’s performance was strongly dependent on whether a binary or a graded
SJT was used. In a strong demonstration that the type of SJT makes a difference
on participant performance, whereas the underinformativeness studies reviewed
above used different sets of 5-year-olds, this study used the same children on both
types of SJT and showed that the exact same children who in a binary SJT seemed
not sensitive to overinformativeness, did show sensitivity in the graded SJT.
The final study we will review is by Veenstra, Hollebrandse, and Katsos (sub-
mitted). They were interested in the development of Pragmatic Tolerance and
looked at underinformativeness in 4- to 9-year-olds with a binary SJT and a graded
SJT. In contrast to some of the earlier underinformativeness studies reviewed
above, here the same children participated in both the binary and the graded SJT.
Seventy-five 4- to 9-year-old Dutch speaking children participated in the first
study. A fictional character made statements about a display with three objects in
or next to a basket, which the participants had to judge with a press on a green
button (if the statement was “right”) or a press on a red button (if the statement
was “wrong”). In addition to the answers, the response times for the button presses
were recorded. Only ad hoc scales were used, so an example of an underinforma-

1
A pre-defined Likert scale is a scale where the author has decided in advance how many
distinctions can be made. For example, a Likert scale of 1–3 allows participants to draw
up to three distinctions between the items in the experiment. A disadvantage of a pre-de-
fined Likert scale is that a participant may wish to make more distinctions, e. g. she may
judge some sentence as completely unacceptable, some as somewhat unacceptable, some
as mostly acceptable and some as perfectly acceptable. These distinctions cannot be
expressed straightforwardly if this participant is offered a scale with just three points. A
magnitude estimation task, in brief, allows participants to conjure their own scale –and
therefore it allows them to draw as many distinctions between the items as they think is
necessary, without forcing them to use the number of distinctions that the author believes
are sufficient. Using some statistical calculations, the responses can then be ‘normalised’
on a single scale.
Assessing the comprehension of pragmatic language: Sentence judgment tasks 269

tive statement would be in de mand ligt een bal ‘in the basket there is a ball’ when
in fact in the basket there are a ball and a shoe. The children were at ceiling for the
semantically correct and incorrect control sentences, but in the critical underin-
formative condition, they accepted half of the sentences, and would therefore seem
to be only partially able with informativeness.
The same seventy-five 4- to 9-year-old children who had participated in Exper-
iment 1 participated in the second experiment. The procedure was similar to Exper-
iment 1, as participants had to judge the fictional character’s descriptions about
a display, now with three animals on or next to a sofa. Instead of providing a
binary judgment, now the participants had to reward the fictional character with a
small, large, or huge strawberry (see also Katsos and Bishop 2011). The children
rewarded almost all incorrect sentences with the small strawberry and the correct
sentences with the huge strawberry. The results for the underinformative sentences
were more mixed: 25 % of the underinformative utterances were rewarded with the
small strawberry, 40 % with the large strawberry, and 35 % with the huge straw-
berry.
Although the percentage of items that were accepted went down from 50 % in
the binary SJT to 35 % in the graded SJT, the results make even more sense when
looking at individual participants. In Experiment 1, there were twenty-two partici-
pants who never accepted underinformative statements. They have clearly acquired
underinformativeness and responded after it by rejecting the underinformative
utterances. There were also twenty participants who always accepted underinform-
ative statements. Out of these latter children who would be categorized as pragmat-
ically oblivious by the binary SJT alone, six never awarded the huge strawberry in
Experiment 2, whereas seven always awarded the huge strawberry in Experiment
2. This is where the distinction can be made between pragmatically tolerant chil-
dren (the six who show sensitivity in the graded SJT) and pragmatically oblivious
children (the seven who do not show sensitivity in the graded SJT). Although not
statistically significant because of the small number, the pragmatically oblivious
children were younger in age compared to the pragmatically tolerant children.
The graded SJT differentiated between pragmatically tolerant and pragmati-
cally oblivious children, and based on this categorization, the response times pro-
vide additional evidence for the Pragmatic Tolerance Hypothesis. The hypothesis
predicts that pragmatically tolerant children notice the pragmatic violations, but
decide not to penalize it, whereas pragmatically oblivious children do not notice
the pragmatic violation to begin with. Accepting an underinformative sentence took
significantly longer compared to accepting a semantically correct sentence in prag-
matically tolerant children, whereas there was no difference in the time to accept
underinformative and correct sentences for pragmatically oblivious children.
270 Alma Veenstra and Napoleon Katsos

5. Implications for the interpretation of data from sentence judgment

tasks

All in all, the studies reviewed in the previous section raise concerns about how to
treat results from a binary SJT when it comes to pragmatic competence. One thing to
keep in mind is that accepting and rejecting underinformative sentences in a binary
SJT can be the outcome of different kinds of competence. If a participant accepts
an underinformative sentence, such as “the mouse picked up some of the carrots”
in a situation where the mouse picked up all the carrots, this could be because of
four reasons. First of all, it is possible the participant did not generate the scalar
implicature that is available from the critical utterance, namely that “the mouse
did not pick up all of the carrots”. Second, it is possible that the participant did not
even notice that there was a more informative expression that the speaker could
have used, namely “the mouse picked up all of the carrots”. In both these cases, the
participant is oblivious to some important aspect of pragmatics, either that there
exist more informative alternatives, or that a cooperative speaker who utters a less
informative expression usually implies the negation of the more informative alter-
native. Therefore, in both case the acceptance of the underinformative utterance
is due to lack of some aspect of pragmatic competence. However, there is also a
third, distinct, possibility that the participant did generate an implicature or that,
fourth, they did at least notice that there was a more informative expression that the
speaker could have used, but in either case, they did not consider the violation to be
grave enough to categorically reject the critical utterance. In these two cases, the
participant is tolerant towards pragmatically infelicitous utterances but not obliv-
ious to some aspect of pragmatics. And in these two cases, the acceptance of the
underinformative utterance is due to a meta-cognitive disposition, rather than their
linguistic competence which is the actual focus of interest of linguists.
Turning to the case where a participant rejects an underinformative sentence,
we are in somewhat more certain ground as regards interpreting their competence,
but not completely. A rejection might be elicited because the participant generated
the implicature, compared it to the display that does not match the implicature
interpretation and decided that the difference is big enough to warrant the rejection
of the utterance. Alternatively, the participant simply notices that there is a more
informative expression that the speaker could have used, and rejects the utterance
on these grounds, without generating the implicature. These two scenarios involve
distinct levels of pragmatic processing, because noticing that there exists a more
informative statement is a precondition of generating an implicature.
The upshot of these observations is that if the main purpose of an experiment
is to study whether a speaker has acquired the ability to generate implicatures, then
binary SJT leaves much to be desired. Acceptances of pragmatically infelicitous
utterances may be due to lack of some kind of pragmatic competence (competence
with generating alternatives or with actually generating implicatures), or to toler-
Assessing the comprehension of pragmatic language: Sentence judgment tasks 271

ance to pragmatic infelicity. These are categorically different mental and develop-
mental states, the former pertaining to linguistic competence while the latter relates
to a meta-linguistic judgment. Rejections of pragmatically infelicitous utterances
are also ambiguous, because it is not straightforwardly clear on what grounds the
participant has rejected the critical utterance: is it the fact that they noticed that
there was a more informative alternative that was not used? Or is it that they actu-
ally generated an implicature? The binary outcome of a SJT does not suffice to
clarify this issue. What is the way forward then?

6. Alternatives for the sentence judgment paradigm

As Katsos and colleagues have done (Davies and Katsos 2010, Experiment 2; Kat-
sos and Bishop 2011, Experiment 2; Veenstra, Hollebrandse, and Katsos, Exper-
iment 2), one way to side-step some of these challenges is to retain the format of
the SJT but move away from the binary scale. By introducing a graded scale, from
a ternary-one as in Katsos and Bishop (2011, Experiment 2) to a magnitude esti-
mation task used by Davies and Katsos (2010), participants who are tolerant but
not oblivious towards pragmatic violations are given the option to select a response
that is less penalizing than a categorical rejection. This minimal adaptation of the
binary SJT is quite simple and yet powerfully effective. However, while SJTs have
been the statistically dominant paradigm in the study of pragmatics, there is no
particular motivation for using a judgment paradigm in the first instance. At the
risk of stating the obvious, a paradigm where participants judge the felicity (or
grammaticality, or phonological or phonetic realization – to make this point more
generally) of a sentence will involve two components by definition: a linguistic
component where participants assign an interpretation to the utterance (or a syn-
tactic or other representation), and a non-linguistic, meta-linguistic component
where they evaluate the similarity between the representation they formed and the
given situation of evaluation and form a judgment. While graded SJTs allow more
than two possible judgments, and therefore allow for more nuanced judgments,
they do not by-pass the fundamental challenge of any SJT, namely that it involves
a meta-linguistic judgment.
Any paradigm where the participant is presented with the critical utterance
and is offered a range of situations among which they can select one to match
to the utterance avoids the meta-linguistic component of SJTs. There is a wealth
of variations of this paradigm, from the very simple sentence-to-picture-match-
ing task reported in Katsos and Bishop (2011, Experiment 3) to the visual-world
eye-tracking paradigm used by Huang and Snedeker (2009). As mentioned above,
Katsos and Bishop (Experiment 3) presented participants with an utterance and
four pictures, of which only one matched with the pragmatic interpretation of the
utterance (e. g. the mouse picked up some but not all of the carrots). One other
272 Alma Veenstra and Napoleon Katsos

alternative matched with the logical interpretation of the utterance (e. g. the mouse
picked up all of the carrots, which is compatible with the mouse picking up at
least some of the carrots), whereas the other two pictures did not match at all.
The 5-year-old children in this task scored at ceiling, always selecting the picture
which matches with the pragmatic interpretation (‘some but not all’) in stark con-
trast to their performance on the binary judgment task (Experiment 1) where their
responses seemed more in line with the logical interpretation. In another version
of this sentence-to-picture-matching task, Horowitz and Frank (2015) showed par-
ticipants three pictures of book covers and asked them to point out the book that
matched the experimenter’s description. In the underinformative condition, one
cover depicted the pragmatic interpretation, one cover the logical interpretation,
and one did not match the utterance at all. Only three pictures were used to reduce
the task demands. The authors used the paradigm to test both quantifier and ad-hoc
scalar implicatures and concluded that 4- to 5-year-olds had more difficulties with
the quantifier implicatures than with the ad-hoc ones but in both cases they per-
formed better than what would have been expected from SJT data.
Bill et al. (2014) wanted to investigate whether indirect scalar implicatures (the
implicature from not all to some) and presuppositions are generated in the same
way in adults and two groups of children. They used a covered box picture selec-
tion task. Here, participants were presented with a context picture and a description
of it. Next, two pictures were shown, one with an actual scene and one that was
covered with a black screen. A new description was played and the participants had
to select the picture that matched this description. In the indirect scalar implicature
condition, the story was introduced with “Today a group of penguins and a group of
rabbits went to the park. All of the penguins brought balls”. Then the target pictures
were presented, and the experimenter said “But not all rabbits brought balls. Which
group of rabbits do you think I’m talking about?” (Bill et al., 2014: 63–64). The
visible picture showed the literal interpretation, a group of rabbits with no balls,
whereas the covered picture implied the pragmatic interpretation, on which some
rabbits would have a ball. Choosing the covered box meant that the implicature or
presupposition was generated. Adults more often selected the covered picture for
indirect scalar implicatures than for presuppositions. The opposite was evidenced
for 4- to 5-year-old children. Similarly, the adults were more likely to generate a
direct than an indirect scalar implicature, which was reversed for the children. The
group of 7-year-old children patterned with the 4- to 5-year-olds.
Huang and Snedeker (2009) used the visual-world paradigm to study the com-
prehension of scalar implicatures with some and all. Participants were presented
with a visual display which showed four pictures. Each picture depicted a person
and a set of items. For example, in the condition testing the comprehension of some,
one picture showed a boy with two socks, one picture showed a girl with two socks,
one picture showed a boy without any item, and the final picture showed a girl with
three balls. Participants heard the utterance “point to the girl that has some of the
Assessing the comprehension of pragmatic language: Sentence judgment tasks 273

socks”. While the utterance unfolds over time, the unconscious eye-movements
reflect how it is being interpreted: the word girl distinguishes between the pictures
with the boys and the girls. The word some distinguishes between the girl with some
of the socks and the girl with all the balls, but only participants that have generated
the implicature will show more looks to the girl with the socks at this point. On the
logical interpretation, the final decision can only be made upon hearing socks. This
paradigm allows researchers to study how the interpretation develops over time,
rather than looking only at the end results of the comprehension process. Although
Huang and Snedeker investigated implicature in 5-year-old children, the visual-
world paradigm is also very useful for studies with very young children. Yoon,
Wu, and Frank (2015) investigated the comprehension of ad-hoc implicatures in
2- to 5-year-old children using eye-tracking. The participants were presented with
two pictures and a pre-recorded utterance describing one of the pictures. In the
underinformative condition, one picture showed a plate with a carrot, whereas
the other picture showed a plate with a banana and a carrot. The corresponding
utterance was “Look at these plates. Elmo’s plate has a carrot”. The trial ended
with the character marking the correct plate. The participants were not required
to do anything other than look at the pictures. Anticipatory eye-movements to the
target and distracter pictures showed whether or not the participants interpreted
the utterance pragmatically. The results showed that the 4- and 5-year-olds per-
formed better (e. g. produced more looks toward the target picture) than the 2- and
3-year-olds.
In a similar vein, paradigms where the participant is invited to act upon the
situation they are presented with and to make it match the critical utterance also
bypass the meta-cognitive component of the SJTs. In a very simple, but elegant
Direct Instruction Task, Miller et al. (2005) presented participants with a sheet of
paper on which there were four faces, each of them lacking a mouth. They were
instructed to make some faces happy. The authors used this paradigm to study if
stressing the quantifier helped children generate implicatures, which turned out to
be the case for children as young as three-and-a-half years old. Using a slightly
different approach, Pouscoulous et al. (2007) designed an action-based task where
children are presented with physical containers and objects (e. g. tokens or toys)
that may be moved in or out of them. The participants were required to change the
position of the objects to match the sentence they heard. Critically, the starting
configuration of containers and objects for each trial of the experiment is differ-
ent. In the critical condition for implicature, in the starting configuration all of the
containers have an object inside them, while the participant hears “some” of the
containers have an object. In this paradigm, Pouscoulous et al. (2007) reported
substantially high rates of implicature in 4-year-olds and 5-year-olds, higher than
what is to date reported with SJTs.
Another paradigm that does not require meta-linguistic judgment measures
brain activation during the comprehension of implicatures with the Event-Related
274 Alma Veenstra and Napoleon Katsos

Potential technique (ERP). The advantage of this technique is that no overt task is
necessary as comprehension is tracked without the participant’s conscious control
while the utterance unfolds. Noveck and Posada (2003) had participants listen
to utterances that were semantically true, false, or pragmatically underinforma-
tive. They focused on the N400, which is a particular negatively charged electrical
response of the brain which occurs around 400 ms after a semantically incongruous
sentence is processed. Both true and false utterances produced larger N400s than
the underinformative utterances, which was unexpected and difficult to explain,
as semantically true utterances are not expected to generate a large N400. In a
more recent study, Nieuwland, Ditman, and Kuperberg (2010) pointed out some
methodological issues with the Noveck and Posada’s study (for instance, the coun-
terbalancing of the items and the timing of stimulus presentation). They conducted
an ERP study where participants read utterances that were presented on a monitor
in a word-by-word fashion. The N400 response on fully informative utterances
(e. g. “some people have pets”) was compared to underinformative utterances (e. g.
“some people have lungs”). Pragmatically weak participants, who scored high on
an autism-spectrum quotient questionnaire, showed no difference in N400 between
the two types of sentences. However, pragmatically strong participants, with a low
autism-spectrum quotient score, showed stronger N400s for underinformative than
informative utterances. Finally, Hunt et al. (2013) measured ERPs while their par-
ticipants saw pictures and heard utterances describing these pictures. They found
that the strength of the N400 was mediated by the type of violation: compared to
when the sentence was fully informative and matched the picture, the N400 was
strongest for semantic mismatches where the sentence completely mismatched
with the picture, but intermediate for sentences that were semantically appropriate
but pragmatically underinformative.

7. Concluding remarks on the use of sentence judgment tasks

In this chapter we do not advocate that, because of the challenges we identified for
SJTs, they should be abandoned wholesale. As long as researchers are aware that
the interpretation of acceptances and rejections in SJTs is not unambiguous, and
that a meta-linguistic judgment is involved, one can use SJTs when what is at stake
is whether a certain factor has a discernible effect on adult language processing
or child acquisition. For example, Skordos and Papafragou (2016) use a SJT to
investigate the effect of relevance and of the availability of the stronger alternative
to children’s judgments on underinformative utterances. Their findings show that
indeed these two factors are taken into account and they have a significant effect
on the rate of rejection of underinformative utterances. We argue that important
questions remain unanswered by these findings, e. g. about whether children use
relevance and the availability of alternatives to generate implicatures, or whether
Assessing the comprehension of pragmatic language: Sentence judgment tasks 275

these two factors simply raise the salience of the stronger alternative. Nevertheless,
the SJT findings do provide a robust demonstration that these two factors affect
some level of pragmatic computation, which is in fact a critical observation about
theories of implicature acquisition and processing. Similarly, one may use SJTs to
investigate group differences, e. g. between native speakers and learners of a lan-
guage, or typically- and atypically-developing populations among others, as well
as in all situations where the main question is whether a property of the test-popu-
lation or test-items is significant.
Having said that, when it comes to research whose aim is to identify age-ranges
and boundaries before or after which a certain competence is available to children,
or research that aims to unveil if participants unambiguously have the ability to
generate implicatures, then SJTs are ill-suited, for the reasons we discussed in
this chapter. This is exceptionally so for the binary version of SJTs. In the final
section of this chapter, we reviewed alternative methods that can be employed to
measure a participant’s comprehension of pragmatics, from simple to more com-
plex behavioral tasks to neuroimaging techniques. These methods do not involve
a meta-linguistic judgment and are more suited to reveal the participant’s true
competence with pragmatics. A gentle shift towards these paradigms and a nuanced
understanding of the limits of SJTs would be a particularly helpful move in the
field of experimental pragmatics, and experimental linguistics in general.

References

Baldwin, Dare A.
1993 Early referential understanding: Infants’ ability to recognize referential acts
for what they are. Developmental Psychology 29(5): 832–843.
Bard, Ellen Gurman, Dan Robertson and Antonella Sorace
1996 Magnitude estimation of linguistic acceptability. Language 71(1): 32–68.
Barner, David and Asaf Bachrach
2010 Inference and exact numerical representation in early language development.
Cognitive Psychology 60(1): 40–62.
Barner, David, Katherine Chow and Shu-Ju Yang
2009 Finding one’s meaning: A test of the relation between quantifiers and integers
in language development. Cognitive Psychology 58(2): 195–219.
Barner, David, Neon Brooks and Alan Bale
2011 Accessing the unsaid: The role of scalar alternatives in children’s pragmatic
inference. Cognition 118(1): 84–93.
Behne, Tanya, Malinda Carpenter and Michael Tomasello
2005 One-year-olds comprehend the communicative intentions behind gestures in a
hiding game. Developmental Science 8(6): 492–499.
Bill, Cory, Jacopo Romoli, Florian Schwarz and Stephen Crain
2016 Scalar implicatures versus presuppositions: The view from acquisition. Topoi
35(1): 57–71.
276 Alma Veenstra and Napoleon Katsos

Bott, Lewis, Todd M. Bailey and Daniel Grodner

2012 Distinguishing speed from accuracy in scalar implicatures. Journal of Memory
and Language 66(1): 123–142.
Bott, Lewis and Ira A. Noveck
2004 Some utterances are underinformative: The onset and time course of scalar
inferences. Journal of Memory and Language 51(3): 437–57.
Breheny, Richard, Napoleon Katsos and John Williams
2006 Are generalised scalar implicatures generated by default? An on-line investi-
gation into the role of context in generating pragmatic inferences. Cognition
100(3): 434–463.
Carston, Robyn
1998 Informativeness, relevance and scalar implicature. In: Robyn Carston and Seiji
Uchida (eds.), Relevance Theory: Applications and implications, 179–238.
Amsterdam/Philadelphia: John Benjamins Publishing.
Chemla, Emmanuel and Lewis Bott
2013 Processing presuppositions: Dynamic semantics vs pragmatic enrichment.
Language and Cognitive Processes 28(3): 241–260.
Chierchia, Gennaro
2003 Scalar implicatures, polarity phenomena, and the syntax/pragmatics inter-
face. In: Adriana Belletti (ed.), Structures and Beyond, 39–103. Oxford, UK:
Oxford University Press.
Clark, Eve V.
1987 The principle of contrast: A constraint on language acquisition. In: Brian
MacWhinney (ed.), Mechanisms of Language Acquisition, 1–33. Hillsdale,
NJ/London: Lawrence Erlbaum Associates.
Clark, Eve V.
1988 On the logic of contrast. Journal of Child Language 15(2): 317–335.
Davidson, Donald
1967 Truth and Meaning. Synthese 17(3): 304–323.
Davies, Catherine and Napoleon Katsos
2010 Over-informative children: Production/comprehension asymmetry or toler
ance to pragmatic violations? Lingua 120(8): 1956–1972.
De Neys, Wim and Walter Schaeken
2007 When people are more logical under cognitive load: Dual task impact on scalar
implicature. Experimental Psychology 54(2): 128–133.
Dieussaert, Katrien, Suzanne Verkerk, Ellen Gillard and Walter Schaeken
2011 Some effort for some: Further evidence that scalar implicatures are effortful.
The Quarterly Journal of Experimental Psychology 64(12): 2352–2367.
Dummett, Michael
1959 Truth. Proceedings of the Aristotelian Society 59(1): 141–162.
Feeney, Aidan, Susan Scrafton, Amber Duckworth and Simon J. Handley
2004 The story of some: Everyday pragmatic inference by children and adults.
Canadian Journal of Experimental Psychology 58(2): 121–132.
Foppolo, Francesca, Maria Teresa Guasti and Gennaro Chierchia
2012 Scalar implicatures in child language: Give children a chance. Language
Learning and Development 8(4): 365–394.
Geurts, Bart
2010 Quantity Implicature. Cambridge, UK: Cambridge University Press.
Assessing the comprehension of pragmatic language: Sentence judgment tasks 277

Gibbs, Raymond W.
1994 The Poetics of Mind: Figurative Thought, Language, and Understanding.
Cambridge, UK: Cambridge University Press.
Glucksberg, Sam
2003 The psycholinguistics of metaphor. Trends in Cognitive Sciences 7(2): 92–96.
Grassmann, Susanne, Marén Stracke and Michael Tomasello
2009 Two-year-olds exclude novel objects as potential referents of novel words
based on pragmatics. Cognition 112(3): 488–493.
Grice, H. Paul
1989 Studies in the Way of Words. Cambridge, MA: Harvard University Press.
Grodner, Daniel J., Natalie M. Klein, Kathleen M. Carbary and Michael K. Tanenhaus
2010 “Some,” and possibly all, scalar inferences are not delayed: Evidence for
immediate pragmatic enrichment. Cognition 116(1): 42–55.
Guasti, Maria Teresa, Gennaro Chierchia, Stephen Crain, Francesca Foppolo, Andrea
Gualmini and Luisa Meroni
2005 Why children and adults sometimes (but not always) compute implicatures.
Language and Cognitive Processes 20(5): 667–696.
Hirschberg, Julia
1991 A Theory of Scalar Implicature. New York: Garland.
Horn, Laurence
1984 Toward a new taxonomy for pragmatic inference: Q-based and R-based impli-
cature. In: Deborah Schiffrin (ed.), Meaning, Form, and Use in Context,
11–42. Washington, DC: Georgetown University Press.
Horowitz, Alexandra C. and Michael C. Frank
2015 Sources of developmental change in pragmatic inferences about scalar terms.
Proceedings of the 37th Annual Conference of the Cognitive Science Society.
Austin, TX: Cognitive Science Society.
Huang, Yi Ting and Jesse Snedeker
2009 Semantic meaning and pragmatic interpretation in 5-year-olds: Evidence from
real-time spoken language comprehension. Developmental Psychology 45(6):
1723.
Hunt, Lamar, Stephen Politzer-Ahles, Linzi Gibson, Utako Minai and Robert Fiorentino
2013 Pragmatic inferences modulate N400 during sentence comprehension: Evi-
dence from picture–sentence verification. Neuroscience Letters, 534: 246–251.
Katsos, Napoleon
2014 Scalar implicature. In: Danielle Matthews (ed.), Pragmatic Development in
First Language Acquisition, 183–198. Amsterdam: John Benjamins.
Katsos, Napoleon and Dorothy V. M. Bishop
2008 Pragmatic tolerance. Paper presented at the XI International Congress for the
Study of Child Language (IASCL), Edinburgh, 28 July – 1 August.
Katsos, Napoleon and Dorothy V. M. Bishop
2011 Pragmatic tolerance: Implications for the acquisition of informativeness and
implicature. Cognition 120(1): 67–81.
Katsos, Napoleon and Nafsika Smith
2010 Pragmatic tolerance and speaker-comprehender asymmetries. In: Katie
Franich, Kate M. Iserman and Lauren L. Keil (eds.), The 34th Boston Univer-
sity Conference in Language Development – Proceedings, 221–232. Boston:
Cascadilla Press.
278 Alma Veenstra and Napoleon Katsos

Katsos, Napoleon and Elspeth Wilson

Forthcoming Acquiring implicatures. In: Klaus P. Schneider and Elly Ifantidou (eds.)
Developmental and Clinical Pragmatics. Berlin/Boston: de Gruyter Mouton.
Levinson, Stephen C.
2000 Presumptive Meanings: The Theory of Generalized Conversational Implica-
ture. Cambridge, MA: MIT Press.
Lidz, Jeffrey and Julien Musolino
2002 Children’s command of quantification. Cognition 84(2): 113–154.
Liebal, Kristin, Tanya Behne, Malinda Carpenter and Michael Tomasello
2009 Infants use shared experience to interpret pointing gestures. Developmental
Science 12(2): 264–271.
Markman, Ellen M.
1989 Categorization and Naming in Children. Cambridge, MA: MIT Press.
Markman, Ellen M.
1990 Constraints children place on word meanings. Cognitive Science 14(1): 57–77.
Markman, Ellen M. and Gwyn F. Wachtel
1988 Children’s use of mutual exclusivity to constrain the meanings of words. Cog-
nitive Psychology 20(2): 121–157.
Miller, Karen, Cristina Schmitt, Hsiang-Hu Chang and Alan Munn
2005 Young children understand some implicatures. In: The 29th Annual Boston
University Conference on Language Development – Proceedings, 389–400.
Boston: Cascadilla Press.
Nieuwland, Mante S., Tali Ditman and Gina R. Kuperberg
2010 On the incrementality of pragmatic processing: An ERP investigation of infor-
mativeness and pragmatic abilities. Journal of Memory and Language 63(3):
324–346.
Noveck, Ira A.
2001 When children are more logical than adults: Experimental investigations of
scalar implicature. Cognition 78(2): 165–188.
Noveck, Ira A. and Andres Posada
2003 Characterizing the time course of an implicature: An evoked potentials study.
Brain and Language 85(2): 203–210.
Noveck, Ira A. and Anne Reboul
2008 Experimental pragmatics: A Gricean turn in the study of language. Trends in
Cognitive Sciences 12(11): 425–431.
Panizza, Daniele, Gennaro Chierchia and Charles Clifton Jr.
2009 On the role of entailment patterns and scalar implicatures in the processing of
numerals. Journal of Memory and Language 61(4): 503–518.
Papafragou, Anna and Julien Musolino
2003 Scalar implicatures: Experiments at the semantics–pragmatics interface. Cog-
nition 86(3): 253–282.
Papafragou, Anna and Niki Tantalou
2004 Children’s computation of implicatures. Language Acquisition 12(1): 71–82.
Pouscoulous, Nausicaa, Ira A. Noveck, Guy Politzer and Anne Bastide
2007 A developmental investigation of processing costs in implicature production.
Language Acquisition 14(4): 347–376.
Assessing the comprehension of pragmatic language: Sentence judgment tasks 279

Roeper, Thomas
2004 Diagnosing language variations: Underlying principles for syntactic assess-
ment. Seminars in Speech and Language 25(1): 41–56.
Roeper, Tom, Petra Schulz, Barbara Z. Pearson and Ina Reckling
2007 From singleton to exhaustive: The acquisition of wh-. In: Michael Becker and
Andrew McKenzie (eds.), The 3rd Conference on the Semantics of Under-
represented Languages in the Americas – Proceedings, 87–10. Amherst, MA:
University of Massachusetts.
Sauerland, Uli
2004 Scalar implicatures in complex sentences. Linguistics and Philosophy 27(3):
367–391.
Schulze, Cornelia, Susanne Grassmann and Michael Tomasello
2013 3-year-old children make relevance inferences in indirect verbal communica-
tion. Child Development 84(6): 2079–2093.
Schulze, Cornelia and Michael Tomasello
2015 18-month-olds comprehend indirect communicative acts. Cognition 136:
91–98.
Sedivy, Julie C.
2003 Pragmatic versus form-based accounts of referential contrast: Evidence for
effects of informativity expectations. Journal of Psycholinguistic Research
32(1): 3–23.
Sedivy, Julie C., Michael K. Tanenhaus, Craig G. Chambers and Gregory N. Carlson
1999 Achieving incremental semantic interpretation through contextual representa-
tion. Cognition 71(2): 109–147.
Smith, Carol L.
1980 Quantifiers and question answering in young children. Journal of Experimen-
tal Child Psychology 30(2): 191–205.
Skordos, Dimitrios and Anna Papafragou
2016 Children’s derivation of scalar implicatures: Alternatives and relevance. Cog-
nition 153: 6–18.
Southgate, Victoria, Coralie Chevallier and Gergely Csibra
2010 Seventeen-month-olds appeal to false beliefs to interpret others’ referential
communication. Developmental Science 13(6): 907–912.
Sperber, Dan and Deirdre Wilson
[1986] 1995 Relevance: Communication and Cognition. Oxford, UK: Blackwell.
Tomasello, Michael
1992 The social bases of language development. Social Development 1(1): 67–87.
Veenstra, Alma, Bart Hollebrandse and Napoleon Katsos
Submitted Why some children accept under-informative utterances: Lack of compe-
tence or Pragmatic Tolerance? Manuscript submitted for publication.
Wynn, Karen
1992 Addition and subtraction by human infants. Nature 358(6389): 749–768.
Yoon, Erica J, Yunan Charles Wu and Michael C. Frank
2015 Children’s online processing of ad-hoc implicatures. Proceedings of The 37th
Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive
Science Society.
11. Psycholinguistic production tasks
Raymond W. Gibbs, Jr.

Abstract: This chapter reviews several of the important experimental tasks

employed within psycholinguistics to study the pragmatic language production.
These tasks range from what would you say in different contexts, why people
speak in certain ways, describing scenes and reading stories, answering questions,
cooperative communication games, and language production in multimodal con-
texts. A major theme here is that language production is not an isolated psycholog-
ical process which is separate from language comprehension. Instead, pragmatic
language production must be situated and theoretically understood as a “joint” or
“coupled” activity involving both speakers and listeners as they attempt to coop-
erate and coordinate. Moreover, pragmatic language production is by no means a
modular linguistic activity as it is tightly linked to many ongoing non-linguistic,
bodily processes in human communicative interaction.

1. Introduction

Psycholinguistics has always suffered from an imbalance in its attention to language

comprehension and production. On the one hand, it is relatively easy to study lan-
guage understanding. Participants in an experiment can be presented with some lin-
guistic stimuli (e. g., phonemes, words, phrases, sentences, longer stretches of dis-
course) and asked to perform some task that will reveal insights into the processes
and products of interpretation. There is a vast array of methods by which different
levels of language understanding may be examined and theoretically described.
On the other hand, language production is far more challenging to study in con-
trolled experimental conditions. This is especially true for topics within linguistic
pragmatics. For example, imagine that I am interested in the psycholinguistics of
figurative language production, such as when and how people produce metaphors,
idioms, irony, and so forth. It makes little sense to present someone with a phrase,
such as “John kicked the bucket”, and then repeat this as a method for exploring
the dynamics of how idioms, in this case, are accessed and uttered in context. After
all, we produce language on our own without some researcher telling us what to
say in advance. Describing the sequence of psychological operations leading from
idea to speech requires that we investigate how speakers produce language on their
own. How can we get people to say things related to different, specific pragmatic
phenomena and do so in a naturalistic way characteristic of ordinary language
production processes?

https://doi.org/10.1515/9783110424928-011
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 281–303. Berlin/
Boston: De Gruyter Mouton.
282 Raymond W. Gibbs, Jr.

Much work on language production examines what people really say in dis-
course. Corpus linguistic studies provide a wealth of evidence about different prag-
matic phenomena, which is complemented by various scholarly analyses of prag-
matic language use by linguists and philosophers. Nonetheless, the examination of
complex segments of naturalistic discourse is limited in terms of what it can reveal
about specific psychological processes of language production. Corpus analyses,
for example, are less amenable to examining specific hypotheses that may test
exactly how and when people speak as they do in context. For this reason, different
methods have been created and employed that enable researchers to investigate
alternative hypotheses about people’s language use under controlled conditions.
My goal in this chapter is to describe some of the attempts to examine prag-
matic language production within experimental psycholinguistics. A major theme
of this review will be that language production is not an isolated psychological pro-
cess which is separate from language comprehension. Instead, pragmatic language
production must be situated and theoretically understood as a “joint” or “coupled”
activity involving both speakers and listeners as they attempt to cooperate and
coordinate. Moreover, language production is by no means a modular linguistic
activity as it is tightly linked to many ongoing non-linguistic, bodily behaviors
in human communicative interaction. I explore these issues through discussion of
different experimental tasks including what would you say in different contexts,
why people speak in certain ways, describing scenes and reading stories, answering
questions, cooperative communication games, and language production in multi-
modal contexts.

2. What would you say?

The first task of interest here is one that asks people to say something in very
particular, social, pragmatic conditions. For example, we are all accustomed to
making promises to other people in our daily conversations. A speaker may say
“I promise to meet you for lunch at noon” or “I’ll meet you for lunch at noon” to
indicate that he or she will have lunch with the other person tomorrow at noon.
When a speaker utters any of these statements, is he or she obligated to actually
show up at the scheduled time? If so, where does the obligation to show up on time
actually come from?
Philosophers of language argue that there are rules guiding people’s production
of promises. Within the theory of speech acts, three conditions, known as felicity
conditions, must hold true for a promise to be made (Searle 1965). These include
(1) the speaker intends his utterance to count as the undertaking of the obligation
referred to (obligation condition), (2) the speaker believes that the listener would
prefer him to do what was promised (the hearer preference condition), and (3) it is
not obvious to either the speaker or the hearer that the speaker would do what he
Psycholinguistic production tasks 283

has promised in the ordinary course of events (the non-evident condition). Each
of these conditions is seen as necessary for the felicitous production of a promise
and taken collectively the set of conditions will be sufficient for a promise to have
been made.
Gibbs and Delaney (1987) experimentally investigated the pragmatic factors
that determine how people actually make and understand promises. We explored
whether people have tacit knowledge of the felicity conditions listed above, which
affects how promises are produced and interpreted. Participants were presented
with stories that depicted a person about to say something concerning a future
event. These stories were either consistent with the three felicity conditions (i. e.
obligation, hearer preference, and non-evident) or violated one of them. Partici-
pants read each story and produced an utterance that they would say in the situ-
ation. Afterwards, participants went back and rated, on a seven-point scale, their
utterances on the extent to which each one represented a promise.
We hypothesized that if people have tacit knowledge governing how they make
promises, then violating any of these conditions should affect participants’ ratings
of the utterances they produce as promises. For example, suppose that you have
been mowing the lawn once a week for the past three summers and that everyone
in your family expects you to do so. One day you say to a member of your family
“I’ll mow the lawn this afternoon”. According to Searle, this utterance should not
count as a promise because it is obvious to both you and the hearer that you would
have mowed the lawn in the normal course of events. This violates the non-evident
felicity condition in that there is no point in making a promise if the action to be
performed would have been done anyway.
The participants in this study produced a range of utterance types, including
most frequently statements of future acts (e. g., “I’ll take out the garbage”), state-
ments of fact (e. g., “I don’t mind taking out the garbage”) and reassurances alone
(e. g., “Don’t worry about the garbage, really”). These utterances generally fit into
Searle’s scheme of felicity conditions because people referred to one of these when
making their promises, such as when a speaker predicates a future action.
An analysis of the promise ratings showed that the utterances produced in the
normal and different violation conditions were not equally promise-like. Thus, peo-
ple gave higher promise ratings to utterances generated in the normal condition than
in each of the obligation, hearer preference, and non-evident violation conditions.
But people gave higher ratings to utterances produced in the non-evident condition
than in the other two violation conditions, indicating that promises can be made in
situations where the speaker would have done the action in the normal course of
events. A second study in this same series found similar results when a different
group of participants rated the utterances produced by people from the first exper-
iment. Overall, the rating data are consistent with the idea that people implicitly
believe that certain conditions should hold for promises to be felicitous, and this
seems especially true for the obligation and hearer preference felicity conditions.
284 Raymond W. Gibbs, Jr.

It would be great if one could conduct an experiment like the one above when
people make promises in real-life contexts that they were actively involved with
and simply stated what they intended to a real addressee (i. e. the recipient of the
promise). However, it would be really difficult to discover, or even create, sys-
tematically different real-life situations which exemplify the key characteristics of
Searle’s felicity condition theory of promising. Asking individual participants to
make promises in a range of contexts and then reflect on their promise-like behav-
iors offers a good compromise in being able to draw inferences about what people
implicitly believe is needed to make socially appropriate promises.
One long-standing debate within speech act theory is whether people can
implicitly produce different illocutionary acts (e. g., directives, assertives, com-
missives, expressives, declaratives) without explicitly noting that one is giving
a direction, making an assertion, making a promise, and so forth. Thus, can one
make a promise by saying “I’m sure that I will have it finished tomorrow”, without
having to include the explicit performative phrase “I promise”. A series of exper-
iments examined whether participants could make different speech acts without
the inclusion of explicit performatives (Holtgraves 2005). Participants were asked
to imagine that they were speakers in specific situations and to indicate what they
would say in each one. Most importantly, participants were precluded from using a
relevant speech act verb (e. g., “I promise,” “I apologize,” “I request”) when stating
what they would say. Consider, for example the following situation (Holtgraves
2005).
You and a close friend are roommates. Your friend is very forgetful. You know that your
roommate has a dentist appointment today which you are sure he has forgotten. You’re
eating breakfast together and you want to remind your roommate of the dentist appoint-
ment. What exactly would you say (please do not include “remind” in your remark)?
(Holtgraves 2005: 207)

Participants’ responses to situations depicting four major speech act types (i. e.
assertives, expressives, directives, and comissives) were then analyzed by a sec-
ond group of participants who wrote down a single word that best described the
specific action that they believed the speaker was performing with his remark. The
main finding was that speakers most typically referred to one of the most relevant
felicity conditions for a particular speech act. Thus, when people had to remind the
roommate of the dentist appointment, they also make reference to a particular state
of affairs in the world (e. g., “You have a dentist appointment later”). Similarly,
when people made promises, again without being able to use the words “I prom-
ise,” they referred to a future action (e. g., “I swear I will see you tomorrow”),
which is one of the key felicity conditions associated with the appropriate making
of promises in discourse. It appears, then, that people are capable of producing
implicit performatives by referring to the felicity conditions that support a particu-
lar linguistic utterance as conveying a specific speech act meaning.
Psycholinguistic production tasks 285

The main drawback of this production task is that it does not directly examine
whether people routinely produce implicit performatives even when they are free
to explicitly state the relevant speech act function (e. g., “I promise”). Placing
restrictions on how different speech acts are made (e. g., no explicit performatives)
may be unnatural, yet it enables researchers to more systematically investigate the
role that certain pragmatic features have in shaping speakers’ verbal behaviors.
A slightly different version of the “what would you say” paradigm explores the
pragmatics of making and understanding indirect requests. Imagine that you need
to find the time as you walk down a crowded street. You go up to one person and
say, “Excuse me,” followed by one of the following indirect requests:
“Do you have the time?”
“Can you tell me the time?”
“Would you mind telling me the time?”

These utterances are all indirect because they do not state a clear imperative (e. g.,
“Tell me the time”). Still, does it matter which of the above sentence forms you use
to make your request for the time? Making requests of others typically interrupts
what a listener is doing, and speakers must find a way of inserting their request into
the conversation to maximize the possibility of the listener fulfilling the request.
One possibility is that these indirect forms of making a request, along with many
others, are equally good, with some of them perhaps becoming quite conventional
to use for mostly arbitrary reasons. However, empirical research suggests that peo-
ple formulate their requests in specific ways to deal with the listener’s greatest
potential obstacles in complying with the request (Francik and Clark 1985; Gibbs
1986). For example, if a listener may not possess the desired information or object
then the “Do you have …?” form may be most appropriate. On the other hand, if
a person’s ability to fulfill the request is in question, then the form “Can you tell
me …?” may be preferable.
One experimental test of this obstacle hypothesis brought participants to dif-
ferent locations on a university campus, each of which was carefully designed to
highlight a different potential obstacle (Gibbs 1986). For example, an experimenter
and a participant went inside the university library and walked over to a table
where a student, who was specifically set up for this situation, was busily working
on a paper assignment. The participant was told to imagine sitting near the student
and also working on a paper when his pen suddenly ran out of ink. Participants
were then asked to state what they would say to the nearby student in order to get
that addressee to lend them a pen.
Overall, participants produced requests that specified the most likely obstacles
for addressees 74 percent of the time. Furthermore, other laboratory studies demon-
strated that people find it easiest to interpret indirect requests that appropriately
specify the main obstacle for addressees. People take less time, for example, to
understand “Do you have …?” requests when they are stated in possession obsta-
286 Raymond W. Gibbs, Jr.

cle contexts than in ability obstacles contexts. People take less time to infer the
meanings of “Can you …?” in ability contexts than when the context highlighted
the possession obstacle. The obstacle hypothesis, therefore, provides a strong con-
straint on people’s production and interpretation of indirect requests.
There is one advantage in asking people to be physically present in real-world
scenes before stating what they would say to the addressees in these contexts.
Rather than asking people to imagine being in a situation by reading a short par-
agraph in which different pragmatic features are explicitly emphasized, putting
people in the actual scene forces them to assess in their own way what is most
appropriate to do or say (i. e. observing different obstacles). This method better
approximates real pragmatic language production, and does so in a controlled sys-
tematic manner that would not be possible to accomplish by simply observing
people making requests in real-life.

3. Why would you say it?

A different approach for studying pragmatic language productions explores the rea-
sons for why speakers talk in specific ways. For example, why do people employ
pragmatic figures of speech such as metaphors, idioms, ironies and so forth in their
speech and writing? These different types of figurative language may be moti-
vated by different pragmatic purposes. One empirical attempt to examine why
people use a variety of figurative/indirect language presented participants with
10 examples each of eight kinds of “figurative” (or at least indirect) language:
hyperbole, idioms, indirect requests, ironies, understatements, metaphors, rhetor-
ical questions, and similes (Roberts and Kreuz 1994). Participants were asked to
read the examples, and to then generate three other examples of each figure. After
completing this task, participants listed reasons for why speakers might use the
particular forms in discourse.
Participants’ responses to the last question were organized into a taxonomy
of discourse goals. Care was taken in creating the taxonomy to identify unique
goals (e. g., “to be comical” and “to be funny” were seen as satisfying the same
goal of “to be humorous”), although it is not clear exactly how this was prac-
tically done by raters in this study. For example, the goals of “to add interest,”
“to get attention,” “to emphasize,” and “to provoke thought” show many sim-
ilarities, as perhaps do “to be conventional,” “to be polite,” and “to protect the
self.” Some goals may also reflect clearly identifiable cognitive or social motiva-
tions (e. g., “to be polite” appears to have a strong social motivation), while other
goals, such as “to clarify” and “to contrast differences” may be motivated by a
combination of cognitive, social, and even emotional factors. These classification
problems highlight one difficulty in enumerating specific reasons for speaking
figuratively.
Psycholinguistic production tasks 287

Nonetheless, Roberts and Kreuz’s (1994) analysis revealed both a great deal
of similarity and variation in what different figures appear to socially accomplish.
Seemingly unrelated figures, such as irony and simile, were often shown to accom-
plish similar discourse goals. At the same time, different figures also accomplished
quite a large number of different goals. Indeed, out of the total number of 19
unique discourse goals reported for all the figures (not including miscellaneous
goals collectively labeled as “other”), the average number of goals accomplished
by each figure of speech was 14.6 (77 % of all the possible goals). This diversity of
discourse goals was not exclusive to a small group of tropes, because people listed
between 12–18 goals per figure (63 % – 95 % of all the goals mentioned).
Among the most notable commonalities among the figures was that metaphor
and simile, not surprisingly, shared several discourse goals of “to compare sim-
ilarities,” “to provoke thought,” and “to clarify.” Interestingly, these two figures
also differed given people’s judgments that simile, but not metaphor, exhibited the
goals of “to be humorous” and “to deemphasize.” People also generally thought
that several figures accomplished the social, pragmatic goals of “to be humorous”,
namely hyperbole, irony, simile, and idioms. However, the goal of “to be polite”
was only noticeably indicated for indirect requests among all the tropes.
One problem with the Roberts and Kreuz (1994) study is that it did not com-
pare reasons for using different kinds of figurative language against nonfigurative
expressions. Although there are clearly problems in drawing principled distinc-
tions between figurative and nonfigurative language, it may still be useful to con-
trast people’s intuitions about why, for example, a metaphor may be differentially
informative in some context compared to some other nonmetaphoric expression.
Conducting this kind of study may provide data that permits a more global set of
reasons for why figurative language could have special rhetorical purposes. For
example, several scholars have emphasized the importance of ambiguous, includ-
ing indirect and figurative, language for keeping language “expressive”, capable of
evoking a rich layer of propositional, affective, and social meanings, and enabling
speakers to convey mastery over some complex situation (Colston 2016). These
reasons for speaking figuratively are more general than those emerging from the
Roberts and Kreuz study (e. g., “to clarify,” “to be comical”), and depending on the
particular trope, may supersede people’s local aims when using one specific trope,
as opposed to another, in discourse.
A different concern with this study is that participants in the study were asked
to speculate about the possible reasons for using a particular category of figurative
language in an abstract manner apart from realistic discourse contexts. These intro-
spections are relevant to understanding people’s folk ideas about the functions of
figurative language, but are less informative on the complex motivations for why
people pragmatically produce figures of speech, and the specific functions of these
expressions, in realistic speech and writing contexts. For example, ironic language
is often disparaging and critical, but depending on the context and specific utter-
288 Raymond W. Gibbs, Jr.

ance, irony can accomplish these goals while driving a large social wedge between
speakers and some listeners (i. e. addressees in the case of sarcasm) but not oth-
ers (i. e. overhearers who are friends of the speaker). Other kinds of irony (i. e.
jocularity with friends) can be humorous and actually increase intimacy between
speaker and listeners, despite their superficial negativity. The general problem is
that asking people to think about the reasons for what they may say apart from
real contexts opens up too much of a gap between the data obtained and a sensible
account of real-world pragmatic behavior. At the very least, people should be asked
for their communicative motivations in discourse contexts that are systematically
varied along different pragmatic dimensions.

4. Describing scenes and reading stories

One method for analyzing people’s language production is to have them watch
a short video and then verbally describe its contents to a listener. This task has
been quite useful for examining people’s productions of gestures, especially those
that exhibit metaphoric representations of concepts, called “metaphorics” (McNeill
1992: 73). For example, in one study participants were shown a cartoon and then
asked to describe what they saw to another person (McNeill 1992: 74). At an early
point, the speaker said
(1) “It was a Sylvester and Tweety cartoon”.
(hands rising to offer an object)

The raising of the hands here suggests that the speaker is offering the listener
a material object, referring to the cartoon event or the cartoon genre. Thus, the
speaker makes the abstract idea concrete by forming an image of a bounded, spa-
tially localizable object supported in the hands and offered to the listener for her
consideration. This metaphorical mapping is motivated by the CONDUIT meta-
phor in which language, meaning, knowledge or works of art are presented as a
physical container into which substances are placed and the whole is moved along
a conduit (Reddy 1979). Asking people to describe the cartoon they saw elicits this
kind of pragmatic, metaphoric understanding of the video and its contents.
A related topic of interest to linguists and psychologists is the role that iconicity
plays in verbal language production. For example, linguists have long observed that
language often parallels the physical characteristics of real-world objects and events.
Many words convey semantic information through their forms, which suggests that
form-meanings pairings are far from arbitrary. Indeed, some research showed that
speakers modulate their prosody in iconic ways. One study asked participants to
describe the direction of a dot moving on a computer screen, either up, down, left
or right (Shintel, Nusbaum, and Okrent 2006). Speakers raised their pitch when
describing upward movements and lowered their pitch when referring to downward
Psycholinguistic production tasks 289

movements of the dot. They also spoke faster when describing a fast moving dot and
decreased their articulation rates when describing a slow moving dot.
These findings on the iconic modulation of speech have been extended using a
production task in which people read different short stories that contrasted along
different elements of meaning. People inflected their pitch when reading stories
about higher locations and upward movements versus low locations and downward
movement (Clark, Perlman and Johansson Falck 2014), and about small- versus
big-sized objects (Perlman, Clark and Johansson Falck 2015). Moreover, speakers
modulated their articulation rates when reading stories about fast versus slow-
paced events (Perlman et al. 2015), a finding that has also been observed when
people were asked to spontaneously describe depictions of fast versus slow events
on short video clips (Perlman 2010).
One other production study explored whether people modulate their prosody
when speaking about both concrete (e. g., fast driving) and abstract, metaphorical
(e. g., fast career progress) events (Perlman et al. 2015). Participants read aloud
stories referring to fast rates of speed more quickly than they did slow stories for
both the concrete and metaphorical events. They also read both types of stories in
lower pitch when these referred to events that were physically heavier (e. g., lifting
a heavy object) or metaphorically more important (e. g., having an important meet-
ing). These findings suggest that people’s metaphorical understanding of events
influences the spoken quality of their speech when talking about these events. For
example, noting that a metaphorical event refers to a fast “life is a journey” or a
heavy “importance is weight” situation alters the vocal quality of their language
productions.
Of course, asking people to read aloud stories is not the same as having them
produce language on their own in specific discourse contexts. But the reading task
is still useful for examining people’s in-the-moment appreciation of iconicity, and
other possible pragmatic constraints, in the speech planning process.
There is a significant literature on narrative language production in which peo-
ple see a film and then describe it to others, such as the project on the “Pear Stories”
(Chafe 1980). Examination of these narratives enable scholars to detail a range of
cultural, cognitive, and linguistic factors in immediate language production, espe-
cially regarding how words unfold during a monologue. A good deal of this, and
related, work is devoted to examining people’s conceptualization processes when
they verbally describe events, including the use of conceptual metaphors in speak-
ing about personal and emotion experiences (Gibbs and Franks 2002), and how
different languages shape these event construals (von Stutterheim and Nüse 2003).
Some elicitation research has also, quite naturally, investigated how speakers pay
attention to listeners’ specific needs, or perspectives, when introducing topics or
referents (Smith et al. 2005).
There are a variety of narrative elicitation techniques used by linguists and
sociolinguists, primarily, in which participants are, once again, given specific stim-
290 Raymond W. Gibbs, Jr.

uli to watch and describe or to recall past life events. However, most of these tech-
niques, and the resulting empirical findings, are not examples of psycholinguistic
productions tasks per se. For example, most of the work on narrative elicitation is
not conducted within an experimental framework in which participants engage in
different tasks within different experimental conditions (e. g., with different ori-
enting instructions). At the same time, these studies do not typically analyze all the
data collected from all participants, but focus more selectively on certain examples
within the corpuses created to illustrate specific theoretical points (e. g., how a
person speaking one language describes a short film differently from how another
person, speaking a different language, describes the same film). Few of these stud-
ies also test particular experimental hypotheses with explicit performance pre-
dictions. These observations are not intended as criticisms of any of this line of
linguistic research. I only offer these comments to suggest that narrative elicitation
techniques require an analysis and evaluation that extend beyond what is typically
seen within experimental psycholinguistics.

5. Answering questions

Asking people to answer specifically worded questions in different contexts has

also been shown to be a very useful psycholinguistic production task. For example,
one enduring issue in the study of linguistic pragmatics is whether people analyze
the literal meaning of linguistic expressions as part of how these are processed and
interpreted (Gibbs 1994; Gibbs and Colston 2012). There is much debate about this
topic, with various reading time experiments offering results which suggest that
people can often understand what others pragmatically imply without first deci-
phering and then rejecting the literal or semantic meanings of utterances (Gibbs
and Colston 2012). One possibility, though, is that people analyze the literal mean-
ings of speakers’ utterances at some point during linguistic processing without it
being fully determined before pragmatic messages are inferred.
One set of studies provided some initial support for this idea using a naturalistic
question answering task (Clark 1979). An experimenter called local merchants on
the telephone and made simple indirect requests about the time these businesses
closed at the end of the day, such as “Can you tell me what time you close?” and
“Will you tell me what time you close?” Many merchants included “yes” in their
responses to these indirect requests as in “Yes, we close at 6 pm”. People presum-
ably included “yes” in their responses to adequately address the literal question
and “we close at 6 pm” to provide the information that was indirectly requested. It
appears that listeners ordinarily analyze the literal meanings of indirect requests,
perhaps in parallel to interpreting the indirect request message. This strategy seems
particularly useful to enable listeners to know when a speaker is being polite, and
therefore requires a polite response in turn.
Psycholinguistic production tasks 291

Yet it is not clear that the inclusion of “yes” in people’s verbal responses to
certain indirect requests is due necessarily to their analysis of an indirect request’s
literal meaning. People may include “yes” simply because it is conventionally
polite to do so even though they do not actually analyze a statement’s literal inter-
pretation. One reason to suspect that this might be true is because merchants also
included “yes” when verbally responding to the indirect request “Would you mind
telling me what time you close?” People should have included “no” as in “No, we
close at 6 pm” if they were really responding to the literal question asked. In fact,
the mention of “yes” in people’s verbal responses to indirect requests may only sig-
nal a willingness to comply with the implied request rather than because of some
automatic analysis of speaker’s literal meanings. At least in this case, one must be
careful not to over-interpret people’s answers to questions as necessarily reflecting
different parts of the language interpretation process.
Still, do people produce language that is specifically designed to meet the pre-
sumed needs of their addressees? Psychologists and sociologists have long argued
that speakers design each utterance so that their addressees can figure out what they
intend by considering the utterance against their current common ground (Clark
1996). Some common ground information is cultural (i. e. information broadly
shared by members of a community), and some information is personal (i. e. infor-
mation uniquely shared by two or more speakers). When people converse, they
typically design their utterances to take into account the perspective of the listener,
which facilitates addressees understanding speakers’ communicative intentions.
A simple demonstration of this is seen in a study looking at people’s assump-
tions about mutually known beliefs and knowledge when speaking with others
(Krauss 1987). This study had an experimenter stopping people on the street in
downtown Boston, Massachusetts, where he asked for directions to Jordan Marsh,
a large department store about six blocks away. To a third of the people, the exper-
imenter asked, “Can you tell me how to get to Jordan Marsh?” To another third,
the experimenter said, “I’m from out of town. Can you tell me how to get to Jordan
Marsh?” To the remaining third, the experimenter asked, “Can you tell me how to
get to Jordan Marsh?” but did so employing a rural Missouri accent, representative
of a speech style in a different part of the United States.
The addressees’ responses were secretly tape-recorded and analyzed for the
number of words spoken and the number of places en route that were referred to.
When the experimenter prefaced his question with “I’m from out of town,” people
responded with significantly more words and more place names than when asked
this question without the preface. Alerting the addressee to the fact that the speaker
does not share the same community knowledge clearly gets respondents to design
their answers differently. But the respondents also gave longer, more detailed,
answers when the experimenter asked his question without the “I’m from out of
town” preface but spoke with a Missouri accent. Again, people designed their
answers given their assumptions about how well their addressee may most easily
292 Raymond W. Gibbs, Jr.

infer their communicative intentions. This study, therefore, showed in a realis-

tic situation how people’s language productions are constrained by assumptions
regarding common ground beliefs and information.
As noted earlier, much research debates whether people necessarily must hold
explicit common ground information in all aspects of language production and
understanding. One proposal claims that speakers aim to be optimally relevant in
saying what they do (Sperber and Wilson 1995; Wilson and Sperber 2012). Under
this view, called “relevance theory”, every act of ostensive behavior communicates
a presumption of its own optimal relevance, that is, a presumption that it will be
relevant enough to warrant the addressee’s attention and as relevant as compatible
with the communicator’s own goals and preferences (the Communicative principle
of relevance). Speakers design their utterances to maximize the number of cogni-
tive effects listeners infer while minimizing the amount of cognitive effort to do so.
Listeners understand speakers’ communicative intentions via the “relevance-the-
oretic comprehension procedure” (Wilson and Sperber 2012), by following a path
of least effort in computing cognitive effects. They do this by testing interpretive
hypotheses (e. g., disambiguations, reference resolutions, implicatures) in the order
of accessibility, and then stopping when their expectations of relevance are satis-
fied.
Consider one detailed experimental test of this view in which people answered
simple questions about the time. Imagine a situation in which a stranger approaches
you on the street and says “Excuse me, do you have the time?” If you were wear-
ing a watch, how would you interpret and respond to this person’s request? Some
possible replies include the following:
(2) a. “It’s about 4.”
b. “It’s 3 minutes before 4.”
c. “It is um … 3:57.”

All these responses provide a reasonable answer to the person’s request. But the
three responses differ in the exactness of their time given, the form in which it is
given (minute-hour vs. hour-minute), and whether the answer was given directly
or included other paralinguistic information (pauses and filled pauses).
Although statement (b) provides the same cognitive effect as does (c), it likely
requires more cognitive effort to comprehend than (c), given the extra mental com-
putation needed to derive the exact time of 3:57 from the statement “It is 3 minutes
before 4.” Statement (b) is therefore less optimally relevant because greater effort
is expended than what is required to understand statement (c). At the same time,
the filled pause in (c) may work to signal that an answer is forthcoming which is
indeed worth the addressee’s continued attention. In this manner, statement (c) may
convey an additional cognitive effect over that seen in (b), namely that a highly
relevant answer is forthcoming, which clearly benefits the addressee and may facil-
itate her understanding of the speaker’s communicative intention.
Psycholinguistic production tasks 293

Of course, statement (a), “It’s about 4,” may provide sufficient cognitive effects
with little cognitive effort, unless the questioner first mentions the fact that he
needs to reset his stopped watch. In that case, the approximator “almost” should
supply a highly relevant cognitive effect that the following numerical answer is
just good enough (e. g., “It is almost about 4”). How do people respond to time
requests given some of these considerations? Complicating the pragmatics of the
time answering situation is that some people may wear digital watches and oth-
ers analog watches. Although it may be ideal to answer any “Do you have the
time?” question with an exact answer, doing so when wearing an analog watch may
require more effort than when wearing a digital watch.
However, research shows that when people are asked “Do you have the time?”
they typically provide rounded answers, even when wearing digital watches (Gibbs
and Bryant 2008; van der Henst, Carles and Sperber 2002). The fact that respond-
ents tend to round their answers to time questions, even when wearing digital
watches that provide exact times, suggests that conversational exchanges are not
guided by an egocentric bias to state what is easiest, or to follow a maxim to
always speak truthfully (cf. Grice 1989), both of which would predict that digital
watch wearers should invariably give the exact time. Rather, people aim to provide
answers to questions that are optimally relevant for the circumstances, which in
most cases does not require an exact time.
In other research, people were approached and asked “Do you have the time?”
and their answers tape-recorded (Gibbs and Bryant 2008).1 An analysis of the
responses showed that speakers plan their answers to time questions in specific
ways by often including acknowledgments (“Yeah, it is 10 till 4”), approximators
(“It is about 3:30”), and filled pauses (“It is um 10 till 4”). These linguistic and
paralinguistic cues do not simply indicate that the speaker is experiencing produc-
tion problems, but may function as a green light for the addressee to continue with
the process of deriving relevant cognitive effects (see Finlayson & Corley, 2012
for a discussion of this question).
Furthermore, people who wore digital watches and gave exact replies took
longer to plan these than did those who provided rounded answers. Thus, people
with digital watches who saw the exact time actually put more cognitive effort to
produce that exact time than when they produced a rounded answer. But digital

1
These tape-recordings, each one lasting less than 10 seconds, were collected without
asking participants’ permission beforehand. Many participants rushed off after provid-
ing their time responses and so it was not always possible to ask for their explicit per-
mission to use their answers as data. Not all universities or countries approve of this
degree of lack of consent. More generally, though, there are complex ethical issues in
studying language production in real-world settings, especially in terms of obtaining
naturalistic language evidence without people being potentially biased by knowing that
they are participating in an experiment.
294 Raymond W. Gibbs, Jr.

watch wearers did not take longer to produce exact replies in a context where
the original speaker asked “Excuse me, my watch has stopped. Do you have the
time?” This pattern of findings suggests that respondents most easily understood
that giving an exact time was optimally relevant in the case where it appeared
that the questioner wanted the time in order to reset his watch. On the other hand,
respondents were less sure that an exact time was relevant when the question only
stated “Do you have the time?” despite that the exact time was easiest to retrieve
for digital watch wearers.
This research on answering time questions is unusual in that it explores peo-
ple’s pragmatic responses in a real-world context, while still measuring response
latencies as is done in typical laboratory psycholinguistic experiments. The beauty
of this method is that it is both naturalistic and produces very detailed information
about the speech planning process as it operates in real-time. Our results indicate
that people appear to be striving for optimal relevance when formulating their
pragmatic responses to people’s indirect time requests. Being optimally relevant
requires that speakers do not aim for the greatest efficiency in an abstract sense, but
they take pragmatic considerations into their immediate evaluation of what to say.

6. Cooperative communication tasks

A central feature of many pragmatic theories is that people use language for coor-
dinating both their individual and joint actions. There have been notable exam-
ples of how coordination may possibly be accomplished in conversation within
the fields of linguistics and philosophy. However, a major development in our
understanding of pragmatic language use has occurred over the past 30 years with
the emergence of psycholinguistic studies employing cooperative communication
tasks. Thus, experimental participants are asked to perform some task together,
usually with one person directing another to solve some problem, such as arrang-
ing cards or pictures in a certain order or constructing some toy building. The
participants’ performance on these tasks, and the dialogue they engage in to do so,
are then closely analyzed for evidence of coordinative, cooperative linguistic and
nonlinguistic behaviors. Most of these psycholinguistic studies aim to demonstrate
how the accrual of common ground enables speakers and listeners to more readily
coordinate their intentional meanings in discourse.
Consider, for example, an experiment in which two persons talk to each other,
but cannot see each other (Clark and Wilkes-Gibbs 1986). Both sit before sche-
matic drawings of cartoon figures, called tangrams, which are new to both parties.
One conversant describes a specific figure from her set of figures, and the other
identifies the correct picture from his set using the heard description alone. Unsur-
prisingly, participants get better at this task over time. Speakers initially provide
detailed descriptions of the figures to make initial identifications possible, but
Psycholinguistic production tasks 295

over time each pair of dialogue partners eventually evolves a shared idiosyncratic
lingo specific to the given task environment allowing them to pick out figures more
quickly. Thus, on a first trial, one speaker referred to a figure by saying, “All right,
the next one looks like a person who is ice skating, except that they’re sticking
two arms out in front.” But on the sixth trial in this study, the same speaker simply
said, “The ice skater.” These results suggest that understanding what a speaker
intends to communicate, and the criteria by which listeners judge that they have
understood that message is a joint product requiring coordination and cooperation
between listeners and speakers.
Another version of this card-sorting task examined the role of expertise (Isaacs
and Clark 1987). In these studies, pairs of people, some being from New York
(experts) and some not (novices), attempted to arrange a set of postcards with
pictures of different buildings and places in New York City. To the extent that the
director and matcher could establish that each was from New York, more proper
nouns (e. g., the Chrysler Building, Rockefeller Center) would be used to describe
the postcard scenes. If both participants were novices (i. e. not from New York)
far fewer uses of proper names would be expected. If an expert and a novice were
paired, then the use of proper names would increase over time (or trials) as the
experts taught the novices about the names for different postcards.
These general predictions were shown to be correct. There was also an increase
in the efficiency of the conversations as shown by a decrease in the overall number
of words used and the number of turns required to complete the task. Thus, in con-
versations between experts, proper names were used about 80 % of the time while
proper names were used less than 20 % of the time between novices. When an
expert was talking to a novice, the number of proper names initially decreased as
it became clear to the expert that the novice did not know what some of the names
referred to. When novices talked to experts, the number of proper names increased
as some of the expertise “rubbed off” and the names of landmarks were learned
from the expert partner. Experts and novices seemed to have discovered that they
were talking to other experts or novices by the way the conversation proceeded,
because in only 6 of the 32 pairs did participants actually ask or tell the other per-
son whether they were New Yorkers.
Participants in real-life conversations sometimes design their utterances with the
intention of excluding some person from understanding their pragmatic meaning.
One set of experiments explored this type of situation. In this particular case, partic-
ipants in a card-arranging task had to communicate the ordering of photographs of
Stanford University scenes, but there was a third person in the room, provided with
the same set of pictures, and the two conversants had to try to ensure that the third
person did not succeed in the task (Clark and Schaefer 1987). Thus, the speaker
had to ensure that the addressee understood, but had to conceal his meaning from
the overhearer. All three participants were Stanford University students and thus
“experts”, but the two conversants were friends and the overhearer was a stranger.
296 Raymond W. Gibbs, Jr.

Because the three participants had the same community membership, it was
expected that the conversant would use “private keys” or information that was
part of their particular common ground, but which was unknown to the overhearer.
Although there were certain instances when the speakers slipped up and uttered the
name of a scene, the vast majority of references contained these private keys. For
example, a speaker referred to a fountain on campus as “where someone wanted
to put my teddy bear.” Overall, the addresses were twice as successful in correctly
arranging the photographs as were the overhearers, suggesting that speakers and
listeners can often successfully hide their communicative intentions from some
people.
Producing language in these dialogue situations is not a ballistic process in
which speakers state what they mean and then hope for the best that addressees
will somehow understand them. Various psycholinguistic research demonstrates
that speakers actively, automatically monitor listeners’ reactions to insure proper
understanding of what was expressed, both linguistically and gesturally. For exam-
ple, one study had pairs of participants assemble different Lego models with one
person acting as the director and the other as the builder (Clark and Krych 2004).
For one group of people, the director and builder could see each other and the
builder’s workspace. In a second group, the participants could hear but not see one
another, and in a third group, the director gave only audiotaped instructions to the
builder.
The participants performed the worst in constructing the specific Lego model
when they communicated using an audiotaped message, and somewhat better when
they could hear, but not see, each other. Not surprisingly, people performed the
assembly task best when they could both see and hear one another. Examination
of the discourse showed that directors engaged in a host of actions when speaking
with builders, including exhibiting, poising, and pointing, in addition to using eye
gazes and head nods to communicate their in-the-moment messages. These dif-
ferent linguistic and nonlinguistic actions were also exquisitely timed given what
the builders were doing at any moment. In many instances, directors altered their
utterances midcourse when they sensed that the builders needed to reorient their
specific actions to better complete the overall assembly task.
This version of the communication game task provides an excellent method
for exploring how pragmatic language production is a joint activity involving both
speakers and addressees, and sometimes overhearers. Speakers produce language
not only to express what they mean, but also to ordinarily ground what is said
through a variety of linguistic and nonlinguistic devices. Language production is
not just a matter of verbally articulating one’s private thoughts, and it is fundamen-
tally used for continually updating common ground between individuals given dif-
ferent real-world adaptive requirements. Furthermore, as Clark and Krych (2004)
emphasize, speech planning is opportunistic in taking advantage of the online
process of language production to alter what is said as problems arise. Finally,
Psycholinguistic production tasks 297

language production is also multi-modal given the intricate blend of verbal and
gestural/bodily processes in dialogue.
The use of cooperative communication games offers systematic evidence on
the ways speakers and listeners, sometimes in the presence of overhearers, coordi-
nate what they say and do to reach joint goals. These experimental tasks are, conse-
quently, ideal for examining how language is used for purposes of both monitoring
and altering the always changing common ground between people in everyday
life. Systematic variations on the design of these experiments provide different
opportunities for testing specific hypotheses on pragmatic language production
that could not be done through the analysis of ordinary, unstructured speech.
There are, not surprisingly, challenges against the idea that speakers always
design their utterances to best meet their listeners’ understanding needs. Under
some circumstances, such as stress or high levels of cognitive burden, speakers can
be more egocentric in their productions than the traditional common ground view
would predict. Listeners do not consistently consider common ground in their com-
prehension (Barr and Keysar 2005). People frequently misjudge the effectiveness
of their own communication precisely because they do not correctly understand
what is, and is not, part of their common ground with others. Speakers who have
learned the meaning of opaque phrases (e. g., “as the goose hangs high” meaning
that something is to your liking), for instance, sometimes overestimate the like-
lihood that other people know those meanings (Keyser and Bly 1995). Speakers
also sometimes think their own utterances are less ambiguous and more effective
than they actually are (Keysar and Henly 2002). Nonetheless, these various studies
do not contest the view that common ground exists and may constrain language
production and comprehension. Instead, the argument is over whether initial stages
of speech production and understanding are inherently egocentric, particularly in
moments when speakers experience cognitive stress in some manner.

7. Language production in multimodal contexts

There is an emerging body of research within psycholinguistics on coordination

during conversation that explores the embodied dynamics of how speakers and
listeners make use of common ground. These experimental studies employ tasks
that are often similar to what was described above, such as asking participants to
perform joint tasks together under different conditions. In other studies, however,
people are just given different topics to discuss, sometimes in the presence of spe-
cific visual cues. The notable feature of these studies is that they focus more on dif-
ferent aspects of bodily coordination than on the semantic content of what people
say. Nonetheless, this work offers important lessons on how psychologists and oth-
ers should conceive of pragmatic language productions as an embodied activity in
which spoken language is one part of a large repertoire of conversational behaviors.
298 Raymond W. Gibbs, Jr.

For example, one study asked people to verbally describe pictures that were
arranged on an easel (Bangenter 2004). People employed a range of verbal forms
when making their descriptions. Most notably, participants pointed more often to
the pictures when they were physically closer to the easel. Indeed, this pointing
behavior increased as the distance decreased and pointing often replaced verbal
descriptions as people stood closer to the pictures. These data demonstrate how
pointing can be easily and opportunistically employed along with speech when
people describe objects in the world.
Similar to previous studies, some psycholinguistic experiments have explored
how speakers coordinate when talking about a given topic (Richardson, Dale and
Kirkham 2007). For example, one study had two people discuss their favorite char-
acters from one of two TV shows (e. g., “Friends,” and “The Simpsons”); they both
could see pictures of cast members, even though the two speakers could not see
one another. The main interest here was to monitor participants’ eye-movements as
they engaged in conversation. An analysis of when people looked at the pictures,
and for how long, revealed a tight coupling of the eye-movements of the two par-
ticipants, especially during the first few seconds after which any cast members’
name was mentioned. Furthermore, when participants were first given a back-
ground story regarding the objects they were looking at in a separate study (e. g.,
looking at paintings by Salvador Dali), there was an ever greater coupling of the
participants’ eye-movements. This latter finding suggests that having additional
common ground knowledge helps people exploit other types of common ground
information, namely the shared visual pictures presented. In general, people’s con-
versational behaviors may be coordinated at many levels beyond speech alone.
Many other experiments have clearly demonstrated how people synchronize
their verbal and non-verbal behaviors along multiple dimensions in conversation.
Paxton and Dale (2013) had participants, who did not know one another, talk about
one of two topics, one of which they both agreed about (affiliation group) and
the other topic was one they disagreed on (argumentation group), based on prior
survey results. These conversations were recorded and later analyzed for coordi-
nation among many different verbal and non-verbal dimensions. Most generally,
there was a much greater degree of temporal coordination in their eye gazes, head
nods, body posture, prosody, and so on when the speakers agreed than when they
disagreed.
These studies are only representative of a large body of research in psycho-
linguistics and experimental psychology on the implicit bodily coordination that
arises between speakers in conversation. Most importantly, what people say, and
how they say it, are tightly coupled with a repertoire of nonverbal behaviors. This
linguistic and nonlinguistic coordination is inherently flexible to meet the changing
demands of different conversational contexts. Indeed, the diversity of empirical
findings suggests that no single mechanism drives conversational behaviors. For
this reason, “it is unlikely to be the case that conversational performance and lin-
Psycholinguistic production tasks 299

guistic interactions can be accounted for in terms of a small single subset of mech-
anism” (Dale et al. 2014: 80). Characterizing language production must, therefore,
always be situated as a multimodal process in which language plays only a con-
tributing, but not exclusive, role in what gets conveyed and interpreted. There is
clearly a great need in this emerging body of research on multimodal language use
to examine more of the semantic content and pragmatic implications of what peo-
ple produce with their language. Still, analysis of the pragmatic nature of speaker
meaning should be conceived of as part of embodied communication involving
multiple people working to achieve both individual and common goals.

8. Is some language production quite deliberate?

A widely-held belief within linguistic pragmatics is that speakers make choices

in strategic ways during online language production. The underlying idea is that
language production is not completely automatic because speakers are consciously
aware of what they are doing. People are believed to be occasionally quite thought-
ful when speaking and even produce very specific linguistic expressions with con-
scious deliberation, both when using conventional and novel word formations.
One example of this idea is the proposal on “deliberate metaphor theory” which
assumes that only a small select group of words or utterances really conveys met-
aphoric messages, namely those that are composed and delivered with a deliberate
aim to alert others to particular cross-domain mappings (e. g., Shakespeare’s Son-
net 17 line “How shall I compare thee to a summer’s day”) (Steen 2008). Many
linguists and literary scholars maintain that different stretches of language must be
deliberately composed and produced for one reason or the other with deliberation
being closely tied to conscious intent.
However, there is actually no psycholinguistic evidence that supports the claim
that some metaphors are produced deliberately with all other so-called metaphors
not really conveying metaphoric, or other forms of, meaning (Gibbs 2015a, b).
Many studies employing different production tasks show that speakers are sensi-
tive to a wide range of constraints when articulating what they mean in context,
such as the studies described in the earlier sections. Yet these experimental findings
do not offer any support for the hypothesis that the context-sensitive production of
certain language is “necessarily” deliberate or conscious. Speakers and listeners
may be influenced by multiple constraints that push them toward adopting certain
verbal practices in different circumstances. This fact does not, in any way, show
that the speakers’ so-called strategic choices are due to special deliberative thought
processes. All language, to one degree or other, is intentional in the sense that
people aim for others to understand their implied intentions given what is stated.
But intentional language production is quite different from a unique process of
deliberate language or metaphor production.
300 Raymond W. Gibbs, Jr.

I discuss this topic given the arguments over deliberation in many contexts
within the worlds of metaphor and linguistics pragmatics. At the same time, it is
questionable whether any single psycholinguistic production task will be capable
of unambiguously detecting conscious deliberation when people produce specific
language materials, such as particular metaphors. The main difficulty is that there
simply is no clear distinction between mental processes which are automatic and
fully conscious. First, when people presume that they have performed some action
with deliberative forethought or full awareness, they often mistakenly believe their
behaviors are entirely the sole product of conscious mental processes. Experimen-
tal psychology has dozens of studies that drive home this important point (Gibbs
2011).
At the same time, when people act automatically, or without deliberate thought,
they mistakenly believe that their actions are not shaped by many interacting per-
sonal, interpersonal, and environmental constraints. For example, skilled drivers
move around in their cars without little conscious thought, unless some problem is
encountered. Yet this so-called automatic behavior is really organized by a com-
plex set of cognitive, perceptual, and motor skills, all of which operate again with-
out much conscious awareness. For similar reasons, our intuitions as pragmatic
scholars that people mostly produce language in one of two modes, automatic or
controlled or conscious, is far too simplistic and fails to acknowledge the dynamic
reality of how people really work, including when they speak or write. As more
and more psycholinguistic production tasks demonstrate, people’s speech planning
behaviors emerge from a constellation of interacting sub-personal, personal, and
contextual factors. No single force, such as a consciousness module, solely drives
the language production process. It is a mistake, then, to argue that some specific
aspects of language are only, or primarily, due to conscious, deliberate production
processes which are completely different from automatic speech planning behav-
iors.

9. Conclusions

Different psycholinguistic production tasks have been developed to systematically

examine various aspects of pragmatic language production. These tasks, which
vary considerably, enable scholars to test specific falsifiable hypotheses and inves-
tigate details of the speech planning process. Of course, there are always disad-
vantages in studying people’s linguistic behavior in scientific contexts, the most
notable one being that laboratory situations do not always approximate real-world
conversational behaviors. The distinct trend in psycholinguistics now is to examine
language production not as a solitary process, but as a joint activity. What speakers
say, and how they do so, is always constrained by their implicit attempts to coop-
erate and coordinate in order to achieve social, pragmatic tasks. At the same time,
Psycholinguistic production tasks 301

language production is tightly linked with people’s nonverbal behaviors along a

variety of dimensions. Experimental findings in support of this conclusion point to
the need for scholars to create theories that are sensitive to the multimodal nature
of everyday language use where language itself only plays an important, but not
solitary, role in shaping the course of human interaction.

References

Bangerter, Adrian
2004 Using pointing and describing to achieve joint focus of attention in dialogue.
Psychological Science 15: 415–419.
Barr, Dale and Boaz Keysar
2005 Making sense of how we make sense: The paradox of egocentrism in language
use. In: Herbert Colston and Albert Katz (eds.) Figurative Language Compre-
hension: Social and Cultural Influences, 21–41. Mahwah, NJ: Erlbaum.
Clark, Herbert
1979 Responding to indirect speech acts. Cognitive Psychology 11: 430–477.
Clark, Herbert
1996 Using Language. New York: Cambridge University Press.
Clark, Herbert and Deanna Wilkes-Gibbs
1986 Referring as a collaborative process. Cognition 22: 1–39.
Clark, Herbert and Meredyth Krych
2004 Speaking while monitoring addressees for understanding. Journal of Memory
and Language 50: 62–81.
Clark, Herbert and Edward Schaefer
1987 Concealing one’s meaning from overhearers. Journal of Memory and Lan-
guage 26: 209–225.
Clark, Nathaniel, Marcus Perlman and Marlene Johansson Falck
2014 The iconic use of pitch to express vertical space. In: Barbara Dancygier,
Michael Borkent and Jennifer Hinnell (eds.), Language and the Creative
Mind, 393–410. Stanford: SCLI Publications.
Colston, Herbert
2016 Using Figurative Language. New York: Cambridge University Press.
Dale, Rick, Ricardo Fusaroli, Richard Duran and Daniel Richardson
2014 The self-organization of human interaction. In: Brian Ross (ed.), Psychology
of Learning and Motivation, 43–95. New York: Academic Press.
Finlayson, Ian and Martin Corley
2012 Disfluency in dialogue: An intentional signal from the speaker. Psychonomic
Bulletin & Review 19: 921–928.
Francik, Ellen and Herbert Clark
1985 How to make requests that overcome obstacles to compliance. Journal of
Memory and Language 24: 560–568.
Gibbs, Raymond
1986 What makes some indirect speech acts conventional? Journal of Memory and
Language 25: 181–196.
302 Raymond W. Gibbs, Jr.

Gibbs, Raymond
1994 The Poetics of Mind: Figurative Thought, Language, and Understanding.
New York: Cambridge University Press.
Gibbs, Raymond
2011 Are deliberate metaphors really deliberate? A question of human conscious-
ness and action. Metaphor and the Social World 1: 26–52.
Gibbs, Raymond
2015 Do pragmatic signals affect conventional metaphor understanding? A failed
test of deliberate metaphor theory. Journal of Pragmatics 90: 77–87.
Gibbs, Raymond
2015 Does deliberate metaphor theory have a future? Journal of Pragmatics 90:
73–76.
Gibbs, Raymond and Gregory Bryant
2008 Striving for optimal relevance in answering questions. Cognition 106: 345–
369.
Gibbs, Raymond and Herbert Colston
2012 Interpreting Figurative Meaning. New York/Cambridge: Cambridge Univer-
sity Press.
Gibbs, Raymond and Suzanne Delaney
1987 Pragmatic factors in making and understanding promises. Discourse Processes
8: 107–126.
Gibbs, Raymond and Heather Mair Franks
2002 Embodied metaphor in women's narratives about their experiences with can-
cer. Health Communication 14: 139–165.
Grice, H. Paul
1989 Studies in the Ways of Words. Cambridge, MA: Harvard University Press.
Holtgraves, Thomas
2005 The production and perception of implicit performatives. Journal of Pragmat-
ics 37: 2024–2043.
Horton, William and Richard Gerrig
2005 The impact of memory demands on audience design during language produc-
tion. Cognition 96: 127–142.
Isaacs, Ellen and Herbert Clark
1987 References in conversations between experts and novices. Journal of Experi-
mental Psychology: General 116: 26–37.
Keysar, Boaz and Brigitte Bly
1995 Intuitions of the transparency of idioms: Can one keep a secret by spilling the
beans? Journal of Memory and Language 34: 89–109.
Keysar, Boaz and Anne Henly
2002 Speakers’ overestimate of their effectiveness. Psychological Science 13: 207–
212.
Krauss, Robert
1987 The role of the listener: Addressee influence on message formulation. Journal
of Language and Social Psychology 6: 81–98.
McNeill, David
1992 Gesture and Thought. Chicago: University of Chicago Press.
Psycholinguistic production tasks 303

Paxton, Alexandra and Rick Dale

2013 Argument disrupts interpersonal synchrony. Quarterly Journal of Experimen-
tal Psychology 66: 2092–2102.
Perlman, Marcus
2010 Talking fast: The use of speech rate as iconic gesture. In: Fay Perrill, Vera
Tobin and Mark Turner (eds.) Meaning, Form, and Body, 245–262. Stanford:
CSLI Publications.
Perlman, Marcus, Nathaniel Clark and Marlene Johansson Falck
2015 Iconic prosody in story reading. Cognitive Science 6: 1348–1368.
Reddy, Michael J.
1979 The conduit metaphor – a case of frame conflict in our language about lan-
guage. In: Andrew Ortony (ed.), Metaphor and Thought, 284–297. New York:
Cambridge University Press.
Richardson, Daniel, Rick Dale and Natasha Kirkham
2007 The art of conversation is coordination: Common ground and the coupling of
eye movements during dialogue. Psychological Science 18: 407–413.
Roberts, Richard and Roger Kreuz
1994 Why do people use figurative language? Psychological Science 5: 159–163.
Searle, John
1965 What is a speech act? In: Max Black (ed.), Philosophy in America, 221–239.
London: Allen and Unwin.
Shintel, Hadas, Howard Nusbaum and Arika Okrent
2006 Analog acoustic expression in speech communication. Journal of Memory and
Language 55: 167–177.
Smith, Sara, Hiromi Pat Noda, Steven Andrews and Andreas H. Jucker
2005 Setting the stage: How speakers prepare listeners for the introduction of refer-
ents in dialogues and monologues. Journal of Pragmatics 37: 1865–1895.
Sperber, Dan and Deirdre Wilson
1995 Relevance: Cognition and Communication. Second edition. Oxford: Black-
well.
Steen, Gerard
2008 The paradox of metaphor: Why we need a three-dimensional model for meta-
phor. Metaphor and Symbol 23: 213–241.
von Stutterheim, Christiane and Ralf Nüse
2003 Processes of conceptualization in language production: Language-specific
perspectives and event construal. Linguistics 41: 851–881.
Van der Henst, John-Baptist, Laura Carles and Dan Sperber
2002 Truthfulness and relevance in telling the time. Mind and Language 17: 457–
466.
Wilson, Deirdre and Dan Sperber
2012 Meaning and Relevance. New York: Cambridge University Press.
12. Role plays
J. César Félix-Brasdefer

Abstract: This chapter offers a comprehensive account of the role-play method

commonly used in cross-cultural and interlanguage pragmatics. Role plays have
been used to investigate different aspects of the learners’ pragmatic competence
(e. g. pragmalinguistic and sociopragmatic knowledge). This chapter focusses on
the conceptualization of the role-play method for research and assessment pur-
poses, looks at the structure of the role-play task and instructions of the task for
the role-play taker and role-play conductor, reviews existing varieties of role plays,
and explains procedures for the coding and analysis of role play data. The distinc-
tion between closed and open role play is explained, as well as the relevance of the
role-enactment approach. Finally, this chapter ends with an overview of key issues
of validity and reliability, and methodological and ethical issues for researchers
using the role-play method.

1. Introduction

Role plays provide oral data, enable simulations of real-life interactions, and are
used for experimental purposes under controlled conditions. They are used to elicit
interactive data from native (NSs) and non-native speakers (NNSs) in different
interdisciplinary research fields. The role-play method was introduced in social
psychology research during the 1940s (Sarbin 1943) with participants who were
asked to take on roles in psychodramatic experiments assuming roles based on
previous experience. This method was validated in subsequent experiments to fur-
ther examine the role behavior of patients under experimental conditions (Sarbin
and Jones 1955). From a psychological foreign language teaching perspective, the
concepts of social role and role play are complex because the use of this “qua-
si-dramatic device” is now used with learners “who do not have the linguistic
skills to express the conventional expectations for that role, in order to develop
just those skills” (McDonough 1986: 80). Within the field of linguistics, role plays
are widely used to collect data in second language acquisition (SLA) (Cohen and
Macaro 2010; Dörnyei 2007; Mackey and Gass 2005), cross-cultural and interlan-
guage pragmatics (Cohen 2012; Félix-Brasdefer 2010; Kasper 2000; Kasper and
Dahl 1991), and linguistic politeness (Félix-Brasdefer 2008). Role plays have been
used to investigate different aspects of the learners’ pragmatic competence, and are
employed for training, for assessment and testing purposes, and for the teaching of
pragmatics in the classroom.

https://doi.org/10.1515/9783110424928-012
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 305–331. Berlin/
Boston: De Gruyter Mouton.
306 J. César Félix-Brasdefer

This chapter provides a comprehensive review of the role-play method in the

service of pragmatics among NSs and NNSs in bilingual and multilingual contexts.
It focuses on the conceptualization of the role-play method, role play types, task
design, analysis of role play data, issues of validity and reliability, and methodo-
logical and ethical issues.

2. Role plays: Measurement of online (pragmatic) knowledge

The role-play method has been used to examine different aspects of pragmatic
knowledge among NSs and NNSs, including children, who assume roles in order to
produce and interpret communicative action. According to Crookall and Saunders
(1989), a role play can be defined as “a social or human activity in which partic-
ipants ‘take on’ and ‘act out’ specified ‘roles’, often within a predefined social
framework or situational blueprint (a ‘scenario’)” (1989: 15–16). Within the field
of interlanguage pragmatics (ILP), role plays can be of two types, each of which
has different formats: closed and open role plays (Kasper and Dahl 1991). With
role plays one can control for a series of contextual parameters: the description
of the setting, the degree of social distance and social power between the inter-
locutors, the weight of imposition, gender and age of the participants, context of
learning (foreign language [FL] vs second language [L2]), and proficiency level.
The description of the situation makes explicit or implicit reference to the prag-
matic target that is required of the participants, such as requesting, apologizing,
or refusing. The main difference between closed and open role plays is the degree
of interaction between the participants, and the amount of contextual information
provided to the interactants.
The role-play method has been widely used in cross-cultural and ILP research.
For example, of the 39 studies in ILP which are reported in Kasper and Dahl
(1991), 33 % (13 of 39 studies) were carried out using role plays (open or closed).
And, of the 51 studies on refusals and rejections conducted in cross-cultural, sin-
gle-moment studies examined in Félix-Brasdefer (2008), 31 % used role plays
(open or closed) (16 of 51 studies).
Influenced by social psychology research (Sarbin and Jones 1955), a distinc-
tion is made between role playing and role enactment. Role playing is “pretending
to react as if one were someone else in a different situation”, while role enactment
is “performing a role that is part of one’s normal life or personality” (McDonough
1986: 80–81). In the former, participants are asked to take on a social role in a
setting that may or may not have happened to them, such as asking a university
student to pretend to be a boss, an employee, a doctor, or the manager of a res-
taurant; in the latter, the role plays are designed to fit participants’ characteristics
based on previous experience, containing characters and difficulties that are known
to be familiar to the participants. Although both types gather simulated data, the
Role plays 307

role enactment approach is considered most effective, as it ensures a higher degree

of validity of the data based on previous experience. For example, the role enact-
ment approach was used by Trosborg (1995) in her study of learner data with open
role plays to elicit requests, complaints, and apologies. To increase the degree of
validity of the data, the role plays were “tailor-made to the participants or, at least,
contain problems and characters which were known beforehand to be familiar to
those involved” (1995: 144).
When adopting the role enactment approach, the aim is to construct role sim-
ulations that approximate real-life interactions using a variety of communicative
activities that are embedded in everyday interaction. However, one should keep
in mind that the role enactment approach limits the number of roles assumed by
participants, while role playing may include a wider variety of roles in formal and
informal situations to measure different aspects of pragmatic competence. Overall,
researchers should keep this distinction in mind when designing the role-play task,
in closed or open role plays.

2.1. Closed role plays

Closed role plays (also called Oral Discourse Completion Test [DCT]) elicit one-
turn responses in reaction to a situational prompt with an initiating or reacting
speech act. While there are different types of closed role-play formats, they are
characterized by oral data in non-face-to-face interaction. The participant’s oral
response is recorded and later transcribed and analysed. Walters (1980) represents
one of the early studies that examined children’s L2 requests using a closed role
play. Children were asked to make a request to a puppet which differed on varia-
bles such as age, sex, and race. This data collection technique, in which children
interact with puppets, is often used in L2 developmental pragmatics. Rintell and
Mitchell (1989) compared the performance of requests and apologies of learners
of English as a Second Language (ESL) and NSs of English using a DCT and a
closed role play. Differences were found with regard to the length and content
of the speech acts, with closed role plays producing longer responses that con-
tainted verbal and non-verbal features often found in oral discourse. The responses
included features of oral discourse, namely, more supportive moves, hesitation,
and recycling in comparison to written DCTs, which produced features of written
discourse. In the classic closed role play, participants read the situation and are
asked to respond with either an initiating (compliment, request) or responding
act (refusal, compliment response). The description of the situation is limited to a
written stimulus and a brief description of the situation for which the interlocutor
has to provide an appropriate response.
In order to provide rich audiovisual and contextual information in the situation
prompt, Schauer (2004) designed the computer-based multimedia elicitation task
(MET) to examine ILP development of requests in 16 scenarios among German
308 J. César Félix-Brasdefer

learners of English during a year-long study-abroad program in Great Britain. The

MET controls the time and the nature of the audio and visual input, guarantees
equal conditions for every participant, elicits oral data, and is delivered by means
of a computerized presentation format with visual (photographic images) and
audio input (description of the situation). The MET represents an interactive for-
mat with the situational prompt (not with an interlocutor), with different stimuli,
that increases the degree of construct validity. In her study of compliments and
compliment responses in L2 Spanish, Hasler-Barker (2013) used a revised ver-
sion of the MET to elicit compliments and compliment responses from learners of
Spanish in a FL context. In her cross-sectional study, learners at two proficiency
levels were provided with a situational description which included written and
aural stimuli, and visual information (delivered through Power Point slides on
the computer). After the learners read the situation, they were asked to respond
with a compliment or a compliment response. Following Hasler-Barker’s (2013)
experimental design, Félix-Brasdefer and Hasler-Barker (2015) used the comput-
er-delivered closed role play to elict compliments and compliment responses from
learners of Spanish who studied abroad in Mexico for seven weeks. Example (1)
shows the instructions used prior to the task, example (2) shows the description of
the situation, and example (3) provides a sample response:
(1) Instructions for the computer-delivered closed role play
You will see a series of situations during your study abroad program in Guanajuato,
Mexico. You are studying Spanish for eight weeks this summer with Mexican profes-
sors from the Universidad de Guanajuato, and you are living with a host family. The
following situations take place with a host family, at school, or in the city. Following
the description, you will be prompted to speak your response to the situation out loud.
You will have twenty seconds to read the situation and respond to it. Please remember
to respond to the situations in Spanish. Your responses will be recorded.
Any questions?
(2) Sample of computer-delivered closed role play (complimenting host sister) (Félix-Bras-
defer and Hasler-Barker 2015)
Your host sister picked you up on the city bus
Photo of an attractive and is taking you to your host family’s house.
young woman with When you saw her, she was wearing sun-
green eyes glasses. When you get on the bus, she removes
them and you notice that she has striking eyes.
What do you say to her?
(3) Sample response of the closed role play (compliment in L2 Spanish)
Oh – tus ojos son muy bonitos quiero ahm ojos ahm verdes también porque el contraste
ahm entre la piel morena del mexicano es muy bonito.
‘Oh – your eyes are very pretty um I want um green eyes, too, because the contrast um
between the dark skin of the Mexican is very pretty’
Role plays 309

The instructions in (1) direct the learner to read the situation. In this example, the
instructions were provided in English to avoid difficulties with comprehension,
but the responses were in the target language (L2 Spanish). The example in (2)
provides the prompt with written and visual stimuli, and the information of the
pragmatic target, a compliment. The sample in (3) shows an oral response from
the closed role play with some oral features: a non-verbal token to express sur-
prise (oh), repetitive hesitation markers that are common in oral discourse (um),
and prosodic features such as fast speech, low final intonation, and elongation of
non-verbal tokens to express surprise or emotion.
If the research goal is to analyze different dimensions of interaction (e. g. turn-
taking, speech acts sequences, repair, collaborative talk), the open role-play is
appropriate as it elicits data at the discourse level.

2.2. Open role plays

Open role plays (also called “discourse role play tasks” Brown 2008: 232) specify
the actor’s roles, but the course and outcome of the conversation is not predeter-
mined. During a role-play task, participants are asked to read a situational descrip-
tion and to respond orally as they would in a real situation with an interlocutor
in face-to-face interaction. The (open) role-play technique has the advantage of
including interaction in a dyadic face-to-face format with another participant. Role
plays are generally audio-taped and/or video-recorded. Once the situations have
been recorded, the data are carefully transcribed according to a system of tran-
scription notation in order to capture the sequential organisation of discourse (e. g.
Jefferson 2004). Scarcella (1979) represents one of the early studies that used open
role plays to examine learners’ production of requests and invitations when speak-
ing to superiors, equal familiars, and subordinates. Example (4) shows a classic
situational role play description:

(4) Description of an open role play (Scarcella 1979: 277)

You are planning an office party. You invite your boss, the clerk who works under you,
and your good friend, a fellow employee. You request that each of your guests come
unaccompanied by his wife.

According to Scarcella, role plays (i) “obtain complete conversational interactions,

containing both conversational openings and closings”; (ii) allow the researcher
“to control the conversation to a certain extent”; (iii) provide “comparable samples
of speech”; and (iv) facilitate “videotaping, important when examining paralin-
guistic features of discourse” (1979: 277).
The data gathered by means of open role plays will depend on whether suffi-
cient contextualized information is included in the description of the situation. For
instance, the role-play situations used in the pilot study in Félix-Brasdefer (2002)
310 J. César Félix-Brasdefer

were tested for content validity with three groups: ten Americans, ten Mexicans,
and ten US learners of Spanish (all university-level students). Using an open role
play, the participants in each group were instructed to role play nine situations of
equal and unequal status (they included three distracters and six refusals to invita-
tions, suggestions, and requests) with another NS of English or Spanish (the mean
number of words in the situations was 51.6 words). An example of an archetypal
(unenhanced) role-play prompt is shown in (5) (36 words):
(5) Refusing a friend’s invitation to a birthday party
A friend of yours invites you to his birthday party next Friday evening. He is inviting a
select group of friends over to his house, and you are one of them, but you can’t make
it.

Immediately after the participants completed the nine role-play situations, they
were asked about the validity of their responses during the interactions. Regarding
the content of the situation as in example (5), participants commented that more
specific information was needed in each situation, such as detailed information
with respect to the situation, time, place, and more information regarding the degree
of formality of the relationship such as distant or close friends and the degree of
social power between the interlocutors in situations which involved a boss or pro-
fessor and an employee or student. In fact, most participants mentioned that their
responses in the role-play interaction would have varied based on whether their
relationship with the friend, boss or professor was close or distant. Thus, in light of
the observations of the participants, and following Billmyer and Varghese (2000),
the descriptions of the situations were enhanced to include enriched role-play sce-
narios (the mean number of words for the situations was 130.5 [Americans], 146.5
[Mexicans], and 140.5 [learners]). The role play in (5) was enhanced with con-
textual information about the setting, the relationship between the participants,
gender of the initiator, and time of the event. The enhanced role play is shown
in (6) (138 words) and a sample of the response that resulted from that situation
between two American college students is given in (7) (taken from Félix-Brasdefer
[2010: 48–49]):
(6) Enhanced role play: Refusing a friend’s invitation to attend a birthday party (-P, -D)
You are walking across campus when you run into a good friend of yours whom you
haven’t seen for about a month. You and he have been studying in the same program at
the University for three years, and have studied and written papers together in the past,
but you don’t have any classes together this semester since you have been doing an
internship off-campus. He invites you to his 21st birthday party at his house next Friday
night at 8:00 p.m. He tells you that a group of mutual friends that you both used to hang
out with and whom you haven’t seen since the semester started will also be there. You
know that this would be a good opportunity to see everyone again and to celebrate this
special occasion with him. Unfortunately, you cannot make it.
Role plays 311

(7) Sample response of open role-play interacton (NSs of US English) (Félix-Brasdefer

2010: 49)
Role play interaction: Declining a friend’s invitation to a birthday party. Erin issues
invitation; Paul declines invitation.1
1 Erin: Hey Paul – how’s it going?
2 Paul: hey, Erin how are you?
3 Erin: I’m fanta::stic=
4 Paul: =I haven’t seen you in a long time – [where you been? Opening
5 Erin: [I – sequence
6 I’ve just been working – going to class
7 Paul: [oh good - good
8 Erin: [the usual –
9 I’m so glad that I saw you – I’ve been - trying to figure out
10 how to get in touch with you cuz –um – I just turned 21 -
11 yesterday – and I’m gonna have a party this Friday night
12 and I’m just trying to get in touch with everybody –um –
13 from last semester – that we were all in class together and
14 everything and I really wanted you to come –
15 it’s gonna be at eight o’clock at my house
16 Paul: ooh – this Friday?
17 Erin: yeah
18 Paul: ohh – my goodness – it’s my grandmother’s birthday Invitation-
this weekend refusal
19 Erin: you’re kidding
20 Paul: and my grandmother lives out of town - too
21 Erin: oh [no:::
22 Paul: [and normally, you know, my parents go of course – you know
23 Erin: umhm
24 Paul: so – when we go, we spend the weekend with ‘em
25 Erin: yeah
26 Paul: because I live so far away –
27 we just can’t come back and forth on [a day
28 Erin: [yeah

1
The following transcription notations are used: Contiguous utterances
= Equal signs indicate no break-up or gap. They are placed when there is no interval be-
tween adjacent utterances and the second utterance is linked immediately to the first.
[ A left bracket indicates the point of overlap onset.
] A right bracket indicates the point at which two overlapping utterances end, if they
end simultaneously, orthe point at which one of them ends during the course of the
other. It is also used to parse out segments ofoverlapping utterances.
- A dash marks a short untimed pause within an utterance.
Characteristics of speech delivery
↑↓ The up and down arrows mark sharp rises or falls in pitch.
: A colon marks a lengthened syllable or an extension of a sound.
::: More colons prolong a sound or syllable.
312 J. César Félix-Brasdefer

29 when are you leaving?

30 Paul: Thursday night
31 Erin: oh man::
32 Paul: and we’re gonna get there Friday morning and stay until Sunday
33 Erin: and – there’s no way you can – like =
34 Paul: = oh, I wish I could – I – I wish I could make it
35 because, you know - I haven’t seen you for such a [long time Insistence-
36 Erin: [yeah response
37 Paul: and I’d like to get - you know –
38 I’d like to get back with you but –
39 um – maybe next – are you busy next week?
40 I mean - I’ll take out for dinner or =
41 Erin: = ohh ((laughs)) that’s nice of you – um yeah we can just – Suggestion-
42 we can get together – that’s cool response
43 Paul: would that work?
44 Erin: yeah
45 Paul: ok
46 Erin: well, I’m sorry you can’t come,
47 but have a good time with your grandmother Closing
48 Paul: alright – I’m sorry too – sequence
49 Erin: alright
50 Paul: happy birthday
51 Erin: thank you.

The interaction in (7) is between two NSs of English (university students) from one
Southern region of the United States (North Carolina). Erin initiates the interaction
(female) and Paul (male) responds to the initiation. In this role play interaction,
there are five sequences commonly used in American refusals to invitations from a
friend (Félix-Brasdefer 2010): an opening sequence (lines 1–8), the invitation-re-
fusal sequence (lines 9–32), the insistence-response sequence (lines 33–38), the
suggestion-response sequence (lines 39–45), and the closing of the interaction
(lines 46–51). After the invitation (lines 9–15), a pre-refusal response is realised
in one turn (line 16, ‘ooh – this Friday?’) followed by the refusal response, which
is accomplished by means of various turn-constructional units (TCUs) (Schegloff
2007) (lines 18, 20, 22, 24, 26–27). Notice that the insistence-refusal response
(lines 33–38) is shorter than the initial invitation-refusal sequence (lines 9–32).
One can also appreciate several instances of overlap (lines 21–22, 27–28), inter-
ruption (lines 40–41), and laughter particles which serve to reinforce the links of
solidarity between the participants (line 41). Finally, with role-play data research-
ers can examine the pragmatic effect of prosodic elements employed in an interac-
tion to express tentativeness, politeness, or degrees of directness or indirectness,
such as pitch direction, pitch range, pauses, loudness, tempo, and voice quality
(e. g. creaky voice) (Selting 2010). For example, in his contrastive study of pro-
Role plays 313

sodic features in Mexican, Costa Rican, and Domincan Spanish, Félix-Brasdefer

elicited refusal (2011) and request exchanges (2009) using data from open role
plays. The author found prosodic patterns of low final intonation and loudness
(prosodic cues) that accounted for the realisation of directness and indirectness
and politeness features.
Role-play data are widely used in formal and informal contexts, in different
learning contexts (at home vs abroad), and among learners at different proficiency
levels (Félix-Brasdefer 2008; Kasper and Dahl 1991). For instance, Félix-Brasde-
fer (2004) examined the sequential organization of dispreferred responses among
learners of Spanish with different lengths of study abroad in Spain or Latin Amer-
ica. The role-play task included three informal and three formal situations (two
refusals to invitations; two refusuals to requests; two refusals to suggestions). The
example in (8) shows a refusal exchange between a boss (NS) and an employee
(+P, +D) during an invitation-refusal sequence (Félix-Brasdefer 2004: 648–49).
The boss was a NS of Mexican Spanish (a professor of Hispanic literature), and
the employee was an undergraduate student of Spanish who had studied in Mexico
for approximately two years. The role-play situation is given in (8) and the invita-
tion-refusal response is provided in (9):
(8) NS-learner role-play interaction: Refusing a boss’s invitation to attend a farewell party
(+P, +D)
Imagine that you are in (Spanish-speaking country of your preference). You have been
working at 3M as a sales representative for the last five years. You have a good working
relationship with your boss although you do not socialize together outside the office.
Your boss has always been supportive of your ideas and has been instrumental in your
receiving a recent promotion. After working for him for three years, he has recently been
promoted and will become the Manager of the Latin American Sales Division which
will require his relocation to Mexico City next month. He is having a party next Satur-
day evening at a restaurant and is inviting you and other members of his sales group to
celebrate his promotion and as a farewell, but you are unable to attend.
(9) Learner of Spanish (spent 24 months in Mexico: Refusing a boss’s inviation to attend a
farewell party)
(Learner role: Employee; NS of Spanish: Boss) (male-male interaction)
Boss: 1 Hola, Greg, ¿cómo te va? (‘Hi, Greg, how are you?’)
Learner: 2 muy bien, y tú? (‘very good, yourself?’) Open
Boss: 3 bien (‘good’)
4 no sé si has escuchado la noticia de que me han ascen-
dido de puesto,
5 me han reubicado en la ciudad de México, y tendré que Invitation-
translardame allá. Refusal
6 Pienso tener una reunión en un restaurante, tus colegas Sequence
estarán ahí,
7 y quería extenderte esta invitación a ti también, el próx-
imo sábado a las 6:00 p.m.
314 J. César Félix-Brasdefer

8 Me gustaría que estuvieras con nosotros.

(‘I don’t know if you’ve heard the news that I’ve been
promoted and relocated to Mexico City, and I’ll have to
move there. I plan to have a gathering at a restaurant,
your colleagues will be there, and I wanted to extend this
invitation to you too, next Saturday at 6:00 p.m. I’d like
you to be there with us.’)
Learner: 9 Gracias por tu invitación
(‘Thank you for your invitation’)
10 lo que pasa es que ya tengo una cita con mi novia y vamos
11 a ir a un concierto, ya tenemos, este, los boletos
12 y no vamos a poder asistir
(‘the thing is that I already have a date with my girlfriend
and we are going to go to a concert, we already have, um,
the tickets and we are not going to be able to attend’)
Boss: 13 qué pena, quizá un poco tarde, trae a tu novia también
(‘maybe pretty late, bring your girlfriend also’)
Learner: 14 y van a estar ahí hasta …?
(‘and you’re going to be there until …’)
Boss: 15 creo que estaremos hasta por las once y media
(‘I think we’ll be there until around 11:30’)
Learner: 16 pues quizás después del concierto podemos llegar y plat-
icar, tomar algunas
17 sodas, pero primero tenemos que ir al concierto y luego
llegamos. Insistence-
(‘well maybe after the concert we can come and chat, response
have some sodas but first we have to go to the concert and
afterwards we’ll come’)
Boss: 18 me parece ideal
(‘sounds great’)
Learner: 19 ¿está bien?, haremos lo possible por llegar , no sé
(‘is that fine?, we’ll do our best to be there, I don’t know’)
Boss: 20 me daría mucho gusto verte por allá, aunque sea un
ratito.
(‘I would very much like to see you there, even if it’s for
a short time’)
Learner: 21 okay
Boss: 22 bueno, hasta luego, Closing
(‘well, see you later’)
Learner: 23 felicidades (‘congratulations.’).

The exchange in (9) is realized across 14 turns during four main sequences (or
joint-actions), namely, an opening (lines 1–3), an invitation-response (lines 4–12),
an insistence-response (lines 13–21), and a closing (lines 22–23). After the boss’s
invitation (lines 4–8), the learner expresses appreciation with the informal pro-
nominal form tú (‘you-informal’) (instead of the formal pronoun usted [‘you-for-
Role plays 315

mal’]) (line 09) and employs an indirect refusal by means of one justification (lines
10–12). Upon the boss’ insistence (line 13), the learner engages in a negotiation
process in search of a successful resolution and requests further information in
the next turn (line 14). In the learner’s next turn, he compromises with the boss
and provides a vague alternative (lines 16–17). In the following turn, the learner
offers a clarification request (and a vague response) to confirm that his previous
alternative has been well received by the boss (lines 19–21). The last two turns
(lines 22–23) serve to close the interaction politely and successfully. Overall, this
exchange shows the learner’s ability to negotiate a refusal across multiple turns,
with various mitigating strategies that comprise the speech act set of refusals,
expressions of politeness, as well as the learner’s ability to open, close, and nego-
tiate the interaction successfully.
The next section describes four varieties of role plays that examine different
aspects of the learner’s pragmatic competence: simulated role play, naturalized role
play, the Advanced Placement (AP) role-play task, and the oral-proficiency-inter-
view (OPI) role play. The aim of each role play is to enhance construct validity
in order to elicit data that approximates natural discourse (simulated role play, AP
role-play task, and the naturalized role play). The OPI role play and the AP role
play are mainly used for testing and assessment purposes. Each role play variety
has advantages and disadvantagtes and is employed for different purposes.

2.2.1. Simulated (role-play) task

Simulated tasks are similar to role enactments in that they ask participants to sim-
ulate a situation based on their own roles, such as giving directions. In her study
on giving directions in ESL, Lee (2014) utilized a simulated task to ask learners to
give directions to specific places on a university campus. According to Lee (2014:
76–77), role plays and simulated tasks share certain characteristics: both elicit
interactive and multi-turn discourse in face-to-face interactions, they are not conse-
quential, and interlocutors engage in the co-construction of speech act sequences.
Unlike role plays which ask participants to take on social roles that may or may not
have happened to them, in simulated tasks participants are given “discourse roles”
(e. g. direction-giver and direction-seeker). Further, in similated tasks participants
do not take on social roles different from their own, as in Lee’s (2014) study where
the participants “take a role of direction-giver and direction-seeker” (2014: 77),
roles that are familiar to students. Similar to role plays, participants receive a card
with a description of the situation, and the role play conductor is trained prior to
the role-play interaction. Although this instrument is not frequently used in ILP
research, future studies should elicit interactive data using simulated tasks with a
variety of initiating and responding speech acts to enhance the validity of the data.
316 J. César Félix-Brasdefer

2.2.2. Naturalized role play

The naturalized role play (NRP) was proposed by Tran (2006) in order to increase
the validity of the traditional open role play. It elicits spontaneous data in con-
trolled settings and the participants are not aware of the research focus. Spontane-
ous data does not mean natural; instead, it refers to data that approximates natural
discourse. The researcher who uses data from the NRP also collects additional data
from other sources such as observation and field notes, thus, increasing the degree
of validity of the data. With regard to the design of the situations, the author noted
that the role-play description should be “carefully designed with attention to detail
in order to be elicited with real-life situations” (2006: 7). The NRP differs from
open role play in that it consists of distracting tasks for informants to perform.
Participants respond to spontaneous speech acts which are elicited without the
participants being aware of the pragmatic target. The instructions for the role-play
conductor include the following information shown in (10):
(10) Instructions in the card for the role-play conductors (Naturalized Role Play, Tran
2006: 23)
• Please ask for directions to [“place”].
• Please ask him/her what time the bookshop is closed today.
• Please accept the ride that he/she offers.
• When it is most natural during the talk, compliment him/her on:
– his/her article published last week
– his/her car.
• Please make the conversation as natural as posible. Speak as you would in real life.
It is very important that you compliment naturally and make your compliments a
part of the normal social talk. Do not make it obvious that the compliments are
among the tasks listed in the card for you.

While the data elicited through the NRP approximates natural discourse, the pro-
cess to conduct the interview for the desired pragmatic target requires creativity
on the part of the researcher as well as the ability to perform the task on the part of
the role-play conductor. One advantage of this method is that it triangulates data
from other sources, thus increasing the degree of construct validity. And while the
NRP seems to be more suitable to elicit data from responding speech acts such as
compliment responses (Tran 2006), future studies are needed to elicit data from
other responding speech acts such as refusals, or initiating speech acts (e. g. com-
pliments, invitations, complaints, or requests for action or information). The NRP
should be tested with other speech acts, in various learning contexts, and across
proficiency levels.
Role plays 317

2.2.3. The Advanced Placement (AP) role-play task

The AP exam is administered by the US College Board to assess written and oral
proficiency among high school students who study a second language (e. g. French,
German, Italian, and Spanish). This exam is developed and scored by experienced
college and university faculty members, as well as by experienced AP (high school)
readers. For example, the Spanish AP Exam is one of the various proficiency tests
offered by Educational Testing Services (ETS) to measure the language proficiency
(speaking, written, and comprehension skills) of high school students who intend
to be placed in advanced language Spanish courses at the university level (5th and
6th semester or the equivalent). The AP exam committee of the College Board
incorporates rigorous practices of language instruction and language testing by
considering documents such as the ACTFL Proficiency Guidelines and the Stand-
ards for Foreign Language Learning2 in preparing their exams. The first part of the
oral section, in which students participate in a recorded role-play (conversation)
task with a NS of Spanish, measures interpersonal speaking skills. The AP role-
play task includes five or six prompts, and students have 20 seconds to respond.
The role-play task is based on a role-enactment approach in that it asks students to
take on roles in situations that are likely to happen to them in everyday life, such
as inviting a friend to a party, asking for information on the street, or an interac-
tion with a teacher. Some of the communicative functions students are expected
to accomplish on the exam include the ability to initiate (greetings), maintain,
and close a conversation (farewells) on a familiar topic, formulate questions for
information or action, and seek clarification information. Example (11) includes
the instructions for the role-play task and example (12) shows the structure of the
role tasks.3 In each turn, the student is provided with information.
(11) Instructions to the AP Spanish role-play task (instructions are provided in Spanish and
English)
You will participate in a coversation. First, you will have a minute to read a preview
of the conversation, including an outline of each turn in the conversation. Afterward,
the conversation will begin, following the outline. Each time it is your turn to speak,
you will have 20 seconds to record your response. You should participate in the con-
versation as fully and appropriately as possible.

2
See description of the ACTFL Proficiency guidelines: https://www.actfl.org/publications/
guidelines-and-manuals/actfl-proficiency-guidelines-2012
3
The instructions, structure, and audio of the 2016 Spanish role-play task can be accessed
here: http://apcentral.collegeboard.com/apc/public/exam/exam_information/4554.html
318 J. César Félix-Brasdefer

(12) AP Spanish role-play task4

Context: This is a converation with your friend Sonia about opportunities of commu-
nity service. You will participate in this conversation because you are interested in
doing community service. (instructions in Spanish)
Speaker Test Booklet Master Audio
Sonia Te saluda, te pide disculpas Hola. Siento haberme perdido tu fiesta de
y te hace una pregunta cumpleaños.
Ese día me tocó trabajar como voluntaria en el
Centro Social.
¿Cómo estuvo la fiesta?
‘She greets you, apologizes ‘Hello. I’m sorry I missed your birthday party.
and asks you a question’ That day I had to volunteer at the Social
Center.
How was the party?’
Tú: Responde, incluyendo detalles [Tone] 20 seconds [Tone]
‘Respond, including details’
Sonia:
Continúa la conversación y Ah … Lamento no haber asistido. Por cierto,
¿sabes que están
te hace una pregunta buscando voluntarios para trabajar con niños?
¿Te interesaría
participar?
‘She continues the conversation ‘Ah … sorry I couldn’t make it. By the way,
do you know
and she asks you a question’ that they are looking for volunteers to work
with kids?
Would you be interested in participating?’
Tú: Responde afirmativamente [Tone] 20 seconds [Tone]
y explica por qué.
‘Respond affirmatively
and explain why’
Sonia: Continúa la conversación ¡Qué bien! ¿Y … qué tipo de actividades
podrías hacer con los niños?
y te hace otra pregunta.
‘She continues the conversation ‘That’s great! And … what kind of activities
could you do
and she asks you a question’ with the kids?

4
Role-play task: https://secure-media.collegeboard.org/digitalServices/pdf/ap/ap16_frq_
spanish_language_script.pdf. Audio for AP Spanish (Task 3: Conversation) http://
apcentral.collegeboard.com/apc/members/exam/exam_information/231929.html
Role plays 319

Tú: Responde con detalles [Tone] 20 seconds [Tone]

‘Respond with details’
Sonia: Continúa la conversación Ya veo … ¿Te gustaría colaborar todas las
y te hace una propuesta. tardes?
‘She continues the conversation ‘I see … would you like to help out every
afternoon?’
and makes a proposal
Tú: Responde negativamente [Tone] 20 seconds [Tone]
y explica por qué.
‘Respond negatively
and explain why’
Te hace una pregunta. Bueno, entiendo, pero cualquier ayuda será
Sonia:
bien recibida.
¿Y … qué más quisieras saber sobre el centro?
‘She asks you a question’ ‘Well, I understand, but any help would be
well received.
And … what else would you like to know
about the center?’
Tú: Pide más información. [Tone] 20 seconds [Tone]
‘Ask for more information’

In this open role-play task the student is asked to provide specific information
for each turn in response to each prompt. Students are required to provide details,
explain, agree or disagree with the interlocutor’s response, and make requests. The
AP role-play task measures on-line pragmatic knowledge in order to co-construct
a conversation across simulated multiple turns. Since all the students receive the
same stimulus, the data are comparable. This open role-play task elicits interac-
tional data for testing and assessment purposes.

2.2.4. The oral proficiency interview role play

The oral proficiency interview (OPI) consists of a semi-structued interview that
is used as a means of assessing the speaking ability of international students from
non-English-speaking countries who wish to study at an English-speaking univer-
sity, or the ability of employees who intend to demonstrate a sophisticated level of
speaking in the English language. OPIs are largely organized as question-answer
sequences in which the interviewer leads the interview by asking questions with
different degrees of complexity to measure different skills: grammatical, sociolin-
guistic, pragmatic, discourse, and strategic ability, and the candidate provides the
answers.
320 J. César Félix-Brasdefer

Some varieties of OPIs include a role-play component (e. g. the International

English Language Testing System [IELTS]; Foreign Service Institute/Interagency
Language Roundtable [FSI/ILR]; or the American Council for the Teaching of
Foreign Languages [ACTFL]). Researchers have examined the sequential struc-
ture of the OPI with respect to the interviewer’s questions and the candidate’s
ability to co-construct meaning at the discourse level (cf. Brown 2004; Ross and
Kasper 2013, see chapters 8–13). The OPI consists of various stages in order to
assess different aspects of the learner’s interactional competence. For example,
the IELTS interview encompasses five stages, of which the middle one is the role-
play task: introduction (Phase 1), extended discourse (Phase 2), elicitation role-
play task (Phase 3), speculation and attitudes (Phase 4), and conclusion (Phase 5).
The role-play task is based on “information gap” type activities. The ACTFL OPI
consists of four stages: warm-up, interactive process, role play, wind down. The
role play is used to test interactional skills that cannot be measured through the
interview. For instance, for the ACTFL OPI role play, taking place sometime in the
middle of the interview, the interviewer selects a card with a role-play task to test
additional aspects of the candidate’s speaking ability. The interviewer shifts roles
from speaking as himself/herself and indicates to the candidate that he/she will
engage in a simulated role-play task in which he/she takes on a role in a hypothet-
ical situation. The selection of the role-play task depends on the candidate’s initial
proficiency level identified by the interviewer (novice, intermediate, advanced,
superior, distinguished). The role-play task asks the candidate to take on a role to
solve a complication at the airport, apartment building, to ask for directions, to
express disagreement, to complain, or to issue a request for service or information.
The OPI role play measures sequential and organization skills such as the ability to
take turns, to make a request and respond to the request, to come up with a solution,
and finally to close the interaction appropriately. Immediately after the role play is
completed, the interviewer indicates that the role play has ended and he/she shifts
back to end the interview.
The OPI role play differs from the archetypal open role play in that it is embed-
ded in the OPI and signals the candidate when the role play begins and when it
ends. The OPI role play shares the following characteristics with the traditional
role play: the roles adopted by the candidate are not necessarily based on previous
experience; it measures different aspects of the learner’s interactional competence
such as turn-taking, repair, sequential organization. However, in the OPI role play
the interviewer exerts more control (than the role-play conductor in the traditional
open role play) during the interaction as he/she indicates the begining and end of
the interaction, the selection of the topic for the situation, and the role selected
for the candidate. In their discursive analysis of the OPI role play, Okada and
Greer (2013) provided a detailed sequential analysis of conversational practices of
the OPI role play, such as the interviewer’s formulation and reformulation of the
question and the use of silence to signal the candidate’s course of action during the
Role plays 321

pursuit of a response. Specifically, the OPI role-play task is used to test one aspect
of the learner’s speaking skill embededed in a semi-structured interview in face-
to-face or telephone interaction.
Table 1 displays the main characteristics of the various types of role plays
analyzed in this section.

Table 1. Varieties of role plays in cross-cultural and interlanguage pragmatics

Role play Source in cross- Pragmatic/ Characteristics

type cultural orILP discourse targets
Archetypal Félix-Brasdefer Requests, refus- Face-to-face simulated dyadic
(open) role (2008); als, apologies, interactions. Elicits multi-turn
play Gass and Houck compliments conversational interactions in
(1999); and compliment face-to-face (or telephone) dia-
Hasler-Barker responses logic simulations. Open-ended
(2013); interacion including opening,
Márquez Reiter negotiating phase, and closing
(2000); Márquez sequence. Participants are
Reiter, Reiney, and asked to take on social roles
Fulcher (2005); that may or may not be based
Scarcella (1979) on previous experience.
Role-en- Trosborg (1995) Requests, Face-to-face dyadic simulated
actment complaints, and interactions. Participants
approach apologies perform a role that is part of
(based on their everyday normal life or
Sarbin and personality. Role plays are
Jones 1955) tailor-made to the participants
containing problems and char-
acters known beforehand.
Naturalized Tran (2006) Compliment Face-to-face dyadic simulated
role play responses interactions with spontaneous
data. Consists of distracting
tasks for informants to per-
form while the role play is in
progress without the partici-
pants being aware. Participants
respond to spontaneous speech
acts which are elicited without
the participants being aware
of the pragmatic target. Trian-
gulates with observational and
field-note data to increase data
validity.
322 J. César Félix-Brasdefer

Role play Source in cross- Pragmatic/ Characteristics

type cultural orILP discourse targets
Simulated Lee (2014) Direction-giving Face-to-face dyadic simulated
role-play interactions. Elicits interactive
task data in face-to-face dialogic
interactions. Similar to role
enactments in that they ask
participants to simulate a situ-
ation based on their own roles,
such as giving directions. Par-
ticipants are asked to take on
social roles that are familiar to
them (based on previous expe-
rience).
AP role- http://apcentral. Openings Simulated role-play inter-
play task collegeboard.com/ and closings action used for assessment
apc/public/exam/ in conversa- purposes. Assesses interper-
exam_information/ tions, requests, sonal speaking skills of high
4554.html agree-disagree school students who intend to
sequences, be placed at advanced level at
request-response US colleges and universities.
sequences, and Students respond to prompts to
other speech act co-construct a simulated con-
sequences versation.
OPI role Brown (2004) Different conver- Face-to-face or telephone
play Oakda and Greer sational practices dyadic interactive simulations.
(Role play (2013) (repair, turn-tak- A component of the oral pro-
in oral-pro- ing, sequency ficiency interview as a means
ficiency organization) of assessing various aspects
interviews) of the learner’s interactional
competence. Question-answer
sequences in which inter-
viewer asks questions and
the candidate gives answers.
Interviewer has control of
topics for the development of
interaction.
Role plays 323

Role play Source in cross- Pragmatic/ Characteristics

type cultural orILP discourse targets
Closed role- Hasler-Barker Requests, compli- Non-interactive oral data in
play (2013); ments, apologies one or two closed turns. Par-
Félix-Brasdefer ticipants are asked to respond
and Hasler-Barker orally in one turn to initiating
(2015); Rintell and or responding acts. Other
Mitchell (1989); types of closed role plays vary
Rose (2000); with regard to the amount
Schauer (2004) of contextual information in
the written, aural, and visual
stimuli. The Cartoon Oral
Production Task (COPT)
includes oral and visual stim-
uli. Each cartoon includes a
brief caption to describe the
scenario. After the participants
are directed to the sceneario,
the researcher reads a brief
situation, followed by an oral
response that is recorded.
The Multi-media Elicitation
task (MET) (Schauer 2004) is
delivered through a computer-
ised presentation format with
visual (photographic images)
and audio input (description of
the situation).

3. Issues of validity, reliability, and analysis of the data

Role plays have been compared to other methods such as production question-
naires, multiple-choice, or questionnaires. The main issue is the degree of validity
and reliability of role-play data. Validity refers to the degree to which an instrument
(e. g. role plays) measures what it intends to measure, and consequently allows
adequate interpretation of the results. Three types of validity are discussed in the
literature: (1) content, (2) criterion-related, and (3) construct validity. Content
validity refers to the degree to which the instrument measures the content area in
two ways: item and sampling validity (Gay et al. 2009). Item validity refers to how
well the items of the instrument measure the intended content area; for example, if
the content of the situations employed in role plays is relevant for measuring the
intended aspect of pragmatic competence, namely, pragmalinguistic or socioprag-
324 J. César Félix-Brasdefer

matic knowledge (cf. section 4). Sampling validity refers to the representativeness
of the content of the items included in the overall instrument, such as the inclusion
of different types of situations to measure performance of one or two speech acts,
and in symmetrical and asymmetrical contexts. Criterion-related validity examines
whether the results of a production of a test correlate with the findings obtained
from another instrument that measures the same aspect of pragmatic competence,
as done in Brown (2001) who used six different instruments to measure requests,
refusals, and apologies. Finally, construct validity is the most difficult form of
research validation, as it refers to the internal structure of the instrument and what
aspect of pragmatic competence it intends to measure (e. g. production, percep-
tion, interaction). In contrast, reliability has to do with the degree to which an
instrument consistently measures the intended hypothetical construct. In particular,
the reliability of a group of test scores, for example, reflects the “consistency of
measurement whether across time, forms, raters, items, etc” (Brown, 2008: 228).
Validity and reliability are crucial methodological concepts that every researcher
needs to keep in mind during the conceptualisation of the role-play task. In their
study of sequential analysis of role-play requests, Al-Gahtani and Roever observed
that “role plays allow a decent degree of standardization while eliciting extended
interactive data” (2012: 44). And Demeter (2007) showed that role plays represent
a reliable and valid method to collect data for apologies. If the focus of the research
question is to analyze interactional aspects of communicative action (e. g. sequence
organization, turn-taking, overlap, repair), role plays are a better option than other
non-interactive methods such as DCTs.
One of the main criticisms against the role-play method is that it gathers arti-
ficial or simulated data, as participants are asked to imagine a situation that may
or may not have happened to them in real life. It has been noted that an ethno-
graphic approach in naturalistic setting is more favorable for speech acts than
elicited data (Duranti 2009; Wolfson 1989). Under this approach, data are often
collected during prolonged participant observation, audio and video-recording, and
field-note data. Nevertheless, the role-play method provides the researcher with
interactional data that approximates natural discourse with regard to the dynam-
ics of the interaction, such as sequence organization, turn-taking, prosodic cues
(e. g. intonation, tone, stress, and rhythm), and the realization of speech acts at
the discourse level (joint actions). Further, role-play data ensure comparability
across different situations and degrees of formality, and allow the researcher to
control for micro- (social power, distance, situation) and macro-social factors such
as region, sex, age, ethnicity, and the socioeconomic level of the participants. For
example, with regard to pragmatic variation, role plays are suitable for the analysis
of intra-lingual pragmatic variation because they allow for the comparison of one
or more speech acts across different situations and among participants from two or
more varieties of the same language (Félix-Brasdefer 2009; Schneider 2010), and
in cross-cultural pragmatics research contrasting the speech act realization in two
Role plays 325

languages (Félix-Brasdefer 2008; Márquez Reiter 2000). They can also be used
in inter-cultural pragmatics for testing and assessment purposes; for example, in
doctor-patient scenarios between US NSs of English (monolinguals or learners)
and NSs of Spanish-speaking regions (Cohen 2012). And to increase the degree
of validity, other methods can be used to complement role-play data, such as lik-
ert scales or verbal reports, which provide insights to speakers’ perceptions with
regard to the planning, the selection of the language of thought, and delivery of
the speech act. Thus, being cognizant of the artificiality of the data, the role-play
method has many advantages. It is mainly used for experimental purposes where
the focus is the analysis of online pragmatic knowledge, face-to-face interaction,
assessment of learners’ pragmatic competence, and the control of micro- and mac-
ro-social factors in comparable formal and informal situations.

4. Methodological and ethical considerations

Role-play data allow the researcher to examine two aspects of the learner’s prag-
matic competence (Leech 1983; Thomas 1983): (1) pragmalinguistic competence –
knowledge about and performance of the conventions of language use or the lin-
guistic resources available in a given language that convey “particular illocutions”
in contextually appropriate situations (Leech 1983: 11), and (2) sociopragmatic
competence – knowledge about and performance consistent with the social norms
in specific situations in a given society, as well as familiarity with variables of
social power and social distance. More importantly, due to their interactional
nature, researchers can examine different dimensions of the learner’s interactional
competence, which concern their ability to use interactional resources necessary
for the co-construction of meaning in joint-action in formal and non-formal set-
tings: how learners open and close an interaction, how they negotiate meaning such
as refusing a professor’s advice not to take a class or complaining to the manager
of a restaurant about bad customer service, how learners initiate and accomplish
repair in conversation, and how they take turns and overlap in interactions with
native and non-native speakers.
If the objective of the research question is a sequential analysis across multiple
turns, an open role play is the preferred method. For example, Al-Gahtani and
Roever (2012) took a close look at the sequential analysis of request responses
among ESL learners to examine pragmatic development across four proficiency
levels. A CA approach (Schegloff 2007) was used to examine the role-play interac-
tions. Results showed that lower-level learners were less likely to project upcoming
requests and pre-requests. Advanced learners produced more pre-expansions and
insert-sequences, and a more sophiscated production and co-construction of the
request across multiple turns. Félix-Brasdefer (2007) analyzed role-play inetrac-
tions to examine learners’ ability to co-construct request sequences (pre-sequences,
326 J. César Félix-Brasdefer

insertions, and post-expansions) among FL learners of Spanish at three proficiency

levels. Gass and Houck (1999) also used a discursive approach to examine the
negotiation of refusal sequences among ESL learners. Hasler-Barker (2013) used
open role-plays to analyze the structure of compliments and compliment responses
in learner-learner and NS-NS interactions. The author showed how intermedi-
ate-level learners negotiate these speech acts. And, Félix-Brasdefer (2006) used
a CA approach to teach the ability to refuse across multiple turns using role-play
interactions. Learners were taught to identify speech act sequences, followed by
an awareness-raising approach. Overall, Kasper and Dahl (1991) stated that open
role plays “provide a much richer data source. They represent oral production, full
operation of the turn-taking mechanism, impromptu planning decisions contingent
on interlocutor input, and hence, negotiation of global and local goals, including
“negotiation of meaning” (in the SLA sense of the term), when required” (1991:
228; original emphasis).
When designing and conducting the role-play task, the researcher should be
aware of the following recommendations. With regard to the representativeness
of the situations, researchers need to include sufficient contextual information:
a description of the setting, the degree of social power and social distance of the
interlocutors, information about the degree of imposition, and specific information
about the pragmatic target, namely, asking the participant to perform a speech act
(issuing a request or an invitation, agreeing or disagreeing). In this case, the role
play is targeted at specific speech acts. Researchers should also make an effort to
include a balanced distribution of role-play situations in formal and informal set-
tings to analyze different aspects of the learner’s pragmatic competence. In addi-
tion, role plays are widely used to elicit joint-action in cross-cultural and inter-
cultural settings (cf. Félix-Brasdefer 2008). With regard to the administration of
the role-play task, researchers can match different participants for each role-play
situation, but this increases the number of the participants. For example, ten NSs
with ten NNSs, each pair performing the role-play situations. In this case, each pair
receives instructions prior to the interaction. A second option that is more frequent
in cross-cultural and ILP research is to train an interlocutor to conduct all role-play
interactions with different participants. In this case, the same interlocutor interacts
with each participant in formal and informal situations. The advantage of this alter-
native is that it ensures comparability of the role-play samples, especially when
examining pragmatic development across groups of various proficiency levels.
In addition to the researcher’s own analysis, a second, trained person should
code the data independently to increase its reliability (e. g. request types, openings
and closings, internal modification of the request). And for a cross- and inter-gen-
der analysis, a statistical program (e. g. SPSS) can be used to examine the data
through the use of descriptive statistics and an analysis of independent (e. g. gen-
der) and dependent variables of the study (e. g. request type, stylistic forms). For
the analysis of interactional data, a qualitative method of discourse analysis is nec-
Role plays 327

essary to capture the dynamics of the structure of joint sequences, such as applied
CA (Ross and Kasper 2013).
Triangulating data from two or more sources enhances the credibility of the
results and offers a broader understanding of the data from different perspectives.
For example, in Félix-Brasdefer (2004, 2008) the role-play interactions were sup-
plemented by retrospective verbal reports to examine learners’ perceptions with
regard to sociocultural information, directness and indirectness, and polite or
impolite behavior. To capture non-verbal features of the role-play interaction, some
researchers have videotaped the entire role-play interaction (Gass and Houck 1999;
Scarcella 1979; Walters 1980). If the research focus is to analyze prosodic patterns
of the role-play interactions, recordings should be conducted with a high-quality
digital recorder and a high-fidelity microphone in a sound proof room to ensure a
high quality of the recorded interactions.
Finally, due to ethical considerations and in order to protect the rights of human
subjects, researchers collecting role-play data need to obtain approval from the
Institutional Review Board (IRB) at their institutions (or applicable offices in each
country) to collect audio- or video-recorded data. It should be noted, however,
that in some countries protection of human subjects may not be available at the
researcher’s institution, but the researcher should seek to investigate other offices
that protect the rights of participants for research purposes. Participants, including
NSs and NNSs, should be informed of the general objectives of the research project
and that their interactions will be recorded, and they must be assured that the data
will be used confidentially. In this case, adult participants must complete a consent
form to agree to participate in the research project. The consent form provides a
description of the project, and gives information about the rights of the partic-
ipants, risks and benefits of the project, and the participant’s right to withdraw
from the research project at any time. If role-play data are elicited from vulnerable
subjects in research (e. g. prisoners or children), consent must be obtained from a
third party. For instance, regarding minors, consent is provided by their parents or
guardians. This information must be mentioned in the description of the research
method of any study, and all data must be considered with anonymous subjects.

5. Conclusion

Role plays represent a reliable and appropriate method for collecting pragmatic and
interactional data in cross-cultural and ILP research. They can be used to examine a
variety of speech acts among NSs and NNSs in bilingual and multilingual contexts.
Role-play data, if elicited with care, represent reliable and valid data (to a degree)
that permit the examination of different aspects of the learner’s pragmatic compe-
tence (pragmalinguistic and sociopragmatic knowledge), including interactional
competence in face-to-face or telephone dyadic interactions. Role plays represent
328 J. César Félix-Brasdefer

a viable method for examining pragmatic development across learners of various

proficiency levels and in different learning contexts. The choice of the type of role
play (archetypal open role play, naturalized role play, simulated role-play tasks,
OPI role play) depends on the researcher’s ability to design a task that yields data
that approximates natural discourse. The OPI role play and the AP role-play tasks
are used for assessment and testing purposes to determine the learner’s level of
pragmatic and discourse competence. To increase the validity of role-play data,
researchers should triangulate additional data from other sources, such as retrospec-
tive verbal reports, field notes, production questionnaires, and natural discourse.

References

Al-Gahtani, Saad and Carsten Roever

2012 Proficiency and sequential organization of L2 requests. Applied Linguistics
33(1): 42–65.
Billmyer, Kristine and Manka Varghese
2000 Investigating instrument-based pragmatic variability: Effects of enhancing
discourse completion tests. Applied Linguistics 21: 517–552.
Brown, Annie
2004 Discourse analysis and the oral interview: Competence or performance. In:
Diana Boxer and Andrew D. Cohen (eds.), Studying Speaking to Inform Sec-
ond Language Learning, 253–282. Clevedon: Multilingual Matters.
Brown, Jean D.
2001 Pragmatic tests: Different purposes, different tests. In: Kenneth Rose and
Gabriele Kasper (eds.), Pragmatics in Language Teaching, 301–325. Cam-
bridge: Cambridge University Press.
Brown, Jean D.
2008 Raters, functions, item types, and the dependability of L2 pragmatics tests. In:
Eva Alcón Soler and Alicia Martínez-Flor (eds.), Investigating Pragmatics in
Foreign Language Learning, Teaching and Testing, 224–248. Clevedon: Mul-
tilingual Matters.
Cohen, Andrew D. and Ernesto Macaro
2010 Research methods in second language acquisition. In: Ernesto Macaro (ed.),
Continuum Companion to Second Language Acquisition, 107–133. London:
Continuum.
Cohen, Andrew D.
2012 Research methods for describing variation in intercultural pragmatics for cul-
tures in contact and conflicto. In: J. César Félix-Brasdefer and Dale A. Koike
(eds.), Pragmatic Variation in First and Second Language Contexts: Method-
ological Issues, 17–48. Amsterdam: John Benjamins.
Crookall, David and Danny Saunders
1989 Communication and Simulation. Clevedon, UK: Multilingual Matters.
Demeter, Gusztav
2007 Role-plays as a data collection method for research on apology speech acts.
Simulation and Gaming 38(1): 83–90.
Role plays 329

Dörnyei, Zoltan
2007 Research Methods in Applied Linguistics. Oxford: Oxford University Press.
Duranti, Alessandro
2009 Linguistic Anthropology, 2nd ed. Oxford, UK: Wiley-Blackwell.
Félix-Brasdefer, J. César
2011 Cortesía, prosodia y variación pragmática en las peticiones de estudiantes uni-
versitarios mexicanos y dominicanos [Politeness, prosody and pragmatic vari-
ation in requests among Mexican and Dominican university students]. In: Car-
men García Fernández and Maria Elena Placencia (eds.), Estudios de variación
pragmática [Studies of Pragmatic Variation], 57–86. Buenos Aires: Dunken.
Félix-Brasdefer, J. César and Maria Hasler-Barker
2015 Complimenting in Spanish in a short-term study abroad context. System 48:
75–85.
Félix-Brasdefer, J. César
2002 Refusals in Spanish and English: A cross-cultural study of politeness strate-
gies among speakers of Mexican Spanish, American English, and American
learners of Spanish as a foreign language. Ph.D. dissertation, University of
Minnesota, Minneapolis, MN, USA.
Félix-Brasdefer, J. César
2004 Interlanguage refusals: Linguistic politeness and length of residence in the
target community. Language Learning 54: 587–653.
Félix-Brasdefer, J. César
2006 Teaching the negotiation of multi-turn speech acts: Using conversation-ana-
lytic tools to teach pragmatics in the classroom. In: Kathleen Bardovi-Har-
lig, César Félix-Brasdefer and Aliwa Omar (eds.), Pragmatics and Language
Learning, 165–197. Teaching and Curriculum Center University of Hawai’I,
Volume 11. Manoa, HI: Second Language.
Félix-Brasdefer, J. César
2007 Pragmatic development in the Spanish as an FL classroom: A cross-sectional
study of learner requests. Intercultural Pragmatics 4(2): 253–286.
Félix-Brasdefer, J. César
2008 Politeness in Mexico and the United States: A Contrastive Study of the Reali-
zation and Perception of Refusals. Amsterdam: John Benjamins.
Félix-Brasdefer, J. César
2009 Pragmatic variation across Spanish(es): Requesting in Mexican, Costa Rican,
and Dominican Spanish. Intercultural Pragmatics 6(4): 473–515.
Félix-Brasdefer, J. César
2010 Data collection methods in speech act performance: DCTs, role plays, and
verbal reports. In: Esther Usó Juán and Alicia Martínez-Flor (eds.), Speech
Act Performance: Theoretical, Empirical, and Methodological Issues, 41–56.
Amsterdam: John Benjamins.
Gass, Susan M. and Noel Houck
1999 Interlanguage Refusals: A Cross-Cultural Study of Japanese- English. Berlin/
New York: Mouton de Gruyter.
Gay, Lorraine R., Mills, Geoffrey E. and Peter W. Airasian
2009 Educational Research: Competencies for Analysis and Applications (9th edi-
tion). Upper Saddle River, NJ: Pearson Education.
330 J. César Félix-Brasdefer

Hasler-Barker, Maria
2013 Effects of pedagogical intervention on the production of the compliment and
compliment response sequence by second language learners of Spanish. Ph.D.
dissertation, Indiana University Bloomington, Bloomington, IN, USA.
Jefferson, Gail
2004 Glossary of transcript symbols with an introduction. In: Gene H. Lerner (ed.),
Conversation Analysis: Studies from the First Generation, 13–31. Amsterdam/
Philadelphia: John Benjamins.
Kasper, Gabriele and Merete Dahl
1991 Research methods in interlanguage pragmatics. Studies in Second Language
Acquisition 13: 215–247.
Kasper, Gabriele
2000 Data collection in pragmatics research. In: Helen Spencer-Oatey (ed.), Cul-
turally Speaking: Managing Rapport through Talk across Cultures, 316–369.
London, UK: Continuum.
Lee, Jih-Ye
2014 Speech and gesture in route direction-giving interactions. PhD dissertation,
Department of Second Language Studies, Indiana University, Bloomington,
IN, USA.
Leech, Geoffrey
1983 Principles of Pragmatics. New York: Longman.
Mackey, Alison and Susan Gass
2005 Second Language Research: Methodology and Design. Mahwah, NJ: Law-
rence Erlbaum.
Márquez Reiter, Rosina
2000 Linguistic Politeness in Britain and Uruguay: A Contrastive Study of Requests
and Apologies. Amsterdam/Philadelphia: John Benjamins.
Márquez Reiter, Rosina, Isobel Rainey and Glenn Fulcher
2005 A comparative study of certainty and conventional indirectness: Evidence
from British English and Peninsular Spanish. Applied Linguistics 26: 1–31.
McDonough, Steven H.
1986 Psychology in Foreign Language Teaching. London: Allen and Unwin.
Okada, Yasuke and Tim Greer
2013 Pursuing a relevant response in oral proficiency interview role plays. In: Ste-
ven Ross and Gabriele Kasper (eds.), Assessing Second Language Pragmatics,
288–310. Basingstoke: Palgrave McMillan.
Rintell, Ellen M. and Candace J. Mitchell
1989 Studying requests and apologies: An inquiry into method. In: Shoshana Blum-
Kulka, Juliane House and Gabriele Kasper (eds.), Cross-Cultural Pragmatics:
Requests and Apologies, 248–272. Norwood, NJ: Ablex.
Rose, Kenneth
2000 An exploratory cross-sectional study of interlanguage pragmatic development.
Studies in Second Language Acquistion 22: 27–67.
Ross, Steven and Gabriele Kasper
2013 Assessing Second Language Pragmatics. Basingstoke: Palgrave McMillan.
Sarbin, Theodore R. and Donald S. Jones
1955 An experimental analysis of role behavior. Journal of Abnormal and Social
Psychology 51: 236–241.
Role plays 331

Sarbin, Theodore R.
1943 The concept of role-taking. Sociometry 6(3): 273–285.
Scarcella, Robin
1979 On speaking politely in a second language. In: Carlos Yorio, Kyle Perkins
and Jacquelyn Schachter (eds.), TESOL ‘79: The Learner in Focus, 275–287.
Washington, DC: TESOL.
Schauer, Gila
2004 ‘May you speak louder maybe?’ Interlanguage pragmatic development in
requests. EUROSLA Yearbook 4: 253–272.
Schegloff, Emanuel
2007 Sequence Organization in Interaction: A Primer in Conversation Analysis I.
Cambridge: Cambridge University Press.
Schneider, Klaus
2010 Variational pragmatics. In: Mirjam Fried (ed.), Variation and Change: Prag-
matic Perspectives, (Handbook of Pragmatic Highlights 6.) 239–267. Amster-
dam: John Benjamins.
Selting, Margaret
2010 Prosody in interaction: State of the art. In: Dagmar Barth-Weingarten, Elis-
abeth Reber and Margret Selting (eds.), Prosody in Interaction, 3–40. Amster-
dam: John Benjamins.
Thomas, Jenny
1983 Cross-cultural pragmatic failure. Applied Linguistics 4(2): 91–112.
Tran, Giao Q.
2006 The naturalized role-play: An innovative methodology in cross-cultural and
interlanguage pragmatics research. Reflections on English Language Teaching
5(2): 1–24.
Trosborg, Anna
1995 Interlanguage Pragmatics: Requests, Complaints, and Apologies. Berlin:
Mouton de Gruyter.
Walters, Joel
1980 Grammar, meaning and sociocultural appropriateness in second language
acquisition. Canadian Journal of Psychology 34: 337–345.
Wolfson, Nessa
1989 Perspectives: Sociolinguistics and TESOL. New York: Newbury House Pub-
lishers.
IV. Observational pragmatics
13. Introduction to part 4: Observational pragmatics
Andreas H. Jucker

1. Introduction

Parts 4 and 5 of this handbook are devoted to methods of analysis that rely on
observational data, that is to say on data that have an existence outside of the
research context and which have not been experimentally elicited or created by
the researcher. Part 4 focuses on methods of analysis that are mainly qualitative
and rely on relatively small sets of data, consisting, for instance, of transcriptions
of audio- or video-recorded data, field notes of various types or small samples
of written texts. Part 5, in turn, will focus on research methods that are mostly
quantitative in nature and depend on larger data samples, which require computer
assisted retrieval techniques.
The distinction between qualitative and quantitative research is here used
mainly as a convenient structuring principle. It is not a distinction that can be
applied in any categorical manner. In a general sense, qualitative approaches focus
on functional aspects of linguistic entities; they focus on careful descriptions of
generally small sets of data without considering numerical data, such as frequency
figures or measurements (Andersen 2011: 587). Patterns and generalisations are
described on a small scale. Distributional differences based on statistical informa-
tion are less important. The focus is very much on the description of the details, on
meanings and functions in context.
Quantitative research, on the other hand, is based on numerical data, on meas-
urements and frequencies. Such approaches are generally based on large datasets.
Patterns and generalisations are described on a large scale and often different data-
sets are compared in terms of the frequencies of certain entities or other measure-
ments (see Rühlemann 2011). Quantitative research depends on countable or meas-
urable entities, and such entities depend on the classification of entities gained
through qualitative research. In this sense, quantitative research is not possible
without a qualitative foundation (see also chapter 18, the introduction to section 5
of this handbook). Qualitative research, on the other hand, appears to be possible
without any quantification of its categories, except that the qualitative description
of categories in a set of data always makes the, to some extent, quantitative point
that this category at least exists in this particular set of data.
In section 2 of this introductory chapter, I will briefly problematize the concept
of “naturally occurring”, which is often seen as the gold standard for observational
pragmatic research. The final section will introduce the four papers of this sec-
tion of the handbook. For more details on the different types of data in pragmatic

https://doi.org/10.1515/9783110424928-013
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 335–342. Berlin/
Boston: De Gruyter Mouton.
336 Andreas H. Jucker

research the reader is referred to chapter 1 and for a more detailed introduction
of corpus pragmatic approaches to chapter 18, which introduces section 5 of this
handbook.

2. The concept of “naturally occurring”

For many approaches in pragmatics, “naturally occurring” data seem to be some

kind of gold standard. Data should be as uncontaminated by any researcher influ-
ence as possible. While it is acknowledged that this is an ideal that is hard to
achieve, everything should be done to minimise the undesirable effects of research-
ers’ impact on the data. Schegloff (1996: 468), for instance, argues that only “natu-
rally occurring interactional environments which seem to be the natural, primordial
home for language use” can serve as data for a conversation analytical research
agenda. Have (2007) makes a similar point:
The general CA recommendation for making recordings is that these should catch “nat-
ural interaction” as fully and faithfully as is practically possible. The term “natural” in
this expression refers to the ideal that the interactions recorded should be “naturally
occurring”, that is “non-experimental”, not co-produced with or provoked by the re-
searcher. […] In other words, the ideal is to (mechanically) observe interactions as they
would take place without research observation, but one can never really verify this.
(Have 2007: 68)
The distinction between “naturally occurring” or “authentic” data on the one hand
and “contrived” or “researcher prompted” data on the other seems to be relatively
clear. It is the distinction between data that “would have occurred anyway without
the intervention of the researcher” and data that have been “deliberately elicited
by the researcher, by setting up conversations or speaking tasks for the purpose of
analysis” (Archer, Aijmer and Wichmann 2012: 12). Potter (2002) makes the same
point with his “(conceptual) dead social scientist’s test”, in which he asks:
Would the data be the same, or be there at all, if the researcher got run over on the way
to work? An interview would not take place without the researcher there to ask the
questions; a counselling session would take place whether the researcher turns up to
collect the recording or not. (Potter 2002: 541; see also Golato 2017: 21, and Golato
and Golato, this volume)

He suggests that the term “natural” should be replaced by “naturalistic” because

of the many ways in which data generally fall short of the ideal of being entirely
natural in the strict sense (Potter 2002: 540). This would then provide a “useful
contrast between data that are got up and data that are, at least ideally, not, while
recognizing the limits on that distinction” (Potter 2002: 541).
However, even in data that would exist without the intervention of the
researcher, there are different levels of “naturalness”. The dead social scientist’s
Introduction to part 4: Observational pragmatics 337

test focuses solely on the impact the researcher has on bringing about the commu-
nicative event under observation. There are three additional dimensions or scales
along which speech data can be classified as being either naturally occurring or
contrived; these concern the purpose of the communicative event, the level of
researcher participation in the event and the manner in which it is transformed into
a written form as a basis for subsequent inspection and analysis. These dimensions
are partly interlinked but they cannot simply be subsumed under the dead social
scientist’s test.
On the dimension of the purpose of the communicative event, we can dis-
tinguish between those speech events that have a purpose outside the research
context and those whose purpose is entirely research centred. The counselling
session given as an example by Potter in the quotation above has a purpose in
itself. Both the counsellor and the client have communicative goals that are not
dictated by the research context. In a role play, at the other end of the spectrum,
the interactants take part as if play acting. The communicative goals are prescribed
by the researcher, and the complaints, requests or apologies acted out in these
situations do not have real-world consequences. However, communicative events
can also occupy some middle ground between these extremes. In Rüegg’s (2014)
study of restaurant interactions, for instance, the data consisted of interactions
recorded in different types of restaurants in Los Angeles. These interactions were
clearly staged for the purpose of the research but they had real-life consequences
in that the researcher and her assistants who acted as customers were served drinks
and food and were asked to pay for these services. The waiters who served the
researcher and her friends arguably interacted with them as they normally interact
with restaurant guests in spite of the fact that most of the recorded interactions
would presumably not have taken place without the research project.
The next dimension that needs to be considered concerns the researcher’s par-
ticipation or non-participation in the speech event under analysis. Here the spec-
trum ranges from data that have been produced without any participation and per-
haps even without any knowledge of the researcher. The data appear to be most
“natural” if the researcher plays no part in the speech event at all. In Labov’s (1972:
209) terms, “the aim of linguistic research in the community must be to find out
how people talk when they are not being systematically observed, yet we can only
obtain these data by systematic observation”. According to Labov, this “Observer’s
Paradox” can be overcome in various ways, for instance by diverting an inter-
viewee’s attention away from speech, which will “allow the vernacular to emerge”
(Labov 1972: 209). In spite of the success that Labov had with this method, such
data would presumably still not count as entirely “natural” or even “naturalistic”.
Depending on the type of data being recorded, the researcher’s involvement
can vary considerably. In some cases, the researcher is a silent observer who tries
to behave as unobtrusively as possible, but even in this case his or her presence
might affect the speech event under observation. The researcher might be involved
338 Andreas H. Jucker

as one of the participants with a more or less active role in the proceedings with a
correspondingly higher influence on the speech event. Or, in the case of role plays,
the researcher might even play the role of a movie director who assigns roles and
tasks that the participants are supposed to play act. It is difficult to decide at which
exact point between the extremes the situation is no longer “natural” and becomes
“contrived”.
And, finally, the speech situation under observation can only be analysed if
at least some aspects are recorded and made permanent. This ranges from field
notes to audio and video recordings. Field notes necessarily require the presence
of a researcher who observes the situation and decides on the aspects that need
to be written down for subsequent analysis. In many cases, field notes have the
advantage that they can be taken relatively unobtrusively sometimes even after
the event. But field notes can only be extremely selective. The researchers must
decide in advance what they want to focus on, and they have to be alert and quick
in order not to miss relevant parts while taking notes, and it may be very difficult to
remember the crucial aspects of an interaction in the necessary detail. As a result,
the field notes might be idealised rather than one hundred per cent accurate.
Recordings are more comprehensive than field notes, especially in the case of
video recordings. They are much richer in the details that they capture but their
comprehensiveness is also deceptive. Participants in the interaction, perhaps even
including the researcher as participant observer, may have background knowledge
that allows them to read between the lines of what is going on in the interaction.
These may be aspects that fail to show up on recordings made by the impartial
technical equipment. Microphones and cameras impose certain perspectives. They
highlight some aspects of what is going on and leave others in the dark, often
literally.
Ethical considerations are less restrictive for field notes than for recordings.
The anonymity of the participants obviously needs to be observed but informed
consent is not always necessary if the researcher only takes notes and does not
make any audio- or video-recordings. For such recordings informed consent has to
be obtained from all participants prior to them being recorded. This requirement in
effect rules out that any data can be truly “natural”. “From this perspective, then,
all data are researcher-prompted and thus contrived” (Speer 2002: 516; empha-
sis original). This is presumably the reason why Hambling-Jones and Merrison
(2012: 1121) argue that surreptitious recordings and retrospective consent might in
some situations be superior to pre-obtained consent, but it is doubtful whether the
majority of ethical review committees would agree to this position, and in many
countries this would be clearly illegal.
Recordings of speech data have to be transcribed to make them accessible to
analysis (see Kreuz and Riordan chapter 3, this volume). However, even a very
rich and detailed transcription is an idealisation and abstraction of the actual real-
ity that it represents. It imposes the transcriber’s perspective on the data and his
Introduction to part 4: Observational pragmatics 339

or her decisions about the details that are included and the details that have been
omitted. “Transcription is theory. […] How we transcribe doesn’t just reflect our
theories of language, it also shapes them, drawing our eyes to some phenomena
while leaving others in shadow” (du Bois 1991: 71). As a result, we cannot expect
our transcriptions to be an unadulterated representation of reality. A transcription
is necessarily a somewhat distorted – or contrived – version of the communicative
reality it tries to represent.
Thus, we have to be aware of the many ways in which the pragmaticist’s data
fail to be truly “natural”. Generally, it is more important to carefully assess the
limitations of the available data and to evaluate its suitability for specific research
questions, rather than to aim for an unrealistic goal (see also Jucker 2009).

3. The papers in this section

There are four papers in part 4 of this handbook. In the first paper, Meredith Marra
and Mariana Lazzaro-Salazar present ethnographic methods. The term “ethnogra-
phy” covers a broad range of methods but they all go back to an approach developed
by cultural anthropologists. Researchers immerse themselves as much as possible
in a community in order to provide detailed, “thick” descriptions of community
activities. It is through this participation that the researcher gains a deeper insight
into a particular culture and its communicative practices. It provides an analysis
that combines an outside perspective (an etic or technical point of view) with an
inside perspective (the emic perspective, the point of view of the community mem-
bers themselves). Marra and Lazzaro-Salazar illustrate ethnographic methods with
a discursive approach to politeness in their work on language use in a workplace
context. They focus in particular on the prevalent data collection methods, field
notes, observations and interviews, and on the different ways of working with
such data. At the end of their contribution they also discuss several frequently dis-
cussed critiques of ethnographic studies, for instance the critique that ethnographic
research can never be sufficiently objective because of the inevitable subjectivity
of the data gathering techniques. Further problems are the time commitment that
is necessary for data collection and the limited generalisability of the observed
patterns beyond the investigated communities.
The paper by Andrea and Peter Golato deals with ethnomethodology, conversa-
tion analysis and interactional linguistics, which they describe against a historical
backdrop and the seminal work of Erving Goffman, Harold Garfinkel and later
Harvey Sacks. Ethnomethodology, conversation analysis and interactional linguis-
tics share most of their underlying assumptions but there are also differences that
the authors carefully tease out. Ethnomethodology, for instance, focuses more on
how interactants engage in social actions through talk in interaction, while conver-
sation analysis focuses more on the underlying order of talk itself. Both of them
340 Andreas H. Jucker

adopt the perspective of the interactants and investigate how they use language to
create meaning. Utterances are not seen in isolation but in the sequential context
in which they occur. Conversation analysis and interactional linguistics insist on
audio- and video-recorded naturally occurring data that conform to Potter’s (2002)
dead social scientist’s test (see above), and great care is taken with the transcription
process that turns the data into written representations. Golato and Golato’s outline
finishes with a discussion of the range of research topics that have been tackled
with the methodologies of conversation analysis and interactional linguistics, a
discussion of their strengths and weaknesses, as well as some brief comments on
current and future applications of these methods.
Anita Fetzer covers approaches under the general heading of discourse anal-
ysis. The two main issues, according to her, are the granularity of the discourse
units and the nature of their connectedness. Discourse is seen as a parts-whole
configuration in which the whole is more than the sum of its parts. It is the dis-
course units at whatever granularity they are proposed that form the constitutive
elements in the structuring and linearization of discourse. Fetzer also brings in the
terms quantity and quality. However, she uses them in a slightly different manner
from what has been outlined above. Here, quantity relates to the number of consti-
tutive parts of discourse, i. e. the number of discourse units, while quality relates
to the pragmatics of the discourse units, that is to say the way in which they are
integrated into their context and connected with neighbouring units. Quantitative
studies, therefore, tend to focus on the linear sequence of discourse units and their
connectedness, while qualitative studies tend to focus on how interlocutors co-con-
struct and negotiate discourse coherence.
The final paper in this section by Piotr Cap covers Critical Discourse Anal-
ysis (CDA). Cap uses the term Critical Discourse Analysis as a cover term for
a range of different approaches that vary in their underlying notions and in their
research methodology but have in common that they intend to be instrumental
in bringing about social change. In this, CDA approaches differ from almost all
other linguistic theories, which insist on being descriptive, impartial and detached.
CDA is unashamedly partisan. It tries to uncover social injustice and to highlight
how language is used to exert institutional power by the elite. Cap teases out the
interconnectedness of different branches of CDA and their methodological attrac-
tors, that is to say the basic methodologies from which these branches draw their
research tools and he discusses the ways in which CDA and pragmatics are related.
He also sketches out a CDA model, called a legitimization-proximization model
(Cap 2013), which he uses for a case study in which he analyses a speech by U.S.
President George W. Bush, given only weeks before U.S. and coalition troops
entered Iraq on March 19, 2003. The model helps to unravel the ways in which
Bush construes and manipulates closeness and remoteness in the political sphere
in order to create credibility and legitimization of the Iraq war and the subsequent
anti-terrorist campaigns.
Introduction to part 4: Observational pragmatics 341

Thus, in contrast to section 3 of this handbook, which was devoted to vari-

ous ways of eliciting relevant data for pragmatic research, this section focuses on
approaches that deal with pre-existing data. The emphasis is squarely on observa-
tion and analysis of what is already there, be it spoken communication or written
communication. All contributions in this section focus on approaches that prefer
qualitative methods of analysis with an insistence on careful attention to small
details and richly contextualised data samples. In this respect, they contrast signif-
icantly from the approaches reviewed in the contributions of section 5 of this hand-
book, which seek generalisations at a higher level and across much larger data sets.

References

Andersen, Gisle
2011 Corpus-based pragmatics I: Qualitative studies. In: Wolfram Bublitz and Neal
R. Norrick (eds.), Foundations of Pragmatics, 587–627. (Handbooks of Prag-
matics 1.) Berlin: de Gruyter Mouton.
Archer, Dawn, Karin Aijmer and Anne Wichmann
2012 Pragmatics. An Advanced Resource Book for Students. (Routledge Applied
Linguistics.) London: Routledge.
Cap, Piotr
2013 Proximization. The Pragmatics of Symbolic Distance Crossing. (Pragmatics &
Beyond New Series 232.) Amsterdam: John Benjamins.
Du Bois, John W.
1991 Transcription design principles for spoken discourse research. Pragmatics
1(1): 71–106.
Golato, Andrea
2017 Naturally occurring data. In: Anne Barron, Yuego Gu and Gerard Steen (eds.),
The Routledge Handbook of Pragmatics, 21–26. London: Routledge.
Hambling-Jones, Oliver and Andrew John Merrison
2012 Inequity in the pursuit of intimacy: An analysis of British pick-up artist inter-
actions. Journal of Pragmatics. Special Issue: Im/politeness across Englishes,
44(9): 1115–1127.
Have, Paul ten
2007 Doing Conversation Analysis. A Practical Guide. Second edition. (Introducing
Qualitative Methods.) London: Sage.
Jucker, Andreas H.
2009 Speech act research between armchair, field and laboratory: The case of com-
pliments. Journal of Pragmatics 41(8): 1611–1635.
Labov, William
1972 Sociolinguistic Patterns. Philadelphia: University of Pennsylvania Press.
Potter, Jonathan
2002 Two kinds of natural. Discourse Studies 4(4): 539–542.
Rüegg, Larssyn
2014 Thanks responses in three socio-economic settings: A variational pragmatics
approach. Journal of Pragmatics 71: 17–30.
342 Andreas H. Jucker

Rühlemann, Christoph
2011 Corpus-based pragmatics II: Quantitative studies. In: Wolfram Bublitz and
Neal R. Norrick (eds.), Foundations of Pragmatics, 629–656. (Handbooks of
Pragmatics 1.) Berlin: de Gruyter Mouton.
Schegloff, Emanuel A.
1996 Some practices for referring to persons in talk-in-interaction: A partial sketch
of a systematics. In: Barbara Fox (ed.), Studies in Anaphora, 437–485. (Typo-
logical Studies in Language 33.) Amsterdam: Benjamins.
Speer, Susan A.
2002 “Natural” and “contrived” data: A sustainable distinction? Discourse Studies
4(4): 511–525.
14. Ethnographic methods in pragmatics
Meredith Marra and Mariana Lazzaro-Salazar1

Abstract: Researchers within the ethnographic paradigm strive for enhanced,

contextualised understandings of pragmatic phenomena as understood within the
wider social system. This typically involves incorporation of both emic (partici-
pant) and etic (analysts’ technical) perspectives. In this chapter we unpack some of
the principles underlying the paradigm by describing the foundations of ethnogra-
phy, the emergence and development of ethnographic methods within pragmatics
and a range of data collection techniques and analytic approaches used in ethno-
graphic research. By way of illustration we draw on our own experiences inves-
tigating pragmatic issues within workplace talk. In this chapter we also reflect on
the strengths and weaknesses of an ethnographic approach, focusing in particular
on the objectivity-subjectivity divide, the associated time commitment and the
ethnographer’s attitude to generalisability. We conclude by discussing the potential
benefits that the approach offers for future work on pragmatic phenomena.

1. Introduction

While those working in pragmatics make use of data collected in a number of

different ways (as described in the extensive range of chapters in this handbook),
analysts do not always capture, or even aim to capture, the inherent “messiness”
of real life interaction. For many, however, empirical evidence which supports an
understanding of everyday, naturally-occurring talk is a necessary component of
their work. These researchers typically aim for rich and detailed contextual infor-
mation to underpin their interpretation of linguistic data. The foundation on which
their approach is built is ethnography.
Ethnography has been defined as “a whole cluster of methods for gathering
data, analyzing, interpreting, and writing” about the day-to-day interactions of a
group of people (Davis and Henze 1998: 400). An approach developed by cultural
anthropologists, the broad goal of ethnography is to understand cultural behaviour
and norms, as well as the beliefs and ways of living of a target community through
deep, long-term engagement. Thus, ethnographers immerse themselves in a com-
munity or culture to gain this detailed understanding. Early examples of ethnog-
raphers include Fanny Wright and Harriet Martineau, who investigated societal

1
Mariana Lazzaro-Salazar is supported by CONICYT/FONDECYT nº 3160104.

https://doi.org/10.1515/9783110424928-014
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 343–366. Berlin/
Boston: De Gruyter Mouton.
344 Meredith Marra and Mariana Lazzaro-Salazar

norms and everyday life in North America in the 19th century, as well as Margaret
Mead and Bronisław Malinowski, who famously worked in the Pacific in the 1920s,
increasing public awareness of the methodology in the process (see Griffin and
Bengry-Howell 2008: 15–16). As a research practice, ethnographers engage with
the target community over an extended period of time, conducting “unstructured
field research” (Burgess 1982) through their participation in community life. This
participation often amounts to years of involvement during which researchers learn
the language and take part in social events alongside their research participants.
This participatory approach provides greater access to the participants’ under-
standing of events. This allows the researcher to include both an etic (technical)
and an emic (community member) perspective (see Boyle 1994). The addition
of an emic perspective separates ethnographic approaches from those scientific
approaches purporting to be objective. The reason the community view is fore-
grounded is to bring to the surface cultural meanings that even the most expert
scientific eye may not be able to identify or understand without the help of com-
munity members. The weight given to the emic perspective constitutes the most
important pillar of ethnographic research, especially its use in minimising the
researcher’s own external assumptions and categorization schemes, thus providing
“warrants” for the interpretations of the researcher and strengthening the validity
of the findings. Rather than viewing etic and emic as two distinct sources of data,
however, ethnographers recognise the interplay between the two (e. g. Zhu and
Bargiela-Chiappini 2013): the ethnographer works to interpret events and mean-
ings through the eyes of the study participants, while, at the same time, making
links between these interpretations and relevant social theory to assist in explaining
social phenomena from a scientific point of view. In other words, as research-
ers within this paradigm, we strive for enhanced contextualised understandings
of pragmatic phenomena understood within the wider social system, as well as
an ability to incorporate both the perspective of the analyst and the reality of the
participants themselves in our interpretations.
Below we describe the emergence and development of ethnographic methods
within pragmatics and a range of data collection techniques and analytic approaches
used by researchers who work with ethnographic data. To exemplify our description
we draw on our own experiences of investigating pragmatic issues within workplace
talk. We conclude by discussing the strengths and weaknesses of an ethnographic
approach and the potential benefits that the approach offers for future work.

2. From ethnography to ethnographic methods

The participation involved in a pure ethnography clearly requires enormous dedi-

cation and time. This commitment can be both logistically and financially prohib-
itive, not to mention potentially intrusive, especially for those researchers oper-
Ethnographic methods in pragmatics 345

ating outside the disciplinary traditions of anthropology. As a compromise, many

researchers collect the kinds of information gained in ethnography, but without
extended fieldwork. Thus, instead of two years or more in situ, analysts might
spend months or carefully planned weeks with the community under investiga-
tion, often making use of the participant observation prioritised in ethnography
and semi-structured fieldwork alongside some non-participant observation (for a
consideration of participant and non-participant observation see section 4.2). The
resulting data is more typically labelled “ethnographic”, representing the flavour
of ethnography and acknowledging the more limited participation involved in the
data collection methods.
As a summary, Hammersley (1990: 1–2) suggests the following five features
for identifying ethnographic research:
1. Behaviour is studied in everyday contexts, there are no ‘unnatural’2 or experi-
mental circumstances imposed by the researcher.
2. Observation is the primary means of data collection, although various other
techniques are also used.
3. Data collection is flexible and unstructured to avoid pre-fixed arrangements
that impose categories on what people say and do.
4. The focus is normally on a single setting or group and is small scale.
5. The data is analysed by attributing meanings to the human actions described
and explained.
One of the reasons this truncated approach produces feasible detail is a changing
conceptualisation of what counts as a “community” as well as the ways in which
we understand the relationship between a culture and the community members
who contribute to this culture. In the twentieth century, studies in ethnography
were guided by positivist approaches, which viewed the communities under inves-
tigation as relatively fixed cultural units, often described using static terms (see
Rosaldo 1989). In line with the constructionist turn in the social sciences (see Davis
and Henze 1998; de Volo and Schatz 2004), ethnographic approaches embrace the
idea that culture influences the behaviour of its people and also the idea that behav-
iour of the people contributes to the ongoing construction and (re)negotiation of

2
See considerations regarding the impact of the presence of the researcher on the studied
community in section 7.2. Also see discussion on the “positionality” of the researcher
within the studied community in Denscombe (2014). Shanmuganathan (2005: 79) also
offers a rich discussion on this topic in the context of ethnographic methods of data
collection, reminding ethnographers that “the very act of observation itself affects
the phenomenon under study.” This, Shanmuganathan (2005) explains, is what social
researchers call the Observer’s Paradox (Labov 1972), which involves understanding
the ways in which the presence of the researcher affects the naturalness with which the
very activities under study are carried out.
346 Meredith Marra and Mariana Lazzaro-Salazar

cultural norms. Current ethnographic views, guided by post-positivist principles,

support the belief that realities are multiple, and that there are multiple sub-cultures
and smaller social networks embedded in a wider cultural system.
The focus on subcultures is embraced by those who make use of a Community
of Practice approach (Wenger 1998). In this framework a community is described
as a group of people who come together and actively engage in mutual processes
of negotiation and meaning making, and who develop shared practices, norms
and repertoires which distinguish them from other communities (see Eckert and
McConnell-Ginet 1992; King forthcoming). Other relevant kinds of communities
(also stemming from Wenger’s work) include communities of purpose, communi-
ties of alignment and communities of imagination. Each term describes an identi-
fiable and distinct type of socially-relevant community.
As an example, we offer Lazzaro-Salazar’s (2013) ethnographic work with a
nursing team in a hospital in New Zealand. Here the community of nurses, “an
imagined world of nurses” to use Norton’s words (2001), bound by the same disci-
pline values and practices, was found to be highly salient to the participants in their
negotiation of meaning in face to face meetings. The notion of an imagined commu-
nity captures people who have variable access to activities and resources for partic-
ipation in the community, but share a sense of belonging (see also Anderson 1983).
Lazzaro-Salazar (2016) shows how her participants display membership of the
imagined professional community by drawing on values that bind nurses together,
regardless of where they come from and where they work, and values which also
set them apart from doctors and other health professionals. These include, among
other issues, an unwavering commitment to patient care, and complaints about the
arrangement of the roster and conditions on night shift. These key ideas are shared
by people across geographic borders and between people who have never, and might
never, interact with each other directly. This is a very different understanding of
community to the village or tribe which might be the focus in a standard ethnogra-
phy.
This changing conceptualisation of what counts as a community worthy of
investigation has prompted researchers to re-consider ethnography in its traditional
sense in favour of new methods that enable us to gain insiders’ perspectives of, for
instance, online communities and global communities. In these cases, researchers
simply cannot physically participate in the activities because the communities are
virtual or imagined. As an example, Locher (2006) investigated an internet-based
community of advice seekers and givers who use an online health forum. A digital
community was also the focus of Graham (2008), who analysed the discussion
threads in a Christian e-community to determine the members’ patterns of interac-
tion and norms for what counted as appropriate behaviour.
A methodological distinction is thus drawn between conducting a full “ethnog-
raphy” and doing “ethnographic research” (see Ramanathan and Atkinson 1999).
Using an ethnographic approach means researchers endeavour to access and inter-
Ethnographic methods in pragmatics 347

pret social events of complex modern communities from multiple perspectives.

This involves a multiplicity of data collection techniques that allow for a holistic
approach to the study of culture. In this frame researchers attend to the emic point
of view, and yet sometimes do not actively participate as community members.
The role of the researcher and how they access and join their community is thus
a salient issue. In traditional ethnography, the researcher enters the community as an
outsider and as such makes “familiar” that which was previously unknown. A very
simple example might be recognizing important distinctions made in address forms
based on the relative age of the speakers, a dimension which is not consistently
relevant across communities. The lens of the newcomer highlights those aspects of
community behaviour which are distinctive, or at least different to the researcher’s
own cultural norms. Over time this status as an outsider becomes less clear-cut as
the researcher becomes more integrated into the community. There is considerable
debate about the merits of remaining distant or fully embracing membership of
the community. As a result, the importance of the outsider-insider perspective has
gained a lot of scholarly attention (see, for example, McKinley Brayboy and Dey-
hle 2000; Bonner and Tolhurst 2002). More recently some researchers have begun
taking an ethnographic approach when investigating communities to which they are
already insiders, thus requiring them to make the familiar “unfamiliar” for analytic
purposes. In these cases the researcher needs to recognize and explore their own
assumptions about what is considered normal for the community. Regardless of
the stance one takes in this wider discussion, the researcher must remain critically
reflexive of their own role in, and influence on, community practices.
A fruitful and illustrative example which demonstrates both the changing con-
ceptualisations of community as well as the relevance of the outsider-insider and
insider-outsider debate is provided by critical ethnographer Kidner. Kidner (2015)
examined the language used by industry and activist groups when debating the
virtue of the extraction of natural resources. As such she was working with two
communities focused on the same issue but from opposing “political” perspectives.
Conducting her research in New Zealand, Kidner’s Canadian citizenship made her
an outsider in the local industry and activist communities alike. She was, however,
an insider when it came to a wider imagined community through her extensive
involvement in activism in her home country. Early in her research she began her
participation in the local activist community by attending a festival in a region
where lignite mining was being proposed and where there was grass-roots resist-
ance. As well as participating in the event and keeping field notes, Kidner video
recorded interviews with community members, to be interpreted using multimodal
analysis, and created a collection of relevant artefacts (posters, signs, advertising
and protest leaflets) to support her research. On the first night of the event the
organisers tasked her with the job of installing composting toilets (having used
one in the past, she possessed more knowledge than the others). This activity was
spotted by a journalist in the small New Zealand town where the festival took place
348 Meredith Marra and Mariana Lazzaro-Salazar

and she became a short term media star for her efforts. The attention required much
reflection: had she become too involved to have analytical distance? Would her
unexpected infamy negatively affect efforts at deep community participation? Did
the expertise with which she was being attributed mark her as an authority in an
unhelpful manner for her research? In fact, Kidner found that this attention resulted
in her acceptance within the group as a committed, involved member and chal-
lenged the stereotypical work-shy student identity with which it was later reported
she had risked being labelled. She describes her understanding of the shift in her
identity in her ethnographic field notes:
My enthusiasm for the project earned me the title of ‘the toilet lady’ […], and I even gave
an interview about the toilets’ construction to a local radio station. In a matter of days,
I had become a composting toilet expert and a festival organiser: I had moved from the
classic participant ‘outsider’ to ‘insider’ in a most unexpected fashion. (Kidner 2015: 66)

As noted above, while Kidner was an outsider to the industry perspective, she
acknowledged her insider status within the wider activist community. To ensure
an appropriate degree of analytical distance, she made efforts to make her own
practices “unfamiliar” to be able to describe them for others. These practices align
with the assumptions and practices that guide the ethnographic approach as out-
lined in the list from Hammersley (1998) reproduced above. She was using several
methods of data collection to provide multiple lenses on the community norms;
she created a “thick”, detailed description of the community (following the origi-
nal conceptualisation by Geertz 1973) and then explored the research findings to
address social issues.
The reflection created unforeseen understandings including the identification
of a repeated pattern in the sequential ordering of strategies used by the communi-
ties to resist opposing public arguments (namely drawing first on discourses of the
environment vs. the economy, then regional identity, rights of indigenous people,
and, finally, the queer community), a finding which held across the two national
contexts in which she had become a community member. Intensive engagement
and reflection were crucial to the success of the work.
Unsurprisingly, the affordances of the approach have not been overlooked by
those working in pragmatics who increasingly look to ethnographic methods for
explaining various features of interaction.

3. The growth of ethnographic studies in pragmatics

For those interested in discursive pragmatics, emphasis is placed on the environ-

ment in which the feature or phenomenon is situated. The adoption of ethnographic
methods into pragmatics can be traced to the theory of communicative competence
pioneered by Dell Hymes (working with other anthropologically-oriented linguists
Ethnographic methods in pragmatics 349

in the 1970s) and the related framework known as “Ethnography of Speaking”

(later “of Communication”). This method makes use of detailed contextual infor-
mation, including physical setting and non-verbal communicative components, to
determine community “rules of interaction” and “norms for interpretation”. The
approach arose as a scholarly reaction to the prevailing theoretical linguistic mod-
els which were based on the notion of an ideal speaker-hearer who operated in a
homogeneous speech community (Chomsky 1965). Those models assume children
acquire linguistic competence through an internal, mental process. Hymes (1974)
argued instead that all linguistic systems are embedded within a social matrix which
requires children to acquire “a system of its use” regulated by contextual factors,
labelled as components of communicative events. The Hymesian approach has
influenced many different areas of research and is foundational within socio-cul-
tural approaches to linguistics in particular. The core argument, whereby contextual
information is privileged in interpretation, continues to shape our understandings
of meaning in interaction and is central in discursive approaches to pragmatics.
To illustrate the ways in which ethnographic methods have been applied within
pragmatics specifically, we have chosen two rich research areas, namely discursive
approaches to politeness and language use in the workplace context.
When Brown and Levinson ([1978] 1987) proposed “Politeness: Some Univer-
sals in Language Usage” in their seminal and highly influential monograph, they
created a productive, systematic framework for an etic approach to politeness. A
major criticism aimed at the work, however, was a Western bias and the lack of
applicability for investigating politeness across cultures. In response, Eelen (2001)
proposed the concepts of first order and second order politeness, which loosely
map to emic and etic understandings respectively (but see distinctions drawn in
Haugh 2012). The emphasis on the perspectives of both the analyst and the par-
ticipants themselves was also core to arguments proposed by Watts (who argued
for politeness1 and politeness2) and his work with Locher. In their framework for
discursive politeness, Locher and Watts (2005) promote the participants’ perspec-
tive on what is considered (un)marked, (non)politic and (in)appropriate. Positively
marked behaviour counts as both polite and appropriate, and negatively marked
behaviour (including both impolite and overly polite practices) are labelled as
inappropriate.
In order to ascertain what a community deems appropriate, the analyst needs
access to social norms. Restricting the focus to a particular Community of Practice
(as described above) has proven fruitful for determining these negotiated norms.
This has been a particularly popular method for those in pragmatics who align
with the field of workplace discourse research where the establishment of group
norms in workplace teams has been a regular focus (for example, Angouri 2012;
Schnurr and Chan 2009; Mullany 2007; Marra 2012). As an example, Holmes
and Marra (2011) compared the openings of meetings recorded by teams in New
Zealand workplaces, contrasting Communities of Practice which oriented to indig-
350 Meredith Marra and Mariana Lazzaro-Salazar

enous Māori norms and those who recognised dominant Pākehā3 norms. The anal-
ysis, which drew on emic understandings gathered through extensive recordings,
non-participant observation, interviews, debriefs with community insiders and
“member checking” (Guba and Lincoln 1989), highlighted the difference in the
structure of the meetings. The more elaborate and extended structure of the Māori-
aligned teams created “safe space in which cultural awareness can be mediated
and discussed” (see Boxer 2003: 62). The short, to the point, openings of the other
teams highlighted how much shared knowledge the teams had and suggested that
the longer opening was considered unnecessarily ceremonial.
A difficult task for the researcher in this context is to keep the balance between
the emic and the etic perspectives that account for their methodological and ana-
lytical design. Most, like Holmes and Marra or Kidner above, attempt to find this
balance using a range of data collection methods.

4. Data collection methods

Viewing cultures holistically as complex networks, ethnographic researchers aim

to access and to interpret social events from multiple perspectives. Below we intro-
duce a number of the methods (although not an exhaustive list; for a more compre-
hensive consideration of data collection techniques and practical tips see Holmes
2014) which are regularly part of an ethnographic toolkit in pragmatics. The meth-
ods are ordered from those that most closely resemble “pure” ethnography to those
which require more limited engagement and investment from the researcher.

4.1. Field notes

The ethnographic researcher relies heavily on field notes collected as part of the
ongoing reflection which is characteristic of their work. There are countless vol-
umes providing advice on how to produce useful and useable field notes that allow
the researcher to capture thoughts, feelings and detail for later reflection (e. g.
Rizzo, Corsaro and Bates 1992). Some offer ideas on how to lay out your notes in
order to revisit and structure ideas from the outset (e. g. Emerson, Fretz and Shaw
2011). Others discuss software and freeware which can help organise ideas for
easy retrieval (e. g. Silver and Lewins 2014). They all share an emphasis on sys-
tematicity, thoroughness and the need to be able to undertake regular and repeated
reflection to establish cumulative knowledge.

3
People of European origin (see brief consideration of the term in Schnurr, Marra and
Holmes 2007)
Ethnographic methods in pragmatics 351

The quote from Kidner above collected at the time of her “toilet lady” adven-
tures offers an example of field notes. Figure 1 is another example, this time from
research by Lazzaro-Salazar (2013). In this case rather than demonstrating her
reflections and intuitions, the diagram provides contextual information about the
layout of the room in which her workplace recordings took place. Knowing where
hospital staff were located, what was in their immediate environment and who was
able to overhear their conversations helped Lazzaro-Salazar interpret the recorded
interactions with which she worked.

Figure 1: Diagram of room set created from field notes taken while recording data in a
hospital. The details help the analyst identify speakers in multiparty recordings
and also provide access to important physical features which are made salient
during the recording, for example, the location of the white board.

4.2. Observations
Field notes often include unstructured observations of distinctive community
behaviour as researchers witness interesting differences, but can also include more
structured and deliberate procedures based on observation logs. There are various
forms of observation used in ethnographic research, including both participant
observation, as is standard in ethnography, as well as non-participant observation.
Researchers might work alongside the community in the fields, prepare and eat
352 Meredith Marra and Mariana Lazzaro-Salazar

with household members and attend community events and religious ceremonies as
a way of experiencing life from the insider perspective. Alternatively, they might
engage more selectively in key events or activities as an outside observer. As an
example, Baxter (2009) sat at the back of the room during meetings involving
female leaders in business settings but did not engage in the meeting. Observing
affords the researcher access to data which can be invisible to the participants
themselves, and long-term observation has the potential to build trust with com-
munity members and to reduce reactions to the presence of an outsider. The obser-
vation also helps to clarify the kinds of questions that a researcher might want to
ask (themselves or a participant) to check early intuitions and to gain confidence
in any interpretations (see Adato 2008).
In her investigations of the role of humour in organisations, Plester (2015)
worked within a number of organisations for several months. Her research design
included significant periods of participant observation in order to understand the
humour which she acknowledges was not funny at all to her at the outset. Only by
being the butt of the joke and by participating in the regular playing of pranks did
she get a real sense of what it meant to be an accepted member of the team and
how not to take offence at things that at first seemed very cruel to her. Fletcher
(2011) also chose to participate in the organisation with whom she was working
by spending one day a week sitting at a desk in the large open plan office through-
out the duration of her doctoral work. For Fletcher it was important that she was
not only physically present at the IT company, but also that she had access to the
company-wide email chain for notices. She found this data invaluable for finding
out information about what was going on as well as identifying ways of expressing
this information (e. g. going for a haircut was not only something that was worth
telling colleagues about, but could also provide a chance to exhibit your wit and
outdo the announcements of others).
In the cross-cultural research aimed at describing effective leadership patterns
in Māori and Pākehā organisations in New Zealand mentioned above, the Welling-
ton Language in the Workplace team spent several months working with members
of a Māori organisation who volunteered to record their everyday interactions (see
Holmes, Marra and Vine 2011). As part of the recording process, the team (which
comprised Pākehā academics who began from an etic perspective and Māori
research assistants who acted as a bridge to emic understandings) spent time in the
organisations, setting up cameras for recording larger meetings, having informal
debriefing chats with the participants and engaging in other non-participant obser-
vations to gather as much contextual information as possible. As part of this prac-
tice all research team members kept detailed notes to provide a thick description
(see Figure 2). The Figure shows that separating field notes from observations is
not always as clear cut as descriptions might suggest, as well as the value of revis-
iting ideas (or in this case collaborating by combining ideas). So while the “Māori
research assistants” note that the reception chooses to book their taxi through a
Ethnographic methods in pragmatics 353

Figure 2: Handwritten observation notes made on site by a team of researchers work-

ing together. Notes include details of equipment set up, ethical procedures,
observations and hypotheses, as well as interaction between team members to
support or reject interpretations.

small, Māori-operated company, “Meredith” hypothesizes about a possible reason

for the choice made, that is to say, stated policy or an identity move.

4.3. Interviews
Interviewing participants is a much more direct source of information than the
observations and reflections noted above, and perhaps also a more familiar data
collection technique for most researchers. While quantitative (and positivist)
researchers might prefer a more structured interview to gather the same informa-
tion from a number of participants (Johnstone 2007), those working with ethno-
graphic methods often use semi-structured interviews which allow for more open
lines of questioning and flexibility in the direction of the interview based on par-
ticipants’ responses and interests. The interviewers typically have a set of pre-pre-
pared questions to draw on as necessary or a checklist of topics to be covered
which is “designed to provide room for the exploration of emergent topics and for
follow-up questioning” (Adato 2008: 226).
The researcher’s interviewing techniques play a prominent role in the ethno-
graphic interview. For instance, an important factor is whether the interviewer asks
the questions as a community member or as an outsider. Are the questions biased
or misleading? Are they pushing the interviewee towards a preconceived choice?
More often than not, interviews in ethnographic studies serve the purpose of val-
idating the researcher’s interpretations or act as a way of gaining a deeper insight
354 Meredith Marra and Mariana Lazzaro-Salazar

into aspects of the social phenomena observed and for which they need more infor-
mation. Hammersley and Atkinson (1996: 151) comment that “all interviews, like
any other kind of social interaction, are structured by both researcher and inform-
ant”. This reflects a constructionist perspective on all interaction and, like other
aspects of ethnographic research, emphasizes the complexity and sophistication of
meaning making in interaction.
Interviews may be conducted with individuals or groups. The group interview
(and the related but distinct method of focus groups) has typically been used to
gather macro-group perceptions by workplace discourse researchers interested in
cross-cultural interaction. For example, Kingsley (2009) conducted focus groups
with employees in a range of banks in Luxembourg, a country known as an impor-
tant location for international financial institutions. Security and confidentiality
issues prevented her from using participant observations and recordings of natural-
ly-occurring talk to understand how multiple languages were used by the employ-
ees and how their use was interpreted by colleagues. The groups who discussed
this issue came up with norms about their practices which she was then able to
compare with surveys and with the discussions which took place with other groups
to determine what the use of various languages (other than the prescribed work-
ing language) seemed to signify, namely, solidarity, customer focus and linguistic
proficiency.
Murata (2011) also ran focus groups and group interviews to explore the use
of small talk in New Zealand and Japanese corporations, supplementing the audio
and video recordings she was able to capture. In her case the focus was politeness
and appropriacy through the lens of relational practice. She played samples of
interactions from New Zealand meetings to groups of Japanese business people
and asked them to rate the extracts on a number of relevant pragmatic scales.
For confidentiality reasons, the extracts were rerecorded by actors, but they were
produced to retain as many of the interactional features of the naturally-occurring
originals as possible. The groups had strong and replicated impressions about the
role of silence, the role of laughter and the importance of formality which were
very different to the New Zealand participants. Murata combined audio and video
recordings with observations and interviews to gain as rich a picture as possible. As
noted earlier, those using ethnographic approaches typically make use of multiple
methods to facilitate thorough understanding of the community and their pragmatic
practices.
The value of ethnographic methods is the attention to detail. As will be clear, an
ethnographic approach has the potential to generate an enormous amount of useful
data. Managing the data and identifying what is most relevant and useable can be
a daunting task for the experienced and inexperienced analyst alike.
Ethnographic methods in pragmatics 355

Figure 3: Illustration of data coding. The left column has timestamps to indicate place
in the recording and descriptions of the interaction. The right hand boxes are
more processed themes that join together related pieces of data.

5. Working with ethnographic data

The volume of data collected via the methods described above necessitates sys-
tematic means of processing and categorising, from labelling recordings, to the
thorny issue of choosing a transcription system (see Ochs 1979), as well as making
connections across multiple data sources. Coding and finding themes helps sort
data in ways which begin to reveal and provide evidence for the analyst’s insights.
Figure 3 is an example of this coding.
Coding may be undertaken by hand, for example, with highlighting pens, using
standard software such as MSWord or spreadsheets such as Excel, or within spe-
cially-designed programmes such as NVivo. These programmes are designed to
combine and link transcripts, coding and notes, for instance, and most researchers
working with ethnographic data will make use of the affordance of the software.
In pragmatics, our focus is typically micro-level detail which goes beyond the
themes identified via coding. Below we include two illustrations of the way in
which ethnographic data can be used to support the interpretation of interactional
data, namely data collected (1) as indirect information gathered through observa-
tions and (2) by asking direct questions of the participants as key informants on
their own interaction and meaning.
356 Meredith Marra and Mariana Lazzaro-Salazar

5.1. Gathering of indirect information

During data collection we keep notes about our impressions, without knowing
which information is going to be useful to us later. Below we provide an example,
again from the Māori/Pākehā research project on effective and successful leader-
ship conducted by Holmes, Marra and Vine (2011). The recording which is repre-
sented in the transcript was captured at the meeting being recorded at the time that
the observation notes reproduced in Figure 2 were made. In the extract, the meeting
participants make comments on the recording process and react to the video cam-
eras which had been placed in the corner of their meeting room for the first time.
(1) Being recorded
Context: First recorded meeting at Kiwi Consultations, a Māori organisation. This is
a company-wide meeting, where Hari and Mere are management team members and
Maureen is the Executive Assistant to the company’s Chief Executive Officer (CEO).4
1. Hari: they don’t want to see the back of your head
2. Mere: well they’re gonna see the back of my head ++
3. not gonna see my face ++
4. Hari: (spose) + they’re getting your best side
5. Mere: that’s right
6. Hari: your back side
7. Mere: (you’re lucky you’re over there brother)
[a couple of quiet comments – not transcribed]
8. Mere: who’s gonna be filming
9. Hari: oh john campbell’s here from t v three news [laughter]
10. Mere: (funny boy aren’t you)
11. Maureen: she’s gonna slap you [laughs] [laughter]
12. Mere: [quietly]: bloody shit: [laughter]
13. Jen: I hope that’s on camera

Throughout the extract there is evidence of antagonistic (from an outside perspec-

tive) but good natured ribbing of various participants: Hari teases that the back of
Mere’s head is more attractive than her face ([her] best side), and her response is
a mock-threat of violence (line 7); a strategy which is endorsed through repetition
by Maureen later in the interaction (line 11) and is met with laughter from the
team, suggesting that this behaviour is humorous and welcome. When Mere wants
to know who is responsible for the recording, her colleagues (who the research

4
Transcription conventions
+ untimed pause of less than one second
(unclear) unclear utterance/transcriber’s best guess at an unclear utterance
[info]: : editorial and paralinguistic information. Colons indicate start and end.
… section of transcript omitted
All names are pseudonyms.
Ethnographic methods in pragmatics 357

team had met the day before the meeting as indicated in the notes in Figure 2)
suggest a popular national news channel rather than giving her the real answer. Her
understanding of the team norms and the laughter means she knows not to believe
them.
While there is much more richness than this brief explanation allows, we high-
light in particular the ethnographic information which helps support this interpreta-
tion. Naturally-occurring recordings were made in this organisation over a number
of months, and the teasing and swearing were regular features of the interaction.
In conversation with the CEO we learnt that he actively aimed to make the work-
place casual (also commenting that sometimes things became too casual). The use
of the term brother to refer to a teammate indexes both informal conversation as
well as the Māori ethnicity with which this organisation aligns both officially and
in a number of the everyday practices. In the observation notes in Figure 2 (above)
we comment even in the earliest stages of data collection about the strong cultural
identity within the workplace and the “sleepiness” (later refined to “informal and
casual tone” in contrast to the more corporate feel of a similar organisation with a
different ethnic alignment).
This team closely aligns with a minority group within New Zealand. A majority
group understanding of the practices might suggest that the team is off topic, not
interested in business issues, playing around rather than working hard, violent, lack-
ing cohesion and/or unproductive. The information gathered during data collection
instead suggests they are tight-knit, productive, collaborative and successful.

5.2. Information from key participants

The next example demonstrates the value of information collected during inter-
views with key participants. In the recording, a group of nurses and their manager
are discussing the practice of carrying the ward’s portable phones while doing their
medication rounds. Lisa, a nurse coordinator, has explained that in her view this
practice is inappropriate and rude. Mandy supports Lisa’s opinion on this matter.
(2) Releasing time to care5
Context: Regular monthly meeting for nurses in a public hospital.
1. Mandy: [talking to Lisa]: you’re right:
2. I mean I wouldn’t answer the phone
3. if it was in my pocket …
4. I mean if you’re toileting a patient
5. the last thing you do is go
6. hello [laughs] …

5
This example also appears in Lazzaro-Salazar (2016), where the focus is on the realiza-
tion of the nursing culture through discourse.
358 Meredith Marra and Mariana Lazzaro-Salazar

7. this is all about releasing time [louder]: to care: …

8. so you’re not going to be disturbing your care with
9. with phone calls

In support of Lisa’s arguments for not answering the phone when with patients
(lines 1–2), Mandy puts herself in Lisa’s shoes (lines 3), and gives an example of
how answering the phone would jeopardise their nursing practice (lines 4–6). Mandy
further supports her stance when she emphasises that this is all about “releasing time
to care” in line 7, and reinforces this idea when she explains that phone calls “dis-
turb” their care in line 8, summarizing the main point of the argument.
At first sight the analysis of Mandy’s reflection seems to be very straightfor-
ward and simple, at least at the content level. However, the ethnographic data
collected later in the study revealed that her words in line 7 provide added depth
to the reflection of whether nurses should answer the phone when doing their
rounds. For a lay audience, which included the researcher who is not a health-
care worker herself, the phrase “releasing time to care” seems self-explanatory,
pointing at prioritising caring duties over other tasks in the ward. Though this is
partly true, during an interview with the charge nurse manager of the ward some
months after this meeting was recorded, she explained that the phrase “releasing
time to care” actually embodied a trust-wide nursing programme with the full title
“Releasing Time to Care – the Productive Ward”. This programme is part of the
healthcare reform undertaken in the UK in 2007 with the aim of increasing staff
and patient satisfaction by releasing nurse time from “wasteful” activity (Wilson
2009) while focusing on “improving ward processes and environments to help
nurses and therapists spend more time on patient care thereby improving safety
and efficiency” (National Health Service, 2006–2013). In light of this information,
Mandy’s reflection may work to remind the other nurses present at the meeting
not only that the strongest professional commitment they have is to provide good
quality care but also that decisions like Lisa’s support the implementation of cur-
rent international nursing programmes which regulate their professional practices.
This allows Mandy to construct her professional identity not only at a local level
of nursing practice (as was first considered by the analyst) but also at a much wider
disciplinary level (as the ethnographic data collected in the interview revealed)
(see further discussion of identity considerations in this extract in Lazzaro-Salazar
2016).
These are just two short examples, which are explored very briefly, but which
demonstrate the added depth of understanding that we are able to gain from the
ethnographic information gathered as part of the data collection.
Ethnographic methods in pragmatics 359

6. Criticism of ethnographic approaches

It is important to recognise that while the ethnographic approach offers many

advantages to those interested in pragmatic phenomena as described above, there
are also areas of potential disadvantage which need to be acknowledged.

6.1. Objectivity and subjectivity

One of the frequently discussed “critiques” of ethnographic studies is that objectivity
is never attained because the researcher’s bias is likely to influence their interpreta-
tions of the phenomena studied. Those using ethnographic approaches recognise this
subjective nature. Employing multiple methods of data collection and analysis can
be seen as one attempt to minimise possible shortcomings. But in other ways ethno-
graphic researchers embrace the subjectivity, emphasising engagement as opposed
to the detachment which is often the perceived goal of “objective” research. Taking
an ethnographic approach means immersing yourself with your community as much
as you can in order to provide detailed, “thick” descriptions of community activities.
We would argue that the complexity of understanding required for interpretation
can only be gained through these iterative processes, that is to say, the “sifting
and sorting through pieces of data to detect and interpret thematic categorisations,
search for inconsistencies and contradictions, and generate conclusions about what
is happening and why” (Thorne 2000: 69). The result is richer understandings of
social phenomena and appropriate warrants to enhance the validity of our findings.
Ethnographic researchers actively prioritise depth of understanding over breadth.

6.2. Time commitment

As noted above, this depth of understanding takes considerable time. Those who
enter communities as outsiders need time to discover the social categories that
matter and how they structure the ways that people use language. Detractors would
question if the result is really worth the time invested. A counter argument from
ethnographic researchers is that this time allows for initial misunderstandings to be
reconciled and for blinkered views to be overcome: “Field researchers must have
ample opportunity to develop relationships, to establish an identity (membership
status), and to acquire an appreciation for cultural norms and the interactive and
cognitive style and abilities of the individual participants” (Rizzo, Corsaro and
Bates 1992: 105). We must also be aware that our first impressions and initial
understandings are not always the best. Immersion in communities provides a path-
way to allow us to develop trust and also to test for misinformation or misunder-
standings. We must always recognise that there are no “facts” or absolute “truth”
in interpretive analysis, but rather we knowingly provide our best understanding
based on the data we have available.
360 Meredith Marra and Mariana Lazzaro-Salazar

6.3. Finding universals

Aiming for depth of understanding means giving up the breadth which is achiev-
able in other approaches. This breadth typically serves the purpose of affording
comparisons and identifying generalizable patterns. This goal aligns with a posi-
tivist approach which accepts fixed, essentialist groupings and stable entities, but is
far less appealing for those who question the bluntness of these categories. Never-
theless, some ethnographic researchers have taken on the challenge to form gener-
alisations based on the outcomes of studies which address issues of a similar nature
in different contexts. The validity of this generalisability has created a divide.
On the one hand, some ethnographers and ethnographic researchers question
the extent to which findings of an ethnographic study can be generalized when the
assumptions and interpretations of events and discourses are inevitably linked to
the social context in which they occur (de Volo and Schatz 2004). Advocates of
this view explain there is an issue of “sameness” in ethnographic research since
we cannot assume that social phenomena are perceived in the same way by all
participants and by different individuals across communities. Moreover, along
with cultural variation within and across communities, “ethnographers also have
to assume cultural change [as] no living culture ever stands still, and the forms and
processes of cultural invention are constantly in flux […]. Pragmatic strategies and
their meanings thus are subject to these shifting cultural tides, and studies must
acknowledge that findings may be true for a certain group today, but not necessar-
ily tomorrow” (Davis and Henze 1998: 403). Adopting this perspective, Ramana-
than and Atkinson (1999) argue that ethnographic research can only understand
the particular (that is to say, “particularizability”) in a given special and temporal
context.
The scholars who stand at the opposite end of this divide believe that ethnogra-
phy is by its very nature comparative; although groups and cultures cannot be com-
pared in every single detail, they can be compared at a broader level. As de Volo and
Schatz (2004: 270) explain, “ethnography is readily employed to test hypotheses
to determine whether and how well general theory applies to a specific case”. This
means that “rather than comparing pre-determined pragmatic categories across lan-
guages” the ethnographic approach is aiming “to determine culture-specific catego-
ries” which can then be compared across language communities (Davis and Henze
1998: 403). Nevertheless, providing a thick description of the community under
investigation is vital in helping to determine whether transfer is possible.
Elaborating on this notion, the ethnographic approach has important contri-
butions to make to our theorising in pragmatics through analytic generalisability.
Here our investigations of communities can support the development of dynamic
theoretical models by finding categories and dimensions which are relevant across
cultural groups and which demonstrate “the multiple possible outcomes or rela-
tionships that exist among factors” (Duff 2006: 50). Thus rather than searching for
Ethnographic methods in pragmatics 361

abstract statements of law, we could be emphasising “naturalistic generalisation”

(O’Reilly 2009) to guide real world actions with more subtlety and flexibility.
In sum, it seems that the issues which are declared weaknesses by others (sub-
jectivity, the time commitment and lack of generalisability) are some of the core
strengths for those committed to ethnographic approaches. As ethnographic prac-
tices will remain in the field for some time, it is worth considering where these
methods might take us in the pragmatics research of the future.

7. Future directions

7.1. New kinds of communities

As noted above, those taking ethnographic approaches in pragmatics are already

recognising the changing nature of our understandings of “community” and “cul-
ture”. This also signifies a shift from a focus on a single speech act, marker or
perhaps style of discourse, to an approach where community and culture are seen
as the primary focus of analysis. Here the contribution to pragmatics must surely be
uncovering the meaning system that underlies the pragmatic features. Recognising
that the community has a major impact on the way in which meaning is negotiated
between participants places the community in a more central position than ear-
lier investigations in pragmatics. We should therefore expect more research which
considers different conceptualisations of community, also reflecting changes to the
ways in which we organise ourselves in society. While online communities have al-
ready begun to feature, there are surely many more digital and virtual communities
which deserve attention. And social networks (in the more traditional sense) deserve
greater attention as the smaller sub-cultures with which we identify in our ongoing
negotiations of self. The notion of community and culture are in flux and as new un-
derstandings emerge, so too will our understandings of related pragmatic features.

7.2. Researching “with” rather than “on” participants

In terms of the data collection methods to be applied, there needs to be greater
awareness of our relationships with participants. We anticipate increasing attention
to the ethics of our research methods. Our own research practices prioritise research-
ing “with” rather than research “on” participants (see Cameron et al. 1992), that is,
involving the participants in research design decisions, as well as data collection and
interpretation processes (see also Roberts 2003; Sarangi 2006), including providing
regular feedback. Doing ethnographic research in this way involves the development
of an ongoing social relationship between the researcher(s) and the participants.
The result is a different understanding of the role of the researcher. Most recent
ethnographic research approaching cultural phenomena in the workplace adopts a
362 Meredith Marra and Mariana Lazzaro-Salazar

post-modern and socio-constructionist perspective. Within this paradigm we rec-

ognise our researcher bias and acknowledge our position within the studied com-
munity (see Denscombe 2014). As Johnstone (2007: 112–113) explains, “those tak-
ing a postmodern or post postmodern approach will accept that they are themselves
very much a part of the social world they are studying, that it is therefore futile
to try to eliminate the effects of themselves as researchers, and that reflexivity is
the process through which they will seek to understand these effects”. Rather than
reacting and providing counter arguments to positivists and those who search for
objective truth, it is time for pragmatic researchers using ethnographic methods to
embrace the subjective nature of our work. Our methods reward us with complex-
ity and sophistication and also afford the ethical advantage of involving partici-
pants as co-researchers in the process (see Angouri 2012).
At the start of this chapter we commented that we needed ethnographic
approaches to give us access to the “messiness” of real life interaction. Harnessing
this mess through new conceptualisations of communities as well as by working
with our participants as collaborators in our quest to discover their realities offers
us important opportunities for new and more sophisticated understandings of how
meaning is negotiated in everyday practices.

References

Angouri, Jo
2012 Managing disagreement in problem solving meeting talk. Journal of Pragmat-
ics 44(12): 1565–1579.
Adato, Michelle
2008 Combining survey and ethnographic methods to improve evaluation of con-
ditional cash transfer programs. International Journal of Multiple Research
Approaches 2(2): 222–236.
Anderson, Benedict
1983 Imagined Communities. Reflections on the Origin and Spread of Nationalism.
London: Verso.
Baxter, Judith A
2009 The Language of Female Leadership. Basingstoke: Palgrave Macmillan.
Bonner, Ann and Gerda Tolhurst
2002 Insider-outsider perspectives of participant observation. Nurse Researcher 9(4):
7–19.
Boxer, Diana
2003 Critical issues in developmental pragmatics. In: Alicia Martínez Flor, Esther
Usó Juan and Ana Fernández Guerra (eds.), Pragmatic Competence in Foreign
Language Teaching, 45–67. Castelló: Publicacions Universitat Jaume I.
Boyle, Joyceen
1994 Styles of ethnography. In: Janice Morse (ed.), Critical Issues in Qualitative
Research Methods, 159–185. London: Sage.
Ethnographic methods in pragmatics 363

Brown, Penelope and Stephen Levinson

1978 Universals in language usage: Politeness phenomena. In: Esther Goody (ed.),
Questions and Politeness. Strategies in Social Interaction, 56–311. Cam-
bridge: Cambridge University Press.
Brown, Penelope and Stephen Levinson
1987 Politeness. Some Universals in Language Usage. Cambridge: Cambridge Uni-
versity Press.
Burgess, Robert
1982 Field Research. A Sourcebook and Field Manual. London: Allen and Unwin.
Cameron, Deborah, Elizabeth Frazer, Penelope Harvey, Ben Rampton and Kay Richardson
(eds.)
1992 Researching Language. Issues of Power and Method. London: Routledge.
Chomsky, Noam
1965 Aspects of the Theory of Syntax. Cambridge: MIT Press.
Davis, Kathryn and Rosemary Henze
1998 Applying ethnographic perspectives to issues in cross-cultural pragmat-
ics. Journal of Pragmatics 30(4): 399–419.
Denscombe, Martyn
2014 The Good Research Guide. For Small-scale Social Research Projects, 5th ed.
England: McGraw-Hill Education.
de Volo, Lorraine and Edward Schatz
2004 From the inside out: Ethnographic methods in political research. PS-WASH-
INGTON 37(2): 267–272.
Duff, Patricia
2006 Beyond generalizability: Contextualization, complexity, and credibility in
Applied Linguistics. In: Micheline Chalhoub-Deville, Carol Chapelle and
Patricia Duff (eds.), Inference and Generalizability in Applied Linguistics,
65–95. Amsterdam: John Benjamins.
Eckert, Penelope and Sally McConnell-Ginet
1992 Think practically and look locally: Language and gender as community-based
practice. Annual Review of Anthropology 21: 461–490.
Eelen, Gino
2001 A Critique of Politeness Theories. Manchester: St. Jerome Publishing.
Emerson, Robert, Rachel Fretz and Linda Shaw
2011 Writing Ethnographic Fieldnotes. Chicago: University of Chicago Press.
Fletcher, Jeannie
2011 The role of discourse in establishing and enabling context for organizational
knowledge creation: An ethnographic study. Unpublished PhD thesis, Victoria
University of Wellington.
Geertz, Clifford
1973 Thick description: Toward an interpretive theory of culture. In: Clifford Geertz
(ed.), The Interpretation of Cultures, 3–30. New York: Basic Books.
Graham, Sage
2008 A manual for (im)politeness?: The impact of the FAQ in e-community forma-
tion and socialization. In: Derek Bousfield and Miriam Locher (eds.), Impo-
liteness in Language, 281–304. Berlin: Mouton de Gruyter.
364 Meredith Marra and Mariana Lazzaro-Salazar

Griffin, Christine and Andrew Bengry-Howell

2008 Ethnography. In: Carla Willig and Wendy Stainton-Rogers (eds.), The Sage
Handbook of Qualitative Research in Psychology, 15–31. London: Sage.
Guba, Egon and Yovanna Lincoln
1989 Fourth Generation Evaluation. Newbury Park: Sage.
Hammersley, Martyn
1998 Reading Ethnographic Research. London: Longman.
Hammersley, Martyn and Paul Atkinson
1996 Ethnography Principles in Practice. Oxon: Routledge.
Haugh, Michael
2012 Epilogue: The first-second order distinction in face and politeness research.
Journal of Politeness Research 8(1): 111–134.
Holmes, Janet
2014 Doing discourse analysis in sociolinguistics. In: Janet Holmes and Kirk Hazen
(eds.), Research Methods in Sociolinguisitics. A Practical Guide, 177–193.
West Sussex: Wiley-Blackwell.
Holmes, Janet and Meredith Marra
2011 Relativity rules: Politic talk in ethnicised workplaces. In: Bethan Davies,
Michael Haugh and Andrew Merrison (eds.), Situated Politeness, 27–52. Lon-
don: Continuum.
Holmes, Janet, Meredith Marra and Bernadette Vine
2011 Leadership, Discourse, and Ethnicity. Oxford: Oxford University Press.
Holmes, Janet, Angela Joe, Meredith Marra, Jonathan Newton, Nicky Riddiford and Ber-
nadette Vine
2011 Applying linguistic research to real world problems: The social meaning of talk
in workplace interaction. In: Christopher Candlin and Srikant Sarangi (eds.),
Handbook of Communication in Organisations and Professions. Handbooks of
Applied Linguistics [HAL] (No. 3), 533–550. Berlin: De Gruyter Mouton.
Hymes, Dell
1974 Foundations in Sociolinguistics. An Ethnographic Approach. Philadelphia:
University of Pennsylvania Press.
Johnstone, Bruce
2007 Ethnographic methods in entrepreneurship research. In: Helle Neergaard and
John P. Ulhøi (eds.), Handbook of Qualitative Research Methods in Entrepre-
neurship, 97–121. Cheltenham: Edward Elgar.
Kidner, Keely
2015 Beyond Greenwash. Environmental discourses of appropriation and resist-
ance. Unpublished PhD thesis, Victoria University of Wellington.
King, Brian
Forthcoming Communities of Practice in Language in the workplace research. To
appear in: Bernadette Vine (ed.), Routledge Handbook of Language in the
Workplace. Abingdon: Routledge.
Kingsley, Leilarna
2009 Explicit and implicit dimensions of language policy in multilingual banks in
Luxembourg: An analysis of top-down and bottom-up pressures on practices.
Language Problems & Language Planning 33(2): 153–173.
Ethnographic methods in pragmatics 365

Labov, William
1972 Language in the Inner City. Philadelphia: University of Pennsylvania Press.
Lazzaro-Salazar, Mariana
2013 Investigating nurses’ professional identity construction in two health settings
in New Zealand. Unpublished PhD thesis, Victoria University of Wellington.
Lazzaro-Salazar, Mariana
2016 Downscaling culture in intercultural communication: The case of nurses’ pro-
fessional values in New Zealand. In: Dorottya Cserző, Argyro Kantara and
Jaspal Singh (eds.), The Journey is its Own Reward. Downscaling Culture
in Intercultural Communication Research, 114–140. Cambridge: Cambridge
University Press.
Locher, Miriam
2006 Advice Online. Advice-giving in an American Internet Health Column, Volume
149. Amsterdam: John Benjamins.
Locher, Miriam and Richard Watts
2005 Politeness theory and relational work. Journal of Politeness Research. Lan-
guage, Behaviour, Culture 1(1): 9–33.
Marra, Meredith
2012 Disagreeing without being disagreeable: Negotiating workplace communities
as an outsider. Journal of Pragmatics 44(12): 1580–1590.
McKinley Brayboy, Bryan and Donna Deyhle
2000 Insider-outsider: Researchers in American Indian communities. Theory into
Practice 39(3): 163–169.
Mullany, Louise
2007 Gendered Discourse in the Professional Workplace. Basingstoke, NY: Pal-
grave Macmillan.
Murata, Kazuyo
2011 A contrastive study of the discourse of business meetings in New Zealand and
in Japan. Unpublished PhD thesis, Victoria University of Wellington.
National Health Service, England, Institute for Innovation and Improvement
2006–2013 The productive ward: Releasing time to care. Available (23/11/2016) at:
http://www.institute.nhs.uk/quality_and_value/productivity_series/produc-
tive_ward.html.
Norton, Bonny
2001 Non-participation, imagined communities and the language classroom. Learner
Contributions to Language Learning: New Directions in Research 6(2): 159–
171.
Ochs, Elinor
1979 Transcription as theory. In: Elinor Ochs and Bambi B. Schieffelin (eds.),
Developmental Pragmatics, 43–73. New York: Academic Press.
O’Reilly, Karen
2009 Key Concepts in Ethnography. Los Angeles: Sage.
Plester, Barbara
2015 The Complexity of Workplace Humour. Laughter, Jokers and the Dark Side of
Humour. London/New York: Springer.
Ramanathan, Vai and Dwight Atkinson
1999 Ethnographic approaches and methods in L2 writing research: A critical guide
and review. Applied Linguistics 20(1): 44–70.
366 Meredith Marra and Mariana Lazzaro-Salazar

Rizzo, Thomas, William Corsaro and John Bates

1992 Ethnographic methods and interpretive analysis: Expanding the methodologi-
cal options of psychologists. Developmental Review 12(2): 101–123.
Roberts, Celia
2003 Applied linguistics applied. In: Srikant Sarangi and Teun van Leeuwen (eds.),
Applied Linguistics and Communities of Practice, 132–149. London: Contin-
uum.
Rosaldo, Renato
1989 Culture and Truth. The Remaking of Social Analysis. Boston: Beacon.
Sarangi, Srikant
2006 The conditions and consequences of professional discourse studies. In: Rich-
ard Kiely, Pauline Rea-Dickins, Helen Woodfield and Gerald Clibbon (eds.),
Language, Culture and Identity in Applied Linguistics, 199–220. London:
Equinox.
Schnurr, Stephanie, Meredith Marra and Janet Holmes
2007 Being (im)polite in New Zealand workplaces: Māori and Pākehā leaders. Jour-
nal of Pragmatics 39(4): 712–729.
Schnurr, Stephanie and Angela Chan
2009 Leadership discourse and politeness at work: A cross cultural case study of
New Zealand and Hong Kong. Journal of Politeness Research 5(2): 131–157.
Shanmuganathan, Thilagavathi
2005 Ethics and the Observer’s Paradox. Journal RELC 36(1): 73–84.
Silver, Christina and Ann Lewins
2014 Using Software in Qualitative Research. A Step-by-step Guide. Los Angeles:
Sage.
Thorne, Sally
2000 Data analysis in qualitative research. Evidence Based Nursing 3(3): 68–70.
Wenger, Etienne
1998 Communities of Practice. Learning, Meaning, and Identity. Cambridge: Cam-
bridge University Press.
Wilson, Gwyneth
2009 Implementation of releasing time to care – the productive ward. Journal of
Nursing Management 17(5): 647–654.
Zhu, Yunxia and Francesca Bargiela-Chiappini
2013 Balancing emic and etic: Situated learning and ethnography of communication
in cross-cultural management education. Academy of Management Learning
& Education 12(3): 380–395.
15. Ethnomethodology and conversation analysis
Andrea Golato and Peter Golato

Abstract: In this chapter we provide an overview of ethnomethodology (EM) and

conversation analysis (CA). We first provide a historical backdrop of sociology
within which EM emerged. This is followed by a brief discussion of EM’s meth-
odological underpinnings. We then describe how CA developed out of EM together
with CA’s main tenets and provide an example of how to conduct a CA analysis.
Since interactional linguists have adopted CA methodology, the chapter also pro-
vides a brief introduction to this particular approach to the study of language. Next,
we discuss the data typically used in CA and interactional linguistics (IL), after
which we discuss the research topics in both fields, the advantages and disadvan-
tages of the different methodologies, and their applications. Lastly, we provide a
brief outlook on the directions that the field might take in the future.

1. Historical backdrop

In Europe, Émile Durkheim established the first academic department of sociology

at the University of Bordeaux, France, in 1895. Durkheim’s work and that of his
students was largely concerned with understanding social facts, which were under-
stood to be societal norms and constraints upon individual behavior that existed
external to and independently of individual societal members. As a consequence of
their theorized nature, Durkheim and others working in his tradition held that social
facts were amenable to quantitative study (see e. g. Durkheim 1897). In the US,
beginning with the establishment of the first department of sociology at the Uni-
versity of Chicago in 1892, sociological research was influenced primarily by the
work of George Herbert Mead and his followers, most notably Herbert Blumer, and
by the development of symbolic interactionism. This was a perspective according
to which societal members’ understandings of given events, actions, or behaviors
are necessarily subjective, i. e. are invariably a product of members’ interpretations
of events, actions, or behaviors rather than of any objective truth about them, and
according to which members construct the societies in which they live (see e. g.
Blumer 1969). During the 1930s, Talcott Parsons also adopted this perspective, and
additionally developed two influential notions. The first, structural functionalism,
held that society’s order and overall structure is a function of how its individual
institutions interact with each other. The second notion, action theory, held that
social actions are the product of goal-seeking individuals who operate according
to internal and external constraints including their own evaluations of how a goal

https://doi.org/10.1515/9783110424928-015
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 367–394. Berlin/
Boston: De Gruyter Mouton.
368 Andrea Golato and Peter Golato

might best be achieved, and any goal- or goal-seeking-related societal norms and
values (see e. g. Parsons 1937).
Amidst this theoretical backdrop, with its primary interest in the study of mac-
ro-level societal institutions, constraints, norms, and values within which members
operate, the work of both Erving Goffman and Harold Garfinkel gradually led to
an additional interest in more closely studying the in situ interactions of societal
members from the perspective of members themselves. For instance, Goffman’s
concept of face, that is, the positive self-image that societal members try to project
to and maintain with others, and his dramaturgical analysis, according to which
members actively seek to manage their everyday interactions as a function of par-
ticular participants and settings, were both part of his view that interaction was
itself a social institution with its own structure and orderliness, which he termed
its interactional order. Since there is orderliness and structure in interaction and
since interaction underlies all other social institutions, it follows that interaction is
a necessary object of inquiry in its own right (see e. g. Goffman 1959).
Similar to Goffman, Garfinkel also focused on the study of social interaction
as a means of understanding social order. However, while Goffman’s work was
concerned with how members seek to represent themselves and to understand oth-
ers’ self-presentations in the course of social interaction, Garfinkel’s work sought
to illuminate the relation between social interaction and the origin and structure of
social order itself.
From written reports of everyday, two-party conversations in which each par-
ticipant had annotated what was said with what they had thought was being talked
about, Garfinkel identified several general properties that seemed to underlie par-
ticipants’ mutual understanding during such conversations. These general, under-
lying properties included an expectation of mutual understanding, an acceptance of
widespread indexicality (i. e. context-dependent reference) and vagueness, and an
awareness that utterance meaning may depend upon what was previously said and
may later change depending upon what will be said. Garfinkel posited that these
properties “… furnish a background of seen but unnoticed features of common
discourse whereby actual utterances are recognized as events of common, reason-
able, understandable, plain talk” (Garfinkel 1967: 41). Working from the perspec-
tive that these seen but unnoticed background features would only be visible to
someone who was either a stranger to, or had become estranged from, everyday
social experience, Garfinkel further posited that an absence of these properties in
an interaction would lead to problems which participants would immediately seek
to rectify. One way in which Garfinkel offered support for his claims was through
breaching experiments, i. e. demonstrations during which his students intentionally
violated social conventions of various kinds (e. g. seeking clarifications about state-
ments for which a clarification would normally be neither warranted nor expected,
behaving as if one’s family members were strangers, moving one’s face close to
that of a coparticipant’s during face-to-face interaction, etc.) and then noted the
Ethnomethodology and conversation analysis 369

reactions to the violated conventions. These demonstrations were intended by Gar-

finkel to involve deviations from the familiar, seen but unseen features of everyday
discourse, and thereby cause the unsuspecting “victims” to experience feelings
of estrangement from ordinary social experiences. The range of responses to the
violated social conventions included shock, confusion, and even anger as interact-
ants tried to make sense of their familiar-yet-unfamiliar social circumstances. As
Garfinkel’s breaching demonstrations revealed, society members seemed to use
everyday, common-sense reasoning in seeking to restore order to and/or otherwise
“make sense” of these intentionally derailed interactions. For Garfinkel, therefore,
the study of social facts and social order amounted to the study of how, through
common-sense reasoning within ordinary conversation, members produce and rec-
ognize everyday social actions. Accordingly, he proposed the term ethnomethodol-
ogy as a way of referring to “the investigation of the rational properties of indexical
expressions and other practical actions as contingent ongoing accomplishments of
organized practices of everyday life” (Garfinkel 1967: 11). Very briefly stated, EM
is the study of members’ methods for collaboratively producing and understanding
recognizable social actions.
Through their respective professional interests in ordinary conversation, both
Goffman’s and Garfinkel’s work helped set the stage for later researchers whose
interests would more closely concern the mechanics of conversation itself. For
instance, in his earliest published work Harvey Sacks (1963) adopted an ethnometh-
odological perspective when proposing that sociologists should not concern them-
selves with noting, clarifying, and/or evaluating members’ descriptions of their
social worlds, but rather should seek to describe the everyday, common-sense ways
in which members had produced them such that they were understood by other
members to be descriptions of social worlds (Sacks 1963: 7). In what appeared to
be an extension of what was then Garfinkel’s EM, part of Sacks’ proposed program
also involved a more rigorous description of the language that members used,
though he did not advocate a particular analytical framework for doing this (Sacks
1963: 4). In another apparent extension of EM as it was then practiced, Sacks ten-
tatively proposed the criterion of “recognition” for determining whether a given
utterance constituted an instance of a description of a social world (Sacks 1963: 4).
The essence of both of these early ideas of Sacks’, together with his observation
that there appears to be “order at all points” within any level of social order (Sacks
1992: 594), would undergo subsequent development and figure prominently in his
later work and in the work of others within what would become CA. While EM had
been broadly concerned with members’ common-sense reasoning as manifested
through talk in the production and recognition of social actions, early conversation
analytic studies focused more on identifying the underlying order of talk itself (see
e. g. Schegloff and Sacks 1973; Sacks, Schegloff, and Jefferson 1974).
CA constitutes an empirical, qualitative approach to the study of talk-in-inter-
action. Specifically, CA research views talk-in-interaction as the primordial site of
370 Andrea Golato and Peter Golato

social action (Schegloff 1996: 468). The goal is to discover practices of talk, i. e.
particular turn and sequence designs, and the social actions which these practices
are used to accomplish. It is one of the insights of CA that much of what we do
in our everyday life is accomplished through talk (Drew and Heritage 2006), be it
raising children, conducting work, making friends, fighting with relatives, teaching
and learning, etc. In other words, when we use language, we are not just transmit-
ting or exchanging information, but we are simultaneously accomplishing other
actions, referring to people, objects and thoughts, negotiating our social relation-
ships and identities, etc. Moreover, we do this in an orderly fashion and in concert
with the actions, stances, and beliefs of our conversational partner(s). It is through
the back and forth of interaction that we organize our actions and make sense of
the world around us. Additionally, CA believes that because interactants must (and
can) make sense of the utterances of their coparticipants on a moment-by-moment
basis and then incorporate that understanding into their own utterances, researchers
can use these very same utterances and behaviors as resources in the analysis of
social action.
Thus, in line with the ethnomethodological roots of CA, researchers approach
their data from a member’s perspective. In other words, they never look at an indi-
vidual utterance in isolation but instead determine how members of the conversa-
tion orient to the utterance. The underlying assumption is that “… no empirically
occurring utterance ever occurs outside, or external to, some specific sequence”
(Heritage and Atkinson 1984: 6). For this reason, utterances are always analyzed
as actions which are placed in specific sequential contexts (Schegloff 1988, 2007).
Likewise, context is not viewed as an independent entity which influences partic-
ipants, but is instead considered to be locally managed and co-produced in situ by
the participants of the interaction (Auer and Di Luzio 1992; Duranti and Goodwin
1992; Schegloff 1992). The mechanism and advantages of this approach can be
demonstrated through the following example sentences uttered by members of a
string quartet during one of their practice sessions:
(1) That was great.
(2) Try not to retard.
(3) Oh, I’m sorry.

Viewed in isolation, utterance (1) is likely to be interpreted as a general assessment

or compliment, utterance (2) as criticism or admonishment, and utterance (3) as an
apology. However, when these utterances are viewed in the sequential context in
which they originally occurred, a slightly different picture emerges:
Ethnomethodology and conversation analysis 371

(4) [Quartet Material, 4/12/94] (Golato 2005: 89, with permission from E. Schegloff)
1 Mik: okay, (0.6) (hit it)
2 ((music 10.0))––––––––––
3 Bob: [That’s the place,
4 [––––––––

5 Bob: (Mike)/(now), that’s beautiful sound

6 Bob: but, (.) try not to retard.=
7 Mar: =he didn’t.
8 (0.2)
9 Bob: °he didn’t?°=
10 Mar: =that was gr(h)(h)eat=
11 Bob: =oh I’m sorry
12 (0.5)

In line 1, Mike is giving the others the signal to start playing. After about ten sec-
onds of music, Bob interrupts the play by criticizing and correcting Mike in lines
3–6. Before Mike can respond, Marge overtly disagrees with Bob’s statement.
After a short silence, Bob questions and challenges this counter-opinion (line 9).
Marge responds with a that was gr(h)(h)eat while looking back and forth from
Mike and Bob (not shown in this transcript). This utterance is one that we have
looked at before in isolation. It can be argued that with this turn, Marge is compli-
menting Mike and building solidarity with him while simultaneously disaffiliating
with Bob. However, by virtue of its position in the conversation, i. e. by being
placed in an environment in which it contradicts the opinion of a coparticipant, the
turn does more than merely compliment: it serves to reproach and criticize Bob.
That this is the case can be demonstrated by the reactions of the coparticipants: in
English, complimenting first pair parts normally generate a responding second pair
part in the form of an acceptance, rejection, deflection, or other reaction from the
compliment recipient (Pomerantz 1978). Note that here, however, Mike does not
respond at all. Instead, Bob responds with an apology (line 11). Mike’s behavior
indicates that the function of the turn in line 10 is not primarily to compliment,
while Bob’s turn in line 11 indicates that he perceives the turn as a reproach or
criticism which results in him apologizing for his prior behavior.
The paragraph above gives a brief illustration of a CA-style sequential anal-
ysis. Sequences, which are CA’s primary unit of analysis, are “courses of actions
implemented through talk” (Schegloff 2007: 3). The underlying idea is that turns-
of-talk that are positioned next to each other have “some organization” between
them (Sacks [1973] 1987: 54). Specifically, interlocutors typically hear a given
turn as directed to a prior turn. Thus, with each utterance, speakers display to their
interlocutors not only that they attended to the prior utterance(s), but also how
they understood it and how they orient to the actions accomplished by that prior
utterance (Schegloff 1984: 37). As a result, each turn is oriented to and shaped by
372 Andrea Golato and Peter Golato

what was said before. Consequently, prior speakers can use the subsequent talk to
determine if and how their talk was understood. Similarly, an analyst can look at
the talk of a subsequent speaker to see what they perceived the action and meaning
of a prior turn to be. This is a procedure that Hutchby and Woffitt have labelled
“next-turn proof procedure” (Hutchby and Wooffitt 2008: 15). It is this next-turn
proof procedure that constitutes the ethnomethodological approach of CA.
When analyzing a particular action or specific linguistic feature, analysts com-
pile a large collection of single cases and work out the patterns that can be observed
in the collection. As Heritage (1988: 131) observes “[a]t the core of this task is the
demonstration that these regularities are methodically produced and oriented to by
the participants as normative organizations of action.” Contrary to other method-
ologies, no instance can be treated as an exception in the traditional sense, and be
cast aside and disregarded in the analysis because it does not conform to the rules
that the other examples in the collection have been observed to follow. Instead,
researchers perform a so-called deviant case analysis; that is, they try to show how
participants orient to the behavior as being different from the norm (Heritage 1988)
and provide contextual explanations for the observed difference (for an example,
see Egbert 1996).
IL builds upon the work of CA. As noted above, CA is mostly interested in
describing social actions and the practices involved in accomplishing them. In CA
analyses, linguistic structures are considered to be one of several resources avail-
able to interactants for accomplishing social actions. In IL, however, CA’s figure
and ground are reversed; that is, in IL the focus is on the specific functions of lex-
ical, semantic, syntactic, phonetic, prosodic, and stylistic structures/resources and
the roles that they play in interaction. While Chomskyan or Saussurian linguists
study linguistic structures in isolated and often invented sentences, IL researchers
analyze such structures as they are used in naturally occurring interaction. Put
differently, IL “maintains that linguistic analysis should acknowledge the fact that
language is used in and for particular tasks and purposes of interaction, and that,
as a consequence, linguistic phenomena need to be analyzed with regard to the
conversational actions they are deployed for and the sequences they are embedded
in” (Kern and Selting 2013: 1012). Here then, linguistic phenomena are viewed as
both shaping, and being shaped by, interaction. Moreover “the linguistic shaping
of an utterance is intertwined with the changing relationships among participants
over interactional time” (Schegloff, Ochs and Thompson 1996: 44). Interactional
linguists either start out with a linguistic structure (e. g. a particular particle or
response token) and investigate its function or meaning in interaction, or alterna-
tively they investigate a particular social action (e. g. a compliment) and analyze
which linguistic features are regularly employed in order to accomplish this action
(Kern and Selting 2013).
In terms of its methodology, IL uses the same sequential analysis as used by
conversation analysts. That is, IL researchers show that the specific linguistic phe-
Ethnomethodology and conversation analysis 373

nomenon under investigation is oriented to by the participants in the interaction

as having a specific interactional function. Additional evidence comes from other
regularly occurring linguistic elements in the same turn and from comparisons of
the phenomenon under investigation with similar/related phenomena in the same
or in other languages. A prime example of research in the IL tradition is Fox and
Thompson’s (2010) work on responses to wh-questions in American English. They
noted that wh-questions used as specifying questions (i. e. as questions seeking
particular pieces of information that have nothing to do with problems of hearing
or understanding) can receive either phrasal or clausal answers as displayed in
examples (5) and (6) below:
(5): Boise (adapted from Fox and Thompson 2010: 140)

1 Terry: (Well) her sister’s paying for it. yes.

2 Maureen: >Where does< her sister live.
=> 3 Terry: Boise Idaho.
4 Maureen: (H)ho ↑ta:lk[awa:::y.
5 Abbie: [her sister ne:ver calls.
(6): Game night (adapted from Fox and Thompson 2010: 146)

1 Terry: WE’RE JUST TALKING ABOUT HER ARTI:STIC YOUNG

SO:N.
2 (.)
3 Pam: Oh that’s ri:(h)ght.
4 (.)
5 Maureen: Ye:ah. Whatsuhm (0.3) Whose turn [is it.
6 Terry:
[somebody wants
to bu:y
7 that[:,
=> 8 Abbie: [It’s yours.
9 (0.4)
10 Terry: Yeah, you and Pa:m, (.) huh [huh
11 Maureen: [O::h. Is it me:

dra:wing?

The question in example 5, line 2 receives a phrasal response in line 3, while the
question in example 6, line 5 receives a clausal response in line 8. Fox and Thomp-
son found that the choice of answer format is systematic and is associated with a
specific interactional meaning. Specifically, they show that phrasal responses sim-
ply answer the question that was posed, while clausal answers indicate that either
the question itself or the sequence in which it is placed are problematic. This can
be illustrated using the data examples above. In example (5), Terry is explaining
why their hostess is not cutting an ongoing phone call short. The question in line
2 seeks more information about a person mentioned in line 1. In other words, the
question is topically related to the immediately prior talk. This question receives
374 Andrea Golato and Peter Golato

a phrasal response that provides the sought-after information. Syntactically, the

answer is built symbiotically on the prior talk and it is produced without delay
and causes no interactional problems. In contrast, in example (6), the question
in line 5 is not topically related to the prior talk at all and thus does not develop
the ongoing sequence further. Fox and Thompson note that questions of this type
regularly receive a clausal response. Sequentially, however, there is more going
on: in the example above, the interactants have gathered to play board games and
to socialize. At the beginning of the transcript, they have temporarily abandoned
the game and are involved in a discussion. The turn in line 5, then, does more than
simply ask a specifying question (the speaker does not just want to know whose
turn it is); the question also serves as a prompt for the other interactants to return
to the game. However, it turns out that the person who asked the question is in fact
the player who is up next. Fox and Thompson argue that the clausal response treats
the question as problematic both because of its non-topical relatedness to the prior
talk, and because it is inapposite (i. e. it goes counter to the default assumption that
those wanting to return to a game know whose turn it is). Further evidence for the
problematic nature of the question lies in the ongoing discussion about whose turn
it is. Fox and Thompson note that systematically, wh-questions receiving a phrasal
response have in fact asked a specifying question, and in each case receive exactly
the sought-after information through the phrasal answer. In contrast, wh-questions
receiving a clausal response typically do more than merely ask a question. More
specifically, the clausal answer and the ensuing talk treat the prior question (and
its associated action) as problematic.
Fox and Thompson reached these findings using the following approach: After
first noticing in their data that wh-questions can either receive a phrasal or a clausal
response, they searched a corpus of approximately 500 minutes of recorded, natu-
rally-occurring conversation for instances of wh-questions. They then conducted a
sequential CA analysis of each instance, noting systematic features that hold across
the selection. They further conducted an analysis of the turn design of both phrasal
and clausal responses. In so doing, they noticed that many of the phrasal responses
were produced immediately and without delay, whereas all clausal responses were
delayed. Moreover, they also noted that phrasal responses were more frequent and
thus the more typical or unproblematic response type. These two observations
further supported their analysis of clausal responses as being a device for dealing
with sequential troubles.
More detailed discussions of the conceptual framework of IL along with early
examples of empirical work in this tradition can be found in Couper-Kuhlen and
Selting (1996), Ford and Wagner (1996), and Ochs et al. (1996).
Ethnomethodology and conversation analysis 375

2. Data

Given that both CA and IL are interested in how speakers use language to create
meaning and/or how certain actions are organized in their natural setting, the data
used for analysis need to correspond as closely as possible to the “naturally occur-
ring interactional environments which seem to be the natural, primordial home
for language use” (Schegloff 1996: 468). Moreover, it is necessary to be able to
repeatedly inspect the same data set. For these reasons, CA and IL researchers
audio- and video-record naturally occurring, non-elicited data in a variety of dif-
ferent settings. In order to clarify what naturally occurring data are, Potter (2002,
2009) suggests what he terms the “(conceptual) dead social scientist’s test: would
the data be the same, or be there at all, if the researcher got run over on the way to
work? An interview would not take place without the researcher there to ask the
questions; a counseling session would take place whether the researcher turns up
to collect the recording or not” (Potter 2002: 541). In addition, researchers try to
collect the data in a form that gives them the same access to all the information
that the interactants had. In other words, if interactants are talking to each other
on the phone, the data are audio-recorded; however, if interactants are engaged in
face-to-face interaction, the data are recorded with at least one camera. For a more
detailed discussion regarding data units and data collection, see chapters 1 and 2
of this volume.
In a next step, CA and IL researchers transcribe the data in great detail, includ-
ing not only the words uttered, but also all hesitations, laughter, in- and out-breaths
and, when relevant to the analysis, embodied behaviors and prosody. In other
words, “no order of detail is dismissed a priori, as disorderly, accidental or irrele-
vant” (Heritage 1984: 241). Most CA and IL researchers employ the transcription
conventions developed by Gail Jefferson, as described in Atkinson and Heritage
(1984: ix-xvi). However, it should be noted that several interactional linguists in
German-speaking countries employ GAT 2 (Selting et al. 2009). Examples (7) and
(8) below show the same stretch of talk transcribed using Jeffersonian transcription
conventions and GAT 2 conventions, respectively.
(7) Jeffersonian Transcription
1 O:
der ludwich schreibt da doch immer so glossen
you know ludwich always writes stories
2 O:
in der zeitung,=das kenns- kanns dich vielleicht erinnern?
in the newspaper, you know- perhaps you remember that?
3 M: [wa–
[wha–
[
4 O: [mainzer anzeiger.
[((name of newspaper.))
376 Andrea Golato and Peter Golato

=> 5 M: was fürn lu:dwig.

what lu:dwig.
6 O: der joe ludwig da, w[eisste der–]
the joe ludwig there, y[ou know the]
[ ]
7 M: [ach so. ]
[i see. ]
8 M: der [fasnachter. ] ja.
the [mardi gras fool.]
[ ]
9 O: [fasnachter. ]
[mardi gras fool.]
9 O:
und da steht heut die woch inner zeitung da
un there is today this week in the newspaper there

(8) GAT 2 Transcription

1 O: der ludwig SCHREIbt da doch immer so glossen in der
zeitung,=
2 =das kenns–
3 kanns dich vielleicht erINnern?
4 M: [wa–
5 O: [MAInzer anzeiger.
=> 6 M: WAS für_n lu:dwig.
7 O: der JOE ludwig da,=w[Eisste der– ]
8 M: [achSO. ]
9 M: der [FASnachter.] ja.
10 O: [fasnachter.]
11 O: und da steht heut die woch inner zeitung da

In both transcription systems, a true type font (typically Courier or Courier New)
is used in order to be able to line up utterances exactly with each other. This is
particularly useful for rendering talk that is produced in overlap, such as lines 3 and
4 in example 7 and lines 4 and 5 in example 8. In both transcription notations, the
beginning and end of the overlap are marked with square brackets [ ]. In both sys-
tems, periods, commas and question marks do not function as punctuation marks
but instead serve to notate intonation: a period indicates falling intonation, while
a comma indicates continuing intonation and a question mark indicates rising (or
questioning) intonation. When speakers cut themselves off, this is indicated with a
hyphen (das kenns- line 2 in both examples), while talk that is latched (i. e. with-
out a beat of silence between utterances) is connected with = (see line 2 of both
transcripts). In both transcription notations, silences are timed, with micro silences
rendered as (.) while longer silences provide the length of silence in tenths of sec-
onds (e. g. (0.2); no example given). In both transcription systems, all talk is ren-
Ethnomethodology and conversation analysis 377

dered in lower case, as upper case is reserved for utterances that are produced with
greater amplitude in the Jeffersonian system, or that mark sentence stress in GAT 2
(note that the Jeffersonian system does not mark regular sentence stress, but only
marks utterances that are unusually stressed or lengthened. This is then marked
by underlining the stressed part of the utterance, see schreibt ‘writes’ in line 1 of
example 7). In both transcription systems, elements that are unusually lengthened
are marked by a colon (see the u in the name Ludwig in line 5). The GAT 2 system
differs from the Jeffersonian system in terms of line numbering conventions. In
the Jeffersonian system, each segment of a turn at talk receives a numbered line in
the transcript. In the GAT 2 system, if a segment cannot be rendered in one line, it
is continued in the next line but does not receive a number (observe how the first
two lines of the Jeffersonian transcript are rendered in the GAT 2 transcript). More-
over, since both primary and secondary sentence accents are rendered in GAT 2
and since the system has more notational features for other intonational features
(not displayed above), the GAT 2 system allows for a more fine-grained rep-
resentation of prosodic features. For more detailed discussions of each system, see
Atkinson and Heritage (1984: ix–xvi), Selting et al. (2009), and chapter 3 of this
volume.

3. Research topics

As mentioned above, the research questions in IL and CA center on either the func-
tion of particular linguistic features and embodied conduct in talk-in-interaction, or
the systematic organization of social actions. Given our page constraints, it will not
be possible to provide an exhaustive overview of all research topics in IL and CA.
Instead, we present some of the most salient topics and include some representative
studies. However, the IL/CA research community maintains a comprehensive and
current bibliographic database which can be sorted by author, topic, type and year
of publication, and which the interested reader can access at http://emcawiki.net/
EMCA_bibliography_database.
As previously noted, in IL grammatical and other linguistic features are viewed
as adapted to and shaped by conversation (Ford, Fox and Thompson 2003). For
instance, some of the first studies in IL considered the impact of prosody on
turn-taking (French and Local 1983; Local, Wells and Sebba 1985; Local, Kelly
and Wells 1986). These earlier studies were mostly undertaken by British scholars
working on English. Since then, however, the scope of the research has broadened
to include other languages and other interactional environments (e. g. Selting 1988,
1992, 1995; Couper-Kuhlen and Ford 2004; Barth-Weingarten, Dehé and Wich-
mann 2009; Barth-Weingarten and Szczepek Reed 2014).
In terms of grammar, a large variety of features have been studied. Earlier
research focused on sentence construction (Goodwin 1981, 1984, 1995), anaphora
378 Andrea Golato and Peter Golato

(Fox 1987, 1996), and adverbial clauses (Ford 1993). In these studies, researchers
pointed out that these are concepts that are locally and interactionally managed.
That is, structures used by speakers (1) depended on the knowledge of copartic-
ipants, (2) were influenced by their input and (3) depended on their placement
within the overall structure of the interaction. Over the years, the research in IL
has been broadened to include work on question design (e. g. Selting 1992; Koshik
2005; Sidnell 2010), response particles (e. g. Heritage 1998; Sorjonen 2001; Her-
itage 2002; Betz and Golato 2008; Golato and Betz 2008; Golato 2010, 2012);
discourse markers (e. g. Günthner 1999; Barske and Golato 2010; Keevallik 2010a,
2010b), and tag questions (e. g. Jefferson 1973, 1980; Holmes 1982; Harren 2001;
Hepburn and Potter 2009; Enfield, Brown and de Ruiter 2012; D. Drake and V.
Drake 2015; V. Drake 2015).
In CA, the first two seminal papers focused on turn-taking (Sacks, Schegloff
and Jefferson 1974) and on self-repair (Schegloff, Jefferson and Sacks 1977).
Sacks et al. (1974) firmly established that there is a regularity in how participa-
tion in conversation is organized. Turns at talk consist of turn constructional units
(TCUs) which are utterances that are syntactically, prosodically, and pragmatically
complete. At the end of a given TCU, speaker change becomes relevant. Either the
current speaker can select to go on, or they can select another speaker, or another
speaker can self-select. These (normative) rules keep overlaps and silences brief
and also account for why speakers’ turns at talk have different lengths. In general,
the article showed that turn-taking is locally and interactionally managed. Given its
vital role in turn-taking, it is not surprising that as a construct, the TCU has been
extensively explored in later research (e. g. Ford, Fox and Thompson 1996; Selting
1998; Szczepek Reed and Raymond 2013). In addition, researchers have studied
collaborative completions of turns (e. g. Lerner 1991), pivot constructions (e. g.
Scheutz 2005; Betz 2008; Clayman and Raymond 2015), interruptions (French
and Local 1983; Schegloff 1987; Drummond 1989), silences (e. g. Jefferson 1989;
Roberts, Francis and Morgan 2006; Hoey 2015), and the role of embodied behavior
(e. g. Streeck 1993, 1994; Mondada 2007), as well as the overall turn-taking system
in other speech exchange systems (e. g. Greatbatch 1988; Egbert 1997; Mondada
2015).
Another topic that has received continued attention is repair, collectively under-
stood as the mechanisms available to interactants when dealing with problems
in speaking, hearing and understanding. When initiating repair, speakers stop the
ongoing action to deal with the problem before continuing on with the prior action;
in such instances, self-repair is preferred over other-initiated repair (Schegloff et
al. 1977). Research has been conducted on self- and other-initiated repair as well
as on individual repair initiators in a given language and across languages, various
functions of repair, the intersection of repair and phonetics, the intersection of
repair and embodied action, and repair in a variety of speech exchange systems.
Given the vast amount of research on this topic, the reader is referred to several
Ethnomethodology and conversation analysis 379

overview articles (Kitzinger 2012; Fox 2013; Fox, Benjamin and Mazeland 2013;
Mazeland and Benjamin 2013) and a recent collection (Hayashi, Raymond and
Sidnell 2013).
As mentioned above, one of the main contributions of CA is its practice of
analyzing talk as sequences of action and of viewing sequence organization as the
main structural feature of talk (Schegloff 1990, 2007). A variety of actions have
been investigated, for instance how interactants refer to persons and things. The
choice of reference forms has been shown to depend both on recipient design (i. e.
speakers orient to the knowledge that they presume the recipient(s) to have about
the referent) and on where in the course of an action a referring expression is placed
(e. g. Sacks and Schegloff 1979; Auer 1984; Fox 1987; Ford and Fox 1996; Fox
1996; Lerner 1996; Enfield and Stivers 2007; Egbert, Golato and Robinson 2009;
Enfield 2012; Kitzinger, Shaw and Toerien 2012; Lerner et al. 2012; Mondada
2014). Other actions that have received much attention across languages include
requests (e. g. Davidson 1984; Taleghani-Nikazm 2002b, 2005, 2006; Curl and
Drew 2008; Taleghani-Nikazm and Huth 2010; Drew and Couper-Kuhlen 2014;
Kendrick and Drew 2016) and both assessments and compliments (e. g. Pomerantz
1978; Auer and Uhmann 1982; Pomerantz 1984; Goodwin and Goodwin 1987,
1992; Antaki 2002; Golato 2005; Peräkylä and Ruusuvuori 2006; Lindström and
Mondada 2009; Mondada 2009; Golato 2011).

4. Strengths and weaknesses

The strengths and weaknesses of a particular research methodology always have

to be considered in relation to the types of research questions one wants to answer.
If one is interested in actual language use rather than in intuition or beliefs about
language, CA’s and IL’s obvious strengths lie in their focus on naturally occur-
ring data and their ethnomethodological approach. At the core of each analysis is
not the researcher’s interpretation regarding the possible function of a linguistic
structure, body movement, or facial expression in the interaction, but rather the
coparticipant’s orientation to the phenomenon in question. As demonstrated above,
these orientations are visible from the surrounding turns. Given that research arti-
cles always contain ample examples of the phenomenon under investigation and
given that these examples are in the form of transcripts of the entire surrounding
sequence (rather than just the isolated target phenomenon), it is possible for readers
to verify the line of argumentation for themselves. In other words, the approach
of having to make a case from an emic perspective, in combination with the
requirement to account for deviant cases (see above), makes for a solid analytical
approach that leaves virtually no room for ambiguities with regard to the function
of a given turn. When studying grammatical features that occur turn-internally, it
can be challenging to show the interactants’ orientation to the phenomenon under
380 Andrea Golato and Peter Golato

investigation; turn-initial or turn-final positions typically carry more of a speaker’s

stance to surrounding turns.1
Obviously, the strength of an argument in CA or IL terms lies in large part on
the quality and amount of data collected. It has been pointed out that the recording
equipment can influence the individuals being recorded (Kasper 2000). However,
it should be noted that a detailed sequential analysis will make such an orientation
apparent. Moreover, depending on the frequency of occurrence of the type of action
or linguistic feature under investigation, it can be rather painstaking to assemble a
large enough corpus (Kasper 2000). And lastly, transcribing data requires training,
is time-consuming, and can itself influence the analysis (e. g. Wagner 2013; Auer
2014; Bolden 2015; Ogden 2015; Egbert, Yufu and Hirataka 2016). In this sense,
transcribing data is already a form of interpreting them. However, it should be
noted that this last issue is common to all research methodologies that investigate
language use. CA and IL have also been criticized because they do not allow for the
incorporation of external variables such as the gender or socio-economic status of
the speaker (Yuan 2001); in line with their ethnomethodological approach, in CA
and IL such variables are only discussed if they are recognizably relevant to the
members of the conversation.

5. Application

Since the late 1970s, CA and IL researchers have also been investigating non-ordi-
nary, i. e. institutional, talk such as news interviews, courtroom interaction, medical
encounters, classroom discourse, etc. One of the first such works is Atkinson and
Drew’s analysis of courtroom interaction (Atkinson and Drew 1979). As Heritage
(2013: 3–4) explains, in institutional settings, participants have institution-relevant
roles (e. g. doctor-patient) and related interactional goals, are subject to institu-
tion-specific restrictions as to what a permissible contribution entails (e. g. doc-
tors do not commiserate with patients by providing stories of their own ailments);
and are subject to specific processes and inferential frameworks. In these settings,
turn-taking, repair, preference organization, sequence organization, etc. are still at
work, but one can notice departures from what can be observed in ordinary con-
versation. For instance, in classroom settings, a question-answer sequence is typi-

1
Sometimes, it can even be difficult to show an orientation to elements in turn-initial
position. For instance, together with other researchers, we were trying to explain how
German declarative utterances with the verb in initial position differed in function from
those in which the verb was in second position. This was impossible to do with a straight
sequential analysis, as interactants did not display an orientation to the positioning of
the verb.
Ethnomethodology and conversation analysis 381

cally followed by a third turn that contains an evaluation. This is the case because
teachers ask known-answer questions (these are so-called “display questions”) and
then evaluate the (linguistic or content) accuracy of the response provided by the
student (McHoul 1978; Mehan 1985). It is a basic tenet in CA that the institutional
context of an interaction is talked-into-being (Heritage 1989), meaning that the
participants to the interaction create and uphold the institutional frame through
their own contributions. For conceptual articles on the difference between every-
day and institutional talk-in-interaction and how to analyze interaction in institu-
tional contexts, see for instance Heritage (2005, 2013) or Mondada (2012). Several
review articles and books provide an overview of research on medical interaction
(e. g. Gill and Roberts 2012; Koenig and Robinson 2014), institutional meetings
(e. g. Asmuß and Svennevig 2009; Svennevig 2012), classroom interaction (e. g.
Seedhouse 2005; Markee 2015), news interviews (Clayman 2012), and courtroom
interaction (e. g. Komter 2012).
CA and IL have also been applied to the study of second language acquisition
(SLA). Firth and Wagner (1997, 1998, 2007) were the first to criticize prior SLA
research for adopting an almost exclusively cognitive perspective. They called
for studies conducted from an emic perspective, in which language learning was
viewed as a socially distributed phenomenon/accomplishment. Since the publica-
tion of the special issue in which Firth and Wagner’s work appeared, CA-for-SLA
has developed as a promising approach to studying language acquisition (e. g. Mar-
kee 2000; Taleghani-Nikazm and Huth 2010; Hellermann 2012).

6. Prospects for future research

Both IL and CA have seen a tremendous amount of growth in the last decades in
terms of the number of researchers using these approaches, the number of studies
published, and the areas covered. At the same time, the approaches themselves
have remained remarkably consistent. We would expect these trends to continue.
In recent years, there has been a growing trend to include more comparative
work in the study of both actions and linguistic features. While work on languages
other than English has frequently made comparisons with prior work conducted on
English, more contrastive studies now exist between two non-English languages
(e. g. Taleghani-Nikazm 2002a; Betz 2008; A. Golato and P. Golato 2015), in addi-
tion to work that compares a group of features, such as turn initial particles, across
a variety of languages (e. g. Sorjonen and Heritage in press) and to work that inves-
tigates a specific practice or linguistic feature across many different languages
(e. g. Fox et al. 2009; Enfield et al. 2013; Dingemanse and Enfield 2015; Auer
and Maschler 2016). Again, as more becomes known about individual languages,
we expect more studies to adopt a comparative perspective and discuss language
typology and universals.
382 Andrea Golato and Peter Golato

In addition, we anticipate further pedagogical applications of IL and CA

research. A number of theoretical papers (e. g. Wong 2002; Huth and Taleghani-Ni-
kazm 2006; Huth 2007; Barraja-Rohan 2011; Betz and Huth 2014) demonstrate how
findings from IL and CA can be applied to the teaching of foreign languages and to
the development of teaching materials for English (Barraja-Rohan and Pritchard
1997) and German (e. g. Burkert and Roitsch 2014; Huth 2014; D. Drake and V.
Drake 2015; Taleghani-Nikazm and Golato 2016). Similar pedagogical materials
exist for the training of medical staff (e. g. Heritage 2009), and for CA workshops
and seminars (Robinson and Heritage 2014). A small body of work also exists that
applies CA and IL research findings to improve interaction in other domains, and
there is work that combines CA with other research traditions (for review articles
and empirical research, see Antaki 2011, 2014).

References

Antaki, Charles
2002 “Lovely”: Turn-initial high grade assessments in telephone closings. Dis-
course Studies 4(1): 5–23.
Antaki, Charles (ed.)
2011 Applied Conversation Analysis: Intervention and Change in Institutional Talk.
Basingstoke, U.K.: Palgrave Macmillan.
Antaki, Charles (ed.)
2014 Research on Language and Social Interaction. Special Issue on Conversation
Analysis and Intervention 47(3).
Asmuß, Birte and Jan Svennevig
2009 Meeting talk: An introduction. Journal of Business Communication 46(1):
3–22.
Atkinson, J. Maxwell and Paul Drew
1979 Order in Court: The Organisation of Verbal Interaction in Judicial Settings.
London: Macmillan.
Atkinson, J. Maxwell and John Heritage (eds.)
1984 Structures of Social Action. Studies in Conversation Analysis. Cambridge:
Cambridge University Press.
Auer, Peter
1984 Referential problems in conversation. Journal of Pragmatics 8: 627–648.
Auer, Peter
2014 There’s no harm in glossing (but a need for a better understanding of the status
of transcripts). Research on Language and Social Interaction 47(1): 17–22.
Auer, Peter and Aldo Di Luzio (eds.)
1992 The Contextualization of Language, Volume 22. Amsterdam/Philadelphia:
John Benjamins.
Auer, Peter and Yael Maschler (eds.)
2016 “Nu” and “Nå”. A Family of Discourse Markers across the Languages of
Europe and Beyond. Berlin: de Gruyter.
Ethnomethodology and conversation analysis 383

Auer, Peter and Susanne Uhmann

1982 Aspekte der konversationellen Organisation von Bewertungen. Deutsche
Sprache 10: 1–32.
Barraja-Rohan, Anne-Marie
2011 Using conversation analysis in the second language classroom to teach inter-
actional competence. Language Teaching Research 15(4): 479–507.
Barraja-Rohan, Anne-Marie and Ruth Pritchard
1997 Beyond Talk: A Course in Communication and Conversation for Intermediate
Adult Learners of English. Melbourne, Australia: Western Melbourne Institute
of TAFE.
Barske, Tobias and Andrea Golato
2010 German so: Managing sequence and action. Text & Talk 30(3): 245–266.
Barth-Weingarten, Dagmar and Beatrice Szczepek Reed (eds.)
2014 Prosodie und Phonetik in der Interaktion. Mannheim: Verlag für Gesprächsfor
schung.
Barth-Weingarten, Dagmar, Nicole Dehé and Anne Wichmann
2009 Where Prosody Meets Pragmatics. Bingley: Emerald.
Betz, Emma
2008 Grammar and Interaction: Pivots in German Conversation. Amsterdam/
Philadelphia: John Benjamins.
Betz, Emma and Andrea Golato
2008 Remembering relevant information and withholding relevant next actions: The
German token achja. Research on Language and Social Interaction 41(1):
58–98.
Betz, Emma and Thorsten Huth
2014 Beyond grammar: Teaching interaction in the German classroom. Die Unter-
richtspraxis 57(2): 140–163.
Blumer, Herbert
1969 Symbolic Interactionism: Perspective and Method. New Jersey: Prentice-Hall.
Bolden, Galina
2015 Transcribing as research: ‘Manual’ transcription and conversation analysis.
Research on Language and Social Interaction 48(3): 276–280.
Burkert, Anna Kristin and Julia Marisa Roitsch
2014 Und ich so: Nee, oder? Teaching two formats for reported discourse in German
interaction. Die Unterrichtspraxis 47(2): 193–207.
Clayman, Steven E.
2012 Conversation analysis in the news interview In: Jack Sidnell and Tanya Stiv-
ers (eds.), The Handbook of Conversation Analysis, 630–656. Oxford, U.K.:
Wiley-Blackwell.
Clayman, Steven E. and Chase Wesley Raymond
2015 Modular pivots: A resource for extending turns at talk. Research on Language
and Social Interaction 48(4): 388–405.
Couper-Kuhlen, Elizabeth and Cecilia E. Ford
2004 Sound Patterns in Interaction. Cross-Linguistic Studies from Conversation.
Amsterdam/Philadelphia: John Benjamins.
Couper-Kuhlen, Elizabeth and Margret Selting (eds.)
1996 Prosody in Conversation. Interactional Studies. Cambridge: Cambridge Uni-
versity Press.
384 Andrea Golato and Peter Golato

Curl, Traci S. and Paul Drew

2008 Contingency and action: A comparison of two forms of requesting. Research
on Language and Social Interaction 41(2): 129–153.
Davidson, Judy Arlene
1984 Subsequent versions of invitations, offers, requests, and proposals dealing
with potential or actual rejection. In: J. Maxwell Atkinson and John Heritage
(eds.), Structures of Social Action. Studies in Conversation Analysis, 102–128.
Cambridge: Cambridge University Press.
Dingemanse, Mark and Nick J. Enfield (eds.)
2015 Open Linguistics. Special Issue on Other-initiated Repair Across Languages:
Towards a Typology of Conversational Structures 1(1).
Drake, Derek and Veronika Drake
2015 Tags are easy, ne? How to teach the use of tags in the German language class-
room. Die Unterrichtspraxis 48(1): 146–161.
Drake, Veronika
2015 Indexing uncertainty: The case of turn-final or. Research on Language and
Social Interaction 48(3): 301–318.
Drew, Paul and Elizabeth Couper-Kuhlen (eds.)
2014 Requesting in Social Interaction. Amsterdam/Philadelphia: John Benjamins.
Drew, Paul and John Heritage
2006 Editor’s introduction. In: Paul Drew and John Heritage (eds.), Conversation
Analysis, xxi–xxvii, Volume 1. London: Sage.
Drummond, Kent
1989 A backward glance at interruptions. Western Journal of Speech Communica-
tion. Special Issue on Sequential Organization of Conversational Activities 53:
150–166.
Duranti, Alessandro and Charles Goodwin (eds.)
1992 Rethinking Context: Language as an Interactive Phenomenon. Cambridge:
Cambridge University Press.
Durkheim, Emile
[1897] 1997 Suicide. A Study in Sociology. New York: The Free Press.
Egbert, Maria
1996 Context-sensitivity in conversation: Eye gaze and the German repair initiator
‘bitte’? Language in Society 25(4): 587–612.
Egbert, Maria
1997 Schisming: The collaborative transformation from a single conversation to
multiple conversations. Research on Language and Social Interaction 30(1):
1–51.
Egbert, Maria, Andrea Golato and Jeffrey Robinson
2009 Repairing reference. In: Jack Sidnell (ed.), Conversation Analysis. Contrastive
Perspectives, 104–132. Cambridge: Cambridge University Press.
Egbert, Maria, Mamiko Yufu and Fumiya Hirataka
2016 An investigation of how 100 articles in the Journal of Pragmatics treat tran-
scripts of English and non-English languages. Journal of Pragmatics 94:
98–111.
Enfield, Nick J.
2012 Reference in conversation. In: Jack Sidnell and Tanya Stivers (eds.), The Hand-
book of Conversation Analysis, 433–454. Malden, MA: Wiley-Blackwell.
Ethnomethodology and conversation analysis 385

Enfield, Nick J., Penelope Brown and Jan P. de Ruiter

2012 Epistemic dimensions of polar questions: Sentence-final particles in compara-
tive perspective. In: Jan P. de Ruiter (ed.), Questions. Formal, Functional and
Interactional Perspectives, 193–221. Cambridge: Cambridge University Press.
Enfield, Nick J., Mark Dingemanse, Julija Baranova, Joe Blythe, Penelope Brown, Tyko
Dirksmeyer, Paul Drew, Simeon Floyd, Sonja Gipper, Rósa Signý Gisladottir,
Gertie Hoymann, Kobin H. Kendrick, Stephen C. Levinson, Lilla Magyari,
Elizabeth Manrique, Giovanni Rossi, Lila San Roque and Francisco Torreira
2013 Huh? What? – A first survey in 21 languages. In: Geoffrey Raymond, Jack Sid-
nell and Makoto Hayashi (eds.), Conversational Repair and Human Under-
standing, 343–380. Cambridge: Cambridge University Press.
Enfield, Nick J. and Tanya Stivers
2007 Person Reference in Interaction. Linguistic, Cultural, and Social Perspectives.
Cambridge: Cambride University Press.
Firth, Alan and Johannes Wagner
1997 On discourse, communication, and (some) fundamental concepts in SLA
research. The Modern Language Journal 81(3): 285–300.
Firth, Alan and Johannes Wagner
1998 SLA property: No trespassing! The Modern Language Journal 84(1): 91–94.
Firth, Alan and Johannes Wagner
2007 Second/Foreign language learning as a social accomplishment: Elaborations
on a reconceptualized SLA. The Modern Language Journal 91 (Focus Issue):
800–819.
Ford, Cecilia E.
1993 Grammar in Interaction. Adverbial Clauses in American English Conversa-
tions. Cambridge/New York: Cambridge University Press.
Ford, Cecilia E. and Barbara A. Fox
1996 Interactional motivations for reference formulation: He had. This guy had,
a beautiful, thirty-two o:lds. In: Barbara A. Fox (ed.), Studies in Anaphora,
145–168. Amsterdam/Philadelphia: John Benjamins.
Ford, Cecilia E., Barbara A. Fox and Sandra A. Thompson
1996 Practices in the construction of turns: The “TCU” revisited. Pragmatics 6(3):
427–454.
Ford, Cecilia E., Barbara A. Fox and Sandra A. Thompson
2003 Social interaction and grammar. In: Michael Tomasello (ed.), The New Psy-
chology of Language. Cognitive and Functional Approaches To Language
Structure, Volume 2, 119–143. Mahwah: Lawrence Erlbaum.
Ford, Cecilia E. and Johannes Wagner (eds.)
1996 Interaction-based Studies of Language. Special issue of Pragmatics, Volume
6.
Fox, Barbara A.
1987 Discourse Structure and Anaphora. Cambridge: Cambridge University Press.
Fox, Barbara A.
2013 Conversation analysis and self-repair. In: Carol Chapelle (ed.), The Encyclo-
pedia of Applied Linguistics, 1105–1110. Oxford, U.K.: Wiley-Blackwell.
Fox, Barbara A. (ed.)
1996 Studies in Anaphora, Volume 33. Amsterdam/Philadelphia: John Benjamins.
386 Andrea Golato and Peter Golato

Fox, Barbara A., Trevor Benjamin and Harrie Mazeland

2013 Conversation analysis and repair organization: Overview. In: Carol Chapelle
(ed.), The Encyclopedia of Applied Linguistics, 1094–1097. Oxford:
Wiley-Blackwell.
Fox, Barbara A. and Sandra A. Thompson
2010 Responses to wh-questions in English conversation. Language and Social
Interaction 43(2): 133–156.
Fox, Barbara A., Fay Wouk, Makoto Hayashi, Steven Fincke, Liang Tao, Marja-Leena Sor-
jonen, Minna Laakso and Wilfridio Flores Hernandez
2009 A cross-linguistic investigation of the site of initiation in same-turn self-re-
pair. In: Jack Sidnell (ed.), Conversation Analysis: Comparative Perspectives,
60–103. Cambridge: Cambridge University Press.
French, Peter and John Local
1983 Turn-competitive incomings. Journal of Pragmatics 7: 17–38.
Garfinkel, Harold
1967 Studies in Ethnomethodology. Englewood Cliffs, NJ: Prentice-Hall.
Gill, Virginia Teas and Felicia Roberts
2012 Conversation analysis in medicine. In: Jack Sidnell and Tanya Stivers (eds.),
The Handbook of Conversation Analysis, 575–592. Oxford: Wiley-Blackwell.
Goffman, Erving
1959 The Presentation of Self in Everyday Life. Garden City, NY: Doubleday.
Golato, Andrea
2005 Compliments and Compliment Responses. Grammatical Structure and Sequen-
tial Organization. Amsterdam/Philadelphia: John Benjamins.
Golato, Andrea
2010 Marking understanding versus receipting information in talk: achso. and ach in
German interaction. Discourse Studies 12(2): 147–176.
Golato, Andrea
2011 Appreciatory sounds and expressions of embodied pleasure used as compli-
ments. In: Karin Aijmer and Gisle Andersen (eds.), Pragmatics of Society,
359–390. Berlin: de Gruyter Mouton.
Golato, Andrea
2012 German oh: Marking an emotional change-of-state. Research on Language
and Social Interaction 45(3): 1–24.
Golato, Andrea and Emma Betz
2008 German ach and achso in repair uptake: A resource to sustain or remove epis-
temic asymmetry. Zeitschrift für Sprachwissenschaft 27: 7–37.
Golato, Andrea and Peter Golato
2015 Reference repair in German and French. Journal of Pragmatics. Special Issue
Reference in Interaction 87: 218–237.
Goodwin, Charles
1981 Conversational Organization. Interaction between Speakers and Hearers.
New York: Academic Press.
Goodwin, Charles
1984 Notes on story structure and the organization of participation. In: J. Maxwell
Atkinson and John Heritage (eds.), Structures of Social Action. Studies in Con-
versation Analysis, 225–246. Cambridge: Cambridge University Press.
Ethnomethodology and conversation analysis 387

Goodwin, Charles
1995 Sentence construction within interaction. In: Uta Quasthoff (ed.), Aspects of
Oral Communication, 198–219. Berlin/New York: Walter de Gruyter.
Goodwin, Charles and Marjorie Harness Goodwin
1987 Concurrent operations on talk: Notes on the interactive organization of assess-
ments. IPRA Papers in Pragmatics 1(1): 1–54.
Goodwin, Charles and Marjorie Harness Goodwin
1992 Assessments and the construction of context. In: Alessandro Duranti and
Charles Goodwin (eds.), Rethinking Context: Language as an Interactive Phe-
nomenon, 147–190. Cambridge: Cambridge University Press.
Greatbatch, David
1988 A turn-taking system for British news interviews. Language in Society 17:
401–430.
Günthner, Susanne
1999 Entwickelt sich der Konzessivkonnektor obwohl zum Diskursmarker? Gram-
matikalisierungstendenzen im gesprochenen Deutsch. Linguistische Berichte
180: 409–446.
Harren, Inga
2001 “Ne?” in Alltagsgesprächen. Interaktive Funktionen und Positionierung in
Turn und Sequenz. Oldenburg: Universität Oldenburg Type.
Hayashi, Makoto, Geoffrey Raymond and Jack Sidnell (eds.)
2013 Conversational Repair and Human Understanding. Cambridge: Cambridge
University Press.
Hellermann, John
2012 Conversation analysis and language learning. In: Carol Chapelle (ed.), The
Encyclopedia of Applied Linguistics, 1027–1032. Oxford: Wiley-Blackwell.
Hepburn, Alexa and Jonathan Potter
2009 Interrogating tears: Some uses of tag questions in a child protection helpline.
In: Alice F. Freed and Susan Ehrlich (eds.), Why Do You Ask? The Function of
Questions in Institutional Discourse, 69–86. Oxford: Oxford University Press.
Heritage, John
1984 Garfinkel and Ethnomethodology. Cambridge: Polity Press, in association
with Basil Blackwell, Oxford.
Heritage, John
1988 Explanations as accounts: A conversation analytic perspective. In: Charles
Antaki (ed.), Analalyzing Everyday Explanation. A Casebook of Methods,
127–144. London/Newbury Park/Beverly Hills/New Delhi: Sage.
Heritage, John
1989 Current developments in conversation analysis. In: Derek Roger and Peter
Bull (eds.), Conversation. An Interdisciplinary Perspective, 21–47. Avon:
Multilingual Matters.
Heritage, John
1998 Oh-prefaced responses to inquiry. Language in Society 27: 291–334.
Heritage, John
2002 Oh-prefaced responses to assessments: A method of modifying agreement/
disagreement. In: Cecilia E. Ford, Barbara A. Fox and Sandra A. Thompson
(eds.), The Language of Turn and Sequence, 196–224. Oxford: Oxford Univer-
sity Press.
388 Andrea Golato and Peter Golato

Heritage, John
2005 Conversation analysis and institutional talk. In: Robert Sanders and Kristine
Fitch (eds.), Handbook of Language and Social Interaction, 103–146. Mah-
wah, N.J.: Erlbaum.
Heritage, John
2009 Conversation analysis as an approach to the medical encounter. In: John B.
McKinlay and Lisa Marceau (eds.), e-Source: Behavioral and Social Sci-
ence Research Interactive Textbook. Office of Behavioral and Social Science
Research. http://www.esourceresearch.org/
Heritage, John
2013 Language and social institutions: The conversation analytic view. Journal of
Foreign Languages 36(4): 2–27.
Heritage, John and J. Maxwell Atkinson
1984 Introduction. In: J. Maxwell Atkinson and John Heritage (eds.), Structures of
Social Action. Studies in Conversation Analysis, 1–15. Cambridge: Cambridge
University Press.
Hoey, Elliott M.
2015 Lapses: How people arrive at, and deal with, discontinuities in talk. Research
on Language and Social Interaction 48(4): 430–453.
Holmes, Janet
1982 The function of tag questions. English Language Research Journal 3: 40–65.
Hutchby, Ian and Robin Wooffitt
2008 Conversation Analysis. Second edition. Cambridge: Polity Press.
Huth, Thorsten
2007 Pragmatics revisited: Teaching with natural language data. Die Unterrichts
praxis 40(1): 21–29.
Huth, Thorsten
2014 “When in Berlin …”: Teaching German telephone openings. Die Unterrichts
praxis 47(2): 164–179.
Huth, Thorsten and Carmen Taleghani-Nikazm
2006 How can insights from conversation analysis be directly applied to teaching
L2 pragmatics? Language Teaching Research 10(1): 53–79.
Jefferson, Gail
1973 A case of precision timing in ordinary conversation: Overlapped tag-posi-
tioned address terms in closing sequences. Semiotica 9: 47–96.
Jefferson, Gail
1980 The abominable Ne?. An Exploration of post-response pursuit of response.
In: Peter Schröder and Hugo Steger (eds.), Dialogforschung. Jahrbuch 1980
des Instituts für deutsche Sprache, 53–88. Düsseldorf: Pädagogischer Verlag
Schwann.
Jefferson, Gail
1989 Notes on a possible metric which provides for a ‘standard maximum’ silence
of approximately one second in conversation. In: Derek Roger and Peter Bull
(eds.), Conversation. An Interdisciplinary Perspective, 166–194. Clevedon,
UK: Multilingual Matters.
Kasper, Gabriele
2000 Data collection in pragmatics research. In: Helen Spencer-Oatey (ed.), Cul-
Ethnomethodology and conversation analysis 389

turally Speaking. Managing Rapport through Talk across Cultures, 316–341.

London/New York: Continuum.
Keevallik, Leelo
2010a Marking boundaries between activities: The particle nii in Estonian. Research
on Language and Social Interaction 43(2): 157–182.
Keevallik, Leelo
2010b Pro-adverbs of manner as markers of activity transition. Studies in Language
34(2): 350–381.
Kendrick, Kobin H. and Paul Drew
2016 Recruitment: Offers, requests, and the organization of assistance in interac-
tion. Research on Language and Social Interaction 49(1): 1–19.
Kern, Friederike and Margret Selting
2013 Conversation analysis and interactional linguistics. In: Carol Chapelle (ed.),
The Encyclopedia of Applied Linguistics, 1012–1016. Oxford: Wiley-Black-
well.
Kitzinger, Celia
2012 Repair. In: Jack Sidnell and Tanya Stivers (eds.), The Handbook of Conversa-
tion Analysis, 229–256. Oxford: Wiley-Blackwell.
Kitzinger, Celia, Rebecca Shaw and Merran Toerien
2012 Referring to persons without using a full-form reference: Locally initial index-
icals in action. Research on Language and Social Interaction 45(2): 116–136.
Koenig, Christopher J. and Jeffrey Robinson
2014 Conversation analysis: Understanding the structure of health talk. In: Bryan
B. Whaley (ed.), Research Methods in Health Communication: Principles and
Application, 119–140. New York/London: Routledge.
Komter, Martha L.
2012 Conversation analysis in the courtroom. In: Jack Sidnell and Tanya S tivers
(eds.), The Handbook of Conversation Analysis, 612–629. Oxford: Wiley-
Blackwell.
Koshik, Irene
2005 Beyond Rhetorical Questions: Assertive Questions in Everyday Interaction.
Amsterdam/Philadelphia: John Benjamins.
Lerner, Gene H.
1991 On the syntax of sentences-in-progress. Language in Society 20: 441–458.
Lerner, Gene H.
1996 On the place of linguistic resources in the organization of talk-in-interaction:
“Second person” reference in multi-party conversation. Pragmatics 6: 281–
294.
Lerner, Gene H., Galina Bolden, Alexa Hepburn and Jenny Mandelbaum
2012 Reference recalibration repairs: Adjusting the precision of formulations for the
task at hand. Research on Language and Social Interaction, 45(2): 191–212.
Lindström, Anna and Lorenza Mondada
2009 Assessments in social interaction: Introduction to the special issue. Research
on Language and Social Interaction 32(4): 299–308.
Local, John, John Kelly and Bill Wells
1986 Towards a phonology of conversation: Turn-taking in Tyneside English. Jour-
nal of Linguistics and Language Teaching 22(2): 411–437.
390 Andrea Golato and Peter Golato

Local, John, Bill Wells and Mark Sebba

1985 Phonology for conversation: Phonetic aspects of turn-delimitation in London
Jamaican. Journal of Pragmatics 9(2–3): 309–330.
Markee, Numa
2000 Conversation Analysis. Mahwah, N.J.: Lawrence Erlbaum Associates.
Markee, Numa
2015 Introduction: Classroom discourse and interaction research. In: Numa Markee
(ed.), The Handbook of Classroom Discourse and Interaction, 3–20. Oxford:
Wiley-Blackwell.
Mazeland, Harrie and Trevor Benjamin
2013 Conversation analysis and other-initiated repair. In: Carol Chapelle (ed.), The
Encyclopedia of Applied Linguistics, 1068–1075. Oxford, U.K.: Wiley-Black-
well.
McHoul, Alexander W.
1978 The organization of turns at formal talk in the classroom. Language in Society
7: 183–213.
Mehan, Hugh
1985 The structure of classroom discourse. In: Teun van Dijk (ed.), Handbook of
Discourse Analysis, 119–131, Volume 3. London: Academic Press.
Mondada, Lorenza
2007 Multimodal resources for turn-taking: Pointing and the emergence of possible
next speakers. Discourse Studies 9(2): 1994–1225.
Mondada, Lorenza
2009 The methodical organization of talking and eating: Assessments in dinner con-
versations. Food Quality and Preference 20(8): 558–571.
Mondada, Lorenza
2012 Conversation analysis and institutional interaction. In: Carol Chapelle (ed.),
The Encyclopedia of Applied Linguistics, 1005–1011. Oxford: Wiley-Black-
well.
Mondada, Lorenza
2014 Pointing, talk, and the bodies: Reference and joint attention as embodied inter-
actional achievements. In: Mandana Seyfeddinipur and Marianne Gullberg
(eds.), From Gesture in Conversation to Visible Action as Utterance. Essays in
Honor of Adam Kendon, 95–124. Amsterdam/Philadelphia: John Benjamins.
Mondada, Lorenza
2015 The facilitator’s task of formulating citizens’ proposals in political meet-
ings: Orchestrating multiple embodied orientations to recipients. Gesprächs-
forschung. Online-Zeitschrift zur verbalen Interaktion 16: 1–62.
Ochs, Elinor, Emanuel A. Schegloff and Sandra A. Thompson (eds.)
1996 Interaction and Grammar. Cambridge: Cambridge University Press.
Ogden, Richard
2015 Data always invite us to listen again: Arguments for mixing our methods.
Research on Language and Social Interaction 48(3): 271–275.
Parsons, Talcott
1937 The Structure of Social Action. A Study in Social Theory with Special Refer-
ence to a Group of Recent European Writers. New York: McGraw-Hill.
Ethnomethodology and conversation analysis 391

Peräkylä, Anssi and Johanna Ruusuvuori

2006 Facial expression in an assessment. In: Hubert Knoblauch, Bernt Schnettler,
Jürgen Raab and Hans-Georg Soeffner (eds.), Video Analysis. Methodology
and Methods. Qualitative Audiovisual Data Analysis in Sociology, 127–142.
Frankfurt am Main: Peter Lang.
Pomerantz, Anita
1978 Compliment responses: Notes on the co-operation of multiple constraints. In:
Jim Schenkein (ed.), Studies in the Organization of Conversational Interac-
tion, 79–112. New York/San Francisco/London: Academic Press.
Pomerantz, Anita
1984 Agreeing and disagreeing with assessments: Some features of preferred/dis-
preferred turn shapes. In: J. Maxwell Atkinson and John Heritage (eds.), Struc-
tures of Social Action, 225–246. Cambridge: Cambridge University Press.
Potter, Jonathan
2002 Two kinds of natural. Discourse Studies 4(4): 543–548.
Potter, Jonathan
2009 Discourse analysis. In: Melissa Hardy and Alan Bryman (eds.), Handbook of
Data Analysis, 607–624. London: Sage.
Roberts, Felicia, Alexander L. Francis and Melanie Morgan
2006 The interaction of inter-turn silence with prosodic cues in listener perceptions
of “trouble” in conversation. Speech Communication 48: 1079–1093.
Robinson, Jeffrey D. and John Heritage
2014 Intervening with conversation analysis: The case of medicine. Research on
Language and Social Interaction 47(3): 201–218.
Sacks, Harvey
1963 Sociological description. Berkeley Journal of Sociology 8: 1–16.
Sacks, Harvey
1987 (1973) On the preferences for agreement and contiguity in sequences in conver-
sation. In: Graham Button and John Lee (eds.), Talk and Social Organization,
54–69. Clevedon: Multilingual Matters.
Sacks, Harvey
1992 Lectures on Conversation, edited by Gail Jefferson, introduction by Emanuel
A. Schegloff. Oxford: Blackwell.
Sacks, Harvey and Emanuel A. Schegloff
1979 Two preferences in the organization of reference to persons in conversation
and their interaction. In: George Psathas (ed.), Everyday Language. Studies in
Ethnomethodology, 15–21. New York: Irvington Publishers.
Sacks, Harvey, Emanuel A. Schegloff and Gail Jefferson
1974 A simplest systematics for the organization of turn-taking for conversation.
Language 50(4): 696–735.
Schegloff, Emanuel A.
1984 On some questions and ambiguities in conversation. In: J. Maxwell Atkinson
and John Heritage (eds.), Structures of Social Action. Studies in Conversation
Analysis, 28–52. Cambridge: Cambridge University Press.
Schegloff, Emanuel A.
1987 Recycled turn beginnings: A precise repair mechanism in conversation’s
turn-taking organisation. In: Graham Button and John R.E. Lee (eds.), Talk
and Social Organization, 70–85. Clevedon: Multilingual Matters.
392 Andrea Golato and Peter Golato

Schegloff, Emanuel A.
1988 Presequences and indirection. Applying speech act theory to ordinary conver-
sation. Journal of Pragmatics 12: 55–62.
Schegloff, Emanuel A.
1990 On the organization of sequences as a source of “coherence” in talk-in-inter-
action. In: Bruce Dorval (ed.), Conversational Organization and its Develop-
ment, 51–77. Norwood, N.J.: Ablex Publishing Corporation.
Schegloff, Emanuel A.
1992 In another context. In: Alessandro Duranti and Charles Goodwin (eds.),
Rethinking Context. Language as an Interactional Phenomenon, 191–228.
Cambridge: Cambridge University Press.
Schegloff, Emanuel A.
1996 Some practices for referring to persons in talk-in-interaction: A partial sketch
of a systematics. In: Barbara A. Fox (ed.), Studies in Anaphora, 437–485, Vol-
ume 33. Amsterdam/Philadelphia: John Benjamins.
Schegloff, Emanuel A.
2007 Sequence Organization in Interaction. A Primer in Conversation Analysis,
Volume 1. Cambridge: Cambridge University Press.
Schegloff, Emanuel A., Gail Jefferson and Harvey Sacks
1977 The preference for self-correction in the organization of repair in conversation.
Language 53(2): 361–382.
Schegloff, Emanuel A., Elinor Ochs and Sandra A. Thompson
1996 Introduction. In: Elinor Ochs, Emanuel A. Schegloff and Sandra A. Thompson
(eds.), Interaction and Grammar, 1–51. Cambridge: Cambridge University
Press.
Schegloff, Emanuel A. and Harvey Sacks
1973 Opening up closings. Semiotica 8(4): 289–327.
Scheutz, Hannes
2005 Pivot constructions in spoken German. In: Auli Hakulinen and Margret Selting
(eds.), Syntax and Lexis in Conversation, 103–128. Amsterdam/Philadelphia:
John Benjamins.
Seedhouse, Paul
2005 The Interactional Architecture of the Language Classroom. A Conversation
Analytic Perspective. Oxford: Blackwell.
Selting, Margret
1988 The role of intonation in the organization of repair and problem handling
sequences in conversation. Journal of Pragmatics 12: 293–322.
Selting, Margret
1992 Prosody in conversational questions. Journal of Pragmatics 15(6): 583–588.
Selting, Margret
1995 Prosodie im Gespräch. Tübingen: Niemeyer.
Selting, Margret
1998 TCUs and TRPs: The construction of ‘units’ in conversational talk. Interaction
and Linguistic Structures 4: 1–48.
Selting, Margret, Peter Auer, Dagmar Barth-Weingarten, Jörg Bergmann, Pia Bergmann,
Karin Birkner, Elizabeth Couper-Kuhlen, Arnulf Deppermann, Peter Gilles,
Susanne Günthner, Martin Hartung, Friederike Kern, Christine Mertzlufft,
Ethnomethodology and conversation analysis 393

Christian Meyer, Miriam Morek, Frank Oberzaucher, Jörg Peters, Uta Quast-
hoff, Wilfried Schütte, Anja Stukenbrock and Susanne Uhmann
2009 Gesprächsanalytisches Transkriptionssystem 2 (GAT 2). Gesprächsforschung-
Online 10: 352–402.
Sidnell, Jack
2010 Questioning repeats in the talk of four-year old children. In: Hilary Gardner
and Michael Forrester (eds.), Analysing Interactions in Childhood. Insights
from Conversation Analysis, 103–127. Oxford: Wiley/Blackwell
Sorjonen, Marja-Leena
2001 Responding in Conversation. A Study of Response Particles in Finnish, Vol-
ume 70. Amsterdam/Philadelphia: John Benjamins.
Sorjonen, Marja-Leena and John Heritage (eds.)
in press At the Intersection of Turn and Sequence. Turn-Initial Particles across Lan-
guages. Amsterdam/Philadelphia: John Benjamins.
Streeck, Jürgen
1993 Gesture as communication I: Its coordination with gaze and speech. Commu-
nication Monographs 60(4): 275–299.
Streeck, Jürgen
1994 Gesture as communication II: The audience as co-author. Research on Lan-
guage and Social Interaction 27(3): 239–267.
Svennevig, Jan
2012 Interaction in workplace meetings. Discourse Studies 14(1): 3–10.
Szczepek Reed, Beatrice and Geoffrey Raymond (eds.)
2013 Units of Talk – Units of Action. Amsterdam/Philadelphia: John Benjamins.
Taleghani-Nikazm, Carmen
2002a A conversation analytical study of telephone conversation openings between
native and nonnative speakers. Journal of Pragmatics 34: 1807–1832.
Taleghani-Nikazm, Carmen
2002b The sequence organization of request granting-declining in everyday German
conversation. Paper presented at the International Conference on Conversation
Analysis, Copenhagen, May 17–21, 2002.
Taleghani-Nikazm, Carmen
2006 Request Sequences. The Intersection of Grammar, Interaction and Social Con-
text. Amsterdam/Philadelphia: John Benjamins.
Taleghani-Nikazm, Carmen
2005 Contingent requests: Their sequential organization and turn shape. Research
on Language and Social Interaction 38(2): 155–177.
Taleghani-Nikazm, Carmen and Andrea Golato
2016 Jaja in spoken German: Managing knowledge expectations. Die Unterrichts
praxis 49(1): 80–96.
Taleghani-Nikazm, Carmen and Thorsten Huth
2010 L2 requests: Preference structure in talk-in-interaction. Multilingua 29: 185–
202.
Wagner, Johannes
2013 Conversation analysis and transcription and data. In: Carol Chapelle (ed.), The
Encyclopedia of Applied Linguistics, 1114–1121. Oxford: Wiley-Blackwell.
394 Andrea Golato and Peter Golato

Wong, Jean
2002 ‘Applying’ conversation analysis in applied linguistics: Evaluating dialogue
in English as a second language textbooks. International Review of Applied
Linguistics 40(1): 37–60.
Yuan, Yi
2001 An inquiry into empirical pragmatics data-gathering methods: Written DCTs,
oral DCTs, field notes, and natural conversations. Journal of Pragmatics 33:
271–292.
16. Discourse analysis
Anita Fetzer

Abstract: This chapter examines methods of discourse analysis, considering in

particular the structuring of discourse with respect to the fundamental questions of
granularity and of the nature of the connectedness between the constitutive parts
of discourse and its delimiting frames of reference. It addresses core concepts,
such as micro, meso and macro units of investigation, their concatenation and
linearization, and their statuses as carriers of content, force and discursive glue.
It highlights the interdependence of the conceptualization of discourse units and
appropriate methods for discourse analysis, such as quantitative and qualitative,
and bottom-up and top-down approaches. The dynamics of discourse – as both
process and product – requires discourse units to be relational, relating its consti-
tutive units with language users, discourse coherence, discourse common ground
and context as well as with discourse-as-whole.

1. Introduction

Discourse is a multifaceted and multilayered, and generally also a multi-modal

construct, which has not only been analyzed in the arts and humanities, but also in
the social sciences, in artificial intelligence and in information technology, to name
but the most prominent paradigms. The research community is heterogeneous and
utilizes varying, but not mutually exclusive perspectives and methodologies, which
address discourse from quantitative and qualitative perspectives. An analysis of
discourse needs to address two fundamental issues: (1) what is discourse, or rather
which necessary conditions need to obtain for a “stretch of language (use)” to
count as discourse? And (2) the question of granularity: what is the micro unit of
investigation? Is there a macro unit of investigation and are there in-between units?
And which necessary conditions need to be fulfilled for a linguistic unit to count
as a discourse unit? While there is general agreement about a quantity-anchored
conception of discourse as “language patterns above the sentence” (Widdowson
2004: 3), the question of granularity as regards the basic unit of investigation of
the constitutive parts of discourse remains controversial and depends strongly on
the respective research paradigm, and on its goals and methodology. This holds for
the discourse unit as well as for its delimiting frame, which spans from some kind
of concatenated sequence of discourse units to discourse genre. Classical discourse
units are propositions and illocutionary acts in formal theories of discourse, clause,
sentence, utterance in linguistics-based discourse analysis and text linguistics, turn

https://doi.org/10.1515/9783110424928-016
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 395–423. Berlin/
Boston: De Gruyter Mouton.
396 Anita Fetzer

and turn-constructional unit in conversation analysis and interactional linguistics,

and speech act, conversational contribution, discursive contribution, and pract and
pragmeme (Mey 2001) in discourse pragmatics, which will be elaborated on in
section 2.2.2. But it is not only the form of the discourse unit, which contributes
to the complexity of discourse analysis. There are further, complicating issues: on
the one hand discourse refers to a theoretical construct and to its actual linguistic
realization in context, and on the other hand discourse is both product, that is a
delimited and static unit of concatenated discourse units, and process, that is the
dynamic linearization and sequential organization of discourse units which consti-
tute discourse-as-a-whole. Because of its dual status as product and process, dis-
course can be approached from top-down and bottom-up perspectives, considering
the structuring of discourse-as-a-whole starting from the whole product and thus
top-down, or from one discourse unit and thus bottom-up, considering its concate-
nation and sequential organization in order to construct larger units and constitute
discourse-as-a-whole.
Approaching discourse analysis from a granularity-based perspective focus-
ing on form presupposes that discourse is structured, and that its structure is the
result of concatenated and linearized discourse units. It is that patterned concat-
enation and linearization, which constitutes discourse and which allows for its
delimitation and framing, distinguishing it from its embedding social context (cf.
Fetzer 2012). However, discourse is more than just some concatenated, linearized
and delimited unit: interlocutors produce discourse with a certain goal and com-
municative intention, and they address other interlocutors with that discourse in
order to make manifest their communicative intention and achieve their commu-
nicative goal(s). Discourse is used strategically in context, and the concatenation
and linearization of its constitutive parts are planned (more or less consciously)
and executed (more or less consciously) by interlocutors, who may realize their
discourse in spoken, written, written-to-be-spoken modes, or in other modalities.
The discourse-as-product outlook concentrates on discourse-as-a-whole, while the
discourse-as-process perspective accounts for the production and interpretation of
discourse units as parts of a linearized and sequentially organized whole. Here,
discourse is an online production and conceived of as co-constructed, as is the case
with spoken dyadic and multi-party discourse and its digitally produced counter-
parts; this will be elaborated on in section 2.
The goal of this chapter is to examine methods of discourse analysis, consid-
ering in particular the structuring of discourse. The focus is on qualitative meth-
odological approaches, in particular empirical frameworks. The following section
“Discourse: quantity meets quality” addresses the question of granularity and its
implications for the analysis of discourse from qualitative methodological perspec-
tives. It compares different approaches to discourse units and relates them to carri-
ers of discursive glue, that is coherence strands, discourse relations and discourse
connectives, which contribute to the participants’ construal of discourse coherence.
Discourse analysis 397

The section also addresses the dynamics of discourse from bottom-up and top-
down perspectives, paying particular attention to discursive online production and
discursive online interpretation. Section 3 “Discourse as communicative action”
addresses the question of whether discourse describes the world in terms of true or
false, or whether it constitutes communicative action, and whether discourse can
be assigned the status of a macro speech act concatenated of micro speech acts,
with the illocutionary speech act type of expositive as discursive joint and carrier
of discursive glue. A conclusion summarizes the results of the discussion and pro-
vides an outlook on further research.

2. Discourse: Quantity meets quality

Discourse analysis is fundamentally concerned with the investigation of the nature

of the connectedness between parts and wholes, and for this reason methods of
discourse analysis need to account for discourse as a relational construct, relating
its constitutive parts locally at the level of adjacent positioning and linearization,
relating them not-so-locally at the level of sequence or episode, and relating them
globally with regard to the nature of their connectedness with the discourse-as-a-
whole, i. e. discourse genre.1 Discourse is thus both quantity and quality: quantity
captured by the number of its constitutive parts, its discourse units, and quality,
captured by the pragmatics of discourse units in context, and by the nature of
their connectedness. That is why any analysis of discourse needs to address both
granularity and discursive glue, that is the linguistic material, which contributes to
making discourse units cohere.
Linguistics-based methods of discourse analysis have focused on the struc-
ture of discourse and thus on granularity, for instance Segmented Discourse Rep-
resentation Theory (Asher and Lascarides 2003) and Rhetorical Structure Theory
(Mann and Thompson 1988), and also on discursive glue captured by discourse (or
coherence) relations. Frequently, they have considered idealized, prototypical sce-
narios (but see contributions to Gruber and Redeker 2014). Contrastive methods of
discourse analysis have concentrated on the linguistic representation of discourse
relations based on quantitative and qualitative investigations of parallel corpora,
generally comprising translations of literary or institutional texts. They do not
only allow for the examination of language-preferential realizations of discourse
units and of some of its constitutive constructions across languages, but also for

1
In this chapter discourse genre is used as an umbrella term for delimiting frames of
reference, for instance activity type (Levinson 1979), communicative genre (Sarangi
2000), local and global communicative project (Linell 1998), to name but the most
prominent ones; it is functionally synonymous with discourse-as-a-whole.
398 Anita Fetzer

differences between spoken and written modes, and between selected data sets of
particular discourse genres (e. g. Fetzer and Speyer 2012; Speyer and Fetzer 2014),
accounting for local constraints, such as adjacency, and more global constraints,
such as the delimiting frame of discourse genre.
The following section examines the implications of granularity on the analysis
of discourse units and their size, considering their embeddedness in discourse as
well as possible forms and boundaries. After that, the qualitative aspects of dis-
course units surfacing in the structuring of discourse are going to be analyzed.

2.1. Quantity and granularity

Functional and formal approaches to the analysis of discourse have utilized vari-
ous discourse units, spanning from clause, sentence and utterance to proposition,
illocutionary act, conversational contribution and turn. However, their application
within the different research paradigms has not always been consistent. In linguis-
tics-based analyses, discourse units are generally seen as functionally equivalent
to grammar-based terms of sentence or clause, and sometimes discourse units are
referred to with the usage-based term of utterance, as is the case in text linguistics
(e. g. De Beaugrande and Dressler 1981; Fabricius-Hansen and Ramm 2008) and
in the functional-grammar-anchored paradigms (e. g. Dik 1997; Givón 1993; Hall-
iday 1994; Heine, Kaltenböck, Kuteva and Long 2013; Hengeveld and Mackenzie
2008). The formal paradigm utilizes proposition and/or illocutionary act, but also
utterance in their analyses of both fabricated and non-fabricated data. The func-
tional paradigm employs speech act, turn, conversational contribution, discursive
contribution, pract and pragmeme, but also utterance. Utterance is thus the most
frequent term, but this does not mean that the different paradigms share a common
definition of utterance, which has been used to refer to written discourse as well
as spoken discourse, and the grammars of the two modes are different, as has been
proposed by Biber (1988), Chafe (1994) or Biber et al. (1999), for instance. Eth-
nomethodological approaches and interactional linguistics base their investigations
on the discourse unit of turn, which may be segmented into the constitutive part of
turn-constructional unit (e. g. Schegloff 2007); sometimes conversation analysis
and interactional linguistics also use utterance, clause or sentence in their analyses
(cf. also Golato and Golato, this volume).
The question of granularity regarding the minimal unit of investigation has
been addressed comprehensively. However, the analysis of discourse also needs to
account for its status as a parts-whole configuration, in which the whole is more
than just a somewhat delimited sequence of minimal discourse units, which is
implicit in the rather general definition of discourse as “language patterns above
the sentence” referred to in the introduction. Widdowson’s very general definition
has been taken up by a number of researchers, supporting its line of argumentation,
but also qualifying it. Fabricius-Hansen and Ramm (2008) address the question of
Discourse analysis 399

granularity explicitly for their delimiting frame of text. By accommodating a larger

discourse unit, they refine the definition of the discourse unit “sentence”:
i. A text [original emphasis] consists of a finite number of sentences in succes-
sion, with one-sentence texts as a marked category.
ii. A sequence of syntactically independent clauses separated by commas is not
a sentence sequence but constitutes a single (complex) sentence, if properly
demarcated.
iii. If clause combining [original emphasis] […], or clause linkage [original
emphasis] […], is to be understood as connecting clauses rather than sentences
[…] it must be confined to the sentence level. That is, simple juxtaposition of
syntactically independent clauses separated by comma, without overt coordi-
nation, represents a special case of (paratactic) clause linkage. Corresponding
full-stop sentences, on the other hand, are, strictly speaking, not related by any
kind of clause combining; they simply succeed each other (Fabricius-Hansen
and Ramm 2008: 4–5).
It is not only the question of granularity, which is relevant to methods of discourse
analysis and thus to a felicitous examination of discourse, but also the hierarchi-
cal and non-hierarchical structuring of discourse, as is put forward by Berens,
Fabricius-Hansen and Solfjeld (2012: 198): “It is generally acknowledged that
texts are (more or less) hierarchically structured by way of the discourse relations
[original emphasis] holding between neighboring discourse units, that is sentences
or sentence sequences.” Discourse relations do not only hold between neighboring
discourse units, but may also be positioned non-adjacently and span across dis-
course units, as has been shown by Fetzer and Speyer (2012) and Speyer and Fetzer
(2014). The argument that texts are hierarchically structured leads Berens, Fabri-
cius-Hansen and Solfjeld to the conclusion that discourse structure may be concep-
tualized analogously to syntactic structure: “we assume that syntax is a means of
building not only semantic units but also discourse units – and that the latter need
not be semantic units at the outset” (Berens, Fabricius-Hansen and Solfjeld 2012:
200). Arguing for discourse syntax allows them to offer a framework which may
capture the structuring and linearization of discourse.
The analysis of discourse thus needs to account for both quantity and quality. So
far, the sentence has counted as a prime candidate for the discourse unit, as put for-
ward by Widdowson (2004) and Fabricius-Hansen and Ramm (2008), for instance.
In functional grammars of English, however, the sentence is not seen as a gram-
matical unit but rather as an information-based, orthographic unit (Givón 1993;
Halliday 1994). To account for the discursive nature of language use, Dik’s func-
tional discourse grammar and Halliday’s systemic functional grammar differentiate
between “core-clausal domains” and “extra-clausal domains”. The core-clausal
domains contain experiential and ideational (or semantic) meaning in systemic
functional grammar, and topic and eventualities in functional discourse grammar,
400 Anita Fetzer

while the extra-clausal domains contain textual and interpersonal themes (or: dis-
course-pragmatic meaning) in systemic functional grammar, and extra-clausal con-
stituents in functional discourse grammar. Extra-clausal constituents comprise par-
entheticals, adverbials of time and place or discourse connectives, for instance, and
their discursive functions are interaction management, discourse organization and
discourse execution, attitude specification, and formulation of content, as well as
the metacommunicative function of commenting on clause content. Extra-clausal
constituents are defined relative to the clause: they may be absolute (or free-stand-
ing), pre-clausal constituents, that is interpersonal and textual themes realized in
the theme zone as well as non-congruently configurated theme zones, post-clausal
constituents, viz. tails and tags, and mobile clause-internal constituents, such as
parentheticals. Because of their distribution and form, extra-clausal constituents
fulfill an important function in boundary marking, which connects them closely
with the theme zone in systemic functional grammar. Moreover, their discursive
function makes them a prime candidate for encoding and signaling discourse-prag-
matic meaning regarding the nature of the connectedness between discourse units,
including attitudinal specification.
The functional paradigm thus emphasizes the impact of context on the struc-
turing and linguistic realization of discourse. Widdowson himself qualifies his
quantitative definition discussed above, pointing out that the quantity-based defi-
nition “would seem to imply that discourse is sentence writ large: quantitatively
different but qualitatively the same phenomenon. It would follow, too, of course,
that you cannot have discourse below [original emphasis] the sentence” (Widdow-
son 2004: 3). Widdowson draws our attention to yet another fallacy in the purely
quantity-based definition: if “the difference between sentence and discourse is not
a matter of kind but only of degree, then they are presumably assumed to signal the
same kind of meaning. If sentence meaning is intrinsically encoded, that is to say,
a semantic property of the language itself, then so is discourse meaning” (2004: 3).
Functional-grammar-based analyses, and ethnographic and ethnomethodologi-
cal analyses also address granularity above the basic (or micro) discourse unit and
postulate in-between units, for instance sequences and episodes. In spite of these
divergent, but not necessarily mutually exclusive conceptualizations of discourse
units, all approaches share – more or less explicitly – the premise that discourse
is a parts-whole configuration in which the whole is more than the sum of its
constitutive parts (cf. Fetzer 2013), thus referring to the structuring of discourse
as regards discursive form captured by sequentiality and linearization, discursive
content captured by discourse pragmatics and discourse semantics, and discursive
glue captured by the nature of the connectedness between discursive forms and
their impact of discourse-as-a-whole.
Text linguistics (e. g. De Beaugrande and Dressler 1981) also addresses the
question of granularity. The syntactic unit of sentence counts as its micro discourse
unit and the macro unit is a text-type, which is classified according to discourse
Discourse analysis 401

domains and discourse functions. In functional discourse grammar (e. g. Givón

1993; Halliday 1994; Martin and Rose 2008) the syntactic unit of clause is the
micro unit of investigation and discourse is delimited and framed by episode and
by larger-scale genre, for instance. Discourse semantics considers the semantic
unit of proposition as its micro unit of analysis, while a concatenated sequence
of propositions is seen as a delimiting frame. More dynamic models also inte-
grate illocutionary force (e. g. Asher and Lascarides 2003; Moeschler 2002; Roulet
1991, 2006) and use speech act (or illocutionary act), proposition and utterance as
their unit of analysis as well as larger units composed of concatenated micro units
as delimiting frames. Ethnomethodological conversation analysis uses the unit of
turn, which is composed of smaller turn-constructional units, and the larger-scale
unit of sequence. Usage-based frameworks employ the unit of utterance and gen-
erally do not explicitly discuss possible delimiting frames. Discourse pragmat-
ics utilizes various units, such as utterance, discursive contribution or move. To
account for the duality of form and function, methods in pragmatic theories of dis-
course base their analyses on discourse units as carriers of content and force while
at the same time allowing for the accommodation of the dynamics of discourse
and thus for varying quantities, i. e. discourse connective, discursive contribution,
paragraph(s) or sequence(s). What is more, their discourse unit not only accom-
modates the duality of form and function, but also their instantiations in discursive
and sociocultural contexts.
From a more holistic perspective, granularity goes beyond discursive form and
discursive function expanding the concept to “the granularity of language user”,
that is do speaker/producer and addressee/interpreter constitute one discursive unit
each, or is the dyad the micro unit of the participation framework? If discourse is
conceived of as co-constructed, then it is not individual speakers and individual
addressees who produce and interpret discourse units in individual acts of produc-
tion and interpretation, but rather the dyad who jointly constructs the discourse unit
and negotiates its communicative value. This has been put forward by Arundale
and Good (2002), for instance. They take the dyad as the micro unit of analysis for
studying talk-in-interaction and base their argument on the premise that using lan-
guage is fundamentally a conjoint activity, involving the continuing co-construct-
ing of a stream of talk and its local and not-so-local meanings. They demonstrate
that participants’ cognitive processes involve both anticipatory planning and, at the
same time, retrospective interpretation of what has just happened. Fetzer (2004)
takes the argument further and differentiates between individual presuppositions
and collective co-suppositions as well as between individual dialogue common
ground and collective dialogue common ground (cf. section 2.2.1). A dyad-based
analysis of discourse is also found in the Birmingham-type of discourse analysis
with exchange containing an initiating move and a response as its basic unit of
investigation (cf. Sinclair and Coulthard 1975), which in the context of classroom
talk is expanded to a triadic sequence of initiating move, responsive move and
402 Anita Fetzer

follow-up move. Going beyond strict adjacency, follow-ups have been reconcep-
tualized beyond structural adjacency, spanning across a triadic sequence to account
for the sequential organization of discourse and across discourses (cf. Weizman
and Fetzer 2015).
In the previous sections it has already surfaced that there is no general agree-
ment in the heterogeneous discourse community about a definition of discourse,
except for the quantity-anchored “language patterns above the sentence”. This is
also true for the question of granularity, in particular for the basic unit of investiga-
tion, which may differ from paradigm to paradigm – in spite of the fact that the dis-
course unit and how it is conceived of, for instance as a carrier of content, a carrier
of force, a carrier of metacommunicative meaning, a carrier of content and force,
or a carrier of content, force and metacommunicative meaning, is indispensable to
discourse analysis in general and to the analysis of the structuring of discourse in
particular. This also holds for the production and reception framework.

2.2. The structuring of discourse

The structuring of discourse is based on the premise that discourse is a parts-
whole configuration with discourse-specific “patterns above the sentence”. The
concatenation and linearization of discourse units is captured by discourse syntax,
and the semantics and pragmatics of the connectedness of the units is accounted
for in discourse semantics and discourse pragmatics. Quantitatively oriented stud-
ies tend to focus on the linearization of discourse units as well as on the quality
of their connectedness, while qualitatively oriented discourse studies share the
assumption that discourse as a linearized whole comes in with the presumption of
being coherent (cf. Bublitz, Lenk and Ventola 1999; Gernsbacher and Givón 1995;
Chafe 1994). In qualitative studies it is not “language patterns above the sentence”
and their semantic and pragmatic well-formedness, which make them cohere, but
rather interlocutors who negotiate the meaning of discourse units and of discourse-
as-a-whole, thereby construing and negotiating discourse coherence. Hence, dis-
course coherence does not lie in the discourse itself but rather in the interlocutors’
minds and therefore is a socio-cognitive construct par excellence. This view is also
implicit in cohesion-based analyses of texture (e. g. Hasan and Halliday 1987), in
which discourse coherence is connected intrinsically with cohesion and cohesive
ties, that is linguistic items which signal, if not encode, the nature of the connect-
edness between the constitutive parts of discourse and discourse-as-a-whole.
The following sections examine the structuring of discourse with respect to
the question of how discursive glue is made manifest in the discourse unit, outside
the discourse unit and above the discourse unit. They discuss the connectedness
between discourse unit and sequential organization considering in particular meth-
odological implications for discourse analysis.
Discourse analysis 403

2.2.1. Discourse unit and discursive glue

In a discursive frame of reference, the question of granularity can be addressed

from top-down and bottom-up perspectives. As for the former, the discourse-as-a-
whole is considered as the macro unit, which is then segmented into smaller micro
units, which may be further segmented into yet smaller, minimal units. As for the
latter, a unit is generally adopted from another research paradigm, as is the case
with sentence and clause from different models of grammar, for instance clause
from discourse grammar, such as (systemic) functional grammar (Givón 1993;
Halliday 1994) or sentence from sentence-based models of grammar. Adopting
a discourse-dynamic perspective, it is not only the question of local granularity,
which needs to be considered, but also the concatenation of minimal units and
micro units to form larger constitutive units of discourse. To account for that chal-
lenge, the question of granularity needs to be addressed together with the question
of discursive glue, that is cohesion and coherence.
Discourse units are relational from both discourse-structuring and dis-
course-meaning perspectives. Adapting the conversation-analytic concept of dou-
bly contextual (Heritage 1984) to discourse analysis, adjacently positioned dis-
course units are doubly contextual in so far as they provide linguistic context for
the production and interpretation of neighboring discourse units. Linguistic context
is functionally equivalent to the linguistic realization of interlocutors’ communica-
tive intentions and therefore also contains references to their cognitive contexts,
i. e. mental representations, common ground and discourse common ground,2
and to the social and sociocultural contexts of discourse indexed in the discourse
and imported into the discourse (cf. Fetzer 2011 for the deictic forms ‘here’ and
‘there’). The effects of discourse units thus are considered explicitly with respect
to cognitive effects, i. e. recipient’s recognition of meaning and force, the construal
of discourse common ground and intersubjective reality, and with respect to social
effects, i. e. discourse expectations, and rights and obligations of particular dis-
course units and their felicity conditions. However, it is not only discourse units
that are situated in context, but also the context itself situates and conditions dis-

2
Discourse common ground is an interlocutor-, context- and genre-dependent variant
of common ground. It is anchored in a network structure and connected with other
types of discourse common ground. The network structure is functionally equivalent
to Background (Searle 2010). Discourse common ground is composed of mental rep-
resentations, propositions, and factual and contextual assumptions, which may vary in
strength. It undergoes continuous updating and continuous re-organisation as assump-
tions are read, written and deleted, and contextual implications are raised in strength,
lowered in strength or erased (cf. Fetzer 2004, 2007). Changes resulting from the
administration of emergent discourse common ground may result in changes of other,
higher-level discourse common grounds.
404 Anita Fetzer

course units. This is particularly true for discursively implicated meaning, which is
what the context makes it to be. Conversely, a discourse unit may create the context
for which it is appropriate (cf. Mey 2011), as is also argued for by Levinson (1983):
What makes some utterances after a question constitute an answer is not only the nature
of the utterance itself but also the fact that it occurs after a question with a particular
content – ‘answerhood’ is a complex property composed of sequential location and
topical coherence across two utterances, amongst other things; significantly there is no
proposed illocutionary force of answering (Levinson 1983: 293).

The linearization of discourse is thus a multilayered, complex endeavor. It is based

on communicative intentionality, on the strategic use of language constrained by
the linguistic system, and on interlocutors acting in accordance – they may locally
also act in dis-accordance – with the contextual constraints and requirements of
discourse genre. The sequential organization and linearization of discourse is not
only a linguistic-surface phenomenon, but rather depends on the sociocognitive
construct of discourse common ground, which is updated continuously. Discourse
common ground is – like discourse – a dynamic construct, which is negotiated and
administered as the discourse proceeds, i. e. confirmed, modified or restructured,
by updating already stored information and storing new information. This may
require the restructuring of the interlocutors’ individual and collective discourse
common grounds. Individual discourse common ground administers an individu-
al’s discourse common ground, while collective discourse common ground admin-
isters negotiated and ratified discourse common grounds of the set of interlocutors;
both need to overlap, but may diverge to varying degrees (Fetzer 2007).
The structuring and linearization of discourse is connected intrinsically with
the question of granularity, i. e. size and conceptualization of discourse units, and
with the semantics and pragmatics of their connectedness. Minimal discourse units
may be realized as comment clauses, discourse connectives or elliptical construc-
tions, micro discourse units may be realized as clauses, utterances or discursive
contributions, and in-between-units, so-called meso discourse units, may be real-
ized as clause complexes, paragraphs, sequences, episodes, and also larger units.
The macro discourse unit is the discourse genre. Irrespective of their size, dis-
course units carry discursive glue, they may carry both content and glue, or they
may carry discursive glue only. As a consequence, discourse units are relational
and doubly contextual, as is reflected in their adjacent positioning, that is structural
adjacency, in their semantic relations, that is adjacency relation, and in adjacency
expectations, that is pragmatic adjacency, as is captured by the discursive con-
straint of dovetailed – or dovetailedness (Fetzer 2004) – put forward in “Logic and
conversation” (Grice 1975).
Dovetailedness is a discursive-glue concept par excellence, which constrains
the production and interpretation of discourse units. While adjacency relates struc-
tural positioning, semantic relation and pragmatic expectation, dovetailedness
Discourse analysis 405

specifies the nature of the relatedness. Grice specifies the constraint for the unit of
conversational contribution as “such as is required, at the stage at which it occurs,
by the accepted purpose or direction of the talk exchange [the linearization of
discourse, A.F.] in which you are engaged” (Grice 1975: 45), implying that conver-
sational contributions are linked by one or more common goals manifest in prior
and succeeding contributions. In discourse, conversational contributions have the
status of a discursive contribution, which may be composed of smaller discourse
units, such as minimal discourse units and micro discourse units, or a combination
of both. The discursive constraint of dovetailedness, it has been argued (Fetzer
2014), holds for minimal discourse units, micro discourse units, for more complex
discourse units, such as sequences, and for discourse-genre-as-a-whole.
Dovetailedness is both semantic and pragmatic. It is implicit in the conver-
sation-analytic concept of conditional relevance and in the two-part sequence of
adjacency pair, which, following Mey, “is a case of coherent sequencing, but not
all sequencing needs to be defined strictly in terms of adjacency” (Mey 2001: 249).
Dovetailedness is fundamental to the construal of discourse coherence, which does
not mean that it is meaning-based only. Dovetailedness refers to two sides of a
coin, metaphorically speaking. On the one hand, it specifies structural adjacency
by adding precision thereby making adjacency relation and adjacency expectation
more precise, namely “such as is required”. Dovetailedness is also intrinsic in the
conversation-analytic concept of adjacency pairs, that is patterned co-occurrences
of two communicative actions produced by different speakers with a first part and
preferred/dispreferred second pair-parts, such as greeting and greeting/non-greet-
ing; request and compliance/non-compliance; offer (or invitation) and accept-
ance/refusal; assessment and agreement/disagreement; and question and expected
answer/unexpected answer or non-answer (cf. Levinson 1983: 336). The second
parts of the adjacency pairs just listed are not of equal standing, as one of them is
preferred and the other is dispreferred, as has been examined in the framework of
preference organization (cf. Pomerantz 1984). The classification as preferred and
dispreferred second is not based on the interlocutors’ psychological dispositions,
but rather on structural and distributional features and hence closely connected
with the linguistic concept of markedness (cf. Levinson 1983: 307).
Dovetailedness goes beyond structure-based positioning. It is a pragmatic con-
cept, which may be encoded in discourse and thus made explicit, or it may be
assigned a presuppositional status and thus would need to be inferred. It may have
a narrow scope and be assigned the status of a local constraint, as is the case with
adjacency pairs and their preferred and dispreferred seconds, or it may have a
wider scope and be assigned the status of a less-local constraint, as is the case with
insertion sequences and topical digression, and pre- and post-sequences in con-
versation. Closely related to dovetailedness is the cognitive concept of adjacency
expectation. It is the foundation against which two adjacent discourse units may be
classified as a particular adjacency pair with preferred and dispreferred seconds, or
406 Anita Fetzer

against which the second discourse unit may be assigned the status of the first unit
of an insertion sequence. For instance, in the discourse genre of interview, commu-
nicative actions performed by the interviewer tend to count as requests for infor-
mation generally formatted as questions and communicative actions performed by
the interviewee tend to count as responses to the request for information. Should
an interviewee opt for the communicative action of requesting information, which
is generally formatted as a question, he or she needs to refer to the communica-
tive-action format of requesting information in an explicit manner, e. g. by saying
“may I ask you a question” (cf. Fetzer 2000). In discourse, dovetailedness may be
manifest in dovetailedness relation and dovetailedness expectation, as has been
shown above. However, it is also possible that structural dovetailedness neither
conflates with dovetailedness relation nor with dovetailedness expectation. In that
case, a conversational implicature is triggered and the nature of the connectedness
between the adjacently positioned discourse units is inferred.
A particular type of discursive glue are discourse relations (or coherence rela-
tions), which hold between two discourse units. Discourse relations have been
defined in the discourse semantic framework of Segmented Discourse Representa-
tion Theory (Asher and Lascarides 2003), which analyses the logical relation
between two discourse segments, which refers to one particular type of discourse
unit, i. e. a complex linguistic unit with propositional content and illocutionary
force of its own. Any discourse segment usually stands in a logical relation to at
least one other preceding segment (or rather: the addressee construes a logical
relation between them, in order to vouchsafe coherence). The propositions p1 and
p2 are in the discourse relation R if the inferences the addressee makes and the
logical connection s/he draws between p1 and p2 are in accordance with the ones
defined for R. As discourse is not a purely linear phenomenon, but is hierarchically
structured, two kinds of discourse relations are generally distinguished: coordinat-
ing relations that keep the discourse on the same level, and subordinating relations
that introduce a lower level in the discourse hierarchy.
Discursive glue is also found in functional-grammar anchored coherence
strands, which are made manifest through (a) topic continuity, (b) tense and aspec-
tual coherence (including modality), (c) lexical coherence, and (d) default gram-
matical word order vs. pragmatic word order:
These strands are clearly the most concrete, salient, observable links between clauses
in coherent discourse. But the phenomenon of discourse coherence is richer yet. First,
coherence strands may connect – or ground [original emphasis] – the clause either to
the current text, to the current speech situation, or to generic-lexical knowledge. Sec-
ond, coherence strands may extend either locally, between adjacent clauses, or globally,
across larger text-structures. Third, coherence strands may be either semantic or prag-
matic in nature. Finally, the strands may ground the clause in either an anaphoric or a
cataphoric direction. (Givón 1993: 287, vol. 2)
Discourse analysis 407

A systematic analysis of coherence strands may not only explain higher or lower
degrees of glueyness (cf. Maier, Hofmockel and Fetzer 2016) and thus of dis-
course coherence, but also predict syntactic formatting: “The more thematically
connected a conjoined clause is with an adjacent clause – the more strands of
thematic coherence it shares with that adjacent clause – the more likely it is to
appear reduced, less finite, syntactically integrated with that other clause” (Givón
1993: 318, vol. 2), a claim which has been substantiated in grammaticalization and
pragmaticalization research (e. g. Aijmer 1997; Traugott 1988).

2.2.2. Discourse unit and sequentiality

The structuring of a particular type of discourse, that is conversation, has been
examined comprehensively in ethnomethodological conversation analysis and in
interactional sociolinguistics, considering the inherently dynamic nature of dis-
course. Both subscribe to the premise of indexicality of communicative action and
thus of discourse units – or turns and turn-constructional units in ethnomethodo-
logical terms (cf. Golato and Golato, this volume) –and are, for this reason, appro-
priate frames of reference for examining the connectedness between discourse
units and discourse-as-a-whole: “Sequential organization refers to that property
of interaction by virtue of which what is said at any time sets up expectations
about what is to follow either immediately afterwards or later in the interaction”
(Gumperz 1992: 304). Sequential organization may refer to local adjacency, that
is adjacently positioned units, which may be typified as adjacency pairs, as the
minimal unit for sequence construction, it may refer to expanded minimal units
with pre-expansions, which may be more and less conventionalized as references
to particular felicity conditions for speech acts (cf. Levinson 1983: 356–364), it
may refer to expanded sequences with insertions and post-expansion, and it may
refer to sequences of sequences including retro-sequences (cf. also section 3.2).
Irrespective of their quantity, sequences – be they minimal, expanded or sequences
of sequences – are related semantically via discourse relations holding locally and
via discourse topics spanning across larger units, and they are framed and delimited
by discourse genre with its genre-specific opening, topical and closing sequences,
which delimit discourse-as-a-whole, but may also delimit sequences within the
discourse genre.
Adopting a top-down perspective on the analysis of discourse requires its con-
stitutive units to be conceptualized as relational units, which are decomposed into
smaller units, which may again be decomposed into even smaller units, and it is
these smaller units, which constitute discourse-as-a-whole. The relational nature
of discourse units also holds for a bottom-up perspective in which (small) dis-
course units are connected to constitute larger units, which constitute discourse-
as-a-whole. The dynamic and relational outlook on discourse and on its sequential
organization is based on the premise that discourse units carry force, content and
408 Anita Fetzer

metadiscursive meaning, with varying degrees of explicitness. What is more, dis-

course units need to be indexical, expressing exophoric and endophoric reference.
Because of their relational conceptualization, discourse units are doubly contex-
tual. By contextualizing prior discourse units they pave the ground for the produc-
tion and interpretation of upcoming discourse units indicating how the discourse is
to proceed, i. e. whether there is some change in the intended direction, or whether
there is no intended change and the discourse is to proceed as planned. As for the
former, changes may be signaled with contrastive discourse connectives, they may
be encoded in contrastive discourse relations or they may be communicated with
an embedded sequence with one or more contrastive discourse topics. As for the
latter, non-changes may be signaled with continuative discourse connectives, they
may be encoded in continuative and elaborative discourse relations or they may
be communicated with an embedded sequence with one or more continuative or
elaborative discourse topics, for instance.
Granularity does not only refer to micro, but also to meso and macro units of
investigation (cf. Fetzer 2004, 2013). A relational conceptualization of a discourse
unit allows one to account for the extension of frame from minimal units, such as
discourse connectives, which carry force and metadiscursive meaning, to micro
units, such as discursive contributions, which carry force, content and metadiscur-
sive meaning, to meso, i. e. sequences or episodes, to macro, i. e. discourse-genre-
as-a-whole. A very broad notion of discourse, as is reflected, for instance, in the
discourse on context or the discourse on political correctness, could be captured
by a unit beyond macro.
Micro, meso and macro discourse units come in with the presumption of
being coherent as separate units as well as concatenated and linearized sequences.
However, it is not the language patterns above the sentence and their semantic
well-formedness which makes them cohere but rather recipients who construe
discourse coherence both locally and globally. Discourse coherence is connected
intrinsically with cohesion and cohesive ties, viz. linguistic items which express
the nature of the connectedness between clauses and sentences, sentences and par-
agraphs, and paragraphs and discourse-as-a-whole (Hasan and Halliday 1987; Hal-
liday 1994). It is also connected intrinsically with the sociocognitive constructs of
discourse relation, coherence strand and discourse topic, and with the discursive
constraints of adjacency and dovetailedness. The delimiting frame of discourse
genre is a kind of blueprint, which constrains the production and interpretation of
discourse-as-a-whole, as pointed out by Thibault (2003):
Rather, genres are types. But they are types in a rather peculiar way. Genres do not
specify the lexicogrammatical resources of word, phrase, clause, and so on. Instead,
they specify the typical [original emphasis] ways in which these are combined and
deployed so as to enact the typical semiotic action formations of a given community
(Thibault 2003: 44).
Discourse analysis 409

Connected intrinsically with the “typical ways” of doing things with words in a
discourse genre – or in an activity type in Levinson’s parlance – are inferential
schemata:
[…] there is another important and related fact, in many ways the mirror image of the
constraints on contributions, namely the fact that for each and every clearly demarcated
activity there is a set of inferential schemata [original emphasis]. These schemata are
tied to (derived from, if one likes) the structural properties of the activity in question.
(Levinson 1979: 370).

The communicative value of discourse units is expressed in these “typical ways”

of doing things with words in discourse genres, and the corresponding “inferen-
tial schemata” feed on the discursive constraints discussed above. The constraint
of “typical ways” and their corresponding “inferential schemata” is based on the
differentiation between type and token, which is also found in Mey’s distinction
between the pragmatic units of pragmeme and its realization in discourse as pract.
Pragmeme is a “general situational prototype capable of being executed in a situa-
tion” (Mey 2001: 221). It consists of an activity part and a textual part which when
instantiated in a particular situated context constitutes an “individuated, individual
pragmatic act” or “pract” (Mey 2001: 221). A pragmeme and its realization in
discourse as pract capture both a textual part (or “content”) and an activity part
(or “force”). To accommodate the constraints and requirements of discourse prag-
matics, the textual part additionally needs to contain so-called textual meaning, as
captured by one of the three metafunctions of Systemic Functional Grammar, the
textual metafunction, which administers cohesion as well as thematic progression,
i. e. the structured interplay of theme and rheme. Furthermore, pragmemes need
to be relational by definition, relating adjacently positioned pragmemes, relating
pragmemes with other pragmemes and with discourse-as-a-whole thus making
manifest the interlocutors’ communicative intentions in discourse.

3. Discourse as communicative action

It is impossible to conceptualize communicative action without the explicit accom-

modation of context, in which it is embedded and to which it refers implicitly
and explicitly, and it seems impossible to conceptualize communicative action
in context without the explicit accommodation of discourse, which constrains its
production and interpretation, and delimits context. Discourse is composed of lin-
guistic context (including other semiotic codes), and it needs cognitive context to
account for discourse production, discourse processing, grounding and discourse
coherence. Discourse is embedded in sociocultural context, which is seen as a par-
ticularization of social context in accordance with sociocultural values (cf. Fetzer
2004), which is embedded in social context. Against this background, the exami-
410 Anita Fetzer

nation of communicative action as discourse-dependent meaning in context seems

more appropriate, as is reflected in discourse as a higher-level communicative act,
or as pragmatic discourse.
Discourse – like context – has become more and more relevant to the analysis
of meaning, and like context the concept is used in diverging frameworks referring
to different theoretical constructs. Not only has the question whether discourse and
discourse analysis should be based on semantics or pragmatics been controversial
(e. g. Fetzer 2013), but also the questions of granularity as regards the basic unit of
investigation, that is the discourse unit on the one hand, and the delimiting frame
on the other, and whether they are discrete or fuzzy, and monadic or dyadic (or
collective). To accommodate both quantity and quality, a felicitous analysis of
discourse needs to go beyond the code model of language and accommodate the
premise that the whole, that is discourse, is more than the sum of its constitutive
parts. This also holds for the meaning of the whole, which is more than the sum
of the meanings of its separate parts. Against this background, discourse analysis
“has to do not with what texts mean, but with what might be meant by them, and
what they are taken to mean. In this view there is no ‘understanding’ of texts as
a semantic process, separate from, and prior to, a pragmatic ‘evaluation’, which
brings context into play” (Widdowson 2004: 35).
Few analyses have explicitly addressed the implications of the important meth-
odological consequences that discourse is pragmatic and therefore concerned with
communicative action in context. If discourse is pragmatic, it needs to be analyzed
within pragmatic theory and its fundamental premises of rationality, intentionality
of communicative action and cooperation, which does not only hold for discourse-
as-a-whole, but also for its constitutive parts. This has been done by the Birming-
ham School, for instance, Sinclair and Coulthard (1975), Montgomery (1977) or
Stubbs (1983), who have investigated the question of how illocutionary acts are
sequenced in connected speech. They conceive of discourse as consisting of a
series of exchanges between interlocutors. The exchange minimally comprises two
sequenced moves, an initiation and a response. The initiation-response-analysis is
further elaborated on by Edmondson and House (1981), who state that
[t]he underlying structure of a conversational episode is an interactional structure – i. e.
it is the sequential relevance of interactional acts which gives coherence to a conversa-
tion, and this is reflected in the textual cohesion of the substance of the conversation –
i. e. what is said (Edmondson and House 1981:80).

Speech act sequences have been conceived of as rule-governed units, which are
well-formed and coherent, or ill-formed and incoherent (e. g. Labov and Fanshel
1977; Edmondson 1981). Tsui (1994) approaches responding acts from the notion
of “preferred” versus “dispreferred” second pair parts of adjacent turns. She rein-
terprets both as being “[…] two types of responding acts. One which responds
positively and the other negatively” (1994:58). This allows her to combine con-
Discourse analysis 411

versation-analytic findings with the I-R-F (initiate-response-feedback) model of

Sinclair and Coulthard (1975). Sbisà (2002, and this volume) has analyzed speech
act sequences within narrative semiotics (Greimas 1983), concentrating on the
conventional effects of speech acts, such as assignments of obligations or entitle-
ments, which obtain only on the basis of the interlocutors’ implicit or by-default
agreement, which involves recipient’s decision to take the speech act in a certain
way. The Geneva model of discourse (Moeschler 2002; Roulet 1991) addresses
the connectedness between type and token, and points out that there is no straight-
forward mapping between speech act and discourse unit (or text segment/constit-
uent). They draw the conclusion that because of the complex and multifaceted
relationship between discursive parts and discourse-as-a-whole, the connecting
parts or discursive joints – or text relation markers (Roulet 2006) – require par-
ticular attention. They may indicate illocutionary relations between text segments
(or text constituents) and information stored in the discourse memory, which has
been referred to as discourse common ground. Text relation markers are markers
of illocutionary relations, and are functionally equivalent to illocutionary force
indicating devices, providing instructions for the interlocutors to facilitate access
to the relevant information.
Speech act theory has not yet been adapted comprehensively to the contextual
constraints and requirements of discourse even though the linguistic realization of
selected speech acts across cultures has been examined quite extensively, as have
been face-threatening acts (Brown and Levinson 1987). Van Dijk (1980) concep-
tualizes discourse as some kind of macro speech act, and Oishi and Fetzer (2016)
differentiate between classical speech acts and the higher-level illocutionary act
type of expositive.

3.1. Expositive as a higher-level illocutionary act type

In his analysis of speech acts, Austin discusses one group of illocutionary acts,
which contribute to making explicit the speaker’s attitude towards the communica-
tive status of her/his illocutionary act in discourse: “the expositive is the clarify-
ing of reasons, arguments, and communications” (Austin 1975: 163). Expositive
acts of expounding a view, conducting an argument, and clarifying a usage or a
reference (Austin 1975: 161) are different to ordinary speech acts. A necessary
condition for an ordinary speech act to be felicitous is a locution with a more-or-
less definite sense and reference as regards “naming” and “referring”. For the high-
er-level speech act of expositive, both illocution and locution are also higher-level
acts, and that is why expositives have higher-level locutionary meaning, which is
composed of the contextualization of prior discursive contribution(s) in accordance
with discourse-genre-specific expectations. In performing an expositive illocution-
ary act, the speaker manifests how illocutionary force and locutionary meaning are
intended to be contextualized discursively in context C in discourse D, at a par-
412 Anita Fetzer

ticular stage in discourse. In doing so, the speaker manifests his/her perlocutionary
intention of producing a perlocutionary object or sequel.3
Expositives manifest the speaker-intended concatenation of speech acts and
their linguistic realization as discursive contributions within a discourse and with
the discourse-as-a-whole. The expositive speech act type is thus different to ordi-
nary speech acts in that it has the function of making plain (1) how discursive
contributions are intended to fit into the course of discourse, (2) how the speakers
intend the words/linguistic strings to be taken, and (3) what they intend the words/
linguistic strings to count as in that discursive context. Because of this, exposi-
tives are metacommunicative devices par excellence. Their metacommunicative
function assigns expositives the status of higher-level illocutionary acts, which
are executed in discourse as generalized contextualization devices, requesting the
addressee(s) to contextualize a discursive contribution as the linguistic realization
of a speech act at a particular stage in the discourse in accordance with discursive
requirements. The contextualization of discursive contributions as requested by
expositive acts is indispensible to the interlocutors’ construal of discourse coher-
ence. Expositives count as requests to interpret embedded discursive contributions
in their embedding discursive context and therefore provide relevant discursive
glue.
The following excerpt from the discourse of Prime Minister’s Questions4
(PMQs) by the leader of the opposition Edward Miliband (LO) and Prime Minister
David Cameron (PM) at the July 10, 2013 session illustrate the form and function
of the expositive illocutionary act type, whose linguistic realization is printed in
italics:
Edward Miliband (Doncaster North) (Lab): Mister Speaker, let me (first) join the
Prime Minister in paying tribute to Andy Murray for his fantastic victory–following
Virginia Wade’s victory in 1977. It it was a, it was a fantastic achievement; he showed

3
In his analysis of expositives, Austin (1975: 162–163) provides the following list of
speech-act verbs:
1. affirm, deny, state, describe, class, identify; 2. remark, mention, ?interpose; 3. inform,
apprise, tell, answer, rejoin; 3a. ask; 4. testify, report, swear, conjecture, ?doubt, ?know,
?believe; 5. accept, concede, withdraw, agree, demur to, object to, adhere to, recognise,
repudiate; 6. postulate, deduce, argue, neglect, ?emphasise; 7. begin by, turn to, con-
clude by; 7a. interpret, distinguish, analyse, define; 7b. illustrate, explain, formulate;
7c. mean, refer, call, understand, regard as.
4
Prime Minister’s Question Time (PMQs) is a televised weekly 30-minute parliamentary
session in Great Britain, in which the Prime Minister (PM) responds to questions from
Members of Parliament (MPs). The Speaker presides over the House’s debate. The data
provided by Hansard (http://www.parliament.uk/business/news/2013/july/prime-minis-
ters-questions-10-july-2013/) have been checked against delivery and adapted accord-
ingly.
Discourse analysis 413

extraordinary determination, and the whole country is incredibly proud of him. Mister
Speaker, as the Government considers the issue of party funding reform, can the Prime
Minister tell the House how much his party has received in donations from hedge funds?

In saying “Mr Speaker, let me (first) join the Prime Minister in paying tribute to
Andy Murray …”, the LO connects his upcoming discursive contribution echo-
ing an act of congratulating performed by the PM. With the use of the expositive
“let me join” the LO does not only align with the PM by agreeing both with the
PM’s initial content and illocutionary force, but also provides discourse-structuring
information about how he intends to have his contribution discursively contextual-
ized with respect to embedding discourse and the discourse-as-a-whole, and how
he intends to structure his contribution; the latter is achieved by the combination of
the expositive with the cohesive device “first”. A discourse-structure-based analy-
sis of the hedged performative let me performative verb (cf. Brown and Levinson
1987) goes beyond face management. Rather, it connects interpersonal aspects
of communication with the structuring of discourse in an explicit manner. In the
excerpt above, the hedged performative makes manifest that the behabitive act of
congratulation, that is reacting to other people’s success (Austin 1975: 160–161),
is forthcoming. As for the construal of discourse coherence, let me join refers ana-
phorically to the PM’s prior turn while at the same time referring cataphorically to
an upcoming discursive contribution exhibiting dual referencing potential, which
makes manifest the discourse-structuring function of expositives and thus their
Janus-like nature. In using an expositive, the speaker manifests how s/he intends
the addressee(s) to take up her/his discursive contribution and how s/he intends
them to contextualize it (Gumperz 1996) at that particular stage in discourse. In
performing the expositive act, the LO manifests his perlocutionary intention of
taking up the initiated sequel of offering congratulations and of continuing it. The
expositive act is signaled with the conventionalized performative let me join which
is supplemented with the cohesive device first, implying that another speech act is
to follow, in this case the illocutionary act of directive realized by the convention-
alized performative can you do X, requesting the PM to provide information about
the quantity of donations received by the Conservative Party from hedge funds. In
performing this illocutionary act, the LO manifests his intention of producing the
perlocutionary sequel of initiating a debate about the transparency of donations to
political parties.
The differentiation between ordinary speech acts and expositives as a high-
er-level illocutionary act type allows speech act theory to extend its scope and
account for the nature of the connectedness between linearized speech-act sequences
and their linguistic realization as discursive contributions, considering not only the
status of individual speech acts but also the impact of their sequential position
on the structuring of discourse, thus contributing to a pragmatics-based theory of
discourse (cf. Sbisà, this volume). As higher-level illocutionary acts expositives
414 Anita Fetzer

directly influence the contextualization of the linguistic realizations of speech acts

and thus the structuring of discourse, contributing to the participants’ negotia-
tion and construal of discourse coherence (Gernsbacher and Givòn, 1995; Linell,
1998), making discourse not only co-constructed, but also dynamic. Discourse
connectives have a very similar function.5 Being processed bottom-up, they fulfil
an important indexical function by connecting local domains of discourse with
global ones (Schiffrin 1987). They may connect discursive contributions locally,
as has been demonstrated for the cohesive device first analyzed above, signaling
the sequential status of the argumentative formatting of the contribution as well
as possible degrees of relevance of the discourse topics to the ongoing discourse.
The communicative meaning of discourse connectives can be frequently par-
aphrased by a performative verb or by a hedged performative, e. g. “as a result”
with the value of ‘I conclude’, “but” with the values of ‘I contrast’ or ‘I do not quite
agree’, and “like” with the value of ‘I quote’. Analogously to expositives, discourse
connectives can be seen as carriers of perlocutionary intentions of producing per-
locutionary sequels.
Speech acts have not only been distinguished with respect to their status as
ordinary speech acts and as higher-level speech acts, but also, as has been the case
with Widdowson’s definition of discourse (2004), with respect to quantity, that is
as micro and macro speech acts.

3.2. Macro speech acts

Speech act theory has paved the ground for an examination of natural language
and other types of communication in context. It has not only influenced theoretical
pragmatics, but also applied linguistics, where the linguistic realization of speech
acts is examined in and across cultures, considering in particular different degrees
of (in)directness in sociocultural context. Since the focus has been on individual
speech acts, the context of the speech acts under investigation and the delimiting
frame, of which the speech acts under consideration have been a constitutive part,
for instance a formal or informal interview, have not been fully accounted for. This
does, however, not mean that speech act theory cannot be utilized for a felicitous
analysis of discourse, as has been shown by the contextualization of speech act
theory and the adaptation of the constitutive parts of a speech act, i. e. locutionary
act/propositional act, illocutionary act and perlocutionary act, and intended and

5
Discourse connectives also support the contextualization of a discursive contribution
by indicating the speaker’s intended contextualization, as is the case with the strategic
use of the cohesive device first in the excerpt analyzed above. In addition to their inter-
actional and discourse-structuring function, they may also have attitudinal and illocu-
tionary-force intensifying functions.
Discourse analysis 415

unintended perlocutionary effects, and their felicity conditions to an analysis of

discourse, accommodating the differentiation between direct and indirect speech
acts and their felicity conditions to the contextual and discursive embeddedness
of speech acts and to their sequential organization as single acts or as patterned
sequences with structured pre-, topical and post-sequences. Levinson (1983) has
shown this for the sequential organization of the speech acts of announcement,
invitation and request with respect to felicity-condition-based pre-sequences, that
is references to the preparatory conditions for requests and invitations, and refer-
ences to the preparatory or essential condition for announcements. Trosborg (1995)
has adapted the Cross-Cultural-Speech-Act-Realization-Project framework to the
sequential organization of requests, complaints and apologies with respect to dis-
course structuring pre-, post- and head acts, with pre-acts being functionally equiv-
alent to references to felicity conditions, for instance references to the preparatory
conditions for request, head acts being functionally equivalent to the intended com-
municative act to be performed, that is the request as such, and post-acts being
functionally equivalent to supportive acts accounting for the appropriateness of the
head act, such as accounts for the relevance of the request. Combining structure,
content and force, that is patterned sequences and their inherent hierarchical con-
figuration as pre-, post- and head acts, and propositional content and illocutionary
force and their felicity conditions provides synergetic effects, which can contribute
to a pragmatics-based theory of discourse.
The dynamics of discourse can only be captured if fundamental pragmatic
premises are adapted to discursive linearization. This is because the sequencing
of discourse makes manifest the discursive contributions’ – referred to as “moves”
by Sbisà – perlocutionary effects: “[w]hen considering a sequence of moves, it
is reasonable to view the output of one move as coinciding with the input for
the next” (2002: 72). Bach goes further by connecting micro, meso and macro
domains of discourse with different types of intention: “communicative (illocu-
tionary) intentions generally are accompanied by perlocutionary intentions, and
individual utterances are usually parts of larger plans. So it is plausible to suppose
that identifying a speaker’s perlocutionary intentions and broader plans is often
relevant to identifying his communicative intention” (Bach 1992: 397). Perlocu-
tionary intentions are also inherent in Austin’s conception of perlocutionary act,
which manifests itself in the “achievement of a perlocutionary object (convince,
persuade) or the production of a perlocutionary sequel” (Austin 1975: 181). Thus,
the Austinian conception of speech act does not only account for force and con-
tent, but also for metadiscursive meaning, which is reflected in the reference to
“sequel”, that is some kind of continuation, connectedness, series or sequence.
This can be interpreted as a requirement to connect a speech act with adjacent
speech acts, and possibly with other more remote ones, bringing about the under-
standing of the content, force and metadiscursive meaning, contributing to the
construal of discourse coherence.
416 Anita Fetzer

The extension of frame from speech act to discourse, and from communicative
intention to discourse purpose is a necessary step if discourse-as-a-whole is to be
examined, as has been done by Labov and Fanshel (1977) or by van Dijk (1980) for
instance. The former conceive the performance of discourse (as-a-whole) as func-
tionally equivalent to the performance of a “matrix of utterances” (Labov and Fan-
shel 1977: 30). Van Dijk argues that complex sequences of speech acts are mapped
on more global macro acts in order to be able to plan them, execute them coher-
ently, and in order to understand them, memorize them, and talk about them. The
nature of the connectedness between (micro) speech acts and macro speech acts
is complex. This is because there is no straightforward mapping from discursive
contribution – or utterance in Labov and Fanshel’s terms – to micro speech act and
from micro speech acts to macro speech act. Rather, there are in-between-stages,
or more and less global macro speech acts and thus discursive contributions with
fuzzy boundaries, which also need to be considered in the corresponding mapping
operations. Once discursive contributions have been mapped onto micro speech
acts and once they have been accommodated in the discourse common ground,
they may be administered to form larger units.
Discourse purpose is a pragmatic concept, which is dialectically related to
the pragmatic premise of intentionality of communicative action (Cohen, Mor-
gan and Pollack 1992; Levinson 1995; Searle 1983). It is made manifest in the
speech-act-theoretic operationalization X counts as Y in context C with felicity
conditions as context categories (Sbisà 2002), which, if adapted to the contextual
constraints and requirements of discourse, result in X counts as Y in discourse
D in context C. Analogously to the felicity conditions of a (micro) speech act,
the felicity conditions for discourse can be classified as preparatory conditions,
which are specifications of the context of the (macro) illocutionary act, which can
be realized implicitly or explicitly in discourse. Essential conditions and propo-
sitional content conditions are specifications of direction of fit, which can also
be realized explicitly or implicitly in discourse by indicating how the discourse
is intended to proceed. Micro and macro speech acts “both rely on, and actively
create, the situation in which they are realized” (Mey 2001: 219). They have cog-
nitive effects regarding meaning and force, and they have social effects regarding
the assignment of obligations and force. Both micro and macro speech acts count
as context-changers. As a result, speech acts in discourse are doubly contextual and
therefore do not only change context, but also carry context, and they are multiply
discursive, connecting speech acts and their realizations as discursive contributions
locally, not-so-locally and globally.
Analogously to the performance of a (micro) speech act, which can be realized
as a direct, indirect or conventionally indirect speech act, discourse (as-a-whole)
can be realized as discourse with a direct, indirect or conventionally indirect force.
In discourse with a direct force, the communicative intent and its linguistic realiza-
tion as a sequence of one or more discursive contributions are represented explic-
Discourse analysis 417

itly as regards force and content and thus are intended to be unambiguously clear,
as is the case in legal discourse, e. g. pronouncement of judgement or cross-ex-
amination, and institutional discourse, such as application forms for citizenship
or reminders. The linguistic realization of discourse with conventionally indirect
force depends strongly on cultural conventions, as is the case with small talk (cf.
Schneider 2008) and closing sections in ordinary conversations, or reviews, letters
of recommendation and obituaries, for instance. Analogously to indirect speech
acts, the communicative meaning of discourse with an indirect force depends
strongly on the context, in which it is realized. Informal small talk or gossip may
simply have a phatic function, but it may also serve as some kind of briefing, com-
municating relevant information about something or somebody.
The macro speech act (or discourse genre, communicative genre, activity type,
communicative project) of interview, whose purpose is to elicit information, may
undergo discourse-purpose-specific particularization, according to the kind of
information elicited; discourse-specific particularization is, of course, also inter-
dependent on sociocultural constraints and requirements. For instance, political
interviews are used strategically to elicit and systematize political information, oral
examinations are used in educational contexts to assess the examinee’s expertise,
job interviews are used to evaluate a candidate’s suitability and expertise, and
health interviews are used to elicit information about patients’ conditions. The
macro speech act of interview is also used to elicit and systematize citizenship-ori-
ented information about relevant criteria for the (non)qualification for income sup-
port, housing benefit or political asylum, and it may also be used for various other
purposes.
Depending on their discourse-specific purpose, particularized interviews
are composed of discourse-specific realizations of discursive contributions con-
strained by linguistic and social style (e. g. lexicon, syntactic complexity, discur-
sive complexity, non-verbal code of conduct) and the participants’ face-wants
and face-needs (cf. Brown and Levinson 1987), for instance formal style with
negative politeness or informal style with positive politeness (cf. Fetzer 2000 for
the macro validity claim of political interview). Discourse-specific purpose may
also constrain the sequential organization of the macro speech act, such as elab-
orate opening or closing sections, ad-hoc pre- or side sequences, reformulation
sequences or deviations from the participant-specific employment of speech acts
with requestive force, such as interviewees asking questions to perform requests
for information. Deviations from the genre-specific constraints and requirements
need to be accounted for (“Can I ask you a question because that is important”).
The particularization of macro speech acts goes hand in hand with changes in
social norms and values, as is reflected in the emergence of new macro speech acts
or of discourse-purpose- or social-context-specific particularizations, for instance
in social media. This is found in dialogic formats, such as interviews, and in mono-
logic formats, e. g. lectures and their particularization as academic lecture, political
418 Anita Fetzer

speech or sermon. It is also reflected in the more general process of conversation-

alization of (British) institutional discourse (Fairclough 1992).
The analysis of discourse is fundamentally concerned with the nature of the
connectedness between parts and wholes, and for this reason discourse is a rela-
tional construct par excellence, relating separate parts locally as well as globally
with regard to their connectedness to discourse-as-a-whole. Discourse is thus not
only quantity, as is captured by the number of its constitutive parts, but also quality,
as is reflected in the force and nature of the connectedness of its constitutive parts.

4. Conclusion

Discourse has been described as a multifarious and multilayered construct, which

seems almost impossible to delimit. It has been defined as quantitatively larger than
one discourse unit, and thus is composed of a number of concatenated micro and
meso discourse units. The linearization of the constitutive units of discourse allows
for multiple combinations, whose ordering is constrained by discourse genre and
discursive purpose as well as by the interlocutors’ communicative goals. While
the constitutive units of discourse can be analyzed as grammatical or ungrammat-
ical, true or false, felicitous or infelicitous, or appropriate or inappropriate, their
ordering cannot be classified along those lines only. This is because discourse is
a parts-whole configuration in which the meaning of the whole is more than the
sum of its separate parts. If the ordering of the parts changes, so does the meaning
of the whole.
Micro, meso and macro discourse units and their linguistic realizations are
related dialectically in discourse. This holds for micro discourse units and their
linguistic realizations, for meso discourse units and their linguistic realizations,
and for macro discourse units and their linguistic realizations. Analogously to mul-
tilayered context, macro discourse units and their constitutive parts of meso and
micro discourse units are multilayered and doubly, if not multiply contextual, and
their order of inclusion, i. e. micro, meso and macro, corresponds to their order of
accessibility (cf. Fetzer 2012).
Discourse has been approached in different research paradigms, which have
implemented different methodological approaches considering formal, interpre-
tative and observational analyses as regards qualitative, quantitative and empiri-
cal issues. Irrespective of methodology and research framework, the fundamental
questions of (1) granularity regarding micro, meso and macro discourse units and
(2) the nature of the connectedness between their constitutive parts remain a chal-
lenge.
Discourse analysis 419

References

Aijmer, Karin
1997 I think – an English modal particle. In: Toril Swan and Olaf Jansen (eds.),
Modality in Germanic Languages. Historical and Comparative Perspectives,
1–47. Berlin: de Gruyter.
Arundale, Robert B. and Good, David
2002 Boundaries and sequences in studying conversation. In: Anita Fetzer and
Christiane Meierkord (eds.), Rethinking Sequentiality. Linguistics meets Con-
versational Interaction, 121–150. Amsterdam: John Benjamins.
Asher, Nicholas and Lascarides, Alex
2003 Logics of Conversation. Cambridge: Cambridge University Press.
Austin, John L.
1975 How to Do Things with Words. Cambridge: Cambridge University Press.
Bach, Kent
1992 Communicative intentions, plan recognition, and pragmatics: Comments on
Thomason and Littman and Allen. In: Philip R. Cohen, Jerry Morgan and Mar-
tha E. Pollack (eds.), Intentions in Communication, 389–400. Cambridge: MIT
Press.
Behrens, Bergljot, Cathrine Fabricius-Hansen and Kare Solfjeld
2012 Competing structures. The discourse perspective. In: Cathrine Fabri-
cius-Hansen and Dag Haug (eds.), Big Events, Small Clauses. The Grammar
of Elaboration, 179–225. Berlin: de Gruyter.
Biber, Douglas
1988 Variation across Speech and Writing. Cambridge: Cambridge University Press.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad and Eduard Finegan
1999 Longman Grammar of Spoken and Written English. London: Longman.
Brown, Penelope and Stephen C. Levinson
1987 Politeness. Some Universals in Language Usage. Cambridge: Cambridge Uni-
versity Press.
Bublitz, Wolfram, Uta Lenk and Eija Ventola (eds.)
1999 Coherence in Spoken and Written Discourse: How to Create it and How to
Describe it. Selected Papers from the International Workshop on Coherence,
Augsburg, 24–27 April 1997. Amsterdam: John Benjamins.
Chafe, Wallace
1994 Discourse, Consciousness and Time. Chicago: Chicago University Press.
Cohen, Philip R., Jerry Morgan and Martha E. Pollack (eds.)
1992 Intentions in Communication. Cambridge: MIT Press.
De Beaugrande, Robert and Dressler, Wolfgang
1981 Einführung in die Textlinguistik. Tübingen: Niemeyer.
Dik, Simon
1997 The Theory of Functional Grammar (2 Volumes). Ed. Kees Hengeveld. Berlin:
de Gruyter.
Edmondson, Willis
1981 Spoken Discourse. A Model for Analysis. London: Longman.
Edmondson, Willis and Juliane House
1981 Let’s Talk and Talk about it. München: Urban and Schwarzenberg.
420 Anita Fetzer

Fabricius-Hansen, Cathrine and Wiebke Ramm

2008 Editors’ introduction. In: Cathrine Fabricius-Hansen and Wiebke Ramm (eds.),
‘Subordination’ versus ‘Coordination’ in Sentence and Text, 1–30. Amster-
dam: John Benjamins.
Fairclough, Norman
1992 Discourse and Social Change. Cambridge: Polity Press.
Fetzer, Anita
2000 Negotiating validity claims in political interviews. Text 20(4): 1–46.
Fetzer, Anita
2004 Recontextualizing Context: Grammaticality Meets Appropriateness. Amster-
dam: John Benjamins
Fetzer, Anita
2007 Reformulation and common grounds. In: Anita Fetzer and Kerstin Fischer
(eds.), Lexical Markers of Common Grounds, 157–179. London: Elsevier.
Fetzer, Anita
2011 “Here is the difference, here is the passion, here is the chance to be part of a
great change”: Strategic context importation in political discourse. In: Anita
Fetzer and Etsuko Oishi (eds.), Context and Contexts: Parts Meet Whole?,
115–146. Amsterdam: John Benjamins.
Fetzer, Anita
2012 Contexts in interaction: Relating pragmatic wastebaskets. In: Rita Finkbeiner,
Jörg Meibauer and Petra Schumacher (eds.), What is a Context? Linguistic
Approaches and Challenges, 105–127. Amsterdam: John Benjamins.
Fetzer, Anita
2013 The pragmatics of discourse. Topics in Linguistics 11: 5–12.
Fetzer, Anita
2014 Conceptualizing discourse. In: Klaus Schneider and Anne Barron (eds.), The
Pragmatics of Discourse, 35–61, Handbooks of Pragmatics 3. Berlin: de
Gruyter.
Fetzer, Anita and Augustin Speyer
2012 Discourse relations in context: Local and not-so-local constraints. Intercul-
tural Pragmatics 9(4): 413–452.
Gernsbacher, Morton-Ann and Talmy Givón (eds.)
1995 Coherence in Spontaneous Text. Amsterdam: John Benjamins.
Givón, Talmy
1993 English Grammar: A Function-Based Introduction (2 Volumes). Amsterdam:
John Benjamins.
Greimas, Algirdas J.
1983 Du sens II. Paris: Seuil.
Gruber, Helmut and Gisela Redeker (eds.)
2014 The Pragmatics of Discourse Coherence: Theories and Applications. Amster-
dam: John Benjamins.
Grice, Herbert Paul
1975 Logic and conversation. In: Peter Cole and Jerry L. Morgan (eds.), Syntax and
Semantics, 41–58. New York: Academic Press.
Gumperz, John J.
1992 Contextualization and understanding. In: Alessandro Duranti and Charles
Discourse analysis 421

Goodwin (eds.), Rethinking Context: Language as an Interactive Phenome-

non, 229–252. Cambridge: Cambridge University Press.
Gumperz, John J.
1996 The linguistic and cultural relativity of inference. In: John J. Gumperz and
Stephen C. Levinson (eds.), Rethinking Linguistic Relativity, 374–406. Cam-
bridge: Cambridge University Press.
Halliday, Michael A.K.
1994 Introduction to Functional Grammar. London: Arnold.
Halliday, Michael A.K and Ruqaiya Hasan
1987 Cohesion in English. London: Longman.
Heine, Bernd, Günther Kaltenböck, Tania Kuteva and Haiping Long
2013 An outline of Discourse Grammar. In: Shannon T. Bischoff and Carmen Jany
(eds.), Functional Approaches to Language, 155–206. Berlin: de Gruyter.
Hengeveld, Kees and Lachlan J. Mackenzie
2008 Functional Discourse Grammar. A Typologically-Based Theory of Language
Structure. Oxford: Oxford University Press.
Heritage, John
1984 Garfinkel and Ethnomethodology. Cambridge: Polity Press.
Labov, William and David Fanshel
1977 Therapeutic Discourse. Psychotherapy as Conversation. New York: Academic
Press.
Levinson, Stephen C.
1979 Activity types and language. Linguistics 17: 365–399.
Levinson, Stephen C.
1983 Pragmatics. Cambridge: Cambridge University Press.
Levinson, Stephen C.
1995 Interactional bias in human thinking. In: Esther Goody (ed.), Social Intelli-
gence and Interaction, 221–260. Cambridge: Cambridge University Press.
Linell, Per
1998 Approaching Dialogue. Amsterdam: John Benjamins.
Maier, Robert M., Carolin Hofmockel and Anita Fetzer
2016 The negotiation of discourse relations in context: Co-constructing degrees of
overtness. Intercultural Pragmatics 13(1): 71–105.
Mann, William C. and Sandra A. Thompson
1988 Rhetorical structure theory: Toward a functional theory of text organization.
Text 8(3): 243–281.
Martin, James R. and David Rose
2008 Genre Relations. Mapping Culture. London: Equinox.
Mey, Jacob
2001 Pragmatics. An Introduction. Oxford: Blackwell.
Mey, Jacob
2011 Speech acts in context. In: Anita Fetzer and Etsuko Oishi (eds.), Context and
Contexts: Parts meet Whole?, 171–180. Amsterdam: John Benjamins.
Moeschler, Jacques
2002 Speech act theory and the analysis of conversations. In: Daniel Vanderveken
and Susumu Kubo (eds.), Essays in Speech Act Theory, 239–261. Amsterdam:
John Benjamins.
422 Anita Fetzer

Montgomery, Martin
1977 The structure of lectures. Unpublished M.A. thesis, University of Birmingham.
Oishi, Etsuko and Anita Fetzer
2016 Expositives in discourse. Journal of Pragmatics 96: 49–59.
Pomerantz, Anita
1984 Agreeing and disagreeing with assessments: Some features of preferred/dis-
preferred turn shapes. In: Max Atkinson and John Heritage (eds.), Structures
of Social Action, 57–101. Cambridge: Cambridge University Press.
Roulet, Eddy
1991 On the structure of conversation as negotiation. In: John R. Searle, Herman
Parret and Jef Verschueren (eds.), (On) Searle on Conversation, 91–100.
Amsterdam: John Benjamins.
Roulet, Eddy
2006 The description of text relation markers in the Geneva model of discourse
organization. In: Kerstin Fischer (ed.), Approaches to Discourse Particles,
115–132. Oxford: Elsevier.
Sarangi, Srikant
2000 Activity types, discourse types and interactional hybridity: The case of genetic
counselling. In: Srikant Sarangi and Malcolm Coulthard (eds.), Discourse and
Social Life, 1–27. London: Pearson.
Sbisà, Marina
2002 Cognition and narrativity in speech act sequences. In: Anita Fetzer and Chris-
tiane Meierkord (eds.), Rethinking Sequentiality: Linguistics Meets Conversa-
tional Interaction, 71–97. Amsterdam: Benjamins.
Schegloff, Emanuel A.
2007 Sequence Organization in Interaction: A Primer in Conversation Analysis.
Cambridge: Cambridge University Press.
Schiffrin, Deborah
1987 Discourse Markers. Cambridge: Cambridge University Press.
Schneider, Klaus P.
2008 Small talk in England, Ireland, and the USA. In: Klaus P. Schneider and Anne
Barron (eds.), Variational Pragmatics. A Focus on Regional Varieties in Pluri-
centric Languages, 99–139. Amsterdam: John Benjamins.
Searle, John R.
1983 Intentionality. Cambridge: Cambridge University Press.
Searle, John R.
2010 Making the Social World: The Structure of Human Civilization. Oxford:
Oxford University Press.
Sinclair, John and Malcolm Coulthard
1975 Towards an Analysis of Discourse. London: Cambridge University Press.
Speyer, Augustin and Anita Fetzer
2014 The coding of discourse relations in English and German argumentative
discourse. In: Helmut Gruber and Gisela Redeker (eds.), The Pragmatics of
Discourse Coherence: Theories and Applications, 87–119. Amsterdam: John
Benjamins.
Stubbs, Michael
1983 Discourse Analysis: The Sociolinguistic Analysis of Natural Language. Chi-
cago: University of Chicago Press.
Discourse analysis 423

Thibault, Paul J.
2003 Contextualization and social meaning-making practices. In: Susan L. Eerd-
mans, Carlo L. Prevignano and Paul L. Thibault (eds.), Language and Interac-
tion. Discussions with John J. Gumperz, 41–62. Amsterdam: John Benjamins.
Traugott, Elizabeth Closs
1988 Approaches to Grammaticalization. Amsterdam: John Benjamins.
Trosborg, Anna
1995 Interlanguage Pragmatics. Berlin: De Gruyter.
Tsui, Amy
1994 English Conversation. Oxford: Oxford University Press.
Van Dijk, Teun
1980 Macrostructures. Hillsdale, N.J.: Erlbaum.
Weizman, Elda and Anita Fetzer (eds.)
2015 Follow-Ups in Political Discourse. Explorations Across Contexts and Dis-
course Domains. Amsterdam: John Benjamins.
Widdowson, Henry
2004 Text, Context, and Pretext. Critical Issues in Discourse Analysis. Oxford:
Blackwell.
17. Critical Discourse Analysis
Piotr Cap

Abstract: This chapter gives an overview of the theoretical underpinnings and

current work in Critical Discourse Analysis (CDA). It defines CDA as a transdis-
ciplinary, text-analytical approach to critical social research, aimed at revealing
the power imbalance reflected in the use of language and patterns of dominance
imposed through the use of language. Describing the most important schools and
models in CDA, the chapter demonstrates how critical approaches draw on recent
developments in different areas of linguistics, such as pragmatics, cognitive lin-
guistics and corpus studies. At the same time, it shows how the interdisciplinary
research agenda of CDA attracts the “classic” theories and tools of linguistics to
new empirical territories in political/public discourse. The final part of the chap-
ter illustrates the explanatory power of the legitimization-proximization model in
CDA in a case study of the discourse of the war-on-terror.

1. What is Critical Discourse Analysis?1

Critical Discourse Analysis (CDA) has now firmly established itself as a field
within the humanities and social sciences, to the extent that the abbreviation CDA
is widely used to denote a recognizable approach to language study manifested
across a range of different disciplines (Breeze 2011; Hart and Cap 2014). In the
most recent handbooks, CDA is characterized as a “transdisciplinary, text-analyti-
cal approach to critical social research” (Hart and Cap 2014: 1; see also Wodak and
Meyer 2009, 2015; Flowerdew and Richardson 2016). Of course, this basic charac-
terization cannot possibly do justice to the vast body of work produced within the
field of CDA. It captures, however, one property that is central to all CDA research:
the commitment to a systematic, text-based exploration of language to reveal its
role in the workings of ideology and power in society (Fowler et al. 1979; Hodge
and Kress 1993; Fairclough 1989, 1995; van Dijk 1999, 2003, 2006; Wodak and
Meyer 2009; Wodak 2012; among others). It is exactly this core feature, or aspira-
tion, that underlies any strand of CDA practice.
As a self-conscious movement bringing together scholars of linguistic, socio-
logical, political scientific and other backgrounds, CDA abounds in declarations
of what it purports to do. These declarations range from the highly politicized: “to

1
Parts of sections 1 and 2 are based on Hart and Cap (2014).

https://doi.org/10.1515/9783110424928-017
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 425–451. Berlin/
Boston: De Gruyter Mouton.
426 Piotr Cap

explain existing conventions as the outcome of power relations and power strug-
gle” (Fairclough 1989: 2), to the almost anodyne “to answer questions about the
relationships between language and society” (Rogers et al. 2005: 365), depending
on the stance of the individual researcher (Breeze 2011). In an attempt to recon-
cile the different positions, Weiss and Wodak (2003) propose that “CDA takes
a particular interest in the relationship between language and power […]. This
research specifically considers more or less overt relations of struggle and conflict”
(2003: 12). Drawing on this perspective, and stressing the particular interest of
CDA in the asymmetrical nature of these relations, we can conclude that the aim of
CDA is to raise awareness of the power imbalance reflected in the use of language
and patterns of dominance imposed through the use of language (Chouliaraki and
Fairclough 1999; Reisigl and Wodak 2001; Weiss and Wodak 2003; Wodak and
Chilton 2005; among others).
As can be imagined from the above characterization, Critical Discourse Anal-
ysis is not confined to any specific methodology or area of research. On the con-
trary – it is and always has been multifaceted, dealing with data of very different
kinds and applying a broad spectrum of theories sourced from across the human-
ities, social and cognitive sciences (Hart and Cap 2014; Wodak and Meyer 2015;
Flowerdew and Richardson 2016). Hart and Cap (2014) note that, because of this
heterogeneity, both the “discourse” and the “analysis” in the CDA designation
tend to mean something different to different analysts. Discourse (see Fetzer in
this volume) is a multidimensional, multimodal and multifunctional phenomenon.
It is produced with reference to different dimensions of context, such as linguistic,
intertextual, historical and – notably for CDA practitioners – socio-cultural and
political. Functionally, it is used to represent, evaluate, argue for and against, and
ultimately to legitimate or delegitimate social actions. In this way, discourse is
socially constitutive as well as socially conditioned (Fairclough and Wodak 1997;
Wodak 2011). That is, on the one hand, all discourse is shaped by the situations,
institutions and social structures which surround it. At the same time, however,
discourse itself constitutes these situations and institutions, as well as the social
identities and relationships between their members or participants. Altogether, the
many faces of discourse preclude any uniform perception of how it can be inves-
tigated.
In CDA, analytic differences reflect conspicuously in the amount of space that
different researchers devote to explore the “micro” (linguistic) and the “macro”
(social) dimensions of discourse (Lemke 1995; Benke 2000). Some analysts focus
deductively on the macro-level social structures which facilitate or motivate dis-
cursive events, while others concentrate inductively on the micro-level, looking at
the particular chunks of language that make up these events. These preferences are,
of course, never mutually exclusive but are a matter of analytical emphasis. Fur-
thermore, many researchers steer a middle, “abductive” course. In Luke’s (2002)
words:
Critical Discourse Analysis 427

CDA involves a principled and transparent shunting backwards and forth between the
microanalysis of texts using various tools of linguistic, semiotic and literary analysis,
and the macroanalysis of social formations, institutions and power relations that these
texts index and construct (Luke 2002: 100).

Methods of studying discourse in CDA are thus diverse and depend on the domains
and dimensions of discourse under consideration, plus the theoretical goals of the
researcher. Analytic aspirations and the amount and kind of data available deter-
mine the tools analysts obtain from different macro- and micro-level theories. At
the micro-level, one of the most addressed models is Hallidayan systemic func-
tional linguistics (1985, 1994), providing a viable handle on ideological properties
of written texts (Fowler 1991; Hodge and Kress 1993). At the other end of the
spectrum, cognitive approaches inform studies in the bottom-level lexico-gram-
matical structures of discourse in terms of the conceptual processes they invoke
(Hart 2014; Chilton 2014). Finally, one must not disregard the explanatory power
of hybrid approaches, such as critical metaphor analysis (Charteris-Black 2004;
Koller 2004; Musolff 2010), which offers CDA practitioners a rich, integrated
framework to capture the ideological import of metaphoric expressions occurring
in specific text patterns and phraseological sequences. Needless to say, such a
diversity and fluidity makes CDA a difficult discipline to pin down.
It seems that the best way to define CDA, though by no means ideal, is by
the word “critical” in its designation (Hart and Cap 2014). This involves seeing
CDA as a perspective, position or attitude, signposting a specific research agenda.
The concept of critical in CDA, however, is understood in as broad a sense as the
concept of discourse. For scholars working with a neo-Marxist notion of critique
(Fairclough 1995; Chouliaraki and Fairclough 1999), or following the Critical
Theory of the Frankfurt School (Wodak 2011; Reisigl and Wodak 2001), critique
presupposes a particular political stance on the part of the analyst and is intended
to be instrumental in bringing about social change (Hart and Cap 2014). Not-
withstanding its popularity, this attitude is often contested by researchers both
within (Luke 2002; Martin 2004) and outside (or half-outside) the community of
CDA (Widdowson 1998, 2005; Chilton 2005). Martin (2004) claims that it leads
to the essentially “negative” nature of analysis, which thus overlooks positive and
potentially transformative uses of discourse. In response, Martin and Rose (2003)
propose “positive discourse analysis” encouraging critical scholars to devote more
attention to the “discourse of positive change and discourse as the site of resist-
ance” (2003: 36).
For others still, critique comes not so much from a particular political perspec-
tive but is concerned more with abuses of language per se and the cognitive and
linguistic mechanisms involved (Hart and Cap 2014). At the same time, there are
traditions in post-structuralist discourse analysis, which adopt a critical perspective
(Slembrouck 2001) but which would not normally be considered as falling under
428 Piotr Cap

the banner of CDA. Criticality, then, is in a way a necessary condition for defining
CDA but it is not a sufficient condition. What sets CDA apart from other forms of
critical research is its focus on the micro-level analysis of texts, which are con-
sidered the prime source of attested data. In its analysis of texts, CDA relies quite
naturally on the field of linguistics – including pragmatics – though to different
degrees in different works. Here, although CDA is a huge and complex field which
is apparently without boundaries both methodologically and in terms of the type
of data it targets, some clear traditions can be identified and described. These tra-
ditions may be delineated in terms of particular methodological approaches (e. g.
Wodak and Meyer 2009; Hart and Cap 2014) and in terms of the discourse domains
targeted (e. g. Cap and Okulska 2013; Bhatia 2004; Martin and Rose 2008).

2. Approaches and domains in CDA

In one of the more recent and most comprehensive attempts at taking stock of the
field, Hart and Cap (2014) distinguish eleven approaches to CDA. Because of space
constraints, I will not describe each of these approaches in detail. Instead, I will
focus on how the different approaches interrelate, forming analytic handles dealing
with different types of data. Hart and Cap (2014) present the eleven approaches in
relation to their specific methodological attractors, which indicate the underlying
analytical traditions. Hart and Cap’s (2014) outline is reproduced in Figure 1. The
white ovals mark the approaches, and the shaded ovals mark their attractors. The
five constellations in the diagram demonstrate how different approaches are linked
by common objects of analysis.
The representation in Figure 1 illustrates the variety and interconnectedness
of different research traditions in CDA. For example, the discourse-historical
(Wodak 2011; Reisigl and Wodak 2001; etc.) and socio-cognitive (van Dijk 2008)
approaches are both related in their focus on argumentation, although the dis-
course-historical approach deals with argumentation in more detail, proposing
tools to locate and describe fallacy triggers and argumentative topoi (van Eemeren
and Grootendorst 1992) in different discourse domains. At the same time, the dis-
course-historical approach borrows in its framework of referential strategies from
the social actor model (Koller 2004; van Leeuwen 2005; etc.). In turn, the social
actor model is presented as a grammar in the format of Halliday’s functional net-
work (van Leeuwen 1996; Halliday 1994). We thus observe direct as well as indi-
rect connections between the particular models.
As Hart and Cap (2014) demonstrate, contemporary CDA is a genuine mix
of social and linguistic theory, lending itself to different typological procedures.
While different approaches can be mapped out according to the social theories they
are influenced by they may equally be distinguished by the linguistic fields and
models that provide for their text-analytical methodologies. One model that has
Critical Discourse Analysis 429

Figure 1. Approaches and methodological attractors in CDA (reproduced from Hart and
Cap 2014: 7)
(CL: Critical linguistics; DRA: Dialectical-relational approach; DA: Disposi-
tive analysis; SAM: Social actor model; DHA: Discourse-historical approach;
SCA: Socio-cognitive approach; CCP: Critical cognitive pragmatics; L/PM:
Legitimization-proximization model; CogLA: Cognitive linguistics approach;
CMA: Critical metaphor analysis; CorpLA: Corpus linguistics approach)

turned particularly influential is Halliday’s systemic functional grammar (e. g. Hal-

liday 1985, 1994), implementing analytic formalizations in much of the early CDA
and in critical linguistics in particular (Wodak 2011; Chilton 2005). It has thus
helped critical linguistics, or the East Anglian school (Fowler et al. 1979; Fowler
1991; Hodge and Kress 1993), to retain its central role in the development of CDA.
As noted by Fairclough and Wodak (1997), critical linguistics is more than a histor-
ical precursor to CDA. Influenced over years by text-analytical frameworks such
as systemic functional grammar, it has been able to upgrade its tools to produce
comprehensive, qualitative-quantitative studies (Hart and Cap 2014; Flowerdew
and Richardson 2016). As a result, it can be considered a major approach in the
landscape of modern CDA (Fairclough and Wodak 1997).
430 Piotr Cap

Notwithstanding the revisions of older theories, CDA has grown considera-

bly in the last few years to develop several completely new schools. This rapid
expansion can be understood as a response to recent advances in linguistics and
other communication sciences. The nature of this response is, first of all, that such
advances make it possible to address and, in many cases, offset certain criticisms
raised against CDA. Second, modern developments in linguistics and communi-
cation science provide new tools to better capture and document the ideological
potential of discourse. Third, there are new frameworks being developed or refined
to account for newly formed genres, such as, recently, genres of computer mediated
communication (Giltrow and Stein 2009; Yus 2011). One development in linguis-
tics that CDA incorporated almost immediately was, undoubtedly, corpus studies
(Stubbs 2002, 2004; Partington 2006; Baker 2006; Baker at al. 2008; O’Halloran
2010).2 Hart and Cap (2014) argue that the corpus linguistic approach in CDA helps
answer criticisms pertaining to possible bias in data selection and to the statistical
value of findings (Stubbs 1997; Widdowson 2004). It is, however, not just a prob-
lem solver which can be applied together with other approaches to ensure against
subjectivity and overgeneralization (Wodak and Meyer 2009). As noted recently
by Flowerdew and Richardson (2016), the corpus linguistic approach brings along
its own unique analytical techniques, such as collocation and prosody analysis,
which have been more and more productive in studying set chunks of texts for their
ideological properties (Baker 2006; Baker et al. 2008).
Figure 1 includes four new approaches in CDA, which had not been acknowl-
edged prior to Hart and Cap’s (2014) work. These increasingly influential par-
adigms can be identified as: critical metaphor analysis (Charteris-Black 2004;
Koller 2004; Musolff 2004, 2010; Zinken 2007, among others); the cognitive
linguistic approach (Hart 2011a/b/c, 2013a/b; Marín Arrese 2011; Filardo Lla-
mas et al. 2016); the legitimization-proximization model (Cap 2006, 2008, 2013,
2016; Chilton 2004, 2011b; Dunmire 2011); and the “Neuchatel/Fribourg” school
of critical cognitive pragmatics (Saussure and Schulz 2005; Maillat and Oswald
2009, 2011; Lewiński and Oswald 2013). Each of these new agendas represents,
like most strands in CDA, an individual yet interdisciplinary research program.
Moreover, like other schools in CDA, each of them constitutes a specific line of
inquiry aiming to reveal the otherwise unexplored characteristics of discourse in its
socio-political, cultural and anthropological dimensions. Critical metaphor studies,
for instance, document the fundamental role that metaphor plays not only in our

2
It should be stressed that approaches in CDA do not simply borrow and apply ready-
made frameworks from linguistics. Rather, CDA scrutinizes, adapts and re-thinks lin-
guistic theories abductively in response to data and operationalization (Wodak and
Meyer 2009: 30). In this sense, one must be cautious about characterizing CDA as an
area of applied linguistics.
Critical Discourse Analysis 431

understanding of the socio-political world we inhabit but also in the way we argue
about socio-political issues. They show that metaphorical expressions in language
cannot be treated as isolated entities but, rather, as manifestations of knowledge
networks in the form of conceptual metaphors, which provide structure and coher-
ence to our experience, including social experience (Goatly 2007).
The second approach, cognitive linguistics, is more comprehensive and moves
beyond metaphor (Hart 2011b/c) to consider the ideological load of other linguistic
structures in terms of the conceptual processes they invoke. It focuses mainly on
categorization, modality, and deixis, which bring into effect a range of ideological
discursive strategies. The legitimization-proximization model is more concentrated
on a single conceptual operation – proximization – and the different forms of its
realization (spatial, temporal, axiological) which ensure the continuity of legitimi-
zation in changing geopolitical context. As will be demonstrated in a case study
later in this chapter, the focus of the legitimization-proximization model on the
dynamics of context and the resulting variability of legitimization patterns makes
this approach a truly pragmatic enterprise. The Neuchatel/Fribourg school pre-
sents, in turn, an almost exclusively explanatory framework in which the manipu-
lative facility of language, as manifested in fallacious arguments, is theorized as a
kind of cognitive illusion (Maillat and Oswald 2009). This form of manipulation
is made possible by the fact that “people are nearly-incorrigible ‘cognitive opti-
mists’” (Sperber et al. 1995: 11) who take for granted that their spontaneous cog-
nitive processes are highly reliable and that the output of these processes does not
need double checking (Maillat and Oswald 2009). The Neuchatel/Fribourg school
is thus, again, a timely response to modern developments in cognitive science.
Like the three other approaches, it treats the ideological and persuasive potential of
discourse not as a property of language itself but of the cognitive processes which
language reflects and mobilizes. Altogether, the new schools captured in Figure 1
provide a transdisciplinary, cognitive-scientific insight into the conceptual under-
pinnings of the social-linguistic interface and as such remain in the forefront of the
contemporary CDA (Hart and Cap 2014; Filardo Llamas et al. 2016; Flowerdew
and Richardson 2016).

3. CDA and pragmatics

The relationship between CDA and pragmatics is complex and difficult to capture.
This is because neither pragmatics nor CDA are confined to one specific method-
ology or one particular area of study. Pragmatics is often understood as an analytic
stance, offering a unique, function-based account of all aspects of human commu-
nication (Verschueren 1999; Fetzer 2002). As noted by the editors of this hand-
book series, “pragmatics is defined by its point of view more than by its objects
of investigation”, which means that “researchers in pragmatics work in all areas
432 Piotr Cap

of linguistics (and beyond), but from a distinctive [functional] perspective that

makes their work ‘pragmatic’ and leads to new findings and to reinterpretations
of old findings” (Bublitz, Jucker and Schneider 2011: v). As such, pragmatics is
concerned with all facets of communicative acts, such as the speaker, his/her back-
ground knowledge and contextual assumptions, the lexical and grammatical con-
stituents of an utterance, the hearer’s interpretations and patterns of inferencing,
etc. All these are explored against a broad network of social factors, preconditions,
norms and expectations that govern communication, both within a culture and
across cultures. Since communicative acts involve linguistic units, whose choice
is dictated by language-internal rules, as well as their interpersonal, social and
cultural embedding, pragmatic studies bridge the system and the use side of lan-
guage. They examine what is lexically and grammatically available for a speaker to
accomplish a communicative goal, and at the same time explore the ways in which
the linguistic potential is realized in a specific social context.
The perspectivist view of pragmatics reveals several features which pragmatics
and CDA have in common. These include the fundamental interest in the func-
tionality of language, the sensitivity to the macro/social dimension of language
and discourse, as well as the interest in linguistic choices that speakers make to
carry out specific functional goals in particular social contexts. At the same time
there are differences, or at least asymmetries. The analytical focus of pragmatics
is still broader than the CDA focus, both in terms of the discourse domains which
it extends over and the levels of language organization it encompasses. While
pragmatics is concerned quite equally with the macro dimension of discourse and
the micro dimension of the lexico-grammatical features of individual utterances,
the interest of CDA has for a long time been primarily in the macro (social) level
of analysis. Pragmatics is preoccupied with the functions fulfilled by language in
real contexts, and with the relationships between form and social function; how-
ever, it also focuses on the detailed study of specific instances of language use.
In comparison, although CDA practitioners have long called for triangulation in
the sense of obtaining multiple perspectives on the phenomenon under scrutiny
(Reisigl and Wodak 2001; van Dijk 2006; etc.), or at least for “constant movement
back and forth between theory and data” (Meyer 2001: 27), there has been and still
is an observable trend for many research projects in CDA to operate in a top-down
manner. Presupposing a particular theory of social relations, they tend to single
out the most interesting aspects of language that tie in with a particular theoretical
approach, rather than embarking on an all-round, in-depth study covering the mul-
tiple dimensions of a text to determine how language works in a particular setting
(Blommaert 2001; Breeze 2011). If this trend has been changing recently, the credit
goes to the critique levelled at CDA by, indeed, pragmaticians, as well as conver-
sation analysts, ethnographers of communication and other scholars committed to
the notion that all interpretations should clearly emerge from the underlying data
(Breeze 2011; Verschueren 2011).
Critical Discourse Analysis 433

While work in linguistic pragmatics has helped CDA in the search for attested
textual data to support theoretical claims at the macro level, CDA attracts pragma-
ticians to new empirical territories, where discourse serves to (re-)enact, negotiate,
modify and/or reproduce ideology and individual as well as collective identity in
accordance with socio-political goals. There, pragmatics – and the pragmatics of
discourse (macropragmatics; see Cap 2011) in particular – benefit from the inter-
disciplinarity of CDA and its tendency to look for and engage new conceptual
frameworks in social research. The results are interdisciplinary studies bridging
different disciplines and approaches at the intersection of social and political sci-
ence and linguistics. The role of pragmatics in such studies is often to appropriate
findings in disciplines other than linguistics to the rigid requirements of linguistic
micro-analysis. For instance, findings in cognitive science and anthropology, the
disciplines frequently addressed in CDA, are used to build frameworks that serve
as conceptual handles on a specific kind of linguistic data (Chilton 2004, 2014; Cap
2013; Dunmire 2011; Hart 2014). These frameworks are pragmatic in the sense that
they elucidate the functional potential of lexical and grammatical choices drawn
from non-linguistic, cognitive domains, such as space or time. The best example of
such a framework seems the legitimization-proximization model, which has been
included in the panorama of the contemporary CDA in Figure 1. In the remainder
of the chapter I discuss this model further as an instance of the dynamic interac-
tion between CDA and pragmatics. Apart from elucidating links that connect the
macro-social and micro-linguistic dimensions of research, the legitimization-prox-
imization model also illustrates the most important interdisciplinary elements of
the modern CDA research in their typical configuration. The central principles of
this configuration involve the top-level position of cognitive and anthropological
categories and the bottom-level position of lexico-grammatical categories, with
pragmatics acting as an analytic mediator between the two positions.

4. The legitimization-proximization model in CDA

In its broadest sense, “proximization” can be defined as a discursive strategy of

presenting physically and temporally distant occurrences, events and states of
affairs (including distant i. e. adversarial ideologies) as increasingly and negatively
consequential to the political speaker and her addressee. Projecting the distant
entities as encroaching on the speaker-addressee territory (both physical and ideo-
logical), the speaker seeks justification of actions and/or policies that she proposes
to neutralize the growing impact of the negative, “foreign”, “alien”, “antagonistic”,
entities. Proximization is thus a cognitive-pragmatic strategy of legitimization of
interventionist policies.
The term proximization was first proposed by Cap to analyze coercion patterns
in the American anti-terrorist rhetoric following 9/11 (Cap 2006, 2008, 2010). Since
434 Piotr Cap

then it has been used within different discourse domains, though most commonly in
studies of state political discourses: crisis construction and war rhetoric (Chovanec
2010), anti-migration discourse (Hart 2010), political party representation (Cienki,
Kaal and Maks 2010), construction of national memory (Filardo Llamas 2010),
and design of foreign policy documents (Dunmire 2011, etc.). Findings from these
studies have been integrated in the legitimization-proximization model put forward
by Cap (2013). The model defines proximization as a forced construal operation
meant to evoke closeness of an external threat to solicit legitimization of preven-
tive measures. It presupposes a bipolar, dichotomous architecture of the political
Discourse Space (DS), in which meanings are construed from conceptual opposi-
tions between the in-group (DS-central) and the out-group (DS-peripheral). The
threat is posed by the DS-peripheral entities, which the model refers to as ODCs
(outside-deictic-center). The ODC entities are construed as moving across the DS
to invade the IDC (inside-deictic-center) entities, the speaker and her addressee.
Since the ODC threat can be conceptualized in spatio-temporal (physical) as well
as ideological terms, the strategy of proximization falls into three categories. Spa-
tial proximization is a forced construal of the DS-peripheral entities encroaching
physically upon the DS central entities (speaker, addressee). Temporal proximiza-
tion is a forced construal of the envisaged conflict as not only imminent, but also
momentous, historic and thus needing immediate response and unique preventive
measures. Spatial and temporal proximization involve fear appeals (becoming par-
ticularly strong in reactionary political projects) and typically use analogies to
conflate the growing threat with an actual disastrous occurrence in the past, to
endorse the current scenario. Lastly, axiological proximization involves construal
of a gathering ideological clash between the “home values” of the DS-central enti-
ties (IDCs) and the alien and antagonistic (ODC) values. Importantly, the ODC
values are construed to reveal potential to materialize (that is, prompt a physical
impact) within the IDC home territory.
In its conceptual design, the legitimization-proximization model subsumes
a dynamic view of the discourse space, which involves not only the opposition
between IDC and ODC entities, but also the discursively constructed movement of
the latter toward the deictic center of the DS (Figure 2). It thus focuses, from a lin-
guistic standpoint, on the lexical and grammatical deictic choices which speakers
make to, first, index the existing socio-political and ideological distinctions and,
second, demonstrate the capacity of the out-group (ODC) to erase these distinc-
tions by forcibly colonizing the in-group’s (IDC’s) space.
Furthermore, the legitimization-proximization model assumes that all the three
strategies/aspects of proximization contribute to the continual narrowing of the
symbolic distance between the entities and values in the discourse space and their
negative impact on the speaker and her addressee. This does not mean, however,
that all the three strategies are linguistically present (to the same degree) through-
out each stretch of the unfolding discourse. While any use of proximization prin-
Critical Discourse Analysis 435

Figure 2. Proximization in Discourse Space (DS)

cipally subsumes all of its strategies, spatial, temporal and axiological, the degree
or density of their actual linguistic representation is continually motivated by their
effectiveness in the evolving context. As will be shown in a case study below,
extralinguistic contextual developments may cause the speaker to limit the use of
one strategy and compensate it by an increased use of another, in the interest of the
continuity of legitimization.
As a theoretical proposal in CDA, the legitimization-proximization model
makes a new contribution at two levels, (i) cognitive-pragmatic and (ii) linguistic,
or more precisely, lexico-grammatical. On the (i) cognitive-pragmatic conceptual
level, the Spatial-Temporal-Axiological (STA) paradigm revisits the ontological
status and the pragmatic function of deixis and deictic markers. While according to
classical views (Levinson 1983; Levelt 1989; etc.) deixis is considered primarily a
technical necessity and a formal tool for the coding of elements of context so com-
munication and interpretation could take place, the proximization approach makes
deixis an instrument of legitimization, persuasion and social coercion. Within the
legitimization-proximization model, the concept of deixis is not reduced to a finite
set of deictic expressions, but rather expanded to cover bigger lexico-grammatical
phrases and discourse chunks. As a result, the component deictic markers par-
take in forced conceptual shifts. An example of the legitimization-proximization
436 Piotr Cap

approach to deixis and deictic expressions is Cap’s (2013: 109) spatial proximiza-
tion framework (Table 1). It defines the main constituents and the mechanism of
proximization in the discourse space, as well as makes possible abstracting the rel-
evant (i. e. “spatial”) lexico-grammatical items. It thus allows a quantitative analy-
sis of the lexical intensity of spatial proximization in a given discourse timeframe.

Table 1. Spatial proximization framework and its key lexico-grammatical items

(reproduced from Cap 2013: 109)

Category Key items

1. (Noun phrases (NPs) construed as [USA, United States, America]; [Ameri-
elements of the deictic center of the DS can people, Americans, our people/nation/
(IDCs)) country/society]; [free people/nations/
countries/societies/world]; [democratic
people/nations/countries/societies/world]
2. (Noun phrases (NPs) construed as ele- [Iraq, Saddam Hussein, Saddam, Hus-
ments outside the deictic center of the DS sein]; [Iraqi regime/dictatorship]; [terror-
(ODCs)) ists]; [terrorist organizations/networks,
Al-Qaeda]; [extremists/radicals]; [foreign
regimes/dictatorships]
3. (Verb phrases (VPs) of motion and direc- [are determined/intend to seek/acquire
tionality construed as markers of movement WMD]; [might/may/could/can use WMD
of ODCs towards the deictic center) against an IDC]; [expand/grow in military
capacity that could be directed against an
IDC]; [move/are moving/head/are heading/
have set their course toward confrontation
with an IDC]
4. (Verb phrases (VPs) of action construed [destroy an IDC]; [set aflame/burn down
as markers of impact of ODCs upon IDCs) an IDC or IDC values]
5. (Noun phrases (NPs) denoting abstract [threat]; [danger]
concepts construed as anticipations of
impact of ODCs upon IDCs)
6. (Noun phrases (NPs) denoting abstract [catastrophe]; [tragedy]
concepts construed as effects of impact of
ODCs upon IDCs)

The six categories depicted in the left-hand column of Table 1 are a stable element
of the spatial proximization framework. The key items provided in the right-hand
column depend on the actual discourse under investigation. In Table 1, they come
from the domain of the anti-terrorist rhetoric, which has been widely analyzed
Critical Discourse Analysis 437

within the legitimization-proximization paradigm (Cap 2006, 2008, 2010). Table

1 includes the most frequent of the spatial proximization items in the 2001–2010
corpus of the US presidential addresses on the American anti-terrorist policies and
actions.3 Quantifiable items appear in square brackets and include combinations of
words separated by slashes with the head word. For example, the item [free peo-
ple/nations/countries/societies/world] includes the five following combinations,
all of which contribute to the general count of the first category: free people, free
nations, free countries, free societies, free world. The italicized phrases indicate
parts that allow synonymous phrases to fill in the item and thus increase its count.
For example, the item [destroy an IDC] in category 4 subsumes several quantifi-
able variations, such as destroy America, destroy our land or destroy the free and
democratic world.4
The framework and its 6 categories capture not only the initial arrangement of
the Discourse Space (categories 1, 2), but also (in 3, 4) the shift leading to a clash
between the out-group (ODC) and the in-group (IDC), as well as the (anticipated)
effects of the clash (5, 6). The third category, central to the design of the frame-
work, sets “traditional” deictic expressions such as personal pronouns to work
pragmatically together with the other elements of the superordinate VP. The VP in
the third category holds a deictic status; apart from denoting the static DS entities
(marked by pronominals), it indexes their movement, which the latter establishes
the target perspective construed by the speaker. Category 3 can thus process and
yield counts from complex lexico-grammatical phrases, such as for instance “they
[terrorists] have set their course to confront us and our civilization” (G.W. Bush, 17
March 2003). In this phrase, the person deixis (they) combines with the following
VP into a complex deictic structure marking both the antagonistic entity and its
movement toward home entities in the deictic center.
The spatial proximization framework (as well as the temporal and axiological
frameworks, Cap 2013) endorses the (ii) linguistic/lexico-grammatical contribu-
tion of the legitimization-proximization model. The model makes it possible to
extract quantifiable lexical evidence of the strategic use of different proximization
strategies within different timeframes of policy legitimization. Most importantly,
it can account quantitatively for cases – such as below – where one proximization
strategy is dropped in favor of another one, for contextual reasons.

3
The corpus contains 402 texts (601,856 words) of speeches and remarks, downloaded

from the White House website http://www.whitehouse.gov in January 2011. It includes
only the texts matching at least two of the three issue tags: defense, foreign policy,
homeland security.
4
See Cap (2013: 108–109) for details. See also the two other frameworks, temporal
(2013: 116) and axiological (2013: 122), which I do not have space to discuss here.
438 Piotr Cap

5. A case study

As has been mentioned, the main application of the legitimization-proximization

model so far has been to critical studies of state political discourse seeking legiti-
mization of interventionist preventive measures against an external threat. In what
follows, I give an example of this application, discussing instances of the American
discourse of the war-on-terror. Specifically, I outline what proximization strategies
were used to legitimize the US government’s decision to go to war in Iraq (March
2003), and what adjustments in the use of the strategies were made later (from
November 2003) as a result of contextual changes which had taken place in the
meantime.

5.1. Initiating legitimization through proximization

Below I look at parts of G.W. Bush’s speech at the American Enterprise Institute,
which was delivered on February 26, 2003. The speech took place only three weeks
before the first US and coalition troops entered Iraq on March 19, and has often
been considered (Silberstein 2004) a manifesto of the Iraq war. The goal of the
speech was to list direct reasons for the intervention, while also locating it in the
global context of the war-on-terror declared by G.W. Bush on the night of the 9/11
attacks. The realization of this goal involved a strategic use of various lexico-gram-
matical forms reflecting different proximization strategies.
Providing his rationale for war, President Bush had to confront the kind of pub-
lic reluctance faced by many of his White House predecessors: how to legitimize
the US involvement in military action in a far-away place, among a far-away peo-
ple, of whom the American people knew little (Bacevich 2010). The AEI speech
is remarkable in its consistent continuity of attempts to overcome this reluctance.
It applies spatio-temporal and axiological proximization strategies, which are per-
formed in diligently designed pragmatic patterns drawing from more general con-
ceptual premises for legitimization:
We are facing a crucial period in the history of our nation, and of the civilized world. […]
On a September morning, threats that had gathered for years, in secret and far away, led
to murder in our country on a massive scale. As a result, we must look at security in a
new way, because our country is a battlefield in the first war of the 21st century. […] We
learned a lesson: the dangers of our time must be confronted actively and forcefully,
before we see them again in our skies and our cities. And we will not allow the flames
of hatred and violence in the affairs of men. […] The world has a clear interest in the
spread of democratic values, because stable and free nations do not breed the ideologies
of murder. […] Saddam Hussein and his weapons of mass destruction are a direct threat
to our people and to all free people. […] My job is to protect the American people. When
it comes to our security and freedom, we really don’t need anybody’s permission. […]
We’ve tried diplomacy for 12 years. It hasn’t worked. Saddam Hussein hasn’t disarmed,
he’s armed. Today the goal is to remove the Iraqi regime and to rid Iraq of weapons of
Critical Discourse Analysis 439

mass destruction. […] The liberation of millions is the fulfillment of America’s found-
ing promise. The objectives we’ve set in this war are worthy of America, worthy of all
the acts of heroism and generosity that have come before (Bush 2003a).

In a nutshell, the AEI speech states that there are WMD5 in Iraq and that, given
historical context and experience, ideological characteristics of the adversary as
opposed to American values and national legacy, and G.W. Bush’s obligations as
standing US president, there is a case for legitimate military intervention. This
complex picture involves historical flashbacks, as well as descriptions of the cur-
rent situation, which both engage proximization strategies. These strategies operate
at two interrelated levels, which can be described as diachronic and synchronic.
At the diachronic level, Bush evokes ideological representations of the remote
past, which are “proximized” to underline the continuity and steadfastness of pur-
pose, thus linking with and sanctioning current actions as acts of faithfulness to
long-accepted principles and values. An example is the final part: “[t]he liberation
is […] promise. The objectives […] have come before”. It launches a temporal
analogy axis which connects a past reference point (the founding of America) with
the present point, creating a common conceptual space for both the proximized
historical acts of heroism and the current and/or prospective acts construed as their
natural follow-ups. This kind of legitimization, performed by mostly temporal and
axiological proximization (the originally past values become the here and now
premises for prompt action6), draws, in many ways, upon the socio-psychological
predispositions of the US addressee (Dunmire 2011). On the pragmatic-lexical
plane, the job of establishing the link and thus winning credibility is performed by
sequences of assertions, which fall within the addressee’s “latitude of acceptance”
(Jowett and O’Donnell 1992).7 The assertions reveal different degrees of accepta-
bility, from being indisputably and universally acceptable (“My job is […]”; “The
liberation of millions […]”) to being acceptable due to the credibility developed
step-by-step within a “fact-belief series” (“We’ve tried diplomacy for 12 years
[FACT] […] he’s armed [BELIEF]”), but none of them is inconsistent with the key
predispositions of the addressee.

5

Weapons of mass destruction.
6

This is a secondary variant of axiological proximization. As will be shown, axiological
proximization mostly involves the adversary (ODC); antagonistic values are “dormant”
triggers for a possible ODC impact.
7
Jowett and O’Donnell (1992) posit that the best credibility and thus legitimization
effects can be expected if the speaker produces her message in line with the psycholog-
ical, social, political, cultural, etc., predispositions of the addressee. However, since a
full compliance is almost never possible, it is essential that a novel message is at least
tentatively or partly acceptable; then, its acceptability and the speaker’s credibility tend
to increase over time.
440 Piotr Cap

At the synchronic level, the historical flashbacks are not completely aban-
doned, but they involve proximization of near history and the main legitimization
premise is not the (continuing) ideological commitments, but the direct physical
threats looming over the country (“a battlefield”, in President Bush’s words). As
the threats require a fast and strong pre-emptive response, the main proximization
strategy operating at the synchronic level is spatial proximization, often encom-
passing a temporal element. Its task is to raise fears of imminence of the threat,
which might be external and distant apparently, but in fact able to materialize
anytime. The lexico-grammatical carriers of the spatial proximization include such
items and phrases as “secret and far away”, “all free people”, “stable and free
nations”, “Saddam Hussein and his weapons of mass destruction”, etc., which force
dichotomous, “good against evil” representations of the IDCs (America, Western
[free, democratic] world) and the ODCs (Saddam Hussein, Iraqi regime, terrorists),
located at a relative distance from each other. This geographical and geopolitical
distance is symbolically construed as shrinking, as, on the one hand, the ODC
entities cross the DS towards its deictic center and, on the other, the center (IDC)
entities declare a reaction. The ODC shift is enacted by forced inference and met-
aphorization. The inference involves an analogy to 9/11 (“On a September morn-
ing […]”), whereby the event stage is construed as facing another physical impact,
whose (“current”) consequences are scrupulously described (“before we see them
[flames] again in our skies and our cities”). This fear appeal is strengthened by the
FIRE metaphor, which contributes the imminence and the speed of the external
impact (Hart 2010).
While all spatial proximization in the text draws upon the presumed WMD
presence in Iraq – and its potential availability to terrorists for acts far more
destructive than the 9/11 attacks – Bush does not disregard the possibility of having
to resort to an alternative rationale for war in the future. Thus the speech contains
supporting ideological premises, tied to the principal premise. An example is the
use of axiological proximization in “The world has a clear interest in the spread of
democratic values, because stable and free nations do not breed the ideologies of
murder”. This ideological argument is not synonymous with Bush’s proximization
of remote history we have seen before, since its current line subsumes acts of the
adversary rather than his and/or America’s own acts. It involves a more typical
axiological proximization, where an initially ideological conflict changes, over
time, into a physical clash. Notably, in its ideological-physical duality it forces
a spectrum of speculations over whether the current threat is still ideological or
already physical. Since any conclusion from these speculations can be denied in
the prospective discourse, the example quoted (“The world …”) shows how prox-
imization can interrelate, at the pragmalinguistic level, with the mechanism of
implicature (Grice 1975).
Critical Discourse Analysis 441

5.2. Maintaining legitimization through adjustments in proximization

strategies

Political legitimization pursued in temporally extensive contexts – such as the

timeframe of the Iraq war – often involves redefinition of the initial legitimiza-
tion premises and coercion patterns and proximization is very well suited to enact
these redefinitions in discourse. This seems to promise a vast applicability of
the legitimization-proximization model as a truly dynamic cognitive-pragmatic
development in CDA. The legitimization obtained in the AEI speech and, mainly,
how the unfolding geopolitical context has put it to test is an illuminating case in
point. Recall that although Bush has made the WMD factor the central premise
for the Iraq war, he has left an emergency door half-open to be able to reach for
an alternative rationale. Come November 2003 (just eight months into the Iraq
war), and Bush’s pro-war rhetoric adopts (or rather has to adopt) such an emer-
gency alternative rationale, as it becomes evident that there were never weapons of
mass destruction in Iraq, at least not in the ready-to-use product sense. The change
of Bush’s stance is a swift change from strong fear appeals and spatio-temporal
proximization to a more subtle ideological argument for legitimization, involving
predominantly axiological proximization. The following quote from G.W. Bush’s
Whitehall Palace address of November 19 is a good illustration:

By advancing freedom in the greater Middle East, we help end a cycle of dictatorship
and radicalism that brings millions of people to misery and brings danger to our own
people. By struggling for justice in Iraq, Burma, in Sudan, and in Zimbabwe, we give
hope to suffering people and improve the chances for stability and progress. Had we
failed to act, the dictator’s programs for weapons of mass destruction would continue
to this day. Had we failed to act, Iraq’s torture chambers would still be filled with vic-
tims, terrified and innocent. […] For all who love freedom and peace, the world without
Saddam Hussein’s regime is a better and safer place (Bush 2003b).

The now dominant axiological proximization involves a dense concentration of

ideological and value-oriented lexical items (such as freedom, justice, stability,
progress, peace, vs. dictatorship, radicalism) as well as items/phrases marking
the human dimension of the conflict (e. g. misery, suffering people, terrified vic-
tims, vs. the world [being] a better and safer place). All these lexico-grammatical
forms serve to construe, as in the case of the AEI address, clearly dichotomous
representations of the DS “home” and “peripheral/adversarial” entities (IDCs
vs. ODCs), and the vision of impact upon the DS “home” entities. In contrast to
the AEI speech, however, all the entities (both IDCs and ODCs) are construed in
abstract, rather than physical, tangible terms, as the particular lexical items (dicta-
torship, radicalism) are not explicitly but only inferentially attributed to concrete
groups. Proximization in the Whitehall speech is thus mainly a proximization of
antagonistic values, and not so much of physical entities recognized as embodi-
442 Piotr Cap

ments of these values. The consequences for maintaining the legitimization stance
which began with the AEI address are enormous.
First, there is no longer a commitment to material threat posed by a physical
entity. Second, the relief of this commitment, however leading to a new premise
for war, does not disqualify the original (WMD) premise since the antagonistic
“peripheral” values retain a capacity to materialize within the deictic center (see
“… a cycle of dictatorship and radicalism that brings millions of people to misery
and brings danger to our own people”, reiterating “The world has a clear interest
in the spread of democratic values, because stable and free nations do not breed the
ideologies of murder” from the AEI speech). Third, as ideological principles pos-
sess a global appeal, the socio-ideological argument helps extend the spectrum of
the US (military) engagement (Burma, Sudan, Zimbabwe), which in turn forces the
construal of failure to detect WMD in Iraq as merely an unlucky incident amongst
other (successful) operations.
Add to these general factors the power of legitimization ploys in specific prag-
malinguistic constructs (“programs for weapons of mass destruction”8, the enumer-
ation of the “new” fields of engagement [Burma, etc.], the always effective appeals
for solidarity in compassion [“terrified victims” in “torture chambers”]) and there
are reasons to conclude that the fall 2003 change to essentially axiological discourse
(subsuming axiological proximization) has helped a lot toward saving credibility
and thus maintaining legitimization of not only the Iraq war, but the later anti-ter-
rorist campaigns as well. The flexible interplay and the discursive switches between
spatial and axiological proximization (aided by temporal projections) in the early
stages of the US anti-terrorist policy rhetoric have made a major contribution.

6. Conclusion: Proximization as a method

and territories for a pragmatic CDA

The legitimization-proximization model is where pragmatics, spatial cognition,

and CDA meet in a conspicuous way. While drawing on the essentially cogni-
tive-anthropological theories of discourse, proximization provides the conceptual
representation of discourse space with a pragmatic element involving speaker’s
awareness of the changing context. In its account of discourse, the model focuses
on the strategic, ideological and goal-oriented essence of construals of the near

8
The nominal phrase “[Iraq’s] programs for WMD” is essentially an implicature able
to legitimize, in response to contextual needs, any of the following inferences: “Iraq
possesses WMD”, “Iraq is developing WMD”, “Iraq intends to develop WMD”, “Iraq
intended to develop WMD”, and more. The phrase was among G.W. Bush’s rhetorical
favorites in later stages of the Iraq war, when the original premises for war were called
into question.
Critical Discourse Analysis 443

and the remote. Specifically, it focuses on how the imagining of the closeness and
remoteness can be manipulated in the political sphere and bound up with fear, secu-
rity and conflict. At the linguistic level, it draws from critical-corpus approaches
(cf. Figure 1) to offer a rigorous scrutiny of the lexical and grammatical choices
which (political) speakers make to enact the conceptual affiliations and distinc-
tions. Along with the other modern developments in CDA (especially the cognitive
models, such as critical metaphor analysis; cf. Figure 1), the legitimization-proxi-
mization model is an example of how CDA realizes its commitments by engaging
cognitive, socio-psychological and anthropological concepts and approaches in a
joint work with a text-analytical pragmalinguistic apparatus. As a method, it struc-
tures these concepts and tools in a hierarchical analytic mechanism processing data
in a comprehensive, abductive manner. At the top level, cognitive and anthropo-
logical categories are responsible for the conceptual framework of analysis. This
involves defining two geopolitically and ideologically disparate camps (in-group
vs. out-group) in the Discourse Space and setting them at a relative distance from
each other. This distance is symbolically construed as shrinking; first, because the
out-group aims to encroach on the in-group’s territory (both physical and ideolog-
ical), second, because the in-group declares a preventive reaction. The ability to
capture this shift in the setup of the Discourse Space in linguistic terms constitutes
the central methodological advantage of the legitimization-proximization model.
As has been documented in the case study, the model expresses this conceptual
change in terms of pragmatically-minded variations, at the bottom level, in the
use of specific lexico-grammatical constructs, such as deictic builders of spatial
and ideological dichotomies. While the case study in the present chapter has been
essentially qualitative, the legitimization-proximization model opens up further
vistas to endorse the findings (such as the change from spatial to axiological prox-
imization, or, generally, from the rhetoric of direct physical threat to a milder
rhetoric of ideological conflict) in rigorous quantitative analysis. This is possible
by engaging the spatial proximization framework (cf. section 3), together with the
axiological proximization framework (Cap 2013), to produce counts of specific
lexico-grammatical items in set periods of time.
The landscape of discourses where such transdisciplinary, qualitative-quan-
titative projects are possible is huge. The domains addressed in CDA in the last
30 years have been racism, xenophobia, national identity, gender identity and
inequality, media discourse, discourses of national vs. international politics, and
many more. This list, by no means exhaustive, gives a sense of the spectrum of
discourses where models such as legitimization-proximization can contribute.
Since the central commitments of CDA include exploring the many ways in which
ideologies and identities are reflected, (re)-enacted, negotiated, modified, repro-
duced, etc., in discourse, any “doing” of CDA must involve studying, in conceptual
terms, the “original positioning” of the different ideologies and identities, and, in
the majority of cases, studying also the “target positioning”, that is the conceptual
444 Piotr Cap

change which the analyst claims is taking place through the speaker’s strategic use
of discourse. Doing CDA means thus handling issues of the original arrangement
of the Discourse Space, and most notably, the core issue of the DS symbolic re-ar-
rangement. As such, any CDA practice clearly needs a pragmalinguistic approach
to account for the original and later the target setup of the DS. At the heart of this
account are bottom-level, quantifiable lexico-grammatical choices responsible for
strategic enactment of the conceptual shifts. The anti-terrorist discourse, such as
analyzed in the case study, clearly contains a lot of lexical material that is used to
force such strategic shifts. Among other domains and discourses, the most analyt-
ically promising appear those in which distinctions between different ideologies
and identities are enacted in a particularly clear-cut and appealing manner – to
construe strong oppositions between “better” and “worse” ideologies or identities.
This applies to the discourses of xenophobia, racism, nationalism or social exclu-
sion, all of which presuppose a rigid in-group vs. out-group distinction, arguing for
a growing threat from the out-group. Each of these discourses constitutes a fruitful
field for critical-pragmatic explorations. In that sense, CDA not only draws from
pragmatics, but also takes it to new and exciting territories.

References

Atkinson, J. Maxwell and John Heritage (eds.)

1984 Structures of Social Action: Studies in Conversation Analysis. Cambridge:
Cambridge University Press.
Bacevich, Andrew
2010 Washington Rules: America’s Path to Permanent War. New York, N.Y.: Metro-
politan Books.
Baker, Paul and Anthony McEnery
2005 A corpus-based approach to discourses of refugees and asylum seekers in UN
and newspaper texts. Journal of Language and Politics 4: 197–226.
Baker, Paul
2006 Using Corpora in Discourse Analysis. London: Continuum.
Baker, Paul, Costas Gabrielatos, Majid Khosravinik, Michał Krzyżanowski, Anthony
McEnery and Ruth Wodak
2008 A useful methodological synergy? Combining critical discourse analysis and
corpus linguistics to examine discourses of refugees and asylum seekers in the
UK press. Discourse & Society 19: 273–306.
Benke, Gertraud
2000 Diskursanalyse als sozialwissenschaftliche Untersuchungsmethode. SWS Rund-
schau 2: 140–162.
Bhatia, Vijay
2004 Worlds of Written Discourse. A Genre-Based View. London: Continuum.
Billig, Michael
2008 The language of critical discourse analysis: The case of nominalization. Dis-
course & Society 19: 783–800.
Critical Discourse Analysis 445

Blommaert, Jan
2001 Context is/as critique. Critique of Anthropology 21: 13–32.
Breeze, Ruth
2011 Critical discourse analysis and its critics. Pragmatics 21: 493–525.
Bublitz, Wolfram, Andreas H. Jucker and Klaus P. Schneider
2011 Preface to the handbook series. In: Wolfram Bublitz and Neal Norrick (eds.),
Handbooks of Pragmatics, v-vii, Volume 1: Foundations of Pragmatics. Ber-
lin: Mouton de Gruyter.
Bush, George W.
2003a The President’s address to the American Enterprise Institute, February 26,
2003. http://www.whitehouse.gov (accessed January 1, 2011).
Bush, George W.
2003b The President’s address at London Whitehall Palace, November 19, 2003.
http://www.whitehouse.gov (accessed January 1, 2011).
Cap, Piotr and Urszula Okulska (eds.)
2013 Analyzing Genres in Political Communication: Theory and Practice. Amster-
dam: John Benjamins.
Cap, Piotr
2006 Legitimization in Political Discourse: A Cross-disciplinary Perspective on the
Modern US War Rhetoric. Newcastle: Cambridge Scholars Press.
Cap, Piotr
2008 Towards the proximization model of the analysis of legitimization in political
discourse. Journal of Pragmatics 40: 17–41.
Cap, Piotr
2010 Axiological aspects of proximization. Journal of Pragmatics 42: 392–407.
Cap, Piotr
2011 Micropragmatics and macropragmatics. In: Wolfram Bublitz and Neal Nor-
rick (eds.), Handbooks of Pragmatics, Volume 1: Foundations of Pragmatics,
51–75. Berlin: de Gruyter Mouton.
Cap, Piotr
2013 Proximization: The Pragmatics of Symbolic Distance Crossing. Amsterdam:
John Benjamins.
Cap, Piotr
2017 The Language of Fear: Communicating Threat in Public Discourse. Basing-
stoke: Palgrave.
Charteris-Black, Jonathan
2004 Corpus Approaches to Critical Metaphor Analysis. Basingstoke: Palgrave.
Chilton, Paul and Christina Schaeffner
2002 Politics as Text and Talk: Analytic Approaches to Political Discourse. Amster-
dam: John Benjamins.
Chilton, Paul
2004 Analysing Political Discourse: Theory and Practice. London: Routledge.
Chilton, Paul
2005 Missing links in mainstream CDA: Modules, blends and the critical instinct.
In: Ruth Wodak and Paul Chilton (eds.), A New Agenda in (Critical) Discourse
Analysis, 19–51. Amsterdam: John Benjamins.
446 Piotr Cap

Chilton, Paul
2011a Still something missing in CDA. Discourse Studies 13: 769–781.
Chilton, Paul
2011b Deictic Space Theory (DST): The fundamental theory and its applications.
Paper at the 42nd Poznań Linguistic Meeting, Poznań, 1–3 May 2011.
Chilton, Paul
2014 Language, Space and Mind. Cambridge: Cambridge University Press.
Chouliaraki, Lilie and Norman Fairclough
1999 Discourse in Late Modernity. Rethinking Critical Discourse Analysis. Edin-
burgh: Edinburgh University Press.
Chovanec, Jan
2010 Legitimation through differentiation: Discursive construction of Jacques Le
Worm Chirac as an opponent to military action. In: Urszula Okulska and Piotr
Cap (eds.), Perspectives in Politics and Discourse, 61–82. Amsterdam: John
Benjamins.
Cienki, Alan, Bertie Kaal and Isa Maks
2010 Mapping world view in political texts using Discourse Space Theory: Met-
aphor as an analytical tool. Paper Presented at RaAM 8 Conference, Vrije
Universiteit Amsterdam.
Dunmire, Patricia
2011 Projecting the Future through Political Discourse: The Case of the Bush Doc-
trine. Amsterdam: John Benjamins.
Fairclough, Norman and Ruth Wodak
1997 Critical discourse analysis. In: Teun van Dijk (ed.), Discourse as Social Inter-
action, 258–284. London: Sage.
Fairclough, Norman
1989 Language and Power. London: Longman.
Fairclough, Norman
1995 Critical Discourse Analysis. London: Longman.
Fetzer, Anita and Peter Bull
2013 Political interviews in context. In: Piotr Cap and Urszula Okulska (eds.), Ana-
lyzing Genres in Political Communication: Theory and Practice, 73–100.
Amsterdam: John Benjamins.
Fetzer, Anita
2002 Communicative intentions in context. In: Anita Fetzer and Christiane Meier-
kord (eds.), Rethinking Sequentiality: Linguistics Meets Conversational Inter-
action, 37–69. Amsterdam: John Benjamins.
Filardo Llamas, Laura
2010 Discourse worlds in Northern Ireland: The legitimisation of the 1998 Agree-
ment. In: Katy Hayward and Catherine O’Donnell (eds.), Political Discourse
and Conflict Resolution. Debating Peace in Northern Ireland, 62–76. London:
Routledge.
Filardo-Llamas, Laura, Christopher Hart and Bertie Kaal (eds.)
2016 Space, Time and Evaluation in Ideological Discourse. London: Routledge.
Flowerdew, John and John Richardson (eds.)
2016 Routledge Handbook of Critical Discourse Studies. London: Routledge.
Critical Discourse Analysis 447

Fowler, Roger
1991 Language in the News. London: Routledge.
Fowler, Roger, Gunther Kress and Tony Trew
1979 Language and Control. London: Routledge.
Gabrielatos, Costas and Paul Baker
2008 Fleeing, sneaking, flooding: A corpus analysis of discursive constructions of
refugees and asylum seekers in the UK press 1996–2005. Journal of English
Linguistics 36: 5–38.
Giltrow, Janet and Dieter Stein (eds.)
2009 Genres in the Internet. Amsterdam: John Benjamins.
Goatly, Andrew
2007 Washing the Brain: Metaphor and Hidden Ideology. Amsterdam: John Benja-
mins.
Grice, Herbert Paul
1975 Logic and conversation. In: Peter Cole and Jerry L. Morgan (eds.), Syntax and
Semantics 3: Speech Acts, 41–58. New York, NY: Academic Press.
Halliday, Michael A.K.
1994 Introduction to Functional Grammar. London: Arnold.
Hart, Christopher (ed.)
2011a Critical Discourse Studies in Context and Cognition. Amsterdam: John Benja-
mins.
Hart, Christopher
2011b Moving beyond metaphor in the Cognitive Linguistic Approach to CDA: Con-
strual operations in immigration discourse. In: Christopher Hart (ed.), Critical
Discourse Studies in Context and Cognition, 171–192. Amsterdam: John Ben-
jamins.
Hart, Christopher
2011c Force-interactive patterns in immigration discourse: A Cognitive Linguistic
approach to CDA. Discourse & Society 22: 269–286.
Hart, Christopher
2013a Event-construal in press reports of violence in political protests: A Cognitive
Linguistic Approach to CDA. Journal of Language and Politics 12: 400–423.
Hart, Christopher
2013b Constructing contexts through grammar: Cognitive models and conceptualis-
ation in British Newspaper reports of political protests. In: John Flowerdew
(ed.), Discourse in Context, 159–184. London: Continuum.
Hart, Christopher
2014 Discourse, Grammar and Ideology: Functional and Cognitive Perspectives.
London: Bloomsbury.
Hart, Christopher and Piotr Cap
2014 Introduction. In: Christopher Hart and Piotr Cap (eds.), Contemporary Critical
Discourse Studies. London: Bloomsbury.
Hodge, Robert and Gunther Kress
1993 Language as Ideology. London: Routledge.
Jowett, Garth S. and Victoria O’Donnell
1992 Propaganda and Persuasion. Newbury Park, CA: Sage.
448 Piotr Cap

Koller, Veronika
2004 Metaphor and Gender in Business Media Discourse: A Critical Cognitive
Study. Basingstoke: Palgrave.
Kopytowska, Monika
2013 Blogging as the mediatization of politics and a new form of social Interaction:
a case study of ‘proximization dynamics’ in Polish and British political blogs.
In: Piotr Cap and Urszula Okulska (eds.), Analyzing Genres in Political Com-
munication: Theory and Practice, 379–422. Amsterdam: John Benjamins.
Kress, Gunther and Theo van Leeuwen
1996 Reading Images: The Grammar of Visual Design. London: Routledge.
Lemke, Jay
1995 Textual Politics: Discourse and Social Dynamics. London: Taylor & Francis.
Lewiński, Marcin and Steve Oswald
2013 When and how do we deal with straw men? A normative and cognitive prag-
matic account. Journal of Pragmatics 59: 164–177.
Luke, Allan
2002 Beyond science and ideological critique: Developments in critical discourse
analysis. Annual Review of Applied Linguistics 22: 96–110.
Maillat, Didier and Steve Oswald
2009 Defining manipulative discourse: The pragmatics of cognitive illusions. Inter-
national Review of Pragmatics 1: 348–370.
Maillat, Didier and Steve Oswald
2011 Constraining context: A pragmatic account of cognitive manipulation. In:
Christopher Hart (ed.), Critical Discourse Studies in Context and Cognition,
65–80. Amsterdam: John Benjamins.
Marin Arrese, Juana
2011 Effective vs. epistemic stance and subjectivity in political discourse: Legit-
imising strategies and mystification of responsibility. In: Christopher Hart
(ed.), Critical Discourse Studies in Context and Cognition, 193–224. Amster-
dam: John Benjamins.
Martin, James R. and David Rose
2003 Working with Discourse: Meaning beyond the Clause. London: Continuum.
Martin, James R. and David Rose
2008 Genre Relations: Mapping Culture. London: Equinox.
Martin, James R. and Peter White
2007 The Language of Evaluation: Appraisal in English. Basingstoke and New
York: Palgrave.
Martin, James R.
2004 Positive discourse analysis: Solidarity and change. Revista Canaria de Estu-
dios Ingleses 49: 179–202.
Mautner, Gerlinde
2007 Mining large corpora for social information: The case of ‘elderly’. Language
in Society 36: 51–72.
Meyer, Michael
2001 Between theory, method and politics: Positioning of the approaches to CDA.
In: Ruth Wodak and Michael Meyer (eds.), Methods of Critical Discourse
Analysis, 14–31. London: Sage.
Critical Discourse Analysis 449

Musolff, Andreas
2004 Metaphor and Political Discourse: Analogical Reasoning in Debates about
Europe. Basingstoke: Palgrave.
Musolff, Andreas
2010 Political metaphor and bodies politic. In: Urszula Okulska and Piotr Cap (eds.),
Perspectives in Politics and Discourse, 23–42. Amsterdam: John Benjamins.
O’Halloran, Kieran
2003 Critical Discourse Analysis and Language Cognition. Edinburgh: Edinburgh
University Press.
O’Halloran, Kieran
2010 How to use corpus linguistics in the study of media discourse. In: Anne
O’Keeffe and Michael McCarthy (eds.), The Routledge Handbook of Corpus
Linguistics, 563–576. Abingdon: Routledge.
Partington, Alan
2006 Metaphors, motifs, and similes across discourse types: Corpus assisted dis-
course studies (CADS) at work. In: Anatol Stefanowitsch and Stefan Gries
(eds.), Corpus-Based Approaches to Metaphor and Metonymy, 267–304. Ber-
lin: Mouton de Gruyter.
Reisigl, Martin and Ruth Wodak
2001 Discourse and Discrimination: Rhetorics of Racism and Anti-Semitism. Lon-
don: Routledge.
Richardson, John and Ruth Wodak
2009 The impact of visual racism: Visual arguments in political leaflets of Austrian
and British far-right parties. Controversia 6: 45–77.
Rogers, Rebecca, Elizabeth Malancharuvil-Berkes, Melissa Mosley, Diane Hui and Glynis
O’Garro
2005 Critical discourse analysis in education: A review of the literature. Review of
Educational Research 75: 365–416.
Saussure, Louis de and Peter Schulz (eds.)
2005 Manipulation and Ideologies in the Twentieth Century: Discourse, Language,
Mind. Amsterdam: John Benjamins.
Silberstein, Sandra
2004 War of Words. London: Routledge.
Slembrouck, Stef
2001 Explanation, interpretation and critique in the analysis of discourse. Critique
of Anthropology 21: 33–57.
Stocchetti, Matteo and Karin Kukkonen (eds.)
2011 Images in Use. Amsterdam: John Benjamins.
Stubbs, Michael
1997 Whorf’s children: Critical comments on critical discourse analysis. In: Ann
Ryan and Alison Wray (eds.), Evolving Models of Language, 100–116. Cleve-
don: Multilingual Matters.
Stubbs, Michael
2002 Two quantitative methods of studying phraseology in English. International
Journal of Corpus Linguistics 7: 215–244.
Stubbs, Michael
2004 Language corpora. In: Alan Davies and Catherine Elder (eds.), Handbook of
Applied Linguistics, 106–132. Oxford: Blackwell.
450 Piotr Cap

Titscher, Stefan, Michael Meyer, Ruth Wodak and Eva Vetter

2000 Methods of Text and Discourse Analysis. London: Sage.
Van Dijk, Teun
1999 Critical Discourse Analysis and Conversation Analysis. Discourse & Society
10: 459–470.
Van Dijk, Teun
2003 Critical discourse analysis? In: Deborah Schiffrin, Deborah Tannen and Heidi
Hamilton (eds.), The Handbook of Discourse Analysis, 352–371. Oxford:
Blackwell.
Van Dijk, Teun
2006 Discourse and manipulation. Discourse & Society 17: 359–383.
Van Dijk, Teun
2008 Discourse and Context: A Socio-Cognitive Approach. Cambridge: Cambridge
University Press.
Van Dijk, Teun
2009 Society and Discourse: How Social Contexts Influence Text and Talk. Cam-
bridge: Cambridge University Press.
Van Eemeren, Frans and Rob Grootendorst
1992 Argumentation, Communication, and Fallacies. A Pragma-Dialectical Per-
spective. Hillsdale, NJ: Lawrence Erlbaum.
Van Leeuwen, Theo
1996 The representation of social actors. In: Carmen Rosa Caldas-Coulthard and
Michael Coulthard (eds.), Texts and Practices: Readings in Critical Discourse
Analysis, 32–70. London: Routledge.
Van Leeuwen, Theo
1999 Speech, Music, Sound. London: Palgrave.
Van Leeuwen, Theo
2000 Visual racism. In: Martin Reisigl and Ruth Wodak (eds.), The Semiotics of
Racism – Approaches in Critical Discourse Analysis, 35–56. Vienna: Passagen
Verlag.
Van Leeuwen, Theo
2005 Introducing Social Semiotics. London: Routledge.
Ventola, Eija, Cassily Charles and Martin Kaltenbacher
2004 Perspectives on Multimodality. Amsterdam: John Benjamins.
Verschueren, Jef
1999 Understanding Pragmatics. London: Edward Arnold.
Verschueren, Jef
2001 Predicaments of criticism. Critique of Anthropology 21: 59–81.
Verschueren, Jef
2011 Ideology in Language Use: Pragmatic Guidelines for Empirical Research.
Cambridge: Cambridge University Press.
Weiss, Gilbert and Ruth Wodak
2003 Critical Discourse Analysis: Theory and Interdisciplinarity. Basingstoke: Pal-
grave.
Widdowson, Henry
1998 The theory and practice of Critical Discourse Analysis. Applied Linguistics 19:
136–151.
Critical Discourse Analysis 451

Widdowson, Henry
2005 Text, Context, Pretext: Critical Issues in Discourse Analysis. Oxford: Black-
well.
Wodak, Ruth (ed.)
2012 Critical Discourse Analysis (4 Volumes). London: Sage.
Wodak, Ruth and Michael Meyer (eds.)
2001 Methods of Critical Discourse Analysis. London: Sage.
Wodak, Ruth and Michael Meyer (eds.)
2009 Methods of Critical Discourse Analysis. (2nd edn.). London: Sage.
Wodak, Ruth and Michael Meyer (eds.)
2015 Methods of Critical Discourse Studies. London: Sage.
Wodak, Ruth and Paul Chilton (eds.)
2005 A New Agenda in (Critical) Discourse Analysis. Amsterdam: John Benjamins.
Wodak, Ruth
2001 The discourse-historical approach. In: Ruth Wodak and Michael Meyer (eds.),
Methods of Critical Discourse Analysis, 63–95. London: Sage.
Wodak, Ruth
2011 Critical linguistics and critical discourse analysis. In: Jan Zienkowski, Jan-Ola
Ostman and Jef Verschueren (eds.), Discursive Pragmatics, 50–70. Amster-
dam: John Benjamins.
Yus, Francisco
2011 Cyberpragmatics: Internet-mediated Communication in Context. Amsterdam:
John Benjamins.
Zagar, Igor
2010 Topoi in Critical Discourse Analysis. Lodz Papers in Pragmatics 6: 3–27.
Zinken, Joerg
2007 Discourse metaphors: The link between figurative language and habitual anal-
ogies. Cognitive Linguistics 18: 445–466.
V. Corpus pragmatics
18. Introduction to part 5: Corpus pragmatics
Andreas H. Jucker

1. Introduction

Part 5 of this handbook is devoted to methods in pragmatics that rely on corpus

searches. Corpus pragmatics is a relatively late addition to the various subfields of
pragmatics. Early work in pragmatics tended to be qualitative rather than quantita-
tive. It tended to focus on richly contextualised instances of language use, on small
sets of data and on the minutiae of spoken interaction, which precluded the use of
large-scale corpora. Early work in corpus linguistics, on the other hand, tended to
explore research questions in the area of lexico-grammatical, morphological and
syntactic patterns and other areas of the interaction between the lexicon and sen-
tence structure, which were amenable to be turned into search algorithms because
they concerned the surface manifestations of language.
Some work in corpus pragmatics, however, appeared as early as the late 1980s
and the 1990s (e. g. Aijmer 1987, 1996; Stenström and Andersen 1996; Schmied
1998 or Culpeper and Kytö 1999), but the field really took off only in the 2000s
with a series of monographs and edited volumes (e. g. Aijmer 2002; Deutschmann
2003; Aijmer and Stenström 2004; Baker 2006; Facchinetti and Rissanen 2006;
Adolphs 2008; Romero-Trillo 2008; Jucker, Schreier and Hundt 2009). In the
meantime, the field has already matured to such an extent that in addition to a ded-
icated journal (Corpus Pragmatics) and handbook (Aijmer and Rühlemann 2015)
a series of survey articles have appeared (e. g. Andersen 2011; Rühlemann 2011;
Jucker 2013; Jucker and Taavitsainen 2014). Work in corpus pragmatics is prolifer-
ating at an increased pace at the moment. It combines the persisting interest in the
field of pragmatics in general with the increased reliance on empirical and above
all quantitative approaches and the explosion of available corpora and corpus tools
(Felder, Müller and Vogel 2012; Taavitsainen and Jucker 2014).
Corpus pragmatic approaches typically adopt a quantitative perspective.
Research questions often ask about the frequencies of certain elements in specific
text samples and, crucially, about differences of these frequencies in different text
samples. But – as I will argue in this introduction and as will become clear in the
contributions assembled in this section – a quantitative perspective requires a very
solid foundation in the preparation of the data base and in the analysis and catego-
risation of the data.

https://doi.org/10.1515/9783110424928-018
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). Methods in Pragmatics, 455–466. Berlin/Boston:
De Gruyter Mouton.
456 Andreas H. Jucker

2. The scope of corpora

In a pre-theoretical sense, any collection of texts or even one single text can be
called a corpus. In the sense intended here, however, only electronically searchable
corpora are meant. In the definition of Andersen (2011: 590), “corpora are com-
pilations of naturally occurring spoken or written language that can be accessed
on a computer. Such compilations may be monolingual or multilingual and may
represent general language or specific domains (professional/academic corpora)”.
The earliest corpora in this sense date back to the 1960s. They were designed
to provide a more or less representative mirror image of an entire language, and
a lot of thought went into the balanced construction of these corpora: which text
genres should be represented? And how should the different genres be distributed?
According to Aarts’ (2011) useful typology, such corpora are, therefore, called bal-
anced corpora. Examples of such early balanced corpora are the London-Lund Cor-
pus of Spoken English (LLC), the Brown Corpus of written American English or
the Lancaster-Oslo-Bergen (LOB) Corpus of written British English. Aarts (2011)
stresses the intuitive nature on which the “balancing” was done. There is, as yet, no
established way to assess in any useful sense the overall composition of a language
as a whole, and, therefore, it can only be pure guess work what kind of composition
of a sample corpus would best represent an entire language. To a large extent this is
also true for specialised corpora that try to represent a single variety of a language.
The corpus of Early Modern English Medical Texts (EMEMT), for instance, claims
to be a “representative sample of the entire field of English medical writings that
appeared in print between 1500 and 1700” (Taavitsainen and Pahta 2010: cover
blurb). However, from a strictly statistical point of view, such a claim rests on a
full and comprehensive list of all the relevant texts of the entire field and a selec-
tion principle which gives every single text of the field the same chance of being
included in the sample corpus, a criterion which seems hard to achieve even in a
limited field such as medical discourse. In the case of an entire language, there
is no way of establishing the limits of the entire set (or “population” in statistical
terms) that a corpus is supposed to represent. Corpora still try to be representative
of more than just themselves, and, therefore, the label “sample corpus” seems more
appropriate according to Aarts (2011). He mentions the British National Corpus
with 100 million words as the largest sample corpus of British English.
According to Aart’s (2011) typology, there are also full-text corpora, which
contain one or more complete texts. Parallel corpora contain texts of more than one
language or more than one variety of the same language. The parallelism between
these texts can vary from direct translations of one language into the other to cor-
pora of different varieties or languages that have been compiled on the basis of
identical designs. The Brown and LOB corpora, for instance, consist of identical
samples of different genres drawn from American English and British English
respectively. Additional categories are diachronic or historical corpora represent-
Introduction to part 5: Corpus pragmatics 457

ing older stages of a language and learner corpora containing texts produced by
non-native speakers of a language.
In recent years, the number of available corpora and their size have increased at
an unprecedented rate. Back in the 1960s one-million-word corpora were consid-
ered to be large. In the meantime, many corpora are available extending to several
hundred million words. A dedicated website created by Mark Davies includes a
dozen different corpora, four of which contain more than one billion words (http://
corpus.byu.edu). It includes balanced corpora such as the Corpus of Contemporary
American English (COCA, 520 million words) but also corpora with a very narrow
focus on just one type of text, e. g. the Hansard Corpus with the proceedings of the
British Parliament from 1803 to 2005 (1.6 billion words) or the Corpus of Ameri-
can Soap Operas with transcripts from American soap operas from the early 2000s
(100 million words). The largest corpus, however, is provided by the Google Books
Ngram Viewer, which accesses a database of 361 billion words.
However, for research questions in pragmatics, corpus size is usually not the
decisive criterion. It is usually more important for the pragmaticists to be able to
contextualize the individual search results, either in the immediate context sur-
rounding the search item or the larger context of the genre or text type in which it
occurs. The Ngram Viewer does not provide any context at all. In fact, the searches
are not performed on entire texts but on indexes derived from the texts. The ngrams
in these indexes carry only minimal information about the type of English and the
year of publication of the text in which they originally occurred. In other corpora,
it is usually possible to trace individual occurrences of search items back to their
original location but often this has to be done manually, which severely restricts
the amount of data that can be assessed in this way in spite of the ease of retrieving
many more occurrences from these large corpora. Thus, there is often a tension
between small but richly contextualised sets of data versus large-scale corpora
with a lot of quantifiable material but a very limited amount of context for each of
the retrieved hits; the big data caveat in O’Keeffe’s terms (this volume; see also
Taavitsainen and Jucker 2015: 18).
One solution to this problem is the use of pragmatically annotated data (see
Archer and Culpeper, this volume). A subcorpus of the Michigan Corpus of Aca-
demic Spoken English (MICASE), for instance, has been tagged for some speech
acts, and the Corpus of Verbal Response Mode (VRM) Annotated Utterances has
been coded both for literal meaning and for pragmatic meaning (see Rühlemann
2011: 630). But such annotations are extremely labour intensive, which puts severe
limitations on the size of the corpora that can be annotated in this way.
458 Andreas H. Jucker

3. Corpora, quantification and statistics

Corpus pragmatic approaches search for patterns and generalisations across large
amounts of data. Research questions typically ask for frequencies and differences
in frequencies in different samples or subsamples. They ask questions that can only
be answered with numerical results. However, any numerical claim depends on a
solid foundation consisting of several layers pertaining to the database, the identi-
fication and analysis of the data and so on. This can be visualised as a pyramid in
which each individual level depends on a solid foundation of all the lower levels,
and at the same time each level consists of a higher degree of abstraction and gen-
eralisation than its supporting level and thus the height of each level comes at the
cost of a further loss of detail (see Figure 1).
Figure 1 depicts the pyramid of quantitative research. At the bottom of any
quantitative research there is the selection and compilation of data. The researcher
can decide to make use of an existing corpus or to construct a corpus specifically
designed for the research question at hand (see chapter 19 by Gisle Andersen).
The decision is not trivial. Mistakes at this level may render all the work at higher
levels questionable or even meaningless. Considerations at this level will include
the question about which language varieties need to be included, whether they are
spoken or written, the degree of formality, the diachrony of the data and many
more. The second level of the pyramid very often consists of the pre-processing
of the data (see chapter 20 by Dawn Archer and Jonathan Culpeper). Present-day
corpora are often annotated with parts-of-speech tags. There are also speaker-iden-
tification tags and tags that identify different registers or modalities of the language
samples that are included. Some corpora even include pragmatic annotations. The
quality of these annotations again has an immediate bearing on the reliability of
all the work carried out at the higher levels in the pyramid. If the accuracy of the
parts-of-speech tags is less than one hundred per cent, for instance, the quantifi-
cations at the higher levels inherit these errors to the extent that they rely on the
parts-of-speech tagging.
The core of any research project is, of course, the identification and description
of a certain linguistic phenomenon. In the context of corpus pragmatic research this
can be a particular linguistic form or a range of such forms, such as a particular
discourse marker or an interjection, whose functions are to be investigated (see
chapter 21 by Karin Aijmer), or a range of speech functions, such as a specific
speech act or a class of speech acts, whose specific linguistic realisations are to
be investigated (see chapter 22 by Anne O’Keeffe). A precise description of these
phenomena is again an indispensable prerequisite in order to ensure the reliability
of the higher levels in the pyramid.
Once the elements have been identified, they need to be categorised. Different
uses of a discourse marker, for instance, or specific ways of realising a certain
speech act have to be distinguished. Without such a categorisation, the elements
Introduction to part 5: Corpus pragmatics 459

Figure 1: The pyramid of quantitative corpus research

cannot be counted and quantification is not possible. The items to be quantified

need to be identified in such a way that they can be reliably counted. This means
that individual occurrences of some phenomenon are claimed to be sufficiently
similar or even identical in order to be lumped together. Small differences that are
not relevant are abstracted away or ignored. In this sense quantification necessarily
involves a certain loss of detail of description. It is the price that has to be paid for
quantification. If we are prepared to pay the price, we can count the instances, and
we can compare different phenomena.
It is also essential at this stage – and this is all too often ignored – that the
categories must be defined in such a way that another researcher would identify
the same elements as instantiations of this particular category. This stage, there-
fore, should include an interrater reliability test. This involves at least two raters,
or coders, who independently code a data sample and then compare their results.
The categorisation is only considered to be sufficiently robust if the coders come
up with a sufficiently high number of identical codes assigned to the data. If that
level is not achieved, the category descriptions have to be improved or the cate-
gories have to be adjusted before a new round of testing with fresh data samples
can be started. This process has to be repeated until the desired level of interrater
agreement has been achieved. Usually a level of 70 per cent is considered to be
adequate. Practical experience shows that such a level, which may appear to be
relatively modest, is often more difficult to achieve than might be expected, espe-
cially if functional categories are involved. However, the reliability of category
460 Andreas H. Jucker

counts critically depends on the reliability of category identification. If the cate-

gories proposed by the researcher have not passed the test of interrater reliability,
the quantitative results have to be seen with a lot of scepticism, and even if they
have passed such a test, it should be clear that a level of a minimum of 70 per cent
interrater agreement means that the results are no more than approximations or
relatively accurate estimates. The nature of linguistic data generally does not lend
itself to high precision measurements.
This scepticism is essential whenever higher levels in the pyramid are consid-
ered. The counting of categories that forms the basis for the descriptive statistics
seems like a tedious task that can generally be done easily and quickly by the
computer. But the ease of computation should not be allowed to suggest a degree
of precision that is not supported by the approximate nature of the underlying data
categorisation.
At the level of descriptive statistics, researchers often have to work with nor-
malised frequencies. If the frequencies of a certain linguistic element are to be
compared in two or more different contexts, the actual figures have to be set into
relation of the size of these contexts. Normally this is done in terms of number of
words. The observed frequency of the element in each context is calculated as a
frequency per 10,000 words or per one million words or some other suitable level.
It seems straightforward to use the number of words as the category for normali-
sation but it is not without problems. Computers can count the number of words
very easily and quickly but they rely on a rather crude definition of what a word is
(something like a string of letters enclosed by blanks or punctuation marks). Even if
this is too simplistic for a linguistic definition of what a word is, for many purposes
it is good enough as a proxy, in particular if the word count is carried out in the
same way in all the relevant contexts. But in some instances the number of turns or
the duration of speaking may be more accurate measures for the normalisation of
frequency figures, and it must be realised that the results depend on such choices.
The pinnacle of many research efforts seems to be reached when the researcher
cannot only produce the frequencies for a particular element in different contexts
but when he or she can confidently claim that the differences are significant. This
is done on the basis of inferential statistics. Many different statistical tests are
available for this purpose, and the computer will very quickly return a verdict of
whether different numerical patterns in the different contexts are likely to be ran-
dom or whether they are sufficiently large to exclude the possibility of being just
random and, therefore, must be assumed to be significant.
However, such results must always be addressed with a healthy dose of cau-
tion. It crucially depends on the choice of an appropriate statistical test, and it
depends just as crucially on the reliability of the figures that have been fed into
the computer, which depends – as argued above – on the quality of the choices at
all the lower levels of the pyramid. But even with the best of intentions and the
highest level of care, the result at the top of the pyramid inherits all the unavoidable
Introduction to part 5: Corpus pragmatics 461

limitations at the lower levels. It only applies to the data that was included in the
sample, it depends on the accuracy of the data annotations, the reliability of the
data categorisation and counting, and so on.
And ultimately, even if we accept – with sufficient caution – the significance
of our results, the statistical tests do not tell us anything about the reasons for this
significance. A distribution of the data that is highly unlikely to be random is just
that – a distribution that is highly unlikely to be random – no more, no less. Often
enough it is just the starting point for new questions to be asked.

4. The papers in this section

The first two papers in this section are concerned with the construction and anno-
tation of corpora. In chapter 19, Gisle Andersen discusses the various aspects that
need to be taken into consideration when researchers either choose an existing cor-
pus or decide to build their own corpus. He argues that the specifics of pragmatic
research often make it useful or even indispensable to go beyond ready-made, off
the shelf corpora by either extracting relevant subparts, by annotating existing
corpora in various ways or by embarking on the construction of the researcher’s
own tailor-made corpora. Andersen focuses on the various selective processes,
or sampling frames, of corpus construction and on the effects these choices have
on the potential for corpus pragmatic investigations. He discusses the differences
between form-based approaches and function-based approaches and the distinction
between corpus-based versus corpus-driven approaches. The sampling frame is
particularly challenging in the case of parallel corpora with data drawn from dif-
ferent languages or different time periods because the inventory of genres and text
types may be very different in these languages or time periods. He also discusses
some more technical aspects of corpus construction, such as the transcription of
spoken data and various types of annotations.
In chapter 20, Dawn Archer and Jonathan Culpeper argue that pragmatic anno-
tation for a long time lagged behind the annotation of other aspects in corpora.
They note that corpus pragmatic work so far has had a strong bias towards research
questions with a formal entity as a starting point. Pragmatic annotation offers a way
out of this restriction. They distinguish between different levels of pragmatic anno-
tation. At one level, there are annotation schemes that identify interactional phe-
nomena, such as speech acts, and at a second level, there are annotation schemes
for contextual phenomena, such as the gender or social status of the interactants.
Such contextual features are particularly important since pragmatic interpretations
are regularly based on contextual features. The annotation of pragmatic units is
difficult because of the problem of identifying adequate boundaries and because
pragmatic units are often ambiguous and indeterminate. Pragmatic annotations,
therefore, must often be applied manually, which seriously restricts the corpus
462 Andreas H. Jucker

size for annotations. They also present their own annotation scheme, which they
used for the Sociopragmatic Corpus with its sophisticated and highly detailed tags
identifying for each segment the relevant combination of sociopragmatic variables
including speaker identification, addressee identification, and their relationship.
They argue that many pragmatic phenomena cannot easily be annotated automati-
cally but some annotation is possible with computational assistance.
The third paper in this section, by Irma Taavitsainen, chapter 21, is devoted to
the historical dimension of corpus pragmatics, where the challenges and problems
of corpus pragmatic research are exacerbated because of the historical nature of the
data. She provides an outline of the relevant corpora, from the pioneering Helsinki
Corpus to the single-register or single-variety corpora produced by the same Hel-
sinki team to more recent corpora. She focuses on some of the challenges of his-
torical corpus pragmatic work, such as the dilemma between large generalisations
which cover a lot of data versus the wish to focus on increasingly fine-grained dis-
tinctions, which reduces the available data for each relevant distinction to such an
extent that useful generalisations are no longer possible, or the problem of spelling
variation in historical texts. The chapter also gives a brief introduction to the most
important corpus tools, such as concordances, keyword analysis, collocations and
statistical assessments, and it points out the importance of including the social and
cultural context as well as the genre context into the analysis. This makes it neces-
sary to switch back and forth between the frequency counts of corpus searches and
the actual contexts in which the search items occur. Finally, she identifies some
future directions for historical corpus pragmatics, as for instance an increased trend
towards megacorpora, towards increasingly richer and more sophisticated annota-
tions of corpora, and towards more and more sophisticated editing techniques that
are used to prepare historical material for inclusion into searchable corpora.
Chapters 22 and 23 consider the relationship between form and function in
corpus pragmatics. The chapter by Karin Aijmer looks specifically at research
approaches that take a linguistic form, such as a discourse marker, an interjection,
a term of address or a hesitation marker as a starting point in order to explore its
function across a large number of occurrences. This is the more common approach
in corpus studies because corpus searches depend on clearly specifiable strings of
linguistic material, i. e. on formal patterns. She draws attention to the problem of the
ambiguity of many linguistic forms. Discourse markers, for instance, often have lin-
guistic forms that coincide with forms in other word classes and even as discourse
markers they are multifunctional. She, too, draws attention to the importance of the
context for the interpretation of the various functions of the elements retrieved in
corpus searches. She also points out the connection to the variationist perspective,
in which search items are systematically correlated with different types of context
in order to explore the sociolinguistic factors, for instance, on the usage of specific
elements. Moreover, she considers corpus pragmatic work in the context of selected
theoretical approaches, such as Thetical Grammar or Construction Grammar.
Introduction to part 5: Corpus pragmatics 463

The paper by Anne O’Keeffe looks at approaches that take a speech function,
e. g. a specific speech act, as a starting point in order to explore its realisations in a
specific set of texts. This can be done by searching for elements that are regularly
associated with this function, as for instance sorry, which may function as an apol-
ogy or may accompany an apology. But not all apologies contain an instance of
sorry, and not all instances of sorry occur together with an apology. She also draws
attention to the dilemma in corpus research between large numbers of occurrences
of a particular phenomenon, breadth of forms in her words, and the contextual
depth that is available for each occurrence. The larger the number of occurrences,
the more restricted will be the contextual depth for each occurrence and vice versa.
In order to illustrate the problem, she traces the history of I’m sorry and I apologise
in the largest available corpus, the Google Books Ngram Viewer. She then presents
two case studies which contrast corpus linguistic methods and discourse comple-
tion tasks. The study by Schauer and Adolphs (2006), which analyses expressions
of gratitude in the Cambridge and Nottingham Corpus of Discourse in English
(CANCODE) and in a discourse completion task, finds that the corpus data gives
a broader contextual picture than the DCT data. In the corpus, expressions of grat-
itude often occur in clusters while in the DCT data single utterances expressing
gratitude are the norm. This result is supported by a study by Flöck and Geluykens
(2015), who compared directives in the British component of the International
Corpus of English (ICE) with response data of a written DCT and a small corpus
of business letters. In the final part of the chapter, O’Keeffe presents different
approaches that deal with the problem of searching for speech functions. The first
approach, one-to-one searching, is restricted to instances in which a specific form,
such as thank you, or a specific tag is searched for. This will provide a full recall
of all such forms. The second approach consists of a down-sampling of the corpus
to a manageable size and a manual analysis of the relevant search item. The third
approach makes systematic use of existing research findings, e. g. from DCT stud-
ies, to establish the relevant search items for corpus search. And finally she presents
four possible solutions that have been proposed for larger corpora together with
their advantages and limitations: the use of illocutionary force indicating devices;
the use of genre-specific search inventories established by manual searches of
small sample corpora; the use of typical lexical or grammatical features associated
with a speech act; and, finally, the use of metacommunicative expressions.
In the last chapter of this volume, chapter 24, finally, Michael Haugh focuses
specifically on the corpus-pragmatic approaches that take metapragmatic elements
as a starting point. Such elements reflect the interactants’ awareness of what is
going on in the interaction and their comments about this. Haugh uses elements
such as just kidding, kidding, only joking and so on as examples with which the
speaker signals to the addressee that the surrounding talk should be treated as
non-serious, playful or jocular. He distinguishes between three different types
of acts and activities: first, pragmatic acts and activities (e. g. apologise, joke,
464 Andreas H. Jucker

threaten); second, inferential acts and activities (e. g. allude, imply, sarcasm); and
third, evaluative acts and activities (e. g. aggressive, polite, rude). He identifies a
number of challenges of an analysis of metapragmatic elements. First, the analysis
must identify a sufficient number of tokens for an analysis, and these tokens must
be comparable across contexts. The same metapragmatic lexical item may well
be used in different ways on different occasions. And second, the accuracy of the
transcriptions is essential. A careful transcription often reveals details that are lost
in a less detailed rendering.
Part 4 of this handbook covered methods that were largely qualitative. They
focused on small data sets of richly contextualised communicative behaviour. In
the following chapters of part 5 of the handbook, the focus shifts to large scale
investigations that try to find generalisations across ever increasing data sets. But
the tension between such large-scale generalisation and the goal of paying attention
to the minute details of each individual occurrence remains a leitmotif in all the
chapters of part 5.

References

Aarts, Jan
2011 Corpus analysis. In: Jan-Ola Östman and Jef Verschueren (eds.), Handbook of
Pragmatics Manual. Amsterdam/Philadelphia: John Benjamins.
Adolphs, Svenja
2008 Corpus and Context. Investigating Pragmatic Functions in Spoken Discourse.
(Studies in Corpus Linguistics 30.) Amsterdam: John Benjamins.
Aijmer, Karin
1987 Oh and ah in English conversation. In: Willem Meijs (ed.), Corpus Linguistics
and Beyond. Proceedings of the Seventh International Conference on English
Language Research on Computerized Corpora, 61–86. Amsterdam: Rodopi.
Aijmer, Karin
1996 Conversational Routines in English. Convention and Creativity. London:
Longman.
Aijmer, Karin
2002 English Discourse Particles. Evidence from a Corpus. (Studies in Corpus Lin-
guistics 10.) Amsterdam: John Benjamins.
Aijmer, Karin and Christoph Rühlemann (eds.)
2015 Corpus Pragmatics. A Handbook. Cambridge: Cambridge University Press.
Aijmer, Karin and Anna-Brita Stenström (eds.)
2004 Discourse Patterns in Spoken and Written Corpora. (Pragmatics & Beyond
New Series 120.) Amsterdam: John Benjamins.
Andersen, Gisle
2011 Corpus-based pragmatics I: Qualitative studies. In: Wolfram Bublitz and Neal
R. Norrick (eds.), Foundations of Pragmatics, 587–627. (Handbooks of Prag-
matics 1.) Berlin: de Gruyter Mouton.
Introduction to part 5: Corpus pragmatics 465

Baker, Paul
2006 Using Corpora in Discourse Analysis. London: Continuum.
Culpeper, Jonathan and Merja Kytö
1999 Modifying pragmatic force: Hedges in a corpus of Early Modern English dia-
logues. In: Andreas H. Jucker, Gerd Fritz and Franz Lebsanft (eds.), Historical
Dialogue Analysis, 293–312. Amsterdam: John Benjamins.
Deutschmann, Mats
2003 Apologising in British English. (Skrifter från moderna språk 10). Umeå: Insti-
tutionen för moderna språk, Umeå University.
Facchinetti, Roberta and Matti Rissanen (eds.)
2006 Corpus-based Studies of Diachronic English. (Linguistic Insights 31.) Bern:
Peter Lang.
Felder, Ekkehard, Marcus Müller und Friedemann Vogel (Hrsg.)
2012 Korpuspragmatik. Thematische Korpora als Basis diskurslinguistischer
Analysen. Berlin: de Gruyter.
Flöck, Ilka and Ronald Geluykens
2015 Speech Acts in corpus pragmatics: A quantitative contrastive study of directives
in spontaneous and elicited discourse. In: Jésus Romero-Trillo (ed.), Yearbook
of Corpus Linguistics and Pragmatics 2015, 7–37. London: Springer.
Jucker, Andreas H.
2013 Corpus pragmatics. In: Jan-Ola Östman and Jef Verschueren (eds.), Handbook
of Pragmatics, 2–17. Amsterdam: Benjamins.
Jucker, Andreas H., Daniel Schreier and Marianne Hundt
2009 Corpus linguistics, pragmatics and discourse. In: Andreas H. Jucker, Daniel
Schreier and Marianne Hundt (eds.), Corpora: Pragmatics and Discourse.
Papers from the 29th International Conference on English Language Research
on Computerized Corpora (ICAME 29). Ascona, Switzerland, 14–18 May
2008, 3–8. (Language and Computers: Studies in Practical Linguistics 68.)
Amsterdam: Rodopi.
Jucker, Andreas H. and Irma Taavitsainen
2014 Diachronic corpus pragmatics: Intersections and interactions. In: Irma Taavit-
sainen, Andreas H. Jucker and Jukka Tuominen (eds.), Diachronic Corpus
Pragmatics, 3–26. (Pragmatics & Beyond New Series 243.) Amsterdam: John
Benjamins.
Romero-Trillo, Jesús (ed.)
2008 Pragmatics and Corpus Linguistics. A Mutualistic Entente. (Mouton Series in
Pragmatics 2.) Berlin: Mouton de Gruyter.
Rühlemann, Christoph
2011 Corpus-based pragmatics II: Quantitative studies. In: Wolfram Bublitz and
Neal R. Norrick (eds.), Foundations of Pragmatics, 629–656. (Handbooks of
Pragmatics 1.) Berlin: de Gruyter Mouton.
Schauer, Gila A. and Svenja Adolphs
2006 Expressions of gratitude in corpus and DCT data: Vocabulary, formulaic
sequences, and pedagogy. System 34: 119–134.
Schmied, Josef
1998 Discourse markers in the Lampeter Corpus of Early Modern English Tracts.
In: Raimund Borgmeier, Herbert Grabes and Andreas H. Jucker (eds.), Anglis-
tentag 1997 Giessen. Proceedings, 57–65. Trier: Wissenschaftlicher Verlag.
466 Andreas H. Jucker

Stenström, Anna-Brita and Gisle Andersen

1996 More trends in teenage talk: A corpus-based investigation of the discourse
items cos and innit. In: Carol E. Percy, Charles F. Meyer and Ian Lancashire
(eds.), Synchronic Corpus Linguistics: Papers from the Sixteenth Interna-
tional Conference on English Language Research on Computerized Corpora
(ICAME 16), 189–203. Amsterdam: Rodopi.
Taavitsainen, Irma and Andreas H. Jucker
2015 Twenty years of historical pragmatics: Origins, developments and changing
thought styles. Journal of Historical Pragmatics 16(1): 1–25.
Taavitsainen, Irma and Päivi Pahta (eds.)
2010 Early Modern English Medical Texts. Corpus Description and Studies.
Amsterdam: John Benjamins.
19. Corpus construction
Gisle Andersen

Abstract: This chapter considers various aspects of corpus construction, i. e. the
collection, processing and annotation of texts for corpora that can be used in lin-
guistic analyses of speech or writing. It focuses on the range of selective pro-
cesses that shape various types of corpus construction and the effects of the choices
made. Corpus construction is illustrated with reference to recent studies in corpus
pragmatics which either directly address methodological issues or which illustrate
important aspects thereof. The issues dealt with include form- and function-based
approaches to pragmatics, corpus-based vs. corpus-driven studies, and various fac-
tors in research design, such as text type and domain, language variety and demog-
raphy, transcription and annotation of corpora, etc.

1. Introduction

Corpus construction is probably the most significant component of research design

in corpus pragmatics. This chapter outlines the parameters involved in the con-
struction of spoken and written corpora and the consequences of the choices made
for possible research questions in corpus pragmatics. The field of pragmatics is
notoriously wide, and the last couple of decades have accumulated a range of
corpus-based studies that would merit mentioning in this context. It is not my
intention to provide a historical account of the field of corpus linguistics or a
survey of all available corpora of English or other languages, as good overviews
of the field have been provided elsewhere (McEnery and Hardie 2012; Andersen
2010; O’Keeffe and McCarthy 2010). Nor is it my intention to survey research in
corpus pragmatics, as this has been done in two earlier chapters of this handbook
series (Andersen 2011; Rühlemann 2011). Rather, my focus will be on the range
of “selective processes” that shape various types of corpus construction and the
effects of the choices made. These will be outlined in section 2 and illustrated with
reference to recent studies in corpus pragmatics (widely defined) which either
directly address the issue of corpus construction or which illustrate important
aspects of it. Special emphasis will be on studies that illustrate the problematic
nature of corpus studies and cross-corpus comparability across different corpora.
Finally, section 3 offers some concluding remarks and reflections on future devel-
opments within corpus pragmatics.

https://doi.org/10.1515/9783110424928-019
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 467–494. Berlin/
Boston: De Gruyter Mouton.
468 Gisle Andersen

2. Issues in corpus construction and overview of literature in corpus

pragmatics

The choice of a certain corpus as a basis for studies in linguistics has obvious
bearings on the kind of research questions that can be pursued and the outcome of
testing individual hypotheses about language. The research potential of a corpus
is constrained by its sampling frame, i. e. the totality of numerous choices made in
corpus construction, such as whether to document a certain variety of a language,
mode of communication, period or speaker group. This section considers a wide
range of parameters that may be relevant to consider in corpus construction. Many
of these parameters can be construed as dichotomies (highlighted in italics below),
and each will be illustrated in this section with examples of previous work in cor-
pus pragmatics.

2.1. Overall methodological considerations

In very general terms, research in corpus pragmatics is motivated by a specific
objective to investigate a functional category, say, a certain type of speech act
(Adolphs 2008), e. g. requests, or a particular form, say, the sequence you know,
in a certain language variety with the aim of exploring what kind of bearing it
has on communicative interaction. To this end, researchers may choose to use an
existing corpus vs. a purpose-built corpus. An existing corpus such as the British
National Corpus (BNC) provides a comprehensive and multifaceted snapshot of
British English at the time of its compilation (early 1990s) in the form of a gener-
ally accessible 100-million word sample of this variety. With such a rich source at
one’s fingertips, it is only natural that its construction has stimulated a vast amount
of research, although only a relatively small subset of this research is in the field of
pragmatics. For instance, Tottie (2011) used the BNC in a study of the two “fillers”
er/uh and erm/um, items which play a crucial role in the organisation of spoken
discourse (Swerts 1998; Kjellmer 2003; Corley, MacGregor and Donaldson 2007).
This is a principally quantitative sociolinguistic study, which demonstrates some
of the complexity and features of the BNC, showing that demographic factors of
speaker gender, socioeconomic status and age have a bearing on the aggregate
use of the variant filler forms. In short, men use more fillers than women, people
from the higher socio-economic and more well-educated strata of the population
use more than people from lower strata, and older speakers use more than younger
speakers. This research illustrates the importance of metadata in corpus design, and
how information about the background of the speakers can lead to new insights.
Metadata covers information about speakers and authors, texts, topic domain, date
of recording, region, etc. The data for this part of Tottie’s research is drawn from
what is known as the demographic component of the BNC, which contains a statis-
tically balanced sample of conversations of speakers of a variety of backgrounds.
Corpus construction 469

Its counterpart, the context-governed component, contains a number of less spon-

taneous spoken discourse types that were selected according to a priori linguisti-
cally motivated categories, including lectures and talks, news commentaries and
classroom interaction. Together these two spoken components amount to about 10
million words of spoken English, which is outnumbered by the written component
of the BNC by roughly ten to one. The spoken component of the BNC constitutes
one of the resources most often exploited in corpus pragmatics (cf. Andersen 2011;
Rühlemann 2011 for surveys of studies), and British English is thus among the
better documented varieties of English.
However, despite the pervasiveness of a corpus such as the BNC, there may be
research questions that require researchers to pursue the development of one’s own
corpus. A variant of the ready-made corpus procedure is applied by Rühlemann,
Bagoutdinov and O’Donnell (2011) and Rühlemann and O’Donnell (2012). The
authors were interested in the dynamics of narrative discourse and how it compares
with conversational discourse more generally. For this purpose they created the
Narrative Corpus (NC) by extracting texts from the BNC, specifically only those
parts of the spoken demographic BNC that contained conversational narratives.
This raises the obvious question of how narratives are defined, and this is addressed
by Rühlemann and O’Donnell (2012). The extraction is based on a priori assump-
tions about lexemes thought to be recurrent in English narratives, such as “‘it was
so (funny, weird, etc.)’, ‘did I tell you’, ‘reminds me’, the interjections ‘bloody
hell’ and ‘oh my god’, as well as one-word items such as ‘anyway’, ‘suddenly’,
‘happened’, and the lemma ‘remember’” (Rühlemann and O’Donnell 2012: 316).
This corpus construction procedure required substantial manual work in reading
concordance lines and browsing text files. Their extraction of a relevant BNC sub-
set also relied on other criteria in addition to these lexical ones, criteria that were
based on basic conversation analytical principles and accumulated knowledge
about the dynamics of narrative discourse. Typically, stories contain utterances
that are longer than utterances in turn-by-turn talk more generally, and therefore
the authors decided to include utterances of more than 15 words length. They also
looked for stretches of conversation where one speaker occupied roughly every
third slot, a technique that is based on Sacks’ observation that “[f]ormally [a story]
can be said to be in the first instance an attempt to control a third slot in talk, from a
first” (Sacks 1992:18). Given the laborious nature of this corpus construction task,
the authors express hopes that the extraction of narratives can be automatised or
done at least semi-automatically in the future. Among the interesting observations
made during the construction of the corpus is that narratives tend to trigger more
narratives, as stories are often responded to with other stories, thus forming parts of
narrative chains. Hence, the number of narratives is much higher than the number
of texts/conversations in the NC. Further, Rühlemann and O’Donnell (2012) have
developed an innovative system for discourse annotation (cf. section 2.4) which
enables new approaches to the analysis of discourse and pragmatics, applied for
470 Gisle Andersen

instance in Rühlemann et al.’s (2011) study of the interaction of paralinguistic fea-

tures (filled pauses; cf. Tottie 2011 above) and discourse presentation.
The construction of the BNC alongside other influential corpora such as the
so-called Brown family of American and British comparable corpora (Francis
and Kucera 1979; Hofland and Johansson 1982; Mair 1997; Baker 2009) and the
International Corpus of English (Greenbaum 1996) have stimulated much research
and laid the foundations for corpus pragmatics as what is now a fledging field
(Romero-Trillo 2008; Jucker, Schreier and Hundt 2009; Taavitsainen, Jucker and
Tuominen 2014; Rühlemann and Aijmer 2016). However, the choices made in the
compilation of such ready-made corpora naturally limits the kind of research ques-
tions that can be pursued on their basis, as ready-made corpora do not necessarily
accord with one’s research objectives. For example, McIntyre and Walker (2011)
were interested in a discourse phenomenon commonly addressed in corpus stylis-
tics, namely discourse presentation (Semino and Short 2004), i. e. the presentation
of speech, writing and thought for instance by means of expressions such as he said
(that). However, rather than being content with available off-the-shelf corpora, the
authors decided to construct their own corpus tailored for their research purposes:
to compare such presentation in Early Modern English (EModE) with equivalent
phenomena in Present-Day English (PdE). They therefore built a relatively small
corpus of Early Modern English writing, as a point of comparison with studies
of Present-Day English. The overall aim of their pursuit is to study the degree to
which discourse presentation categories evolve over time. This study provides a
good illustration of why corpus construction is not a trivial task, as the authors
were forced to think carefully about such issues as comparability and delimitation
of time frame. Their corpus was designed to match the fiction and news sections
of a PdE corpus used as comparative basis, namely The Lancaster Speech, Writing
and Thought Presentation Spoken Corpus (SW&TP).1 This procedure raises issues
of comparability of historical and contemporary corpora, in that it is difficult to
find a textual basis from a historical period that exactly matches that of a contem-
porary corpus (a problem also noted by Jucker 2006), especially as “the newspaper
as a text type did not appear until the latter end of our time frame” (McIntyre and
Walker 2011: 109). Defining what counts as EModE was also problematic, since
“there is no common consensus among historical linguists, particularly with regard
to where EModE ends and Modern English begins” (McIntyre and Walker 2011:
109). To overcome such problems, the authors made a sensible pragmatic choice
to include a selection of data from 1500 to 1750 that matches the SW&TP corpus
to the extent possible.
During the corpus construction procedure, McIntyre and Walker (2011) made
an important methodological observation, namely that it is possible to apply Leech

1
http://www.lancaster.ac.uk/fass/projects/stwp/default.htm
Corpus construction 471

and Short’s (1981, 2007) model of discourse presentation to Early Modern Eng-
lish, a model which organises discourse presentation according to the amount of
involvement of the original speaker in the anterior discourse and the person in the
posterior discourse presenting what was said in the anterior discourse (Semino and
Short 2004: 10). Further, the manual annotation procedure allowed the authors
to improve on earlier work on discourse presentation in EModE journalism by
extending the categorical inventory of that of Jucker (2006). Although this is a
hypothesis-generating rather than a hypothesis-testing study, by quantifying the
various types of discourse presentation McIntyre and Walker (2011) point out sig-
nificant differences between EModE and PdE, such as the fact that overall there
is less speech, writing and thought presentation in EModE than in PdE, but also,
significantly, that maximal presentation forms occur more in EModE than PdE,
while the reverse is true for minimal presentation forms. This suggests that there is
more telling (diegesis) rather than showing (mimesis) in EModE, thus indicating
a general trend towards less narrator interference. In other words, their study sup-
ports the idea that there may indeed be a long-term evolution that affects the ways
in which speech, writing and thought are presented in journalistic prose.
Another overall consideration to make in corpus construction is what type of
corpus methodology to use, or put more simply, what to search for. Andersen (2011)
distinguishes the form-based and function-based approaches to corpus pragmatics.
Under the form-based approach, the point of departure is a previously recognised
form (a word, a phrase or a structural pattern, such as English it-clefts or wh-clefts
(Collins 2005)). An example of a form much studied over the last decades is the
discourse marker innit (Stenström and Andersen 1996; Andersen 2001; Stenström,
Andersen and Hasund 2002), which serves functions as tags and as response sig-
nals (follow-ups) especially in adolescent speech. Andersen (2001: 139 ff) showed
that the invariant British English form innit (from isn’t it or ain’t it) has an extended
function from occurring in tag position as a marker of mutual manifestness (com-
mon ground) directed towards the previous speaker’s utterance, in addition to its
more generally recognised use as a tag question which modifies a proposition of
the current speaker. Subsequent studies have uncovered a functional expansion
of this form, and especially Pichler (2013, 2016) demonstrated a wider range of
functions in more recent data than originally observed. This shows the value of
comparisons of forms across different corpora and underlines the need to replicate
studies in corpus pragmatics as new comparable corpora become available.
The methodological counterpart, the function-based approach, takes as its basis
a particular pragmatic function and describes its possible realisations in actual dis-
course. This can be exemplified by Torgersen et al. (2011), who investigate a cer-
tain class of discourse markers whose “overarching function […] is to (appear to)
involve the interlocutor by (appearing to be) eliciting responses indicating that the
interlocutor agrees with, remembers, understands or follows the thread” (Torgersen
et al. 2011: 96). The forms performing this function included in their study were
472 Gisle Andersen

right, innit, ok, yeah, you know, you know what I mean, if you know what I mean,
do you know what I’m saying, you get me. Their analysis is based on the Bergen
Corpus of London Teenage Language (COLT; Stenström, Andersen and Hasund
2002) and the Linguistic Innovators Corpus (LIC), which contains transcriptions
of recordings made in connection with a sociolinguistic project in London (Chesh-
ire et al. 2008), containing interview data as well as self-recorded conversations.
Torgersen et al.’s comparative analysis of these forms in COLT and LIC revealed
that the forms innit and if you know what I mean occur with similar frequencies in
both corpora; ok, right, yeah and you know are more frequent in COLT; while you
get me, (do) (you) know what I mean and (do) (you) know what I’m saying are sig-
nificantly more frequent in LIC. This exemplifies a significant line of research in
corpus pragmatics, namely the variationist approach, where different realisations
of discourse functions are construed as a discourse-pragmatic variable (Pichler
2016). Other work within this research paradigm has documented innovation in the
system of general extenders (a.k.a. set-marking tags, items such as and things like
that), such as ongoing grammaticalisation in London (Cheshire 2007) and lexical
replacement with and stuff in other varieties (Denis 2011; Tagliamonte and Denis
2010; Tagliamonte 2016) as well as variability with regard to the use of quotatives
(Denis 2016).
The final dichotomy to be mentioned in this section is the crucial division
between corpus-based vs. corpus-driven approaches in corpus pragmatics. This
distinction reflects two coherent and complementary ways of using corpora for the
study of language use, first laid out by Tognini-Bonelli (2001). In corpus-based
studies, researchers study predefined linguistic features based on their assumption
that a particular word form or set of forms are known to or likely to be found in a
corpus. This assumption is usually based on preliminary observations of the data or
hypotheses about a form’s occurrence in a particular language variety. Researchers
use the corpus to search for this form and to analyse its use and distribution in the
corpus. Corpus-driven research, by contrast, is a more inductive and exploratory
approach that makes no or minimal assumptions as to which word forms and cat-
egories a corpus contains and therefore “differs from the standard practice of lin-
guistics” (Biber 2009: 276). It generally involves calculating frequencies of indi-
vidual word forms and sequences of words within and across different corpora, thus
inductively “exploiting the potential of a corpus to identify linguistic categories
and units that have not been previously recognised” (Biber 2009: 278). All corpora
lend themselves easily to corpus-based studies, which is methodologically simple
in that it involves searching for the relevant forms and the subsequent study of the
concordance lines retrieved by the search facility. This method can involve “one-
to-one searching” (Ädel and Reppen 2008: 2), where specific linguistic forms are
searched for in the corpus, but often needs to be followed by “sifting” of the data,
i. e. manually extracting relevant tokens from corpus concordances and discarding
irrelevant tokens. Given this relative methodological simplicity, it is not surprising
Corpus construction 473

that most work in corpus pragmatics is corpus-based rather than corpus-driven, and
a wide range of studies could have been mentioned (see for instance the studies in
Romero-Trillo 2008). However, a recent study by Andersen (2016) shows that the
corpus-driven approach is a valuable asset in corpus pragmatics as well. The main
advantage of the corpus-driven method is that it avoids the intuition-based selec-
tion of members of a certain category as candidates for analysis, thereby providing
a more accurate picture of variants and variables that may be undergoing change.
Corpus-driven studies require either direct access to the full set of texts in a
corpus, on which statistical operations can be performed, or access to statistical
data derived from the corpus, such as word frequency lists, frequency-ranked lists
of n-grams (sequences of n words of varying length), or lists of collocations (sta-
tistically significant co-occurrences of words; cf. section 2.4). Such data may then
be used for instance as basis for comparison between corpora or between sections
within one corpus. In other words, the corpus-driven approach is somewhat more
technically demanding than the corpus-based approach, as it presumes access to
statistical techniques not normally accessible directly from corpus web-sites but
which require additional computation. In the case of Andersen (2016), he applied
keyness analysis, which is a bottom-up statistical approach, to identify words and
sequences of words that are particularly frequent in one corpus and much less
frequent or non-existent in another. He performed a comparative analysis of two
London corpora (COLT and the London English Corpus) and the result of the
comparison made it evident that, within this variety of English, there is innova-
tion in several pragmatic categories, such as the use of interjections, vocatives,
text-organising discourse markers like at the end of the day and response elicitors.
The study also uncovered response elicitor variants that have not been accounted
for in Torgersen et al.’s (2011) comparison of the same corpora, mentioned above.
Andersen’s corpus-driven analysis showed that formal variation in response elic-
itation is greater than originally proposed by Torgersen et al., and that the enve-
lope of variation should be extended to include a number of forms left out in the
original study (do you get what I mean, do you get what I’m saying, if you get
what I mean, if you get what I’m saying, you get what I mean, you get what I’m
saying). In conclusion, in order to produce fully accountable results, it may be
necessary to combine corpus-based with corpus-driven methods, as reliance on
the corpus-based approach alone risks overlooking variants not previously docu-
mented in the literature and failing to uncover recent additions to the pool of avail-
able variants. Although somewhat more technically demanding, the corpus-driven
approach is now more accessible through the establishment of large international
infrastructures for language resources, such as CLARIN2, which houses a very
large number of corpora and other language resources and makes them available

2
https://www.clarin.eu/
474 Gisle Andersen

not just for search but also for downloading and performing statistical operations
on them, which can subsequently be used in corpus-driven studies.

2.2. Mode of communication, text type and domain

Having overviewed some of the main methodological considerations, I now turn to
a set of factors that pertain to the content of corpora. Different research objectives
will lead to the use of corpora representing different modes of communication, i. e.
whether we wish to explore the “written” or “spoken” mode. In the written mode,
the compilers of the first and second generation of corpora (the Brown family,
BNC, ICE, etc.) have made conscious efforts to include as wide a range of genres
as possible, including informative and imaginative writing. This connects with
another significant dichotomy in corpus construction, namely representativity vs.
balance. The former refers to the extent to which the texts in a corpus actually rep-
resent the discourse domains and linguistic distributions of the language variety we
aim to describe. In strict statistical terms, a corpus such as the BNC is not a repre-
sentative sample of the total text production in a language community (Leech 1991,
2007), since there is no way of knowing the extent of the population of possible
texts or the distribution between different text types within this population. Further
to this, genre taxonomies are sometimes intuitively defined and not based on any
well-defined criteria for what constitutes a coherent textual genre. These problems
illustrate what has been termed external representativeness, which concerns “the
extent to which [a sample of texts] is selected from the range of text types in the
target population” (Biber 1993: 243), which contrasts with internal representative-
ness, which concerns “the extent to which [a sample of texts] includes the range of
linguistic distributions in the population (Biber 1993: 243; on this issue, see also
Millar and Biber 2015). Besides, it is clear that a lot of text categories, especially
unpublished ones, are never represented in corpora, such as internal documents,
memos, reports, brochures, etc. Therefore, although representativity may be an
ideal, what compilers of corpora usually aim for is balance between the various
text types and linguistic categories that they decide to include. In the case of the
ICE-GB, for example, this balance is utilised in a study by Facchinetti (2002), who
charts the distribution and semantic and pragmatic values of the modal verbs can
and could in this corpus, showing that statistical discrepancy between the verbs –
can outnumbering could by roughly seven to one – is stable in the various sections
of the spoken and written components, while only a handful of written categories
show a counter-tendency, notably academic and non-academic humanistic writ-
ing, news reports and novels/stories. The study further shows that the two verbs
behave differently with respect to the inferences of modality they invite; epistemic
modality is by far the least common in can, it is the most common modality in
could; dynamic ability and dynamic possibility account for two thirds of all tokens
of can, they are much less frequent with could, while deontic modality is of sim-
Corpus construction 475

ilar frequency across the two verbs. The differences can in general be ascribed
to the modal meanings of the two verbs and their relevance for different types of
discourse; “for instance, the intrinsically interactive features of spoken data partly
justify the high incidence of deontic and dynamic implication values for both can
and could in this medium” (Facchinetti 2002: 241). Facchinetti’s study thus shows
how balance between different text types and relevant metadata are of significance
in accounting for a linguistic category in corpus data.
However, the quest for balance across text types is challenged by recent soci-
etal developments which have altered the very shape of the textual landscape
which written corpora seek to represent, most notably the emergence of new genres
within computer-mediated communication (CMC) and – doubtlessly concurrent,
though much less focused – the gradual decrease of the relevance of some other
written categories, letter writing being one obvious example. While the compilers
of the first generation corpora could restrict themselves to a finite set of published
and unpublished text categories, corpus construction in the internet era has to cater
for a wide range of new genres such as text messages, e-mails, blog posts, status
updates on social networking sites and web pages (Crystal 2001, 2006). These gen-
res share some features with written texts, such as the need for a physical medium
of communication other than the human voice, be it transferred on a computer, a
smartphone or an electronic advertising board. But some of the genres, such as
text messaging in social media or via an SMS service, contain language that much
resembles speech, in that it is highly colloquial, dialect-near and informal. The
emergence of new genres poses a challenge in particular to those studies which
use corpora to compare the development of a language over time. One branch of
corpus linguistics that uses this methodology has come to be termed “short-term
diachronic comparable corpus linguistics” (Leech et al. 2009: 24) and involves
investigations of comparable corpora which are collected at different times and
which together span a shorter period of time than is usual for historical linguistics.
One recent study within this paradigm is Baker (2017). He compares all corpora
in the Brown family in a study that covers a wide range of features, showing that
language use in the 20th and 21st centuries is characterised by broad tendencies
towards democratisation and colloquialisation. Democratisation of discourse is
observed as a collection of features that suggest that language is becoming less
authoritarian and increasingly reflecting equality among people. These features
include the tendency to avoid unequal or face-threatening forms such as a shift
from use of strong modals (must, should, shall, will) towards weak modals (can,
could, might) and avoidance of formal titles (e. g. Mr and Mrs). Colloquialisation
(informalisation) refers to the tendency for written language to follow spoken lan-
guage norms and thus appear more informal, including the increased use of active
verbs at the expense of passives, more use of first and second person pronouns and
increased frequency of colloquial forms such as kids, guy, okay, kind of, etc. At the
same time, Baker is reluctant to claim that British English is necessarily adapt-
476 Gisle Andersen

ing American norms (Americanisation), as has been alleged, but rather subscribes
recent changes to parallel developments in the two varieties.
With regard to corpus construction, it is of course a tremendous advantage
to have available this suite of corpora that have been compiled at different times
using exactly the same sampling frame, not least evidenced in the study by Leech
et al. (2009). But it remains a problem that the youngest members of the Brown
family of corpora fail to capture language usage within the new CMC genres. This
is especially true since much of present-day language change appears to be fuelled
by these new genres, such as the emergence of a new acronym-based vocabulary
in expressions such as lol and wtf and the rise of emoticons and emojis as a new
mode of attitudinal expression. In fact, it does not seem unreasonable to claim that
the trends of democratisation and colloquialisation that Baker is describing appear
precisely to be inspired by or even accelerated by these new CMC genres, where
the threshold for user participation is so much lower than in the traditional printed
media. Recent work has shown that the CMC genres also pose new technical chal-
lenges to corpus construction, for instance regarding how to deal with duplicate
texts, mass mailing and attachments to emails (Deutschmann et al. 2009). The
field of CMC is now maturing and after its “first wave” of studies concentrating
on the “features and strategies that are (assumed to be) specific to new media”
(Androutsopoulos 2008: 1), studies in the “second wave” acknowledge that CMC
increasingly takes place on mobile platforms and therefore concentrate on “situ-
ated language use and diversity” (Androutsopoulos 2008: 1). It has thus become
necessary for some researchers to investigate datasets that transcend the traditional
spoken/written divide but which are so-called “heterogeneous corpora” that incor-
porate “not only text-based records but also video, audio and field notes” (Adolphs,
Knight and Carter 2011: 315) in order to capture the full complexity of language
users’ linguistic experience.
Not only has there emerged a set of new genres in corpus construction which
challenge the idea of genre balance over time (Renouf and Kehoe 2013), but one
could in fact argue that the whole idea of conventional genre classification is set
in motion in our (post-) postmodern society. The notion of text categorical balance
is tied up with another significant dichotomy in corpus construction: the distinc-
tion between static vs. monitor corpora. While the corpora described thus far are
static, providing a snapshot of a language variety at a certain point in time, one
of the mega-trends in corpus construction gaining speed around the turn of the
millennium has been the development of large monitor corpora (Renouf 2007),
which use a continuous sampling method by which the corpus is augmented with
new texts at regular intervals, yearly as with the Corpus of Contemporary Ameri-
can English (COCA; cf. Davies 2009) or daily, as with the Norwegian Newspaper
corpus (NNC; cf. Andersen and Hofland 2012). Several of these monitor corpora
are internet-based, using web crawler technology, such as the NNC and WebCorp
LSE (Renouf and Kehoe 2013). Such monitor corpora allow for studies of lin-
Corpus construction 477

guistic innovation, including lexical innovation, based on continuously updated

data (Gabrielatos et al. 2012). It should also be pointed out that the emergence of
internet itself has triggered a discussion of whether the web is indeed a corpus and
how it can be approached and analysed as such (Kilgarriff and Grefenstette 2003).
This approach links to another crucial parameter in corpus construction, namely
whether the corpus is designed to account for diachrony or synchrony in the doc-
umentation of a language. Diachronic corpus studies entered the stage with the
launch of the Helsinki corpus in the early 1990s (Rissanen and Tyrkkö 2013) and
has been supplemented with other significant corpora such as the ARCHER corpus
(Biber, Finegan and Atkinson 1994) and the Corpus of Historical American English
(COHA). Jucker and Taavitsainen (2014) use the historical COHA corpus and the
contemporary COCA corpus in a diachronic study of compliments in a time win-
dow from 1810 to 2010. Using an approach called “metacommunicative expression
analysis” (Jucker and Taavitsainen 2014: 257), by which speech acts are searched
for not via overt expressions but via expressions which talk “about” particular
speech acts, the authors show that a crucial distinction must be drawn between
ceremonious and personal compliments, and that, contrary to claims often made
in the literature, in their corpus, compliments are more often paid and received by
men rather than by women.
With regard to the content of corpora, yet another dichotomy can be distin-
guished in corpus construction, between genre-diverse and monolithic corpora. As
has been suggested above, the large and balanced reference corpora incorporate a
wide variety of genres. However, it is often purposeful to construct categorically
monolithic corpora in order to zoom in on the language use of a particular genre.
Especially the language of newspapers has been widely documented (Renouf 2007;
Andersen and Hofland 2012). Compilers of such corpora are keenly aware that
journalistic text does not represent the totality of language use, but, as Renouf and
Kehoe (2013) argue, “[t]here are cases where it is appropriate to look at newspaper
data only […] [since] newspapers are usually at the forefront of linguistic change,
so [they] are promising starting points for the study of neology and productivity”
(Renouf and Kehoe 2013: 181). It has also become customary to document new
web-based genres with monolithic corpora, as with the Birmingham Blog Corpus
(Kehoe and Gee 2012) and the Networks of Texts and People (NTAP) corpus,
where a computer crawler looks for daily updated blog posts on climate change
discourse in the blogosphere (Salway et al. 2016). In an earlier investigation, Hal-
verson (2012) found that a monolithic corpus, namely the Norwegian Newspaper
Corpus, was appropriate for the study of metonymic extension and vagueness.
She looked into how new metonymical uses of place names, such as Kyoto and
Schengen, are utilised to refer not to the places themselves but to events located
there or to the participants or results of such events. Such metonymic uses also
exhibit signs of vagueness between different metonymic readings, although only
in a minority of the cases, and an interesting observation is made in that the meto-
478 Gisle Andersen

nymic uses by far outnumber the literal uses of such place names in contemporary
newspaper discourse.
The content of corpora is also distinguishable with regard to domain, since
corpora may cover general or specific language use. While the large reference
corpora often contain parts that cover specific domains such as BNC’s sections
with informative writing in the arts, social sciences, commerce and finance, etc.,
it is also customary to construct domain-specific corpora that allow researchers to
explore language usage within particular scientific domains and to study academic,
legal or professional language (Connor and Upton 2004; Flowerdew 2002). Much
of this work places itself within the branches of Language for Specific Purposes or
Applied Linguistics. In one study, Walsh, Morton and O’Keeffe (2011) investigate
the Limerick Belfast Corpus of Academic Spoken English (LIBEL) and show how
a set of recurrent multiword units in academic language play a crucial role as mark-
ers of discourse aimed at orienting the hearer. Expressions such as as I was saying,
what you can do is, do you think you could, etc. are used to “signpost, manage,
demonstrate, sequence, set up activities/groups and they mark out shared and new
knowledge” (Walsh, Morton and O’Keeffe 2011: 332). Their study is methodolog-
ically interesting in that it combines corpus linguistics with conversation analysis,
which is an innovative approach to features of spoken academic discourse.
Finally, the construction of spoken corpora raises a number of issues pertaining
to the selection of speakers (cf. section 2.3) and the interface to data (section 2.4).
With regard to content, it is worth pointing out that conscious efforts have been
made to classify spoken interaction according to a number of parameters, in the
spoken component of the BNC and in subsequent corpora. Ideally, a general spo-
ken corpus should contain as wide a range of usage contexts as possible, although
the problem of representativity, mentioned at the beginning of this section, is cer-
tainly no less present in the case of spoken corpora. In fact, as argued by Čermák,
“the problem of what should be included [in spoken corpora] has hardly ever been
considered” (2009: 113). In order to achieve a comprehensive and balanced cov-
erage of spoken data and for spoken corpora to become a true counterpart of the
large written corpora, he argues, we need to identify relevant parameters that aim
towards representativity of the population of speech events from which they are
sampled. His typology incorporates a set of twelve design criteria (in addition to
traditional demographic factors such as speaker age, regional background, etc.;
cf. section 2.3). These include origin of the text, whether it is originally spoken or
written (as in the case of a read manuscript), dialogue or monologue, the proximity
of partners (friends/family vs. no proximity), private vs. public speech, interactive
vs. unidirectional (as in the case of lectures), spontaneous vs. prepared (scripted)
text, casual vs. official contexts, etc. However, it remains to be seen what implica-
tions this proposed typology has for practical work in corpus construction and for
the costs and efforts associated with this task.
Corpus construction 479

2.3. Language and demography

We now turn to what we might consider as the sociolinguistic factors associated

with corpus construction, i. e. factors which pertain to the linguistic and cultural
background of the speakers/writers represented in a corpus. First of all, corpora
may be monolingual or multilingual, where the latter can be further distinguished
into cross-lingual corpora with comparable data from different languages or par-
allel corpora which contain texts and their translations into other languages. The
studies reported above deal with (mostly English) monolingual corpora. Contras-
tive language studies are valuable because they shed light on what is specific and
what is more general or even universal in language use. Such studies may be based
on corpora with comparable sections representing different languages. One such
corpus is the KIAP corpus (Fløttum et al. 2009), which contains research articles
across three disciplines, economics, linguistics and medicine, in three languages,
English, French and Norwegian. This corpus has triggered text-linguistic and dis-
course-analytical research that looks into how authors manifest themselves in aca-
demic discourse. This research into scientific language shows that discipline is
more important than language in the identification of authors’ cultural identities.
To exemplify, the medical researcher is seen as textually “rather absent […]”, the
economist as “somewhat present but in a modest way […]”, and “the linguist as
clearly and polemically present in the text” (Fløttum et al. 2009: 141).
Contrastive work may also span several corpora representing different lan-
guages, such as Defranq and De Sutter’s (2010) work on the intersubjective func-
tion of equivalent expressions in Dutch, English and French. They focus specifi-
cally on so-called “contingency hedges” (Defranq and De Sutter 2010: 183), e. g.
the verbs depend in English, dépendre in French and afhangen/liggen and a modal
form zien in Dutch. The authors demonstrate that these verbs cross-linguistically
show similar signs of decategorialisation, having become markers of intersubjec-
tivity, but not all to the same extent, and that the choice between the Dutch verbs
depends on regional and functional parameters. A similar cross-corpus compara-
tive approach is taken by Drange, Hasund and Stenström (2014) in their study of
a highly specific type of swearing that is found cross-linguistically in adolescent
speech, so-called “swearing by mum”. They investigate this feature in three cor-
pora, COLT (cf. section 2.1) representing London English, COLA representing
Spanish speakers from Madrid and UNO representing Norwegian speakers from
Oslo (Drange, Hasund and Stenström 2014: 37). Demonstrating a certain degree
of cross-linguistic similarity and some statistical differences and cultural-specific
patterns, this study illustrates the value of comparable corpus construction. The
three corpora have been created using the same sampling frame, COLT serving as
a model for the other two. The only crucial difference is time, which the authors
problematize and consider not to be a major obstacle for comparison in their case,
since “the swearing repertoire of a given language community changes very slowly”
480 Gisle Andersen

(Drange, Hasund and Stenström 2014: 37; cf. also Fjeld 2002 and Ljung 2011). In
contrast, the issue of corpus comparability does pose a problem for Defranq and
De Sutter (2010), who acknowledge that their corpora are not fully comparable, in
that the Belgian French corpus mainly contains interview data, as opposed to the
spoken part of the BNC, and the Corpus of Spoken Dutch. In other words, it cannot
be ruled out that observable differences may be a reflection of different sampling
strategies applied in the corpus construction.
A variant of contrastive studies compares not different languages but varieties
of the same language. Most such studies have compared the two “super-varieties”
(Collins and Yao 2013: 479) of English, namely British and American English,
such as Tottie’s (1991) work on backchannels and Tottie and Hoffmann’s (2006)
work on tag questions, but there are also studies which compare peninsular vs.
Latin American Spanish (Placencia and García 2007), hexagonal vs. Québécois
French (Dostie 2009), etc. The compilation of the ICE corpus has facilitated the
study of a much wider set of language varieties. Collins and Yao (2013) use ten
of the ICE corpora in an exploration of colloquialisation in world Englishes with
regard to a set of grammatical variables including contracted vs. full forms of
verbs, use of quasi-modals such as gonna, gotta and wanna, and let’s imperatives
as markers of directive illocutionary meaning. A consistent pattern of variation
emerges from their study: the South-East Asian (SEA) varieties of English (Sin-
gapore, Philippines, Hong Kong) are moving closer towards the colloquialisation
known to characterise the so-called inner circle varieties (BrE, AmE, CanE, AusE,
NZE) than the two non-SEA varieties (India and Kenya). The strength of this
research methodology stems directly from the major advantage of having a set of
similarly constructed corpora using the same sampling strategy as evidenced by
the ICE corpus project.
Further, the construction of parallel corpora has triggered much research,
although pragmatics has not been its prime focus. In a collection of papers that
look into pragmatic markers contrastively by Aijmer and Simon-Vandenbergen
(2006), several contributions are based on parallel corpora, such as Hasselgård’s
(2006) comparison of Norwegian nå and English now in the English-Norwegian
Parallel Corpus (ENPC) and Johansson’s (2006) work on the translation of English
well in the ENPC and the Oslo Multilingual Corpus. A more recent study which
explores parallel corpora in a contrastive functional analysis is Ebeling and Ebe-
ling (2013). They consider Sinclair’s idea that a lexical item is characteristically
an “extended unit of meaning” (Sinclair 1996), rather than an individual word, and
how this is evident in translations. Their contrastive analyses incorporate a range of
phraseological items in English and Norwegian, and they conclude that “translators
strive for both idiomaticity and sameness of meaning along as many dimensions
as possible” (Ebeling and Ebeling 2013: 217), i. e. not only with regard to the
semantic denotation of a word. These dimensions include the semantic prosody
or attitudinal discourse function of words, as conceptualised in Sinclair’s model.
Corpus construction 481

Ebeling and Ebeling’s (2013) study marks an important shift in the study of Sin-
clairian pragmatics and the corpus-based study of phraseology, in introducing the
cross-linguistic study of semantic prosody.
The next parameter to be considered concerns the selection of speakers on the
basis of demographic factors such as gender, age, educational level, region, soci-
oeconomic background and ethnicity. These factors constitute metadata that is not
always explicitly coded in spoken corpora, but they are nevertheless essential in
studies located at the intersection between corpus linguistics and sociolinguistics (cf.
Andersen 2010; Baker 2010 for methodological accounts), within the socio-prag-
matics or variational pragmatics paradigm (cf. Schneider and Barron 2008; Ander-
sen and Aijmer 2011; Murphy 2012). One topic which has received much attention
is the use of tag questions (Andersen 2001; Stenström, Andersen and Hasund 2002;
Tottie and Hoffmann 2006; Pichler 2013; Kimps, Davidse and Cornillie 2014; Bar-
ron, Pandarova and Muderack 2015). A fresh approach to tag questions is taken in
a recent study by Kimps (2016), which offers a detailed functional and prosodic
analysis of tag questions in three corpora, COLT, LLC and ICE-GB. She considers
whether there are particular functions that are associated with particular speakers
or corpora. Among a wide range of findings, Kimps shows that there are observa-
ble effects of age with regard to the functional variability of tag questions. Young
speakers below the age of 18 show preferences for tag questions used as responses,
and to some extent also use tag questions to denote desired actions, more so than
adults, while tag questions with questioning functions are significantly more typi-
cal of speakers between 18 and 45, or older (Kimps 2016: 191). Further, the study
shows that the impact of gender on the choice of tag questions is minimal. This is
an interesting observation, especially in light of earlier sociolinguistic studies on
tag questions, much of which “has concentrated on the different speech style of
men and women” (Kimps 2016: 193). Kimps’ work is methodologically significant
because it introduces a very comprehensive functional apparatus for the analysis of
tag questions in terms of their speech function and attitudinal stance.
Finally in this section, spoken corpora aiming at the documentation of a certain
language variety will mostly select native speakers, but in other studies it is pur-
poseful rather to document the use of language learners. Learner corpus research
has been gaining ground the last couple of decades (Hasko 2013). The corpus-based
study of authentic collections of spoken or written learner language has been used
by scholars concerned with language acquisition and pedagogy (Aijmer 2009;
Granger 2009), with a view to studying learner behaviour including, but certainly
not limited to, errors made by learners. A special feature of corpus construction
applied in learner corpora is the comparison of learner data with a baseline corpus
containing similar texts (e. g. student essays) written by native speakers. A recent
study by Paquot (2013) uses this methodology to chart the collocational and colli-
gational preferences of French EFL learners in the International Corpus of Learner
English (ICLE), and compares them with nine other ICLE learner sub-corpora.
482 Gisle Andersen

She finds that, among the features that appear to be transferred from the learners’
first language are discourse conventions such as French on peut dire that boosts
“French-like” English phrases such as we may wonder and we can wonder, as
well as other cases that display “French-speaking novice writers’ reliance on phra-
seological cascades including je dirais to conclude their argumentative essays”
(Paquot 2013: 409).

2.4. Access to data, transcription, annotation and statistical methods

In this final subsection, I address issues in corpus construction which are more
technical in nature, especially regarding the construction of spoken corpora (cf.
also Culpeper and Archer this volume). There are different ways in which spoken
data can be made accessible in corpora, and we can distinguish between text-based
and multimodal spoken corpora. All speech corpora, from the ground-breaking
London-Lund Corpus (Svartvik 1990) onwards, contain texts with transcriptions
of speech. More recently compiled corpora are sometimes multimodal, in that they
also give users access to audio and/or video files containing the conversations
(Andersen 2010). This has an obvious advantage, especially to research in corpus
pragmatics, which in many contexts requires “a holistic approach to language data,
in which all aspects of an utterance are investigated” (Andersen 2011: 598). COLT
was among the first corpora to make audio data available to its users, and sound
files aligned with their transcriptions are now accessible as part of the large, inter-
national CLARIN infrastructure.3 The use of video data allows for not only the
concurrent study of words and acoustic features such as intonation, but it may also
enable the detailed study of gestures, facial expressions, movements and posture,
which also play prominent roles in communication. However, the access to video
data has obvious consequences for privacy protection of the speakers; this may be
an obstacle to research, and it can be argued that “the utilisation of video material
in corpus-based pragmatics is still in its infancy” (Andersen 2011: 598).
But even the transcription of a spoken corpus raises crucial methodological
issues. In a recent special issue focusing on this topic, Kirk and Andersen (2016)
stress that, although fairly well-established conventions exist for how to transcribe
speech, transcriptions “amount to no more than selections of linguistic features
from what through utterances was intersubjectively communicated. Transcriptions
are abstractions from – or ‘idealisations’ (Cook 1995: 38) about – a given utter-
ance” (Kirk and Andersen 2016: 291). The special issue aims to take a step towards
best practice in corpus transcription and annotation. One contribution shows, for
example, that subjectivity is a general feature of transcription. Andersen (2016)
compares COLT and the London English Corpus, and the comparison unveils a

3
http://clarino.uib.no/korpuskel/landing-page?identifier=colt&view=short
Corpus construction 483

series of corpus-internal as well as corpus-external differences that are due not to

genuine differences between the two corpora or user groups within them, but due to
inconsistent patterns of transcription. The differences pertain to the transcription of
what Andersen calls “semi-lexical features” (Andersen 2016: 324), namely voiced
pauses, interjections, response signals, certain discourse markers and phonological
reductions, categories which were all characterised by considerable inconsistency
in their transcription in the two corpora. On the other hand, colloquialisms and
dialect forms were much less problematic, as they seemed to involve word forms
where conventional orthographies prevail. Thus, Andersen concludes, a lot can be
gained by better standardisation in English corpus transcription.
Among the most valuable features of corpora is that they are commonly aug-
mented with layers of annotation of various types of linguistic information, such
as the word class of individual words, the lemma each word form belongs to, and
the syntactic structure, which is the output from automatic parsing of the sentences
in the corpus (Nelson, Wallis and Aarts 2002). Of particular importance to corpus
pragmatics is prosodic annotation, which is necessarily laborious, but which was
quintessential to Kimps’ (2016) study of tag questions, mentioned above. For two
of the corpora, COLT and LLC, Kimps could rely on existing annotation of tone
units, stress and intonation that was available as text files in the corpora. The two
corpora have similar annotation systems that are based on Crystal (1969), which
was strongly influenced by Halliday’s (1967) system for prosodic analysis. Spot
checks made by the author suggested that the transcriptions were accurate (Kimps
2016: 42). For ICE-GB, however, Kimps had to do the prosodic analysis of the tag
questions in the dataset. The methodological value of having access to transcription
is also obvious from a study by Kjellmer (2009). He considers it a major drawback
that the corpus he is using in his study of backchannels, the spoken component of
the Cobuild Direct corpus, does not contain prosodic annotation, “which, if given,
would have disambiguated a number of occurrences” (Kjellmer 2009: 85). Another
recent study that benefits greatly from prosodic annotation is Lin (2013). Based
on the IBM/Lancaster Spoken English Corpus, this is an innovative study in that it
examines the prosody of formulaic language, showing that “[w]hether a formulaic
expression receives the nucleus in its immediate context depends on is position in
the intonation unit, its ‘holisticity’, ‘pragmatic meaningfulness’ and ‘predictabil-
ity’” (Lin 2013: 580). In other words, prosody plays an important part in deter-
mining the status of formulaic expressions. Further, among the types of annotation
that pertain most directly to corpus pragmatics is discourse-pragmatic annotation,
which like prosodic annotation requires extensive manual work. Although still in
its early stages with regard to its exploitation in corpus pragmatics, such systems
have been developed and utilised not least in the context of the Irish component of
ICE, thanks to the effort of John Kirk and his colleagues (Kallen and Kirk 2012).
The SPICE-Ireland scheme has annotations enabling the search and retrieval of
speech act types (directives, expressives), discourse markers, tags, quotatives, etc.,
484 Gisle Andersen

and the system may well serve as a model for similar annotation pursuits in the
future. Another dimension of pragmatic annotation is introduced by Rühlemann
and O’Donnell (2012), mentioned above, who propose a system for the annotation
of narrative structure in discourse.
The final issue, to be dealt with briefly, concerns the ways in which corpora
can be explored via quantitative techniques. Such techniques have been utilised
widely in other branches of linguistics, including phraseology, lexicography and
terminology research, but are now also gaining ground in corpus pragmatics. As
suggested in the discussion of corpus-driven approaches in section 2.1, in state-
of-the-art corpus linguistics it is necessary to allow users to go beyond the method
of searching corpora to facilitate a range of statistical analyses. This entails that in
corpus construction one should ideally make the corpus texts available for users to
perform statistical operations on them, or produce downloadable statistics which
users may subsequently explore. The range of statistical corpus methods includes
the analysis of word frequencies, n-grams, collocations and keyness. For example,
Clancy (2011) uses frequency analyses to “elucidate the benefits of synergy of
corpus linguistics and variational pragmatics” (Clancy 2011: 371) in an analysis
of hedging behaviour in two different home/family environments with data drawn
from the Limerick Corpus of Irish English. One family represents the middle class
mainstream culture and is contrasted with a family representing the Irish Traveller
Community. This study illustrates how frequency comparison can highlight differ-
ences between the two speaker groups with regard to hedging expressed through
markers such as like, I think, just, you know and actually, showing that “[t]he
Traveller Community exhibits some of the characteristics of East Asian collectivist
cultures” (Clancy 2011: 382). Another method that requires more sophisticated
calculation than a simple frequency analysis is the analysis of collocations. Two
words are said to collocate if they occur in combination more often than would be
expected given their individual frequencies (e. g. Sinclair 1991; Sag et al. 2002;
Lyse and Andersen 2012). Collocations are identified by means of a statistical
measure of association between words, such as the Mutual Information score,
which reflects the collocational strength of two (or more) words seen from their
position in a ranked list of collocations. This is a method that is far less exploited
in pragmatics than its usefulness should suggest (but see Trommer 2011). In a
recent study, Andersen (2016) argues that many types of pragmatic innovation are
to do with the combination of existing word forms in new and innovative ways.
This can be seen, for instance, in the emergence of the expression you get me used
innovatively in London English with an interactional function akin to you know
what I mean (Torgersen et al. 2011). The reason why collocations are particularly
relevant in the context of corpus pragmatics is that pragmatic analyses often entail
the grammaticalisation and reanalysis of particular forms which take on new prag-
matic functions. Grammaticalisation necessarily leads to changes in the combi-
natory possibilities of words and the degree to which particular words combine.
Corpus construction 485

The emergence of a new discourse marker such as you get me has an inevitable
effect on the collocational behaviour of its constituent parts, you+get+me, which
can be seen to occur with increasing frequency as a result of its reanalysis into a
discourse marker once it catches on among the speakers in a community. Thus,
corpus pragmatics can gain a lot from a more systematic study of collocations, and
corpus constructors should make it possible for researchers to access collocation
data, and not just access the corpus via a search interface. Yet another statistical
method, the analysis of keyness, briefly touched upon in section 2.1, refers to
word forms that are used uniquely or significantly more frequently in one corpus
(or section of a corpus) than in another. These can be said to represent the “about-
ness” of a corpus in terms of its cultural or topical characteristics or its stylistic
features. The method is used in a study of the language of television by Bednarek
(2012), who shows that emotionality, expressed through “key” trigrams (sequences
of three words) such as no no no, what the hell and oh my god, is a crucial “defining
feature of the language of television, cutting across individual series and different
television genres” (Bednarek 2012: 59). Use of the keywords analysis has been
especially salient in the study of political discourse (Johnson, Culpeper and Suhr
2003; Baker 2004; Archer 2009) and in corpus stylistics, where keywords are seen
as important markers of style. Examples are Culpeper’s (2009) study of charac-
ter talk in Shakespeare’s Romeo and Juliet and Fischer-Starke’s (2009) study of
Austin’s Pride and Prejudice, which shows that this research methodology has a
potential to “[uncover] meanings that are not discussed in literary critical second-
ary sources” (Fischer-Starke 2009: 492).

3. Concluding remarks

In the preceding section I have chosen to present previous research and proce-
dures in corpus construction as a series of dichotomies, such as corpus-based vs.
corpus-driven studies, representative vs. balanced corpus, etc., and shown how
various studies in corpus pragmatics have contributed new insights into language
use by exploring corpora that were constructed according to different selective
processes. Some general issues are worth addressing at this time. From the above
discussion, an important lesson to be learnt is that, despite the availability of a wide
selection of ready-made corpora, embarking on a corpus construction process for
the purpose of a specific research project in corpus pragmatics (and beyond) may
well be worth the effort. A study such as McIntyre and Walker (2011) showed that
it may be fruitful to build and annotate a new corpus specifically designed for a
particular research objective, in their case in order to zoom in on a particular period
of English writing. Another overall observation is that corpus pragmatics research
is fundamentally interdisciplinary in its nature (Murphy 2012), and the corpus
linguistic approach provides an adequate methodological link to neighbouring dis-
486 Gisle Andersen

ciplines such as political discourse studies, literary studies and sociology. Some
of the research also stresses the need for a holistic approach to data that requires
access not just to transcriptions but to audio or video recordings, and for “more
fine-grained explorations of corpora, where indeed the corpus is small and lends
itself to such analyses” (Murphy 2012: 344). Yet other work highlights the need to
make more use of sophisticated statistical techniques in corpus pragmatics, which
have a great potential for unveiling pragmatic innovation and allowing for fuller
accountability of the data (Andersen 2016). A crucial methodological issue that
has only to some degree been focused in corpus pragmatics is the replicability of
studies. With regard to corpus construction, this means that compilers of corpora
must document their choices and procedures in research articles or as published
guidelines, showing for instance the choices made in the transcription procedure
(Kirk and Andersen 2016). There are also other issues in corpus construction that
would merit a fuller discussion than this chapter allows for. One problem that is
yet unresolved for corpus linguistics is the study of absence, which relates to the
charge that is sometimes made against corpus linguistics that it cannot deal with
information not to be found in a corpus. As Partington (2014) puts it, “[corpus lin-
guistics] may have much to say about what is present in the corpus being examined
but it cannot enlighten us about what is absent, what is not found therein” (Parting-
ton 2014: 119), a criticism raised by proponents of Critical Discourse Analysis (cf.
discussion in Baker 2005). Another big issue is the relation between corpus prag-
matics and theoretical pragmatics and, by extension, whether corpora can be used
systematically to the study of implicit meaning. This is pointed out by Larrivee
and Duffley (2014), who state that, “[w]hile corpus pragmatics has been pursued
for the study of particular items […] and for more general pragmatic phenomena
such as speech acts, […] implicatures, presuppositions and similar pragmatic phe-
nomena remain to be investigated more fully with regard to their actual occurrence
in real language use” (Larrivee and Duffley 2014: 544). This raises a fundamental
question of how hypotheses about language grounded in pragmatic theory, such
as the principle of relevance, can be tested on the basis of corpora. These are
challenges that are not to be taken lightly and which will likely shape the future
discussion of theory and methodology in corpus pragmatics in the years to come.

References

Ädel, Annelie and Randi Reppen

2008 The challenges of different settings: An overview. In: Annelie Ädel and Randi
Reppen (eds.), Corpora and Discourse: The Challenges of Different Settings,
1–6. Amsterdam: John Benjamins.
Adolphs, Svenja
2008 Corpus and Context: Investigating Pragmatic Functions in Spoken Discourse.
Amsterdam: John Benjamins.
Corpus construction 487

Adolphs, Svenja, Dawn Knight and Ronald Carter

2011 Capturing context for heterogeneous corpus analysis. International Journal of
Corpus Linguistics 16(3): 305–324.
Aijmer, Karin (ed.)
2009 Corpora and Language Teaching. Amsterdam: John Benjamins.
Aijmer, Karin and Anne-Marie Simon-Vandenbergen (eds.)
2006 Pragmatic Markers in Contrast. Amsterdam: Elsevier.
Andersen, Gisle
2001 Pragmatic Markers and Sociolinguistic Variation. Amsterdam: John Benja-
mins.
Andersen, Gisle
2010 How to use corpus linguistics in sociolinguistics. In: Anne O’Keeffe and
Michael McCarthy (eds.), The Routledge Handbook of Corpus Linguistics,
547–562. London and New York: Routledge.
Andersen, Gisle
2011 Corpus-based pragmatics I: Qualitative studies. In: Wolfram Bublitz and
Neal Norrick (eds.), Foundations of Pragmatics, 587–627. Berlin: Mouton de
Gruyter.
Andersen, Gisle
2016 Semi-lexical features in corpus transcription: Consistency, comparability,
standardisation. International Journal of Corpus Linguistics 21(3): 324–348.
Andersen, Gisle
2016 Using the corpus-driven method to chart discourse-pragmatic change. In:
Heike Pichler (ed.), Discourse-Pragmatic Variation and Change in English:
New Methods and Insights, 21–40. Cambridge: Cambridge University Press.
Andersen, Gisle and Karin Aijmer (eds.)
2011 Pragmatics of Society. (Handbooks of Pragmatics 5.). Berlin: Mouton de
Gruyter.
Andersen, Gisle and Knut Hofland
2012 Building a large monitor corpus based on newspapers on the web. In: Gisle
Andersen (ed.), Exploring Newspaper Language: Using the Web to Create and
Investigate a Large Corpus of Modern Norwegian, 1–30. Amsterdam: John
Benjamins.
Androutsopoulos, Jannis K.
2008 Potentials and limitations of discourse-centred online ethnography. Lan-
guage@internet 5(8): 1–20.
Archer, Dawn (ed.)
2009 What’s in a Word List? Investigating Word Frequency and Keyword Extrac-
tion. London: Ashgate.
Baker, Paul
2004 Quering keywords: Questions of difference, frequency and sense in keywords
analysis. Journal of English Linguistics 32(4): 346–359.
Baker, Paul
2005 Public Discourses of Gay Men. London: Routledge.
Baker, Paul
2009 The BE06 Corpus of British English and recent language change. Interna-
tional Journal of Corpus Linguistics 14(3): 312–337.
488 Gisle Andersen

Baker, Paul
2010 Sociolinguistics and Corpus Linguistics. Edinburgh: Edinburgh University
Press.
Baker, Paul
2017 British and American English: A Common Language? Cambridge: Cambridge
University Press.
Barron, Anne, Irina Pandarova and Karoline Muderack
2015 Tag questions across Irish English and British English: A corpus analysis of
form and function. Multilingua 34(4): 495–525.
Bednarek, Monika
2012 Key words and trigrams in TV series. International Journal of Corpus Lin-
guistics 17: 135–163.
Biber, Douglas
1993 Representativeness in corpus design. Literary and Linguistic Computing 5(4):
257–269.
Biber, Douglas
2009 A corpus-driven approach to formulaic language in English: Multi-word pat-
terns in speech and writing. International Journal of Corpus Linguistics 14(3):
275–311.
Biber, Douglas, Edward Finegan and Dwight Atkinson
1994 ARCHER and its challenges: Compiling and exploring A Representative Cor-
pus of Historical English Registers. In: Udo Fries, Peter Schneider and Gunnel
Tottie (eds.), Creating and Using English Language Corpora, 1–13. Amster-
dam: Rodopi.
Čermák, František
2009 Spoken corpus design: Their constitutive parameters. International Journal of
Corpus Linguistics 14(1): 113–123.
Cheshire, Jenny
2007 Discourse variation, grammaticalisation and stuff like that. Journal of Socio-
linguistics 11(2): 155–193.
Cheshire, Jenny, Sue Fox, Paul Kerswill and Eivind Torgersen
2008 Ethnicity, friendship network and social practices as the motor of dialect
change: Linguistic innovation in London. Sociolinguistica 22: 1–23.
Clancy, Brian
2011 Complementary perspectives on hedging behaviours in family discourse.
International Journal of Corpus Linguistics 16(3): 371–390.
Collins, Peter
2005 It-clefts and wh-clefts: Prosody and pragmatics. Journal of Pragmatics 38:
1706–1720.
Collins, Peter and Xinyue Yao
2013 Colloquial features in World Englishes. International Journal of Corpus Lin-
guistics 18(4): 479–505.
Connor, Ulla and Thomas A. Upton (eds.)
2004 Discourse in the Professions: Perspectives from Corpus Linguistics Amster-
dam: John Benjamins.
Corley, Martin, Lucy J. MacGregor and David I. Donaldson
2007 It’s the way that you, er, say it: Hesitations in speech affect language compre-
hension. Cognition 105: 658–668.
Corpus construction 489

Crystal, David
1969 Prosodic Systems and Intonation in English. Cambridge: Cambridge Univer-
sity Press.
Crystal, David
2001 Language and the Internet. Cambridge: Cambridge University Press.
Crystal, David
2006 Language and the Internet. 2nd ed. Cambridge: Cambridge University Press.
Culpeper, Jonathan
2009 Words, parts-of-speech and semantic categories in the character-talk of Shake-
speare’s Romeo and Juliet. International Journal of Corpus Linguistics 14(1):
29–59.
Davies, Mark
2009 The 385+ million word Corpus of Contemporary American English (1990–
2008+): Design, architecture, and linguistic insights. International Journal of
Corpus Linguistics 14(2): 159–190.
Defranq, Bart and Gert De Sutter
2010 Contingency hedges in Dutch, French and English. International Journal of
Corpus Linguistics 15(2): 183–213.
Denis, Derek
2011 Innovators and innovation: Tracking the innovators of and stuff in York Eng-
lish. University of Pennsylvania Working Papers in Linguistics 17(2): 61–70.
Denis, Derek
2016 The role of children in the propagation of discourse-pragmatic change:
insights from the acquisition of quotative variation. In: Heike Pichler (ed.),
Discourse-Pragmatic Variation and Change in English: New Methods and
Insights, 160–182. Cambridge: Cambridge University Press.
Deutschmann, Mats, Annelie Ädel, Gregory Garretson and Terry Walker
2009 Introducing Mini-McCALL: A pilot version of the Mid-Sweden Corpus of
Computer-Assisted Langugage Learning. ICAME Journal 33: 21–44.
Dostie, Gaétane
2009 Discourse markers and regional variation in French: A lexico-semantic
approach. In: Kate Beeching, Nigel Armstrong and Françoise Gadet (eds.),
Sociolinguistic Variation in Contemporary French, 201–214. Amsterdam:
John Benjamins.
Drange, Eli-Marie Danbolt, Kristine Hasund and Anna-Brita Stenström
2014 Teenagers’ swearing by mothers in English, Spanish and Norwegian. Interna-
tional Journal of Corpus Linguistics 19(1): 29–59.
Ebeling, Jarle and Signe Oksefjell Ebeling
2013 Patterns in Contrast. Amsterdam: John Benjamins.
Facchinetti, Roberta
2002 Can and could in contemporary British English: A study of the ICE-GB cor-
pus. In: Pam Peters, Peter Collins and Adam Smith (eds.), New Frontiers of
Corpus Research, 229–246. Amsterdam: Rodopi.
Fischer-Starke, Bettina
2009 Keywords and frequent phrases of Jane Austen’s Pride and Prejudice. Interna-
tional Journal of Corpus Linguistics 14(4): 492–523.
Fjeld, Ruth Vatvedt
2002 Om banning og sverting. Maal og Minne 2: 152–166.
490 Gisle Andersen

Flowerdew, John
2002 Academic Discourse. London: Longman.
Fløttum, Kjersti, Trine Dahl, Anders Alvsåker Didriksen and Anje Müller Gjesdahl
2009 KIAP – reflections on a complex corpus. In: Lidun Hareide, Christer Johans-
son and Michael Oakes (eds.), The Many Facets of Corpus Linguistics in Ber-
gen, 137–150. Bergen: BeLLs.
Francis, W. Nelson and Henry Kucera
1979 Brown Corpus Manual: Revised Version. Providence, Rhode Island: Brown
University.
Gabrielatos, Costas, Tony McEnery, Peter J. Diggle and Paul Baker
2012 The peaks and troughs of corpus-based contextual analysis. International
Journal of Corpus Linguistics 17(2): 151–175.
Granger, Sylviane
2009 Corpora and second-language acquisition. In: Karin Aijmer (ed.), Corpora
and Language Teaching, 13–32. Amsterdam: John Benjamins.
Greenbaum, Sidney (ed.)
1996 Comparing English Worldwide: The International Corpus of English. Oxford:
Clarendon Press.
Halliday, M.A.K.
1967 Intonation and Grammar in British English. The Hague: Mouton.
Halverson, Sandra
2012 Metonymic extension and vagueness: Schengen and Kyoto in Norwegian
newspaper language. In: Gisle Andersen (ed.), Exploring Newspaper Lan-
guage: Using the Web to Create and Investigate a Large Corpus of Modern
Norwegian, 285–306. Amsterdam: John Benjamins.
Hasko, Victoria
2013 Introduction to special issue: New frontiers in learner corpus research. Inter-
national Journal of Corpus Linguistics 18(3): 295–300.
Hasselgård, Hilde
2006 “Not now” – On non-correspondence between the cognate adverbs now and
nå. In: Karin Aijmer and Anne-Marie Simon-Vandenbergen (eds.), Pragmatic
Markers in Contrast, 91–113. Amsterdam: Elsevier.
Hofland, Knut and Stig Johansson
1982 Word Frequencies in British and American English. London: Longman.
Johansson, Stig
2006 How well can well be translated? On the English discourse particle well and
its correspondences in Norwegian and German. Karin Aijmer and Anne-Ma-
rie Simon-Vandenbergen eds.), Pragmatic Markers in Contrast, 115–138.
Amsterdam: Elsevier.
Johnson, Sally, Jonathan Culpeper and Stephanie Suhr
2003 From ‘politically correct councillors’ to ‘Blairite nonsense’: Discourses of
political correctness in three British newspapers. Discourse and Society 14(1):
28–47.
Jucker, Andreas H.
2006 “but ‘tis believed that …”: Speech and thought presentation in Early English
newspapers. In: Nicholas Brownlees (ed.), News Discourse in Early Modern
Britain, 105–125. Bern: Peter Lang.
Corpus construction 491

Jucker, Andreas, H., Daniel Schreier and Marianne Hundt (eds.)

2009 Corpora: Pragmatics and Discourse. Papers from the 29th International Con-
ference on English Language Research on Computerized Corpora (ICAME
29). Amsterdam: Rodopi.
Jucker, Andreas H. and Irma Taavitsainen
2014 Complimenting in the history of American English: A metacommunicative
expression analysis. In: Irma Taavitsainen and Andreas Jucker (eds.), Dia-
chronic Corpus Pragmatics, 257–276. Amsterdam: John Benjamins.
Kallen, Jeffrey L. and John M. Kirk
2012 SPICE-Ireland: A User’s Guide: Documentation to Accompany the SPICE-Ire-
land Corpus: Systems of Pragmatic Annotation in ICE-Ireland. Belfast/Dub-
lin: Queen’s University Belfast & Trinity College Dublin.
Kehoe, Andrew and Matt Gee
2012 Reader comments as an aboutness indicator in online texts: Introducing the
Birmingham Blog Corpus. In: Signe Oksefjell Ebeling, Jarle Ebeling and
Hilde Hasselgård (eds.), Aspects of Corpus Linguistics: Compilation, Annota-
tion, Analysis. Helsinki: University of Helsinki.
Kilgarriff, Adam and Gregory Grefenstette
2003 Introduction to the Special Issue on Web as Corpus. Computational Linguis-
tics 29(3): 1–15.
Kimps, Ditte
2016 English variable tag questions: A typology of their interpersonal meanings.
Unpublished PhD thesis, Departement Taalkunde, KU Leuven, Leuven.
Kimps, Ditte, Kristin Davidse and Bert Cornillie
2014 The speech function of tag questions and their properties: A comparison of
their distribution in COLT and LLC. In: Kristin Davidse, Caroline Gentens,
Lobke Ghesquière and Lieven Vandelanotte (eds.), Corpus Interrogation and
Grammatical Patterns, 321–350. Amsterdam: John Benjamins.
Kirk, John and Gisle Andersen
2016 Introduction: Compilation, transcription, markup and annotation of spoken
corpora. International Journal of Corpus Linguistics 21(3): 291–298.
Kjellmer, Göran
2003 Hesitation: In defence of er and erm. English Studies 84(2): 170–197.
Kjellmer, Göran
2009 Where do we backchannel? On the use of mm, mhm, uh huh and such like.
International Journal of Corpus Linguistics 14(1): 81–112.
Larrivee, Pierre and Patrick Duffley
2014 The emergence of implicit meaning: Scalar implicatures with some. Interna-
tional Journal of Corpus Linguistics 19(4): 530–547.
Leech, Geoffrey
1991 The state of the art in corpus linguistics. In: Karin Aijmer and Bengt Altenberg
(eds.), English Corpus Linguistics, 8–29. London: Longman.
Leech, Geoffrey
2007 New resources, or just better old ones? The holy grail of representativeness.
In: Marianne Hundt, Nadia Nesselhauf and Carolin Biewer (eds.), Corpus Lin-
guistics and the Web, 133–149. Amsterdam: Rodopi.
492 Gisle Andersen

Leech, Geoffrey, Marianne Hundt, Christian Mair and Nicholas Smith (eds.)
2009 Change in Contemporary English. Cambridge: Cambridge University Press.
Lin, Phoebe M.S.
2013 The prosody of formulaic expressions in the IBM/Lancaster SEC. Interna-
tional Journal of Corpus Linguistics 18(4): 561–588.
Ljung, Magnus
2011 Swearing: A Cross-cultural Linguistic Study. Houndmills: Palgrave Macmillan.
Lyse, Gunn Inger and Gisle Andersen
2012 Collocations and statistical analysis of n-grams. In: Gisle Andersen (ed.),
Exploring Newspaper Language: Using the Web to Create and Investigate a
Large Corpus of Modern Norwegian, 79–110. Amsterdam: John Benjamins.
Mair, Christian
1997 Parallel corpora: A real-time approach to the study of language change in pro-
gress. In: Magnus Ljung (ed.), Corpus-based Studies in English, 195–209.
Amsterdam: Rodopi.
McEnery, Tony and Andrew Hardie
2012 Corpus Linguistics. Cambridge: Cambridge University Press.
McIntyre, Dan and Brian Walker
2011 Discourse presentation in Early Modern English writing: A preliminary cor-
pus-based investigation. International Journal of Corpus Linguistics 16(1):
101–130.
Millar, Don and Douglas Biber
2015 Evaluating reliability in quantitative vocabulary studies. International Journal
of Corpus Linguistics 20(1): 30–53.
Murphy, Bróna
2012 Exploring response tokens in Irish English – a multidisciplinary approach.
International Journal of Corpus Linguistics 17(3): 325–348.
Nelson, Gerald, Sean Wallis and Bas Aarts
2002 Exploring Natural Language: Working with the British Component of the
International Corpus of English. Amsterdam: John Benjamins.
O’Keeffe, Anne and Michael McCarthy (eds.)
2010 The Routledge Handbook of Corpus Linguistics. Abingdon: Routledge.
Paquot, Magali
2013 Lexical bundles and L1 transfer effects. International Journal of Corpus Lin-
guistics 18(3): 391–417.
Partington, Alan
2014 Mind the gaps: The role of corpus linguistics in researching absences. Interna-
tional Journal of Corpus Linguistics 19(1): 118–146.
Pichler, Heike
2013 The Structure of Discourse-pragmatic Variation. Amsterdam: John Benjamins.
Pichler, Heike
2016 Uncovering discourse-pragmatic innovations: ‘innit’ in Multicultural London
English. In: Heike Pichler (ed.), Discourse-Pragmatic Variation and Change
in English: New Methods and Insights, 59–85. Cambridge: Cambridge Univer-
sity Press.
Pichler, Heike (ed.)
2016 Discourse-Pragmatic Variation and Change in English: New Methods and
Insights. Cambridge: Cambridge University Press.
Corpus construction 493

Placencia, María Elena and Carmen García (eds.)

2007 Research on Politeness in the Spanish-speaking World. Mahwah, N.J.: Law-
rence Erlbaum.
Renouf, Antoinette
2007 Corpus development 25 years on: From super-corpus to cyber-corpus. In:
Roberta Facchinetti (ed.), Corpus Linguistics 25 Years on, 27–49. Amsterdam/
New York: Rodopi.
Renouf, Antoinette and Andrew Kehoe
2013 WebCorp Linguist’s Search Engine. International Journal of Corpus Linguis-
tics 18(2): 167–198.
Rissanen, Matti and Jukka Tyrkkö
2013 The Helsinki Corpus of English Texts. VARIENG – Studies in Variation, Con-
tacts and Change in English 14.
Romero-Trillo, Jesús
2008 Pragmatics and Corpus Linguistics: A Mutualistic Entente. Berlin: Mouton de
Gruyter.
Rühlemann, Christoph
2011 Corpus-based pragmatics II: Quantitative studies. In: Wolfram Bublitz and
Neal Norrick (eds.). Foundations of Pragmatics, 629–656. Berlin: de Gruyter
Mouton.
Rühlemann, Christoph and Karin Aijmer (eds.)
2016 Corpus Pragmatics. Cambridge: Cambridge University Press.
Rühlemann, Christoph, Andreas Bagoutdinov and Matthew Brook O’Donnell
2011 Windows on the mind: Pauses in conversational narrative. International Jour-
nal of Corpus Linguistics 16(2): 198–232.
Rühlemann, Christoph and Matthew Brook O’Donnell
2012 Towards a corpus of conversational narrative: Construction and annotation of
the Narrative Corpus. Corpus Linguistics and Linguistic Theory 8(2): 313–
350.
Sacks, Harvey
1992 Lectures on Conversation, Volume 2. Cambridge: Blackwell.
Sag, Ivan A., Timothy Baldwin, Francis Bond, Ann Copestake and Dan Flickinger
2002 Multiword expressions: A pain in the neck for NLP. Lecture Notes in Com-
puter Science 2276: 1–15.
Salway, Andrew, Dag Elgesem, Knut Hofland, Øystein Reigem and Lubos Steskal
2016 Topically-focused Blog Corpora for Multiple Languages. Proceedings of the
10th Web as Corpus workshop (WAC-X): 17–26.
Schneider, Klaus P. and Anne Barron (eds.)
2008 Variational Pragmatics. Amsterdam: John Benjamins.
Semino, Elena and Mick Short
2004 Corpus Stylistics: Speech, Writing and Thought Presentation in a Corpus of
English Writing. London: Routledge.
Sinclair, John
1991 Corpus, Concordance, Collocation. Oxford: Oxford University Press.
Sinclair, John
1996 The search for units of meaning. Textus IX: 75–106.
494 Gisle Andersen

Stenström, Anna-Brita and Gisle Andersen

1996 More trends in teenage talk: A corpus-based investigation of the discourse
items cos and innit. In: Carol E. Percy, Charles F. Meyer and Ian Lancashire
(eds.), Synchronic Corpus Linguistics, 189–203. Amsterdam: Rodopi.
Stenström, Anna-Brita, Gisle Andersen and Kristine Hasund
2002 Trends in Teenage Talk: Corpus Compilation, Analysis and Findings. Amster-
dam: John Benjamins.
Svartvik, Jan (ed.)
1990 The London Corpus of Spoken English: Description and Research. (Lund
Studies in English.) Lund: Lund University Press.
Swerts, Marc
1998 Filled pauses as markers of discourse structure. Journal of Pragmatics 30:
485–496.
Taavitsainen, Irma, Andreas H. Jucker and Jukka Tuominen (eds.)
2014 Diachronic Corpus Pragmatics. Amsterdam: John Benjamins.
Tagliamonte, Sali
2016 Antecedents of innovation: Exploring general extenders in conservative dia-
lects. In: Heike Pichler (ed.), Discourse-Pragmatic Variation and Change in
English: New Methods and Insights, 115–138. Cambridge: Cambridge Univer-
sity Press.
Tagliamonte, Sali and Derek Denis
2010 The stuff of change: General extenders in Toronto, Canada. Journal of English
Linguistics 38(4): 335–368.
Tognini-Bonelli, Elena
2001 Corpus Linguistics at Work. Amsterdam: John Benjamins.
Torgersen, Eivind, Costas Gabrielatos, Sebastian Hoffmann and Sue Fox
2011 A corpus-based study of pragmatic markers in London English. Corpus Lin-
guistics and Linguistic Theory 7(1): 93–118.
Tottie, Gunnel
1991 Conversational style in British and American English: The case of backchan-
nels. In: Karin Aijmer and Bengt Altenberg (eds.), English Corpus Linguistics,
254–271. London: Longman.
Tottie, Gunnel
2011 Uh and Um as sociolinguistic markers in British English. International Jour-
nal of Corpus Linguistics 16(2): 173–197.
Tottie, Gunnel and Sebastian Hoffmann
2006 Tag questions in British and American English. Journal of English Linguistics
34: 283–311.
Trommer, Ann-Kathrin
2011 Wondering about the intersection of speech acts, politeness and deixis: I won-
dered and I was wondering in the BNC. ICAME Journal 35: 185–204.
Walsh, Steve, Tom Morton and Anne O’Keeffe
2011 Analysing university spoken interaction. International Journal of Corpus Lin-
guistics 16(3): 325–344.
20. Corpus annotation1
Dawn Archer and Jonathan Culpeper

Abstract: The corpus-based method does not seem to promise much reward for
pragmatics research, given its typical focus on form. The practice of annotating
corpora for pragmatic phenomena has not been prominent in the budding field of
corpus pragmatics either. In this chapter, we therefore argue for and demonstrate
the unrealised potential afforded by pragmatic annotation, especially for macro,
interactional and social areas of pragmatic research. We discuss the nature of cor-
pus annotation, the issues of segmentation and implementation, and the current
state-of-the-art with regard to pragmatic annotation schemes (especially for dia-
logue). We also reflect on promising areas of future development.

1. Introduction

This chapter concerns the addition of explicit interpretive information to a corpus

of electronic language data, usually in the form of tags or codes, to assist in the
analysis of pragmatic phenomena. Nearly ten years have elapsed since our last
attempt, along with Matthew Davies, to crystallize the world of corpus annotation
and pragmatics into a handbook chapter (cf. Archer, Culpeper, and Davies 2008).
Back then, we noted that pragmatics and corpus annotation lagged behind work
on other annotation aspects, notably, grammatical annotation. In fact, even some
of the schemes purporting to be “pragmatic annotation” schemes were prioritising
units that are more lexical or syntactic than they are pragmatic (e. g. anaphors,
modal verbs). That said, there had been some progress in developing annotation
schemes which focused on interactional pragmatic entities such as the act, and/or
the exchange, or on relevant aspects of context (e. g. Stiles 1992; Carletta et al.
1997b; Core and Allen 1997; Archer and Culpeper 2003). Since 2008, the academic
landscape has experienced some near seismic shifts, thanks to the development of
corpus pragmatics. Hence, the inclusion of corpus-related chapters in this prag-
matics and methods volume. Other notable works in the area of corpus pragmatics

1
We pitch this chapter as an update to our 2008 paper on pragmatic annotation. Some of
the summaries of the pre-2008 literature in that paper are re-used here. This is notably
the case for paragraphs that appear here as part of section 3, many with little change. We
gratefully acknowledge Matthew Davies’s contribution in bringing the original 2008
paper into existence.

https://doi.org/10.1515/9783110424928-020
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). Methods in Pragmatics, 495–525. Berlin/Boston:
De Gruyter Mouton.
496 Dawn Archer and Jonathan Culpeper

include Romero-Trillo (2008), Jucker, Schreier and Hundt (2009), Taavitsainen,

Jucker, and Tuominen (2014) and Aijmer and Rühlemann (2014). A commensurate
increase in the attention given to pragmatic annotation has not materialized to date,
however. Consider the two most recent collected volumes on the topic of corpus
pragmatics. Pragmatic annotation gets no specific chapter in Taavistainen, Jucker
and Tuominen (2014) and a single chapter on “speech act annotation” (Weisser
2014a) out of the sixteen in Aijmer and Rühlemann (2014).
The vast majority of extant corpus pragmatics studies have formal features
with pragmatic import (i. e. pragmalinguistic material) as their starting point and/
or primary focus. Although an important area of pragmatics, this bias matches
the bias of computer searches towards form (be it a letter, string of letters or
words), and the tendency within corpus linguistics to concentrate on the relation-
ships between those forms (i. e. text and co-text) at the expense of the (situational,
social and cultural) dynamics of context, particularly at the local, micro level.
This bias tends to mean that more global, macro areas of pragmatics, and espe-
cially the area of sociopragmatics, are neglected or made to be secondary concerns,
not least methodologically. This situation need not be so. Pragmatic annotation
offers a method by which we can tackle such areas directly, as this chapter will
demonstrate.
The field of corpus pragmatics is obviously interdisciplinary. As this chapter
sits in a pragmatics volume, our own bias will be towards pragmatics, not cor-
pus linguistics. Our main question is: what can corpus annotation methods do
for pragmatics, and how? In some cases, and specifically to help move the field
forward, we will discuss areas of pragmatics that have the potential to be explored
through corpus annotation methods, and not simply studies that have already been
undertaken using pragmatic annotation. We will not engage with computational
pragmatics, the neighbouring field of corpus pragmatics, in this chapter, because
of its different research agenda compared with mainstream pragmatics. Simply
put, computational pragmaticians are interested in “getting natural language pro-
cessing systems to reason in a way that allows machines to interpret utterances
in context” (McEnery 1995: 12, our emphasis); in other words, in building “arti-
ficial agents that can carry on conversations with humans in order to perform
tasks like answering questions, keeping schedules, or giving directions” (Jurafsky
2004: 579).
The remainder of this chapter is organized into four sections. The first outlines
some of the general aspects of pragmatic annotation, such as segmentation. The
second considers annotation schemes for interactional phenomena, such as speech
acts. The third considers annotation schemes for contextual phenomena, such as
the gender or social status of the participants. Note that the separation of interac-
tional and contextual reflects the emphasis of the schemes we review; in practice,
there is much overlap, and indeed a few schemes explicitly consider themselves
to be mixing both. The fourth section looks at the potential of new developments,
Corpus annotation 497

especially in corpus methods, to further the pragmatics research agenda. In writ-

ing the third section, and to a lesser extent section 4.1, we will re-use some of the
literature summaries that appeared in our 2008 publication (Archer, Culpeper and
Davies 2008).

2. Corpus annotation: Two key issues

Leech’s definition of corpus annotation is “the practice of adding interpretative,

linguistic information to an electronic corpus of spoken and/or written language
data” (1997: 2, original emphasis). As this definition highlights, there are no dark
arts involved in corpus annotation. Rather, it can be as simple as going through a
text with a highlighter (electronic or otherwise), and highlighting every instance
of, for example, a request (where our interest is exploring requests). If we did this
for several texts in a corpus, we would be doing corpus annotation of a pragmatic
kind. Highlighting a text manually is not an optimal way of annotating, however,
if we wish to retrieve those annotations easily, using computers. For that reason,
it is more usual to add characters to an electronic text. The angle-bracket tags of
XML offer an easy solution, with a “switch-on” tag followed by a “switch-off”
tag, allowing a segment of text to be annotated with a code. Example (1) provides
an illustration:
(1) Barry: <req>Can you bring me a <pause> pint of lime and lemon <pause>
with some i– lots of ice?</req>
Ken: Yeah, course I can.
Barry: Cheers, thanks.
(BNC, spoken demographic)

Thanks to such “structural mark-up”, it becomes relatively easy to “pull out” exam-
ples of requests automatically, and examine their contexts via concordance lines,
and so on. It must be noted, though, that even an apparently simple annotation
scheme, such as this, encounters two knotty issues: how to segment the data into
pragmatic units, and how to consistently categorize those pragmatic units.

2.1. Segmenting language data into pragmatic units

Segmenting language into pragmatic units is a prerequisite for applying interpre-
tative tags. Scholars who annotate words or grammatical features have it relatively
easy compared with the pragmatics scholar, because those units have more defi-
nite and consistent formal correlates and thus are easier to segment. For example,
words have written orthographic correlates: “a string of uninterrupted non-punctu-
ation characters with white space or punctuation at each end” (Leech, Rayson and
Wilson 2001: 13–14). Such orthographic word entities are relatively definite and
498 Dawn Archer and Jonathan Culpeper

tangible, and, moreover, a computer can find them. This is not to say that even with
a “word” things are totally straightforward, as we must decide whether pause-fill-
ers such as er and erm should be treated as “words”, and whether open compounds
or phrasal verbs should be classed as one word or more.
The problems we encounter in pragmatic segmentation will partly depend on
the nature of the pragmatic unit and our ability to identify it. Many pragmatic
annotation schemes orientate to the pragmatic unit of the speech act. Pragmalin-
guistic features seem to offer a means of identifying pragmatic phenomena such
as speech acts, and perhaps even give clues as to their boundaries. A word such
as please or a structure such as Can you [VERB X]? can be found relatively easily
with regular expressions, and then annotated with whatever pragmatic value it is
conventionally associated with (here, in both cases a request). The latter exam-
ple, Can you [VERB X]?, also seems to give some boundary clues: can occurs at
the beginning and a question mark occurs at the end. However, despite Searle’s
(1969: 30) claim that such interrogative structures count as “inference triggers” for
requests, much depends on how conventionalized an expression is for a particular
pragmatic phenomenon to be triggered. Culpeper and Gillings (2018) report that in
their BNC2014 data only 21 of their randomized sample of 100 hits for can you –
and we should bear in mind that Searle’s paradigm example of an indirect request
is “can you pass the salt?” – could be clearly construed as requests, as opposed
to literal questions about somebody’s ability to do something. In other words, a
computer operating with the form “can you” alone is unlikely to reliably identify
requests. This is not to decry the status of interrogative, and other pragmalinguistic
forms, as inference triggers; a human may well comprehend a can you expression
as a request inference trigger in relevant contexts (e. g. somebody known to like an
excess of salt in their food is seated out of reach of the salt).
A further problem is that speech acts are not only not straightforwardly iden-
tified or limited by pragmalinguistic expressions. In fact, they often are not even
wholly confined to the utterance in which the pragmalinguistic expression resides.
Consider example (1) and the utterances that follow it. They are connected to
Barry’s opening requestive utterance containing the conventional can you struc-
ture. Ken signals compliance with the request, and then Barry offers a follow-up
polite acknowledgement. This kind of triple conversational move is part of what
makes requests requests. An even clearer general example is the pre-request. Barry
might have initially said, “Are you going to the bar?” Such pre-requests are closely
associated with head requests: so much so that the addressee knows, very often,
what the full request is before it is performed in full. An annotation scheme, if it
is to do complete justice to requests, would need to connect all these parts up. In
particular, a pre-request would need to be connected to the head request. This is
not impossible to do, and indeed we will review one annotation scheme that does
this at the beginning of section 4.1, though it is clearly beyond current automated
annotation possibilities.
Corpus annotation 499

Thus far, we have assumed that one chooses a pragmatic unit and then proceeds
to segment the data accordingly. A more inductive approach would be to let the
nature of the pragmatic unit be determined by the nature of the data. One could
do this in an informal way. Dialogic data is clearly comprised of conversational
“turns”, an obvious unit to use for segmentation. However, a group of corpus lin-
guists have developed an inductive, data-driven approach for identifying discourse
segments within academic discourse (see, for example, Biber et al. 2004; Csomay
2005). Essentially, it works by comparing the first 50 words of text with the next
50, and then calculating a similarity value. The process is then repeated (i. e. the
following 50 words are compared with the next 50), as needed, after which simi-
larity scores can be plotted. Where there are troughs in the plot, that is, points of
less similarity between two 50-word segments, they become possible candidates
for marking discourse segment boundaries.

2.2. Implementing the annotation scheme

In 2008, we pointed out that pragmatic interpretations, leading to the imple-
mentation of a functional tag, such as a speech act, require a complex synthesis/
understanding of contextual information that is currently beyond the means of a
computer. This broadly remains the case. However, there are ways in which even
manual annotation can be assisted by a computer, some of which are mentioned in
upcoming sections.
The labour-intensive nature of manual annotation can also be circumscribed to
a degree by only annotating those feature(s) that are relevant to our research goals.
For instance, if one were only interested in the clauses that function as questions in
a discourse, it would be inane to annotate all of the clauses; the questions are likely
to make up a smaller proportion of the total, and we would be tagging material we
had no intention of using. It is an important issue for time and efficiency – and
often money – that we know what it is we seek when we approach a corpus, and
that only relevant work is undertaken.
Some key issues for the implementation of pragmatic annotation schemes are
as follows:
Ambiguity and indeterminacy. Pragmatic phenomena cannot be reduced to binary
choices. Ambiguity and indeterminacy are not “noise” or “errors” in pragmatics, but
often strategic choices. Ambiguity and indeterminacy need to be factored into any prag-
matic annotation scheme (see the brief discussion of Stiles 1992, for an example of how
this might be done).
Delicacy. Archer and Culpeper (2003: 52) point out the Catch-22 here: the more del-
icate a categorisation scheme the more accurate the description, yet the more delicate
the scheme, the less likely there will be enough evidence to apply a particular category,
and, consequently, the less likely there will be enough evidence to find statistically
meaningful results for a particular category.
500 Dawn Archer and Jonathan Culpeper

Implementation evidence. There are various types of evidence that can justify the
implementation of a category, including language (e. g. vocatives), secondary sources
(e. g. sociological accounts) and inferences (e. g. networks of interaction) (see Archer
and Culpeper 2003: 53, for more detail on these sources of evidence). Often the appli-
cation of a pragmatic category is done on the basis of multiple sources of evidence, both
formal and interpretative.

Consistency is a particularly crucial feature of implementation. Annotators – and

given the size of corpora there are usually more than one – must apply the prag-
matic categories of the annotation scheme in a consistent fashion, both over time
(i. e. not shift practices after having coded a few items) and relative to each other.
One way of safeguarding against the former is to re-annotate the first part of the
data, having passed through all the data. One way of safe-guarding against the
latter is to check inter-rater reliability (e. g. compare the coding of two or more
annotators) (see Hallgren 2012, for an overview).

3. Pragmatic annotation schemes

In this section, we review extant pragmatic annotation schemes. The following

sub-section briefly overviews the pragmatic background, mostly revolving around
illocutionary force, to the “dialogue act” schemes, and then considers the schemes
themselves. The section concerning contextual meanings proceeds in a simi-
lar fashion – with, first, an overview of the pragmatic background, and then the
schemes themselves. Finally, we give an example of a mixed scheme.

3.1. Dialogue act schemes

3.1.1. The pragmatic background

The key pragmatic theory underpinning dialogue act schemes is speech act theory
(Austin 1962; Searle 1969, 1975). That this theory has become important in cor-
pus pragmatics is not entirely surprising, as for decades studies have attempted to
quantify speech acts and their realisations. For example, the Cross-Cultural Speech
Act Realization Project (CCSARP) (see Blum-Kulka, House and Kasper 1989)
is a study of data elicited by written discourse completion tasks, involving seven
different languages or language varieties and 1,088 informants, and that data is
almost invariably analyzed quantitatively. However, studies deploying question-
naires elicit short data samples, not long stretches of discourse. The point made
by McEnery and Wilson (1996: 99), though writing more than 20 years ago, thus
still holds: “quantitative accounts […] would be an important contribution to our
understanding of pragmatics”. Dialogue act schemes are a means of pursuing quan-
titative pragmatic analysis across long stretches of discourse.
Corpus annotation 501

Extant dialogue act studies typically involve the manual- or semi-automated

tagging of speech act types, so that they can be placed in more generic groups, and
thereby reveal patterns in the discourse. Once a particular form has been assigned
to a speech act category, it is possible to investigate, for example, the formal char-
acteristics of that category – and also compare them with the formal characteristics
of the other categories. Although Austin’s (1962) classification of speech acts was
probably the first, most researchers draw on Searle (1976). His taxonomy consists
of five categories (1976: 10–15):
Representatives committing the speaker to the truth of the expressed proposition, e. g.
asserting, concluding [he later renamed this category Assertives]
Directives attempts by the speaker to get the addressee to do something, e. g.
advising, requesting
Commissives committing the speaker to a future course of action, e. g. promising,
threatening, offering
Expressives expressing a psychological state, e. g. thanking, apologising, welcom-
ing
Declaratives effecting immediate changes in an institutional state of affairs, with
extra-linguistic qualities, e. g. declaring war, christening

Alternative classifications, which have been proposed, include Bach and Harnish
(1979). One particular criticism made of these classifications is that, practically
speaking, they are classifications of the semantics of speech act “verbs”, which
cannot be assumed to map straightforwardly onto classifications of illocutionary
“acts” (Searle 1976: 8; Leech 1983: 177, 198). Broadly speaking, the earlier the
classification – Austin, Searle or Bach and Harnish – the more obviously this is the
case. Some researchers have therefore rejected attempts to devise a classification
of illocutionary acts in favour of a classification of speech act verbs drawn from
dictionaries: for example, Ballmer and Brennenstuhl (1981) identified 4,800 Eng-
lish verbs using dictionaries and classified them. Studies, whether corpus-based
or computational, have typically adapted such speech act classifications for their
particular datasets and annotated acts accordingly. The grand plan of devising a
classification that accommodates all kinds of speech act found in all kinds of dis-
course and at the right level of delicacy therefore seems impossible. The global
classifications that exist are best seen, then, as providing a useful starting point for
would-be annotators.
Readers will have noticed that we are using the label “dialogue act”. This brings
us into line with how pragmatic corpus annotators refer to their own schemes.2

2
Usage of the label is not necessarily consistent. For example, Bunt (1994) suggests that
a dialogue act is a speech act in the context of a dialogue, whilst Core and Allen (1997)
suggest that it is an act whose internal structure relates specifically to its dialogue func-
tion.
502 Dawn Archer and Jonathan Culpeper

Moreover, many of the schemes do not strictly confine themselves to speech acts,
but encompass other interactional phenomena, especially aspects that have been a
focus of study for Conversation Analysis (CA) (e. g. Sacks, Schegloff, and Jeffer-
son 1974) and Discourse Analysis (DA) (e. g. Sinclair and Coulthard 1975). Many
aspects of CA would be amenable to, and, we would argue, would benefit from,
annotation, by which one could study conversational patterns over large datasets.
Research attempting to combine a CA-based approach with annotation, for exam-
ple, is the Linguistic Interaction Database Exchange System (LIDES), which ena-
bles switches in code from one linguistic variety to the other to be tagged (see
the LIDES Coding Manual 2000 for further details). Such work is extremely rare,
however. One reason for this relates to the philosophy behind CA. Engaging in CA
is an inductive matter – a matter of revealing the categories that are used. Annota-
tion generally involves the opposite: the imposition of preformed categories. This
being so, annotation can perhaps be more naturally equated with the DA model of
the Sinclair and Coulthard (1975) type – a speech-act based model, which focuses
on structural relationships between utterances, using terms such as “exchange” or
“move”. A Sinclair-Coulthard inspired approach has been used by both computa-
tional and corpus-based analysts (see, for example, Carletta et al. 1997a, 1997b;
Archer 2005; and sections 3.1.2, 3.1.3 and 3.3). Additional computational work
with respect to discourse structure that is not necessarily inspired by the Birming-
ham School (i. e. DA in the spirit of Sinclair and Coulthard 1975) includes Carl-
son, Marcu and Okurowski (2003), Stede (2004) and Baldridge and Lascarides
(2005).

3.1.2. Dialogue act schemes: A hand-coded scheme

Here, we focus on Stiles (1992) and SPICE-Ireland. Both are notable, but for dif-
ferent reasons.
Stiles (1992) must count as one of the very earliest pragmatic annotation
studies. He raises many of the key issues. His taxonomy, which he calls Verbal
Response Mode or VRM, was designed as a means to an end: he wanted to improve
psychologists’ interactions with their patients, and needed some means of prag-
matically analysing their interaction. The eight categories of VRM (disclosure,
edification, advisement, confirmation, question, acknowledgement, interpreta-
tion and reflection) are essentially groups of particular kinds of speech act. The
“form”/“function” distinction is central to speech act theory, in that words and
their force operate by complex means, and the relationship between the two is
often tangential or indirect. Stiles’s scheme takes this fully into consideration. Each
speech act group is designated by a letter code (e. g. Q = question, A = advisement).
When coding data, the first letter codifies the literal form of the utterance, and the
second its illocutionary function. So, in example (1), the request “Can you bring
me a <pause> pint of lime and lemon <pause> with some i– lots of ice?” would
Corpus annotation 503

be tagged QA. In contrast, a literal question, such as “Can you undo it?” (BNC;
said in the context of a struggle to remove the top from a Tippex bottle), would be
tagged QQ. Stiles (1992: 100) also includes the category “uncodable” (U), which
he details briefly. Unfortunately, he does not include it in his main description of
the taxonomy. He also restricts it to “utterances that coders cannot understand or
hear clearly” (1992: 15). As such, the scheme does not fully accommodate inde-
terminacy in the data.
SPICE-Ireland (Systems of Pragmatic annotation for the spoken component of
ICE-Ireland) must currently count as the largest manually annotated corpus. Con-
structed by John Kirk and Jeffrey Kallen, pragmatic annotations were added to the
Irish component of the International Corpus of English (626,597 words in total).
With over 54,612 speech act annotations, the result is a remarkably rich resource.
Annotations were added for the following features:
• utterance speech-act function
• prosody (pitch movements)
• utterance tags
• discourse markers
• quotatives
The speech act taxonomy was modelled on Searle (1976). The unit to which speech
act functions were applied was usually the utterance, but they allowed for a some-
what wider scope, and included pauses. Cases that appeared ambiguous were coded
according to the “most likely interpretation within the context of the conversation
as a whole” (Kallen and Kirk 2012: 28). Their reasoning for choosing a manual
route for annotation is that: “no simple algorithm exists for determining the speech
act status of an utterance” and, thus, “annotation is made on the basis of a detailed
analysis of language in use” (Kallen and Kirk 2012: 28). The resultant corpus
allows robust and detailed answers to questions about the distribution of speech
acts in Irish English. For example, they revealed representatives and directives to
be outstandingly frequent; other speech act types have much lower frequencies.
Further, they found that speech act types vary according to text type. For example,
representatives are especially frequent in face-to-face conversation, spontaneous
commentary and telephone conversation; directives are outstandingly frequent in
demonstrations, but also frequent in business transactions, classroom discussion,
face-to-face conversation, telephone conversation and legal cross-examination.
For more detail, see Kallen and Kirk (2012).

3.1.3. Dialogue act schemes: (Semi-)automated models

According to Jurafsky (2004), there are two (semi-)automated models of speech
act interpretation: the BDI (belief, desire and intention) model and the cue-based or
probabilistic model. BDI computational models (e. g. Perrault and Allen 1980) use
504 Dawn Archer and Jonathan Culpeper

“belief logics” inspired by Searle’s (1975) explanation of indirect speech acts of

the “Can you pass the salt?” variety. In simple terms, they seek to mimic a hearer’s
chain of reasoning with respect to satisfactorily met pre-conditions. By contrast,
cue-based or probabilistic models (e. g. Jurafsky and Martin, 2000) are inspired
by Power’s (1979) concept of “conversational games and moves”, and Goodwin’s
(1996) work relating to the “microgrammar”, that is, the specific lexical, colloca-
tional, and prosodic features that characterise particular conversational moves. As
most well-known studies are in the latter cue-based tradition, cue-based models
will be the focus of this section.3
Work on the automatic detection of dialogue acts is quite advanced, such that
standards for shallow discourse structure annotation now exist. This said, these
standards are not commonly agreed upon, according to Weisser (2014a: 90). Here,
then, we focus on one of the better known: the Dialogue Act Markup in Several
Layers (DAMSL) tagset, designed by the natural language processing community
as part of the Discourse Resource Initiative (Core and Allen 1997). Of particular
interest, to us, is its utilisation of concepts outside the philosophical traditions that
first defined speech acts, with the result that we see the inclusion of Schegloff’s
concept of “repair” (Schegloff, Jefferson, and Sacks 1977), and “preceding and
succeeding discourse” (Schegloff 1968, 1988). Thus, the DAMSL tagset distin-
guishes between the forward-looking function of an utterance, which differentiates
between different speech act based phenomena (cf. statement = a claim made by
the speaker; info - request = a question by the speaker; check = a question by the
speaker for confirming information), and the backward-looking function, which
identifies some sort of pragmatic relationship between utterance U and previous
utterances (cf. accept = accepting the proposal; reject = rejecting the proposal;
repeat - rephrase = demonstrated via repetition or reformulation).
The SWBD-DAMSL annotation model (SWBD = Switchboard domain) pro-
vides us with an example of work that has utilised – and expanded – the DAMSL
tagset (Stolcke et al. 2000). Consisting of approximately 50 basic tags (including
question , statement , opinion , backchannel , appreciation ), which, when com-
bined with diacritics indicating related information, extends to 220, the model
distinguishes 42 mutually-exclusive utterance types. Here is an example of a con-
versation taken from the Switchboard Corpus of spontaneous human-to-human
telephone speech:

3
The reason that cue-based models are more prevalent than BDI models may relate to the
fact that computers can search for formal correlates of speech act-types more readily
than abstract logical aspects.
Corpus annotation 505

(2) Speaker Dialogue Act4 Utterance

B statement but,uh, we’re to the point now where our Financial income
is enough that we can consider putting some away –
A backchannel Uh-huh/
B statement – for college,/
B statement so we are going to be starting a regular payroll deduction –
A backchannel Um./
B statement – in the fall/
B statement and then the money that I will be making this summer we’ll
be putting away for the college fund.
A appreciation Um. Sounds good.
(Adapted from Stolcke et al. 2000: 7)

This extract shows that each utterance is assigned a unique Discourse Act label.
By “utterance”, Stolcke et al. (2000: 4) mean a sentence-level unit, which may or
may not correspond to a speaker turn. The tagset is interesting for several reasons.
First, it classifies utterances according to a combination of pragmatic, semantic
and syntactic criteria. Second, it claims not to be “task-oriented”. Indeed, Stolcke
et al. (2000: 4) argue that it is generic in nature, having been applied to a corpus
of spontaneous conversational speech – albeit telephone speech. Their claim is
important, as similar work has tended to concentrate on specific tasks, which tend
to be formulaic and may often be easier to annotate.
Carletta et al.’s (1997a, 1997b) taxonomy is an example of such a task-oriented
scheme, having been applied to Map Task dialogues (see Figure 1, below).5 Unlike
Stolcke et al. (2000), their scheme is based on conversational moves (i. e. utterance
function), game structure, and higher-level transaction structure. Consequently, it
shares similarities with the structure adopted by Sinclair and Coulthard (1975),
when analysing classroom discourse (see “interactional meaning” under section 2).
The “games” level is roughly equivalent to Sinclair and Coulthard’s “exchange”
level, in that it distinguishes between initiations and responses, etc. A “game”, in
turn, is made up of conversational moves, beginning with an initiation and contin-
uing until the purpose of the “game” has been achieved. Carletta et al. (1997b: 3)
provide a diagram which summarises the procedure followed when assigning the
moves of their scheme. As Figure 1 shows, formal and interactional aspects are
once again combined, as in Stiles (1992) (see, in particular, query / reply - yn and
query / reply - w ).

4
Dialogue Act (as used here) is synonymous with our use of discourse act throughout the
paper.
5
A Map Task involves participant A’s duplication of a route that is present on B’s map,
but missing from his/her own. For further details see Carletta et al. (1997b: 2).
506 Dawn Archer and Jonathan Culpeper

Figure 1: Conversational Move Categories

(from Carletta et al. 1997b: 3; original emphasis)
Corpus annotation 507

Weisser (2014a: 85) has recently undertaken a useful comparison of DAMSL

and SWDB-DAMSL, with his own task-oriented system – the Dialogue Annota-
tion & Research Tool [DART] – as a means of assessing ‘their relative merits’. His
main argument appears to be that SWBD-DAMSL is an improvement on DAMSL
because of “marking the speech act”, and hence identifying “the pragmatic force
of the unit”, “as the main dimension” (Weisser 2014a: 93). However, he is critical
of SWBD-DAMSL, in turn, for continuing to “hide” potentially relevant pragmatic
information “behind dimensions that are less linguistically motivated” (2014a: 94).
In contrast, his own system is designed to annotate for speech act (sp-act), polar-
ity, topic and mode, as well as to draw on punctuation as a means of “facilitating
further processing” (Archer, Culpeper, and Davies 2008: 647).6 Categories such
as these point to the overlap between DART (Weisser 2014b) and Weisser’s work
with Leech with respect to the Speech-Act Annotation Scheme (SPAAC). SPAAC
was devised to annotate service dialogues (see Leech and Weisser 2003). As its
name suggests, the key level of annotation within SPAAC relates to the tagging
of speech acts (the term Weisser 2014a still prefers over discourse act). SPAAC
draws from a tagset of 40 items for this purpose, including accept , acknowledge ,
answer , answer elaborate , appreciate , bye , complete , confirm , etc. “Correct”
speech act assignment is aided, in turn, by five further dimensions (many of which
are also tagged within DART):
(a) segmentation (e. g. into utterances, C-units7 and discourse markers)
(b) syntactic form (e. g. declarative, interrogative, imperative, fragment)
(c) topic or subject matter (e. g. address, arrival, cancel, credit card, date, depar-
ture: see train booking dialogues)
(d) mode (e. g. alternative, condition, probability, expletive)
(e) polarity (i. e. positive vs. negative).
The form tagset within SPAAC consists of <decl> (= declarative clause), <q-yn>
(yes-no question), <q-wh> (wh-question), <imp> (= imperative), <frag> (=frag-
ment, i. e. a non-clausal unit or incomplete clause lacking a subject), <dm> (=dis-
course marker), <yes> (= affirmative reply) and <no> (= negative reply). Within
DART, similar phenomena are captured under a syntactic category, but with the
addition of term of address <address> and exclamation <exclam> categories. As
we might expect, there are some obvious similarities between SPAAC, DART,

6
According to Weisser (2014a: 89), the use of punctuation in this way is still uncommon,
in spite of the usefulness of punctuation when marking up “prosodic information or
completeness status”.
7
C-units are Communication Units. They are usually defined as an independent clause,
along with any subordinate clauses attached to it.
508 Dawn Archer and Jonathan Culpeper

and those schemes already mentioned (cf. query - yn and query - with <q-yn> and
<q-wh>, reply - y and reply - n with <yes> and <no>).
Leech and Weisser (2003) go on to suggest that form labels other than ques-
tioning/answering share strong associations with particular speech act labels. For
example, an “inform” is said to be “normally conveyed by a <decl> or (less fre-
quently) a <frag>” (Leech and Weisser 2003: 22). Leech and Weisser (2003: 22)
define an inform as follows:
Typically speaker x has the goal of informing speaker y about something speaker x
believes that speaker y did not know or was not aware of before, generally without this
having been elicited, e. g. The last train leaves at 1650.

Informs can be difficult to distinguish in practice, because of their potential over-

lap with other speech acts, including confirm , express , regret , expresswish ,
expresspossibility , expressopinion , etc. In consequence, Leech and Weisser’s
(2003: 22) inform category is “flexible”. That is:
[Inform is] used where some element of conveying information or making the ad-
dressee aware is present. For example, after a longish period in which the telephone
is ringing, the operator may say to the caller: I’m sorry, there’s no reply (emphasis
original).

Here, their argument is that, although the information content of the utterance in
context is low, the operator is nevertheless making the caller “explicitly aware of
something”. Leech and Weisser’s need to justify the definition and use of their
“inform” category illustrates how even apparently straightforward speech act cat-
egories like “inform” are problematic to both define and apply to real data.

3.2. Contextual schemes

3.2.1. The pragmatic background

A crucial feature of pragmatics is that it accounts for meanings which are context
sensitive. Areas of context requiring some consideration include the co-text, phys-
ical context, personal/social context, cognitive context, cultural context and the
context in the situational model (i. e. the context that is projected by the language
itself, e. g. a fictional world projected by the words of a novel). Approaches to
context within pragmatics often combine several of the above areas, as in Dell
Hymes’ (e. g. 1972) “speech events” or Levinson’s ([1979]1992) “activity types”.
There is also an increasingly strong recognition that there are multiple contexts in
communication, as constructed by different participants, and that these are always
in a state of flux (e. g. somebody might speak to another in their capacity as “tutor”
and then speak to someone else in their capacity as “friend”). Unfortunately, many
researchers working outside pragmatics regularly employ an impoverished notion
Corpus annotation 509

of context. Often the co-text is taken to be the sum total of all there is to be said
about the context. This criticism also applies to corpus-based studies, which, if
they consider context at all, confine themselves to the inclusion of a few static
values (e. g. the sex of the participants) in the header of files (see section 3.2.2).
The major challenge for pragmatic annotation must be to take “adequate” account
of the context. By this we mean, identifying – so that coders might annotate –
what are perceived to be the most germane contextual variables, thereby provid-
ing an optimally relevant interpretation of an utterance (see, e. g. Clark, 1992:
105–6). This often means that coders have to annotate their datasets manually
(but see 4.2).

3.2.2. Context as a static construct: The example of the BNC headers

Most language data are analysed away from their original source, and linguists
of all types must re-create a sense of context so that valid interpretations of data
are possible. Of course, the issue for corpus annotation is what aspects of the
context must be selected for annotation, and the form that that annotation should
take. There is no universally-adopted scheme for contextual annotation, as yet
(or, indeed, for most types of annotation, but see the Text Encoding Initiative,
TEI). However, the Expert Advisory Group for Language Engineering Systems
(EAGLES) has surveyed dialogue annotation practices, with the aim of produc-
ing a set of guidelines (see http://www.ling.lancs.ac.uk/eagles/ and also Gibbon,
Moore and Winski 1998). One of the things highlighted by EAGLES is the issue
of where contextual information (e. g. speaker characteristics, channel character-
istics, activity types) should be placed. One option is to place such information in
the headers of individual files, as in the British National Corpus (or BNC), which
stores contextual information about the text in a header at the start of each file.
These headers, in turn, are structured according to the guidelines produced by the
TEI. According to EAGLES, researchers interested in spoken data can also provide
additional contextual information in external files, linked to the original dataset via
pointers (cf. Gibbon, Moore, and Winski 1998: 733–747).
Compared with the rich diversity of potential contextual inputs, as described
in the field of pragmatics, corpora like the BNC have taken a fairly minimalist
approach. As we have already argued, applying an empirical methodology whereby
one can quantify context in some way is a major challenge for a corpus-based
approach. This is especially so when one’s concern is not with the relatively static
characteristics of speakers, but with “face-to-face” interactions between speakers
and hearers. Work in the EAGLES tradition focuses almost exclusively on dyadic
interactions (the addressee is normally the other speaker), rather than multi-party
talk. One of the main interests of EAGLES is the automated analysis of dialogue.
Automatically identifying the addressee in multi-party talk is well beyond the
capabilities of current tagging programmes. Even less work has addressed the rel-
510 Dawn Archer and Jonathan Culpeper

evant contextual properties of spoken interaction on a turn-by-turn basis. Putting

contextual information in file headers may be reasonable for the general research
purposes of corpora such as the BNC, but this practice is inadequate for full prag-
matics research. In the following section, we demonstrate how a more complex
scheme can go some way towards pursuing a full pragmatics agenda.

3.2.3. Context as a dynamic construct: An example of a sociopragmatic

annotation scheme
Archer and Culpeper’s (2003) sociopragmatic annotation scheme, implemented in
the Sociopragmatic Corpus, seeks to identify important contextual factors relating
to both speaker and addressee at the level of the utterance (as opposed to the text),
and is designed to interface with three fields – pragmatics, corpus linguistics and
(historical) sociolinguistics. These disciplines have their own research goals and
methodological preferences. Consequently, when they are combined, they present
us with a particular set of difficulties (see Archer and Culpeper 2003: 38–42), at the
heart of which lies the issue of context. This scheme demonstrates that it is possi-
ble, nonetheless, to bridge the gap between text and contexts, such that we can (i)
accommodate the investigation of language set in various context(s), for example,
speaker/hearer relationships, social roles, and sociological characteristics such as
gender and (ii) treat context(s) as dynamic.
Following TEI guidelines, Archer and Culpeper (2003) transcribe individual
utterances using the <u> element, where <u> signals the beginning of the segment
to which the annotation pertains and </u> the end. In the BNC and similar corpora,
this <u> element tends to contain a limited amount of information, such as the
person id (see the TEI Guidelines, http://www.uic.edu/orgs/tei/p3/doc/p3.html), as
in example (3):
(3) <u id=1 who=POO1>How are you?</u>

The approach taken by Archer and Culpeper (2003) is slightly different, since
they opt for text-internal coding at the utterance level. This means that participant
information is given in the <u> element rather than in the header. Example (4) is
from the Trial of Charles I (1649):

(4) Lord President <u speaker=“s” spid=“s3tcharl001” spsex

=“m” sprole1=“j” spstatus=“1” spage=“9” addressee=“s”
adid=“s3tcharl002” adrole1= “d” adstatus=“0” adage= “9”>If
this be all you will say,</u>
<u speaker=“s” spid=“s3tcharl001” spsex =“m” sprole1=“j”
spstatus=“1” spage=“9” addressee=“m” adid=“x” adrole1= “n”
adstatus=“x” adage= “x”>then, Gentlemen, you that brought
the Prisoner hither, take charge of him back again.</u>
Corpus annotation 511

The annotation scheme is designed to identify the specific combination of socio-

pragmatic variables affecting each segment. In particular, this means describing
who is talking to whom at a given point in time, and in what capacity (cf. the
annotation scheme in the BNC, which only describes the static properties of speak-
ers across the whole interaction). The first six tags identify the speaker, and these
are followed by corresponding tags to identify the addressee (underlined above).
The first tag (speaker=“s”) tells us that the speaker is a single individual (rather
than being representative of multiple speakers, which would be tagged as speak-
er=“m”). The spid=“s3tcharl001” tag indicates that the speaker is the Lord Pres-
ident of the courtroom. The spsex=“m” tag identifies this speaker as male, and
sprole1=“j” tells us that he is acting in the capacity of a judge. He is also of high
status (spstatus=“1”) and over the age of 45 (adage=“9”). The addressee tags tell
us that the judge is addressing an individual (addressee=“s”), who is identified as
Charles I (adid=“s3tcharl002”). The adrole1=“d” tag tells us that the King is acting
as defendant; adstatus=“0” that he is royal; and adage=“9” that he is over the age
of 45 (like the Lord President).8 Note that if the same speaker addresses a different
person, the values for the addressee then change. This can sometimes occur within
the same speaker turn, as shown above, where the second <u> tag marks the point
at which the judge addresses other hearers and the utterance continues. The ability
of Archer and Culpeper’s (2003) scheme to identify more than one role field in
any given interaction is especially important when it comes to multi-party inter-
action. Clearly, this kind of information cannot be encoded usefully in a header
file.
Sceptics may point out that such a scheme is both time-consuming to apply,
and, due to its complexity, open to error. Our experience of applying it to a corpus
of 250,000 words suggests that it is time-consuming but nonetheless viable for
implementation in smaller, more focussed corpora. This is further confirmed by
the work of researchers such as Ursula Lutzky, who have successfully extended
the size of the drama component of the Sociopragmatic Corpus in order to study
discourse markers such as marry. In Lutzky’s (2008) case, she extended the corpus
by drawing upon more texts from the Corpus of English Dialogues and adding
texts from the Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME).
We would argue, in addition, that software tools can be developed to speed up the
implementation process of a non-automatic annotation scheme quite significantly.
For example, once a participant’s identity is clarified, the computer can enter static
values automatically, leaving the analyst to focus on the beginnings and endings of
utterances, as well as dynamic values.
Archer and Culpeper (2009: 288) have since exploited their annotation scheme.
They aimed to identify:

8
See Archer and Culpeper (2003), for a complete breakdown of the categories.
512 Dawn Archer and Jonathan Culpeper

[…] linguistic material in the Sociopragmatic Corpus (containing play-texts and trial
proceedings from the period 1640–1760) that is statistically characteristic of particular
constellations of social categories (relative to other constellations of categories).

They used the pragmatic annotation in their corpus to extract the talk exchanges
of particular dyads (i. e. master/mistress-servant in plays, examiner-examinee in
trials). They then used WMatrix3 to identify key features in this talk by comparing
word frequency lists of the various dyads with a reference corpus (in this case,
the Corpus of English Dialogues). Their aim, in so doing, was to introduce a new
approach – that of sociophilology. Sociophilology shares with sociopragmatics a
fundamental interest in “the ‘local’ conditions of language use” (Leech 1983: 10),
and thus has context as its starting point. It then draws on corpus-linguistic meth-
ods in order to determine how various contextual labels are shaping the language
used (and vice versa). In section 4.2, we introduce another means of using Wma-
trix3 to explore pragmatic phenomena.

3.3. Mixed schemes: An example

Archer (2005) is an example of a scheme that combines various strands of prag-
matic meaning, including the formal, illocutionary, interactional and contextual.
Based on an analysis of courtroom interaction from the later Early Modern Eng-
lish period (1640–1760), Archer’s work extends the Sociopragmatic Annotation
Scheme to include an “interactional intent” field, a “force” field, and, where appli-
cable, a “(grammatical) form” field. As the “form” field is relatively self-explan-
atory, in that it consists of the range of question-types (e. g. yes-no, disjunctive,
wh-, etc.) used in the historical courtroom, we will comment, instead, on the more
pragmatically-oriented phenomena, the interactional intent and force fields. The
interactional intent field “stfunc” relates to the position an utterance occupies in
the discourse. In other words, it assesses the interactional/structural purpose of an
utterance, so that we have a better understanding of the ways in which (trial) talk
is organised. Possible values include:
“initiation” initiating a new exchange by means of an eliciting device. Pro-
totypical examples include question, request, requirement.
“response” providing information that has been directly elicited by another
participant, usually by responding verbally. Prototypical exam-
ples include answer, acceptance, refusal, denial.
“response-initiation” responding to a direct elicitation of another participant by using
and/or following it with an eliciting device. Prototypical exam-
ples include an answer immediately followed by a request.
“report” stating information which has not been directly elicited by
another participant. Prototypical examples include statement,
explanation.
Corpus annotation 513

“follow up” providing follow-up/feedback to a preceding utterance in some

way. Prototypical examples include comment, evaluation.
“follow up-initiation” providing follow-up/feedback to a preceding utterance by using
and/or following it with an eliciting device. Prototypical exam-
ples include a comment immediately followed by a question.

The extract below, taken from the Trial of Giles (1680), provides an illustration of
the respective values. The recorder (labelled Record. and Recorder, in the transcript)
was questioning a witness, Elizabeth Crook. When she contradicted evidence given
by an earlier witness, William Richmond, the recorder intervened. Shortly after,
other participants also became involved. They included the King’s Counsel:
(5)
Record. You made the Bed, did not you? [initiation]
Crook, I did. [response]
Recorder, Upon your Oath, what time of Night was it? [follow up-
initiation]
Crook I think it was nearer Eleven than Ten. [response]
[text omitted]
What time of Night was it that he was making love to
Kings Coun.
you?
[initiation]
Crook, I think about Ten a Clock. [response]
Kings Coun. Time passed merrily away with you then. [follow up]
Rich. It was Twelve a Clock. [report]

Archer’s (2005) interactive/structural elements clearly show some resemblance to

Sinclair and Coulthard (1975). Carletta et al. (1997b) also show an IRF influence,
though their approach is more computational (see section 3.1.2).
Stenström (1984) and Carletta et al. (1997b) account for many more values
at their “move” level than Archer (2005) does at her “interactional intent” level
(cf. Stenström’s (1984: 83–6) “framing”, “focusing”, “checking” and “supporting”
moves and Carletta et al.’s (1997b) “instruct”, “explain”, “check”, “align”, “query”
and “acknowledge” moves, etc.). There is a necessary balance in any categorisa-
tion between usefulness and ease or consistency of coding. The primary purpose
of the “interactional intent” field is to distinguish between utterances that elicit,
respond to, comment on, and terminate an exchange. Archer (2005) argues that
further classifications would make the field cumbersome and, thus, potentially
more problematic to implement, and that the kind of distinctions that Carletta et
al. (1997b) and Stenström (1984) make at this level can be adequately accounted
for at a different level (i. e. via the force field).
Archer’s (2005) force field [force=“”] is inspired by Searle (1969, 1975) and
Wierzbicka (1987), in that it assigns utterances to one (or more) of seven macro
categories: “counsel” (= “w”), “question” (= “q”), “request” (= “r”), “require”
(= “c”), “sentence” (= “v”), “express” (= “e”) and “inform” (= “h”). These are
514 Dawn Archer and Jonathan Culpeper

viewed as macro categories, and the values they subsume, as “reasonably accurate
approximations of the prototypical instances of verbal behaviour describable by
means of the English verbs used as labels” (Verschueren 1999: 131–2). The defi-
nitions of four macro-categories are as follows:
Counsel S wants to convey something to A which will help prevent/result in Y [= an
event not in A’s best interest], e. g. “My advise to you is, that you would put
your self upon your Tryal [sic] … [text omitted] … If you will deal ingenu-
ously with the Court, I think that is best”.
Question S wants A to supply a missing variable by saying/confirming/clarifying some-
thing about X [= an action/event/behaviour/person], e. g. “Shall I withdraw?”
Request S wants Z [= an action/event] to happen and hopes to do it/get A to do (or get
others to do) it, e. g. “I do humbly move, that I may have time allowed me by
this court to send for my Witnesses”.
Require S wants (and expects) A to do something, even though A may be reluctant, or
to do something him/herself, in spite of A’s (probable) reluctance, e. g. “My
Lord, I demand this, to hear the Commission read”.

As the force of some utterances can/may remain indeterminate (due to contextual

factors such as status, power and discourse sequencing, for example), Archer’s
design also allows for the inclusion of multiple and, indeed, indeterminate forces by
using the “m” and “p” values respectively (cf. Stiles 1992; see also section 3.1.1).

4. Furthering the pragmatics research agenda with corpus annotation

4.1. Areas for exploitation within pragmatics

There are a multiplicity of areas within pragmatics that could benefit from prag-
matic annotation. We will confine ourselves to some examples relating to speech
acts and Gricean implicature.
The manual segmentation of speech act phenomena is well-established, notably
through the Cross-Cultural Speech Act Realization Project, CCSARP (see Blum-
Kulka, House, and Kasper 1989). This project devised a way of coding speech
acts in a large body of elicited data. The data was elicited by discourse completion
tasks, a type of questionnaire that requires the informant to produce a speech act
appropriate to a particular context. The vast majority of studies applying this meth-
odology have focussed on requests or apologies. The key issue relating to segmen-
tation using the CCSARP framework is making the distinction between head act
(HA) and alerter (AL) or supportive move (SM). The head act is regarded as the
core of the speech act, and usually contains a verb and its grammatically related
elements, for example, “Pass me the salt”, “I’d like the salt” or “Can you pass
me the salt?”. Note that the head act can vary, particularly in terms of directness.
Corpus annotation 515

The three salt examples just mentioned would be categorised as “mood deriva-
ble”, “want statement” or “preparatory (ability)”. Alerters (e. g. “excuse me”) alert
the addressee to the up-coming speech act.9 Supportive moves are independent
elements which pragmatically support the head act in some way (e. g. mitigate
the face threat of the act). They include grounders (e. g. “I really need it”) and
disarmers (e. g. “I know you are really busy but …”). The CCSARP framework
also encompasses the internal modification (IM) of head acts through downgraders
(e. g. politeness markers and minimisers) and upgraders (e. g. intensifiers). The
CCSARP framework in its entirety has not been applied to corpus data, as far as
we are aware. However, it is not difficult to envisage its application. The scheme
could lead to segmentation and annotation in a corpus in the way that is illustrated
in example (6), taken from the BNC:
(6) <al >Excuse me</al > <ha -preparatory>could you speak up</ha -preparatory>
<im -minimiser>just a little bit</im -minimiser>?

This example is designed to be illustrative only (the tag labels are unnecessarily
long).
Culpeper and Archer’s (2008) study of requests is relevant here. Their primary
aim was to determine which features were more prototypical of late Early Modern
English requests (based, once again, on the Sociopragmatic Corpus). Their second-
ary aim is more pertinent for this paper, however, as they also assessed the appli-
cability of parts of the CCSARP framework to requests in this period. They were
able to confirm that “Blum-Kulka, House, and Kasper’s (1989) “broad” categories
of directness are indeed applicable […] and thus can be seen as “universal” in some
sense” (Culpeper and Archer 2008: 79). But they go on to highlight a “fundamental
problem” with associating CCSARP categories of directness with head acts (to
the neglect of supportive moves) – in this period at least – especially “if we are
concerned with explicitness as opposed to directness in Searle’s sense” (Culpeper
and Archer 2008: 79). This is because “support moves can become pragmatical-
ized so that they not only “support” another element signaling the illocutionary
force but they themselves actually signal the illocutionary force”. They argue, for
example, that it is difficult to imagine that prithee (which was strongly associated
with requestive acts in the late Early Modern period) “would not have been taken
in many contexts as evidence of the illocutionary force [of a request] on its own”
(Culpeper and Archer 2008: 79).
Turning to another pragmatic area, virtually no corpus annotation work has
tackled what the speaker implies and what the hearer infers, that is to say, the infer-

9
The label “alerter” is unfortunate for a category that includes terms of address. Terms of
address do far more than alert the addressee; they are more like support moves in that
they often pragmatically support the head act.
516 Dawn Archer and Jonathan Culpeper

ential processes usually discussed with reference to Grice’s Cooperative Principle

(1975) or Sperber and Wilson’s Relevance Theory (1986). However, Archer (2002)
offers an analytical framework that allows for a quantitative investigation of (the
various ways in which participants broke) the Gricean maxims within the histor-
ical courtroom. Grice (1975) identified four maxims (Quality, Quantity, Relation
and Manner) that, taken together, specify what participants have to do in order to
converse in a maximally efficient, rational, co-operative way (i. e. speak sincerely,
relevantly and clearly, while providing sufficient information). Grice also sug-
gests that, as interactants, we can manipulate these maxims in order to generate an
implicature (i. e. cause our interlocutor to look for an additional meaning beyond
the surface level meaning). Archer (2002: 10) uses Gricean maxims as annotation
categories. She also added two additional categories, “coop” and “ambig”, which
signal surface level cooperation and uncertainty respectively. This scheme was
used to investigate surface and deeper level “cooperation” (or the lack of) in his-
torical courtroom discourse.
Of course, the lack of formal correlates and the sheer complexity of the infer-
ential system place a very heavy burden on the corpus-based analyst. One thing
to bear in mind here is that the annotator is not supplying absolute values to seg-
ments of data, but an interpretation that it has a particular value. McEnery’s (1995)
argument, in the context of computational pragmatics, is therefore relevant here.
Developing an idea first posited by Leech (1983), McEnery (1995: 37) contends,
first, that inferencing should be seen as probabilistic, and, second, that Grice’s
(1982) own “deeming” process appears to confirm this:
Given that complete understanding is impossible, Grice says that the generation of ap-
proximations to meaning will lead to the interpretation of an utterance being the opti-
mum realisation of the meaning of that utterance. Grice says that these approximations
are ones that ought to be deemed to “satisfy a given ideal even though they do not,
strictly speaking, exemplify it” – that is utterances may be taken to mean one thing, even
though some uncertainty may exist as to whether they definitely do have that meaning.
So one may not, strictly speaking, in any model of intention, say that A meant C. But
contextually it is legitimate for us to deem that A meant C […]

4.2. Corpus tools for pragmatics exploitation

Throughout this paper, we have hinted at the difficulty of automatically identifying
pragmatic phenomena (see, e. g. 2–2.1, 3.1.2–3.1.3, and 3.2.2–3.2.3). This does
not mean that all language phenomena are impossible to annotate automatically.
A focus on formal categories, found using procedures such as (hidden) Markov
modelling for instance, is helping to ensure that POS taggers become increas-
ingly robust (Jurafsky et al. 2014). In contrast, much of the pragmatic phenom-
ena we have discussed to date are functional categories. This tends to mean that
they are “fuzzy” and/or can be realized by a range of different linguistic forms.
Corpus annotation 517

Automatic annotation is much more difficult, in consequence – if not impossible

in some instances. There are what we might call “back-door” opportunities for
finding pragmatic phenomena using automated tools, however. As noted in 3.2.3,
Archer and Culpeper (2009) have used Wmatrix3, along with their sociopragmatic
annotation scheme, to investigate keyness features, for example. Such explorations
are possible at three levels within Wmatrix3 – word level, part-of-speech level
and semantic-field level – as texts uploaded to this content analysis and compar-
ison tool are automatically assigned part-of-speech tags and semantic field tags
prior to the keyness method being applied. The part-of-speech tagger makes use of
CLAWS4 software (a list of the 137 parts-of-speech, is available at: http://ucrel.
lancs.ac.uk/claws/). The semantic field tagger (known as USAS) consists of twen-
ty-one macro categories, which expand into 232 semantic fields (a list of the 232
“semtags” is available at: http://ucrel.lancs.ac.uk/usas/USASSemanticTagset.pdf).
In more recent work, Archer has suggested that the keyness approach is not the
only means by which a tool initially designed to enable semantic field analysis might
also be used to investigate pragmatic phenomena. In Archer (2014), she outlines
an approach for investigating verbal aggression and related discursive phenomena
based on targeting utterances or sentences using specific semtags: namely, speech
acts (Q2.2), “good/bad” evaluation (A5.1+/–), “true/false” evaluation (A5.2+/–),
“anger/violence” (E3–), (im)politeness (S1.2.4+/–) and “respect/lack of” (S7.2+/–).
Archer is careful to point out that, although this approach has proven to be very
fruitful, it provides potential indicators of verbal aggression only. As such, words
and phrases captured by the semtags must be viewed in their context-of-use (i. e.
re-contextualised by the researcher, using an expand context facility within Wma-
trix3) so that false positives can be differentiated from genuine examples of verbal
aggression. The approach also assumes that semantic fields share several similar-
ities with a pragmatic space of inter-related speech acts, such as both being “ana-
lysed in relation to neighboring expressions” (Jucker and Taavitsainen 2000: 74).
A problem with using Wmatrix3 to explore diachronic data pragmatically is
that it was designed with modern data in mind. As such, it cannot cope well with
meaning change over time. In her own study of Old Bailey trial texts dating from
the late eighteenth-century, for instance, Archer (2014: 278) noted a speaker’s
use of politely to describe the deftness with which he saw a thief pick someone’s
pocket. In this case, the meaning of politely aligned with the now obsolete sense,
“smoothly”, but Wmatrix3 wrongly assigned politely to the semtag, S. 1.2.4+,
which encompasses extant senses such as having “consideration for the feelings
of others”.10 One way of alleviating the problem of tag mis-assignment due to
meaning change over time is to draw upon the Historical Thesaurus Semantic Tag-

10
“politely, adv.” OED Online. Oxford University Press, September 2016. 29 November
2016.
518 Dawn Archer and Jonathan Culpeper

ger (HTST).11 The HTST incorporates the aforementioned CLAWS and USAS
annotation tools within Wmatrix3 with (i) a VARiant Detector (VARD) designed
to link variant spellings to their modern equivalent (Baron and Rayson 2008),
and (ii) HT codes derived from the Historical Thesaurus of the Oxford English
Dictionary (HTOED). The VARD helps to eradicate tag mis-assignments due to
spelling differences. The HT codes provide HTST users with access to 700,000
word senses arranged into 225,000 time-sensitive categories, thus ensuring that
their annotation results are demonstratively more accurate over time than when
reliant on Wmatrix3 alone.
Archer and Malory (2017) have recently developed three innovative means
of analyzing instances of (im)politeness and other facework phenomena semi-au-
tomatically using the CQPweb interface of HTST. The first method makes use
of portmanteau tags of two or more USAS semtags, which point users to, for
example, language characterised by threats and invective (via Q2.2/E3–), insult,
mockery, ridicule, sarcasm and taunts (via Q2.2/S1.2.4–) or boasts and brag-
ging (via S1.2.3+/Q2.2), etc. An example of the S1.2.3+/Q2.2 portmanteau tag,
taken from Archer and Malory (2017), involved the British Member of Parliament
(MP), Michael Portillo, who stated “that the [then] Chancellor like[d] to swagger
and boast about the working families tax credit” (S6CV0347P0, 6/04/2000) in
the UK House of Commons. The second method makes use of specific HTOED
classifications and/or HT codes, and can be effective in tracing name-calling that
is based on a certain characteristic. By way of illustration, the HTOED classifi-
cation, 01.02.01.02.01.03.02.01, can be used to highlight instances of the term,
moron, in datasets such as (Historic) Hansard. Some uses of moron, by British
MPs were descriptive (i. e. signalled mental impairment) but others functioned as
a name-calling strategy (see, e. g. “The hon. Member is a moron”, S5CV0623P,
16/05/1960). The third method highlighted by Archer and Malory engages in what
they call “meaning constellation searches”. These are searches made up of USAS
semtags, HT codes and/or CLAWS part-of-speech tags, that is, both semantic fields
and/or parts-of-speech. They include the “bad behaviour/accusation/speech act”
meaning constellation, which identified hits such as the following in (Historic)
Hansard, where an MP indirectly accused one of his counterparts of bias:
When the hon: Member wants to throw cold water upon the stories of atrocities in Bel-
gium, why he should always drag in his sneers about the Belgian atrocities in the Congo
I leave it for the House and country to judge […] (S5CV0068P0_01653, 16/11/1914)

Archer and Malory found the meaning constellation search method to be the most
effective of the three as it tended to point to more true positives, in percentage

11
HTST was developed as part of the cross-university, AHRC/ESRC funded, SAMUELS
project (grant reference AH/LO1OO62/1).
Corpus annotation 519

terms.12 Meaning constellations made up of semantic fields with an explicit im/

politeness association (e. g. (dis)respect, praise, bad behaviour, contempt, etc.)
were especially prone to have a high incidence of true positives. As with Wmatrix3,
it remains important for the researcher to check each hit returned automatically,
regardless of the search method used, by viewing them in their specific context-
of-use. For interactional phenomena like face threatening or face enhancing acts
cannot be understood properly without fully appreciating how they are being used
by the speaker, and received by relevant others, in their context-of-use.

5. Conclusion

In our 2008 paper on pragmatic annotation, we noted that “pragmatic annotation can-
not be fully automated, though tagging can be computationally-assisted” (Archer,
Culpeper, and Davies 2008: 25). This is still the case. We also noted that a means
of disseminating information “about models people have devised, so that research-
ers can build on each others’ work rather than, as it were, reinventing the wheel”
(2008: 25) was largely lacking. We wrote the 2008 article, in the hope of making a
small contribution to “spreading the word”. It has been good, therefore, to see that
there is growing evidence of researchers drawing upon, extending and/or comparing
their own pragmatic annotation schemes with those of others (e. g. Lutzky 2008;
Weisser 2014a). There is still much to do, however. A positive next step, for exam-
ple, would be to see more chapters on pragmatic annotation in handbooks. Since
2008, a gold standard has not emerged, when it comes to pragmatic annotation, in
spite of organisations such as EAGLES pressing for one. This is largely because
researchers are still devising “pragmatic annotation schemes that meet their per-
sonal research objectives” (2008: 25). The fear that this will have the “unfortunate
consequence that research efforts do not interface with one another” (2008: 25) is
perhaps not well founded within this particular area of pragmatics, however, given
our comment above about researchers increasingly using, further enhancing and
comparing different pragmatic annotation schemes. Our own position, today, is that,
if we are to continue seeking a “basic reference model”, or models acting as a kind
of gold standard, the criteria capturing “interpretative” categories or concepts must,
first and foremost, be flexible. Such guidelines might emphasise the need to be
systematic enough to ensure replicability (and, by so doing, ensure its usefulness to

12
The desire to eradicate false positives (i. e. results which do not turn out to be genuine
examples of facework) is what partly prompted the authors to begin simultaneously
searching for a combination of (consecutive) HT_codes, or HT_codes plus semtags
and/or POS tags.
520 Dawn Archer and Jonathan Culpeper

others), the need to balance delicacy of categorisation with the ability to fill catego-
ries with a statistically meaningful quantity of members, and so on.
If we are called upon to write another pragmatics annotation paper ten years
from now, we would like to be able to report that the types of schemes and pro-
cedures discussed in this paper are being applied to languages other than Eng-
lish. We would also like to be able to report many more approaches that focus on
sociopragmatic features (as well as pragmalinguistic features). This means moving
away from the co-text and/or form alone to investigate pragmatic functions, turn
by turn, across whole interactions. As we highlighted in 2008, corpus annotation
provides an extremely effective means of tagging pragmatic phenomena so that
it can be analysed both systematically and rigorously, across interactions, with
the aid of computers. We believe now, as we did in 2008, that this can only serve
to strengthen the relationship between corpus linguistics and pragmatics, whilst
nonetheless maintaining the view “that annotation should be regarded as an inter-
pretative record only, so as to ensure that we do not over-state the importance of
the annotation in relation to the text” (2008: 24). Last but not least, we would like
to be able to report that some of the areas of pragmatics, flagged here as having
the potential to be explored through corpus annotation methods, have realised this
potential through the creation of innovative annotation schemes. This will ensure
that the work around pragmatic annotation (and, perhaps corpus pragmatics more
broadly) maintains the links with – and, importantly, is seen as being relevant to –
more mainstream pragmatics.

References

Aijmer, Karin and Christoph Rühlemann

2014 Corpus Pragmatics: A Handbook. Cambridge: Cambridge University Press.
Archer, Dawn
2002 “Can innocent people be guilty?” A sociopragmatic analysis of examination
transcripts from the Salem Witchcraft Trials. Journal of Historical Pragmatics
3(1), 1–30.
Archer, Dawn
2005 Questions and Answers in the English Courtroom (1640–1760): A Socioprag-
matic Analysis. (Pragmatics and Beyond Series.) Amsterdam: John Benjamins.
Archer, Dawn
2014 Exploring verbal aggression in English historical texts using USAS: The pos-
sibilities, the problems and potential solutions. In: Irma Taavitsainen, Andreas
H. Jucker and Jukka Tuominen (eds.), Diachronic Corpus Pragmatics, 277–
301. Amsterdam/Philadelphia: John Benjamins.
Archer, Dawn and Jonathan Culpeper
2003 Sociopragmatic annotation: New directions and possibilities in historical
corpus linguistics. In: Andrew Wilson, Paul Rayson and Anthony M. McEn-
ery (eds.), Corpus Linguistics by the Lune: A Festschrift for Geoffrey Leech,
37–58. Frankfurt Main: Peter Lang.
Corpus annotation 521

Archer, Dawn and Jonathan Culpeper

2009 Identifying key sociophilological usage in plays and trial proceedings (1640–
1760): An empirical approach via corpus annotation. Journal of Historical
Pragmatics 10(2): 260–283.
Archer, Dawn and Beth Malory
2017 Tracing facework over time using (semi)automated methods. International
Journal of Corpus Linguistics 22(1): 27–56.
Archer, Dawn, Jonathan Culpeper and Matthew Davies
2008 Pragmatic annotation. In: Merja Kytö and Anke Lüdeling (eds.), Corpus Lin-
guistics: An International Handbook, 613–642. Berlin: Mouton de Gruyter.
Austin, John L.
1962 How to do Things with Words. Oxford: Oxford University Press.
Bach, Kent and Robert M. Harnish
1979 Linguistic Communication and Speech Acts. Cambridge, MA: M.I.T. Press.
Ballmer, Thomas T. and Waltraud Brennenstuhl
1981 Speech Act Classification: A Study of the Lexical Analysis of English Speech
Activity Verbs. Berlin/New York: Springer-Verlag.
Baron, Alistair and Paul Rayson
2008 VARD 2: A tool for dealing with spelling variation in historical corpora. Pro-
ceedings of the Postgraduate Conference in Corpus Linguistics, Aston Univer-
sity, Birmingham, UK, 22 May 2008.
Biber, Douglas, Eniko Csomay, James K. Jones and Casey Keck
2004 A corpus linguistic investigation of vocabulary-based discourse units in uni-
versity registers. In: Ulla Connor and Thomas A. Upton (eds.), Applied Corpus
Linguistics: A Multidimensional Perspective, 53–72. Amsterdam: Rodopi.
Baldridge, Jason and Alex Lascarides
2005 Annotating discourse structures for robust semantic interpretation. In: Pro-
ceedings of the 6th International Workshop on Computational Semantics.
IWCS-6. Tilburg, Netherlands.
Blum-Kulka, Shoshana, Juliane House and Gabriele Kasper (eds.)
1989 Cross–Cultural Pragmatics: Requests and Apologies. Norwood, N.J.: Ablex
Publishing Corporation.
Bunt, Harry C.
1994 Context and dialogue control. Think 3: 19–31.
Carletta, Jean, Nils Dahlbäck, Norbert Reithinger and Marilyn A. Walker
1997a Standards for Dialogue Coding in Natural Language Processing. Technical
Report no. 167, Dagstuhl Seminars. Report from Dagstuhl seminar number
9706.
Carletta, Jean, Amy Isard, Stephen Isard, Jacqueline C. Kowtko, Gwyneth Doherty–Sned-
don and Anne H. Anderson
1997b The reliability of a dialogue structure coding scheme. Computational Linguis-
tics 23: 13–32.
Carlson, Lynn, Daniel Marcu and Mary E. Okurowski
2003 Building a discourse–tagged corpus in the framework of rhetorical structure
theory. In: Jan van Kuppevelt and Ronnie W. Smith (eds.), Current and New
Directions in Discourse and Dialogue, 85–112. Dordrecht, Boston and Lon-
don: Springer.
522 Dawn Archer and Jonathan Culpeper

Clark, Herbert H.
1992 Arenas of Language Use. Chicago/London: University of Chicago Press.
Core, Mark and James F. Allen
1997 Coding dialogs with the DAMSL annotation scheme. In: Working Notes of the
AAAI Fall Symposium on Communicative Action in Humans and Machines,
28–35. Boston, MA: AAAI.
Csomay, Eniko
2005 Linguistic variation in the lexical episodes of university classroom talk. In:
Andrea Tyler, Mari Takad, Yiyoung Kim and Diana Marinova (eds.), Language
in Use. Cognitive and Discourse Perspectives on Language and Language
Learning, 302–324. (Georgetown University Round Table on Languages and
Linguistics, GURT 2003). Georgetown: Georgetown University Press.
Culpeper, Jonathan and Dawn Archer
2008 Requests and directness in Early Modern English trial proceedings and play
texts, 1640–1760. In: Andreas H. Jucker and Irma Taavitsainen (eds.), Speech
Acts in the History of English, 45–84. Amsterdam/Philadelphia: John Benja-
mins.
Culpeper, Jonathan and Mathew Gillings
2018 Politeness variation in England: A north-south divide? In: Vaclav Brezina,
Robbie Love and Karin Aijmer (eds.), Corpus Approaches to Contemporary
British Speech: Sociolinguistic Studies of the Spoken BNC2014. London: Rou-
tledge.
Gibbon, Dafydd, Roger Moore and Richard Winski
1998 Handbook of Standards and Resources for Spoken Language Systems. Berlin:
Mouton de Gruyter.
Goodwin, Charles
1996 Transparent vision. In: Elinor Ochs, Emanuel A. Schegloff and Sandra A.
Thompson (eds.), Interaction and Grammar, 370–404. Cambridge: Cam-
bridge University Press.
Grice, H. Paul
1975 Logic and Conversation. In: Peter Cole and Jerry L. Morgan (eds.), Syntax and
Semantics 3: Speech Acts, 41–58. New York: Academic Press.
Grice, H. Paul
1982 Meaning revisited. In: Neilson V. Smith (ed.), Mutual Knowledge, 223–245.
London: Academic Press.
Hallgren, Kevin A.
2012 Computing inter–rater reliability for observational data: An overview and
tutorial. Tutorials in Quantitative Methods for Psychology 8(1): 23–34.
Houghton, George
1986 The production of language in dialogue: A computational model. Ph.D. thesis,
University of Sussex.
Hymes, Dell
1972 Towards ethnographies of communication: The analysis of communicative
events. In: Pier P. Giglioli (ed.), Language and Social Context, 21–44. Lon-
don: Penguin.
Jucker, Andreas H. and Irma Taavitsainen
2000 Diachronic speech act analysis: Insults from flyting to flaming. Journal of
Historical Pragmatics 1(1): 67–95.
Corpus annotation 523

Jucker, Andreas H., Daniel Schreier and Marianne Hundt (eds.)

2009 Corpora: Pragmatics and Discourse. Papers from the 29th International Con-
ference on English Language Research on Computerized Corpora (ICAME
29). Ascona, Switzerland, 14–18. (Language and Computers: Studies in Prac-
tical Linguistics 68). Amsterdam: Rodopi.
Jurafsky, Daniel
2004 Pragmatics and computational linguistics. In: Laurence R. Horn and Gregory
Ward (eds.), Handbook of Pragmatics, 578–604. Malden, MA: Blackwell Pub-
lishing.
Jurafsky, Daniel and James H. Martin
2000 Speech and Language Processing: An Introduction to Natural Language Pro-
cessing, Computational Linguistics and Speech Recognition. New Jersey:
Prentice Hall.
Jurafsky, Daniel, James H. Martin, Peter Norvig and Stuart Russell
2014 Speech and Language Processing: An Introduction to Natural Language Pro-
cessing, Computational Linguistics and Speech Recognition. Second edition.
New Jersey: Prentice Hall.
Kallen, Jeffrey L. and John M. Kirk
2012 SPICE–Ireland: A User’s Guide. Belfast: Queen’s University Belfast, Trinity
College Dublin and Cló Ollscoil na Banríona.
Leech, Geoffrey N.
1983 Principles of Pragmatics. Longman: London.
Leech, Geoffrey N.
1997 Introducing corpus annotation. In: Roger Garside, Geoffrey Leech and Tony
McEnery (eds.), Corpus Annotation, 1–18. London/New York: Longman.
Leech, Geoffrey and Martin Weisser
2003 Generic speech act annotation for task-oriented dialogues. In: Dawn Archer,
Paul Rayson, Andrew Wilson and Tony McEnery (eds.), Proceedings of the
Corpus Linguistics 2003 Conference, 441–446. Lancaster, University Centre
for Computer Corpus Research on Language Technical Papers 16(1).
Levinson, Stephen C.
[1979] 1992 Activity types and language. In: Paul Drew and John Heritage (eds.), Talk
at Work, 66–100. Cambridge: Cambridge University Press.
Lutzky, Ursula
2008 The discourse marker marry – a sociopragmatic analysis. Vienna English
Working Papers 17(2): 3–20.
Leech, Geoffrey N., Paul Rayson and Andrew Wilson
2001 Frequencies in Written and Spoken English. Harlow: Pearson.
McEnery, Tony
1995 Computational pragmatics: Probability, deeming and uncertain references.
Unpublished PhD thesis. Lancaster, Lancaster University.
McEnery, Tony and Andrew Wilson
1996 Corpus Linguistics. Edinburgh: Edinburgh University Press.
Perrault, C. Raymond and James F. Allen
1980 A plan–based analysis of indirect speech acts. American Journal of Computa-
tional Linguistics 6: 167–182.
524 Dawn Archer and Jonathan Culpeper

Power, Richard
1979 The organization of purposeful dialogs. Linguistics 17: 105–152.
Romero–Trillo, Jesús (ed.)
2008 Pragmatics and Corpus Linguistics. Berlin: Mouton de Gruyter.
Sacks, Harvey, Emanuel A. Schegloff and Gail Jefferson
1974 A simplest systematics for the organisation of turn-taking in conversation.
Language 50(4): 696–735.
Schegloff, Emanuel A.
1968 Sequencing in conversational openings. American Anthropologist 70: 1075–
1095.
Schegloff, Emanuel A.
1988 Presequences and indirection: Applying speech act theory to ordinary conver-
sation. Journal of Pragmatics 12: 55–62.
Schegloff, Emanuel A., Gail Jefferson and Harvey Sacks
1977 The preference for self–correction in the organization of repair in conversa-
tion. Language 53: 361–382.
Searle, John R.
1969 Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge
University Press.
Searle, John R.
1975 Indirect speech acts. In: Peter Cole and Jerry L. Morgan (eds.), Speech Acts:
Syntax and Semantics, 59–82, Volume 3. New York: Academic Press.
Searle, John R.
1976 A classification of illocutionary acts. Language in Society 5: 1–23.
Sinclair, John and Malcolm Coulthard
1975 Towards an Analysis of Discourse: The English Used by Teachers and Pupils.
Oxford: Oxford University Press.
Sperber, Dan and Deidre Wilson
1986 Relevance: Communication and Cognition. Oxford: Blackwell.
Stede, Manfred
2004 The Potsdam Commentary Corpus. In: Bonnie Webber and Donna Byron
(eds.), Proceedings of the ACL Workshop on Discourse Annotation, 96–102.
Barcelona, Spain.
Stenström, Anna-Brita
1984 Questions and Responses in English Conversation. Malmö: Liber Förlag.
Stiles, William B.
1992 Describing Talk: A Taxonomy of Verbal Response Modes. London: Sage Pub-
lications.
Stolcke, Andreas, Klaus Ries, Noah Coccaro, Elizabeth Shriberg, Rebecca Bates, Dan-
iel Jurafsky, Paul Taylor, Rachel Martin, Carol Van Ess–Dykema and Marie
Meteer
2000 Dialogue act modeling for automatic tagging and recognition of conversa-
tional speech. Computational Linguistics 26(3): 339–371.
Taavitsainen, Irma, Andreas H. Jucker and Jukka Tuominen (eds.)
2014 Diachronic Corpus Pragmatics. (Pragmatics & Beyond New Series 243).
Amsterdam: John Benjamins.
Corpus annotation 525

Verschueren, Jef
1999 Understanding Pragmatics. London: Arnold.
Weisser, Martin
2014a Speech act annotation. In: Karin Aijmer and Christoph Rühlemann (eds.),
Corpus Pragmatics: A Handbook, 84–113. Cambridge: Cambridge University
Press.
Weisser, Martin
2014b The Dialogue Annotation and Research Tool (DART). Version 1.0. Available
at: Martinweisser.org/ling_soft.html#DART
Wierzbicka, Anna
1987 English Speech Act Verbs: A Semantic Dictionary. Sydney: Academic Press.
Historical corpus pragmatics 527

21. Historical corpus pragmatics

Irma Taavitsainen

Abstract: Historical corpus pragmatics had its beginning in the mid-1990s. The
development of the field has been rapid, and the field at present is very different
from its initial stages. New tools and innovative methods of analysis make it pos-
sible to answer more ambitious research questions, and sizes of historical corpora
have grown from one and a half million to hundreds of millions of words. Con-
textualizing is essential in pragmatics: assessments from of the narrow linguistic
co-text to the broad cultural context and sociohistorical developments are of spe-
cial importance. At present, new challenges are being posed by Digital Humanities,
paving the way to future trends.

1. Introduction

Corpus pragmatics is a methodological framework that can adopt either synchronic

(including historical stages of language) or diachronic perspectives. It relies on
empirical assessments of authentic language use employing digital corpora, i. e.
principled collections of natural texts, compiled according to some clearly defined
criteria, usually aimed at being representative of the target language variety. His-
torical corpus pragmatics investigates real language use of the past, as recorded
in historical corpora, and corpus compilation and annotation are its cornerstones.
Studies make use of computer techniques in different ways, utilising different
retrieval tools and quoting supporting evidence. The scope is broad: from mainly
qualitative corpus-based studies to quantitative assessments relying on advanced
statistical methods. The range extends from lexico-grammatical features and col-
locations to semantic and pragmatic aspects of language use, unfolding discourse
and representations of attitudes and ideology. Real language examples need con-
textualisation and explications of the sociohistorical circumstances of text produc-
tion and parties of communication, which are essential for all pragmatic studies.
Consequently, the patterns of language use under scrutiny are considered in their
multilayered context for interpretation.
Historical approaches to corpus pragmatics differ somewhat from cor-
pus pragmatic studies on present-day materials (see Aijmer and Andersen in
this volume). This chapter discusses characteristics, pitfalls and achievements
of historical corpus pragmatics and points out relevant differences between
modern and historical studies. It takes time before the potentials of new meth-
ods are developed in full and researchers learn to apply them in an optimal

https://doi.org/10.1515/9783110424928-021
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 527–553. Berlin/
Boston: De Gruyter Mouton.
528 Irma Taavitsainen

way.1 The available databases have developed a great deal within recent years (see
below 7.1) and provide an abundance of material with easy access. Metadata are
sometimes given, but mostly the linguistic forms are retrieved in isolation from
their textual and sociohistorical contexts, which require special study. Applications
to historical pragmatics need developing, as contextualisation may present even
more problems in historical studies than in assessments of present-day language
use. Since the early days, the methodology has advanced and research questions
have become more sophisticated. Indeed, in 2015 historical corpus pragmatics
was pointed out as “one of the latest (and most fruitful) synergies of pragmatics
and corpus linguistics” (Clancy and O’Keeffe 2015: 236). In the following, I shall
survey how the field has developed since the early days, and how corpus linguistic
methods have been applied to historical data for pragmatic research questions. I
shall demonstrate their range, give some case studies, discuss the present state of
the art and finish with future challenges.

2. A survey of the early stages

2.1. First steps of historical corpus pragmatics

Modern corpus linguistic studies were first conducted in the 1960s, but pragmatic
studies came to corpus linguistics later in the nineties and after the turn of the
millennium (Rühlemann and Aijmer 2015), and historical pragmatics on a larger
scale even later (Jucker and Taavitsainen 2014). Historical topics were introduced
into corpus linguistics with the Helsinki Corpus of English Texts (HC hereafter),
compiled in the 1980s and released in 1991. The new principled and structured dig-
ital database was received with enthusiasm, as it opened up new possibilities and
inspired a wide range of pioneering studies, but applications of corpus linguistic
methods to pragmatic research questions had to wait till 1995. Frequency counts
were an innovative feature in pragmatic studies of the mid-1990s, and although
the applications were simple, they marked a new way of doing linguistic research.2
Statistical methodology was borrowed from the hard sciences with their criteria
of replicability and objectivity, but do these requirements really apply to language
studies on human communication? Corpus-linguistic searches are replicable, but

1
For example, at the initial phase the first frequency counts were often given with-
out context. There was a break with the philological tradition, perhaps on purpose to
emphasise the “scientific” nature of corpus studies. But this has been amended and
contextual assessments are an essential component in all corpus studies now.
2
However, only few studies on digital corpora with frequency data are found in the
inaugural volume of the new branch of study, Historical Pragmatics: Pragmatic Devel-
opments in the History of English (1995, ed. by Jucker).
Historical corpus pragmatics 529

interpretations of the results always contain a subjective element, and although

frequency counts can provide an answer to pragmatic research questions it may
be with a societal or cultural slant. They are better suited for some other branches,
e. g. at the interface between semantics and pragmatics where they can show the
pace of pragmatic processes when tracing changes of meaning. Most importantly,
the two key concepts of pragmatics, variability and negotiability (Verschueren
1999: 59), bring their own dynamics to corpus linguistic studies. The underlying
theoretical view in corpus linguistics is the variationist approach to language, in
which language is considered a constantly changing entity, with current options
providing the basis of language use (see e. g. Milroy 1992). The pragmatic notion
of variability can be defined as the range of possibilities from which choices can be
made at any given moment in the course of interaction. Such options are sensitive
to momentary and situational changes that may alter or even reverse the meanings.
Furthermore, interpersonal relations may change in the course of the interaction as
each new utterance creates new context. In historical studies, negotiated meanings
may not be replicable as such, and early periods have their own peculiarities that
should be taken into account. From the pragmatic point of view, two different real-
isations of an utterance, even if similar in their surface structure, may not be saying
the same thing at all, as subtle shades of meaning are created in each new context.3

2.2. Historical corpora and their developments

We have come a long way to reach the present state of historical corpus develop-
ments. The pioneer was the HC, and after it several single-register or single-variety
corpora were created by the HC team members, with correspondence, medical
and scientific texts and Older Scots being the most important ones (see below and
CoRD4). These corpora include several, of what at the time were, novel features,
such as sociolinguistic awareness of corpus design, metatextual apparatus, and
even the “visual prosody”, e. g. lay-out and hand(s) on the page, in the source
material of the Corpus of Scottish Correspondence (Meurman-Solin 2013). Early
milestones of historical corpus compilation were the Lampeter Corpus (1999) that
focused on pamphlets of various topics from 1640 to 1740, a time that is marked
by the rise of mass publication, and the Zurich English Newspaper Corpus (ZEN),
a collection of early English newspaper texts from the late 17th and 18th centuries.
ZEN inspired studies on communicative aspects of language use already in the
1990s, prior to its release in 2004. The historical dialogue corpus, The Corpus of

3
The notion of pragmatic variables (Jucker and Taavitsainen 2012) could prove useful
in a researcher’s toolkit e. g. by correlating pragmatic principles to background factors,
such as sociolinguistic parameters and contextual information.
4
The Corpus Research Database CoRD at http://www.helsinki.fi/varieng/CoRD/
530 Irma Taavitsainen

English Dialogues (2006, CED), records texts of interactive face-to-face commu-

nication and is specially designed for pragmatic research tasks (see Culpeper and
Kytö 2010). It became available to researchers in 2006. Part of it has been enriched
with sociopragmatic annotation and will be discussed in connection with “rich”
corpora; this is an area where a great deal has happened in recent years (see below
and Archer and Culpeper in this volume).
The three corpora of early English medical writing, Middle English Medical
Texts 1375–1500 (MEMT, 2005), Early Modern English Medical Texts 1500–1700
(EMEMT, 2010) and Late Modern English Medical Texts 1700–1800 (LMEMT,
forthcoming), serve as examples of the development of corpus thinking and rethink-
ing, as each part is composed according to different principles, partly depending
on the source material and partly on the advances in digital humanities. MEMT is
mainly based on editions, as it was not possible to transcribe texts in manuscript
repositories for several reasons. The corpus compilers checked the reliability, and
transcribed some early texts made for historians’ use. EMEMT is mainly based on
Early English Books Online (EEBO) texts. This corpus was a pioneer in taking the
first steps towards multimodality among historical corpora with facsimiles of title
pages and an image gallery of book illustrations. It also includes links to EEBO
(subject to subscription) enabling the end user to have direct access to the original
pages. Marginal notes, underlinings and pointing fingers in the margins, to draw
the reader’s attention, are rare but they give direct evidence of reader response
where extant. Such marks facilitate the pragmatic analysis of the appropriation of
texts and ideas. Metadata in the text catalogue gives contextual information, and
e. g. links to the online Oxford Dictionary of National Biography prompt corpus
users to make their own explorations to background data. The third part of the
corpus series, LMEMT, is mainly based on Eighteenth Century Collections Online
(ECCO) texts in collaboration with the Text Creation Partnership (TCP). The cor-
pus has been developed in a new direction, as the texts are XML coded with meta-
data, and its annotation system allows searches according to various parameters
(manual in preparation by Hiltunen and Tyrkkö).5

3. Pitfalls of historical corpus linguistics at the initial stages

and at present

It is pertinent to begin this survey by taking stock with the earlier phases of the
discipline. Its “fallacies” serve to illustrate how much historical corpus research

5
The medical corpus has inspired further corpora on other branches of science. A Corpus
of English Texts on Astronomy (CETA) was released in 2012, and the Málaga Corpus of
Late Middle English Scientific Prose can be accessed, too (see the Appendix).
Historical corpus pragmatics 531

has developed since the early days. The present pitfalls are very different, as will
be shown below.

3.1. Early fallacies

An inspirational article by Matti Rissanen (1989), a pioneer in historical corpus
linguistics, deals with problems associated with work on diachronic corpora. He
formulated three core axioms: “The philologist’s dilemma”, i. e. the fear that cor-
pora “might mean the wane of philologically oriented language studies” (1989:
16), makes studies one-sidedly quantitative and discourage the study of original
texts; “God’s truth fallacy” or the ill-guided trust that a corpus gives an accurate
reflection of the reality “as we are not intuitively aware of its limitations in the
same way we are with corpora containing present-day language” (1989: 17); and
finally “The mystery of vanishing reliability” is connected with corpus annotation,
and needs further explanation. The HC is divided according to chronology and text
type, and sociolinguistic parameters are also given. This fine-meshed coding is
“inversely proportional to the amount of evidence in each information area sam-
pled” (1989: 18–19), and the reliability of the quantitative analysis may suffer if
the occurrences in each category are only few.
These points were relevant warnings at the early stage against, on the one hand,
overdue optimism and, on the other hand, overdue worry that the corpus revolution
would turn to mere number-crunching. With hindsight we can say that develop-
ments have not gone to the envisaged directions. Philology has made a comeback,
but in a different form, relying on the newest digital technologies. Basic philolog-
ical work in editing manuscripts has experienced a new renaissance (see below).
Corpus linguists have learned the limitations of their methods and do not place
unwarranted trust in the results. The methodology has developed greatly, our pres-
ent toolkit contains several applications, and researchers have learned to combine
various methods for more reliable results. Quantitative and qualitative studies are
both applied to data in the same study. This interaction serves as an important meth-
odological innovation combining statistical assessments with data contextualised
as thoroughly as possible. Rissanen’s third point is valid and taught to learners
of corpus linguistics the preliminary warning that reliable conclusions cannot be
drawn on too few occurrences. Thus the old pitfalls have largely been amended, but
new ones have come to light instead and new worries have replaced the early ones.

3.2. Present pitfalls
Pragmatic studies on historical data pose several challenges. Change is an inher-
ent feature of language, and the assumption that identical lexical items express the
same concepts, in the past as now, often prove erroneous, and backwards: identical
surface forms can have different meanings and connotations in different periods.
532 Irma Taavitsainen

For example, the adjective outlandish only started to develop a negative sense in
the mid-sixteenth century; in Sir Thomas Wilson’s The arte of Rhetorique its mean-
ing is simply “foreign” (Kay and Allen 2015: 16). Functions change and meanings
change, and several different processes have been identified. Grammaticalisation,
pragmaticalisation and discursisation work in different ways, and they have to be
taken into account already when planning a historical corpus pragmatic study. Lists
of linguistic features in modern grammar books cannot be relied on as the sole point
of departure, but rather the repertoire of features for past periods needs critical scru-
tiny. The items may not be relevant for the time under investigation and new features
may have been added to earlier items. The Oxford English Dictionary (OED hereaf-
ter) provides a reliable point of departure, especially if combined with the Historical
Thesaurus of the Oxford English Dictionary (Kay et al. 2009), which lists changes
according to the semantic fields and records entries and exits of their lexical items.
Another major pitfall is connected with the fact that the early stages of lan-
guages exhibit a great deal of spelling variation. Early studies on HC had great
difficulties with this feature, as tools like A Linguistic Atlas of Late Mediaeval
English (LALME)6 were not available and could not be consulted. Middle English
shows geographical variation to the extreme, personal pronouns and other deictic
elements display dozens of different forms in various parts of Britain, e. g. the
second person singular occurs in more than 420 manuscripts from all over the
country with up to seven different manifestations in several of them.7 For corpus
linguists, variation in spelling makes it difficult to retrieve specific constructions
from the original versions of texts in a reliable way (unless each possible spelling
is checked against wordlists). The current remedy to this problem is offered by
normalised versions of texts. Corpus software programs designed for this purpose
are now available and can be run through the data, but manual checking is required
to ensure that the process goes right and correct forms are given as replacements of
non-standard forms. The standardised texts are then used, instead of the originals,
to secure that the computer performs the frequency calculations right. Automatic
or almost automatic new tools have been developed for this purpose. The Vari-
ant Detector program (VARD)8 was developed for Early Modern English, but the
recent trend is to apply normalisation to later texts as well (e. g. in the Archer, A
Representative Corpus of Historical English Registers, up to 1850; personal com-
munication with the compilers). The default values work well for late eighteenth-
and nineteenth-century texts, e. g. for standardising abbreviated endings (’d), but

6
The opus came out in four volumes in late 1986 and the electronic version was released
in 2013. With a “fit technique” written texts can be localised with precision on the basis
of their co-occurring spelling forms.
7
Other medieval vernaculars have the same characteristic.
8
By Alistair Baron. Version 2.5.4 available in 2016, http://ucrel.lancs.ac.uk/vard/user-
guide/
Historical corpus pragmatics 533

for the earlier periods the program needs manual training and sometimes even
translation at the initial stage. At the final stage, for quoting illustrative examples,
the researcher should go back to the originals.9

4. Research questions

In the twenty years of historical corpus pragmatics research questions have devel-
oped from simple tasks to more elaborate and more diversified research agendas
(see Taavitsainen and Jucker 2015). At first, studies were descriptive and focused
mostly on the uses and meanings of single items, but soon topics expanded to larger
entities, like pragmatic markers (Brinton 1996). Studies on pragmatic functions
and motivations for language change are in accordance with the Anglo-American
approach to historical pragmatics.10 Research questions of this type are common
in historical corpus linguistics and ratios of newer and older variants indicate the
pace of change and help to pinpoint the time when the frequencies were reversed
and periods of accelerated change occurred.11 Lexical items provide the search
words for these studies, as well as the starting point for other kinds of corpus lin-
guistic searches in the early period. Most pragmatic research questions, however,
pose severe problems and do not lend themselves to numeral measurements and
statistical calculations for several reasons. The broader European view, also called
the perspective view, enhances the societal context of language use in human inter-
action and states that any language feature can gain pragmatic uses.12 Gradually
in the 21st century the repertoire widened to utterances and larger pragmatic units
like speech acts and responses to them. In these studies, methodologies for tracing
utterances and their perlocutionary effects were developed in various ways. The
first object of study was insults (Jucker and Taavitsainen 2000) and responses to
them proved important, because they showed whether the utterance was perceived
as an insult. Utterance pairs with the second part in kind were common in rituals.
In another study, metacomments were paid attention to as means of detecting rel-
evant loci for speech acts in discourse. An ethnomethodological approach charts

9
It is important to keep the original and the normalised versions clearly apart, and this
has been taken into account in EMEMT by carefully designing the corpus so that the
layers cannot be confused.
10
Traugott defines the field as “usage-based approach to language change” (2004: 538).
11
The focus is on the history of pragmatic units in language and pragmatic explanations
for language change especially in the processes of grammaticalisation. The Invited
Inferencing Theory of Semantic Change (Traugott and Dasher 2005) charts predictable
paths for semantic change across time.
12
Jucker (2008: 895) enhances the societal context; see also the discussion on historical
(socio)pragmatics in Włodarczyk and Taavitsainen (2017).
534 Irma Taavitsainen

how speech acts were talked about; relevant examples were detected by examining
loci where the speech acts were mentioned by their labels, even if the words were
not used performatively but descriptively (Taavitsainen and Jucker 2007).13 Other
expressives (greetings, compliments, apologies), requests and commissives (prom-
ises) followed in articles by various scholars in a collective volume on speech
act history (ed. by Jucker and Taavitsainen 2008). Valkonen (2008) focuses on
performative instances of promises with the appropriate speech act verbs in eight-
eenth-century prose, and Kohnen (2008) presents a report on directives throughout
the history of English. This line of enquiry proved a good source for inspiration
and is increasingly active.14
Preliminary results of historical corpus pragmatic studies already show a few
tendencies. In meaning change the general trend is from more concrete to more
abstract. Its two important stages are subjectification and intersubjectification; sub-
jectification being the expression of the self, and intersubjectification the speaker’s
awareness of the other participant (Traugott and Dasher 2005: 20). In studies on
more societal issues, emphasis is laid on discursive assessments with each addi-
tional utterance creating new context. The objects of study have also become more
abstract than before, towards ideological representations and politeness studies
(see below; Rayson 2015; Taavitsainen 2015).

5. Points of departure and retrieval tools

5.1. Top-down and bottom-up methodologies

Corpus linguistic studies on historical topics have mostly been conducted with
the “top-down” (deductive) method. The point of departure is a linguistic feature
identified as relevant for the research task in grammar books, thesauri and earlier
studies. The occurrences of these features are then verified in previously unex-
plored data. Modal verbs or adverbs of various kinds are typical research items,
and the questions may concern their increasing or decreasing ratios in tracing lan-
guage change by relevant processes (cf. above). The alternative is the “bottom-up”
(inductive) method, which is also called corpus-based, and relies on what the mate-
rial yields. This method has gained ground in recent years, and it is also possible
to combine the two.
In a recent study on stance by Whitt (2016), the research aim was to find out
how writers encode their evidence for asserted propositions in EMEMT (1500–

13
This method gave an incentive to develop the “sliding window” method for retrieving
stance devices (see Landert forthcoming).
14
The most recent contribution is an article by Jucker (2017).
Historical corpus pragmatics 535

1700) and to trace the decline of medical scholasticism in more detail than had
been done in earlier pilot studies. In order to identify evidential markers in the
EMEMT corpus, Whitt employed a combination of top-down and bottom-up anal-
yses (see Pahta and Taavitsainen 2010: 563). A list of items identified in previous
research on the topic provided the search words for the top-down method. The
corpus categories that form the structure of the corpus provided an analytical grid
for the top-down corpus searches. The assessment was complemented by a bot-
tom-up survey through qualitative, close reading to verify additional items that
carry the same or related meanings and to secure that nothing goes unnoticed. For
this bottom-up study, close reading of 2,000 word extracts in fifty-year slots from
each corpus category was done. The retrieved frequencies were normalised to a
rate per 10,000 words to make them comparable. The results revealed frequently
occurring constructions involving that clauses and as X says (cf. Gray, Biber and
Hiltunen 2011). The corpus was searched for them as well, and a high number of
relevant examples were found. In addition, morphologically related variants of
the identified items were searched for. An analysis of variance (ANOVA) test was
used to determine whether the differences in frequency among the periods were
statistically significant (p < .05). In addition, Levene’s test of homogeneity of
variance, Kruskal-Wallis one-way analysis of variance, the Shapiro-Wilk test of
normality, and the Mann-Whitney U-test were applied when pertinent (for further
information of these tests, see the bibliography of Whitt’s article). The hypothesis
was that a gradual decline in markers of mediated scholastic information would be
complemented by a significant increase in the use of markers of direct observation
and inference through markers relating to information mediated by non-scholastic
authorities and hearsay. The process was, however, more complicated. Scholastic
thought declined significantly during the two-hundred-year period under investi-
gation, but remained influential at least to some extent till the end. Old authorities
came under increasing scrutiny among a variety of choices in the medical mar-
ketplace, but this matter warrants further investigation, in longitudinal studies of
evidential markers in medical writing beyond the period in focus.

5.2. Corpus linguistic tools

Sophisticated modern corpus-linguistic tools have made new research paradigms
possible with a rich array of applications. The repertoire extends from frequency
analysis and wordlists to concordances and collocations, key-words, and n-grams
that are not restricted to word pairs only but find strings of co-occurring words of
varying lengths. They are among the most important tools in a corpus linguist’s
tool kit today. Wordlists provide a preliminary way of exploring the data and are
often useful for researchers to glean the vocabulary for variants and synonyms as
a preliminary orientation to their research tasks.
536 Irma Taavitsainen

5.2.1. KWIC concordances

Keyword-In-Context (KWIC) concordances have surpassed numerical frequency

counts. They have gained wide currency as a ready and useful way of getting
acquainted with the material and demonstrating the narrow linguistic co-text. For
example, we can see at a glance whether a word, such as wish, is a noun or a verb,
as in
u.^) $] O fie woman, what a wish is that? if (^Abigail^) had D2HOSNAW
Fortunes, I repent em, And wish I could new ioynt and stren D2CWILKI15

It is also quick to detect whether the assessed form occurs in a direct speech quo-
tation, an indirect quote or a narrative passage, and whether it occupies an initial
position in a sentence or is added as an additional tag. These are essential pieces of
information for pragmatic analyses. Negotiations of meaning become evident e. g.
in the following lines, whether the word sorry is used in describing an emotional
state or performatively in the speech act of apology, or ambiguously between the
two:
thee; alass poor Man, I am sorry for him^). [$Miss then de D5WBLAND
were going to gag us. I am sorry Sir, [$ (said he to (^Call D5FDAVYS
im.^) $] Farewel, Sir. I am sorry I must leave you so soon, D5CMILLE
leman! Why, she says she is sorry she could not send them so D5CHOADL

Tools like AntConc (http://www.laurenceanthony.net/software) and WordSmith

(http://www.lexically.net/wordsmith) allow navigation between text and search
findings and their interpretations, which is needed for demonstration and guarantee
of the accuracy of the quotation. In general, technical and methodological inno-
vations have enabled researchers to carry out more complicated analytical tasks,
sometimes resulting in “radically different perspectives on language variation”
(Biber and Reppen 2015: 2). This statement is not specified more closely in the
source, but it can be taken to refer to the historical dimension, especially historical
corpus pragmatics as it is mentioned as one of the most versatile fields of corpus
linguistics in the same book (see section 1).

5.2.2. Keyword analysis

Keyword analysis has already become one of the most frequently applied methods
of corpus linguistics and has been applied to historical studies e. g. at the inter-
face between historical sociolinguistics and pragmatics. The method is based on
significance tests to distinguish significantly more frequent or significantly less

15
The KWIC lines come from the CED (see the Appendix).
Historical corpus pragmatics 537

frequent words in the target corpus than in a reference corpus. The application
of the method is easy as the computer performs the calculations, but the corpus
user’s own role begins by designing the study in a competent way, and it continues
with the challenge of selecting the most appropriate reference corpus for the target
corpus so that the comparison is sensible (like with like, not apples and oranges).
After retrieving the evidence, the researcher’s task is to interpret the Key word list
as the mere words do not tell much but need to be grouped and their meanings need
to be discussed within their multilayered contexts (see below).

5.2.3. Collocations
Collocations, i. e. the combinations of words that attract each other, have become
one of the main objects of study at the interface between corpus linguistics,
sociohistory, semantics and pragmatics. Collocations reveal word meanings and
through them we can have access to semantic shifts, as well as to the negative or
positive semantic prosody of a linguistic item. Consistent co-occurrence patterns
of a word (to a degree greater than chance) also permit an assessment of sociohis-
torical attitudes.16 Several ambitious research projects have been launched in this
field lately. Phrasal structures and ready-made chunks are of interest to cognitive
linguistics, and collocations can also be analysed to achieve insights into underly-
ing ideologies.
In a study combining historical sociolinguistics with pragmatic contextual
analyses of the examples to anchor them to their users, McEnery and Baker (forth-
coming) study seventeenth-century collocations as expressions of attitudes used
with reference to the criminalised poor. The first step was to identify words refer-
ring to the group in Early Modern English. Intuition is an unreliable source, and
therefore official records were consulted, including British History Online, and
the frequencies of relevant words in the corpus itself were taken as a guideline
for inclusion into the list of search items. The data of the study came from EEBO
that currently offers access to over 39,212 texts from the seventeenth century,
amounting to just under one billion words,17 and can be accessed via the Corpus
Query Processor (CQP).18 A technique was developed to trace meaning changes
by looking at collocates over time. The four words beggar, vagabond, vagrant
and rogue were repeatedly mentioned, not only in state legislation but in sessions
rolls, state papers and county records, and they were also used to describe the
criminalised poor. The frequencies were high enough to allow an analysis by the

16
See McEnery and Hardie (2012: chapter six).
17
The precise figure is 996,472,953 words, as available for the seventeenth century in
version 3 of the EEBO-TCP corpus as used in this paper.
18
See Hardie (2012).
538 Irma Taavitsainen

decades throughout the century. The occurrences were normalised to frequencies

per one million words as a necessary step for comparisons of decades against one
another. The results showed three different kinds of collocates. Consistent collo-
cates have a stable relationship and always occur together. Second, words that can
acquire collocates in the course of time are called initiating collocates. The third
kind was named terminating collocates, i. e. lost within the study period. Further
observations indicate that transient collocates cause a concept to develop, but are
discarded soon after the debates are over. From the beginning of the seventeenth
century onwards, beggar, vagabond and rogue all appear as strong collocates of
one another, and they also collocate with vagrant throughout the second half of the
century. At first sight the words may appear to be synonyms, but textual evidence
of context shows them to be near-synonyms instead, as subtle distinctions can be
discerned by exploring them in more detail. The method is versatile and can be
developed further for even more purely pragmatic purposes.

5.2.4. Methods of statistical assessment

Methodological issues also include various statistical tests used in evaluating the
significance of corpus findings. Mastering the discipline involves awareness of
the tools to operationalise the research questions with appropriate methodology,
which “corpus linguistics has only begun to develop” (Gries 2015: 50). A great
deal has happened in this area since the early days. Simple significance tests with
binary research settings, comparing A with B, were typical of the early phase, but
sophistication has increased in statistical measurements and researchers use tri-
angulation with several different methods to achieve more reliable results. A wel-
come property of the programs to many researchers is their readiness to perform
statistical calculations and significance tests for us. The exploitation of corpus
linguistic techniques has been made easy for the end-user, as it suffices to under-
stand the purpose and the mechanism and to plan the research in a proper way; the
more technical aspects are taken care of by the programs. In studies designed to
answer research questions of diachronic pragmatics, scholars have been looking
at the increase or decrease of linguistic features over time and their dispersions to
indicate how evenly or unevenly the items are distributed in the corpus. For exam-
ple, a recent corpus study on diachronic pragmatics19 focused on phraseological
variation of the adverb so followed by a delayed declarative content clause with
adjectival, nominal or adverbial heads in EMEMT (1500–1700) (Hiltunen 2012).

19
This branch of historical pragmatics investigates functional changes of linguistic pat-
terns over time from the perspective of diachronic pragmatics, i. e. specific linguistic
patterns are identified and their frequencies and co-occurrence patterns are studied as
evidence for semasiological change (see Traugott and Dasher 2005: 100).
Historical corpus pragmatics 539

The results showed that the pattern is used for indicating degree, extent or manner
and the trend was typical of learned texts, in descriptions rather than instruction.
Dispersion studies of linguistic patterns seem to be a rising topic among historical
corpus pragmatics, especially when combined with phraseological units in special
registers or genres.

6. Contextualising language use

Pragmatics takes people into account: utterance context includes participant iden-
tities, their beliefs and intentions as well as their shared common ground, and
provides clues to the interpretations of meaning (Levinson 1983: 5). Therefore it
is of importance to ask the wh-questions who, what, where, when, and why (and
how, too). Situational constraints determine which of the available variants is
chosen (see above, variability and negotiability). The same expression can have
different meanings in different situations and knowledge of the circumstances as
well as speaker and addressee parameters are essential. Contextual mappings and
illustrative examples are necessary in historical corpus pragmatic assessments and
the overlap with sociolinguistics is considerable. But, as pointed out above, it is
important to master the background facts of texts even prior to the study itself in
order to formulate relevant research questions. Historical corpora are being refined
with metadata that give ready access to sociolinguistic textual coordinates. The
discourse locus of the feature is essential for the textual context, and therefore
whole or at least longer stretches of texts are needed for assessing unfolding dis-
course, as the context changes with each new addition, and there may be several
layers and several genres embedded in texts that lend their own colourings to the
interpretations.

6.1. Genre context

In contrast to the concrete contexts of lexical co-occurrence shown e. g. in KWIC
concordances, more abstract levels have become increasingly important for his-
torical corpus pragmatics. Genres can be defined as groupings of individual texts
that provide models for production and interpretation, guiding authors in their
writing processes as well as audiences in making sense of the text, e. g. genre labels
like “jests” guide the readers’ expectations. One reason why genres are taken into
account more prominently than before is due to the insight that various develop-
ments take place at different rates within different genres (Taavitsainen 2016a).
Furthermore, different genres may lend different colourings to word and utter-
ance meanings. Thus genre contexts give essential clues for interpreting meanings.
Irony and sarcasm are good examples of context sensitivity. Eighteenth-century
prose abounds in ironical and satirical readings building on subtle techniques of
540 Irma Taavitsainen

reversing the surface meanings by contextual clues, e. g. Jonathan Swift takes the
satirical vein into its extreme, and in the following century Oscar Wilde was a
master of adding ironical shades, turning compliments to insults and thanking to
shrewd or even nasty comments with sugar coating.

6.2. Cultural context

In search of explanations for language change, the larger and more abstract con-
texts play an important role. Cultural aspects, including the thought style and the
world view, with the position of man and his relation to the surrounding world, take
the centre stage here: stance expressions changed as medieval scholasticism rely-
ing on the logocentric mode of acquiring knowledge by studying ancient authori-
ties gave way to empiricism and observation as the mode of knowing in the early
modern period.20 It is, however, often difficult to find such correlations as past
periods are more or less foreign to modern researchers.21 The strangeness of past
cultures and their textual products has received attention in the earlier literature
with its “ surprising otherness” and of literary texts that surpass “the original com-
municative situation” and acquire universal meaning (Jauss 1979: 182). Another
(partly overlapping) quality of medieval texts is their “openness” (Bergner 1995),
i. e. their inherent vagueness of meaning.22 The quality of openness and vagueness
is exactly what the above-mentioned strangeness of past culture refers to and it is
not only a medieval feature but extends to more recent centuries. Shakespeare’s
plays exhibit the same fuzziness of sentence boundaries, and alternative readings
may be possible as medieval texts often escape clear-cut clause divisions (e. g.
Jucker and Taavitsainen 2013: 16). This feature presents one of the caveats that text
editions pose as they may define the “correct” readings by editorial decisions (for
a remedy, see below). Even later eighteenth-century irony and sarcasm remains
hidden or obscure without explanations. With the demand of contextualization and
an understanding of the special quality of medieval and other past literatures, we
come close to philological scholarship that relies on textual interpretation, and the
researchers’ profound knowledge of the language form and culture.

20
See the Scientific thought styles homepage for publications https://www.helsinki.fi/
en/researchgroups/varieng/scientific-thought-styles-the-evolution-of-english-medi-
cal-writing.
21
From this point of view historical studies are related to cross-cultural research.
22
This vagueness is largely due to the loss of contextual knowledge and is a common
feature in medieval texts. Old French was discussed by Fleischman (1990) and more
recently by Schrott and Völker (2005).
Historical corpus pragmatics 541

7. Present trends of historical corpora and new challenges

Historical corpora were described as “long and thin” in contrast to “small and
fat” or “small and tidy” in contrast to “big and messy” (Kohnen 2007, 2009),
but the scene has radically changed in recent years. Two largely contradictory
directions of corpus development have emerged.23 Small and often purpose-built
corpora are advocated as ideally suited for corpus pragmatics because of their con-
textual anchoring, especially as the analyses are often conducted by compilers of
the data with “unique insight into the context” (Clancy and O’Keeffe 2015: 244).
In historical pragmatic studies, philologically-oriented multi-layered corpora are
somewhat comparable to modern spoken or multimodal pragmatic corpora which
rely on rich contextualisations that enable micro-level assessments. The opposite
trend is towards megacorpora that defy contextualisation in the traditional way,
but open up new possibilities at the same time to historical corpus pragmatics with
methodological innovations. In the following, I shall deal with these opposing
trends and proceed according to a three-fold division into big data, rich data and
uncharted data.24

7.1. Big data

In recent years megacorpora of large electronic text collections have become avail-
able to linguists. The sheer amount of data may pose practical problems and it is
not possible to master the contents of texts in the same way as of smaller data sets.
Contextualisation becomes more difficult with increasing size. HC contained some
1.6 million words while the Corpus of Historical American English (COHA) has
400 million words of text extracts from the 1810s to the 2000s.25 Even larger histor-
ical datasets are available. The Ngram Viewer (http://books.google.com/ngrams/)
relies on a database consisting of 361 billion words from millions of books from

23
This discrepancy is what we called the “double binds” in our introduction to Diachronic
corpus pragmatics (Jucker and Taavitsainen 2014: 12).
24
This formulation echoes the three trends as the conference theme in “From data to evi-
dence” in Helsinki in October 2015. I shall proceed in this order, although annotation
has recently been pointed out as the “holy grail” of corpus pragmatics (Clancy and
O’Keeffe 2015: 251), perhaps as the pragmatic research applications to big corpora had
not been conducted at the time of writing.
25
Whole texts are not available for copyright reasons; pragmatic analysis often needs
more context. See Mark Davies’s homepage http://davies-linguistics.byu.edu/personal/
and the corpus page http://corpus.byu.edu/.
542 Irma Taavitsainen

1500 to 2000 scanned by the Google Books project (see Michel et al. 2010).26 It
has proved useful for overviews of developments.
As discussed above, researchers of historical corpus pragmatics have come
to accept the demand of reading the original texts with care. McEnery and Baker
(forthcoming), who focus on changes of collocations and constructions in a hun-
dred-years’ perspective (see 5.2.3), discuss the demand of close reading for con-
textual assessment in connection with the “big data” approaches to humanities.
They advocate rich interaction between the examples retrieved by corpus linguistic
techniques and large-scale characterisation of the data, cycling between close and
distant reading. In their view, corpus searches lend a bird’s eye view to the mate-
rial, but the retrieved examples need to be scrutinised by qualitative reading.
The first big collections of digital historical texts were included in large liter-
ary databases in the Chadwyck-Healey collections (1996–2011). They provided
interesting data for historical pragmatic research tasks, especially for historical
speech act studies (see above), but there are obvious problems that hinder the
application of corpus linguistic methodology in full. These literary corpora can
provide data for lexical uses, but it is impossible to count the relative frequen-
cies of linguistic features or to apply more advanced statistical methods. Thus the
applicable methodology is necessarily a qualitative corpus-aided study. Arnovick’s
pioneer study (1999) traces the development of good-bye by the process of dis-
cursisation and presents exciting new evidence of the development by moving at
the interface between language and literature. The problem of data retrieval for
studies on historical speech acts is difficult (see above and Taavitsainen and Jucker
2008b). Literary databases are also being developed towards more user-friendly
corpus-linguistic applications. Corpus descriptors or descriptive meta-data, with
standardised text headers, are innovative devices created to aid researchers in find-
ing appropriate data for their studies in the genre-based and genre-balanced, and
even part-of-speech tagged version of the 34-million-word Late Modern English
Corpus 1710–1920 (CLMET 3.0; see Diller, de Smet and Tyrkkö 2010).
Even more comprehensive data is provided by EEBO (v. 3, see above). Meth-
ods are being developed to collect, manage and interpret the data in their historic
contexts (see above). ECCO provides an equally large database, and both have
the advantage of bringing page images of almost all published books in English
to the researchers’ desks (subject to subscription). Other mega-data include online
dictionaries, compendia and other electronic collections that open up huge visions
as they allow access to almost all texts that have survived from a historical period.
From the time before book printing, we have the whole extant Anglo-Saxon lit-

26
The Ngram Viewer lists ngrams derived from these texts without any context. Pitfalls
with the process of automatic scanning (OCR) may give rise to erroneous forms (see
Jucker, Taavitsainen and Schneider 2012).
Historical corpus pragmatics 543

erature as an electronic Dictionary of Old English Corpus, and most of Middle

English literature is available as text in the Middle English Compendium, although
most recently published editions are not included. The OED can also be used as a
corpus, though its examples are short consisting of one clause or sentence in isola-
tion from their larger context.27 It provides an important addition to English elec-
tronic resources and is particularly useful for historical pragmatic research tasks at
the initial stages for checking the search items listed e. g. in modern grammars to
avoid the pitfalls (see above).

7.2. Rich data

The term “rich data” refers to annotated corpora that contain more than the text
itself for added value to the end-user.28 Ambitious schemes of rich annotations to
grasp the subtleties of language use have already been realised, to some extent,
in several corpora to facilitate data retrieval for pragmatic purposes. Research-
ers have come up with innovative solutions. Such metadata about sociolinguistic
parameters and information about pragmatic units have been integrated into cor-
pora to allow direct access to relevant material. A section of the Corpus of English
Dialogues 1560–1760 with annotations is called the Sociopragmatic Corpus.29
Tagging includes sociolinguistic information such as age, status and gender of the
speaker and the addressee as well as of participant roles and relations. In contrast
to part of speech analysis, which can be performed automatically,30 pragmatic and
sociolinguistic annotation has to be performed by hand (see Archer 2005). A sim-
ilar tagging model has been applied to drama texts in the Drama Corpus of Early
Modern English, comprising 242,561 words from 1500 to 1760 (see Lutzky and
Demmen 2013). It has been used for some research tasks, e. g. for tracing the fre-
quency distribution of the different forms of pray with the variables of social status
and gender taken into account. The 50-million-word Old Bailey Corpus of court
trials documents spoken English from 1720 to 1913 and allows similar research
tasks to be performed thanks to its sociolinguistic tagging and mark-up of partici-
pant roles (see Huber 2007).

27
Thus it was necessary to go beyond to the original publications to verify, e. g. the genre
context and more subtle shades of meaning in the OED material provided by the His-
torical Thesaurus on the metaphorical uses of address terms (Taavitsainen 2016b).
28
HC was the pioneer in this respect, too, as it has sociolinguistic and generic information
attached to it as descriptive metadata.
29
The principles of pragmatic annotation are discussed by Culpeper and Archer in this
volume; see its bibliography for earlier work.
30
These taggers have been developed for historical corpora by the same team that devised
the automatic spelling normalization tool VARD.
544 Irma Taavitsainen

We are also witnessing the rise of a novel philological trend of historical prag-
matics. Various types of annotation and mark-up have been added to manuscript
texts, and in an ideal case the manuscript page images are also given. Multimodal
assessments rely on textual lay-out features like line spacing, types of hands,
graphs and other visual aids. Contextual assessments range from the narrow lin-
guistic co-text to anchoring the text to the original audience and subsequent users
through possible marginal notes and signs of wear, all the way to the cultural con-
text. More recent innovations in corpus compilation and mark-up can be found,
for example, in An Electronic Text Edition of Depositions 1560–1760 (ETED, see
Kytö, Grund and Walker 2011). It is a novel type of corpus, based directly on man-
uscript sources. Physical features of the text have also been included. This branch
of study has inspired some projects as well (see Carroll et al. 2013). Manuscript
repositories are adopting more liberal policies with image reproductions on their
webpages, which encourages the study of manuscript pages in the digital age.

7.3. Uncharted data
A great amount of uncharted material still exists in manuscript repositories that
offer plenty of opportunities for original research. Sources that have not been
systematically charted may contain features that call for revisions in our present
notions e. g. of language history periodisation and styles of writing. As an example
of what uncharted data can yield, it is pertinent to refer to a scholastic commentary
that had remained unnoticed until fairly recently. It has brought to light additional
characteristics of highly learned vernacular scholastic argumentation (Taavitsainen
and Schneider forthcoming). Another kind of uncharted material can be found in
data that has been “edited” for other purposes than linguistic research and needs
to be revised for linguistic investigations, e. g. the Old Bailey corpus was first
used for historical research but the corpus for linguists has undergone meticulous
checking for accuracy.
Most corpora including medieval materials rely on edited texts for their sources.
Editorial practices are the target of a heated debate at present as it is the editors’
judgments that are given in these texts, e. g. syntactic structures are framed with
punctuation, but alternative readings are possible as well (Smith and Kay 2011;
Kytö and Peikola 2014). The renaissance of manuscript studies extends to editing
texts, and the field has become an important branch of digital humanities backed up
by the most modern digital techniques. Philologically-oriented corpora are being
created and remodelled with fresh principles as an answer to the demands of pre-
serving the quality of “openness” that makes multiple interpretations possible; in
an ideal case the editions would allow researchers to see the original manuscript
pages and judge for themselves, but unfortunately this is not always possible for
copyright reasons and high costs of image reproduction. In recent editions, simple
transcriptions without editorial intervention are included, but an edited version of
Historical corpus pragmatics 545

the text that gives an informed interpretation is also given (see Taavitsainen and
Fitzmaurice 2007; Honkapohja, Kaislaniemi and Marttila 2009; Marttila 2014).

8. Future trends: From present-day to historical

Diachrony includes the past, the present and the future. Language is an ever-chang-
ing entity, and corpora recording modern language use become historical in the
course of time. The pioneer corpora of the 1960’s, Brown and London-Lund, have
changed from surveys of language use in the present to language use in the past,
and new time slices have been added with the same compilation principles to show
how language changes within 75 years (1931, 1961, 1991 and 2006). The British
National Corpus is becoming historical as well, and the artificial division into pres-
ent-day and historical corpora is being blurred even more than some decades ago.
Present-day large corpora of different genres and varieties (e. g. COCA, GloWbE),
and corpora created by web crawling (e. g. EnTenTen, UKWaC) will face the same
transition into historical soon.31 The pace seems to be even faster with internet gen-
res, and newspaper materials are gathered to monitor corpora, where yesterday’s
papers present the most recent historical items of the past. These developments,
among others, have brought forth a new way of looking at language where dia-
chrony and synchrony are no longer separated, but seen as “essentially overlapping
processes, and one cannot be understood without the other” (Aitchison 2012: 19).
Researchers have come to realise that there is diachronic depth in present-day prac-
tices and language use needs context as it undergoes a dynamic process of variation
and change every moment. Trends pointing to the future can only be predicted, if
placed into perspective with the present and the past.32

References

A Linguistic Atlas of Late Mediaeval English (LALME). M. Benskin, M. Laing, V. Karaiskos

and Keith Williamson. An Electronic Version of A Linguistic Atlas of Late
Mediaeval English. http://www.lel.ed.ac.uk/ihd/elalme/elalme.html
Aitchison, Jean
2012 Diachrony vs synchrony: The complementary evolution of two (ir)reconcil-
able dimensions. In: Juan M. Hernández-Campoy and J. Camilo Conde-Sil-

31
For details of modern corpora, see Andersen in this volume.
32
Creating a long-line perspective with the past, the present and the future is perhaps the
ultimate goal of historical linguistics and thus relevant for historical pragmatics, too
(see Lass 1997: chapter 1).
546 Irma Taavitsainen

vestre (eds.), The Handbook of Historical Sociolinguistics, 11–21. Oxford:

Wiley-Blackwell.
Archer, Dawn
2005 Questions and Answers in the English Courtroom (1640–1760). Amsterdam/
Philadelphia: John Benjamins.
Arnovick, Leslie K.
1999 Diachronic Pragmatics. Seven Case Studies in English Illocutionary Develop-
ment. Amsterdam/Philadelphia: John Benjamins.
Bergner, Heinz
1995 The openness of medieval texts. In: Andreas H. Jucker (ed.), Historical Prag-
matics. Pragmatic Developments in the History of English, 37–54. Amster-
dam/Philadelphia: John Benjamins.
Biber, Douglas and Randi Reppen (eds.)
2015 The Cambridge Handbook of English Corpus Linguistics. Cambridge: Cam-
bridge University Press.
Brinton, Laurel J.
1996 Pragmatic Markers in English. Grammaticalization and Discourse Functions.
Berlin/New York: Mouton de Gruyter.
Carroll, Ruth, Matti Peikola, Hanna Salmi, Mari-Liisa Varila, Janne Skaffari and Risto
Hiltunen
2013 Pragmatics on the page. European Journal of English Studies 17(1): 54–71.
Clancy, Brian and Anne O’Keeffe
2015 Pragmatics. In: Douglas Biber and Randi Reppen (eds.), The Cambridge
Handbook of English Corpus Linguistics, 235–251. Cambridge: Cambridge
University Press.
Culpeper, Jonathan and Dawn Archer
2008 Requests and directness in Early Modern English trial proceedings and play
texts, 1640–1760. In: Andreas H. Jucker and Irma Taavitsainen (eds.), Speech
Acts in the History of English, 45–84. Amsterdam/Philadelphia: John Benja-
mins.
Culpeper, Jonathan and Merja Kytö
2010 Early Modern English Dialogues. Spoken Interaction as Writing. Cambridge:
Cambridge University Press.
Diller, Hans-Jürgen, Hendrik de Smet and Jukka Tyrkkö
2010 A European database of descriptors of English electronic texts. The European
English Messenger 19(2): 29–35.
Fitzmaurice, Susan M. and Irma Taavitsainen (eds.)
2007 Methods in Historical Pragmatics. Berlin/New York: Mouton de Gruyter.
Fleischman, Suzanne
1990 Philology, linguistics, and the discourse of the medieval text. Speculum 65:
19–37.
Gray, Bethany, Douglas Biber and Turo Hiltunen
2011 The expression of stance in early (1665–1712) publications of the Philosoph-
ical Transactions and other contemporary medical prose: Innovations in a
pioneering discourse. In: Irma Taavitsainen and Päivi Pahta (eds.), Medical
Writing in Early Modern English, 221–257. Cambridge: Cambridge Univer-
sity Press.
Historical corpus pragmatics 547

Gries, Stefan Th.

2015 Quantitative designs and statistical techniques. In: Douglas Biber and Randi
Reppen (eds.), The Cambridge Handbook of English Corpus Linguistics,
50–71. Cambridge: Cambridge University Press.
Hardie, Andrew
2012 CQPweb – combining power, flexibility and usability in a corpus analysis tool.
International Journal of Corpus Linguistics 17(3): 380–409.
Hiltunen, Turo
2012 So ADJ/ADV that clause patterns in Early Modern English medical writing.
Journal of Historical Pragmatics 13(2): 313–335.
Hiltunen, Turo and Jukka Tyrkkö
Forthcoming Manual. In: Irma Taavitsainen and Turo Hiltunen (eds.), Late Modern
English Medical Texts: Corpus Description and Studies. Amsterdam: John
Benjamins.
Honkapohja, Alpo, Samuli Kaislaniemi and Ville Marttila
2009 Digital Editions for Corpus Linguistics: Representing manuscript reality in
electronic corpora. In: Andreas H. Jucker, Daniel Schreier and Marianne Hundt
(eds.), Corpora: Pragmatics and Discourse. Papers from the 29th International
Conference on English Language Research on Computerized Corpora (ICAME
29). Ascona, Switzerland, 14–18 May 2008, 451–475. Amsterdam: Rodopi.
Huber, Magnus
2007 The Old Bailey Proceedings, 1674–1834: Evaluating and annotating a corpus
of 18th- and 19th-century spoken English. In: Anneli Meurman-Solin and Arja
Nurmi (eds.), Annotating Variation and Change. Studies in Variation, Con-
tacts and Change in English, Volume 1. http://www.helsinki.fi/varieng/series/
volumes/01/index.html. Research Unit for Variation, Contacts and Change in
English (VARIENG), University of Helsinki.
Jauss, Hans Robert
1979 The alterity and modernity of medieval literature. New Literary History 10:
181–229.
Jucker, Andreas H. (ed.)
1995 Historical Pragmatics. Pragmatic Developments in the History of English.
Amsterdam/Philadelphia: John Benjamins.
Jucker, Andreas H.
2008 Historical pragmatics. Language and Linguistics Compass 2(5): 894– 906.
Jucker, Andreas H.
2017 Speech acts and speech act sequences: Greetings and farewells in the history
of American English. In: Merja Kytö, Jeremy J. Smith and Irma Taavitsainen
(eds.), Interfacing Individuality and Collaboration in English Language
Research World. Studia Neophilologica, Special Issue, 39–58.
Jucker, Andreas H., Irma Taavitsainen and Gerold Schneider
2012 Semantic corpus trawling: Expressions of “courtesy” and “politeness” in the
Helsinki Corpus. In: Carla Suhr and Irma Taavitsainen (eds.), Developing Cor-
pus Methodology for Historical Pragmatics. (Studies in Variation, Contacts
and Change in English 11). Helsinki: Research Unit for Variation, Contacts
and Change in English. Available online at <http://www.helsinki.fi/varieng/
series/volumes/11/jucker_taavitsainen_schneider/>
548 Irma Taavitsainen

Jucker, Andreas H., Daniel Schreier and Marianne Hundt (eds.)

2009 Corpora: Pragmatics and Discourse. Papers from the 29th International Con-
ference on English Language Research on Computerized Corpora (ICAME
29). Ascona, Switzerland, 14–18 May 2008. Amsterdam: Rodopi.
Jucker, Andreas H. and Irma Taavitsainen
2000 Diachronic speech act analysis: Insults from flyting to flaming. Journal of
Historical Pragmatics 1(1): 67–95.
Jucker, Andreas H. and Irma Taavitsainen (eds.)
2008 Speech Acts in the History of English. Amsterdam/Philadelphia: John Benja-
mins.
Jucker, Andreas H. and Irma Taavitsainen (eds.),
2010 Historical Pragmatics. Berlin/New York: Mouton de Gruyter.
Jucker, Andreas H. and Irma Taavitsainen
2012 Pragmatic variables. In: Juan M. Hernández-Campoy and J. Camilo Conde-Sil-
vestre (eds.), The Handbook of Historical Sociolinguistics, 303–317. Oxford:
Wiley-Blackwell.
Jucker, Andreas H., and Irma Taavitsainen
2013 English Historical Pragmatics. (Edinburgh Textbooks on the English Lan-
guage). Edinburgh: Edinburgh University Press.
Jucker, Andreas H. and Irma Taavitsainen
2014 Diachronic corpus pragmatics: Intersections and interactions. In: Irma Taavit-
sainen, Andreas H. Jucker and Jukka Tuominen (eds.), Diachronic Corpus
Pragmatics, 3–26. Amsterdam: John Benjamins.
Kay, Christian and Kathryn Allen
2015 English Historical Semantics. (Edinburgh Textbooks on the English Lan-
guage). Edinburgh: Edinburgh University Press.
Kohnen, Thomas
2007 From Helsinki through the centuries: The design and development of English
diachronic corpora http://www.helsinki.fi/varieng/journal/volumes/02/koh-
nen/. In: Päivi Pahta, Irma Taavitsainen, Terttu Nevalainen and Jukka Tyrkkö
(eds.), Towards Multimedia in Corpus Studies, Studies in Variation, Contacts
and Change in English, Volume 2. Research Unit for Variation, Contacts and
Change in English (VARIENG), University of Helsinki.
Kohnen, Thomas
2008 Tracing directives through text and time: Towards a methodology of a cor-
pus-based diachronic speech-act analysis. In: Andreas H. Jucker and Irma
Taavitsainen (eds.), Speech Acts in the History of English, 295–310. Amster-
dam/Philadelphia: John Benjamins.
Kohnen, Thomas
2009 Historical corpus pragmatics: Focus on speech acts and texts. In: Andreas H.
Jucker, Daniel Schreier and Marianne Hundt (eds.), Corpora: Pragmatics and
Discourse. Papers from the 29th International Conference on English Lan-
guage Research on Computerized Corpora (ICAME 29). Ascona, Switzerland,
14–18 May 2008, 13–36. Amsterdam: Rodopi.
Kytö, Merja, Peter J. Grund and Terry Walker
2011 Testifying to Language and Life in Early Modern England: An Electronic Text
Edition of Depositions 1560–1760 (ETED). Amsterdam/Philadelphia: John
Benjamins
Historical corpus pragmatics 549

Kytö, Merja and Matti Peikola

2014 Philology on the move: Manuscript studies at the dawn of the 21st century.
Studia Neophilologica 86: 1–8.
Landert, Daniela
Forthcoming Evidentiality in Early Modern English. In: Carla Suhr, Terttu Nevalainen
and Irma Taavitsainen (eds.), From Data to Evidence in English Language
Research. Leiden: Brill.
Lass, Roger
1997 Historical Linguistics and Language Change. (Cambridge Studies in Linguis-
tics). Cambridge: Cambridge University Press.
Levinson, Stephen C.
1983 Pragmatics. Cambridge: Cambridge University Press.
Lutzky, Ursula and Jane Demmen
2013 Pray in Early Modern English drama. Journal of Historical Pragmatics 14(2):
263–284.
Marttila, Ville
2014 Creating digital editions for Corpus Linguistics: The case of Potage Dyvers, a
family of six Middle English recipe collections. PhD Dissertation. University
of Helsinki.
McEnery, Tony and Helen Baker
Forthcoming Language surrounding poverty in early modern England: A corpus-based
investigation of how people living in the seventeenth-century perceived the
criminalised poor. In: Carla Suhr, Terttu Nevalainen and Irma Taavitsainen
(eds.), From Data to Evidence in English Language Research. Leiden: Brill.
McEnery, Tony and Andrew Hardie
2012 Corpus Linguistics. Cambridge: Cambridge University Press.
Meurman-Solin, Anneli and Jukka Tyrkkö
2013 Introduction. In: Anneli Meurman-Solin and Jukka Tyrkkö (eds.), Principles
and Practices for the Digital Editing and Annotation of Diachronic Data http://
www.helsinki.fi/varieng/journal/volumes/14/introduction.html. Research Unit
for Variation, Contacts and Change in English (VARIENG), University of Hel-
sinki.
Michel, Jean-Baptiste, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K. Gray,
The Google Books Team, Joseph P. Pickett, Dale Hoiberg, Dan Clancy, Peter
Norvig, Jon Orwant, Steven Pinker, Martin A. Nowak and Erez Lieberman Aiden
2010 Quantitative analysis of culture using millions of digitized books. Science
http://www.sciencemag.org/content/early/2010/12/15/science.1199644
Milroy, James
1992 Linguistic Variation and Change: On the Historical Sociolinguistics of Eng-
lish. Oxford: Blackwell.
Pahta, Päivi and Irma Taavitsainen
2010 Scientific discourse. In: Andreas H. Jucker and Irma Taavitsainen (eds.), His-
torical Pragmatics. 549–586. Berlin: Mouton de Gruyter.
Rayson, Paul
2015 Computational tools and methods for corpus compilation. In: Douglas Biber
and Randi Reppen (eds.), The Cambridge Handbook of English Corpus Lin-
guistics, 32–49. Cambridge: Cambridge University Press.
550 Irma Taavitsainen

Rissanen, Matti
1989 Three problems connected with the use of diacronic corpora. ICAME Journal
13: 16–22.
Rühlemann, Christoph and Karin Aijmer
2015 Corpus pragmatics: Laying the foundation. In: Karin Aijmer and Christoph
Rühleman (eds.), Corpus Pragmatics, 1–26. Cambridge: Cambridge Univer-
sity Press.
Schrott, Angela and Harald Völker
2005 Historische Pragmatik und historiche Varietätenlinguistik: Traditionen, Meth-
oden und Modelle in der Romanistik. In: Angela Schrott and Harald Völker
(eds.), Historische Pragmatik und historische Varietätenlinguistik in den
romanischen Sprachen, 1–22. Göttingen: Unversitätsverlag.
Smith, Jeremy and Christian Kay
2011 The pragmatics of punctuation in Older Scots. In: Päivi Pahta and Andreas
H. Jucker (eds.), Communicating Early English Manuscripts, 212–225. Cam-
bridge: Cambridge University Press.
Suhr, Carla, Terttu Nevalainen and Irma Taavitsainen (eds.)
Forthcoming From Data to Evidence in English Language Research. Leiden: Brill.
Taavitsainen, Irma
2015 Historical pragmatics. In: Douglas Biber and Randi Reppen (eds.), The Cam-
bridge Handbook of English Corpus Linguistics, 252–268. Cambridge: Cam-
bridge University Press.
Taavitsainen, Irma
2016a Genre dynamics in the history of English. In: Merja Kytö and Päivi Pahta
(eds.), Cambridge Handbook of Historical Linguistics, 271–285. Cambridge:
Cambridge University Press.
Taavitsainen, Irma
2016b The case of address terms. In: Wendy Anderson, Ellen Bramwell, and Carole
Hough (eds.), Mapping English Metaphor Through Time, 260–280. Oxford:
Oxford University Press.
Taavitsainen, Irma and Susan M. Fitzmaurice
2007 Historical pragmatics: What it is and how to do it. In: Susan M. Fitzmaurice
and Irma Taavitsainen (eds.), Methods in Historical Pragmatics, 11–36. Ber-
lin/New York: Mouton de Gruyter.
Taavitsainen, Irma and Andreas H. Jucker
2007 Speech act verbs and speech acts in the history of English. In: Susan M. Fitz-
maurice and Irma Taavitsainen (eds.), Methods in Historical Pragmatics, 107–
138. Berlin/New York: Mouton de Gruyter.
Taavitsainen, Irma and Andreas H. Jucker
2008a “Methinks you seem more beautiful than ever”: Compliments and gender in
the history of English. In: Andreas H. Jucker and Irma Taavitsainen (eds.),
Speech Acts in the History of English, 195–228. Amsterdam/Philadelphia:
John Benjamins.
Taavitsainen, Irma and Andreas H. Jucker
2008b Speech acts now and then: Towards a pragmatic history of English. In: Andreas
H. Jucker and Irma Taavitsainen (eds.), Speech Acts in the History of English,
1–23. Amsterdam/Philadelphia: John Benjamins.
Historical corpus pragmatics 551

Taavitsainen, Irma and Andreas H. Jucker

2010 Trends and developments in historical pragmatics. In: Andreas H. Jucker and
Irma Taavitsainen (eds.), Historical Pragmatics, 3–30. Berlin/New York:
Mouton de Gruyter.
Taavitsainen, Irma and Andreas H. Jucker
2015 Twenty years of historical pragmatics: Origins, developments and changing
thought styles. Journal of Historical Pragmatics 16(1): 1–24.
Taavitsainen, Irma, Andreas H. Jucker and Jukka Tuominen (eds.)
2014 Diachronic Corpus Pragmatics. Amsterdam: John Benjamins.
Taavitsainen, Irma and Gerold Schneider
Forthcoming Scholastic argumentation in early English medical writing and its afterlife:
New corpus evidence. In: Carla Suhr, Terttu Nevalainen and Irma Taavitsainen
(eds.), From Data to Evidence in English Language Research. Leiden: Brill.
Traugott, Elizabeth Closs
2004 Historical pragmatics. In: Laurence R. Horn and Gregory Ward (eds.), The
Handbook of Pragmatics, 538–561. Oxford: Blackwell.
Traugott, Elizabeth Closs and Richard B. Dasher
2005 Regularity in Semantic Change. Cambridge: Cambridge University Press.
Valkonen, Petteri
2008 Showing a little promise: Identifying and retrieving explicit illocutionary acts
from a corpus of written prose. In: Andreas H. Jucker and Irma Taavitsainen
(eds.), Speech Acts in the History of English, 247–272. Amsterdam/Philadel-
phia: John Benjamins.
Verschueren, Jef
1999 Understanding Pragmatics. London: Arnold.
Whitt, Richard Jason
2016 Evidentiality in Early Modern English medical treatises (1500–1700). Journal
of Historical Sociolinguistics 2(2): 235–263.
Włodarczyk, Matylda and Irma Taavitsainen
2017 Introduction: Historical (socio)pragmatics at present. Journal of Historical
Pragmatics 18(2): 159–174.

Appendix: Historical corpora available for research

For a more complete list of historical corpora, see CoRD,
http://www.helsinki.fi/varieng/CoRD/corpora/index.html.

General:
A Representative Corpus of Historical English Registers (ARCHER) Compiled under
the supervision of Douglas Biber and Edward Finegan at Northern Arizona
University, University of Southern California, University of Freiburg, Uni-
versity of Heidelberg, University of Helsinki, Uppsala University, University
of Michigan, University of Manchester, Lancaster University, University of
Bamberg and University of Zurich; see www.llc.manchester.ac.uk/research/
projects/archer/.
Corpus of Late Modern English Texts (CLMET) Compiled by Hendrik De Smet, Hans-Jür-
gen Diller and Jukka Tyrkkö; see https://perswww.kuleuven.be/~u0044428/.
552 Irma Taavitsainen

The Corpus of Historical American English (COHA) see http://corpus.byu.edu/coha/.

Eighteenth Century Collections Online (ECCO) see http://www.gale.cengage.com/Digital-
Collections/products/ec co/ about.htm.
Early English Books Online (EEBO) see http://eebo.chadwyck.com/home.
Helsinki Corpus = The Helsinki Corpus of English Texts
1991 Compiled by Matti Rissanen (project leader), Merja Kytö (project secretary);
Leena Kahlas-Tarkka, Matti Kilpiö (Old English); Saara Nevanlinna, Irma
Taavitsainen (Middle English); Terttu Nevalainen, Helena Raumolin-Brun-
berg (Early Modern English); see http://www.helsinki.fi/varieng/CoRD/cor-
pora/HelsinkiCorpus/index.html.

Correspondence:
Corpus of Early English Correspondence (CEEC) and its extensions
1998 Compiled by Terttu Nevalainen, Helena Raumolin-Brunberg, Jukka Keränen,
Minna Nevala, Arja Nurmi and Minna Palander-Collin (Department of Eng-
lish, University of Helsinki); see http://www.helsinki.fi/varieng/CoRD/cor-
pora/CEEC/index.html.
Corpus of Scottish Correspondence (CSC) Compiled by Anneli Meurman-Solin (Uni-
versity of Helsinki); see http://www.helsinki.fi/varieng/CoRD/corpora/CSC/
index.html.

Science:
Corpus of Early English Medical Writing (CEEM) see www.helsinki.fi/varieng/CoRD/
corpora/CEEM/index.html.
Middle English Medical Texts 1375–1500 (MEMT)
2005
Early Modern English Medical Texts 1500–1700 (EMEMT)
2010 see Taavitsainen and Pahta (eds.)
Late Modern English Medical Texts 1700–180033 (LMEMT) See Taavitsainen and Hiltunen
(eds.) forthcoming.
A Corpus of English Texts on Astronomy (CETA)
2012 See Moscovich, Isabel and Begoña Crespo (eds.) Amsterdam: John Benja-
mins.
The Málaga Corpus of Late Middle English Scientific Prose Javier Calle-Martín and
Antonio Miranda-García; see http://hunter.uma.es/.

Dialogues:
Corpus_of_English_Dialogues (CED) see www.engelska.uu.se/Research/English_Lan-
guage/Research_Areas/Electronic_Resource_Projects/A_Corpus_of_Eng-
lish_Dialogues/.

33
More information about LMEMT in the ICAME Journal 38, March 2014, 137–153
http://www.degruyter.com/view/j/icame.2014.38.issue-1/icame-2014–0007/icame-
2014–0007.xml
Historical corpus pragmatics 553

Literature:
Chadwyck-Healey Literature Collections, Online (LION) see http://lion.chadwyck.com./.

Court room and legal settings:

An Electronic Text Edition of Depositions 1560–1760 (ETED)
2011 Kytö, Grund and Walker (eds.)
The Old Bailey corpus see http://www.uni-giessen.de/oldbaileycorpus/.

Newspapers and pamphlets:

The Lampeter Corpus of Early Modern English Tracts (LC) see http://www.helsinki.fi/
varieng/CoRD/corpora/LC/index.html.
The Zurich English Newspaper Corpus (ZEN), see http://www.helsinki.fi/varieng/CoRD/
corpora/ZEN/index.html.

Dictionaries and thesauri

Dictionary of Old English Corpus in Electronic Form (DOEC)
2004 Compiled by Antonette diPaolo Healey, Dorothy Haines, Joan Holland, David
McDougall, Ian McDougall and Xin Xiang (University of Toronto); see http://
www.doe.utoronto.ca/pub/corpus.html; for earlier versions, see http://www.
doe.utoronto.ca/pub/pub.html; see, also, http://www.doe.utoronto.ca/.
The Historical Thesaurus of English, version 4.2.
2016 Kay, Christian, Jane Roberts, Michael Samuels, Irené Wotherspoon, and Marc
Alexander (eds.) Glasgow: University of Glasgow; see http://historicalthesau-
rus.arts.gla.ac.uk/.
Middle English Compendium see http://quod.lib.umich.edu/m/mec/.
Oxford English Dictionary Online (OED Online) see http://www.oed.com/.
Oxford Dictionary of National Biography see www.oxforddnb.com.
22. Corpus pragmatics: From form to function
Karin Aijmer

Abstract: Corpus-pragmatic studies, in general, are form-based and they start with
mapping words or constructions onto a range of functions. Examples of functional
categories which need to be described in this way are discourse markers, interjec-
tions, address terms and hesitation markers. The availability of spoken corpora
has now made it possible to study how structural and prosodic properties correlate
with function and with the speech situation. The form-to-function relationship has
been addressed in different theoretical frameworks both synchronically and dia-
chronically.

1. Introduction

Pragmatics is concerned with how people use linguistic resources to address “prob-
lems of speaking, hearing and understanding” (Dingemanse, Blythe and Dirks-
meyer 2014: 5). Assuming that there are lexical words or constructions which are
oriented to these problems as their starting-point, we can ask questions such as:
what are the communicative functions associated with a linguistic expression and
how has the form or structure been adjusted to better fit the interactive functions
the expression is intended to perform? Such questions are inspired by the availabil-
ity of real, interactional data. According to Schegloff et al. (1996: 11), “real-time
data have inspired a radical shift in the kind of question being asked. Scholars
interested in the relation between form (grammar) and function are beginning to
examine the probability that categories of grammatical description need to be made
responsible to the categories appropriated to describing communicative inter-
action”.
We now have access to a number of spoken corpora which can be used to study
pragmatic phenomena on the basis of naturally spoken interaction (see for example
Rühlemann and Aijmer 2015: 4). Corpus-based pragmatic studies are generally
form-based, and they start by mapping words or constructions onto a range of
functions. Examples of functional categories which need to be described in this
way are discourse markers, interjections, vocatives, hesitation markers (er, erm),
address forms, and expletives. An advantage of the corpus-based approach is that
the forms can be studied with great precision with regard to frequency distribution,
position, prosody, collocation and function. On the other hand, the method may
perform badly “in terms of recall” (identifying all the examples of a particular
function) (Rühlemann and Aijmer 2015: 10). In the following example (simplified

https://doi.org/10.1515/9783110424928-022
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 555–585. Berlin/
Boston: De Gruyter Mouton.
556 Karin Aijmer

from Dingemanse, Blythe and Dirksmeyer 2014: 6) who? has the function of ini-
tiating a repair of what was said in the preceding turn:
(1) A: oh Sibbie’s sister had a baby boy
B: who?
A: Sibbie’s sister
B: oh really

Speaker A produces a turn at talk that B treats as problematic by initiating repair in

the next turn. The method has its advantages and disadvantages. While the form-
to-function approach excludes by definition the ability to search for alternative
ways in which the repair function can be realised (such as huh?; what?; you mean
+ noun), it has the advantage of providing a rich description of the function(s) of a
particular form in many different situations and activities, and in different syntactic
positions.
My contribution to this volume will focus on corpus pragmatics taking as an
example, corpus-based form-to-function approaches to discourse markers in the
first place, interjections, hesitation markers and address forms in the second place.
I will consider the type of research question the approach merits, and I will dis-
cuss, by means of examples, the strengths and weaknesses of the method. In this
approach, the overriding research question is how we can account for all the dif-
ferent interpretations that a specific linguistic form can receive in the speech sit-
uation, and to do this, we have to take a closer look at function and the context in
which the discourse marker occurs in the corpus data. From a different angle, we
also have to deal with the contribution of position, prosody, non-verbal features
such as gesture, and turn-taking phenomena to the construction and interpretation
of meaning.
The rest of the article is structured as follows. In section 2, I discuss the form-
to-function-method in more detail relating it to corpus-based analysis and, in par-
ticular, its application to discourse markers. Section 3 argues and demonstrates
that formal properties provide an entry into understanding the function of prag-
matically interpreted elements. Section 4 discusses what we mean by function
with regard to elements which are formally “inserts”. Section 5 is concerned with
the interaction between multifunctionality and context, and section 6 deals with
the theoretical models which have been proposed to describe the complex relation
between form and function. Section 7 discusses advantages (and shortcomings) of
the corpus-pragmatic approach in relation to interjections, hesitation markers and
address forms. Section 8 concludes with a summary and a discussion of the pros-
pects of the corpus-pragmatic method for future research.
Corpus pragmatics: From form to function 557

2. A corpus-pragmatic approach going from form to function

We have recently witnessed dramatic developments in the creation of new corpora

and the use of corpus-linguistic methods. These developments have coincided with
an interest in studying elements which have pragmatic or discourse functions in
spoken language.
Many spoken phenomena such as discourse markers, interjections, vocatives,
hesitation markers have defied a grammatical analysis because of their formal
and functional properties. They are, for example, not integrated syntactically into
the clause to which they are attached, they do not have a special form and they
get their functions from the context where they are found. (See Biber et al. 1999:
1083–1095 for a more detailed classification of “out-of-the-clause elements”.) On
the other hand, they provide a challenge for linguists using corpora and corpus-lin-
guistic methods. Corpora make it possible to describe the lexical elements on the
basis of authentic examples and to describe their distribution and function in dif-
ferent contexts. However, in order to tackle the challenges, we need to define the
elements which require a corpus-pragmatic method and establish some boundaries
between the different types.
Discourse markers, for example, are notoriously difficult to define and to
delimit from other spoken elements. Moreover, in spite of the recent boost in stud-
ies of discourse markers there is no agreement about terminology and what to
include as discourse markers. All of the following terms (partly based on Clark
2004: 376), and many more, have been used to describe them: discourse mark-
ers (Schiffrin 1987; Östman 1981; Lenk 1998; Jucker and Ziv 1998), discourse
particles (Schourup 1985; Aijmer 2002), pragmatic expressions (Erman 1987),
disjunct markers (Jefferson 1978), discourse operators (Polanyi 1985; Redeker
1986, 1990), clue words (Reichman 1978), cue words (Grosz and Sidner 1986),
cue phrases (Hirschberg and Litman 1987), discourse signals (Stenström 1989) and
functional markers (Ghezzi and Molinelli 2014).
Another problem is what elements belong here. Heine (2013: 1208) states that
the following items “are amongst the ones most commonly discussed and least con-
troversial items classified as English DMs [discourse markers]”: after all, anyway,
as it were, besides, however, indeed, in fact, instead, I mean, now, okay, so, then, I
think, well, what else, you know, you see. Discourse markers, however, are a fuzzy
category and it is difficult to draw boundaries to other groups of spoken phenomena.
The use of spoken corpora has added to our knowledge of discourse elements
and contributed to the development of a theoretical framework where they can be
analysed. The area of research is broadening and steps are also being taken to use
corpora and corpus-pragmatic practices to make a more systematic analysis of
elements such as address forms (vocatives), interjections and hesitation forms in
order to address deeper questions about the relationship between form, function
and the communicative interaction. Corpus-linguistic methods, for example, are
558 Karin Aijmer

newcomers in the field of hesitation markers and they show convincingly that it is
not enough to analyse spoken phenomena on the basis of form or function alone
(see further section 7).

3. Formal properties

3.1. Position
Until quite recently few studies looked at the formal features of discourse markers
and related elements in detail, probably because it has been thought to be difficult
to establish a link between formal features and function. Although it is possible that
formal properties are unrelated to the functions of a lexical item, a more promising
way forward is to assume that they are associated in some ways.
Thus looking at their position provides a way of analysing what lexical ele-
ments are doing in the discourse. Example (2) illustrates a common position of
discourse markers initially at the edge of the clause:
(2) A: How long did you stay there
B: Well I had a month’s study tour and then three months’ exchange
(adapted from ICE-GB1)

Well is placed outside the clause without any syntactic relations to the clause it
introduces. This does not mean that it does not have a function. Quoting Kalten-
böck and Heine (2014: 350), at a particular point in the discourse “the speaker may
choose to step out of the confines of syntax and create an extra place of communi-
cation which caters for the immediate demands of the situation.”
Corpus-pragmatic methods are suitable for analysing what lexical elements are
doing in the clausal environment. On the other hand, the methods are less suita-
ble to discuss the position of discourse markers in relation to turn-taking and the
sequential discourse. The agenda for explaining the placement of out-of-the-clause
elements in an interactional perspective comes instead from Conversation Analysis
(CA) (e. g. Schegloff 1996; Sacks, Schegloff and Jefferson 1974) and interactional
linguistics (Ono and Thompson 1995; Linell 1998; Golato and Golato this volume).
In a CA perspective, discourse markers (e. g. well, I think, I mean) are inextricably
associated with spoken discourse and have functions which can be explained with
reference to the fact that language is produced by speakers in real time on a turn-
by-turn basis. In the interactional perspective, the focus is transferred from the

1
The British component of the International Corpus of English. See http://ice-corpora.
net/ice/
Corpus pragmatics: From form to function 559

tasks that discourse markers perform in a single utterance, for example to signal
hesitation, to their role in structuring whole chunks of text. Their global discourse
function can be illustrated by the following example:
(3) A: The funny thing is that none of the sort of
Nancy Mitford stuff <,>
Do I mean Nancy
I can never remember which Mitford is which
But anyway none of the U and non-U stuff
seems to have washed off on your mother at all
(adapted from the ICE-GB)
Anyway comes in at a point in the interaction where the speaker wants to resume
the conversational thread after an interruption. She has just mentioned “the sort of
Nancy Mitford stuff” and then becomes uncertain about which Mitford is meant
(“Do I mean Nancy”). The interpretation of anyway as a marker of dismissal of
what has just been said depends on its position in the larger sequential structure
and not only on its position initiating a clause. In such cases a corpus can provide
examples of extended discourse which can be the basis for analysing the marker
in a particular function.
Although there may be a tension between the interactional and the corpus-based
approach, it seems that the two methodologies can benefit from each other’s com-
pany. Corpora can be used for investigating (and providing frequency information
about) what a lexical element is doing in larger contexts in naturally occurring
discourse, while the interactional discourse perspective helps us to understand why
it performs certain communicative tasks.
Corpora can also be used to test and evaluate hypotheses about the uses and
functions of pragmatic elements derived from what we know about spoken lan-
guage and how it functions dialogically. One such hypothesis is that discourse
markers are doing different things depending on where they are placed in the utter-
ance. The example below illustrates actually in two pragmatically interesting posi-
tions, initially and finally, in the clause:
(4) Actually she is not as pretty as she might have been
She is not as pretty as she might have been actually
(adapted from ICE-GB)

The initial position has been regarded as the normal one for discourse markers
(Schiffrin 1987: 328). The rightmost position (at the end of the clause), on the other
hand, has often been neglected but is typical of certain discourse markers, such as
tag questions in English, and many discourse markers such as actually, anyway,
then have a variable position. We can take the issue of position one step further by
considering why a discourse marker occurs in a particular position.
An assumption following from the property of spoken discourse, that it pro-
gresses from “left” to “right”, is that the leftmost position (the left periphery)
560 Karin Aijmer

should be preferred for other tasks than the rightmost one (Beeching and Detges
2014: 1). A discourse marker in the left periphery provides an opportunity for
speakers to indicate that they are willing to take the turn, or it may be used to create
topic-shift. In the rightmost position (the right periphery) a discourse marker may
be a sign that the speaker wants to cede the conversational floor to the hearer or
express hearer-oriented affect:
[just looking at English] there does appear to be a basic difference between elements in
the two peripheries. The left periphery is for elements with responsive functions […].
Markers in this position (e. g. well, so, indeed) seem to acquire text-structuring func-
tions and (inter)subjective meanings, the latter marking affective components integrated
into a reaction. Elements in final position, however, do not provide an initial guide to
the hearer, but rather signal how the utterance is to be interpreted in a specific context
(Haselow 2013: 418).
Beeching and Detges (2014) wanted to test the hypothesis that discourse markers
are used in different positions depending on general discourse principles, such as
subjectivity or intersubjectivity and associated interactional usages by including
more languages.
The hypothesis that linguistic elements should pattern in the same way across
languages was, however, shown to be too simple. The authors found that “some
kind of asymmetry between left and right periphery does exist; it was shown that
in most cases this asymmetry is a matter of frequency (and hence of degree) rather
than being categorical and that the LP/RP (left periphery/right periphery) position
interacts with both the core meanings of the items and prosody” (Beeching and
Detges 2014:19).
Beeching and Detges concluded that more empirical research needs to be car-
ried out to establish the complex relationship between position and function, which
may involve several factors and frequencies rather than categorical differences.
However, there seems to be sufficient corpus-based evidence (including cross-lin-
guistic data) that the demands imposed by the circumstances in which the dialogic
interaction takes place can be responsible for where in the utterance a discourse
marker is placed.

3.2. Prosody
It is possible to study prosodic phenomena on the basis of corpora, such as the
London-Lund Corpus of Spoken English, which contains a prosodic transcription
of the conversations and the ICE-GB corpus (the British component of the Inter-
national Corpus of English), which preserves the original audio recordings of the
conversations. Using these resources it has been possible to describe many spoken
phenomena with regard to accent status and intonation.
Out-of-the-clause elements or “inserts” as a class tend to be prosodically
non-integrated in the utterance to which they belong, separated by boundaries that
Corpus pragmatics: From form to function 561

can be marked by a pause or by the assignment of a nuclear tone. According to

Kaltenböck and Heine (2014: 352) “they [outside-the-clause elements or what the
authors refer to as “theticals”] are prosodically set off from the rest of the utterance
by a separate intonation contour and pauses”. On the other hand, we cannot gen-
eralize about individual inserts. An investigation of the discourse marker well by
Altenberg (1987: 137), on the basis of ten spoken texts in the London-Lund Corpus
of Spoken English, showed for example, that the marker was pronounced with a
nuclear tone only in about half of the examples.
However, we need to delve deeper and study quantitatively how prosodic infor-
mation (accent status, pausing, phrasing) is related to the meaning and function
of discourse phenomena. The link between prosody and function is complicated
because of the multifunctionality of discourse markers and related phenomena. Dis-
course markers, for instance, have been regarded as “cue phrases” which together
with information about prosody and position can help the hearer to distinguish
between different functions. An example is Hirschberg and Litman (1987), who
examined the prosodic features of now to find out which features best disambigu-
ated between the sentential and discourse uses. The importance of prosody to distin-
guish between different functions is illustrated by Ferrara’s (2001) study of different
types of anyway.2 Ferrrara used a corpus of spoken narratives from sociolinguistic
interviews focusing on the intonational patterns of anyway. She found that different
tonal patterns can distinguish between three types of anyway, only one of which
was defined as a discourse marker. The intonational differences between the three
uses of anyway were both “recognizable and measurable” (Ferrara 2001: 129). The
discourse marker anyway was most frequently used as digression marker. It was
pronounced with a low pitch signalling a dramatic and attention-getting function.
Wichmann, Simon-Vandenbergen and Aijmer (2010) also focused on a spe-
cific discourse marker, viz. of course in order to study the interaction between
form, prosody and function on the basis of corpora. The examples were analysed
syntactically, semantically and pragmatically. The tokens were then categorized
according to position in the tone group and intonational pattern. The results of
the analysis showed that of course can have several different prosodic realisations
consistent with the fact that it is multifunctional and can be placed in different
positions. The authors also showed that prosodic choices can depend on a number
of different contextual factors, such as speech style, situation, rhetorical goals,
previous knowledge, etc. (Wichmann, Simon-Vandenbergen and Aijmer 2010: 47).
Moreover, the corpus-based approach makes it possible to go beyond the syn-
chronic analysis and establish a connection between form, prosody (for example
accent status), position in the utterance, function and grammaticalization changes.
Of course as a grammaticalized item (with the interpersonal meaning “as we/you

2
Ferrara’s (2001) analysis is based on research published in Ferrara (1997).
562 Karin Aijmer

know”) was, for example, shown to be unstressed, reflecting the fact that prosodic
attenuation accompanies a loss or a weakening of meaning characteristic of gram-
maticalization processes.
In addition to prosody, non-verbal features are important for constructing and
interpreting meanings and provide a challenge for spoken corpora and corpus
analysis. The existence of multimodal corpora now makes it possible to describe
non-verbal elements and how they are linked to form and function. As Carter and
Adolphs (2008) point out,
[…] gesture and prosody are not forms of language in the same way as words or syntac-
tic patterns or structural features of discourse organization, but they are both comple-
mentary and integral in several respects to forms of language. They play a significant
role in the creation of meaning and, as we have shown, can be incorporated alongside
forms of language as data for corpus analysis. (Carter and Adolphs 2008: 179)

The future looks bright for corpora taking into account gestures, head nods and body
movements. Knight and Adolphs (2008)3 used a multi-modal approach to study
back-channels on the basis of the Nottingham Multi-Modal Corpus. Back-channels
were defined as “any short item that did not appear to take over a speaker turn, and
was not a response to a question” and illustrated by yeah (O’Keeffe and Adolphs
2008; quoted from Knight and Adolphs 2008: 180). The backchannels were clas-
sified into categories such as convergence tokens and information receipt tokens
depending on whether they were used to mark convergence and to help maintain
good relations, or as a response signal where one of the speakers controlled the con-
versation. In order to identify different types of head nods a coding scheme was de-
veloped which could account for the different types. Head nods, for example, were
classified with regard to duration and a decision had to be made whether a small
number of head nods in succession counted as the same nod or whether they should
be coded as different nods. Knight and Adolphs’ analysis showed that back-channels
were more frequent with head nods than without them, and that when the backchan-
nel had the function of a convergence token, the head nod was coded for longer
duration than when it marked information receipt (Knight and Adolphs 2008: 186).

3.3. Collocations
Another aspect of the “external syntax” associated with inserts which needs to be
taken into account in a corpus-based approach is their collocational properties. The
co-selection of discourse markers is systematic and has the function to disambigu-
ate markers which have many different functions, or to reinforce a function which
is only weakly grammaticalized or emergent (Linell 2009: 322). It is therefore

3
Based on research published in Carter and Adolphs (2008).
Corpus pragmatics: From form to function 563

important to study not only single “out-of-the clause” elements but their combi-
nations with other markers in different contexts and corpora. In the conversations
in the ICE-GB (Aijmer 2013: 29), well as a discourse marker tends to collocate
with okay, now, at least, anyway, as in (5), where the acceptance sense of well is
qualified by at least:
(5) Oh well at least it looks better for us when there’s nobody there. (ICE-GB)

In example (6), the collocation with I mean suggests that well is being used with
the same meaning (self-monitoring or correcting):
(6) A: Well Xepe seems to love this idea of having a picnic but I’m not too sure about this
B: Not if you’ve had lunch
A: Because I’ll have eaten anyway
Well I mean part part of the reason I am eating will be so that we don’t have a picnic
(ICE- GB; quoted from Aijmer 2013: 33)

Certain discourse markers almost seem to require a collocating discourse marker to

express a certain function. An example of this is anyway which was used together
with another marker (but anyway, well anyway, so anyway, and anyway) in nearly
all the examples in the left periphery in the ICE-GB (Aijmer 2016b).

4. Function

Function is at the core of a corpus-pragmatic approach to language. The cor-

pus-based form-to-function method must, therefore, be evaluated with regard to its
success to retrieve functions on the basis of form. However, we cannot search for
a particular lexical form and expect to get only relevant hits (“one-to-one search-
ing” Ädel and Reppen 2008: 2). Well, for example, is both a discourse marker and
a manner adverbial with grammatical rather than pragmatic function. In this case
the discourse uses can easily be distinguished from the grammatical uses. How-
ever, many pragmatic elements are multifunctional in the sense that they can have
several different functions depending on the context or even several functions in
the same context.
From a theoretical perspective multifunctionality raises the question whether
functional interpretations should be dealt with at the level of langue (or seman-
tics) or parole (pragmatics) (Hansen 2014: 152).4 At the level of langue we con-
sider conventionalized functions of a particular discourse marker and how they
can be organized in a polysemous way around one or more core meanings. This

4
For a discussion of other theoretical approaches to multifunctionality, see Fischer
(2006a: 13–14).
564 Karin Aijmer

is in line with the principle that “languages tend to avoid homonymy, and reserve
one meaning for one form. There is no reason why a similar principle should not
also be operative with respect to ‘functions’” (Östman 1995: 102–103). At the
level of parole a discourse marker can have a large number of different functions
depending on its uses in the context (for example its occurrence in different activity
types).
The corpora only provide raw material for the corpus-pragmatic analysis. In
addition we need to define the functions that spoken phenomena can have. There
seems to be a fairly general agreement that functions of discourse markers or
related elements should be defined in pragmatic and discourse terms and that the
functions have their origin in the tasks performed in the communication situation.
However, it is far from clear what these functions are, what criteria should be used
to identify them or the number of functions which provide the best descriptions.
The functions identified depend on the corpus data used for the analysis and
on the particular “insert” we discuss. A framework for analysing the functions of
discourse markers (or for other inserts) does not yet exist, and there is no agree-
ment about the number of functions such a theoretical model should contain. Nev-
ertheless many scholars have attempted to characterize the “macro-functions” or
types of communicative tasks associated with the interpretation of the linguistic
elements. The purpose is to define “a plausible number of well-defined identifiable
readings” (Fischer 2006a: 3) (“paradigm functions” in Heine et al. 2013: 173) and
the communicative domains to which they belong. Existing functional typologies
suggest that attention should be given to at least interpersonal and textual func-
tions although “richer” typologies have also been proposed. Östman (1995) has
developed a model of discourse markers based on three different macro-functions
(parameters “in accordance with which communication takes place: Coherence,
Politeness and Involvement”; 1995: 104). Coherence has to do with cultural and
social constraints we have to take into account when communicating (Östman
1995: 104). Politeness and Involvement (affective stance) are other general-be-
havioural functions of the model.
Fischer (2006b) considers a wider range of functions that discourse markers
can have “commonly, and often cumulatively”, including “functions with respect
to the turn-taking system, the indication of discourse relations, discourse structur-
ing, the regulation of interpersonal relationships, speech management, or polite-
ness” (2006b: 430).
An influential framework for understanding the functions of discourse markers
is inspired by Halliday’s typology of language functions in the theory of Sys-
temic Functional Linguistics (Halliday and Hasan 1976). Brinton (1996) argues
that it is possible to describe the function of discourse markers on the basis of the
interpersonal and textual functions identified by Halliday. Discourse markers in
the interpersonal function would be concerned with “the social, expressive and
conative functions of language” while elements with a textual function belong to
Corpus pragmatics: From form to function 565

the text-forming component of language (cf. Halliday and Hasan 1976: 26–27).
Speakers orient to these domains differently depending on the speech situation. In
communication among friends, speakers may pay more attention to the interper-
sonal function than to textual coherence.
A related question which needs to be debated is how the functional domains
or taxonomies should be applied to the empirical analysis of discourse markers in
different text types and different languages or in other words, “which constraints
should be grouped under which domain” (Fried and Östman 2005: 1760). Which
functions, for example, belong to the textual domain and which functions are pref-
erably placed under the interpersonal umbrella? Brinton (2008) suggests the fol-
lowing model:
the textual functions include those of claiming the attention of the hearer, initiating
and ending discourse, sustaining discourse, marking boundaries, including topic shifts
and episode boundaries, constraining the relevance of adjoining clauses, and repair-
ing discourse. Among the interpersonal functions are expressing responses, reactions,
attitudes, understanding, tentativeness, or continued attention, as well as interactive
functions, such as expressing intimacy, cooperation, shared knowledge, deference, or
face-saving (politeness). (Brinton 2008: 17–18)
The categories proposed, the “right” number, and the labels used to describe them
are based on our interpretation of what pragmatic elements are doing in discourse.
However, many problems remain. As Lewis points out (2006: 57), some functions
can be expressed by a large number of forms while others have only a few realiza-
tions. The functions proposed may be difficult to distinguish from each other and
more work needs to done on linking function to formal properties such as position
or stress and intonation.

5. The context

Many spoken corpora also contain information about activity type (text types) and
sociolinguistic features relating to the speakers’ age, social class and gender. As
a result, we can take a further step and consider how function is associated with
contextual variables. For example, discourse markers as well as hesitation markers
and other inserts can be used in a variety of activities and situations. Moreover,
their function is linked to the age, class and gender of the speakers.

5.1. Context defined

To begin with, the notion context needs to be discussed. What do we mean by
context? How much of the context, and what features of the context should be
included in a corpus-pragmatic approach to linguistic forms? In linguistic terms,
context is defined narrowly in terms of the words surrounding a lexical item and
566 Karin Aijmer

determining its meaning. However, we need a broader description of the context

in order to describe the situational and sociolinguistic variability of pragmatic
expressions. Context, however, is an obscure notion which is referred to differently
depending on the theory and the phenomena it is used to explain. According to
Schiffrin (1987: 3) contexts range “from cultural contexts of shared meanings and
world views to social contexts through which definitions of self and situation are
constructed, to cognitive contexts of past experience and knowledge” (Schiffrin
1987: 3). In a cognitivist framework, Croft and Cruse (2004: 102) propose that
context puts constraints on utterance interpretation corresponding to what Clark
(1996) refers to as the common ground of the speakers (the shared knowledge,
beliefs and assumptions of the participants in the conversation). The following
aspects of context have in common that they constrain the interpretation of the
utterance.
(i) Previous discourse (what has been said immediately prior to a given utterance)
(ii) Immediate linguistic environment (co-text)
(iii) Type of discourse (the type of activity)
(iv) Physical context (the immediate surroundings in which the speech situation
takes place)
(v) Social context (this refers to the kind of situation the participants are in and the
social relations between them)
(vi) Stored knowledge (remembered experiences and knowledge which can have
an effect on interpretation)
In anthropological literature we find a rich description of the socio-cultural dimen-
sions of the communication situation such as the social identities of the interact-
ants, their relationships to each other, activities (debating, story-telling), attitudes
and feelings (Ochs 1996: 410). Social identity includes “all dimensions of social
personae, including roles (e. g. speaker, over-hearer, master of ceremonies, doc-
tor, teacher, coach), relationships (e. g. kinship, occupational, friendship, recrea-
tional relations), group identity (gender, generation, class, ethnic, religious, educa-
tional group membership) and rank (e. g. titled and untitled person, employer and
employee)” (Ochs 1996: 410).5 When a meaning is chosen in the interaction all
of these contextual factors are potentially relevant. However, this categorization
may be less suitable or too fine-grained for the analysis of discourse markers and
related lexical elements.
Moreover, neither the cognitivist nor anthropological definitions are cor-
pus-based and therefore mainly provide a wish-list for features which should be
analysed on the basis of a corpus. Luckily, however, present-day spoken corpora
provide a great deal of information about contextual factors which can be anno-

5
On the linguistic indexing of activity (types) see section 5.2.
Corpus pragmatics: From form to function 567

tated and used for the study of lexical elements which are interpreted in the context.
Large spoken corpora such as the British National Corpus contain demographic
information about the social situation and who the speakers are making it possible
to study the distribution and frequency of lexical elements in relation to the coded
social factors. Of particular importance in this regard is the association between
discourse markers and the type of activity. As I will show below, it would in fact
be difficult to understand many functions of linguistic elements in spoken language
without considering their link to usage in different activities.
In the following sections, I will consider in more detail the contextual factors
which are emphasised in corpus studies and the methods used by corpus linguists
to describe discourse markers and other inserts in the context.

5.2. Form, function and the type of activity

The activity plays an important role for how discourse markers are used. The sit-
uational description can be conceptualized as frame (“communicative background
frame”, Fischer 2006b: 442) or as “(communicative) activity type”, Levinson
1979; Linell 2010). Levinson (1979) defines activity type as follows:
In particular, I take the notion of an activity type to refer to a fuzzy category whose
focal members are goal-defined, socially constituted, bounded, events with constraints
on participants, setting, and so on, but above all on the kinds of allowable contributions.
(1979: 368, italics in the original)

Examples of activity types are informal conversation, telephone conversation and

institutionalized activities such as classroom lesson and broadcast discussion.
The speaker and hearer have special social roles in the discourse (e. g. as teacher
and pupil) and the turns follow each other according to an agenda specifying who
says what to whom. Several spoken corpora (such as the London Lund Corpus
of Spoken English (LLC) and the ICE-corpora (International Corpus of English)
provide a categorization of text types which can be linked to activity types such as
conversation, classroom lesson, broadcast discussion, parliamentary debates, legal
cross-examinations. As a result, the frequency and distribution of pragmatic phe-
nomena can be studied in different activities and compared across the sub-corpora.
The following example from a broadcast discussion which is a part of the ICE-GB
illustrates the use of the discourse marker well by the moderator of the discussion
to invite a new speaker to take the conversational floor (cf. Aijmer 2013: 59).
(7) Moderator: Well our next witness is Judith Dawson who’s a principal senior social
worker in Nottinghamshire (ICE-GB Broadcast discussion, simplified)

The use of well initiating a turn in which a new speaker is brought into the dis-
cussion is linked to the speaker’s role as moderator. The invited speaker, who is
a social worker, has witnessed a case of child abuse, and the topic to be debated
568 Karin Aijmer

is whether in such situations children should be removed from their parents. Well
has a specialized meaning along with its general meanings which are described in
relation to textual or interpersonal meanings.
A corpus-based approach demonstrates other functions of well constrained by
the goals of the activity. In (8) well is used in a cross-examination with a spe-
cialized function. Well introduces a question asked by the examiner who already
knows the answer:
(8) Examiner: Well did you understand from Mr Sainsbury that if you didn’t have the
money by the third of February it would cause problems (from ICE:GB)

Moreover, well was used for “activity-based” discourse functions with a punctuat-
ing function in sports commentaries on the radio (data from ICE-GB):
(9) Commentator: Dixon gets that cross in headed away well there by Kalatsakas and well
finally hammered away deep into the Arsenal (ICE-GB)

A corpus-study can be integrated with approaches which are more closely asso-
ciated with sociolinguistics and discourse analysis. Innes (2010) illustrates how a
corpus analysis can be combined with Conversation Analysis and an ethnographic
analysis to study pragmatic phenomena. Her data consisted of criminal jury trials
in a New Zealand setting and featured many different participants (police wit-
nesses, judges, counsel for the defense and for the prosecution). Following the lead
from Conversation Analysis she found that the discourse marker well was more
often used in initiations (e. g. challenges) than in responses (e. g. justifications).
Well was used differently depending on the speaker’s professional identity and
women used it more than men.

5.3. Form, function and pragmatic variability

The interaction between form and the (sociolinguistic) context has generally been
studied from a sociolinguistic variationist perspective rather than as a pragmatic
phenomenon. The variationist approach has, for example, been used successfully
to study the influence of sociolinguistic factors such as age and social class on
pronunciation or grammatical variation. Attempts have also been made to widen
the analysis to discourse phenomena. Dines (1980) proposed that the variationist
approach could be extended to the analysis of “set-marking tags” such as and that,
and stuff like that. The “sticking-point” for taking the variationist approach is that
few variants can be distinguished which are semantically and pragmatically equiv-
alent (Beeching and Woodfield 2015: 9). A more successful approach has therefore
been to study the influence of sociolinguistic factors such as social class, age and
gender on discourse markers in corpora which make such information available.
The use of corpora and a form-to-function approach are also compatible with
the importance of studying pragmatic phenomena from the perspective of variabil-
Corpus pragmatics: From form to function 569

ity. Pragmatic variability has been observed in many areas and levels of language
and has come on the agenda recently within the new branch of linguistics referred
to as variational pragmatics (Barron and Schneider 2009). Barron and Schneider
draw attention to the fact that pragmatic variation (for example between different
variants of the same speech act) can be related to the sociolinguistic context (in
particular different regional varieties). Discourse markers have been much less
discussed than speech acts from a variational perspective but can also be system-
atically related to sociolinguistic factors. Jucker and Taavitsainen (2012: 296), for
example, suggested that a linguistic variable can be regarded as a “pragmatic vari-
able” or a pragmatic unit which needs to be analysed with regard to sociolinguistic
and other contextual factors:
Realizational pragmatic variables are a special type of the linguistic variables described
above, but with a focus on a pragmatic unit of a language instead of a phonological,
morphological, or syntactic unit. Relevant examples are address terms (tu versus vous
in French, for instance), discourse markers, different types of speech acts or different
types of politeness strategies. (Jucker and Taavitsainen 2012: 296)

Pragmatic variables are indexically linked to sociolinguistic features in the com-

munication situation such as age and gender and indirectly to features such as
“youth” and establishing solidarity with the peer group. They need to be studied
on the basis of corpora which enable the researcher to establish patterns of usage
which connect formal and functional factors to sociolinguistic aspects such as
regional provenance, age, gender and social class of the speakers.
The sociolinguistic angle has been present since the early days of corpus lin-
guistics. Holmes (1986), for example, showed that you know requires both a func-
tional analysis and a description of how the marker is used differently by men and
by women basing herself on a corpus of New Zealand English (Holmes 1986:1).
However, we are now beginning to get more and broader corpus documentation
of the relationship between function and sociolinguistic factors. Andersen (2001)
studied how teenagers from different London schools used the discourse marker
like on the basis of the conversations in the Bergen Corpus of London Teenage
Language (COLT). The discourse marker is illustrated in the example below
(Andersen 2001: 208):

(10) Starts off a bit boring. First like twenty minutes and then it gets good (from COLT;
Andersen 2001: 209)

The examples of like were classified with regard to the social factors gender, age,
social class, ethnicity, and location (different London boroughs). This made it pos-
sible to link the function of like to the context. The pattern which emerged was
that the prototypical user of like with a discourse marker function was “a white
17-year-old girl from the highest social class who attends the boarding school in
570 Karin Aijmer

Hertfordshire” (Andersen 2001: 294). In a corpus-pragmatic approach we can go

beyond the corpus data to explanation.
The teenagers’ use of like can also be indexically linked to a particular social
identity and values or norms associated with that identity. Andersen mentions, for
instance, the effect of the marker to invoke politeness or solidarity since like may
have the effect of the speaker avoiding sounding abrupt (Andersen 2001: 295). Like
is also capable of invoking a set of more general socio-cultural values associated
with adolescence such as post-modern, ironic and non-committal (Andersen 2001).
Corpus-pragmatic methods and the access to new sociolinguistic corpora have
also made it possible to study innovative or emergent discourse markers according
to several different sociolinguistic aspects. Torgersen and Gabrielatos (2009) stud-
ied innit and you get me in the Linguistic Innovators Corpus: the English of ado-
lescents (LIC). The methodology combined corpus linguistics and sociolinguistics.
The analysis took into account the relative frequency of the use of the markers as
well as the proportion of speakers using the variants. The variables included were
age, sex, ethnicity, place of residence (in London). Torgersen and Gabrielatos’
analysis showed that innit and you get me were the most frequent tags and that
they were used most frequently by male, non-Anglo, Hackney (a London borough)
residents. You get me was most frequent in multi-cultural friendship groups.
More recently, Beeching (2016) employed corpus-linguistic and sociolinguistic
methods to analyse the discourse markers well, just, you know, like, sort of, I mean.
In order to analyse their distribution and frequencies with respect to the sociolin-
guistic variables social class, age and gender, she used the demographic part of
the British National Corpus. The full BNC was, however, used for a genre-based
analysis of the markers. By using the KWIC function it was, for example, possible
to investigate bundles with discourse markers and perceive patterns in the way they
were used in different genres (Beeching 2016: 61).
The choice of contextual features which can be studied by means of corpus
linguistic methods is also becoming broader. Regional variation, for example, has
been investigated on the basis of the corpora of national varieties of English within
the ICE-project (International Corpus of English). The corpora, which include both
speech and writing, are of the same size and compiled in the same way in order
to make it possible to compare the frequency and function of the items examined.
The comparison of actually in several sub-corpora of spoken English showed, for
example, that the marker was more frequent in Hong Kong English and Singapore
English than in British or New Zealand English and that it was used in different
positions and functions in the varieties (Aijmer 2016a). On a deeper level, the
study of the variation between national varieties gives rise to new questions con-
cerning the importance of social and cultural norms to explain variation.
Corpus pragmatics: From form to function 571

6. Combinations of form and function in selected theoretical approaches

In this section we go from corpus-based observations of pragmatic phenomena in

spoken language to theory. Construction Grammar, for example, seems to be well
equipped to give a rich description of both form and function of spoken phenomena
in different contexts. Other approaches which are compatible with a corpus-prag-
matic approach are thetical grammar and, more generally, descriptive grammars of
spoken language. In Biber et al.’s (1999: chapter 14) “Grammar of conversation”,
elements which are placed outside the clause are described collectively as a class
of inserts. Thetical grammar has the aim to describe phenomena which do not fit
into Sentence Grammar. Biber et al.’s analysis of inserts and thetical grammar
have in common that they are mainly taxonomic, while the constructional model
is compatible with a dialogical or interactional view of spoken language.

6.1. Taxonomic approaches

6.1.1. Lexical expressions as inserts

Biber et al’s (1999) approach is corpus-based and descriptive. The authors describe
inserts as “a class of words” characterized by the fact that they are “stand-alone”
elements which are unable to enter into syntactic relations with other structures
(Biber et al. 1999: 1082). They are also referred to in semantic-pragmatic terms:
“[s]emantically, they have no denotative meaning: their use is defined rather by
their pragmatic functions” (Biber et al. 1999: 1082). Biber et al. also point out that
they are difficult to translate and are often omitted since they are not part of the
propositional content.
The defining features of (the most common members of) the insert category
are then as follows:
(1) They may appear on their own, i. e. not as part of a larger grammatical structure.
(2) On the other hand, they may appear attached (…) to a larger structure, which may be a
clausal unit or a non-clausal unit.
(3) They rarely occur medially in a syntactic structure.
(4) They are morphologically simple.
(5) They are not homonyms of words in other classes.
(6) Semantically, they have no denotative meaning: their use is defined rather by their
pragmatic function.
(Biber et al. 1999: 1082, italics in the original)

Biber et al. (1999) discuss several classes of inserts, such as interjections, voca-
tives and expletives, both with regard to formal and functional features. They also
make the observation that inserts are used with different frequencies in British and
American English. However, they do not go beyond British and American English
to other varieties.
572 Karin Aijmer

6.1.2. Thetical Grammar

Thetical Grammar is another model analysing pragmatic phenomena both formally

and functionally. The “theticals” are extra-clausal units such as vocatives, imper-
atives, formulae of social exchange and interjections (Heine et al. 2013: 155).
The assumptions underlying the thetical view of (spoken) grammar are inspired
by discourse analysis rather than by conversation analysis.6 In other words, the
focus is on “orthodox linguistic taxonomy” rather than on the tasks performed by
linguistic elements in the discourse sequence. In line with this objective, Heine et
al. are concerned with describing characteristic formal and functional features of
the theticals. The grammar also has to reconcile the dichotomy between the linear
progression of spoken communication and the interactional exigencies imposed by
the speech situation. While sentence grammar is suited for presenting information
in a linear format, thetical grammar is said to have “the entire situation in its scope:
the speaker, the hearer, their relation to one another, to the text, and to the situation
in which discourse takes place” (Heine et al. 2013: 194–195). The meanings of
discourse markers and other theticals can for example refer to the speaker-hearer
interaction or to the speaker’s attitudes depending on which component is fore-
grounded in the situation.

6.2. Discourse markers as constructions

A constructional approach is used increasingly to study spoken phenomena (on
Construction Grammar see e. g. Kay and Fillmore 1999; Croft 2001; Goldberg
1995). It can be combined with a pragmatic analysis of spoken phenomena in
interactional and social contexts. The idea that discourse markers are linguistic
constructions combining formal and functional properties has been taken up by
Fried and Östman (2005), who welcome Construction Grammar as an opportunity
to combine a description of the meaning (or functions) of discourse markers with
a dialogical approach to spoken language.
Construction Grammar can be “easily enriched by introducing the parameters
that are necessary for incorporating discourse-level information” and be used as
a framework to give a “communicatively and grammatically adequate treatment
of discourse markers (pragmatic particles)” (Fried and Östman 2005:1753; italics
in the original). Fried and Östman’s procedure is to consider how Conversational
Analysis (CA) can be combined with Construction Grammar. Formally, as we have
seen, discourse markers must be described with regard to their position in the

6
“With reference to the distinction between discourse analysis and conversation analy-
sis as proposed by Levinson (1983: 286), our concern is exclusively with the former”
(Heine et al. 2013: 157).
Corpus pragmatics: From form to function 573

utterance, collocational patterns and prosody. On the meaning-side, the construc-

tion can include information about constraints on the preceding and following dis-
course, the immediate discourse, types of activity in which a construction occurs,
etc. The authors illustrate the constructional approach by drawing a functional map
of what speakers need to know about the structure and meanings of question parti-
cles in a Finland-Swedish dialect (Solv) and in Czech. The following information,
according to Östman (2006: 244), illustrates contextual constraints on the usage
of the question particle då (‘then’) found in Solv: it is acceptable as a question in
the linguistic community, men do not use it in wh-questions, the expected response
is new, it has (im)polite functions with regard to distance/deference and it is not
stressed. The construction can be thought of as a fairly abstract representation of
the potential meanings of a particular discourse marker. In the concrete speech sit-
uation the construction’s meaning or functional potential is exploited for carrying
out recurrent discourse tasks by selecting the appropriate features.
To conclude, while corpus linguistic analysis aims to describe the formal, func-
tional and contextual features of spoken phenomena, Construction Grammar adds
a deeper understanding of how formal and functional features are motivated by a
theory of spoken communication.
A constructional approach is also useful to compare discourse markers and
related elements which belong to the same functional class and share some for-
mal, functional or contextual features. We can for instance contrast the adversative
markers actually and in fact on the basis of the example below from the ICE-GB
Corpus (actually occurs in the original text). The markers are semantically related
since they refer to actuality and truth but are used in ways where this meaning
seems to have disappeared:
(11) A: But working in this group
It’s different in terms of uhm the way that you have to dance
Actually you have to be much more honest about what you’re doing
(adapted from ICE-GB; Aijmer 2013: 112)

Actually and in fact would both have the meaning elaboration in this example fur-
ther indicated by their structural position and stress. In a constructional analysis we
can show both in what respects they are similar and how they differ.
574 Karin Aijmer

In fact Actually
Formal features
Position Most frequently initial Most frequently medial
Prosody Both stressed and unstressed Both stressed and unstressed
Occurs with pauses Does not occur with pauses
Typical collocations But Well, and, but
Function
Textual Primarily elaborative Primarily adversative
Interpersonal Taking up an argumentative stance Tentativeness
Style Formal Solidarity
Specialized meanings The use in cross-examinations by The use in the classroom by
the examiner to ask questions to the tutor to present explana-
which both the speaker and the tions in a particular order
hearer know the answer
(based on Aijmer 2013: 124)
Figure 1: Formal and functional features of in fact and actually

There is not a categorical difference between in fact and actually, but they are
related in complex ways which also involve frequencies. With regard to their for-
mal properties in fact and actually are both positionally variable, but in fact is
more frequent initially. Prosodically only in fact occurs with pauses. But is most
typical as a collocate of in fact while well frequently co-occurs with actually. The
contexts in which in fact and actually are used are both textual and interpersonal.
The markers can be elaborative, as in the example above, or adversative, although
actually is more frequently used with adversative function. On the interpersonal
level, in fact is rhetorical and argumentative while actually tends to be tentative.
Other differences have to do with politeness or style. A distinction can, for exam-
ple, be made between the more formal in fact and actually which marks solidarity.
In addition, both markers have specialized meanings depending on the activity
type, as illustrated by the use of in fact in cross-examination questions and actually
introducing explanations in the classroom.
Summing up, there has been a growing interest in establishing a framework
which is compatible with corpus-linguistic descriptions of spoken elements and
what we know about interaction and the use of language in an ethnographic and
sociolinguistic perspective. The theoretical models discussed in this section have
in common that they do not focus either on form or function but analyse discourse
markers and related elements as units of form and function (inserts, theticals, con-
structions). Constructional approaches, in particular, have also been influenced by
directions taking a conversation analytic or interactional perspective on language
function.
Corpus pragmatics: From form to function 575

7. Extending the analysis to other areas of research

where the corpus-pragmatic framework is suitable

The discussion in this section will turn to some other spoken phenomena which
are important in pragmatics, discourse and interaction. These are interjections,
address terms and hesitation signals. They have in common with discourse markers
that they are best described in a corpus-pragmatic perspective taking into account
formal, functional and contextual factors.

7.1. Interjections
Interjections have been described as “relatively conventionalised vocal gestures
(or more generally, linguistic gestures) which express a speaker’s mental state,
action, attitude or reaction to a situation” (Ameka 1992: 106). They are special
because they can represent “sounds” (tskt, ouch) and can be prosodically pro-
longed as illustrated by ooh and oooh. From a different perspective interjections
have been referred to as “those little words or ‘non-words’ whose main character-
istic is being (phonologically and morphologically) anomalous” (Cuenca 2000:
29 with reference to Ameka 1992). Their anomalous properties can, however, be
explained if they are treated as units of form and function in a corpus-pragmatic
approach.
The features needed to define them range from formal ones (morphological,
prosodic, position in the utterance) to a specification of their functions and con-
textual constraints. Like discourse markers they are “non-conformist” in that they
do not enter into syntactic relations with other word classes (Crystal 1992: 190;
cf. also Markus 2014: 118). They can stand alone or in the left periphery of the
utterance as inserts (Biber et al. 1999). Another feature characteristic of the way
they are used is that they can combine with other interjections or discourse markers
(e. g. oh god, oh I see).
Oh seems to be a good example of how function can be linked to formal and
contextual factors and explained in an interactional perspective:

(12) A: I was in a pub we were at the Covent Garden Festival

B: m
A: and he was there and he was with- he was working for the chap who wrote Martin
Luther’s crusade for the People
B: O\h (pause) Edward Somerset
A: that’s right
(simplified example from the London-Lund Corpus of Spoken English; Aijmer 2002:
121).

Oh is typically associated with the expression of emotions such as surprise or

pleasure. However, this is only one of the functions that it can have. The fact that
576 Karin Aijmer

it occurs initially in the utterance pointing backwards in the context goes a long
way towards determining other functions of oh in the interaction. In the exam-
ple above, for example, it occurs after a statement when a previously uninformed
speaker (Speaker B) receives new information or suddenly remembers something
(Heritage 1984). Oh in this function is also linked to prosody; it is a separate tone
unit (as indicated by the following pause) and it is pronounced with a falling tone
(cf. Aijmer 2002: 109). Similarly Local (1996) draws attention to the relation-
ship between oh as a news receipt and a falling pitch movement (Local 1996:
180–183).
As with discourse markers we also need to take into account how features of
the social activity affect the ways in which the interjection is used. Norrick (2008),
for example, found that interjections were used in conversational narratives by
the teller of the story in prefaces justifying the “tellability” of the story and by the
listener as a stand-alone marker signalling ‘active listenership’ as in the following
example:
(13) Gloria: didn’t give the people enough time to get off the train.
Elizabeth and about four or five other people.
Matthew: gosh
Gloria: couldn’t get off and they had to go to the next station
(example taken from Norrick 2008: 448)

7.2. Hesitation markers

Hesitation markers have been studied mainly by psycholinguists. If we want to

find out how “hesitators” are actually used we need a corpus-based approach tak-
ing into account frequencies, formal and distribution and pragmatic interpretation.
The example below illustrates the use of a hesitation marker as an insert outside
the clause:
(14) Oh yes Oh erm, but er, you know, wh – whether it’ll be a good thing (British National
Corpus)

As pointed out by Clark and Fox Tree (2002: 79; Tottie 2011: 175), lexicographers
have been slow or unwilling to recognize the status of er (uh) and erm (um) as
words. It can, however, be argued that hesitators are conventional just like words
and that the patterns relating forms to their recurring function should be described
in a systematic way on the basis of corpora.
Both formal and functional features are needed to characterize hesitators. They
perform functions related to turn-taking in the discourse activity and they can be
described both by their invariant meaning ‘I am thinking’ (Fischer 2006b: 432) and
with regard to placement in the utterance and prosody. Like discourse markers and
interjections, they frequently co-occur with other discourse items (er you know,
Corpus pragmatics: From form to function 577

well erm). They are found in many different positions depending on their function
in the discourse.
Corpus-based methods have shown that hesitation markers or “planners” (Tot-
tie 2011), such as er and erm, perform a wide range of uses controlling and organ-
izing discourse depending on formal factors such as position. Time needed for
planning or word search is an obvious function of er/erm in the online production
of discourse, but er/erm can also have interactive functions as suggested by previ-
ous work. Kjellmer (2003) referred to the role of hesitation markers in turn-taking,
attracting attention, highlighting significant elements in the utterance and correct-
ing part of the utterance (cf. Gilquin and de Cock 2011: 153).
Formal factors such as position are important clues to function. In a study
of hesitation phenomena in the British National Corpus, Rühlemann (2007: 161)
observed that “the different distributions across utterance positions are correlated
to the functions er and erm perform in discourse”. In utterance-initial position er/
erm is said to serve as a “turn-bidder” by means of which the speaker signals his/
her willingness to take the turn. Utterance-internal er/erm is a means for holding
the turn while turn-final er/erm can be seen as a “turn-yielder”.
The functions of hesitators are also sensitive to interaction type. Tottie (2016)
found a wide spread of er/erm frequencies over the texts in the Santa Barbara
Corpus of Spoken American English (SBCSAE). However, there were differences
depending on the type of activity and the speaker. A task-related interaction in a
small-claims court with the judge “summarizing claims and striving for precision,
and litigants also weighing their words” contained more examples of er/erm for
planning than conversation (Tottie 2016: 102).
Hesitators can also have a special role when they are used by fictional char-
acters. Jucker (2015a) discussed uh and um7 in their functions as planners in the
Corpus of Historical American English (COHA) in examples where the authors
used these elements in a more deliberate way to characterize speakers. The most
important function of uh and um was hesitation and planning but they were also
used in some examples by the authors to signal that “certain utterances are some-
what embarrassing or awkward for the speaker” or “that a speaker who uses a
planner actually is lying” (Jucker 2015a: 176).
In another study Jucker (2015b) investigated the high frequency of hesitation
signals in a mock science fiction text known for its eccentric characters and bizarre
conversations. The many hesitations and drawn-out spellings of uh and um were
shown to reinforce the characterizations of the fictional characters rather than
describe their speech thus providing a rich data source for literary pragmatics.

7
Uh and um are the American correspondences of British English er and erm.
578 Karin Aijmer

7.3. Address terms

Address terms or vocatives are words used to address a person in the interaction.
They are often neglected since they can so easily be omitted from the sentence
without any change of the propositional content. They have special formal and
functional properties which can be explained by considering them as constellations
of form and function.
Both formally and functionally they are distinguished from discourse markers
and interjections. Formally they look like nouns, but as address terms they have
a special position in the utterance. Address terms such as man, darling, sir, guys,
mate, dude are typically added to the beginning or end of the clause as inserts. They
can also stand alone and they frequently occur with other elements as in hey man.
The interpretation of address terms depends on what they are doing in the
local discourse as well as their function in the broader social situation. Biber et al.
(1999: 1112) take a closer look at “vocatives” (their term) and suggest that they
can be classified as endearments (darling, dear), kinship terms (mom), familiar-
izers (mate), first names, title and names, honorifics (sir, madam). The types have
different frequencies with the highest frequency of first names and they are placed
initially or finally depending on the length of the clausal unit to which they are
attached. They have an attention-getting function but they may also serve the func-
tion to maintain and reinforce relationships between the speaker and the hearer. In
terms of sequential positioning they are used both to open a communicative act (or
a sequence of acts) and in closing sequences.
An initial address form can, for instance, be used in the context of greetings. It
can also have “the function of clearing space for a lengthy turn” (Leech 1999: 117)
or introduce a new topic. In the face of challenges or disagreements the vocatives
have a softening function.
More generally, we also need to describe, in sociolinguistic terms, who uses a
form such as mate, man or guys and the situations in which they are used. Rend-
le-Short (2010: 1216), for example, says that mate as a form of address in Austral-
ian English: “can be used to most people, whether you know them or not, but one
should be more cautious addressing women over fifty by ‘mate’ as they report that
they do not like the term”.
Summing up, there is a large number of phenomena which are unique to spo-
ken language but would be abnormal in the perspective of written grammar. Such
phenomena can, however, be described in a corpus-pragmatic approach which
explains lexical expressions and constructions in the light of principles according
to which interaction takes place. Here belong a large number of items which have
in common that they have certain formal and functional properties and need to be
described in a form-to-function perspective (as inserts, theticals or constructions)
Corpus pragmatics: From form to function 579

8. Conclusion

For a long time corpus linguistics and pragmatics were separated by their differ-
ent areas of research. Corpus linguists were mainly occupied with grammar and
lexicography and developed statistical methods to better deal with large amounts
of data. Scholars interested in pragmatics, on the other hand, were concerned with
describing the use of language for communication on the basis of conversational
data. More recently there has been a “rapprochement” between the two disciplines
reflected in the use of corpora and corpus-linguistic methods to describe how lexi-
cal units are assigned functions in the speech situation. The shared area of research
has broadened as witnessed by the overview in this article where I have aimed to
show how corpus methods can be used to study the forms and functions of prag-
matic elements and how new corpora provide data for the sociolinguistic turn in
pragmatics. The corpus-pragmatic method can be used both to analyse spoken
data and to address problems of methodology involving their form and function.
Corpus findings have much to contribute to the analysis of pragmatic phenomena
by providing information about their distribution in different positions, colloca-
tions, prosodic patterns, frequencies in different text types, etc. In particular, the
corpus-pragmatic procedure is capable of dealing with complex form-to-function
relations which are not categorical but involve frequencies.
The drift of this article has been to argue that we need to combine corpus find-
ings with a dialogic view of the interaction in order to find a motivation for the
similarities or symmetries between form and function. However, this may well be
a long-time project requiring spoken corpora for many different languages and the
analysis of the use and functions of many different pragmatic items.

References

Ädel, Annelie and Randi Reppen

2008 Corpora and Discourse: The Challenges of Different Settings. Amsterdam/
Philadelphia: John Benjamins.
Aijmer, Karin
2002 English Discourse Particles: Evidence from a Corpus. Amsterdam/Philadel-
phia: John Benjamins.
Aijmer, Karin
2013 Understanding Pragmatic Markers: A Variational Pragmatic Approach. Edin-
burgh: Edinburgh University Press.
Aijmer, Karin
2016a Revisiting actually in different positions in some national varieties of English.
In: Francisco Alonso Almeida, Laura Cruz García, and Víctor González Ruiz
(eds.), Corpus-based Studies on Language Varieties, 115–144. Frankfurt am
Main: Peter Lang.
580 Karin Aijmer

Aijmer, Karin
2016b Pragmatic markers as constructions: The case of anyway. In: Gunther Kalten-
böck, Evelien Keizer and Arne Lohmann (eds.), Outside the Clause, 29–57.
Amsterdam/Philadelphia: John Benjamins.
Altenberg, Bengt
1987 Prosodic Patterns in Spoken English. Studies in the Correlations Between
Prosody and Grammar for Text-to-Speech Conversion, (Lund Studies in Eng-
lish 76). Lund: Lund University Press.
Ameka, Felix K.
1992 Interjections: The universal yet neglected part of speech. Journal of Pragmat-
ics 18: 101–118.
Andersen, Gisle
2001 Pragmatic Markers and Sociolinguistic Variation. Amsterdam/Philadelphia:
John Benjamins.
Barron, Anne and Klaus P. Schneider
2009 Variational pragmatics: Studying the impact of social factors on language use
in interaction. Intercultural Pragmatics 6(4): 425–442.
Beeching, Kate
2016 Pragmatic Markers in British English. Meaning in Social Interaction. Cam-
bridge: Cambridge University Press.
Beeching, Kate and Ulrich Detges
2014 Discourse Functions at the Left and Right Periphery. Cross-linguistic Investi-
gations of Language Use and Language Change, 1–23. Leiden: Brill.
Beeching, Kate and Helen Woodfield
2015 Introduction. In: Kate Beeching and Helen Woodfield (eds.), Researching
Sociopragmatic Variability. Perspectives from Variational, Interlanguage and
Contrastive Methodology, 1–16. Basingstoke: Palgrave Macmillan.
Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad and Edward Finegan
1999 The Longman Grammar of Spoken and Written English. London: Longman.
Brinton, Laurel J.
1996 Pragmatic Markers in English. Grammaticalization and Discourse Functions.
Berlin and New York: Mouton de Gruyter.
Brinton, Laurel J.
2008 The Comment Clause in English. Syntactic Origin and Pragmatic Develop-
ment. Cambridge: Cambridge University Press.
Carter, Ronald and Svenja Adolphs
2008 Spoken Corpus Linguistics. From Monomodal to Multimodal. New York/Lon-
don: Routledge.
Clark, Herbert H.
1996 Using Language. Cambridge: Cambridge University Press.
Clark, Herbert H.
2004 Pragmatics of language performance. In: Laurence R. Horn and Gregory Ward
(eds.), Handbook of Pragmatics, 365–382. Oxford: Blackwell.
Clark, Herbert H. and Jean E. Fox Tree
2002 Using uh and um in spontaneous speaking. Cognition 84: 73–111.
Croft, William
2001 Radical Construction Grammar: Syntactic Theory in Typological Perspective.
Oxford: Oxford University Press.
Corpus pragmatics: From form to function 581

Croft, William and D. Alan Cruse

2004 Cognitive Linguistics. Cambridge: Cambridge University Press.
Crystal, David
1992 An Encyclopedic Dictionary of Language and Languages. Harmondsworth:
Penguin.
Cuenca, Maria Josep
2000 Defining the indefinable? Interjections. Syntaksis 3: 29–44.
Dines, Elizabeth R.
1980 Variation in discourse –“and stuff like that”. Language in Society 9: 13–33.
Dingemanse, Mark, Joe Blythe and Tyko Dirksmeyer
2014 Formats for other-initiation of repair across languages: An exercise in prag-
matic typology. Studies in Language 38(1): 5–43.
Erman, Britt
1987 Pragmatic Expressions in English: A Study of you know, you see and I mean
in Face-to-Face Conversation. (Stockholm Studies in English 69). Stockholm:
Almqvist and Wiksell.
Ferrara, Kathleen
1997 Form and function of the discourse marker anyway: Implications for discourse
analysis. Linguistics 35: 345–378.
Ferrara, Kathleen
2001 Sample analysis: Intonation in discourse markers – The case of anyway. In:
Ann Wennerstrom (ed.), The Music of Everyday Speech: Prosody and Dis-
course Analysis, 117–129. New York: Oxford University Press.
Fischer, Kerstin
2006a Towards an understanding of the spectrum of approaches to discourse par-
ticles. Introduction to the volume. In: Kerstin Fischer (ed.), Approaches to
Discourse Particles, 1–20. Amsterdam: Elsevier.
Fischer, Kerstin
2006b Frames, constructions, and invariant meanings: The functional polysemy of
discourse particles. In: Kerstin Fischer (ed.), Approaches to Discourse Parti-
cles, 427–447. Amsterdam: Elsevier.
Fried, Mirjam and Jan-Ola Östman
2005 Construction grammar and spoken language: The case of pragmatic particles.
Journal of Pragmatics 37: 1752–1778.
Ghezzi, Chiara and Piera Molinelli
2014 Discourse and pragmatic markers from Latin to the Romance languages: New
insights. In: Chiara Ghezzi and Piera Molinelli (eds.), Discourse and Prag-
matic Markers from Latin to the Romance Languages, 1–9. Oxford: Oxford
University Press.
Gilquin, Gaëtanelle and Sylvie De Cock
2011 Introduction. Errors and disfluencies in spoken corpora. Setting the scene.
International Corpus of Corpus Linguistics 16(2): 141–172.
Golato, Andrea and Peter Golato
[this volume] Ethnomethodology and conversation analysis.
Goldberg, Adèle
1995 A Construction Grammar Approach to Argument Structure. Chicago: Univer-
sity of Chicago Press.
582 Karin Aijmer

Grosz, Barbara and Candace L. Sidner

1986 Attention, intentions, and the structure of discourse. Computational Linguis-
tics 12: 175–204.
Halliday, M.A.K. and Ruqaia Hasan
1976 Cohesion in English. London: Longman.
Hansen, Maj-Britt Mosegaard
2014 Cyclicity in semantic/pragmatic change: The medieval particle ja between
Latin IAM and Modern French déjà. In: Chiara Ghezzi and Piera Molinelli
(eds.), Discourse and Pragmatic Markers from Latin to the Romance Lan-
guages, 139–165. Oxford: Oxford University Press.
Haselow, Alexander
2013 Arguing for a wide conception of grammar: The case of final particles in spo-
ken discourse. Folia Linguistica 47(2): 375–424.
Heine, Bernd
2013 On discourse markers: Grammaticalization, pragmaticalization, or something
else? Linguistics 51(6): 1205–1247.
Heine, Bernd, Gunther Kaltenböck, Tania Kuteva and Haiping Long
2013 An outline of discourse grammar. In: Shannon T. Bischoff and Carmen Jany
(eds.), Functional Approaches to Language, 155–206. Berlin: de Gruyter
Mouton.
Heritage, John H.
1984 A change-of-state token and aspects of its sequential placement. In: J. Max-
well Atkinson and John Heritage (eds.), Structures of Social Action. Studies in
Conversation Analysis, 299–345, Cambridge: Cambridge University Press.
Hirschberg, Julia and Diane Litman
1987 Now let’s talk about “now”: Identifying cue phrases intonationally. ACL–87,
173–171. Stanford, CA.
Holmes, Janet
1986 Functions of you know in women’s and men’s speech. Language in Society 15:
1–22.
Innes, Bronwen
2010 “Well, that’s why I asked the question sir”: Well as a discourse marker in court.
Language in Society 39: 95–117.
Jefferson, Gail
1978 Sequential aspects of storytelling in conversation. In: Jim Schenkein (ed.),
Studies in the Organization of Conversational Interaction, 219–248. New
York: Academic Press.
Jucker, Andreas H.
2015a Uh and um as planners in the Corpus of Historical American English. In: Irma
Taavitsainen, Merja Kytö, Claudia Claridge and Jeremy Smith (eds.), Devel-
opments in English: Expanding Electronic Evidence, 162–177. Cambridge:
Cambridge University Press.
Jucker, Andreas H.
2015b Pragmatics of fiction: Literary uses of uh and um. Journal of Pragmatics 86:
63–67.
Jucker, Andreas H. and Irma Taavitsainen
2012 Pragmatic variables. In: Juan Manuel Hernández-Campoy and Juan Camilo
Corpus pragmatics: From form to function 583

Conde-Silvestre (eds.), The Handbook of Historical Sociolinguistics, 293–

306. Oxford: Blackwell.
Jucker, Andreas H. and Yael Ziv (eds.)
1998 Discourse Markers: Description and Theory. Amsterdam and Philadelphia:
John Benjamins.
Kaltenböck, Gunther and Bernd Heine
2014 Sentence grammar vs. thetical grammar: Two competing domains? In: Brian
MacWhinney, Andrej Malchukov and Edith Moravcsik (eds.), Competing Moti-
vations in Grammar and Usage, 348–363. Oxford: Oxford University Press.
Kay, Paul and Charles J. Fillmore
1999 Grammatical constructions and linguistic generalizations: The What’s X doing
Y construction? Language 75: 1–33.
Kjellmer, Göran
2003 Hesitation. In defence of ER and ERM. English Studies 84: 170–198.
Knight, Dawn and Svenja Adolphs
2008 Multi-modal corpus pragmatics: The case of active listenership. In: Jesús
Romero-Trillo (ed.), Pragmatics and Corpus Linguistics. A Mutualistic
Entente, 175–190. Berlin/New York: Mouton de Gruyter.
Leech, Geoffrey
1999 The distribution and function of vocatives in American and British English
conversation. In: Hilde Hasselgård and Signe Oksefjell (eds.), Out of Corpora:
Studies in Honour of Stig Johansson, 107–118. Amsterdam: Rodopi.
Lenk, Uta
1998 Marking Discourse Coherence. Functions of Discourse Markers in Spoken
English. Tübingen: Narr.
Levinson, Stephen C.
1979 Activity types and language. Linguistics 17: 365–379.
Levinson, Stephen C.
1983 Pragmatics. Cambridge: Cambridge University Press.
Lewis, Diana
2006 Discourse particle: a discourse-pragmatic category. In: Kerstin Fischer (ed.),
Approaches to Discourse Particles, 43–60. Oxford: Elsevier.
Linell, Per
1998 Approaching Dialogue. Talk, Interaction and Contexts in Dialogical Perspec-
tives. Amsterdam: John Benjamins.
Linell, Per
2009 Rethinking Language, Mind, and World Dialogically. Interactional and Contex-
tual Theories of Human Sense-making. Charlotte, N.C.: Information Age Publ.
Linell, Per
2010 Communicative activity types as organisations in discourses and discourses
in organisations. In: Sanna-Kaisa Tanskanen, Marka-Liisa Helasvuo, Mar-
jut Johansson and Mia Raitainiemi (eds.), Discourses in Interaction, 22–59.
Amsterdam/Philadelphia: John Benjamins.
Local, John
1996 Conversational phonetics: Some aspects of news receipts in everyday talk. In:
Elizabeth Couper-Kuhlen and Margaret Selting (eds.), Prosody in Conversa-
tion, 177–230. Cambridge: Cambridge University Press.
584 Karin Aijmer

Markus, Manfred
2014 Spoken features of interjections in English dialect (based on Joseph Wright’s
English Dialect Dictionary). In: Irma Taavitsainen, Merja Kytö, Claudia Clar-
idge and Jeremy Smith (eds.), Developments in English. Expanding Electronic
Evidence, 116–134. Cambridge: Cambridge University Press.
Norrick, Neal
2008 Using large corpora of conversation to investigate narrative: The case of inter-
jections in conversational storytelling performance. International Journal of
Corpus Linguistics 13(4): 438–464.
Ochs, Elinor
1996 Linguistic resources for socializing humanity. In: John J. Gumperz and Ste-
phen C. Levinson (eds.), Rethinking Linguistic Relativity, 407–437. Cam-
bridge: Cambridge University Press.
O’Keeffe, Anne and Svenja Adolphs
2008 Response tokens in British and Irish discourse: Corpus, context and varia-
tional pragmatics. In: Anne Barron and Klaus P. Schneider (eds.), Variational
Pragmatics, 69–98. Amsterdam: John Benjamins.
Ono, Tsuyoshi and Sandra A. Thompson
1995 What can conversation tell us about syntax? In: Philip W. Davis (ed.), Alter-
native Linguistics: Description and Theoretical Modes, 213–27. Philadelphia:
John Benjamins.
Östman, Jan-Ola
1981 You Know. A Discourse-functional Approach. Amsterdam: John Benjamins.
Östman, Jan-Ola
1995 Pragmatic particles twenty years after. In: Brita Wårvik, Sanna-Kaisa Tan-
skanen and Risto Hiltunen (eds.), Organization in Discourse, 95–108. Turku:
University of Turku.
Östman, Jan-Ola
2006 Constructions in cross-linguistic research: Verbs as pragmatic particles in
Solv. In: Karin Aijmer and Anne-Marie Simon-Vandenbergen (eds.), Prag-
matic Markers in Contrast, 237–257. Amsterdam: Elsevier.
Polanyi, Livia
1985 Conversational storytelling. In: Teun van Dijk (ed.), Handbook of Discourse
Analysis, Volume 3: Discourse and Dialogue, 183–202. New York: Academic
Press.
Redeker, Gisela
1986 Language use in informal narratives: Effects of social distance and listener
involvement. Ph.D dissertation, University of California, Berkeley.
Redeker, Gisela
1990 Ideational and pragmatic markers of discourse structure. Linguistics 29: 1139–
1172.
Reichman, Rachel
1978 Conversational coherency. Cognitive Science 2: 283–327.
Rendle-Short, Johanna
2010 “Mate” as a term of address in ordinary interaction. Journal of Pragmatics 42:
1201–1218.
Corpus pragmatics: From form to function 585

Rühlemann, Christoph
2007 Conversation in Context. A Corpus-driven Approach. London/New York:
Continuum.
Rühlemann, Christoph and Karin Aijmer
2015 Corpus pragmatics: Laying the foundations. In: Karin Aijmer and Christoph
Rühlemann (eds.), Corpus Pragmatics. A Handbook, 1–26. Cambridge: Cam-
bridge University Press.
Sacks, Harvey, Emanuel Schegloff and Gail Jefferson
1974 A simplest systematics for the organization of turn-taking for conversation.
Language 50: 696–735.
Schegloff, Emanuel
1996 Turn-organization: On the intersection of grammar and interaction. In: Elinor
Ochs, Emanuel Schegloff and Sandra A. Thompson (eds.), Interaction and
Grammar, 52–133. Cambridge: Cambridge University Press.
Schegloff, Emanuel, Elinor Ochs and Sandra A. Thompson
1996 Introduction. In: Elinor Ochs, Emanuel Schegloff and Sandra A. Thompson
(eds.), Interaction and Grammar, 1–51. Cambridge: Cambridge University
Press.
Schiffrin, Deborah
1987 Discourse Markers. Cambridge: Cambridge University Press.
Schourup, Lawrence
1985 Common Discourse Particles in English Conversation. New York: Garland.
Stenström, Anna-Brita
1989 Discourse signals: Towards a model of analysis. In: Harald Weydt (ed.), Spre-
chen mit Partikeln, 561–574. Berlin: de Gruyter.
Torgersen, Eivind and Costas Gabrielatos
2009 A corpus-based study of invariant tags in London English. In: Meeting of the
Corpus Research Group. Lancaster University. (Unpublished). http://eprints.
lancs.ac.uk/id/eprint/26026.
Tottie, Gunnel
2011 Uh and um as sociolinguistic markers in British English. The International
Journal of Corpus Linguistics 16: 173–196.
Tottie, Gunnel
2016 Planning what to say: Uh and um among the pragmatic markers. In: Gunther
Kaltenböck, Evelien Keizer and Arne Lohmann (eds.), Outside the Clause.
Amsterdam/Philadelphia: John Benjamins.
Wichmann Anne, Anne-Marie Simon-Vandenbergen and Karin Aijmer
2010 How prosody reflects semantic change: A synchronic case study of of course.
In: Hubert Cuyckens, Kristin Davidse and Lieven Vandelanotte (eds.), Sub-
jectification, Intersubjectification and Grammaticalisation, 103–154. Berlin:
Mouton de Gruyter.
23. Corpus-based function-to-form approaches
Anne O’Keeffe

Abstract: This chapter sets out to explore the options for function-to-form research
in the context of corpus pragmatics. Corpus-based function-to-form research
approaches are used in pragmatics research to explore speech acts and related
phenomena, using the function rather than the form as the starting point. Corpus
studies more commonly begin with a form and, in pragmatic studies, work towards
the functional analysis of these forms (i. e. form-to-function approach). However,
when looking at a particular speech act, it can be challenging to find it in a corpus
using form-based searches. It is possible to look at a dataset manually so as to
code all instances of the speech act in the corpus, however, there is a threshold
of corpus size beyond which this becomes implausible. Other systematic options
and solutions have emerged such as using Illocutionary Force Indicating Devices
(IFIDs) (e. g. sorry for apologies), typical features (e. g. positive adjectives, such
as beautiful, for compliments) and metacommunicative expressions (e. g. using
the word compliment to retrieve compliments). The paper will also look at some
emerging approaches based on using collocational profiles of IFIDs to identify
speech acts in very large corpora.

1. Introduction

Within what is termed the “empirical turn” in linguistic research (Taavitsainen

and Jucker 2015), corpus linguistics (CL) has spread its application to many sub-
fields as well as remaining a robust sub-field in its own right. As Ädel and Reppen
note, however, in relation to CL’s paradigmatic dominance, “some subfields are
more amenable to corpus-linguistic methodology than others” (2008: 1). Pragmat-
ics is one of the sub-fields to take on this data-driven empirical methodology even
though it already had established means of collecting empirical (elicited) data,
mainly through Discourse Completion Tasks (DCTs) and role-plays, which are
especially widespread in the context of the study of contrastive second language
pragmatic competence (Blum-Kulka et al. 1989; Sasaki 1998; Billmyer and Vargh-
ese 2000) (for more on DCTs, see chapter 9, this volume). Bringing a CL method-
ology to pragmatic studies is not without its challenges, as this paper will discuss.
The default analytical approach inherent in CL is to move from frequencies of
forms to their functions (via an inductive process). In other words, it takes a pri-
marily form-to-function approach to analysing data (see Aijmer, this volume). For
those involved in the study of pragmatics, and especially speech acts, and related

https://doi.org/10.1515/9783110424928-023
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 587–618. Berlin/
Boston: De Gruyter Mouton.
588 Anne O’Keeffe

phenomena, the norm is to work in the opposite direction, starting with a specific
pragmatic function and, through means of carefully designed elicitation tasks, to
work from the function under investigation to the forms which are typically used.
This is referred to as a function-to-form approach.
Through its inductive process, Aijmer (this volume) notes that taking a form-
to-function approach means that the forms can be studied with great precision
with regard to frequency, distribution, positions, and collocations with different
functions. Rühlemann and Aijmer (2015) point out, however, that the form-to-
function approach can be weak at identifying all of the instances of a particular
function, as it is form driven. So, on one hand, while CL aligns well with the core
principle of pragmatics that meaning is not a stable counterpart of linguistic form,
this is also its weakness when using a form-to-function approach. This challenge is
referred to by Taavitsainen and Jucker (2015: 12), who say that while pragmatics
has embraced the “empirical turn”, and other developments in linguistics over the
years, “corpus linguistics came into pragmatics later” because, “core features of
pragmatics studies, such as negotiation of meanings, speech functions, and varia-
bility of language use with momentary shifts in interpersonal relations, are harder
to catch with corpus methodology than lexical or morpho-syntactic features” (see
also Romero-Trillo 2008; Brinton 2012; Rühlemann and Aijmer 2015). Rome-
ro-Trillo (2008: 2) refers to CL and pragmatics as being fields that were “par-
allel but often mutually exclusive”. However, as more CL researchers draw on
pragmatics to help analyse their data, and more pragmatics research questions are
addressed using corpus data, we are now at the point where we talk about “cor-
pus pragmatics” as an emerging field (see Jucker 2013; Rühlemann and Aijmer
2015).
Within the new coinage of “corpus pragmatics”, more consideration is being
given to how best to use CL for pragmatics research. Rühlemann and Aijmer
(2015) explain that corpus pragmatics combines the key methodologies of both
fields. They point out that the traditional vertical reading of corpus data (typically
in concordances) needs to be balanced with the more horizontal reading of the con-
textual details that are required to fully understand pragmatic phenomena (see also
Rühlemann and Clancy forthcoming). However, this vertical and horizontal bal-
ance presupposes that one begins by searching for a form and that one then works
towards the balanced and contextualised analysis of its function(s) (i. e. form-to-
function). This paper takes as its starting point the more traditional function-to-
form research route of pragmatics analysis and considers this opposite methodo-
logical route in the context of corpus pragmatics. We will consider whether CL
is fit-for-purpose for this traditional approach within pragmatics research. Essen-
tially, given the importance of continuing the functional investigation of language
in use, especially through the study of speech acts, there is a need to consider how,
whether, and how best, this work can be done using CL methodologies. Lutzky and
Kehoe (2017a) problematize this in relation to the study of speech acts and other
Corpus-based function-to-form approaches 589

pragmatic phenomena in large corpora. They say that, for the most part, speech acts
cannot be identified automatically due to the fact that:
1) forms may be produced in a potentially infinite number of ways and,
2) forms which are prototypically associated with a specific speech act (e. g. sorry) may
also be attested with other functions (e. g. a sorry state). (Lutzky and Kehoe 2017a: 38)

As a result, according to Lutzky and Kehoe corpus studies of speech acts, and
related phenomena, tend to be conducted using smaller manually annotated corpora
and, tend to “resort to manual forms of analysis, or to adopt eclectic approaches,
focusing for instance on specific speech act verbs” (2017a: 38).
In recent studies, this conundrum is being addressed and solutions and
workarounds are emerging, as we shall discuss below. To begin with, we shall cast
a cautious eye on the use of corpus data in the study of pragmatics. Then, we shall
explore possible approaches for function-to-form research within the context of
corpus pragmatics for both small and large datasets.

2. Some caveats of corpus data for the study of pragmatics

On one hand, it might seem so obvious that anyone wanting to investigate a prag-
matic feature nowadays would first go to a corpus and start by looking at forms
and frequencies related to that feature. CL seems to offer so much more in terms of
language range and distribution across speakers or writers than data elicited from
role-plays or DCTs. Often these corpus data are readily (and often freely) available,
in abundance. There are some caveats, however, in terms of the seeming wealth of
naturally-occurring language that is available for pragmatics research.

2.1. The challenge of functional diversity and ambiguity

With the abundance of naturally-occurring language data (in electronic form)
comes the downside for pragmatics research in the form of functional diversity
and ambiguity. A corpus, by its nature, is a sizeable sample of language. A corpus
of one-million words of language is considered “small” (O’Keeffe, McCarthy, and
Carter 2007: 4). Corpora of fewer than one million words are usually individual
enterprises where one researcher has gathered data of a very specific nature to
address a particular research question. The boon of corpus quantity brings with it
the downside of having a greater remove, as a researcher, from contextual detail
and richness which is core to the analysis of pragmatics. Let us consider a brief
example: if we opt to look at the speech act of apology, we could immediately
look up the direct speech act by searching for a prototypical Illocutionary Force
Indicating Device (IFID) for apologising, such as sorry, in a corpus. For the pur-
poses of this example, I will use The Limerick Corpus of Irish English (LCIE), a
590 Anne O’Keeffe

one-million word corpus of spoken Irish English, mostly entailing recordings from
casual conversations between family and friends (see Farr, Murphy and O’Keeffe
2004 for a detailed description).
In the sample of one million words, corpus software will instantly find 363
occurrences of sorry. In so doing, we have taken one form associated with a speech
act and we hope that it generates, or “recalls”, instances of the act. Unfortunately,
this is only the beginning of the challenge. Because pragmatics takes as its start-
ing point the notion that meaning is not “a stable counterpart of linguistic form.
Rather it is dynamically generated in the process of using language” (Verschueren
1999: 10), we cannot, of course, assume a direct correlation between the form and
its function as an apology. As example (1) illustrates, the IFID proves unreliable
as a means of recalling all, and only, instances of apologies. In this example, the
search word, or IFID, sorry, is functioning not as an apology but as a request for
clarification used by the listener:
(1) <$1> and <$2> mark speakers one and two, respectively. Two sisters are talking. One
sister, <$2>, is telling a story about a derelict house.
<$2> In the window when I was down there.
<$1> Sorry?
<$2> There was these kinds of bags of sugar in the window.
<$1> Yeah yeah.
(LCIE)
In order to analyse apologies further in this corpus, there is a need to find a
workaround. It may mean: 1) manually sifting through all the instances of the form
sorry to eliminate any that are not related to an apology routine, or 2) “down-sam-
pling”, that is, taking a smaller sample of the data and reading this manually to
identify all instances of apologies (extended over a number of turns, possibly)
and then annotating these so that they can be analysed with the aid of automated
tools as well as through qualitative functional analysis. In essence, in taking a
pragmatic function, in this example a speech act, as a starting point, it might seem
like one has a head start with a large corpus of data (relative to traditional datasets
in pragmatics), but because of the lack of a one-to-one relationship between form
and function, it is far from straightforward. This challenge, as noted by Lutzky and
Kehoe (2017a: 54), has meant that “scholars resorted to smaller data samples (e. g.
Koester 2002; Garcia McAllister 2015)” as well as “eclectic analyses of common
forms or patterns associated with a speech act (see e. g. Aijmer 1996; Deutschmann
2003; Taavitsainen and Jucker 2007; Adolphs 2008)”. Alternatively, others have
used metacommunicative expression analysis (see e. g. Jucker et al. 2012; Jucker
and Taavitsainen 2014, who use the term “compliment” to retrieve performative
instances of compliments) (see Lutzky and Kehoe 2017a: 54). These processes,
Lutzky and Kehoe (2017a: 54) point out, generally demand “stages of manual
microanalysis to separate unwanted hits from examples with specific pragmatic
functions”. As we shall detail below, Lutzky and Kehoe (2017a; 2017b), Jucker
Corpus-based function-to-form approaches 591

and Taavitsainen (2014) and Deutschmann (2003), among others, offer plausible
solutions for analysing speech acts in large corpora. Firstly however, it is important
to consider the longer established approach of eliciting speech act data in the field
of pragmatics, using Discourse Completion Tasks (DCTs) and how these compare
with corpus data.

2.2. Breadth of form at the expense of contextual depth

DCTs have long been the orthodox method of investigating speech acts (Flöck
and Geluykens 2015). They elicit responses to given situational prompts. This
methodology, moving from function to form, has been the norm in pragmatics
and, as Flöck and Geluykens (2015) note, this longevity is for good reason. Using
a DCT means there is no ambiguity of context because the functional scope of the
instrument will have been predefined and will therefore control the context and
conditions very carefully, including the gender, age, social and interpersonal rela-
tionship, and so on, of the speakers. For example, the DCT could be streamlined
to gather apologies in the context of a student apologising to a college professor
for being late to class. It could say that you have never met the professor before
or that you have met before and that this is not the first time that you have been
late with an assignment. This gives a contextual concentration and richness that
provides a narrowed range of the forms used in this specific context, with confined
conditions. Some would argue that this concentration, or narrowness, of DCT data
is its weakness (see Schauer and Adolphs 2006; Flöck and Geluykens 2015) and
that it is in stark contrast to using a corpus where one can avail oneself of a much
broader range of forms and contexts in a much larger sample of naturally-occurring
data. However, despite the abundance of data usually available in a corpus, it is
often at the cost of being far removed from the context unless the data has actually
been collected by the researcher. In large corpora, there will be detailed metadata
on each recording, but this may not be readily accessible and may not be fully
completed.
In terms of illustrating the contextual challenges of looking at speech acts in
a corpus, let us again take as our example the speech act of apology and look at
it using the LCIE. If we use the IFID, sorry, as a “way in” and sift through the
occurrences so as to identify all instances of sorry functioning as the speech act
of apologising, we are at the mercy of the corpus design as to how much back-
ground data we can access about who made the apology, what their interpersonal
relationship with their interlocutor(s) was, what the power semantic was between
the speaker and interlocutor(s) (e. g. symmetrical or asymmetrical), what led to
the apology (it may or may not be obvious from the data), plus a variety of other
possible contextual data. LCIE has metadata on each of the interactions which were
recorded so we know certain details such as gender, age, relationship, educational
background, place of birth, place where currently living, and so on. However, in
592 Anne O’Keeffe

the meanderings of casual conversation, as an outside reader of a conversation, one

might struggle to contextualise some instances of apologies. Extract 2 is not untyp-
ical of what one will find in a corpus of casual conversation. The researcher, as well
as finding out the contextual information from the speaker-information metadata
database, needs to read a lot of the preceding interactional context to work out
that there is a story being told, among friends, amid the interruptions, background
noises, overlapping turns, unintelligible syllables, truncated words. In extract 2,
with most of the mark-up removed to aid legibility here, it is still challenging to
reconstruct the context of the apology, but we can guess that the three friends were
chatting. One speaker, <$1>, is trying to tell a funny tale, speaker 3, <$3>, is aiding
her narrative with response tokens to show listenership (e. g. yeah), and speaker
4, <$4>, is distracted by something (most likely what’s on the television in the
background) and interrupts the telling of the story by making an aside comment
in relation to her observation (Ah look what yer one’s makin aren’t they lovely?).
She (speaker <$4>) then apologises for this interruption (Sorry Joanne finish your
story) and the story continues:
(2) <$N> represents a speaker in order of appearance in the recording, + represents an
interrupted utterance, = represents a truncated utterance
<$3> Yeah.
<$1> < clanging sound > <unintelligible word>
<$4> What were ya doin in Tramore?
<$1> Oh we just went down we ended up just ya know+
<$3> Goin out and havin a few drinks.
<$1> +yeah wha= ?
<$3> I said Joanne and Philip do funny things.
<$1> We do weird things.
<$4> Ah look what yer one’s makin aren’t they lovely?1
<$3> Wha= ?
<$4> Sorry Joanne finish your story.
<$1> So < laughing > so < two syllables unintelligible > went in and said can I have a
breast of chicken without the bra < laughter >.
(LCIE)

This example illustrates the contextual lacuna that a researcher can experience
when working with corpus data while on the other hand, it clearly shows a richness.
The main advantage of using a corpus is the immense breadth it can offer in terms
of the range of forms that are used across so many contexts, individual language
users in their different roles, their varying statuses, educational and social back-
grounds, ages, genders, and so on. However, though you have ready access to the

1
Yer one is an Irish English slang form of your one, meaning ‘that woman’, which is
functioning here as a personal deictic reference.
Corpus-based function-to-form approaches 593

form (which you elect to search for), you may not have access to the contextual
variables from whence it came.
Clearly there is a trade-off between the breadth of forms that corpus data can
offer a researcher and the details of context and conditions within which these
forms occurred. In contrast, by using a DCT, one can carefully define the context
and its conditions, but this is at the expense of breadth of form, as Figure 1 illus-
trates:

Figure 1: Form versus context in corpus versus elicited data

We will return briefly to this point below when we look at two studies that have
directly compared DCT and corpus data (Schauer and Adolphs 2006 and Flöck and
Geluykens 2015).
The temptation to move away from even attempting to find solutions for using
function-to-form approaches is strong given the allure of big data. Let us now con-
sider some caveats about big data options in the study of speech acts and related
phenomena.

2.3. Big data caveats

Taavitsainen and Jucker (2015: 18) issue an important warning, amid the big data
trend, “[t]his unprecedented increase in data size accentuates the problem of the
right balance between the amount of data and the contextualization of the data.
Often the researcher has to opt for one and sacrifice the other”. With such data
stores at one’s finger tips, it is easy to see how form-focused research, driven by the
weight of data sample size, could become the preferred route for researchers inter-
ested in investigating some aspects of pragmatics. Given the importance of under-
standing the contextual provenance of a form in the study of pragmatics, it is crucial
that the limitations of big data results be understood. Tantalisingly large databases
can give immediate results across centuries of data though without the metadata that
one would associate with a corpus. The best-known example, at the time of writing,
is the Google Books Ngram Viewer, which gives instant access to the frequency of
ngrams of up to five words in a corpus of over 5 million books (500 billion words),
published over the last 500 years, or so (currently from 1500 – 2008).
594 Anne O’Keeffe

Figure 2: I apologise and I apologize in American English books 1800 – 2000 Google
Books Ngram Viewer2

As Taavitsainen and Jucker (2015) note, for historical pragmaticists, it offers a

fascinating exploratory tool. For example, we can instantly look up the frequen-
cies of I apologise and I apologize, between 1800 and 2000, in both American and
British English books. We can see that I apologize has a frequency of 0.7 PMW in
American and 0.25 PMW in British English (Figures 2 and 3):
However, this is a database, and we must be mindful of the major limitation that
it has: we are without any context for the occurrences of these forms, and so while
it is interesting as an exploratory tool, it is clearly contextually devoid. It is best
treated as an interesting starting point, a “ready reckoner” of forms over time but it
is of little or no value to the investigation of how these forms actually function(ed).
Another corpus that is of use for diachronic analyses is the Corpus of Historical
American English (COHA), developed by Mark Davies, Brigham Young Univer-
sity. It was launched in 2010 and covers data from 1810 to 2009. As Fringinal et al.
(2014) note, COHA is a “smaller” mega-corpus, standing at 400 million words and
its creator argues that the substantial difference in size does not affect reliability of
results when these corpora are compared. The COHA comprises data from the regis-
ters of fiction, non-fiction, magazine and newspaper. It is accessible, free of charge,
via the Corpus of Contemporary American English interface. As Taavitsainen and

2
Source code: <iframe name="ngram_chart" src="https://books.google.com/ngrams/inter
active_chart?content=I+apologise%2CI+apologize&year_start=1800&year_end=2000
&corpus=17&smoothing=3&share=&direct_url=t1 %3B%2CI%20apologise%
3B%2Cc0 %3B.t1 %3B%2CI%20apologize%3B%2Cc0" width=900 height=500 margin
width=0 marginheight=0 hspace=0 vspace=0 frameborder=0 scrolling=no></iframe>
Corpus-based function-to-form approaches 595

Figure 3: I apologise and I apologize in British English books 1800 – 2000 Google
Books Ngram Viewer3

Jucker (2015: 18) point out: “This allows for detailed and fascinating information
on the frequency of even extremely rare ngrams”. However, in respect of prag-
matics research, it comes with similar caveats in terms of the degree of contextual
information the researcher has available to them.
We will now examine two studies that focus in detail on the impact of how
data was collected on the output and findings in relation to function and form in
the study of speech acts.

3. Evidence from studies comparing speech act data from DCTs and
corpora sources

Schauer and Adolphs (2006) and Flöck and Geluykens (2015) are two studies
which compare the benefits and challenges of using corpus data versus DCTs in
the investigation of pragmatic function. These studies help us better understand the
complexities of the issue.
Schauer and Adolphs (2006) investigated expressions of gratitude using a DCT
of eight scenarios with 16 native speakers. They then used the forms that emerged

3
Source code: <iframe name="ngram_chart" src="https://books.google.com/ngrams/
interactive_chart?content=I+apologise%2CI+apologize&year_start=1800&year_
end=2000&corpus=18&smoothing=3&share=&direct_url=t1 %3B%2CI%20apolo-
gise%3B%2Cc0 %3B.t1 %3B%2CI%20apologize%3B%2Cc0" width=900 height=500
marginwidth=0 marginheight=0 hspace=0 vspace=0 frameborder=0 scrolling=no></
iframe>
596 Anne O’Keeffe

in the DCTs as a basis for corpus searches. They envisaged the corpus data as being
able to provide detailed insights into expressions of gratitude employed by “a wide
part of the population in casual conversations between friends and family, while
the DCT scenarios were designed to represent situations that a specific group (in
this case university students) were likely to come across during a sojourn in the
target environment” (123–124). They used the Cambridge and Nottingham Cor-
pus of Discourse in English (CANCODE), a five-million-word database that was
collected between 1994 and 2001 (see McCarthy 1998 for a description of this
corpus). In all, nine forms emerged from the DCT for the expression of gratitude:
Thanks, Cheers, Ta, Thank you, Thanks a lot, Thanks very much, Thank you so
much, Nice one, and Cheers sweetie (Schauer and Adolphs 2006: 125). All but one
of these forms, Cheers sweetie, were found in the corpus data though to differing
degrees (in terms of frequencies). The three most frequent forms that appeared in
the DCT were Thanks, Cheers and Thank you and these forms were also the most
frequent, though in reverse order, in the corpus data where Thank you was by far
the most frequently used form.
Schauer and Adolphs (2006) cite the length of the turn in which gratitude is
being expressed as the main difference between the elicited and the corpus data.
Importantly, they note that because the DCT is so focused and controlled within
predetermined conditions, as discussed, it usually generates single utterances rather
than stretches of interaction. The corpus data gives a broader contextual picture of
the stretch of discourse that involves the act of expressing gratitude rather than a
single utterance of gratitude. For example, as mentioned above, thank you is one of
the top three DCT forms identified and the most frequent form in the corpus when
compared to the other DCT-generated forms and yet, the corpus also tells us that
its use can stretch over a number of turns in what Schauer and Adolphs (2006) call
a “gratitude cluster”.
Extract (3) from the BNC shows an example of a gratitude cluster in a conver-
sation between a parent and a child. While it is not clear what the thanking relates
to, it is interesting to observe how the thanking spreads across many speaker turns:

(3) <$1> … Thanks very much.

<$2> Cheers Dad. Put away your er luggage. <$E> pause </$E>
<$1> <$E> unclear </$E>.
<$2> Er cheers. <$E> whispering </$E> … <$E> unclear </$E>. <$E> laugh </$E>
<$E> pause </$E>
<$1> Cheers. Right. Okay.
<$2> Thanks very much.
(BNC)

Schauer and Adolphs (2006) note that because DCT data is normally based around
single utterances, this can distort the overall reality of speech acts which are typ-
ically negotiated and developed over a number of turns in a dynamic discourse
Corpus-based function-to-form approaches 597

event. An important point to bring in here is that Taavitsainen and Jucker (2015:
17) forecast that, in the future, speech act analyses will more consistently focus on
the interaction between participants and how speech act values are jointly nego-
tiated and established in the interaction moving from a one-dimensional focus on
single utterances and their meaning to negotiated meaning within the dynamics of
real-time interaction.
Schauer and Adolphs’ (2006) finding that the forms generated by the DCT
methodology were less complex in nature in comparison to corpus data is borne out
in Bodman and Eisenstein (1988) and Yuan (2001). Additionally, studies such as
Hardford and Bardovi-Harlig (1992) on rejections comparing DCTs and authentic
discourse from advertising and Beebe and Cummings’ (1996) study on refusals
found that DCTs contained fewer semantic formulae and negotiating strategies
and were overall less complex and more direct. However, Flöck and Geluykens
(2015), in their study of directives, found DCT data to be more indirect and to
contain more downgrading modifiers than real spoken data to which they were
compared. However, Flöck and Geluykens (2015) reviewed the findings from eight
comparative studies (including those cited above) and concluded that the findings
were far from convergent.

Flöck and Geluykens (2015) investigate directive speech acts in three datasets:
– A sample of spoken data taken from the British component of the International
Corpus of English (ICE): they manually retrieved instances of directive speech
acts in the spoken component of the ICE-GB, which consists of 100 transcripts
(each 2,000 words) of face-to-face and telephone conversations, between par-
ticipants of mostly “low social distance”.
– Elicited written data collected using DCTs: these were elicited in scenarios
where fictional characters had low social status and low power relations. Flöck
and Geluykens (2015) suggest that these elicited data and the corpus data
are maximally comparable because they have a close genre and micro-social
match-up.
– A small corpus of business letters: these are part of the Antwerp Corpus of
Institutional Discourse (Geluykens and Van Rillaer 1995) and due to confiden-
tiality constraints, there is no demographic information available.
All of Flöck and Geluykens’ (2015) data are from native British English speakers
and were collected within the same time span. They randomly selected 235 direc-
tive speech acts from each data set and these were then categorised according to
a uniform coding system (encoding a pragmatic profile of the act). They conclude
that the DCT data exhibited significant differences compared with the spontaneous
data. They note the greatest degree of difference from the conversational direc-
tive speech acts in almost all aspects of their investigation (e. g. percentage of
direct head acts, conventionally indirect head acts, indirect head acts, downgraded
598 Anne O’Keeffe

head acts, ratio of downgraders per head act, percentage of mood imperatives with
please, number of downgrading modifiers and total number of upgrading modifi-
ers). Interestingly and importantly, they note that spontaneous non-elicited data is
far from homogeneous. They found strong evidence of the influence of the condi-
tions of use and genre (though further investigation was beyond the scope of their
study). This led them to stress that “we should at least allow for the possibility
that the type of illocution influences the production choices language users make”
(Flöck and Geluykens 2015: 34). They go on to note, however, in relation to speech
act variation that other speech acts, such as thanking, might be more routinized
and stereotypical. They say that, “what seems clear is that corpus pragmatics in
the widest sense of the word has a major role to play in unravelling some of these
complex issues” (Flöck and Geluykens 2015: 34).
It is of great importance to the evolution of corpus pragmatics that we see a
continued research of this nature where the output from different methodologies
for data collection are closely scrutinised so as to arrive at enhanced understand-
ings of the value and limitations of methodologies within the area of pragmatics.
Leaving aside how data is collected at this point we turn now to the practicalities
of how best to analyse data in a function-to-form approach.

4. Function-to-form approaches to corpus research

Ädel and Reppen (2008: 2–3), in the introduction to their edited volume, summa-
rised four approaches to using a corpus for corpus-based form-to-function inves-
tigations of discourse (listed below). Ädel and Reppen (2008) point out that these
approaches often overlap and there is iteration within any of these strategies. None-
theless, they are useful to consider as core investigative strategies and more per-
tinent to the present study, we need to consider, what are the equivalent strategies
or approaches that one might take if one is interested in the opposite investigative
route, namely function-to-form. Ädel and Reppen’s (2008: 2–3) four approaches
to form-to-function corpus analysis:
– One-to-one searching: where there is a 100 % match (or recall) from the search
item to relevant hits; for example, you seek to investigate the use of noun
phrases in a sample of data. In a Part-Of Speech (POS) tagged corpus, this will
generate full recall of all noun phrases. If you wished to look at all instances of
Thank you, again this search would result in a full recall of forms.
– Sampling: this involves using one or more search item(s) that are good exam-
ples of the linguistic phenomenon in question. In pragmatic terms, this means
using IFIDs, for example. As discussed earlier, one could search for sorry so
as to sample possible instances of apologies.
– Sifting: if you engage in sampling, you will most likely need to sift through the
Corpus-based function-to-form approaches 599

sample to isolate the forms/instances that you are interested in. For example,
through sifting you can eliminate any instances of sorry that are not functioning
as apologies. However, this process is limited in that you will not find instances
of apologies that do not use sorry.
– Frequency-based listing: this is the purest corpus approach where you take a
bottom-up approach and start by looking at the frequencies of forms in your
corpus and work from there in terms of their patterns and meanings. Many
frequency-based studies of corpus data end up with pragmatic conclusions to
explain differences in frequencies and patterns across contexts of use but they
set out from the baseline of frequency results of forms.
Here we will attempt to lay out the possibilities for function-to-form corpus analy-
sis. As with the aforementioned strategies for form-to-function research, they will
often overlap.

4.1. Approach 1: One-to-one searching in a pragmatically annotated corpus

In the case of function-to-form analyses, being able to conduct a one-to-one search
of a pragmatic function, in a pragmatically-annotated corpus, so as to recall all of
its instances of a given speech act is what O’Keeffe, Clancy and Adolphs (2011)
referred as the “holy grail” for corpus pragmatic research. Now, corpus tools and
annotation systems are emerging which show that this is, and will increasingly be,
possible (cf. Culpeper and Archer, this volume). It ultimately means that a prag-
matic function, for example a speech act such as offers, apologies and so on, could
be recalled automatically because they have been annotated within the corpus and
are thus retrievable, in one-to-one searches, using the appropriate tools.
As Rühlemann and Aijmer (2015) summarise, the growing body of pragmati-
cally annotated corpora include:
– speech acts (Stiles 1992; Garcia 2007; Kallen and Kirk 2012; Kirk 2016)
– discourse markers (Kallen and Kirk 2012; Kirk 2016)
– quotatives (Kallen and Kirk 2012; Kirk 2016; Rühlemann and O’Donnell 2012)
– participation role (Rühlemann and O’Donnell 2012)
– politeness (Danescu-Niculescu-Mizil et al. 2013)
Rühlemann and Aijmer (2015) speculate that the reason why pragmatic annota-
tion is not yet widely used is that the form-function mismatch of most pragmatic
phenomena means that automative assignment of tags will often lack precision
and manual laborious annotation is unavoidable. The work of Weisser (2015)
offers some hope in the form of semi-automating the process of speech act iden-
tification using the Dialogue Annotation and Research Tool (DART). This tool,
through carefully determined multiple syntactic structure features and mode (e. g.
modals, adverbials, conditionals, etc.) as well as complex computational tagging,
600 Anne O’Keeffe

can identify speech acts in task-oriented dialogues from the Trains and Trainline
corpora (see Weisser 2015). Weisser shows that the tool is able to generate a high
number of accurately labelled speech acts, within this very defined context. These
categories yielded a speech act taxonomy that included: conventionalized, dia-
logue-managing, information- or option-seeking, information-providing/respond-
ing, directive-seeking/providing, (dis)agreeing/acknowledging, informing, and
commitment-indicating. For this tool to be further developed, Weisser stresses the
need for corpora to have available more information, at transcription phase (e. g.
syntactic structure, roles of the interlocutors, and prosodic description). This is
borne out by the work of Kallen and Kirk (2012) on the ICE Ireland corpus, which
we will look at in greater detail.
Kirk and Andersen (2016: 294–295) outline some of the challenges of prag-
matic annotation, not least of all the fact that when real spoken language is tran-
scribed, it is reduced into a pragmatically-bereft form (as alluded to above). Kirk
(2016: 300) notes that transcriptions record “the locutionary act of producing forms
and constructions, but ‘what is heard’ (i. e. the illocutionary force or intent, and
its processing as the perlocutionary effect) is only extrapolable from the transcrip-
tion”. These deficiencies make it even more challenging to superimpose pragmatic
annotation onto existing corpora of spoken language.
What is not encoded in conventional lexico-syntactic transcriptions are indica-
tions of the pragmatics operating in an utterance: the illocutionary force or intent
(the speech act status), the perlocutionary effect, the upholding or breaching of the
Gricean co-operative principle, the politeness strategy invoked, the attitude of a
speaker to the message of the utterance being made (pragmatic stance) or to the
hearer of that utterance (face negotiation), and so too its potential impact. Much
of what speakers utter is determined by a speaker’s attitude towards what they are
saying and towards the person(s) to whom they are saying it (Kirk and Andersen
2016: 294–295).
Crucially, they note that understanding these deficiencies is a key to the ongo-
ing development of pragmatic annotation: “The more linguists come to understand
about those interpersonal, intersubjective, communicative ways, the more new
layers may be added to the linguistic structures which have been conventionally
represented hitherto” (Kirk and Andersen 2016: 295).
Of interest is the SPICE Ireland corpus because it is an example of a spo-
ken corpus which has been pragmatically annotated and so it offers a model for
how one-to-one searching can be made possible in a function-to-form approach to
corpus pragmatics. SPICE Ireland is part of the International Corpus of English
suite (Kirk et al. 2011; Kallen and Kirk 2012). It contains just over one million
words, entailing 15 discourse situations, as well as 17 written domains. The 15
discourse situations comprise 626,597 words and all were annotated pragmati-
cally. The annotation scheme comprises five components: the speech act status of
each utterance in the corpus, based on Searle’s (1976) categories of illocutionary
Corpus-based function-to-form approaches 601

acts, tone movements, discourse markers, utterance tags, and quotatives (see Kirk
2016: 306). Speech act status, for instance, is marked with pairs of angled brackets
(based on the system used in COCOA conventions for pairs of opening and closing
angle brackets for the representation of a speech act, see below). The annotation
surrounds the span of an utterance which contains a speech act, i. e. with a code in
angle brackets before the utterance, concluding with a backslash. An appropriate
code is used to represent the type of act based on Searle’s (1976) taxonomy (Kirk
2016: 302):
<rep> … </rep> for “representatives”;
<dir> … </dir> for “directives”;
<com> … </com> for “commissives”;
<exp> … </exp> for “expressives”;
<decl> … </decl> for “declaratives”

Four other codes that were deemed necessary (Kirk 2016: 302):
<icu> … </icu> for “indeterminate conversationally-relevant utterances”

These are used to mark a broad range of minimal responses, back-channel utter-
ances, or “other elements of speech which are relevant to the maintenance of dis-
course coherence or continuity, but which lack a discernible function as a speech
act” (Kirk 2016: 309).
<soc> … </soc> for “social expressions”

This code is used for social expressions such as greetings, leave takings, and other
interactive expressions fall into this category (for example the closing exchange in
telephone conversation).
<xpa> … </xpa> for utterances not analysable at a pragmatic level

Kirk (2016: 310) notes that the SPICE annotation tool requires every utterance to
be glossed for pragmatic value, “yet it is inevitable in a large corpus of natural-
ly-occurring data that many utterances will be impossible to categorise as speech
acts or conversational moves of one kind or another”. In such cases, this code is
used to show that an utterance lies outside the pragmatic frame of analysis.
<K …> … </K …> for “keyed” utterances.

Kirk (2016) notes that the data of ICE-Ireland provide clear examples where speak-
ers are not being literal, but rather use the form of one type of speech act to commit
an act of a different type. Kirk followed the work of Goffman (1974) on frame
analysis, and devised a <K> code for such utterances, where they are treated as
“keyings” of a primary speech act. He provides the following example which, Kirk
(2016: 310) notes, “takes the syntactic form of a commissive (undertaking to send
the listener a bill), but it is not intended as one” rather is it uttered by the judge
602 Anne O’Keeffe

who has just given off-the-record advice to a barrister. Kirk (2016: 310) provides
the interpretation “that it has the function of a directive – an utterance made in or-
der to provoke laughter. The humour itself derives from the speaker’s intentionally
anomalous use of the syntactic form of a commissive when it is understood that the
commissive is not in this case genuine”:
<ICE-NI-LEC-P2A-061$B> <#> <dirK> Yeah* <,> I’ll 1sEnd you my 2bIll% </dirK>
<&> laughter </&> (Kirk 2016: 310)

The scale of pragmatically annotating such a substantial sample as SPICE Ireland

(in terms of spoken language) seems challenging, to say the least, but there are
also examples of work where researchers who are using much smaller and more
contextualised datasets have been able to engage with a similar level of pragmatic
annotation for their particular purposes. A case in point is the work of Milà-Garcia
(in press). In this work, agreement and disagreement in spoken Catalan are the
focus and the data has been annotated for this purpose. This allows the researcher
a total recall on all stretches of discourse (which have been coded) involving either
an agreement or a disagreement. Garcia McAllister (2015) offers more interesting
samples of studies where speech acts have been investigated using corpora where
various workarounds have been found, especially using smaller samples which we
will now consider in greater detail.
In sum, pragmatic annotation offers a possible solution for function-to-form
research but it comes with limitations: 1) it is enormously time-consuming and
labour intensive (and thus expensive) and, realistically, this will be a major bar-
rier to its mainstreaming; 2) due to the inherently fuzzy and discursive nature of
speech acts, decisions of interpretation are dependent on the annotator’s interpre-
tation within the bounds of his/her understanding of the contextual conditions of
the speech event; and 3) because of these constraints, it is best applied in small
scale studies, where the researcher is conducting the annotation and has an indepth
understanding of the contextual variables and conditions.

4.2. Approach 2: Sampling, searching and sifting

As Rühlemann and Aijmer (2015) point out, pragmatics researchers are used to
dealing with small amounts of text and analysing these “horizontally” (taking in all
contextual factors) but, they note, “even small specialized corpora contain far more
words than could possibly be read and analysed by any one researcher in the same
way as the select texts which pragmaticists are used to working with” (Rühlemann
and Aijmer 2015: 6).
An approach that can make function-to-form research more manageable is to
randomly sample from a corpus so as to analyse pragmatic function within that
smaller dataset. When the dataset has been made more manageable, the researcher
can then read it qualitatively and sift through it to find all instances of a particu-
Corpus-based function-to-form approaches 603

lar pragmatic phenomenon. Garcia McAllister (2015) details a study where she
down-sampled a percentage of different data types from a larger corpus so as to
investigate the speech act category of directives. Garcia McAllister drew down
data from the spoken component (1.6 million words) of the TOEFL 2000 Spoken
and Written Academic Language Corpus (T2K-SWAL) (2.7 million words in total),
which was collected via audiotaped recordings of conversations, business inter-
actions, and lectures that took place in a university setting (see Biber et al. 2002,
2004). She narrowed her dataset to the following contexts of use (percentages
of her down-sample are in brackets): service encounters (39.3 %); office hours
(32.3 %), study groups (28.3 %). She then had a data sample of 42,797 words,
which she manually sifted (and listened to the audio recordings) to identify, code
and annotate all instances of directive speech acts. Through her coding system
(see Garcia McAllister 2015), she was able to then apply corpus tools to assign
further linguistic and contextual information to each utterance that she had identi-
fied as relevant to her study and to provide a data set listing each utterance and its
corresponding descriptors. Among other findings, she identified the role of each
situational context in predicting the type of speech act used, for example, service
encounters were found to be characterised by a high frequency of requests for
information, services and payment, suggestions and putting interlocutors on hold.
Reflecting on the process and methodology, she notes that the most difficult part
was identifying speech acts in corpora and annotating them: “It took many hours
of listening to audiotapes and reading transcripts to code all of the utterances ana-
lyzed in this study” (Garcia McAllister 2015: 45).
McCarthy and O’Keeffe (2003) offer another example of a study where
researchers used a down-sample from a larger corpus and then sifted manually
through the sample data to identify and pragmatically categorise the item which
was the focus of their study. This paper sought to explore the pragmatic functions
of vocatives in conversation. One of the datasets under scrutiny, a small 55,000
word corpus of radio phone-ins, was manageable enough in size to manually sift
through to find and functionally classify all vocative occurrences. The other dataset
was the Cambridge and Nottingham Corpus of Discourse in English (CANCODE),
a five-million word corpus of spoken English (see McCarthy 1998). It is obviously
implausible to look at all instances of vocatives in such an amount of data. The
solution was to generate a word frequency list to find the most commonly used
vocatives in the corpus. Kinship terms, Mum(my) and Dad(dy), were also included
since a good deal of the casual data was family-based. The next step was to run
concordances of the high frequency names/address forms. A cut-off of a maxi-
mum of five uses of any one name/address form as vocative was set as a restric-
tion on the corpus search (McCarthy and O’Keeffe 2003). Through this process
of sifting, a total of 100 extracts involving vocatives were identified for further
analysis. Among other findings, they noted a high degree of use of vocatives in
the context of hedging. The vocative was neither syntactically nor semantically
604 Anne O’Keeffe

necessary, but it served often to build relationships (“relational”), downtone chal-

lenges, adversative comments or in disagreements. Additionally, vocatives were
also often found to be a core feature of badinage (McCarthy and O’Keeffe 2003),
especially in the casual conversational data. The functional results from the CAN-
CODE data down-sample could then be compared with the radio phone-in results
once the latter were normalised to percentage results (i. e. so that results were both
out of 100):

Table 1. Breakdown of functional types of vocatives across a random sample

of 100 from CANCODE and a percentage ratio of all vocatives in Liveline
radio phone-in (McCarthy and O’Keeffe 2003: 177)

Function CANCODE (out of a Radio Phone-in (%)

100-vocative sample)
Relational 30 7.7
Topic Management 21 9.0
Badinage 19 3.0
Mitigator 15 10.3
Turn Management 11 11.2
Summons 4 0.0
Call Management 0 58.6

The following examples, (4) to (6), of where the radio phone-in caller uses the pre-
senter’s name (Marian) illustrate the use of vocatives where they are superfluous to
the transactional context of the radio phone-in but they aid the pragmatic smooth
running of agreements and evaluations:
(4) <$2> Yes indeed Marian ah I’d I’d have to agree wholeheartedly with him. (LCIE)
(5) <$2> That’s right Marian. (LCIE)
(6) <$2> It is indeed Marian because ah you know again I think that people are … (LCIE)

In the institutional data (radio phone-in), vocatives had an important call manage-
ment function, which included changes in footing (Goffman 1979) from the audi-
ence to the caller. Example (7) illustrates this function. When the caller’s name is
used (Austin), this is the point at which the presenter (Marian) changes her footing
(speaker alignment) from the audience to the caller:

(7) <$1> Now to a couple that had very very difficult Christmas this year however all’s
well that ends well ah Austin good afternoon to you.
<$2> Good afternoon Marian.
<$1> Your little boy went back to playschool yesterday?
Corpus-based function-to-form approaches 605

<$2> Yesterday that’s right.

(LCIE)

In a follow-up study, Clancy and O’Keeffe (2015) used the results of the functions
of vocatives identified in the 55,000 word radio phone-in and compared these with
an even smaller dataset of 12,500 words of conversations between friends and
family (see Clancy 2015). This dataset was small enough to allow for the sifting
and sorting of all 161 instances of vocatives in the data. Once these were classified
according to their function, it allowed for the comparison of vocative use in the
institutional context of radio phone-in (where pseudo-intimacy was replicated, see
O’Keeffe 2006) and the intimate discourse of family and friends. Again, the results
from the family and friends data, in line with McCarthy and O’Keeffe (2003),
showed that vocative use was much more common in casual conversation between
family and friends and that it played a key downtoning role in the context of miti-
gation, among other relational functions.
These case studies, along with many others, show the benefit of careful and
principled sampling from existing corpora, especially where you can access meta-
data about the speakers and the situation. This then makes scalable the manual
sifting through these data so as to pragmatically categorise all instances of your
research focus.

4.3. Approach 3: Using existing research findings as “seeds”

Another important means of looking at pragmatic functions is to use existing
research findings as the “seeds” or starting points. It is important to stress that there
is so much research output already in existence on so many aspects of pragmatics,
not least of all speech acts, from the many years of work that has preceded corpus
pragmatics. These studies provide very useful starting points for search items in
corpora. The first of these studies, Schauer and Adolphs (2006), has already been
discussed above, in terms of its comparative findings. Here, we will focus on its
methodology. Schauer and Adolphs (2006) take the DCT output from eight sce-
narios involving 16 native speakers as their starting point for corpus searches of
the speech act of expressing of gratitude. In other words, they used the forms that
emerged in the DCTs as a basis for their corpus searches. They say that they opted
to start with the DCTs as their source of corpus search items because they wanted
to control the variables of the context for the scenarios. In doing so, they were able
to make their output from the corpus comparable with that of the DCTs and this,
as we have discussed, led them to some important methodological insights. The
important methodological point here that we can draw from this study is that DCT
results for a given speech act, routine or situation, can offer seeds for searching
corpus data and in so doing one generates a comparable dataset and one will gain
insights into how these forms are used across turns.
606 Anne O’Keeffe

Another example of this use of existing research as a seed is a study by Cheng

and O’Keeffe (2015) where they sought to investigate vague language approx-
imator forms (e. g. about seven, seven or so, at least seven) within one corpus
(inter-culturally) and also to compare the forms across with another variety of
English (cross-culturally):
• Inter-cultural comparison of two sub-corpora of the Hong Kong Corpus of Spo-
ken English (HKCSE) (a total of 216,942 words): a Native Speaker sub-corpus
of 108,760 words and a Hong Kong Chinese sub-corpus of 108,182 words;
• Cross-cultural comparison of the results from the inter-cultural comparison of
Hong Kong data with results from Irish English, using the one-million word
Limerick Corpus of Irish English (LCIE).

Cheng and O’Keeffe were keen to investigate the degree to which these forms and
their pragmatic functions were universal within and across two varieties of English.
This task was too enormous to undertake for all vague language (VL) items which
are not tagged in either corpus so they narrowed their focus to one type of vague
language which was already described in previous research, namely Channell’s
(1994) approximator + number (n). They used the search items from Channell’s
research: about, around, round, approximately, or, or so, at least, at most, less,
more, under and over. These searches had to be disambiguated through manual
concordance sorting so as to arrive at only the relevant structures that contain the
search items and “n” and/or “m” (where “n” refers to a number and “m” refers to
a multiplier of the number, e. g. five (n) or ten (m) minutes). Following Channell’s
(1994) model, the HKCSE sub-corpora and LCIE were examined in detail. In sum-
mary, they found that on the surface, approximator + number (n) seemed to be a
universal feature in terms of form and distribution, with no significant quantitative
differences emerging either from the inter- or cross-cultural analysis. However,
they note that when they looked qualitatively at what the approximators were refer-
ring to in their context of use, they found variation in terms of their distribution
(e. g. approximation with time and calendar periods was the most common con-
text). For this phase of the analysis, the researchers used a random sample of 100
items from each of the three datasets (in the manner detailed in approach 2). This
close qualitative analysis also led to insights about cultural implicitness (especially
within family interaction).
Reflecting on their methodology of using an existing model of forms based on
existing research, Cheng and O’Keeffe say that it allowed them to work within its
syntactic parameters to search through corpora for instances of one specific form
of VL. They say that “while it did involve a lot of manual sorting through concord-
ance lines to eliminate non-VL instances, it was not by any means an insurmount-
able task” (2015: 374).
Corpus-based function-to-form approaches 607

4.4. Approach 4: Solutions for larger corpora

Solutions proposed above are limited to smaller scale corpora or to small samples
drawn down from larger datasets. Let us now showcase some studies that have used
strategies to identify speech acts in larger datasets.

4.4.1. Using Illocutionary Force Indicating Devices (IFIDs)

The seminal work of Deutschmann (2003) set out to examine apologies in British
English using the 10 million word spoken component of the BNC. As he details,
these spoken data involve a total of 4,705 speakers. From this, he isolated only
those dialogues produced by speakers whose age and gender were available in the
metadata. This sub-corpus comprised 5,139,082 words produced by over 1,700
speakers.
As Deutschmann (2003: 17–18) explains, the investigation was limited to
explicit apologies which appeared in the form of IFIDs. Thus, his study focused on
“expressions containing variants of the words afraid, apologise, apology, excuse,
forgive, pardon, regret and sorry”. Using the BNCweb Query System, the results
were then downloaded to an Excel database for manual analysis. By sifting through
the results, utterances which functioned as explicit expressions of apologies were
identified. Once the data was “cleaned” of all non-apologies, each instance was
analysed in the context of the conversation where it was originally uttered so as
to classify it functionally and pragmatically. Where available, speaker metadata,
such as gender, age, social class and the person being addressed, were also noted
in the database for each apology. In addition, where possible, other contextual
variables were also logged, such as the conversational setting (formality level),
conversation type and the number of participants in the given interaction. Details
were also entered for each apology on the power relationship and social distance of
the interlocutors. With this level of meta-detail, Deutschmann was able to generate
some very detailed results on how, when and by whom apologies were performed.
Deutschmann’s analysis identifies three overall functional types of apologies:
real (prototypical) apologies, formulaic apologies and face attack apologies. What
is significant also in this study is that the author sheds light on the link between his
corpus-based approach and its results in comparison to other studies which used
different approaches. In other words, he explores the correlation between research
design and scope of results. The scale of Deutschmann’s study allows for robust
correlations between apologies and variables, such as gender, age, social class,
formality level, group size and genre. For instance, he was able to show that:
• Younger speakers apologised far more often than older speakers; and
• Speakers from middle class backgrounds apologised more than working class
counterparts.
608 Anne O’Keeffe

As Woodman (2005: 316) notes, one of Deutschmann’s most novel findings was
the correlation between group size and apologies:
the more participants in an interaction, the higher the rate of apology. This meant there-
fore that genres such as meetings, classroom contexts, job interviews had more frequent
rates of apologies than genres associated with smaller sizes, for example medical con-
sultations and historical interviews (see Deutschmann 2003: 161).

Deutschmann’s findings in relation to power relations and social distance showed

that the more powerful the speaker, the higher their rate of apology and conversely,
the lower the power of the speaker, the lower the rate of apology.
Woodman (2005: 316) in reviewing Deutschmann’s (2003) methodology sum-
marises the advantages of this approach by saying, “[t]he obvious advantages of
using a computerized database such as the BNC are the sheer scale of the data and
the fact that the language occurred naturally”. Woodman (2005) continues, “[t]he
disadvantages lie in the lack of crucial information in connection with the delivery
of the apologies (such as body language and prosodic features), in the inevitable
inaccuracies involved in the transcription process, and in the lack of any psycho-
logical contextual information about the participants (e. g. perceived gravity of
offense, degree of affection between participants)”. The important achievement
of the painstaking work of Deutschmann is that it showed the scale of what can
be done in using a large corpus for the analysis of a speech act in a systematic
manner. Additionally, other studies can now “stand on the shoulders” of the work
of Deutschmann because he has offered such a detailed starting point for anyone
interested in looking at apologies in corpus data.
Lutzky and Kehoe (2017a) and (2017b) are examples of two studies which
have built on the work of Deutschmann (2003). They both explore apologies in the
diachronically-structured Birmingham Blog Corpus (BBC), which spans 2000–
2010 and is 630 million words in total. In Lutzky and Kehoe (2017a), for example,
they begin with Deutschmann’s eight core apology IFIDs (see above) and their
goal is to arrive at a collocational profile of these items so that they can be used for
automatic attestation of apologies within their very large corpus.
Lutzky and Kehoe (2017a) begin with a sub-corpus of 95 million words of
blogs, plus 86 million words of readers’ comments. Using the apology IFIDs and
their lemmas (e. g. pardon/pardons/pardoned/pardoning), they generate all occur-
rences in the data, without distinguishing between apology and non-apology at this
stage. This meant that their initial findings included many non-apology items, for
example, all instances of afraid, not just I’m afraid in the context of an apology.
Their next step was to conduct a detailed word frequency profile of the col-
locates of each of their initial search items. The collocate had to occur within the
top 100 most frequent times and it had to be within a span of four words to the left
or right of the IFID. For example, the collocates of the IFID apologise included
items that occurred next to the search word, such as profusely, as well as collocates
Corpus-based function-to-form approaches 609

that were up to four words to the left or right of it, for example, inconvenience or
advance. They used a z-score to rank significance of collocational pairings rela-
tive to collocate frequency and corpus size. For instance, they give the example
of profusely: this has the highest z-score though its raw frequency is relatively
low. Though profusely is a relatively rare word (occurring 348 in the dataset),
30 % of all of its occurrences are as a collocate of apologise and thus it has a high
z-score.
By building a profile of the collocates of all of the IFIDs in this manner, Lutzky
and Kehoe (2017a) were then able to aggregate the collocates across all of the
IFIDs to identify the “shared collocates” (see Lutzky and Kehoe 2017a: 46–47).
This showed some interesting patterns, for example, the pronoun I was a shared
collocate of seven of the eight IFIDs (it was not a collocate of apology within the
four word parameters set for the study) while ignorance only collocated with par-
don, excuse and forgive. Interestingly, the reason for apologising within this genre
(blogs) was reflected in this list of shared collocates, such as spelling, typos, poor,
quality and English. Additionally, Lutzky and Kehoe (2017a) identified the items
which were strong collocates with a given IFID but did not appear within the top
100 most frequent collocates of any other IFID. Among their findings in this set of
results was the strikingly colloquial items that uniquely collocated with sorry, such
as oops, aww, hugs, sucks, hon/hun (short for the endearment honey). They further
investigated oops in Lutzky and Kehoe (2017b) and asserted that it could be added
to the list of IFIDs for apologies in blogs.
Lutzky and Kehoe (2017a) and (2017b) offer a fascinating insight into new
and evolving ways of investigating speech act phenomena in very large corpora.
By profiling the similarities and differences in the collocational patterns of several
IFIDs, Lutzky and Kehoe (2017a: 54) show that “functional overlaps” and “diver-
gences can be revealed, which can in turn be used to increase the incidence of
relevant examples in the search output”. This ultimately means arriving at greater
precision in the automated retrieval of speech acts in large corpora. The authors
strongly advocate the place and merits of manual analysis, but they note, “our
methodological approach allowed us to streamline the search for the fairly routi-
nized speech act of apology in our blog data” (Lutzky and Kehoe 2017a: 54).

4.4.2. Using genre-specific search inventories from smaller samples

Kohnen (2008), in his work on directives over text and time, offers an interesting
bottom-up methodology which essentially involves moving from a search inven-
tory drawn from a micro-analysis of a representative genre sample of sermons to
extracting forms from a large-scale diachronic corpus. For Kohnen, the first step
was crucial. It involved manually sifting through a pilot dataset of church sermons
to identify all possible forms of directives. Kohnen concedes that this “by hand”
approach is “extremely labour-intensive” (2008: 296) but it generated a plausible
610 Anne O’Keeffe

search inventory that formed the basis of his study and which is useful for others
who wish to investigate this speech act. In the initial process, there was some
iteration as the pilot micro-analysis was scaled up to a broader representation of
the genre – e. g. also looking at prayers, church letters, etc. Kohnen (2008: 296–7)
notes, “[t]his microanalysis will probably reveal similar as well as different mani-
festations of the speech act, enriching the initial list of manifestations. It will also
give an account of their frequencies and distribution across time”.
The next step was to select manifestations of directives and their distribution
in larger multi-genre corpora so as to further refine the inventory and test their fre-
quency and distribution. This iterative process led to a principled inventory of forms
in a genre-specific historic context. Over time, it shows the profile of a speech act
within a genre and ultimately offers a robust means of moving from a micro-analy-
sis of a speech act in a small representative sample of a genre to a largescale analysis
in a diachronic dataset. Kohnen (2008: 297) reflects that by using this approach,
“we could find out about genre-specific profiles and about speech-act conventions
which may or may not apply in certain genres, and we could trace the develop-
ment of these phenomena in the history of English”. Jucker and Taavitainen (2013)
observe that Kohnen’s method will provide most reliable results for those patterns
that are most frequent and most conventionalised. They note, however, that it is far
less reliable for rare and creative patterns and that it also relies on the availability of
a sufficient amount of data which is relevant and which is spread across the period
of investigation. For Kohnen (2008), this approach worked well because there was
a consistent sample of sermons, and related texts, over time.

4.4.3. Using searches of typical lexical or grammatical features associated

with a speech act
Another interesting approach to analysing a speech act in a corpus is found in
Taavitsainen and Jucker (2008). They sought to examine compliments in three
historical corpora. Faced with the challenge of how to retrieve these, they used
as their workaround adjectives that express positive evaluations, that is search
items such as: beautiful, nice, great, lovely, and lexical strings, such as really nice,
really great, well done, like/love your, what a, you look/’re looking. Reflecting on
the process, Jucker and Taavitsainen (2013: 107) note that while this process did
provide relevant hits, it also returned a lot of passages that were not relevant to the
research focus.
The scale of their research, in terms of breadth of sources and span of time,
meant that they were able to make a number of interesting statements about com-
pliments in an historical context. For example, in Early Modern and Late Modern
English, compliments were found to be gendered. Both male and female authors
were found to use compliments in their writing but female characters received
praise for their look(s) and often turned these down as flattery. On the other hand,
Corpus-based function-to-form approaches 611

in the case of males receiving compliments, they accepted them, by bowing. They
note that this aligned with social norms of the time.
While most compliments were related to physical appearance and possessions,
interestingly, Taavitsainen and Jucker (2008: 218) found only a few instances of
compliments about food and they speculate that perhaps this is due to “the social
norms, protagonists being mostly upper class and not directly associated with the
preparation of meals”.

4.4.4. Using metacommunicative expressions

Focusing the development of compliments in American English from a diachronic
perspective of almost two hundred years, Jucker and Taavitsainen (2014) used both
the 400-million-word Corpus of Historical American English (COHA) and the
425-million-word Corpus of Contemporary American English (COCA). Building
on Taavitsainen and Jucker (2008), they developed a systematic approach to the
analysis of compliments using a “metacommunicative expression analysis” (Jucker
and Taavitsainen 2014: 258). In essence, this approach entailed using the search
term compliment to retrieve performative, descriptive and discursive instances of
the act. This method was a useful starting point even though the term was mostly
not used as a compliment. The process allowed the researchers to negotiate the
value of a given act or to describe someone as having paid a compliment. In this
process, their first step was to narrow their sample across a spread of five sub-cor-
pora of selected decades within the sample period. These were sampled based on
the number of instances of compliment. Subsequently, coders used the extended
contexts of the node occurrence to categorise and code compliments for variables
such as, type, complimenter, complimentee, object of the compliment, compli-
ment response, as well as logging the genre in which it occurred. In their analysis,
they distinguished between personal compliments and ceremonious compliments
and found that in the historical data, more than 90 % were personal compliments.
Within that profile, they noted a steady decline in the use of ceremonious com-
pliments, over time. Their analysis offers details on the distribution of the gender
of the complimenter and complimentee and compares that with the contemporary
data sample from COCA. This showed that males were in the role of compli-
menter between 70 and 85 percent of the time in the historical data while in the
COCA sample, this statistic fell to 67.1 percent. In terms of the gender of com-
plimentee role, there was a balance, apart from the earliest dataset (1820/1830),
where males also received the majority of the compliments. The object of the
compliment showed a consistent pattern across the centuries where most com-
pliments were given on people’s personality/friendship and on their ability/
performance.
In looking at how compliments are responded to in contemporary American
English, it has been shown that they are normally accepted (see Chen 1993). Jucker
612 Anne O’Keeffe

and Taavitsainen (2014) were keen to test this historically through their coding
of the response to the compliment within their dataset. They found that, “accept-
ance of compliments remained more or less stable for the first four periods under
investigation [1820/1830, 1870,1900] but it is clearly higher in the most recent
period [1990/2000], in which it has reached more than 70 per cent” (Jucker and
Taavitsainen 2014: 273). They speculate that this significant rise in acceptance
may be connected with social and cultural changes, or perhaps a change in literary
styles.
Reflecting on the metacommunicative expression analysis methodology that
they deployed in this study, Jucker and Taavitsainen (2014) note that it has strengths
and weaknesses:
It allows the systematic analysis of a specific speech act in large corpora, and thus it
provides a way to investigate synchronic differences or diachronic developments which
would be inaccessible to other methods of investigation. On the other hand, the method
mostly retrieves accounts of a particular speech act rather than the actual speech acts,
and statistical results based on such accounts may be misleading. In the case of com-
pliments, for instance, the retrieved passages may contain a disproportionate amount of
problematic compliments, such as utterances whose status is unclear to the participants.
Such problematic compliments may, of course, differ in systematic ways from a large
number of unproblematic compliments that are given and received in a graceful manner
without any need to explicitly talk about them (Jucker and Taavitsainen 2014: 274).

5. Conclusion

There is no going back to the days before corpora; corpus pragmatics will only
grow stronger amid advances in annotation models, resources and tools. It is
important that within this rapid stream of progress that we are not tempted to see
easily generated computations of forms as a substitute for the qualitative depth that
is needed to fully understand how the meaning of these forms manifests in context.
Taavitsainen (this volume) recalls the caveats of Rissanen (1989) in the early days
of historic corpus linguistics. Rissanen could see that diachronic corpora offered
so much to the field of philology and had so much optimism in terms of what the
power and scope of CL could bring but he flagged the concern that “the corpus
revolution would turn to mere number-crunching” (Rissanen 1989: 17). Though
Rissanen’s fears did not materialise for historic corpus linguistics, it was and still
is healthy to be mindful of these words. Corpus pragmatics is at a relatively early
stage and there is so much potential for both form-to-function and function-to-form
approaches (and indeed a combination of both). It is important for this developing
sub-field that we reflect more on how methodological approaches can be enhanced
through the development of more pragmatic annotation tools, search and retrieval
protocols and resources. Amid the endless growth in data size and ease of availa-
Corpus-based function-to-form approaches 613

bility, we need to keep mindful of the fact that pragmatic insight often starts with
small-scale scoping work, such as we have seen in the work of Kohnen (2008),
Weisser (2015) and Garcia McAllister (2015). We also see that “corpus toiling”
pays off. The painstaking work of Deutschmann (2003) has facilitated development
in IFID collocational profiling by Lutzky and Kehoe (2017a and 2017b) or the
insights which Taavitsainen and Jucker (2008) gained in the analysis of compli-
ments using typical features of positive adjectives to aid recall led to further refine-
ment in later work on metacommunicative expressions (see Jucker and Taavitsainen
2014).
All of these “small-scale” steps, in the larger scheme of mega-corpora, are
leaps in our understanding of the intricacies of building corpora that are fit for the
purpose of both form-to-function and function-to-form approaches to research.
An important lesson from successful examples of corpus-based function-to-form
work to date is the link between the level of detail and consistent of the corpus
metadata and the depth and scope of the results that have been generated about a
given speech act, or related phenomenon (Deutschmann 2003 is a good example
of this). The importance of gathering context-rich metadata should not be missed
by those who are designing a corpus of any scale. The capturing of the subtleties
of any given context will make the dataset more fruitful for pragmatics research
for centuries to come.

References

Adolphs, Svenja
2008 Corpus and Context. Investigating Pragmatic Functions in Spoken Discourse.
Amsterdam/Philadelphia: John Benjamins.
Aijmer, Karin
1996 Conversational Routines in English: Convention and Creativity. London:
Longman.
Aijmer, Karin
[this volume] Corpus pragmatics: From form to function.
Ädel, Annelie and Randi Reppen
2008 The challenges of different settings: An overview. In: Annelie Ädel and Randi
Reppen (eds.), Corpora and Discourse: The Challenges of Different Settings,
1–6. Amsterdam/Philadelphia: John Benjamins.
Beebe, Leslie M. and Martha C. M. Cummings
1996 Natural speech act data versus written questionnaire data: How data collection
method affects speech act performance. In: Susan M. Gass and Joyce Neu
(eds.), Speech Acts across Cultures, 65–86. Berlin: Mouton de Gruyter.
Biber, Douglas, Susan Conrad, Randi Reppen, Pat Byrd and Maria Helt
2002 Speaking and writing in the university: A multi-dimensional comparison.
TESOL Quarterly 36: 9–48.
614 Anne O’Keeffe

Biber, Douglas, Susan Conrad, Randi Reppen, Pat Byrd and Maria Helt, Victoria Clark,
Viviana Cortes, Eniko Csomay and Alfredo Urzua
2004 Representing Language Use in the University: Analysis of the TOEFL 2000
Spoken and Written Academic Language Corpus. Princeton: Educational Test-
ing Service.
Billmyer, Kristine and Manka Varghese
2000 Investigating instrument-based pragmatic variability: Effects of enhancing
discourse completion tests. Applied Linguistics 21(4): 517–552.
Blum-Kulka, Shoshana, Juliane House and Gabriele Kasper (eds.)
1989 Cross-Cultural Pragmatics: Requests and Apologies. Norwood: Ablex.
Brinton, Laurel J.
2012 Historical pragmatics and corpus linguistics: Problems and strategies. Lan-
guage and Computers 76: 101 – 131.
Bodman, Jean W. and Miriam Eisenstein
1988 May God increase your bounty: The expression of gratitude in English by
native and non-native speakers. Cross Currents 15: 1–21.
Channell, Joanna
1994 Vague Language. Oxford: Oxford University Press.
Chen, Rong
1993 Responding to compliments: A contrastive study of politeness strategies
between American English and Chinese speakers. Journal of Pragmatics 20
(1): 49–75.
Cheng, Winnie and Anne O’Keeffe
2015 Vagueness. In: Karin Aijmer and Christoph Rühlemann (eds.), Corpus Prag-
matics: A Handbook, 360–378. Cambridge: Cambridge University Press.
Clancy, Brian
2015 Investigating Intimate Discourse: Exploring the Spoken Interaction of Fami-
lies. Abingdon: Longman.
Clancy, Brian and Anne O’Keeffe
2015 Pragmatics. In: Douglas Biber and Randi Reppen (eds.), The Cambridge
Handbook on Corpus Linguistics, 235–251. Cambridge: Cambridge Univer-
sity Press.
Danescu-Niculescu-Mizil, Cristian, Moritz Sudhof, Daniel Jurafsky, Jure Leskovec and
Christopher Potts
2013 A computational approach to politeness with application to social factors. In:
Proceedings of ACL 2013. Online at: www.mpi-sws.org/~cristian/Politeness.
html (accessed January 2017).
Deutschmann, Mats
2003 Apologising in British English. (Skrifter från moderna språk 10). Umeå: Insti-
tutionen för moderna språk, Umeå University.
Farr, Fiona, Bróna Murphy and Anne O’Keeffe
2004 The Limerick Corpus of Irish English: Design, description and application.
Teanga 21: 5–30
Fringinal, Eric, Marsha Walker and Janet Beth Randall
2014 Exploring mega corpora: Google Ngram Viewer and the Corpus of Historical
American English. EuroAmerican Journal of Applied Linguistics and Lan-
guages 1(1): 48–68.
Corpus-based function-to-form approaches 615

Flöck, Ilka and Geluykens, Ronald

2105 Speech acts in corpus pragmatics: A quantitative contrastive study of direc-
tives in spontaneous and elicited discourse. In: Jésus Romero-Trillo (ed.),
Yearbook of Corpus Linguistics and Pragmatics 201, 7–37. London: Sprin-
ger.
Garcia, Paula
2007 Pragmatics in academic contexts: A spoken corpus study. In: Mari C. Campoy
and María J. Luzón (eds.), Spoken Corpora in Applied Linguistics, 97–128.
Bern: Peter Lang.
Garcia McAllister, Paula
2015 Speech acts: A synchronic perspective. In: Aijmer, Karin and Christoph Rühle-
mann (eds.), Corpus Pragmatics: A Handbook, 29–51. Cambridge: Cambridge
University Press.
Goffman, Erving
1979 Forms of Talk. Philadelphia: University of Pennsylvania Press.
Goffman, Erving
1974 Frame Analysis: An Essay on the Organization of Experience. Cambridge,
MA: Harvard University Press.
Geluykens, Ronald and Gert Van Rillaer
1995 Introducing ACID: The Antwerp Corpus of Institutional Discourse. Interface.
Journal of Applied Linguistics 10(1): 83–101.
Hartford, Beverly S. and Kathleen Bardovi-Harlig
1992 Experimental and observational data in the study of interlanguage pragmatics.
In: Lawrence F. Bouton and Yamuna Kachru (eds.), Pragmatics and Language
Learning 3, 33–52. University of Illinois, Urbana-Champaign: Division of
English as an International Language.
Jucker, Andreas H.
2013 Corpus pragmatics. In: Jan-Ola Östman and Jef Verschueren (eds.), Handbook
of Pragmatics, 2–17. Amsterdam/Philadelphia: John Benjamins.
Jucker, Andreas H. and Irma Taavitsainen
2013 English Historical Pragmatics. Edinburgh: Edinburgh University Press.
Jucker, Andreas H. and Irma Taavitsainen
2014 Complimenting in the history of American English: A metacommunicative
expression analysis. In: Irma Taavitsainen, Andreas H. Jucker and Jukka
Tuominen (eds.) Diachronic Corpus Pragmatics, (Pragmatics & Beyond New
Series 243.) 257–276. Amsterdam/Philadelphia: John Benjamins.
Jucker, Andreas H., Irma Taavitsainen and Gerold Schneider
2012 Semantic corpus trawling: Expressions of ‘courtesy’ and ‘politeness’ in the
Helsinki Corpus. In: Carla Suhr and Irma Taavitsainen (eds.), Developing cor-
pus methodology for historical pragmatics (Studies in variation, contacts and
change in English 11). Helsinki: VARIENG. http://www.helsinki.fi/varieng/
series/volumes/11/jucker_taavitsainen_ schneider/.
Kallen, Jeffery. L. and John M. Kirk
2012 SPICE-Ireland: A User’s Guide. Belfast: Cló Ollscoil na Banríona.
Kirk, John M.
2012 Beyond the structural levels of language: An introduction to the SPICE-Ire-
land Corpus and its uses. In: Janet Cruickshank and Robert McColl Millar
616 Anne O’Keeffe

(eds.), After the Storm: Papers from the Forum for Research on the Languages
of Scotland and Ulster Triennial Meeting, Aberdeen 2012, 207–232. Aber-
deen: Forum for Research on the Languages of Scotland and Ireland.
Kirk, John M.
2016 The Pragmatic Annotation Scheme of the SPICE-Ireland Corpus. Interna-
tional Journal of Pragmatics 21(3): 299–322.
Kirk, John M. and Gisle Andersen
2016 Compilation, transcription, markup and annotation of spoken corpora. Inter-
national Journal of Corpus Linguistics 21(3): 291–298.
Kirk, John M., Jeffery L Kallen, Orla Lowry, Anne Rooney, A and Margaret Mannion
2011 The SPICE-Ireland Corpus: Systems of Pragmatic Annotation for the Spoken
Component of ICE-Ireland. [Version 1.2.2.] Belfast: Queen’s University Bel-
fast and Dublin: Trinity College Dublin.
Koester, Almut J.
2002 The performance of speech acts in workplace conversations and the teaching
of communicative functions. System 30: 167–184.
Kohnen, Thomas
2008 Tracing directives through text and time: Towards a methodology of a cor-
pus-based diachronic speech-act analysis. In: Andreas H. Jucker and Irma
Taavitsainen (eds.), Speech Acts in the History of English, (Pragmatics &
Beyond New Series 176.) 295–310. Amsterdam/Philadelphia: John Benja-
mins.
Lutzky, Ursula and Andrew Kehoe
2017a “I apologise for my poor blogging”: Searching for apologies in the Birming-
ham Blog Corpus. Corpus Pragmatics 1: 37–56.
Lutzky, Ursula and Andrew Kehoe
2017b “Oops, I didn’t mean to be so flippant”: A corpus pragmatic analysis of apolo-
gies in blog data. Journal of Pragmatics 116: 27–36.
McCarthy, Michael J.
1998 Spoken Language and Applied Linguistics. Cambridge: Cambridge University
Press.
McCarthy, Michael J. and Anne O’Keeffe
2003 “What’s in a name?” – vocatives in casual conversations and radio phone in
calls. In: Pepi Leistyna and Charles F. Meyer (eds.), Corpus Analysis: Lan-
guage Structure and Language Use, 153–185. Amsterdam: Rodopi.
Milà-Garcia, Alba
in press Pragmatic annotation for a multilayered analysis of speech acts: A methodo-
logical proposal. Corpus Pragmatics 2 (4).
O’Keeffe, Anne
2006 Investigating Media Discourse. Abingdon: Routledge.
O’Keeffe, Anne, Michael J. McCarthy and Ronald A. Carter
2007 From Corpus to Classroom: Language Use and Language Teaching. Cam-
bridge: Cambridge University Press.
O’Keeffe, Anne, Brian Clancy and Svenja Adolphs
2011 Introducing Pragmatics in Use. Abingdon: Routledge.
Rissanen, Matti
1989 Three problems connected with the use of diachronic corpora. ICAME Journal
13: 16–22.
Corpus-based function-to-form approaches 617

Romero-Trillo, Jésus (ed.)

2008 Pragmatics and Corpus Linguistics: A Mutualistic Entente. Berlin/New York:
Mouton de Gruyter.
Rühlemann, Christoph and Brian Clancy
Forthcoming Corpus linguistics and pragmatics. In: Neal Norrick and Cornelia Ilie
(eds.), Pragmatics and its Interfaces. Amsterdam/Philadelphia: John Benja-
mins.
Rühlemann, Christoph and Karin Aijmer
2015 Corpus pragmatics: Laying the foundations. In: Karin Aijmer and Christoph
Rühlemann (eds.), Corpus Pragmatics: A Handbook, 1–26. Cambridge: Cam-
bridge University Press.
Rühlemann, Christoph and Matthew B. O’Donnell
2012 Introducing a corpus of conversational narratives: Construction and annota-
tion of the Narrative Corpus. Corpus Linguistics and Linguistic Theory 8(2):
313–350.
Sasaki, Miyuki
1998 Investigating EFL students’ production of speech acts: A comparison of pro-
duction questionnaires and role plays. Journal of Pragmatics 30: 457–484.
Schauer, Gila and Svenja Adolphs
2006 Expressions of gratitude in corpus and DCT data: vocabulary, formulaic
sequences, and pedagogy. System 34(1): 119–134.
Searle, John R.
1976 A classification of illocutionary speech acts. Language in Society 5(1): 1–23.
Stiles, William B.
1992 Describing Talk: A Taxonomy of Verbal Response Modes. Newbury Park:
Sage.
Taavitsainen, Irma
[this volume] Historical corpus pragmatics.
Taavitsainen, Irma and Andreas H. Jucker
2007 Speech act verbs and speech acts in the history of English. In: Susan M. Fitz-
maurice and Irma Taavitsainen (eds.), Methods in Historical Pragmatics, 107–
137. Berlin: Mouton de Gruyter.
Taavitsainen, Irma and Andreas H. Jucker
2008 “Methinks you seem more beautiful than ever”: Compliments and gender in
the history of English. In: Andreas H. Jucker and Irma Taavitsainen (eds.),
Speech Acts in the History of English, 195–228. Amsterdam/Philadelphia:
John Benjamins.
Taavitsainen, Irma and Andreas H. Jucker
2015 Twenty years of historical pragmatics: Origins, developments and changing
thought styles. Journal of Historical Pragmatics 16(1): 1–24
Verschueren, Jef
1999 Understanding Pragmatics. London: Arnold.
Weisser, Martin
2015 Speech act annotation. In: Karin Aijmer and Christoph Rühlemann (eds.),
Corpus Pragmatics: A Handbook, 84–110. Cambridge: Cambridge University
Press.
618 Anne O’Keeffe

Woodman, Gill
2005 Review of Mats Deutschmann, Apologising in British English. Language in
Society 34: 314–317.
Yuan, Yi
2001 An inquiry into empirical pragmatics data-gathering methods: Written DCTs
oral DCTs, field notes, and natural conversations. Journal of Pragmatics 33:
271–292.
24. Corpus-based metapragmatics
Michael Haugh

Abstract: Metapragmatics encompasses the study of displays of awareness on

the part of users and observers of language about their use of language. In this
chapter, it is argued that corpus-based approaches to metapragmatics are uniquely
positioned to advance such studies, as it allows for interpretive (horizontal) and
statistical (vertical) methods of analysis to be successively applied to the analysis
of metapragmatic data. The different ways in which corpora have been used to
study metapragmatic labels with respect to the broader metacommunicative lexi-
con of which they form a part are discussed, alongside studies of metapragmatics
in use, that is, the use of these metapragmatic labels in situated contexts. It is then
argued, building on a case study illustrating how claims of non-serious intent (e. g.
just kidding, only joking) can be studied from a corpus-based perspective, that
studies in corpus-based metapragmatics necessarily involve carefully interweaving
horizontal and vertical methods of analysis. The chapter concludes with a call for
further corpus-based metapragmatic studies across different languages.

1. Introduction

Metapragmatics is concerned with the study of reflexive awareness on the part of

participants in interactions, and observers of interactions, about the language that
is being used in those interactions. In other words, it involves analysing the ways
in which we display awareness of our use of language through the various ways in
which we use language to refer to our use of language.
Consider the following excerpt from a phone call taken from the CallFriend
corpus, in which Jess (F1) has called her friend Cathy (F2). Jess is telling Cathy
about an assignment she is completing for a class at college.1 Jess has already
established with Cathy prior to this excerpt that their conversation is being recorded
for research purposes.

1
http://talkbank.org/browser/index.php?url=CABank/CallFriend/engn/engn6015.cha.
The excerpt is reproduced here according to CHAT transcription conventions (see
Appendix) as they are used in Talkbank (MacWhinney 2000). We will revisit the same
example transcribed using more detailed CA (Conversation Analysis) transcription con-
ventions in example (3) in section 5.

https://doi.org/10.1515/9783110424928-024
In: A. H. Jucker, K. P. Schneider and W. Bublitz (eds.). (2018). Methods in Pragmatics, 619–643. Berlin/
Boston: De Gruyter Mouton.
620 Michael Haugh

(1) CallFriend: engn6015: 3:08

102 F1: I wrote a children’s book (0.4) and (0.6)
103 I’m really happy with the entry notes, and like,
104 I did this for my final project, in this class, and ⌈you can⌉.
105 F2: ⌊uh huh⌋.
106 F1: basically you can do anything, it’s for this children’s literature class
107 so the rule, of thumb→
108 F2: +≈they’ll hear→
109 F2: they’re recording us→
110 F2: I’m just kidding ⌈Go on, hhh hhh hhh I’m just kidding⌉→
111 F1: ⌊ hhh and ⌋ and now I’m going to illustrate it↗

What is notable here is that after warning Jess in lines 108–109 that others will
hear what Jess has just said in line 106, Cathy goes on to claim she is just kidding
(line 110). This claim to be just kidding refers back to her prior warning in lines
108–109, and thus constitutes a display of metapragmatic awareness on Cathy’s
part. In claiming to be just kidding she construes this warning as a tease that should
not be taken seriously, with shared laughter in lines 110–111 displaying a joint
understanding on both their parts of just that. However, this kind of interpretive
analysis, albeit one grounded in the responses to participants to (just) prior talk
through which they display their understandings of that talk, raises a number of
broader questions for linguistic pragmaticians.
Cathy is certainly not the first person to claim to be just kidding. A search
of any corpora of spoken or digitally-mediated talk reveals this to be a common
collocation in English. This raises questions about what it “means” to be just kid-
ding, and whether it differs from the meaning of other related terms, such as only
kidding, kidding, just joking, only joking and so on. What Cathy and Jess take just
kidding to mean here likely depends on their own encounters with this particular
collocation, and how it is used by speakers of the variety of English with which
they are evidently both familiar, namely, General American English, specifically
that variety used in the Upper Midwest where the recording took place. The sec-
ond question raised here concerns what Cathy is (taken to be) “doing” through
her claim to be just kidding. From Jess’s response in lines 111, in which she first
laughs and then proceeds to return to her prior telling about the class project, it
appears that this claim to be just kidding not only construes the prior utterance as
a form of non-serious teasing, but also invites laughter and a subsequent return to
serious talk (Haugh 2016b). Given the split-second timing here in the to-and-fro
of this interaction, it appears that what this claim to be just kidding is doing is
readily recognisable to both these interactants. The broader questions for research
in metapragmatics are what underpins the recognisability of these kinds of meta
pragmatic labels not only for these participants, but for us as observers of this inter-
action, and how does their use in these kinds of metapragmatic comments shape
our understandings of not only particular interactions, but our social reality more
Corpus-based metapragmatics 621

broadly. Given metapragmatics is centrally concerned with studying language use

that reflexively takes language use as its object of interest, it is no surprise that the
analysis of various forms of metalinguistic awareness plays a key role in proceed-
ing to answer such questions.
In this chapter, we consider the contribution that corpus pragmatics can make
to advancing our understanding of the metapragmatics of different languages and
varieties therein. The contention will be that methods and approaches developed
in corpus pragmatics are perhaps of even greater import than they are to the study
of pragmatics more generally. This is because the use of various metalinguistic
“forms” is the primary means by which participants and observers can display
metapragmatic “awareness” about their use of language. The identification and
analysis of the functions of linguistic forms across large tracts of data is, of course,
the raison d’être of corpus pragmatics. For that reason, corpus-based metaprag-
matics is arguably uniquely placed to contribute to the ongoing advancement of
metapragmatics more generally, particularly as we move towards a pragmatics that
treats pragmatic variation both within and across languages as a serious object of
analysis in its own right (Schneider 2017; Schneider and Barron 2008).
We begin, in the following section, by briefly introducing metapragmatics as
a field and its growing inter-relationship with corpus pragmatics. In section three,
we move to consider studies that use corpora to examine what different tokens in
the metacommunicative lexicon are taken to mean both within and across interac-
tions. We then discuss, in section four, metapragmatics in use, that is, studies that
use corpora to examine what invoking metapragmatic labels in situated contexts is
taken to be doing in those interactions. In section five, the various methodological
strands that were identified in the two prior sections are drawn together in a brief
case study that discusses how claims to non-serious intent (e. g. just kidding, only
joking etc.) can be studied from a corpus-based perspective, and the issues that
arise in undertaking such studies. The chapter concludes by outlining the impor-
tance of undertaking further research in corpus-based metapragmatics.

2. Metapragmatics and corpus pragmatics

In its broadest formulation, metapragmatics is concerned with “the language user’s

reflexive awareness of what is involved in a usage event” (Verschueren [1995]
2010: 1). This reflexive awareness is what affords multiple layers of interpreta-
tion and evaluation in communicative interaction, as first outlined in Ruesch and
Bateson (1951), and subsequently Bateson’s (1972) highly influential theory of
metacommunication. This broader sense of metapragmatics amounts to a (meta)
theory of pragmatics itself (Caffi 1994, 1998; Hübler 2011; Mey 2001). However,
in a more narrowly focused sense, metapragmatics encompasses studying the ways
in which users display reflexive awareness of their use of language through uses
622 Michael Haugh

of language that reflexively refer to their use of language (Culpeper and Haugh
2014; Kádár and Haugh 2013; Hübler and Bublitz 2007; Hübler and Busse 2012).
In the latter case, the focus is on how we can gain insights into how people both
conceptualise and evaluate aspects of their social worlds through studying meta
pragmatic uses of language in which users are explicitly directing attention to their
use of language through their use of language.
Reflexivity and awareness are thus two key concepts in metapragmatics.
Reflexivity refers to the way in which one level of interpretation of language
use by users is interdependent with other levels of interpretation, while aware-
ness involves directed attention on the part of users towards particular pragmatic
objects. Niedzielski and Preston (2009) suggest the latter varies in its degree of
salience and accessibility for those users (and indeed observers) of language in use.
Various forms of metapragmatic awareness have been discussed (Coupland and
Jaworski 2004; Hübler 2011; Hübler and Bublitz 2007; Mertz and Yovel 2009),2
but two of particular relevance to corpus pragmatics mirror those originally noted
by Bateson (1955) himself, namely, metalinguistic awareness and metacommu-
nicative awareness. Metalinguistic awareness refers more generally to the ability
to treat language itself as an object of reflection through recourse to metalanguage
(that is, language about language), while metacommunicative awareness refers to
the ability to treat communication itself as an object of reflection through recourse
to metacommunication (that is, communication about communication)
In the example above, for instance, the claim to be just kidding involves a
particular form of metalinguistic awareness where the focus is on the ways in
which language is used to communicate with others, or what is sometimes termed
metapragmatic or metacommunicative awareness. The collocation just kidding we
discussed in section one is thus representative of a particular “form” of meta-
language that is concerned with our use of language, which scholars have vari-
ously termed metapragmatic labels (Culpeper and Haugh 2014), metacommunica-
tive expressions (Jucker and Taavitsainen 2014), or metacommunicative lexicon
(Hübler 2011). Notably, the “use” of this particular instance of metalanguage in
this situated context contributes to shaping the interaction in particular ways, both
in relation to the social action being accomplished here (i. e. a tease), and how the
producer proposes it should be interpreted (i. e. as non-serious, playful, jocular
etc.) and thus evaluated (i. e. as “non-impolite”). The use of metapragmatic labels
to accomplish social actions in this way are variously termed metapragmatic com-
ments (Culpeper 2011), metalinguistic comments (Davies 2011), meta-utterances
(Hübler and Bublitz 2007), and so on.
The importance of metalinguistic awareness on the part of users for metap-
ragmatics is premised on the fact that “metalanguage also creates, structures, and

2
See Culpeper and Haugh (2014: 240–258) for a useful summary.
Corpus-based metapragmatics 623

forms language and ongoing speech” (Mertz and Yovel 2009: 250, original empha-
sis). The use of metalanguage by both participants in, and observers of interaction
thus plays a key role in structuring understandings of our social world. As Jucker
(2013) points out, “discourse on these elements, the discourse on politeness or the
discourse on a particular speech act, such as compliments, can give us important
insights of an ethnographic nature. It tells us how people evaluate these elements”
(2013: 15, original emphasis). This is arguably important since to reflexively refer
to one’s own use of language or that of others is itself a social action where users
are directing our attention to issues of (moral) accountability and evaluation. The
moral bases of these evaluations are, of course, not conceptualised or practised in
the same ways by users of different languages, and varieties therein, and so what is
regarded as “(in)appropriate”, “(im)proper” or “(im)polite” use of language is sub-
ject to considerable synchronic and diachronic variation both cross-linguistically
and cross-culturally.
As Hübler and Busse (2012) note, metapragmatic uses of language arise through
people “abstracting from individual phenomena and treating them as tokens of a
type” (2012: 1), specifically, where people become “aware of what they do when
they communicate” and wish to “share details [of this] with other group members”
(2012: 2). This means that over time the metacommunicative lexicon itself both
shapes and is shaped by ongoing interactions across users who identify with a par-
ticular language, or variety therein. Systematic study of metapragmatic labels and
metapragmatic comments thus offers us a way forward out of a pragmatics that has
been dominated to date by the scientific metalanguage of English (Culpeper and
Haugh 2014; Haugh 2016a).
At this point, the potential for corpus pragmatics to contribute to this endeavour
should be becoming apparent. Corpus pragmatics in the broadest sense involves
“studies of language use that employ large, computer-readable collections of lan-
guage” (Jucker 2013: 1, this volume). An important characteristic of studies in
corpus pragmatics is that they typically integrate horizontal and vertical forms of
analysis (Rühlemann and Aijmer 2015: 12). Horizontal analysis involves employ-
ing qualitative methods to examine the function(s) of particular linguistic forms
in their locally situated, sequential contexts. Vertical analysis, on the other hand,
involves employing quantitative methods to identify recurrent patterns in the use
of particular linguistic forms across different discourse contexts. The integration of
these perspectives is generally accomplished in an iterative manner, in which the
focus is on examining the co-textual patterns of a linguistic item or items (Clancy
and O’Keeffe 2015: 235). The iterative process by which the analyst examines
the functions of particular linguistic forms arguably allows him or her to move
“beyond important but surface observations of lexico-grammatical patterns to
allow a more nuanced interpretation of these patterns taking into consideration
who uses them, where they were used, for what purposes, and how this use has
changed over time” (Clancy and O’Keeffe 2015: 235–236). This iterative research
624 Michael Haugh

process confers significant advantage on studies in metapragmatics where what

is accomplished through locally situated metapragmatic comments is shaped, in
part, by what those participants understand to be meant by the metapragmatic
label being used, and in part, by what using that metapragmatic label in that type
of discourse context typically accomplishes.
In sum, corpus-based metapragmatics involves the study of metapragmatic
labels and metapragmatic comments as they arise in corpora using a combina-
tion of interpretive (horizontal) and statistical (vertical) methods of analysis. The
advantage therein is that we are able to not only examine how particular persons
on certain occasions conceptualise and evaluate what they are doing with language,
but also how such conceptualisations and evaluations are accomplished across
groups of persons. In the following two sections, we move to discuss studies of
metapragmatic labels and metapragmatic comments that exemplify just how this
may be done.

3. Corpus-based approaches to the metacommunicative lexicon

A key focus of research in corpus-based metapragmatics has been on the meta-

communicative lexicon, that is, expressions that denote communicative concepts
(Hübler and Busse 2012: 2; Jucker and Taavitsainen 2014: 12). The key foci of
the metacommunicative lexicon can be broadly summarised as falling within one
of the following three groups (Culpeper and Hardaker 2016: 126; cf. Jucker and
Taavitsainen 2014: 12):
1. Pragmatic acts and activities (e. g. apologise, compliment, joke, tease, threaten)
2. Inferential acts and activities (e. g. allude, hint, imply, mean, insinuate)
3. Evaluative acts and activities (e. g. aggressive, considerate, friendly, polite,
rude)
Pragmatic acts and activities refer to the social actions, and structured sequences of
actions within more extended activities, that participants are (taken to be) accom-
plishing. Inferential acts and activities refer to pragmatic meanings, that is, what
participants are (taken to be) referring to, presuming, saying, implicating, inferring
and so on. Finally, evaluative acts and activities refer to the interpersonal attitudes
and evaluations participants are (taken to be) instantiating (Culpeper and Haugh
2014: 267).
There is, of course, inevitably some degree of overlap and such lists are not
exhaustive, given metapragmatic labels represent first-order, lay categorisations
of the various kinds of acts and activities that can be accomplished in conversa-
tional interaction. However, while one cannot generate a theoretically coherent
taxonomy from corpus-based analyses of metapragmatic labels, such studies do
nevertheless offer penetrating insights into the conceptual and evaluative fields
Corpus-based metapragmatics 625

which do much to structure social reality for users. Hübler and Busse (2012) sug-
gest, for instance, that these sorts of metapragmatic labels “may encapsulate cul-
tural models of communication rooted in particular practices of socio-culturally
defined people”, and so “unpacking past contextual meaning of metacommunica-
tive lexemes comes close to what Geertz (1973) would call a ‘thick description’”
(2012: 8). Such points have also long been made by scholars working within the
Natural Semantic Metalanguage (NSM) tradition (Wierzbicka [1991] 2003). Such
meanings are thus clearly of relevance when we start to examine various kinds of
pragmatic phenomena across languages, and varieties therein.
Taylor (2016a), for example, demonstrates that the conceptual scope of irony
and sarcasm in (British) English and ironia and sarcasmo in Italian are not syn-
onymous by explicating how their meanings not only overlap but also differ. Such
linguistic nuances are important. If being ironic or sarcastic amongst (British)
speakers of English is not regarded as exactly the same thing as being ironico or
sarcastico amongst speakers of Italian, for instance, then studies that attempt to
examine cross-cultural differences can be confounded by underlying conceptual
differences in our object of study. There is evidently a pressing need for further
studies that map the conceptual scope of metapragmatic labels across different
languages and varieties therein if pragmatics is to avoid the ontological limitations
that inevitably arise from relying on English (or even other major languages) as the
scientific metalanguage of choice (Haugh 2016a).
Notably, work to date in corpus-based metapragmatics has largely been focused
on English. This is due, in part, to the ready availability of corpora in English. But
it also reflects an attempt by those researchers to draw attention to the culturally
loaded nature of the metacommunicative lexicon in English (albeit just as is the case
with all languages). Such studies have offered insights into the concepts indexed
by particular metapragmatic labels in different varieties of English by examining
their relationships with other expressions in the co-text (i. e. syntagmatic relation-
ships), or with other expressions in the same semantic field (i. e. paradigmatic rela-
tionships). In both cases, the key tools for analysis include examining the relative
frequency of occurrence of different metapragmatic labels, concordance analyses
of the textual environments in which they occur, and cluster analyses of the other
expressions with which their collocational behaviour co-varies in systematic ways.
A syntagmatic perspective examines collocates of the expression in question
found in the co-text. In so doing it offers insights into its “semantic preference”,
that is, the set of lexical items with which it recurrently collocates (Bednarek 2008:
121), and thus its “semantic prosody”, that is, the “aura of meaning with which a
form is imbued by its collocates” (Rühlemann 2013: 291, citing Louw 1993: 157).
Culpeper’s (2009) study of “impoliteness”-related metalanguage in the 2 billion
word Oxford English Corpus offers an exemplar of this kind of perspective. Cul-
peper (2009) initially found that while rude is a very high frequency lexeme (18,387
tokens), impolite is a very low frequency one (871 tokens). What this suggests is
626 Michael Haugh

that these terms vary significantly in their degree of salience for ordinary speakers
of English, and thus so do their connotations. Impolite, for instance, appears to
have a “more formal, a more highbrow flavour” (Culpeper 2009: 77). Systematic
examination of collocates of rude and impolite using Word Sketch (Kilgarriff et al.
2014) – which forms part of a suite of analytical tools offered in Sketch Engine3
– highlights further differences between these two metapragmatic labels. While
rude tends to be associated with items that link speakers and their talk, impolite is
associated with items that link hearers with someone else’s talk (Culpeper 2009:
77). In other words, the focus of rude is on the speaker and their behaviour, while
in the case of impolite, the focus is on the effects on the hearer of the speaker’s talk
or behaviour. What this goes to show is that attempts to assign technical meanings
to particular terms in pragmatics can be undermined by the conceptual baggage that
such metapragmatic labels inevitably carry in ordinary discourse, a problem that
has long been noted by politeness researchers (Watts, Ide and Ehlich 1992).
A paradigmatic perspective identifies recurrent patterns in the semantic prosody
of members of a semantic field. In so doing, it offers insights into how members of
a semantic field “co-determine one another semantically” (Hübler and Busse 2012:
5). Such relationships are generally examined along two axes: semantic opposition
(antonyms) and semantic similarity (synonyms). Systematic studies of semantic
fields are thus important since metapragmatic labels clearly do not exist in iso-
lation, but are inevitably related in various ways to other labels. Such research is
also important because the relative diversity of such sets is arguably an indication
of what social value is placed on the communicative aspect to which they refer.
Hübler and Busse (2012) suggest that “as a rule of thumb, we could say the more
diversified the set is, the higher is the social significance of the communicative
aspect of its members” (2012: 5). This is likely because a more diverse set of
metapragmatic labels enables users to make increasingly fine-grained distinctions
in their descriptions of a particular aspect of social reality.
A recent study by Culpeper, O’Driscoll and Hardaker (forthcoming) in which
they compare clusters of collocates associated with polite in British and Ameri-
can English offers an exemplar of the paradigmatic perspective on the metacom-
municative lexicon. Using the Distributional Thesaurus (Kilgarriff et al. 2014) –
another tool offered in Sketch Engine – they examined similarities and differences
in the collocational clusters associated with occurrences of polite in the American
and British English sub-corpora of the Oxford English Corpus. As Kilgarriff et

3
For further information about how Word Sketch works, see https://www.sketchengine.
co.uk/user-guide/user-manual/word-sketch/. An important feature of Word Sketch is
that it enables the analyst to aggregate concordances automatically into collocational
groups rather than reviewing them manually. However, it is currently limited to the
analysis of single lexemes rather than phrases.
Corpus-based metapragmatics 627

al. (2014) explain, the Distributional Thesaurus tool identifies and aggregates the
number of grammatical relationships a particular word shares with its collocates.
On that basis, a “word cloud” of collocates can be produced, as well as a numer-
ical index of the relative extent to which these different words can be clustered
together.4 While similarities between the clusters of collocates associated with
polite in the American and British data were evident in Culpeper et al.’s (forthcom-
ing) study, interesting differences also emerged. For instance, the sensible cluster
(straightforward, reasonable, convincing etc.) was closely associated with polite
in the British data, but not in the American data. They also noted that respectful
constituted its own distinct cluster with respect to polite in the American data, with
that cluster including compassionate, supportive, constructive, humane, a finding
that in Culpeper et al.’s (forthcoming) view suggests that respectful has somewhat
different connotations in American English to those in British English.
Much of the most important work in corpus-based metapragmatics on the meta-
communicative lexicon to date has been undertaken by historical pragmaticians,
with a particular focus on metapragmatic labels associated with interpersonal eval-
uation. Nevalainen and Tissari (2010), for instance, analysed the collocational prop-
erties of three sets of “politeness” words in the 2.2-million-word Corpus of Early
English Correspondence (CEEC): (a) civil, civility and related words; (b) polite,
politeness and related words, and (c) courteous, courtesy and related words, while
Jucker, Taavitsainen and Schneider (2012) analysed politeness related vocabulary
across the eleven centuries of the Helsinki Corpus. A notable finding in the latter
study was that courtesy-related vocabulary was most prominent in the period of
Middle English (1250–1350). This offers clear evidence that the relative salience
of particular metapragmatic labels within the metacommunicative lexicon can vary
significantly over time, and is suggestive of larger cultural changes that can be
traced over time through methods in corpus-based metapragmatics.
However, despite the common focus on “politeness” in such studies, there is
also an emerging strand that has focused on other dimensions of the metacommu-
nicative lexicon. Taylor (2015, 2016a, 2016b), for instance, has examined inferen-
tial acts, in particular understandings of irony and sarcasm across British English
and Italian, as we previously noted. In another important strand of work, Culpeper
and Hardaker (2016) studied speech acts that were named by users in a corpus con-
structed from Yahoo Q&A using the UCREL Semantic Analysis System (USAS)
tool.5 What emerged was that apart from the frequent use of speech act labels

4
For further information about how the Distributional Thesaurus works, see https://
www.sketchengine.co.uk/user-guide/user-manual/thesaurus/. Similar to Word Sketch,
one current limitation of the Distributional Thesaurus is that it is limited to the analysis
of single lexemes rather than phrases.
5
For further information on USAS, see http://ucrel.lancs.ac.uk/usas/.
628 Michael Haugh

associated with ask, question and tell, as might be expected in this discourse con-
text, other more interpersonally sensitive speech act labels emerged as also being
relatively frequent, including blame, advice, and apologise (2016: 128). They then
went on to examine, in more detail, specific instances in which these speech act
labels arose in order to tease out possible differences across the different varieties
of English represented in the Yahoo Q&A corpus.
Yet despite the existence of these seminal studies, corpus-based studies of the
metacommunicative lexicon are still on the whole relatively small in number, and
so there is clearly much more cross-linguistic work required. What has clearly
emerged from studies to date, from a methodological perspective at least, is that
in order to undertake studies of metapragmatic labels, relatively large corpora
are required. In some instances, specialised corpora can be constructed in which
there are expected to be a greater than average number of tokens of the expres-
sion in question. The generalisability of findings about the metapragmatic label(s)
studied in the latter case remains, however, more open to question. Another key
methodological point to have emerged is that while traditional frequency counts
and semi-manual examination of concordances can offer us useful insights, tools
that enable researchers to statistically aggregate collocates enable analyses across
much larger datasets to be undertaken. For that reason their use is advocated where
possible. Finally, while most corpus-based studies of the metacommunicative lexi-
con tend to start with vertical analyses of large sets of data, almost all of them sup-
plement these more quantitatively-oriented analyses with an analysis of the usage
of particular tokens in situated contexts. The latter analyses have demonstrated that
users may accomplish different understandings of particular metapragmatic labels
that vary in their degree of granularity for particular, locally situated purposes
(Haugh forthcoming).
Corpus-based studies of the metacommunicative lexicon evidently have much
to contribute to better understanding the conceptual and evaluative fields through
which we co-constitute our social worlds. However, given understandings of meta
pragmatic labels vary in their degree of granularity across different speakers and
different occasions, corpus-based studies of the metacommunicative lexicon using
relatively abstracted sets of data arguably need to be complemented by studies that
examine how such metapragmatic labels are put to work in interaction. It is thus to
a consideration of corpus-based studies of metapragmatics in use that we now turn.

4. Corpus-based approaches to metapragmatics in use

Metapragmatics in use involves studying the ways in which participants utilise

metapragmatic comments to explicitly “intervene” (Hübler and Bublitz 2007: 1)
in ongoing talk. In so doing, a variety of different kinds of pragmatic work can be
accomplished. Metapragmatic comments may be deployed, for instance, in order
Corpus-based metapragmatics 629

to “influence and negotiate how an utterance is or should have been heard, or try
to modify the values attributed to it” (Jaworski, Coupland and Galasiński 2004:
4). The example we considered in the introduction to this chapter, where one of
the interactants claimed to be just kidding, constitutes an example of just that.
Notably, these kinds of metapragmatic interventions not only contribute to struc-
turing understandings of what is being accomplished by particular talk, but have
important moral implications as well. To accomplish something as non-serious by
claiming one is just kidding, for instance, amounts to a proposal that one is not
accountable (or at least less accountable) for the real-world implications of what
is meant here (i. e. the content of the tease), and should be evaluated by the target
accordingly (e. g. as “non-impolite”) (Haugh 2016b). To study metapragmatics in
use thus involves examining both the sequential mechanics of such interventions,
as well as their interpersonal and moral implications. A guiding question in any
study of metapragmatics in use is thus what is being accomplished through the
employment of a particular metapragmatic comment in a locally situated context.
What corpus-based studies of metapragmatics in use can add to this line of research
are systematic ways of identifying recurrent practices associated with different
kinds of metapragmatic comments.
Hübler (2011) suggests that there are a number of different analytical foci
available to researchers undertaking corpus-based studies of metapragmatics in
use. The first concerns the “object” of the metapragmatic comments. These range
from more generic topics, such as the “norms of conversation”, including “partic-
ipation in conversation”, through to more specific objects, such as the pragmatic
acts, meanings, attitudes and evaluations, and so forth that are being accomplished
by those users at that moment of interaction (or in a prior interaction). The second
concerns the interactional “function” of the metapragmatic comment, that is, what
is being accomplished through intervening at that point in the interaction. While
metapragmatic comments are commonly associated with attempts to prevent or
repair misunderstandings, or to secure particular understandings, their functions
are, as Hübler and Bublitz (2007) point out, much more varied than that. Indeed,
they are often associated with attempts to manage identities, evaluations of self
and others, and interpersonal (dis)affiliation. The third locus of analysis concerns
the “target” of the metapragmatic comment, that is, whose talk or conduct is the
object of attention. As with all communicative interaction, there can, of course, be
multiple targets at play. Finally, analysts can focus on the “form” of the metaprag-
matic comments themselves. These range, according to Hübler (2011: 111–113)
from categorical or modalised assertions and questions through to reported talk,
including echoic repeats of prior talk.6

6
On most accounts of metapragmatics, adverbials and other pragmatic markers, are also
included in this list. However, given that corpus-based approaches to discourse markers
630 Michael Haugh

We can see how these different analytical loci coalesce in the following excerpt
from an interaction where two Australians are getting acquainted. Emma has been
talking about her acupuncture business for some time up until this point in their
conversation.7
(2) AGA: ERCH: 13:31
263 E: much better he’s gonna get we’ll just keep going with
264 it and see how (0.3) how we go
265 C: mmm
266 (0.4)
267 E: mmmm
268 C: right. ↑what got you into it? like (0.8) what made you
269 think acupuncture [( )]
270 E: [THIS IS ALL ABOUT ME] THIS
271 CONVERSA(H)TION
272 C: yeah well whatever
273 E: .hhhh u:m
274 (1.5)
275 E: ↑oh. (0.6) >is that a< timer?

Emma’s utterance in lines 270–271, which is delivered at a markedly louder vol-

ume, is an example of a metapragmatic comment directed at norms of conversa-
tion, particularly those associated with the activity of getting acquainted, where
there is a general orientation to reciprocity (Haugh and Carbaugh 2015: 481–482).
What is notable here is that while the object of the metapragmatic comment is a
fairly generic one, the interactional work it accomplishes here is clearly locally
situated. In this case, Emma seems to be attempting to head off the inference by
Chris that she likes talking about herself and has no interest in him. What is also
interesting to note here is Chris’s response in line 272 where he displays “unease”
with the way in which Emma has topicalised his ongoing questioning of her. The
conversation breaks down at this point (lines 273–274), with Emma subsequently
asking whether their recording time is up (line 275). This is all the more notable
given there has been no breakdown in progressivity in their interaction thus far. It
appears, then, that metapragmatic comments not only orient to potential interac-
tional “troubles” in conversation, but may themselves occasion such “troubles”.
Corpus-based studies of metapragmatics in use can draw from either pur-
pose-built corpora or pre-existing corpora (or a combination of both). Tanskanen’s

and other such forms are examined in depth elsewhere in this volume (Aijmer, this
volume), they are not discussed further here.
7
This example is taken from the Australians Getting Acquainted (AGA) corpus. See the
appendix for a list of the CA transcription conventions (Jefferson 2004) being used
here.
Corpus-based metapragmatics 631

(2007) study of metapragmatic comments that arise in electronic mailing lists and
discussion boards is an example of the former. In her study, Tanskanen assembled
a corpus of interactions from two mailing lists (Linguist List and Women’s Studies
List) and two discussion boards (Dachsie’s Bulletin Board and Yahoo! Message
Boards). She then identified instances of metapragmatic comments that oriented
to the “effective” management of discourse in her corpus through a primarily dis-
course analytic-driven process. A key finding from her study was that in comput-
er-mediated interactions, at least in these contexts, participants use metapragmatic
comments to either assess the degree of (in)appropriateness of their own posts or
those of others, or to clarify their own posts where they perceive some misunder-
standing to have occurred.
Increasingly, however, researchers are drawing from pre-existing corpora as
well. In cases where the corpus in question is large and well-structured, this confers
a greater degree of generalisability on the findings of the study in question. Jucker
and Taavitsainen (2014), for instance, undertook an analysis of the occurrence
of the speech-act label compliment in the 400 million word Corpus of Historical
American English (COHA) (Davies 2012), and in a sample of the 425 million
word Corpus of Contemporary American English (COCA) (Davies 2009).8 These
tokens were then double-coded as to whether they were referring to instances of
“personal” or “ceremonious” compliments, along with noting the gender of the
complimenter and the complimentee. One intriguing finding from this study was
that in the historical dataset taken from COHA there was a greater proportion of
reports of men issuing compliments. This contrasts with previous findings that
women compliment more frequently than men in American (Placencia and Lower
2013; Wolfson 1983) and New Zealand (Holmes 1988) varieties of English.
In another study, Skalicky, Berger and Bell (2015) examined the functions of
claims to be just kidding (and related expressions) in interactions amongst Amer-
ican speakers of English. Their collection of 1,200 tokens was assembled from a
number of different corpora, including the 425 million word Corpus of Contem-
porary American English (COCA) (Davies 2009), the 385 million word Ameri-
can English component of the 1.9 billion word Global Web-based English Corpus
(GloWbE) (Davies 2015),9 the 250,000 word Santa Barbara Corpus of Spoken
American English (SBCSAE) (Du Bois et al 2000–2005),10 the 250,000 word Call-
Friend American English corpus (Canaven and Zipperlen 1996a, 1996b),11 and the

8
These corpora are freely available at http://corpus.byu.edu/coha/ and http://corpus.byu.
edu/coca/, respectively.
9
This corpus is freely available at http://corpus.byu.edu/glowbe/.
10
This corpus is freely available at http://www.linguistics.ucsb.edu/research/santa-bar-
bara-corpus.
11
This corpus is freely available at https://talkbank.org/access/CABank/CallFriend/engn.
html and https://talkbank.org/access/CABank/CallFriend/engs.html.
632 Michael Haugh

1.8 million word Michigan Corpus of Academic Spoken English (MICASE) (Simp-
son et al. 2002).12 The various functions of claims to be just kidding were then
coded as instances of “inoculation” (i. e. against negative reactions from the tar-
get), “repair” of failed humour, “return to serious”, and “setting up a new joke”. A
notable finding from the quantitative analysis of these different functions was that
“repair” was not as common a function of the expression as generally expected,
with “inoculation” being the most common across the different corpora. Another
interesting finding was that using just kidding to set up a new joke by subverting
the target’s expectations through following this claim with an extension of the pre-
vious joke was largely restricted to communications in digitally-mediated settings
(i. e. the GloWbE corpus).
There is now a growing body of corpus-based studies of metapragmatics in
use. For the most part, corpora are used primarily as a source of data in such stud-
ies, given the relative ease with which metapragmatic comments can be identified
across large tracts of data. Once a collection of metapragmatic comments in a
particular discourse context is assembled, the next step typically involves qualita-
tive analysis of their functions, and subsequently coding of these functions across
the collection. The latter is generally expected to involve more than one coder to
avoid overly idiosyncratic interpretations of the dataset. Once the dataset is coded,
the researcher can then examine whether there are recurrent patterns in the metap-
ragmatic comments themselves, or explore correlations with particular social or
discourse variables.
It is perhaps inevitable that corpus-based studies of metapragmatics in use are
more qualitatively-driven than the studies of the metacommunicative lexicon that
we discussed in the previous section. However, like all corpus-based metaprag-
matic studies, the analytical process is nevertheless firmly iterative, as it generally
involves a combination of vertical and horizontal methods of analysis. The extent
to which the former or latter drives the analytical process is a function of the spe-
cific research questions of the analyst (Jucker 2009). Striking a balance between
the two, however, is not always a straightforward matter.
In the following section, we move to consider some of the methodological
challenges that arise when undertaking corpus-based metapragmatics, as well as to
discuss in more detail the iterative analytical process that is a feature of such stud-
ies. The vehicle for this is a brief case study that focuses on the metapragmatics of
claims to non-serious intent amongst American and Australian speakers of English.

12
This corpus is freely available at https://quod.lib.umich.edu/m/micase/.
Corpus-based metapragmatics 633

5. The metapragmatics of claims to non-serious intent: A case study

A key feature of the methods underpinning corpus-based metapragmatics, as with

corpus pragmatics more generally (Clancy and O’Keeffe 2015: 235; Rühlemann
and Aijmer 2015: 12), is that it should be a fundamentally iterative process that
integrates horizontal and vertical approaches to analysis. One issue that we have
skirted around, however, is “which” analytical approaches should we be attempting
to integrate? Such choices are often driven by practicalities as one can only choose
from methods with which one is familiar with (or at least has the resources to
learn), as well as by one’s own individual proclivities as a researcher (Walsh 2013).
However, as the other chapters in the volume in which this chapter appears attest,
there is a wide range of methods from which one can choose. Indeed, drawing from
a range of different methods is invariably a necessity in pragmatics, given it is not
realistic to hope that a single method can address all the possible research questions
in the field (Jucker 2009).
In the case of corpus-based metapragmatics, while the array of quantitative
methods are perhaps more clearly circumscribed, as we discussed in the previous
two sections, it is less obvious which qualitative approaches to the interpretation
of metapragmatic data might be employed alongside these vertical approaches.
It is also not always made clear by researchers the extent to which quantitative
analyses drive qualitative analyses, and vice-versa, in their studies. Finally, the
ecological validity of drawing from different methods can come into question in
some cases. As researchers it is important to bear in mind the epistemological
assumptions underpinning different methods, and to assess the extent to which they
are consistent with each other. Indeed, in cases where the latter are not consistent,
problems can result in relation to the extent to which data are readily transferable
and analyses readily reconcilable between the different methods.
In order to begin considering some of these issues in more concrete terms, in
the remainder of this section we will use previous studies of claims to non-serious
intent (just kidding, just joking, only kidding, only joking etc.) in everyday interac-
tions amongst American and Australian speakers of English (Haugh 2016b, 2017;
Skalicky, Berger and Bell 2015) as a focal point. We will begin by first consider-
ing their status as metapragmatic labels, and what a corpus-based approach might
offer to furthering our understanding of them. We will then move to consider how
the use of different qualitative approaches in these particular studies generated
somewhat different, albeit broadly complementary findings, and the more general
challenges this raises for corpus-based metapragmatics.
One issue that all three of these studies did not directly address was that of the
status of these metapragmatic labels with respect to the broader “humour”-related
metacommunicative lexicon. While it was acknowledged that there may well be
nuanced differences in “meanings” amongst these different metapragmatic labels,
in practice their “use” in metapragmatic comments was treated as more or less
634 Michael Haugh

interchangeable in these studies. Yet ascertaining what those meanings might be

is no straightforward matter. For instance, while there is work that attests to dif-
ferences in the semantics of kidding and joking (Goddard forthcoming), it remains
subject to further empirical study whether these shades of meaning are retained
across different discourse contexts, including when used in such phrases as just
kidding, only kidding, just joking, only joking and so on.
Of course, if one assumes that the meanings of (specific senses of) words are
more or less invariant across contexts, such a question might seem somewhat puz-
zling. However, it has long been argued that language is not a determinate object
(Harris 1980; Garfinkel and Sacks 1970; Wittgenstein 1972). While “meanings”
can, of course, be abstracted out from the usage of words across contexts, it does
not necessarily follow that the participants’ understandings of that word in a par-
ticular locally situated context is determined by such abstractions. In practice,
users can accomplish understandings of the meanings of particular words to vary-
ing degrees of granularity (Bilmes 2011; Haugh 2016a, forthcoming; Rowen and
Haugh 2017), as their accomplishment as meanings is invariably interlinked with
“handl[ing] some local, situational, contingent matter” (Eglin 2015: 142). In order
to address such issues empirically, then, corpus-based analyses of these sorts of
metapragmatic labels is a must.
Two key challenges facing any researcher attempting to do this, however, are:
(1) identifying a sufficiently large number of tokens that affords just such an anal-
ysis; and (2) applying tools to any such dataset which enable a proper statistical
analysis of their collocational behaviour. While Skalicky, Berger and Bell (2015),
for instance, identified 1,200 tokens, 90 % of these tokens involved occurrences of
kidding. The remaining 10 % of tokens involved occurrences of joking (2015: 21),
but it was not reported what proportion of these 120 tokens were instances of just
joking, only joking or joking. It thus remains questionable whether an empirical
analysis of possible differences between these different metapragmatic labels is
even feasible using their dataset given the small number of tokens involving jok-
ing, despite their search involving more than 800 million words across the different
corpora. In order to address such issues, then, it appears we must make recourse to
large web-based corpora, such as the 1.9 billion word Global Web-based English
Corpus (GloWbE) (Davies 2015) or the larger, but relatively unstructured, 23 bil-
lion word enTenTen13 Corpus (Jakubíček et al. 2013).13 Searches for these metap-
ragmatic labels in GloWbE reveal some potentially interesting differences in their
relative frequency of occurrence in the American and Australian English compo-
nents of the corpus, which are echoed in the results of a search in the enTenTen13

13
The latter is freely available to subscribed users of Sketch Engine. See https://www.
sketchengine.co.uk/ententen-corpus/ for further information.
Corpus-based metapragmatics 635

corpus, as outlined in Table 1. The raw frequencies are reported along with nor-
malised frequencies (per million words) in brackets.

Table 1: Relative frequency of tokens of [just/only] [joking/kidding] in the GloWbE and

enTenTen13 corpora

GloWbE enTenTen13
AmE AusE
just kidding 741 (1.92) 124 (0.84) 12,419 (0.54)
only kidding 49 (0.13) 20 (0.13) 1,393 (0.06)
just joking 108 (0.28) 35 (0.24) 2,461 (0.11)
only joking 53 (0.14) 44 (0.30) 1,675 (0.07)

It is readily apparent that just kidding is the collocation most frequently used by
both American and Australian users of English, although it appears to be used more
frequently by Americans. Only joking, on the other hand, seems to be used more
frequently by Australians than Americans, albeit at lower levels overall. How-
ever, more detailed analyses of these metapragmatic labels remain more challeng-
ing. The Word Sketch and Distributional Thesaurus tools in Sketch Engine, for
instance, can only be applied to kidding and joking as individual lexemes, not to
analysing their collocations.
A second key issue for corpus-based metapragmatics that becomes apparent
when one compares these studies concerns their approach to analysing the func-
tions of these metapragmatic labels and the data itself. While Skalicky, Berger and
Bell (2015) coded the functions of just kidding and variants using the data as it was
presented in the various corpora in question, Haugh (2016b, 2017) undertook more
detailed CA transcriptions of the relevant excerpts using the audio files that were
available in order to undertake a CA-informed analysis. The latter more detailed
transcription lent itself to a close sequential analysis that revealed additional fea-
tures of the use of just kidding and variants that was not apparent from coding
the data as it is made available in the various corpora in question, although, as a
consequence, involved analysis of a smaller sample of data.14
Consider the following more detailed transcription of the example we discussed
in the introduction to this chapter.

14
It is important to note that Skalicky, Berger and Bell (2015) also used web-based data
(GloWbE), in which case transcription is not required.
636 Michael Haugh

(3) CallFriend: engn6015: 3:08

102 F1: I wrote a children’s book (0.4) and (0.6)
103 I’m really happy with the way it turned out? and like,
104 >I ↑did it for my final project< for a- for a ↑final
105 project, in this .hh in this ↑class, like [you can].
106 F2: [uh huh ].
107 F1: basically you can do anything, it’s like a
108 children’s >literature class< .hh so I wro:te a (.)
109 a story.=
110 F2: =talk clearer they’re recording us I’m just kid-=
111 F1: =ohh [huh huh .hhh and ]=
112 F2: [Go o(hh)n I’(hh)m just kidding]=
113 F1: =no:w I’m starting to illustrate it? .hh

If one compares the transcript in excerpt (1) with this one, it becomes readily appar-
ent that they do not record exactly the same thing.15 Of particular relevance to our
understanding of what just kidding is doing here in this excerpt is the occurrence
of oh in line 111 – which does not feature in the more basic transcript – whereby
Jess displays she has reached a new understanding (Heritage 1984), alongside the
proposal that Jess talk clearer (line 110). This is followed by laughter through
which Jess displays an orientation to Cathy’s utterance in line 110 as a non-serious
tease (Drew 1987), and then a return to continuing her telling (lines 111, 113), fol-
lowing prompting by Cathy to do so (line 112). This display of a change in under-
standing indicated through the particle oh is arguably significant, as it is consistent
with a recurrent pattern in teases where the claim to be just kidding (and variants)
is constitutive of the prior action as a tease (Haugh 2016b). In other words, we
have evidence from Jess’s response that Cathy’s tease was designed to be initially
heard as “serious” before its “non-seriousness” is subsequently revealed through
this claim to be just kidding. The use of this metapragmatic label here is thus
not so much an instance of repair, although it can certainly be construed as such
by the analyst, as it is a constitutive part of the teasing sequence itself. If one
is interested in understanding the different practices by which teasing is accom-
plished in interaction then such details are important. However, if one’s focus is
on what such metapragmatic labels do more globally then such details are less
relevant.
Ultimately, then, the method of qualitative analysis one should employ, in
conjunction with a more quantitatively focused one in undertaking studies in

15
It is worthwhile noting that this is not by any means the only case in which such dis-
crepancies have arisen. However, it is important to also bear in mind that such discrep-
ancies are a consequence, at least in part, of the affordances of different transcription
systems, alongside what appear to be errors on the part of the transcriber(s). The latter
are, of course, difficult to completely eliminate when constructing spoken corpora,.
Corpus-based metapragmatics 637

corpus-based metapragmatics depends upon one’s research focus and attendant

research questions. It is well worth noting, however, that transcriptions in spoken
corpora are not always straightforward records of the original interaction, despite
considerable effort on the part of those who create such corpora. What the exist-
ence of such discrepancies highlights is the importance of making available the
original recordings of spoken interactions alongside transcripts held in the corpus
(Haugh 2009). In so doing, spoken corpora will arguably become more useful for
those wishing to undertake corpus-based metapragmatics.

6. Concluding remarks

The focus in corpus-based metapragmatics is not on identifying instances of the

pragmatic phenomenon that is being described by the metapragmatic label in ques-
tion,16 but rather on talk about pragmatic phenomena as an important object of
study in its own right. What corpus-based metapragmatics offers us is a way of
studying this kind of talk in a systematic way across large tracts of data, which, as
a result, confers on the results of the analysis a greater degree of generalisability
than they might otherwise have. Metapragmatic studies are important because they
enable us to tap into the conceptual and evaluative field through which we consti-
tute our understandings of social reality. This is of particular importance when one
wishes to take more seriously pragmatic variation. It is clear that pragmatics is not
only subject to significant cross-linguistic variation, but also variation at other lev-
els of order, as well as over time not only across languages, but within languages
themselves. What is needed, therefore, is the development of a variational meta
pragmatics that treats this variability as a serious object of study. While important
studies have already been undertaken, as we have seen in this chapter, there is
nevertheless still a pressing need for many more studies of the metacommunica-
tive lexicon and metapragmatics in use across different languages, and varieties
therein, as well as over time. In this way, we can go beyond the constraints of a
pragmatics that is largely dominated by English as the scientific metalanguage.
Corpus-based metapragmatics is arguably uniquely placed to make a critical con-
tribution to the development of this field.

16
See Jucker (this volume) for a useful discussion of this approach in corpus pragmatics.
638 Michael Haugh

References

Bateson, Gregory
1955 A theory of play and fantasy. In: Approaches to the Study of Human Personal-
ity, 39–51. Washington, D.C.: American Psychiatric Association.
Bateson, Gregory
1972 Steps to an Ecology of Mind. Chicago, IL: Chicago University Press.
Bednarek, Monika
2008 Semantic preference and semantic prosody re-examined. Corpus Linguistics
and Linguistic Theory 4(2): 119–139.
Bilmes, Jack
2011 Occasioned semantics: A systematic approach to meaning in talk. Human
Studies 34: 129–153.
Caffi, Claudia
1984 Some remarks on illocution and metacommunication. Journal of Pragmatics
8: 449–467.
Caffi, Claudia
1998 Metapragmatics. In: Jacob Mey (ed.), Concise Encylopedia of Pragmatics,
581–586. Amsterdam: Elsevier.
Canavan, Alexandra and George Zipperlen
1996a CALLFRIEND American English-Non-Southern Dialect LDC96S46. Phila-
delphia: Linguistic Data Consortium.
Canavan, Alexandra and George Zipperlen
1996b CALLFRIEND American English-Southern Dialect LDC96S47. Philadelphia:
Linguistic Data Consortium.
Clancy, Brian and Anne O’Keeffe
2015 Pragmatics. In: Douglas Biber and Randi Reppen (eds.), Cambridge Hand-
book of English Corpus Linguistics, 235–251. Cambridge: Cambridge Univer-
sity Press.
Culpeper, Jonathan
2009 The metalanguage of impoliteness: Using Sketch Engine to explore the Oxford
English Corpus. In: Paul Baker (ed.), Contemporary Corpus Linguistics,
66–88. London: Continuum.
Culpeper, Jonathan
2011 Impoliteness: Using Language to Cause Offence. Cambridge: Cambridge Uni-
versity Press.
Culpeper, Jonathan and Claire Hardaker
2016 Pragmatics and corpus linguistics. In: Paul Baker and Jesse Egbert (eds.), Tri-
angulating Methodological Approaches in Corpus Linguistic Research, 124–
137. London: Routledge.
Culpeper, Jonathan and Michael Haugh
2014 Pragmatics and the English Language. Basingstoke: Palgrave Macmillan.
Culpeper, Jonathan, Jim O’Driscoll and Claire Hardaker
Forthcoming The neglected west: Notions of politeness in Britain and America. In: Eva
Ogiermann and Pilar Garces-Conejos (eds.), From Speech Acts to Lay Con-
cepts of Politeness. A Multilingual and Multicultural Perspective. Cambridge:
Cambridge University Press.
Corpus-based metapragmatics 639

Davies, Bethan
2011 Discursive histories, personalist ideology and judging intent: Analysing the
metalinguistic discussion of Tony Blair’s ‘slave trade apology’. In: Linguis-
tic Politeness Research Group (eds.), Discursive Approaches to Politeness,
189–219. Berlin: de Gruyter Mouton.
Davies, Mark
2009 The 385+ Million Word Corpus of Contemporary American English (1990–
2008+): Design, Architecture, and Linguistic Insights. International Journal
of Corpus Linguistics 14: 159–190.
Davies, Mark
2012 The 400 million word Corpus of Historical American English (1810–2009).
In: Irén Hegedűs and Alexandra Fodor (eds.), English Historical Linguistics
2010: Selected Papers from the Sixteenth International Conference on English
Historical Linguistics (ICEHL 16), 231–262. Amsterdam: John Benjamins.
Davies, Mark
2015 Introducing the 1.9 billion word Global Web-based English Corpus (GloWbE).
21st Century Text 15.
Du Bois, John W., Wallace L. Chafe, Charles Meyer, Sandra A. Thompson, Robert Engle-
bretson and Nii Martey
2000–2005 Santa Barbara Corpus of Spoken American English, Parts 1–4. Philadel-
phia: Linguistic Data Consortium
Drew, Paul
1987 Po-faced receipts of teases. Linguistics 25: 219–253.
Eglin, Peter
2015 Language, culture, and interaction. In: Farzad Sharifian (ed.), The Routledge
Handbook of Language and Culture, 141–153. London: Routledge.
Garfinkel, Harold and Harvey Sacks
1970 On formal structures of practical action. In: John C. McKinney and Edward A.
Tiryakian (eds.), Theoretical Sociology, 337–366. New York: Appleton-Cen-
tury-Crofts.
Geertz, Clifford
1973 The Interpretation of Cultures. New York: Basic Books.
Goddard, Cliff
Forthcoming “Joking, kidding, teasing”: Slippery categories for cross-cultural compar-
ison but key words for understanding Anglo conversational humour. Intercul-
tural Pragmatics.
Harris, Roy.
1980 The Language-Makers. Ithaca, NY: Cornell University Press.
Haugh, Michael
2009 Designing a multimodal spoken component of the Australian National Cor-
pus. In: Michael Haugh, Kate Burridge, Jean Mulder and Pam Peters (eds.)
Selected Proceedings of the HSCNet Workshop on Designing the Australian
National Corpus: Mustering Languages, 74–86. Sommerville, MA: Casca-
dilla Proceedings Project.
Haugh, Michael
2016a The role of English as a scientific metalanguage for research in pragmatics:
Reflections on the metapragmatics of ‘politeness’ in Japanese. East Asian
Pragmatics 1(1): 39–71.
640 Michael Haugh

Haugh, Michael
2016b “Just kidding”: Teasing and claims to non-serious intent. Journal of Pragmat-
ics 95: 120–136.
Haugh, Michael
2017 Mockery and non-seriousness in initial interactions amongst American and
Australian speakers of English. In: Donal Carbaugh (ed.), Handbook of Com-
munication in Cross-Cultural Perspective, 104–117. London: Routledge.
Haugh, Michael
Forthcoming The metapragmatics of consideration in (Australian and New Zealand)
English. In: Eva Ogiermann and Pilar Garces-Conejos (eds.), From Speech
Acts to Lay Concepts of Politeness. A Multilingual and Multicultural Perspec-
tive. Cambridge: Cambridge University Press.
Haugh, Michael and Donal Carbaugh
2015 Self-disclosure in initial interactions amongst speakers of American and Aus-
tralian English. Multilingua 34(4): 461–493.
Heritage John
1984 A change of state token and aspects of its sequential placement. In: J. Max-
well Atkinson and John Heritage (eds.), Structures of Social Action, 299–345.
Cambridge: Cambridge University Press.
Holmes, Janet
1988 Paying compliments: A sex-preferential politeness strategy. Journal of Prag-
matics 12(4): 445–465.
Hübler, Axel
2011 Metapragmatics. In: Wolfram Bublitz and Neal Norrick (eds.), Foundations of
Pragmatics, 107–136. Berlin: Mouton de Gruyter.
Hübler, Axel and Wolfram Bublitz
2007 Introducing metapragmatics in use. In: Wolfram Bublitz and Axel Hübler
(eds.), Metapragmatics in Use, 1–26. Amsterdam: John Benjamins.
Hübler, Axel and Ulrich Busse
2012 Introduction. In: Ulrich Busse and Axel Hübler (eds.), Investigations into the
Meta-Communicative Lexicon of English. A Contribution to Historical Prag-
matics, 1–16. Amsterdam: John Benjamins.
Jakubíček, Miloš, Adam Kilgarriff, Vojtěch Kovář, Pavel Rychlý and Vít Suchomel
2013 The TenTen corpus family. Paper presented at the 7th International Corpus
Linguistics Conference (UCREL), Lancaster University, 22–26 July.
Jaworski, Adam, Nikolas Coupland and Dariusz Galasiński
2004 Metalanguage: Why now?”. In: Adam Jaworski, Nikolas Coupland and Dar-
iusz Galasiński (eds.), Metalanguage. Social and Ideological Perspectives,
3–8. Berlin: Mouton de Gruyter.
Jefferson, Gail
2004 Glossary of transcript symbols with an introduction. In: Gene Lerner (ed.),
Conversation Analysis: Studies from the First Generation, 13–23. Amsterdam:
John Benjamins.
Jucker, Andreas H.
2009 Speech act research between armchair, field and laboratory. Journal of Prag-
matics 41: 1611–1635.
Corpus-based metapragmatics 641

Jucker, Andreas H.
2013 Corpus pragmatics. In: Jan Ola-Östman and Jef Verschueren (eds.), Handbook
of Pragmatics 2013 Installment, 1–18. Amsterdam: John Benjamins.
Jucker, Andreas H. and Irma Taavitsainen
2014 Complimenting in the history of American English: A metacommunicative
expression analysis. In: Irma Taavitsainen, Andreas H. Jucker and Jukka
Tuominen (eds.), Diachronic Corpus Pragmatics, 257–276. Amsterdam: John
Benjamins.
Jucker, Andreas H., Irma Taavitsainen and Gerold Schneider
2012 Semantic corpus trawling: Expressions of “courtesy” and “politeness” in the
Helsinki Corpus. In: Carla Suhr and Irma Taavitsainen (eds.), Developing Cor-
pus Methodology for Historical Pragmatics (Studies in Variation, Contacts
and Change in English 11). Helsinki: Research Unit for Variation, Contacts
and Change in English.
Kádár, Dániel Z. and Michael Haugh
2013 Understanding Politeness. Cambridge: Cambridge University Press.
Killgarriff, Adam, Vít Baisa, Jan Bušta, Miloš Jakubíček, Vojtěch Kovář, Jan Michelfeit,
Pavel Rychlý and Vít Suchomel
2014 The Sketch Engine: Ten years on. Lexicography 1: 7–36.
Louw, Bill
1993 Irony in the text or insincerity in the writer? The diagnostic potential of seman-
tic prosodies. In: Mona Baker, Gill Francis and Elena Tognini-Bonelli (eds.),
Text and Technology: In Honour of John Sinclair, 157–176. Amsterdam: John
Benjamins.
MacWhinney, Brian
2000 The CHILDES Project: Tools for Analyzing Talk. Third edition. Mahwah, NJ:
Lawrence Erlbaum.
Mertz, Elizabeth and Jonathan Yovel
2009 Metalinguistic awareness. In: Dominiek Sandra, Jan-Ola Östman and Jef Ver-
schueren (eds.), Cognition and Pragmatics, 250–271. Amsterdam: John Ben-
jamins.
Mey, Jacob
2001 Pragmatics. An Introduction. Second edition. Oxford: Blackwell.
Nevalainen, Terttu and Heli Tissari
2010 Contextualising eighteenth-century politeness: Social distinction and meta-
phorical levelling. In: Raymond Hickey (ed.), Eighteenth-Century English:
Ideology and Change, 133–158. Cambridge: Cambridge University Press
Niedzielski, Nancy and Dennis R. Preston
2009 Folk pragmatics. In: Gunter Senft, Jan-Ola Östman and Jef Verschueren (eds.),
Culture and Language Use, 146–155. Amsterdam: John Benjamins.
Placencia, Maria E. and Amanda Lower
2013 Your kids are so stinkin’ cute:-): Complimenting behaviour on Facebook
among family and friends. Intercultural Pragmatics 10(4): 617–646.
Rowen, Roslyn and Michael Haugh
2017 Bogans, lawyers and teachers: On the interactional achievement of word mean-
ings. Intercultural Pragmatics 14(3): 327–359.
642 Michael Haugh

Ruesch, Jurgen and Gregory Bateson

1951 Communication: The Social Matrix of Psychiatry. New York: Norton.
Rühlemann, Christoph
2013 What can a corpus tell us about pragmatics? In: Michael McCarthy (ed.), Rou-
tledge Handbook of Corpus Linguistics, 288–301. London: Routledge.
Rühlemann, Christoph and Karin Aijmer
2015 Introduction. Corpus pragmatics: Laying the foundations. In: Karin Aijmer
and Christoph Rühlemann (eds.), Corpus Pragmatics. A Handbook, 1–26.
Cambridge: Cambridge University Press.
Schneider, Klaus P.
2017 Pragmatic competence and pragmatic variation. In: Rachel Giora and Michael
Haugh (eds.), Doing Pragmatics Interculturally. Cognitive, Philosophical and
Sociopragmatic Perspectives, 315–333. Berlin: Mouton de Gruyter.
Schneider, Klaus P. and Anne Barron
2008 Where pragmatics and dialectology meet: Introducing variational pragmat-
ics. In: Klaus P. Schneider and Anne Barron (eds.), Variational Pragmatics.
A Focus on Regional Varieties in Pluricentric Languages, 1–32. Amsterdam:
John Benjamins.
Skalicky, Stephen, Cynthia M. Berger and Nancy D. Bell
2015 The functions of ‘just kidding’ in American English. Journal of Pragmatics
85: 18–31.
Simpson, Rita, Sarah Briggs, Janine Ovens and John Swales
2002 The Michigan Corpus of Academic Spoken English. Ann Arbor, MI: The
Regents of the University of Michigan.
Tanskanen, Sanna-Kaisa
2007 Metapragmatic utterances in computer-mediated interaction. In: Wolfram
Bublitz and Axel Hübler (eds.), Metapragmatics in Use, 87–106. Amsterdam:
John Benjamins.
Taylor, Charlotte
2015 Beyond sarcasm: The metalanguage and structures of mock politeness. Jour-
nal of Pragmatics 87: 127–141.
Taylor, Charlotte
2016a Mock Politeness in English and Italian. A Corpus-Assisted Metalanguage
Analysis. Amsterdam: John Benjamins.
Taylor, Charlotte
2016b Mock politeness and culture: Perceptions and practice in UK and Italian data.
Intercultural Pragmatics 13(4): 463–498.
Verschueren, Jef
[1995] 2000 Notes on the role of metapragmatic awareness in language use. Pragmat-
ics 10(4): 439–456.
Walsh, Steve
2013 Corpus linguistics and conversation analysis at the interface: Theoretical
perspectives, practical outcomes. In: Jesús Romero-Trillo (ed.), Yearbook of
Pragmatics and Corpus Linguistics 2013, 37–51. Berlin: Mouton de Gruyter.
Watts, Richard, Sachiko Ide and Konrad Ehlich
1992 Introduction. In: Richard Watts, Sachiko Ide and Konrad Ehlich (eds.), Polite-
ness in Language. Studies in its History, Theory and Practice, 1–17. Berlin:
Mouton de Gruyter.
Corpus-based metapragmatics 643

Wierzbicka, Anna
[1991] 2003 Cross-cultural Pragmatics. The Semantics of Human Interaction, 2nd ed.
Berlin: Mouton de Gruyter.
Wittgenstein, Ludwig
1972 Philosophical Investigations. Third edition. Transl. G. Elizabeth M. Ans-
combe. Oxford: Basil Blackwell.
Wolfson, Nessa
1983 An empirically based analysis of complimenting in American English. In:
Nessa Wolfson and Elliot Judd (eds.), Sociolinguistics and Language Acquisi-
tion, 82–95. Rowley, MA: Newbury House.

Appendix

CHAT-CA transcription conventions (MacWhinney 2000)

. final intonation
, continuing intonation
→ level intonation
↗ rising to mid intonation
hhh aspiration
(0.5) pause length
+≈ no break continuation (i. e. latched talk)
⌈ top begin overlap
⌉ top end overlap
⌊ bottom begin overlap
⌋ bottom end overlap

CA transcription conventions (Jefferson 2004)

. final intonation
, continuing intonation
? rising intonation
↓ ↑ sharp falling/rising intonation
: elongation of vowel/consonant sound
underlining contrastive stress or emphasis
> < talk is compressed or rushed
(.) micropause
(0.5) timed gap/pause
= latched talk
- cut-off talk
[ ] overlapping talk
hhh aspiration
(hh) interpolated aspiration
.hhh inbreathing
Bionotes

Karin Aijmer is professor emerita in English linguistics at the University of Goth-

enburg, Sweden. Her research interests focus on pragmatics, discourse analysis,
modality, corpus linguistics and contrastive analysis. Her publications include
Conversational Routines in English: Convention and Creativity (1996), English
Discourse Particles. Evidence from a Corpus (2002), The Semantic Field of Modal
Certainty: A Study of Adverbs in English (with co-author) (2007), Understanding
Pragmatic Markers. A Variational Pragmatic Analysis (2013). She is co-editor of
Pragmatics of Society (2011) and of a Handbook of Corpus Pragmatics (2014) and
co-author of Pragmatics. An Advanced Resource Book for Students (2012).

Gisle Andersen is Professor of English linguistics at NHH Norwegian School of

Economics in Bergen, Norway. His publications include Pragmatic Markers and
Sociolinguistic Variation (2001), Trends in Teenage Talk (with A.-B. Stenström and
I.K. Hasund, 2002) and Pragmatics of Society (ed. with Karin Aijmer, 2012). He
has recently studied pragmatics from a language contact perspective, developing
the notion of pragmatic borrowing, exemplified through the study of the influx of
English on Norwegian pragmatics.

Dawn Archer is a Professor of Pragmatics and Corpus Linguistics at Manches-

ter Metropolitan University, England. She has been involved in the creation of
pragmatic annotation schemes, with Jonathan Culpeper, as well as the develop-
ment of annotation tools, with Alistair Baron and Paul Rayson (Lancaster Univer-
sity). Recent pragmatic publications by Archer include Pragmatics: An Advanced
Resource Book for Students (2012), co-authored with Aijmer and Wichmann, and
The Pragmatics Reader (2011), co-edited with Grundy.

Wolfram Bublitz is Professor Emeritus of English Linguistics at the University

of Augsburg, Germany. His research interests concern most areas of text analysis,
(constructivist) pragmatics and computer-mediated communication. His (edited)
books include Pragmatics of Social Media (with C.R. Hoffmann, 2017), The Prag-
matics of Quoting Now and Then (with J. Arendholz and M. Kirner-Ludwig, 2015),
Foundations of Pragmatics (with N. Norrick, 2011), Englische Pragmatik (2nd ed.,
2009), Metapragmatics in Use (with A. Hübler, 2007), Coherence in Spoken and
Written Discourse (with U. Lenk and E. Ventola, 1999).

https://doi.org/10.1515/9783110424928-025
646 Bionotes

Piotr Cap is Professor of Linguistics at the University of Łódź, Poland. His inter-
ests are in pragmatics, critical discourse studies, political linguistics and genre the-
ory. His publications include Perspectives in Politics and Discourse (Benjamins,
2010), Proximization: The Pragmatics of Symbolic Distance Crossing (Benjamins,
2013), Analyzing Genres in Political Communication (Benjamins, 2013), Contem-
porary Critical Discourse Studies (Bloomsbury, 2014) and The Language of Fear:
Communicating Threat in Public Discourse (Palgrave, 2017). He is Managing Edi-
tor of International Review of Pragmatics.

Billy Clark is Professor of English Language and Linguistics at Northumbria Uni-

versity in Newcastle. His research and teaching interests cover a wide range of
topics in linguistics and linguistic theory, with a particular focus on semantics and
pragmatics. This has included work on lexical and syntactic meaning, semantic
change, phatic communication, prosodic meaning, multimodality and stylistics. He
has a long-standing interest in connections between work at school and at univer-
sity. He is a member of the UK Linguistics Olympiad committee and, with Mar-
cello Giovanelli and Andrea Macrae, coordinates the Integrating English project
(http://integratingenglish.org).

Jonathan Culpeper is Professor of English Language and Linguistics in the

Department of Linguistics and English Language at Lancaster University, UK. His
work spans pragmatics, stylistics and the history of English, and is often under-
pinned by corpus methods. His recent major publications include Impoliteness:
Using Language to Cause Offence (2011, CUP) and Pragmatics and the English
Language (2014, Palgrave; with Michael Haugh).

J. César Félix-Brasdefer is Professor of Linguistics and Spanish at Indiana Uni-

versity, Bloomington, USA. His research interests include pragmatics, discourse
analysis, cross-cultural and interlanguage pragmatics, and (im)politeness. He has
published numerous research articles in scholarly journals and handbooks. His
most recent book is entitled The Language of Service Encounters: A Pragmat-
ic-Discursive Approach (2015, Cambridge University Press), and he has a forth-
coming co-edited volume (with Kathleen Bardovi-Harlig) on Pragmatics and
Language Learning Vol 14. Honolulu: University of Hawai‘i, National Foreign
Language Resource Center.

Anita Fetzer is a full professor of Applied Linguistics at the University of Augs-

burg, Germany. Her research interests focus on pragmatics, discourse analysis and
functional grammar. She has had a series of articles published on context, political
discourse, discourse relations, and the communicative act of rejection. Her most
recent publications are Expositives in Discourse (2016), with Etsuko Oishi, and
The Dynamics of Political Discourse: Forms and Functions of Follow-Ups (2015),
Bionotes 647

with Elda Weizman and Lawrence N. Berlin. Anita Fetzer is Editor of the book
series Pragmatics & Beyond: New Series (John Benjamins). She is a member of
several editorial boards, including Functions of Language, Pragmatics & Cogni-
tion, Journal of Language and Politics (John Benjamins), Journal of Pragmatics
(Elsevier), Research on Language and Social Interaction (Taylor and Francis) and
Text & Talk (de Gruyter), and she is an elected member of the Consultation Board
of the International Pragmatics Association.

Raymond W. Gibbs, Jr. is Distinguished Professor of Psychology at the Univer-

sity of California, Santa Cruz. His research interests focus on embodied cognition,
pragmatics and figurative language. He is the author of several books, including
The Poetics of Mind: Figurative Thought, Language and Understanding (1994),
Intentions in the Experience of Meaning (1999), Embodiment and Cognitive Sci-
ence (2006), Metaphor Wars: Conceptual Metaphor in Human Life (2017), and
(with Herb Colston) Interpreting Figurative Meaning (2012), all published by
Cambridge University Press. He is also editor of the Cambridge Handbook of Met-
aphor and Thought (2008) (CUP), and editor of the journal Metaphor and Symbol.

Andrea Golato is Professor of German and Dean of The Graduate College at

Texas State University. Her research interests lie primarily in conversation analy-
sis, specifically in culture and communication and grammar in interaction. She has
published extensively on compliments and compliment responses, the function of
various response tokens, and repair.

Peter Golato is Associate Professor of French at Texas State University. His

research interests lie primarily in applied linguistics, second language studies, and
conversation analysis. He has published on word- and sentence-level processing in
French and on research methodologies. In conversation analysis, he has published
on repair and response tokens in French.

Michael Haugh is Professor of Linguistics in the School of Languages and Cul-

tures at the University of Queensland. His research interests lie in pragmatics, con-
versation analysis and intercultural communication. He is particularly interested
in the ways in which spoken corpora can support studies of the role of language in
social interaction.

Yan Huang (PhD Cambridge, DPhil Oxford) is Professor of Linguistics at the

University of Auckland, and Changjiang Scholar Chair Professor (appointed by
the Ministry of Education, China) at Beijing Foreign Studies University. He has
previously taught linguistics at the universities of Cambridge, Oxford, and Read-
ing, where he was Professor of Theoretical Linguistics. He has also spent his
sabbaticals at Yale, Harvard, Cambridge, and Oxford universities, and at a num-
648 Bionotes

ber of top universities in Australia and China. His books include The Syntax and
Pragmatics of Anaphora (Cambridge University Press, 1994, re-issued in 2007),
Anaphora: A Cross-linguistic Study (Oxford University Press, 2000), Pragmatics
(Oxford University Press, 2007), The Oxford Dictionary of Pragmatics (Oxford
University Press, 2012) and Pragmatics 2nd edition (Oxford University Press,
2014). He is also the editor of The Oxford Handbook of Pragmatics (Oxford Uni-
versity Press, 2017). In addition, he has published numerous articles and reviews
in leading international journals of linguistics. He is on the editorial board of a
number of international linguistics journals and research monograph series. He has
been invited to give research lectures/seminars at more than 120 universities and
research institutes in Europe, North America, East and Southeast Asia, Australasia,
and North Africa.

Andreas H. Jucker is Professor of English Linguistics at the University of Zurich,

Switzerland. His current research interests include historical pragmatics, polite-
ness theory, speech act theory, and the grammar and history of English. His recent
publications include English Historical Pragmatics (Edinburgh University Press,
2013) co-authored with Irma Taavitsainen, Communities of Practice in the History
of English (Benjamins, 2013) co-edited with Joanna Kopaczyk and Diachronic
Corpus Pragmatics (Benjamins, 2014) co-edited with Irma Taavitsainen and Jukka
Tuominen.

Napoleon Katsos is a Reader in Experimental Pragmatics at the Department of

Theoretical and Applied Linguistics, at the University of Cambridge, UK. His
research focuses on typically- and atypically-developing monolingual and bilin-
gual children, with a special interest in pragmatics and cognition.

Roger Kreuz is a professor of psychology and an associate dean in the College of

Arts and Sciences at the University of Memphis. He received his PhD in cognitive
psychology from Princeton University. He has broad research interests within the
psychology of language and has primarily conducted research on pragmatics, dis-
course processing, and nonliteral language production and comprehension. With
Richard Roberts, he is the author of Becoming Fluent: How Cognitive Science Can
Help Adults Learn a Foreign Language and Getting Through: The Pleasures and
Perils of Cross-cultural Communication (both with MIT Press).

Mariana Lazzaro-Salazar is a Research Associate of the Language in the Work-

place Project team. She is also a Postdoctoral Fellow and a member of the Ethics
Committee at Universidad Católica del Maule, Chile. Her research has focused on
healthcare communication, including nurses’ construction of professional identity,
an evaluation of doctors’ feedback and communication barriers experienced by
migrant doctors in the Chilean healthcare system.
Bionotes 649

Meredith Marra is Associate Professor at Victoria University of Wellington

and Director of the Wellington Language in the Workplace Project. She has been
involved in collecting and analysing naturally-occurring data in New Zealand
organisations since 1998. Her primary research interest is the language of business
meetings, and she has published in the areas of humour, gender and ethnicity in
workplace interactions.

Eva Ogiermann is Senior Lecturer in English Language and Applied Linguistics

at King’s College London. Her work investigates culture-specific perceptions and
conceptualizations of politeness in English, German, Polish and Russian and ana
lyses interactions in English and Polish families. Her publications include a mo-
nograph on apologising and articles in Intercultural Pragmatics, Journal of Polite-
ness Research, Journal of Pragmatics, Multilingua and Research on Language and
Social Interaction. She is also associate editor of the Journal of Pragmatics.

Anne O’Keeffe is Senior Lecturer in Applied Linguistics, at Mary Immaculate

College, University of Limerick, Ireland. Much of her research explores prag-
matics in spoken corpora (e. g. everyday conversation, academic discourse and
media discourse). She also researches the grammatical competency development
of English learners, as part of the English Grammar Profile (Cambridge Univer-
sity Press). She is author of a number of papers and books, including Introducing
Pragmatics in Use (Routledge).

Monica Riordan is an assistant professor of psychology in the College of Arts,

Science, and Business at Chatham University. She received her PhD in experimen-
tal psychology from the University of Memphis. She has research interests in the
psychology of human interaction and has primarily conducted research regarding
how interlocutors construct meaning both verbally and nonverbally in online con-
texts such as texting, email, and social media. A list of her works can be found
at blogs.chatham.edu/mriordan.

Marina Sbisà is Professor of Philosophy of Language at the University of Trieste,

Italy. She has done research in the philosophy of language, semiotics, discourse
analysis and gender studies, with particular attention to pragmatic issues such as
speech acts, presupposition, implicature, and context. She collaborated on the
revised edition of J.L. Austin, How to Do Things with Words (1975). She authored
many publications in Italian and English and is co-editor (with Ken Turner) of
Pragmatics of Speech Actions (Handbooks of Pragmatics, volume 2, de Gruyter,
2013).

Klaus P. Schneider is Professor of Applied English Linguistics at the University

of Bonn. He studied English, Russian and Education in Marburg, Edinburgh and
650 Bionotes

Moscow and received his doctoral degree, his post-doctoral degree and a teach-
ing degree from the University of Marburg. Before coming to Bonn, he taught at
the universities of Marburg, Hamburg, and Rostock, and at University College
Dublin. He is especially interested in pragmatics and sociolinguistics. His current
research focuses on pragmatic variation across languages and cultures, pragmatic
competence and pragmatic assessment, and perceptions of (im)politeness. His pub-
lications include the monographs Small Talk: Analysing Phatic Discourse (1988)
and Diminutives in English (2003), the volumes The Pragmatics of Irish Eng-
lish (2005), Variational Pragmatics (2008), and Pragmatics of Discourse (2014),
co-edited with Anne Barron, and the special issues of Intercultural Pragmatics
on “Variational Pragmatics” (2009, also with Anne Barron) and of the Journal of
Pragmatics on “Im/politeness across Englishes” (2012, with Michael Haugh).

Irma Taavitsainen is Professor Emerita of English Philology at the University

of Helsinki and Deputy Director of the Research Unit for Variation, Contacts and
Change in English. Her research focuses on historical pragmatics, corpus linguis-
tics, genre and register variation and the evolution of scientific thought styles in
medical writing. She has published widely in these fields. Her most recent co-ed-
ited volumes are Diachronic Corpus Pragmatics (Benjamins 2014), Developments
in English: Expanding Electronic Evidence (CUP 2015), Diachronic Develop-
ments in English News Discourse (Benjamins 2017) and two special issues (2017):
Studia Neophilologica (89: SI) and Journal of Historical Pragmatics (18:2 ).

Alma Veenstra is a Research Associate at the Department of Theoretical and

Applied Linguistics, at the University of Cambridge, UK. Her research focuses on
language processing and development, with a special interest for pragmatics and
morpho-syntax.
Name index

Name index

A Angouri, Jo 40, 82, 349, 362

Aarts, Bas 483, 492 Antaki, Charles 245, 247, 379, 382
Aarts, Jan 456, 464 Archer, Dawn 49, 82, 336, 341, 485, 487,
Adato, Michelle 352–353, 362 495, 497, 499–500, 502, 507, 510–522,
Ädel, Annelie 472, 486, 489, 563, 579, 587, 543, 546
598, 613 Ariel, Mira 196, 205
Adolphs, Svenja 233, 253, 455, 463–465, Arnold, Gordon F. 103, 118
468, 476, 486–487, 562, 580, 583–584, Arnovick, Leslie K. 542, 546
590–591, 593, 595–597, 599, 605, 613, Arundale, Robert B. 401, 419
616–617 Asher, Nicholas 397, 401, 406, 419
Adrefiza 46–47, 65, 86, 221, 228 Asmuß, Birte 381–382
Ahlsén, Elisabeth 113 Aston, Guy 229, 233, 245, 248
Aiden, Aviva Presser 34, 549 Athitsos, Vassilis 100, 118
Aijmer, Karin 47, 82, 198, 212, 336, 341, Atkinson, Dwight 346, 360, 365, 477, 488
407, 419, 455, 464, 470, 480–481, 487, Atkinson, J. Maxwell 102, 113, 370, 375,
493, 496, 520, 528, 550, 555, 557, 561, 377, 380, 382, 388, 444
563, 567, 570, 573–576, 579–580, 585, Atkinson, Paul 354, 364
588, 590, 599, 602, 613, 617, 623, 633, Atlas, Jay D. 160, 179
642 Auer, Peter 73, 82, 104, 120, 370, 379–383,
Airasian, Peter W. 329 392
Aitchison, Jean 545 Auf dem Keller, Caren 10, 28
Al-Ali, Mohammed Nahar 231, 247 Austin, John L. 5, 7, 29, 49, 82, 134–137,
Al-Gahtani, Saad 324, 326, 328 150, 411–413, 415, 419, 500–501, 521
Al-Zumor, Abdul Wahed Qasem Ghaleb 231,
247 B
Alawneh, Rami 231, 247 Bacevich, Andrew 438, 444
Alfonzetti, Giovanna 7, 28 Bach, Kent 142, 147, 150, 163, 179, 415,
Allami, Hamid 231, 247 419, 501, 521
Allen, James F. 495, 501, 503–504, 522–523 Bachrach, Asaf 263, 276
Allen, Kathryn 532, 548 Bagoutdinov, Andreas 469, 493
Allton, Diane 46, 51–52, 55, 85 Bailey, Todd M. 260, 275
Allwood, Jens 110, 113 Baisa, Vít 641
Altenberg, Bengt 561, 580 Baker, Helen 537, 542, 549
Alvsåker Didriksen, Anders 490 Baker, Paul 430, 444, 447, 455, 465, 470,
Ameka, Felix K. 575, 580 475, 481, 485–488, 490
Andersen, Gisle 50, 92, 198–199, 205, Baldridge, Jason 502, 521
335, 341, 455–456, 464, 466–467, 469, Baldwin, Dare A. 262, 275
471–473, 476–477, 481–484, 486–487, Baldwin, Timothy 493
491–492, 494, 569–570, 580, 600, 616 Bale, Alan 82, 262, 275
Anderson, Anne H. 521 Ball, Martin J. 110, 113
Anderson, Benedict 346, 362 Ballmer, Thomas T. 501, 521
Anderson, Timothy 102, 119 Bangerter, Adrian 83, 155, 179, 301
Andrews, Steven 303 Bar-Hillel, Yehoshua 124, 130, 146, 150
Androutsopoulos, Jannis K. 476, 487 Bara, Bruno G. 201, 205

https://doi.org/10.1515/9783110424928-026
652 Name index

Baranova, Julija 385 Biber, Douglas 9, 12, 29, 31, 398, 419, 472,
Bard, Ellen Gurman 268, 275 474, 477, 488, 492, 499, 521, 535–536,
Barden, Birgit 104, 120 546, 557, 571, 575, 578, 580, 603, 613–
Bardovi-Harlig, Kathleen 230, 236–237, 614
240–244, 248, 250, 597, 615, 646 Biernacka, Ewa 213
Bargiela-Chiappini, Francesca 344, 366 Bieswanger, Markus 56–57, 82, 225, 227
Barner, David 71, 82, 262–263, 275 Bill, Cory 272, 275
Baron, Alistair 518, 521, 532 Billig, Michael 444
Barr, Dale 297, 301 Billmyer, Kristine 229, 232, 249, 310, 328,
Barraja-Rohan, Anne-Marie 382–383 587, 614
Barron, Anne 43–44, 47–48, 64, 66, 82, 91, Bilmes, Jack 8, 29, 634, 638
220, 227, 229, 231, 233, 236–237, 239, Bird, Cindy M. 112–113
248, 481, 488, 493, 569, 580, 621, 642 Birkner, Karin 392
Barske, Tobias 378, 383 Bishop, Dorothy V. M. 264–266, 269, 271,
Barth-Weingarten, Dagmar 377, 383, 392 277
Bastide, Anne 278 Blakemore, Diane 7, 29, 200, 203, 205
Bataineh, Ruba Fahmi 230–231, 248 Blommaert, Jan 432, 445
Bataineh, Rula Fahmi 230–231, 248 Bloom, Lois 108, 113
Bates, John 350, 359, 366 Blum-Kulka, Shoshana 7, 29, 41, 43, 46, 65,
Bates, Rebecca 524 82, 141, 150–251, 220–223, 227–228, 230,
Bateson, Gregory 621–622, 638, 642 232, 236, 238, 246, 249, 500, 514–515,
Bavelas, Janet B. 99, 105–106, 113 521, 587, 614
Baxter, Judith A. 352, 362 Blumer, Herbert 367, 383
Beaugrande, Robert-Alain de 10, 29, 398, Bly, Brigitte 297, 302
400, 419 Blythe, Joe 385, 555–556, 581
Beavin, Janet Helmick 153 Bodman, Jean 237, 243, 246, 250, 597, 614
Bednarek, Monika 25–26, 29, 40, 48–49, 82, Bolden, Galina 380, 383, 389
485, 488, 625, 638 Bond, Francis 493
Beebe, Leslie M. 229, 232, 236, 238–239, Bonikowska, Małgorzata 237, 249
242–243, 248, 597, 613 Bonner, Ann 347, 362
Beeching, Kate 9, 29, 43, 61, 82, 220, 227, Bontly, Thomas 188, 205
560, 568, 570, 580 Bosco, Francesca M. 201, 205
Behne, Tanya 262, 275, 278 Bott, Lewis 258, 260–261, 276
Behrens, Bergljot 419 Bou Franch, Patricia 243–244, 249
Bell, Nancy D. 631, 633–635, 642 Boxer, Diana 350, 362
Bella, Spyridoula 231, 248 Boyd, Danah 18, 29
Benesh, Nick 117 Boyle, Joyceen 344, 362
Bengry-Howell, Andrew 344, 364 Brandom, Robert 135, 150
Benjamin, Trevor 379, 385, 390 Breeze, Ruth 425–426, 432, 445
Benke, Gertraud 426, 444 Breheny, Richard 192, 201, 205, 260, 276
Berenz, Norine 104, 115 Brennenstuhl, Waltraud 501, 521
Berger, Cynthia M. 631, 633–635, 642 Breustedt, Barb 86
Bergman, Marc L. 231, 249 Briggs, Sarah 642
Bergmann, Jörg 104, 120, 392 Brinton, Laurel J. 533, 546, 564–565, 580,
Bergmann, Pia 392 588, 614
Bergner, Heinz 540, 546 Brooks, Neon 82, 262, 275
Betz, Emma 378, 381–383, 386 Brown, Annie 320, 322, 328
Bezuidenhout, Anne 201, 205 Brown, Jean D. 309, 324, 328
Bharuthram, Sharita 230, 249 Brown, Penelope 42, 44, 83, 141, 150, 168,
Bhatia, Vijay 428, 444 173, 179, 233, 249, 349, 363, 378, 385,
Bianchi, Claudia 147, 150 411, 413, 417, 419
Name index 653

Brown, Roger 22, 29, 235, 249 Chemla, Emmanuel 166, 179, 258, 276
Brownlees, Nicholas 11, 29 Chen, Rong 45, 65, 83, 611, 614
Brugman, Hennie 109, 120 Cheng, Winnie 78, 83, 606, 614
Bryant, Gregory 293, 302 Cheshire, Jenny 472, 488
Bublitz, Wolfram 14, 18, 29, 31, 42, 83, Chevallier, Coralie 201, 206, 262, 279
130, 402, 419, 432, 445, 622, 628–629, Chierchia, Gennaro 165–166, 179, 197, 206,
640 260, 262, 276–278
Bucciarelli, Monica 201, 205 Childs, Becky 96, 117
Buchanan, Elizabeth 77, 88–89 Chilton, Paul 426–427, 429–430, 433, 445–
Bucholtz, Mary 99, 111, 113 446, 451
Bull, Peter 107, 113, 446 Chomsky, Noam 41, 83, 173, 179, 349, 363
Bunt, Harry C. 109, 113, 501, 521 Chouliaraki, Lilie 426–427, 446
Burgess, Robert 344, 363 Chovanec, Jan 434, 446
Burkert, Anna Kristin 382–383 Chovil, Nicole 105–106, 113–114
Bush, George W. 437, 439, 441, 445 Chow, Katherine 275
Busse, Ulrich 622–626, 640 Cibulka, Paul 18–19, 29
Bušta, Jan 641 Cienki, Alan 19, 29, 434, 446
Byon, Andrew Sangpil 230–231, 249 Clancy, Brian 484, 488, 528, 541, 546, 588,
Byrd, Pat 613–614 599, 605, 614, 616–617, 623, 633, 638
Clancy, Dan 34, 549
C Claridge, Claudia 11, 30, 582, 584
Caffi, Claudia 141, 150, 621, 638 Clark, Billy 189, 192, 195, 202–204, 206–
Caink, Andrew 203, 206 207, 211
Cameron, Deborah 361, 363 Clark, Eve V. 262, 276
Cameron, Lynne J. 112, 120 Clark, Herbert H. 49, 83, 98–99, 114, 155,
Canavan, Alexandra 638 179, 193, 207, 285, 290–291, 294–296,
Cap, Piotr 340–341, 425–431, 433–434, 301–302, 509, 522, 557, 566, 576, 580
436–437, 443, 445, 447 Clark, Nathaniel 289, 301, 303
Cara, Francesco 192–195, 213 Clark, Victoria 614
Carbary, Kathleen M. 260, 277 Clayman, Steven E. 378, 381, 383
Carbaugh, Donal 58–59, 85, 630, 640 Clément, Fabrice 231
Carles, Laura 293, 303 Clements, Paul 42, 83
Carletta, Jean 495, 502, 505–506, 513, 521 Clift, Rebecca 8, 30
Carlson, Gregory N. 279 Clifton Jr., Charles 166, 179, 260, 278
Carlson, Lynn 502, 521 Clyne, Michael 48, 83
Carpenter, Malinda 262, 275, 278 Coccaro, Noah 524
Carroll, Ruth 32, 544, 546 Cohen, Andrew D. 49, 72, 83, 233, 249, 305,
Carruthers, Peter 191, 206 325, 328
Carston, Robyn 147, 150, 189, 191, 197, Cohen, Jacob 112, 114
202–203, 206, 212, 214, 260, 276 Cohen, Jeffrey F. 106, 119
Carter, Ronald 476, 487, 562, 580, 589, 616 Cohen, Philip R. 416, 419
Cenoz, Jasone 231, 249 Collins, Peter 471, 480, 488
Čermák, František 478, 488 Colombino, Tommaso 201, 207
Chafe, Wallace 289, 398, 402, 419, 639 Colston, Herbert 287, 290, 301–302
Chambers, Craig G. 279, 441 Connelly, Gerry 107, 113
Chan, Angela 349, 366 Connor, Ulla 478, 488
Chang, Hsiang-Hu 278 Conrad, Susan 29, 419, 580, 613–614
Channell, Joanna 606, 614 Cook, Guy 99, 114, 482
Chapman, Siobhan 8, 29 Copestake, Ann 493
Charles, Cassily 450 Core, Mark 495, 501, 504, 522
Charteris-Black, Jonathan 427, 430, 445 Corley, Martin 293, 301, 468, 488
654 Name index

Cornillie, Bert 481, 491 Davis, Kathryn 343, 345, 360, 363
Corsaro, William 350, 359, 366 De Cock, Sylvie 577, 581
Cortes, Viviana 614 De Neys, Wim 260, 276
Coseriu, Eugenio 10, 30 De Sutter, Gert 479–480, 489
Cosmides, Leda 191, 195, 208 Defranq, Bart 479–480, 489
Coulmas, Florian 245, 250 Dehé, Nicole 377, 383
Coulthard, Malcolm 6, 8, 35, 401, 410–411, Deignan, Alice 199, 207
422, 502, 505, 513, 524 Delaney, Suzanne 283, 302
Couper-Kuhlen, Elizabeth 104, 120, 374, Demeter, Gusztav 324, 328
377, 379, 383–384, 392 Demmen, Jane 543, 549
Coupland, Nikolas 622, 629, 640 Denis, Derek 472, 489, 494
Crain, Stephen 275, 277 Denscombe, Martyn 345, 362–363
Craven, Alexandra 245, 249 Dent, Susie 157
Cray, Ellen 92 Deppermann, Arnulf 392
Croft, William 566, 572, 580–581 Detges, Ulrich 560, 580
Crookall, David 306, 328 Deutschmann, Mats 7, 30, 52, 84, 455, 465,
Cruse, D. Alan 566, 581 476, 489, 590–591, 607–608, 613–614
Crystal, David 16, 30, 475, 483, 489, 575, Deyhle, Donna 347, 365
581 Di Luzio, Aldo 370, 382
Csibra, Gergely 262, 279 Dieussaert, Katrien 260, 276
Csomay, Eniko 499, 521–522, 614 Diggle, Peter J. 490
Cuenca, Maria Josep 575, 581 Dik, Simon 398, 419
Culpeper, Jonathan 9, 15, 30, 42, 44, 71, 83, Diller, Hans-Jürgen 542, 546
455, 465, 485, 489–490, 495, 497–500, Dines, Elizabeth R. 568, 581
507, 510–511, 515, 517, 519–522, 530, Dingemanse, Mark 381, 384–385, 555–556,
546, 599, 622–627, 638 581
Cumming, Susanna 114 Dinkin, Aaron 44–45, 57, 84
Cummings, Martha C. M. 229, 242–243, 248, Dirksmeyer, Tyko 385, 555–556, 581
597, 613 Ditman, Tali 274, 278
Cummins, Chris 191–192, 210, 212 Dixon, Carol 99, 115
Curell i Gotor, Hortènsia 231, 253 Doherty-Sneddon, Gwyneth 521
Curl, Traci 245, 250, 379, 384 Donaldson, David I. 468, 488
Dong, Li 91
D Dörnyei, Zoltán 39, 70, 74, 76, 84, 305, 329
Dachkyovsky, Svetlana 109, 114 Dorst, Aletta 213
Dahl, Merete 4, 32, 49, 71, 87, 242, 251, Dostie, Gaétane 480, 489
305–306, 313, 326, 330 Drake, Derek 378, 382, 384
Dahl, Trine 490 Drake, Veronika 378, 382, 384
Dahlbäck, Nils 521 Drange, Eli-Marie Danbolt 479–480, 489
Dale, Rick 298–299, 301, 303 Dressler, Richard A. 101–102, 105, 114
Damico, Jack S. 101, 108, 110, 114, 117 Dressler, Wolfgang 10, 29, 398, 400, 419
Danescu-Niculescu-Mizil, Cristian 599, 614 Drew, Paul 245, 250, 370, 379–380, 382,
Dasher, Richard B. 533–534, 538, 551 384–385, 389, 636, 639
Davidse, Kristin 481, 491 Drummond, Kent 378, 384
Davidson, Christina 109, 112, 114 Du Bois, John W. 100, 103, 114, 339, 341,
Davidson, Donald 257, 276 631, 639
Davidson, Judy Arlene 379, 384 Dube, Chad 166, 179
Davies, Bethan 622, 639 Duckworth, Amber 276
Davies, Catherine 266, 268, 271, 276 Duff, Patricia 360, 363
Davies, Mark 476, 489, 631, 634, 639 Duffley, Patrick 486, 491
Davies, Matthew 495, 497, 507, 519, 521 Dummett, Michael 257–276
Name index 655

Dunmire, Patricia 430, 433–434, 439, 446 Feeney, Aidan 260, 262, 276
Duran, Richard 301 Felder, Ekkehard 455, 465
Duranti, Alessandro 24, 30, 324, 329, 370, Félix-Brasdefer, J. César 27, 30, 47, 49, 54,
384 61, 71–73, 84, 221, 231, 250, 305–306,
Durkheim, Emile 367, 384 308–313, 321, 323–327, 329
Dürscheid, Christa 16–17, 32 Ferrara, Kathleen 561, 581
Fetzer, Anita 10, 30, 396, 398–411, 417–418,
E 420–423, 431, 446
Easton, Kristen L. 110, 114 Fiddick, Laurence 195, 208
Ebeling, Hanna 210 Filardo Llamas, Laura 430–431, 434, 446
Ebeling, Jarle 480–481, 489 Fillmore, Charles J. 41, 84, 572, 583
Ebeling, Signe Oksefjell 480–481, 489 Fincke, Steven 386
Eckert, Penelope 346, 363 Finegan, Edward 29, 419, 477, 488, 580
Economidou-Kogetsidis, Maria 72, 84, 231, Finlayson, Ian 293, 301
243, 250 Fiorentino, Robert 277
Edmondson, Willis J. 8, 30, 46, 60, 84, 220, Firth, Alan 381, 385
227, 410, 419 Fischer-Starke, Bettina 485, 489
Edwards, Jane A. 98, 100–101, 108, 115 Fischer, Kerstin 563–564, 567, 576, 581
Eelen, Gino 349, 363 Fitzmaurice, Susan M. 545–546, 550
Egbert, Maria 372, 378–380, 384 Fjeld, Ruth Vatvedt 480, 489
Eglin, Peter 634, 639 Fleischman, Suzanne 540, 546
Ehlich, Konrad 42, 92, 98, 104, 107, 110, Fletcher, Jeannie 352, 363
115, 626, 642 Flickinger, Dan 493
Eisenstein, Miriam 237, 243, 246, 250, 597, Flöck, Ilka 24, 27, 30, 463, 465, 591, 593,
614 595, 597–598, 615
Ekman, Paul 106, 109, 115 Fløttum, Kjersti 479, 490
Elgesem, Dag 493 Flowerdew, John 425–426, 429–431, 446,
Elm, Malin S. 79, 84 478, 490
Emerson, Robert 350, 363 Floyd, Simeon 385
Enfield, Nick J. 378–379, 381, 384–385 Fodor, Jerry A. 186, 208
Engel, Ulrich 78, 84 Foppolo, Francesca 262, 276–277
Englebretson, Robert 639 Forceville, Charles 203, 208
Ericsson, K. Anders 123, 130, 186, 207 Ford, Cecilia E. 374, 377–379, 383, 385
Erman, Britt 557, 581 Foster-Cohen, Susan H. 201–202, 208
Ervin-Tripp, Susan 41, 84 Foudon, Nadege 212
Eslami, Zohreh R. 231, 250 Fowler, Roger 425, 427, 429, 447
Esser, Jürgen 10, 30, 48, 84 Fox, Barbara A. 373, 377–379, 381, 385–386
Evans, Jonathan 194, 207 Fox, Danny 179
Fox, Sue 488, 494
F Fox Tree, Jean E. 98, 114, 576, 580
Fabb, Nigel 203, 207 Francik, Ellen 285, 301
Fabricius-Hansen, Cathrine 398–399, 419– Francis, Alexander L. 378, 391
420 Francis, W. Nelson 470, 490
Facchinetti, Roberta 455, 465, 474–475, 489, Frank, Michael C. 272–273, 277, 279,
493 Franks, Heather Mair 289, 302
Færch, Claus 5, 30, 41, 63, 84, 123, 130, 186, Franquiz, Maria 99, 115
207 Fraser, Bruce 9, 31, 141, 150
Fairclough, Norman 418, 420, 425–427, 429, Frazer, Elizabeth 363
446 French, Peter 377–378, 386
Fanshel, David 410, 416, 421 Fretz, Rachel 350, 363
Farr, Fiona 590, 614 Fried, Mirjam 565, 572, 581
656 Name index

Friesen, Wallace V. 106, 109, 115 Goddard, Cliff 634, 639
Fringinal, Eric 594, 614 Goffman, Erving 368, 386, 601, 604, 615
Fulcher, Glenn 321, 330 Golato, Andrea 4, 7, 23, 27, 31, 49, 73, 85,
Furlong, Anne 203, 208 242–244, 246, 250, 336, 341, 371, 378–
Fusaroli, Ricardo 301 379, 381–384, 386, 393
Golato, Peter 49, 73, 85, 381, 386
G Goldberg, Adèle 572, 581
Gabrielatos, Costas 444, 447, 477, 490, 494, Goldin-Meadow, Susan 106, 115
570, 585 Good, David 401, 419
Galasiński, Dariusz 629, 640 Goodwin, Charles 370, 377, 379, 384, 386–
Garcia McAllister, Paula 590, 602–603, 613, 387, 504, 522
615 Goodwin, Marjorie Harness 379, 387
García, Carmen 480, 493 Göy, Elif 60, 85
Garcia, Paula 599, 615 Graham, Sage 346, 363
Garfinkel, Harold 43, 54, 85, 368–369, 386, Grainger, Karen 239, 250
634, 639 Granger, Sylviane 481, 490
Garretson, Gregory 489 Grassmann, Susanne 262–263, 277, 279
Gass, Susan M. 237, 244, 251, 305, 321, Gray, Bethany 9, 31, 535, 546
326–327, 329–330 Gray, Matthew K. 34, 549
Gauker, Christopher 144, 150 Greatbatch, David 378, 387
Gay, Lorraine R. 323, 329 Greaves, Chris 83
Gee, James P. 96, 115 Green, Judith 99, 115
Gee, Matt 477, 491 Greenbaum, Sidney 470, 490
Geertz, Clifford 348, 363, 625, 639 Greenberg, Rivka 110, 114
Geluykens, Ronald 463, 465, 591, 593, 595, Greer, Tim 320, 322, 330
597–598, 615 Grefenstette, Gregory 477, 491
Gernsbacher, Morton-Ann 402, 414, 420 Greimas, Algirdas J. 411, 420
Gerrig, Richard 302 Grice, H. Paul 6–7, 31, 49, 85, 130, 137–
Geurts, Bart 166, 179, 213, 260, 276 139, 151, 156, 158, 160, 163–164, 172–
Ghezzi, Chiara 557, 581 173, 179–180, 187–189, 209, 258, 277,
Gibbon, Dafydd 107, 115, 509, 522 293, 302, 404–405, 420, 440, 447, 516,
Gibbs, Raymond 141, 150, 192–193, 195, 522
208–209, 258, 277, 283, 285, 289–290, Gries, Stefan Th. 538, 547
293, 299–302 Griffin, Christine 344, 364
Gibson, Linzi 277 Grodner, Daniel 260, 276–277
Gibson, Will 98, 115 Groeber, Simone 18–19, 31
Gill, Virginia Teas 381, 386 Groefsema, Marjolein 190, 209
Gillam, Lynn 75, 85 Grönqvist, Leif 113
Gillard, Ellen 260, 276 Grootendorst, Rob 428, 450
Gilles, Peter 392 Grossman, Eitan 197, 209
Gillings, Mathew 498, 522 Grosz, Barbara 557, 582
Gilman, Albert 22, 29, 235, 249 Groth, Brian Ibbotson 61–62, 85
Gilquin, Gaëtanelle 577, 581 Gruber, Helmut 397, 420
Giltrow, Janet 430, 447 Grund, Peter J. 544, 548, 553
Gipper, Sonja 385 Gualmini, Andrea 277
Girotto, Vittorio 192–195, 213 Guasti, Maria Teresa 262, 276–277
Gisladottir, Rósa Signý 385 Guba, Egon 350, 364
Givón, Talmy 398–399, 401–403, 406–407, Guendouzi, Jacqueline A. 110, 118
414, 420 Guillemin, Marilys 75, 85
Glucksberg, Sam 258, 277 Gumperz, John J. 104, 115, 407, 413, 420–
Goatly, Andrew 431, 447 421
Name index 657

Gundel, Jeanette 202, 209 Heinemann, Trine 245, 250

Gunnarsson, Magnus 113 Heinemann, Wolfgang 10, 31
Günthner, Susanne 104, 120, 378, 387, 392 Heintz, Christophe 213
Gut, Ulrike 107, 115 Hellermann, John 381, 387
Helt, Maria 613–614
H Hendriks, Berna 231, 251
Hager, Joseph C. 106, 115 Hengeveld, Kees 398, 421
Hallgren, Kevin A. 500, 522 Henke, Christoph 10, 31
Halliday, M. A. K. 398–399, 401–403, 408, Henly, Anne 297, 302
421, 427–429, 447, 483, 490, 564–565, Henrichsen, Peter J. 113
582 Henze, Rosemary 343, 345, 360, 363
Halverson, Sandra 398, 477, 490 Hepburn, Alexa 378, 387, 389
Hambling-Jones, Oliver 24, 31, 338, 341 Herbert, Robert K. 64, 85
Hammersley, Martyn 96, 116, 345, 348, 354, Heritage, John 102, 113, 370, 372, 375, 377–
364 378, 380–382, 384, 387–388, 391, 393,
Handford, Michael 96, 115 403, 421, 444, 576, 582, 636, 640
Handley, Simon J. 276 Hernandez, Wilfridio Flores 386
Hanks, Patrick 199, 209 Herring, Susan C. 16, 31
Hanks, William F. 8, 31, 124, 130 Herrman, J. Berenike 213
Hansen, Maj-Britt Mosegaard 563, 582 Hill, Malcolm 76, 85
Happé, Francesca 192–193, 201, 206, 209 Hiltunen, Risto 546
Haravon, Anita 110, 116 Hiltunen, Turo 535, 538, 546–547, 552
Hardaker, Claire 624, 626–627, 638 Hinkel, Eli 241, 251
Hardie, Andrew 467, 492, 537, 547, 549 Hirataka, Fumiya 380, 384
Harnish, Robert M. 142–143, 150–151, 171, Hirschberg, Julia 260, 277, 557, 561, 582
180, 501, 521 Ho, Debbie G. E. 67, 85
Harren, Inga 378, 387 Hodge, Robert 425, 427, 429, 447
Harris, Roy 634, 639 Hoey, Elliott M. 378, 388
Harrison, Sandra 46, 51–52, 55, 85 Hoffmann, Christian 18, 31
Hart, Christopher 425–431, 433–434, 440, Hoffmann, Sebastian 480–481, 494
446–447 Hofland, Knut 470, 476–477, 487, 490, 493
Hart, Herbert L. A. 135, 151 Hofmockel, Carolin 407, 421
Hartford, Beverly S. 236–237, 240–244, 248, Hoiberg, Dale 34, 549
250, 615 Hoiting, Nina 109, 116
Hartung, Martin 392 Hollebrandse, Bart 268, 271, 279
Harvey, Penelope 363 Holmes, Janet 41, 43–44, 56, 65, 86, 141,
Hasan, Ruquaia 402, 408, 421, 564–565, 582 151, 349–350, 352, 356, 364, 366, 378,
Haselow, Alexander 560, 582 388, 569, 582, 631, 640
Hasko, Victoria 481, 490 Holtgraves, Thomas 284, 302
Hasler-Barker, Maria 49, 71–73, 84, 221, Honkapohja, Alpo 545, 547
227, 308, 321, 323, 326, 329–330 Hoque, Mohammed E. 117
Hassall, Tim 60, 85 Horn, Laurence R. 156, 159–160, 165, 170–
Hasselgård, Hilde 480, 490 171, 180, 196, 209, 260, 277
Hasund, Kristine 92, 471–472, 479–481, 489, Horowitz, Alexandra C. 272, 277
494 Horton, William 302
Haugh, Michael 42, 58–59, 85, 87, 349, 364, Houck, Noel 237, 244, 251, 321, 326, 327,
620, 622–625, 628–630, 633–641 329
Have, Paul ten 336, 341 Houghton, George 522
Hayashi, Makoto 379, 386–387 House, Juliane 7, 29, 82, 150, 220–222, 227,
Heine, Bernd 398, 421, 557–558, 561, 564, 230, 238, 249, 410, 419, 500, 514–515,
572, 582–583 521, 614
658 Name index

Hoymann, Gertie 385 Jones, Donald S. 305–306, 321, 330

Huang, Yan 44, 86, 156–158, 161–162, Jones, James K. 521
165–166, 168, 170–176, 178, 180–182, Jones, Jeremy F. 46–47, 65, 86, 221, 228
220, 227–228 Jones, Rodney H. 74, 86, 96, 116
Huang, Yi Ting 260, 271, 272, 277 Jorgensen, Julia 193, 210
Huber, Magnus 543, 547 Jowett, Garth S. 439, 447
Huberman, A. Michael 40, 89 Jucker, Andreas H. 4–5, 7, 10–12, 14–17, 22–
Hübler, Axel 621–626, 628–629, 640 23, 27, 32–33, 38, 41–42, 45, 47, 49–52,
Hui, Diane 449 55, 73, 86–87, 92, 155, 182, 200, 210, 219,
Hundt, Marianne 455, 465, 470, 491–492, 228, 303, 339, 341, 432, 445, 455, 457,
496, 523, 548 465–466, 470–471, 477, 490–491, 494,
Hunt, Lamar 274, 277 496, 517, 522–524, 528–529, 533–534,
Hutchby, Ian 372, 388 540–542, 547–548, 550–551, 557, 569,
Huth, Thorsten 379, 381–383, 388, 393 577, 582–583, 587–588, 590, 593–595,
Hymes, Dell 349, 364, 508, 522 597, 610–613, 615–617, 622–624, 627,
631–633, 640–641
I Jurafsky, Daniel 496, 503–504, 516, 523–
Ide, Nancy 100, 116 524, 614
Ide, Sachiko 42, 92, 626, 642 Jussila, Katja 210
Ifantidou, Elly 202–203, 209–210
Innes, Bronwen 568, 582 K
Isaacs, Ellen 295, 302 Kaal, Anna 213
Isard, Amy 521 Kaal, Bertie 434, 446
Isard, Stephen 521 Kádár, Daniel Z. 42, 87, 622, 641
Kaislaniemi, Samuli 545, 547
J Kallen, Jeffery L. 483, 491, 503, 523, 599–
Jackson, Don D. 153 600, 615–616
Jacobs, Andreas 35, 86 Kaltenbacher, Martin 450
Jacobs, Geert 10, 32 Kaltenböck, Gunther 398, 421, 558, 561,
Jaffe, Alexandra 99, 116 582–583
Jakobson, Roman 134, 151 Kaplan, David 144, 146, 151
Jakubíček, Miloš 634, 640–641 Kärkkäinen, Elise 9, 32
Jary, Mark 200, 210 Kasanga, Luanga A. 230, 251
Jauss, Hans Robert 540, 547 Kasher, Asa 171, 182
Jaworski, Adam 622, 629, 640 Kasper, Gabriele 4–5, 7, 23, 29–30, 32, 41,
Jebahi, Khaled 230, 251 46, 49, 58–59, 62–63, 70–71, 73, 82, 84,
Jefferson, Gail 7, 34, 102, 116, 119, 309, 330, 86–87, 123, 130, 150, 186, 207, 220–222,
369, 378, 388, 391–392, 502, 504, 524, 227–231, 238, 241–242, 244–245, 249,
557–558, 582, 585, 630, 640, 643 251, 305–306, 313, 320, 326–327, 330,
Jenks, Christopher J. 99, 116 380, 388, 500, 514–515, 521, 614
Jeuniaux, Patrick 117 Katsos, Napoleon 71–72, 192, 210, 260,
Jodlowiec, Maria 202, 210 263–266, 268–269, 271, 276–277, 279
Joe, Angela 364 Kay, Christian 532, 544, 548, 550, 553
Johansson, Stig 29, 419, 470, 480, 490, 580 Kay, Paul 572, 583
Johansson Falck, Marlene 289, 301, 303 Kearsy, Cormier 19, 32
Johnson-Laird, Philip N. 194, 210 Keck, Casey 521
Johnson, Robert E. 109, 116 Keevallik, Leelo 378, 389
Johnson, Sally 485, 490 Kehoe, Andrew 51–52, 88, 476–477, 491,
Johnston, Bill 65, 86, 229, 241, 251 493, 588–590, 608–609, 613, 616
Johnstone, Barbara 80, 86 Keisanen, Tiina 9, 32
Johnstone, Bruce 353, 362, 364 Kelly, John 377, 389
Name index 659

Kendon, Adam 19–20, 33 Krych, Meredyth 296, 301

Kendrick, Kobin H. 379, 385, 389 Krzyżanowski, Michał 444
Kent, Alexandra 245, 247 Kubanyiova, Magdalena 75, 87
Kenwood, Christine 99, 106, 113 Kucera, Henry 470, 490
Kerkam, Zainab 239, 250 Kuhn, Thomas S. 170, 182
Kern, Friederike 372, 389, 392 Kui Shen, Yuan 34, 549
Kerswill, Paul 488 Kukkonen, Karin 449
Keysar, Boaz 297, 301–302 Kuperberg, Gina R. 274, 278
Khosravinik, Majid 444 Kuteva, Tania 398, 421, 582
Kidner, Keely 347–348, 364 Kuusikko, Sanna 210
Kilgarriff, Adam 477, 491, 626, 640 Kwon, Jihyun 231, 251
Kim, Dan 40, 87 Kyle, Jim G. 18, 36
Kimps, Ditte 481, 483, 491 Kyratzis, Amy 97, 117
King, Brian 346, 364 Kytö, Merja 9, 15, 30, 33, 455, 465, 530,
Kingdon, Roger 103, 116 544, 546, 548, 553
Kingsley, Leilarna 354, 364
Kingsolver, Barbara 26, 28 L
Kipp, Michael 100, 109, 113, 116 Laakso, Minna 386
Kirk, John M. 482–483, 486, 491, 503, 523, Labov, William 24, 33, 38, 53, 57, 62, 79, 88,
599–602, 615–616 337, 341, 345, 365, 410, 416, 421
Kirkham, Natasha 298, 303 Ladusaw, William A. 97, 119
Kissine, Mikhail 201, 210 Lakoff, Robin 41–42, 88
Kitzinger, Celia 379, 389 Landert, Daniela 9, 15, 33, 45, 86, 534, 549
Kjellmer, Göran 468, 483, 491, 577, Langton, Rae 146, 151
583 Larrivee, Pierre 486, 491
Klassmann, Alex 109, 120 Lascarides, Alex 397, 401, 406, 419, 502,
Klauk, Tobias 22, 33 521
Klein, Natalie M. 260, 277 Lass, Roger 545, 549
Klerk, Vivian de 200, 210 Lazaraton, Anne 74–75, 78, 88
Knight, Dawn 476, 487, 562, 583 Lazzaro-Salazar, Mariana 339, 346, 351,
Koch, Peter 12–13, 33 357–358, 365
Koenig, Christopher J. 381, 389 Lee, Jih-Ye 315, 322, 330
Koester, Almut J. 590, 616 Leech, Geoffrey 4, 24, 29, 33, 41, 44–45, 49,
Kohnen, Thomas 534, 541, 548, 609–610, 71, 88, 99, 117, 128, 172, 182, 233, 251,
613, 616 325, 330, 419, 470–471,474–476, 491–492,
Kolaiti, Patricia 191, 199–200, 210 497, 501, 507–508, 512, 516, 523, 578,
Koller, Veronika 427–428, 430, 448 580, 583
Komter, Martha L. 381, 389 Legrenzi, Maria Sonino 194, 210
Kono, Nariyo 74–75, 77, 79–80, 87 Legrenzi, Paolo 194, 210
Köppe, Tilmann 22, 33 Lehn, Dirk vom 98, 115
Kopytowska, Monika 448 Leinonen, Eeva 210
Koshik, Irene 378, 389 Lemke, Jay 426, 448
Kovář, Vojtěch 640–641 Lenk, Uta 402, 419, 557, 583
Kowal, Sabine 97, 101, 111, 117–119 Lerner, Gene H. 378–379, 389
Kowtko, Jacqueline C. 521 Leskovec, Jure 614
Koyama, Wataru 124, 130 Lester, Jessica N. 100, 118
Krauss, Robert 291, 302 Levelt, Willem J. M. 108, 117, 435
Krennmayr, Tina 213 Levenston, Edward A. 232, 251
Kress, Gunther 425, 427, 429, 447–448 Levinson, Stephen C. 8, 11, 21, 33, 42, 44,
Kreuz, Roger 101–102, 105, 114, 286–287, 83, 130, 141, 150, 156, 160, 162, 164–165,
303 168, 170–171, 173, 176, 179, 182, 196,
660 Name index

210, 233, 249, 260, 278, 349, 363, 385, Magyari, Lilla 385
397, 404–405, 407, 409, 411, 413, 415– Mahlberg, Michaela 48, 88
417, 419, 421, 435, 508, 523, 539, 549, Maier, Robert M. 407, 421
567, 572, 583 Maillat, Didier 430–431, 448
Lewins, Ann 100, 120, 350, 366 Mair, Christian 470, 492
Lewiński, Marcin 430, 448 Maíz-Arévalo, Carmen 243–244, 251
Lewis, David 146, 151 Maks, Isa 434, 446
Lewis, Diana 565, 583 Malancharuvil-Berkes, Elizabeth 449
Lewis, Gwyneth 117 Mallinson, Christine 96, 117
Liddell, Scott K. 109, 116 Malory, Beth 518, 521
Lidz, Jeffrey 258, 278 Mandelbaum, Jenny 245, 252, 389
Liebal, Kristin 262, 278 Manes, Joan 37, 41, 43, 46, 51, 55, 88
Lieberman Aiden, Erez 34, 549 Manificat, Sabine 212
Lin, Phoebe M. S. 483, 492 Mann, William C. 397, 421
Lincoln, Yovanna 350, 364 Mannion, Margaret 616
Lindsay, Jean 111, 117 Manrique, Elizabeth 385
Lindström, Anna 245, 251, 379, 389 Mansor, Fathia 239, 250
Linell, Per 397, 414, 421, 558, 562, 567, 583 Mapson, Rachel 19, 34
Litman, Diane 557, 561, 582 Marcu, Daniel 502, 521
Ljung, Magnus 480, 492 Marin Arrese, Juana 430, 448
Local, John 377–378, 386, 389–390, 576, Markee, Numa 55, 88, 381, 390
583 Markham, Annette 77, 79, 88–89
Locher, Miriam A. 4, 16, 22–23, 32–33, 346, Markman, Ellen M. 262, 278
349, 365 Markus, Manfred 575, 584
Lof, Gregory L. 97, 120 Márquez Reiter, Rosina 321, 325, 330
Long, Haiping 398, 421, 582 Marra, Meredith 86, 349–350, 352, 356,
Looks, Karin 107, 115 364–366
Lopez-Rodriguez, Clara 213 Martey, Nii 639
Lorenzo-Dus, Nuria 243–244, 249 Marti, Leyla 83, 231, 252
Loukusa, Soile 201, 210 Martin, Gillian S. 62, 89
Louw, Bill 625, 641 Martin, James R. 401, 421, 427–428, 448,
Louwerse, Max M. 106, 117 504, 523
Lower, Amanda 631, 641 Martin, Rachel 524
Lowry, Orla 616 Marttila, Ville 545, 547, 549
Lucy, Peter 193, 207, 488 Mascaro, Olivier 202, 211, 213
Luke, Allan 426–427, 448 Maschler, Yael 381–382
Lutzky, Ursula 51–52, 88, 511, 519, 523, Mattila, Marja-Leena 210
543, 549, 588–590, 608–609, 613, 616 Mautner, Gerlinde 448
Lwanga-Lumu, Joy-Christine 230, 251 Mazeland, Harrie 379, 386, 390
Lyse, Gunn Inger 484, 492 McCarthy, Michael 467, 492, 589, 596,
603–605, 616
M McComish, Judith Fry 110, 114
Macaro, Ernesto 305, 328 McConnell-Ginet, Sally 346, 363
Macaulay, Ronald K. S. 44, 88 McDonough, Steven H. 305–306, 330
MacGregor, Lucy J. 468, 488 McEnery, Tony 444, 467, 490, 492, 496, 500,
Mackenzie, Lachlan J. 398, 421 516, 523, 537, 542, 549
Mackey, Alison 305, 330 McHoul, Alexander W. 381, 390
MacMahon, Barbara 203, 210–211 McIntyre, Dan 470–471, 485, 492
Macrae, Andrea 204, 211 McKinley Brayboy, Bryan 347, 365
MacWhinney, Brian 98, 104–105, 108, 117, McNeill, David 107, 117, 288, 302
202, 211, 619, 641, 643 Mehan, Hugh 381, 390
Name index 661

Mei, Meilian 83 Müller, Nicole 101, 110, 117–118

Meier, Christoph 104, 120 Munn, Alan 278
Meißner, Cordula 100, 117 Murata, Kazuyo 354, 365
Mercier, Hugo 203, 211, 213 Murphy, Beth 232, 252
Meroni, Luisa 277 Murphy, Bróna 481, 485–486, 492, 590, 614
Merrison, Andrew John 24, 31, 338, 341 Musolff, Andreas 427, 430, 449
Mertz, Elizabeth 622–623, 641 Musolino, Julien 258, 262, 278
Mertzlufft, Christine 392 Myers, Greg 99, 117
Meteer, Marie 524
Meurman-Solin, Anneli 529, 549, 552 N
Mey, Jacob 396, 404–405, 409, 416, 421, Naeimi, Amin 231, 247
621, 641 Nakabachi, Keiichi 232, 252
Meyer, Charles 163–165, 182, 639 Neidle, Carol 100, 118
Meyer, Christian 393 Nelson, Gerald 163–165, 182, 483, 492
Meyer, Michael 425–426, 428, 430, 432, 448, Neu, Joyce 232, 252
450–451 Nevala, Minna 83
Miall, David S. 204, 211 Nevalainen, Terttu 550, 552, 627, 641
Michel, Jean-Baptiste 26, 34, 542, 549 Newton, Jonathan 364
Michelfeit, Jan 641 Ng, Patrick 91
Milà-Garcia, Alba 602, 616 Nicolle, Steve 192, 195, 211
Miles, Matthew B. 40, 89 Niedzielski, Nancy 622, 641
Millar, Don 474, 492 Nieuwland, Mante S. 274, 278
Miller, George 193, 210 Noda, Hiromi Pat 303
Miller, Karen 273, 278 Noora, Aazam 231, 250
Mills, Geoffrey E. 329 Norbury, Courtenay Frazier 201, 211
Mills, Sara 239, 250 Norrick, Neal 112, 130, 576, 584
Milroy, James 529, 549 Norris, Sigrid 99, 118
Min, Jennifer Quah Xiao 70, 90 Norton, Bonny 346, 365
Minai, Utako 277 Norvig, Peter 34, 523, 549
Mishler, Elliot G. 99, 117 Nöth, Winfried 125, 130
Mitchell, Candace J. 235, 241–242, 253, 307, Noveck, Ira 73, 89, 191–192, 196–197, 200,
323, 330 206, 209, 211, 220, 228, 258, 260–262,
Moeschler, Jacques 401, 411, 421 264, 274, 276, 278
Moilanen, Irma 210 Novick, David G. 96, 118
Moise, Jessica 192, 195, 209 Nowak, Martin A. 34, 549
Molinelli, Piera 557, 581 Nureddeen, Fatima Abdurahman 230, 252
Mondada, Lorenza 378–379, 381, 389–390 Nusbaum, Howard 288, 303
Montgomery, Martin 410, 422 Nüse, Ralf 289, 303
Moore, Roger 509, 522
Morek, Miriam 393 O
Morgan, Jerry 416, 419 O’Connell, Daniel 97, 101, 110–111, 117–
Morgan, Melanie 378, 391 119
Morini, Massimiliano 203, 211 O’Connor, Joseph D. 103, 118
Morris, Charles W. 125, 130, 133, 151 O’Donnell, Matthew B. 469, 484, 493, 599,
Morton, Tom 478, 494 617
Mosley, Melissa 449 O’Donnell, Victoria 439, 447
Muderack, Karoline 481, 488 O’Driscoll, Jim 626, 638
Mullany, Louise 349, 365 O’Garro, Glynis 449
Mulo Farenkia, Bernard 65, 89, 231, 250, 252 O’Grady, William 202, 212
Müller Gjesdahl, Anje 490 O’Halloran, Kieran 430, 449
Müller, Marcus 455, 465 O’Keeffe, Anne 467, 478, 492, 494, 528, 541,
662 Name index

546, 562, 584, 589–590, 599, 603–606, Peräkylä, Anssi 379, 391
614, 616, 623, 633, 638 Perlman, Marcus 289, 301, 303
O’Reilly, Karen 361, 365 Perrault, C. Raymond 503, 523
Oberzaucher, Frank 393 Perrott, Michael A. 106, 119
Obler, Loraine K. 110, 116 Perry, John 146, 151
Ochs (Keenan), Elinor 98, 102, 107–108, Peters, Jörg 393
118, 168–169, 182, 355, 365, 372, 374, Petukhova, Volha 109, 113
390, 392, 566, 584–585 Pfister, Jonas 201, 212
Oesterreicher, Wulf 12, 33 Phillips, Ben 188, 212
Ogden, Richard 380, 390 Phillips, Bruce 99, 106, 113
Ogiermann, Eva 46, 65–66, 89, 230, 234– Pichler, Heike 44, 89, 471–472, 481, 492
237, 245–246, 252, 255 Pickett, Joseph P. 34, 549
Oishi, Etsuko 411, 422 Pilkington, Adrian 203, 212
Okada, Yasuke 320, 330 Pinker, Steven 34, 549
Okrent, Arika 288, 303 Pinto, Derrin 231, 253
Okulska, Urszula 428, 445 Placencia, Maria Elena 54, 89, 231, 253, 480,
Okurowski, Mary E. 502, 521 493, 631, 641
Olshtain, Elite 232–233, 249, 252 Plester, Barbara 352, 365
Ono, Reiko 241, 253 Pochon-Berger, Evelyne 18–19, 31
Ono, Tsuyoshi 558, 584 Podesva, Robert J. 96, 119
Origgi, Gloria 213 Pohle, Stefanie 61, 89
Orwant, Jon 34, 549 Polanyi, Livia 557, 584
Östman, Jan-Ola 557, 564–565, 572–573, Politzer-Ahles, Stephen 277
581, 584 Politzer, Guy 278
Oswald, Steve 430–431, 448 Pollack, Martha E. 416, 419
Otcu, Bahar 85, 231, 252 Pomerantz, Anita 43, 90, 245, 253, 371, 379,
Ovens, Janine 642 391, 405, 422
Owen, Marion 47, 89, 238, 252 Popper, Karl 167, 172, 183
Owtram, Nicola 204, 207 Posada, Andres 260–261, 274, 278
Potter, Jonathan 23, 34, 245, 249, 336, 340–
P 341, 375, 378, 387, 391
Pahta, Päivi 456, 466, 535, 549, 552 Potter, Liz 199, 207
Paiva, Beatriz de 202, 212 Potts, Christopher 614
Pan, Ping Cathy 67, 89 Pouscoulous, Nausicaa 166, 179, 273, 2
Pandarova, Irina 481, 488 78
Panizza, Daniele 260, 278 Powell, George 189, 206
Paolino, Danae 114 Power, Richard 504, 524
Papafragou, Anna 197, 212, 262, 274, Powers, Willow R. 105, 119
278–279 Pragglejaz Group 199, 212
Papp, Szilvia 201, 212 Preston, Dennis R. 99, 119, 622, 641
Paquot, Magali 481–482, 492 Prillwitz, Siegmund 109, 119
Parrot, Dominic J. 106, 119 Pritchard, Ruth 382–383
Parsons, Talcott 367, 390 Psathas, George 102, 119
Partington, Alan 430, 449, 486, 492 Pullum, Geoffrey K. 97, 119
Pasma, Trijntje 213 Pustejovsky, James 100, 116
Paulus, Trena M. 100, 118 Pye, Clifton A. 112, 119
Paxton, Alexandra 298, 303
Pearson, Barbara Z. 263, 279 Q
Peikola, Matti 544, 546, 549 Quasthoff, Uta 104, 120, 393
Peirce, Charles S. 124–125, 131, 140, 146, Quinn, Gary 18, 34
151 Quirk, Randolph 78, 92, 103, 120
Name index 663

R Romoli, Jacopo 275

Rahilly, Joan 110, 113 Rooney, Anne 616
Rainey, Isobel 330 Roque, Lila San 385
Ramanathan, Vai 346, 360, 365 Rosaldo, Renato 345, 366
Ramm, Wiebke 398–399, 420 Rose, David 401, 421, 427–428, 448
Rampton, Ben 363 Rose, Kenneth R. 70, 90, 220, 228, 238, 241,
Randall, Janet Beth 614 253, 323, 330
Raymond, Chase Wesley 378, 383 Rose, Yvan 109, 119
Raymond, Geoffrey 378–379, 387, 393 Ross, Steven 86, 220, 228–229, 241, 251,
Rayson, Paul 497, 518, 521, 523, 534, 549 320, 327, 330
Reber, Elisabeth 9, 34 Rossi, Giovanni 245, 253, 385
Reboul, Anne 192, 201, 211–212, 278 Roth, Ruth-Maria 69–70, 90
Recanati, François 128, 131, 147–148, 151, Roulet, Eddy 401, 411, 422
163, 183 Roulston, Kathryn 63, 90
Reckling, Ina 263, 279 Roush, Daniel R. 19, 34
Reddy, Michael J. 288, 303 Rowen, Roslyn 634, 641
Redeker, Gisela 397, 420, 557, 584 Rubio-Fernández, Paula 191, 212
Reichman, Rachel 557, 584 Rüegg, Larssyn 24, 34, 53, 78, 90, 337, 341
Reigem, Øystein 493 Ruesch, Jurgen 621, 642
Reisigl, Martin 426–428, 432, 449 Rühlemann, Christoph 198, 212, 335, 342,
Reithinger, Norbert 521 455, 457, 464–465, 467, 469–470, 484,
Ren, Wei 44, 65, 90 493, 496, 520, 528, 550, 555, 577, 585,
Rendle-Short, Johanna 578, 584 588, 599, 602, 617, 623, 625, 633, 642
Renkwitz, Katrin 68, 90 Ruiter, Jan P. de 378, 385
Renouf, Antoinette 476–477, 493 Russel, Albert 109, 120
Reppen, Randi 472, 486, 536, 546, 563, 579, Russell, Stuart 523
587, 598, 613–614 Ruusuvuori, Johanna 379, 391
Richardson, Daniel 298, 301, 303 Rychlý, Pavel 640–641
Richardson, John 425–426, 429–431, 446 Ryder, Nuala 210
Richardson, Kay 363
Riddiford, Nicky 364 S
Ries, Klaus 524 Sabaté i Dalmau, Maria 231, 253
Rintell, Ellen M. 235, 241–242, 253, 307, Sacks, Harvey 7, 34, 43, 47, 85, 91, 102, 119,
323, 330 369, 371, 378–379, 391–392, 469, 493,
Rissanen, Matti 14, 34, 455, 465, 477, 493, 502, 504, 524, 558, 585, 634, 639
531, 550, 552, 612, 616 Sag, Ivan A. 484, 493
Rizzo, Thomas 350, 359, 366 Salmi, Hanna 546
Roberts, Celia 99, 119, 361, 366 Salmon, Vivian 22, 34
Roberts, Felicia 112, 119, 378, 381, 386, 391 Salway, Andrew 477, 493
Roberts, Richard 286–287, 303 Sandler, Wendy 109, 114
Robertson, Dan 268, 275 Sarangi, Srikant 361, 366, 397, 422
Robinson, Jeffrey 112, 119, 245, 253, 379, Sarbin, Theodore R. 305–306, 321, 330–331
381–382, 384, 389, 391 Sarno, Martha T. 110, 116
Roeper, Thomas 263, 279 Sasaki, Miyuki 229, 241–242, 253, 587, 617
Roever, Carsten 58, 70, 72, 90, 220, 228, Sassenroth, Denise 46, 89
324–325, 328 Sauerland, Uli 197, 212, 260, 279
Rogers, Rebecca 426, 449 Saul, Jennifer 149, 151
Roitsch, Julia Marisa 382–383 Saunders, Danny 306, 328
Romero-Trillo, Jesús 455, 465, 470, 473, Saussure, Ferdinand de 41, 90
493, 496, 524, 588, 617 Saussure, Louis de 430, 449
Romero, Catherine 111, 119 Sayette, Michael A. 106, 119
664 Name index

Sbisà, Marina 134–135, 141, 144, 146, 151– Seedhouse, Paul 64, 91, 381, 392
152, 411, 415–416, 422 Selting, Margaret 104, 120, 312, 331, 372,
Scarcella, Robin 309, 321, 327, 331 374–375, 377–378, 383, 389, 392
Schaefer, Edward 295, 301 Semino, Elena 470–471, 493
Schaeffner, Christina 445 Senft, Gunter 169, 183
Schaeken, Walter 260, 276 Shanmuganathan, Thilagavathi 345, 366
Schatz, Edward 345, 360, 363 Shardakova, Maria 231, 254
Schauer, Gila A. 44, 62, 83, 90, 233, 253, Sharma, Devyani 96, 119
307, 323, 331, 463, 465, 591, 593, 595– Shaw, Linda 350, 363
597, 605, 617 Shaw, Rebecca 379, 389
Schegloff, Emanuel 7–8, 34, 47, 54, 90–91, Shintel, Hadas 288, 303
102, 107, 119, 312, 325, 331, 336, 342, Short, Mick 42, 91, 221, 228, 470–471, 493
369–372, 375, 378–379, 390–392, 398, Shriberg, Elizabeth 524
422, 502, 504, 524, 555, 558, 585 Shriberg, Lawrence D. 97, 120
Schenkein, Jim 102, 119 Sickinger, Pawel 68, 90
Scheutz, Hannes 378, 392 Sidnell, Jack 378–379, 387, 393
Schiffrin, Deborah 6, 9, 24, 34, 96, 119, 414, Sidner, Candace L. 557, 582
422, 557, 559, 566, 585 Silberstein, Sandra 438, 449
Schlobinski, Peter 104, 120 Silver, Christina 100, 120, 350, 366
Schmid, Hans-Jörg 11, 34 Simmons-Mackie, Nina N. 108, 114
Schmied, Josef 455, 465 Simon-Vandenbergen, Anne-Marie 480, 487,
Schmitt, Cristina 278 561, 585
Schneider, Gerold 86–87, 542, 544, 547, 551, Simon, Herbert A. 123, 130, 186, 207
615, 627, 641 Simpson, Rita 632, 642
Schneider, Hans Julius 5, 34 Sinclair, John 6, 8, 35, 401, 410–411, 422,
Schneider, Klaus P. 42–43, 45, 47, 51–52, 63, 480, 484, 493, 502, 505, 513, 524
66–68, 72, 82, 91, 220–221, 227–228, 233, Siren, Kathleen A. 112, 119
236, 239, 246, 254, 324, 331, 417, 422, Skaffari, Janne 546
432, 445, 481, 493, 569, 580, 621, 642 Skalicky, Stephen 631, 633–635, 642
Schnurr, Stephanie 349–350, 366 Skordos, Dimitrios 197, 212, 262, 274, 279
Schölmberger, Ursula 231, 254 Skukauskaite, Audra 98, 120
Schourup, Lawrence 557, 585 Slavcheva, Adriana 100, 117
Schreier, Daniel 455, 465, 470, 491, 496, Slembrouck, Stef 427, 449
523, 548 Slobin, Dan I. 109, 116
Schröder, Anne 67, 91 Sloetjes, Han 109, 120
Schrott, Angela 540, 550 Slugoski, Ben R. 235, 254
Schubert, Christoph 10, 35 Smet, Hendrik de 542, 546
Schuetze-Coburn, Stephan 114 Smith, Carol L. 258, 279
Schulz, Peter 430, 449 Smith, Jeremy 544, 550
Schulz, Petra 279 Smith, Nafsika 71–72, 87, 263–265, 277
Schulze, Cornelia 263, 279 Smith, Nicholas 492
Schumacher, Petra 197, 212 Smith, Sandra 19, 32
Schütte, Wilfried 393 Smith, Sara 289, 303
Schwarz, Florian 275 Snedeker, Jesse 260, 271–272, 277
Sclaroff, Stan 100, 118 Snow, Catherine 108, 117
Scrafton, Susan 276 Solfjeld, Kare 399, 419
Searle, John R. 7, 35, 49, 91, 96, 120, 140– Sorace, Antonella 268, 275
141, 152, 282, 303, 403, 416, 422, 498, Sorjonen, Marja-Leena 378, 381, 386, 393
500–501, 503–504, 513, 524, 600–601, 617 Southgate, Victoria 262, 279
Sebba, Mark 377, 390 Sowinski, Bernhard 10, 35
Sedivy, Julie C. 279 Spector, Benjamin 166, 179
Name index 665

Speer, Susan A. 338, 342 587–588, 590–591, 593–595, 597, 610–
Spencer-Oatey, Helen 46, 72, 91–92, 221, 613, 615, 617, 622, 624, 627, 631, 641
228 Tagg, Caroline 16, 35
Sperber, Dan 6–7, 35, 49, 73, 89, 92, 131, Tagliamonte, Sali 472, 494
138–139, 152, 167–168, 171, 183, 187, Taguchi, Naoko 220, 228
189, 190–195, 197, 200, 202–203, 210– Takahashi, Tomoko 232, 236, 238, 248
215, 220, 228, 260, 279, 292–293, 303, Taleghani-Nikazm, Carmen 379, 381–382,
431, 516, 524 388, 393
Speyer, Augustin 398–399, 420, 422 Talmy, Leonard 155, 163, 166–167, 183, 420
Sroda, Mary Sue 201, 205 Tanaka, Noriko 65, 72, 92
Stalnaker, Robert 143–144, 146, 152 Tanenhaus, Michael K. 260, 277, 279
Stede, Manfred 502, 524 Tang, Chen-Hsin 231, 254
Steen, Gerard 199, 213, 299, 303 Tannen, Deborah 42, 53, 78, 88, 92, 103, 120,
Stein, Dieter 16, 31, 430, 447 450
Stelma, Juurd H. 112, 120 Tanskanen, Sanna-Kaisa 630–631, 642
Stemmer, Brigitte 227 Tantalou, Niki 262, 278
Stenström, Anna-Brita 9, 35, 78, 92, 455, Tao, Liang 386
464, 466, 471–472, 479–481, 489, 494, Tardy, Christine M. 48, 92
513, 524, 557, 585 Taylor, Charlotte 625, 627, 642
Steskal, Lubos 493 Taylor, Paul 524
Stiles, William B. 495, 499, 502–503, 505, Thibault, Paul J. 408, 423
514, 524, 599, 617 Thies, Alexandra 107, 115
Stivers, Tanya 379, 385 Thomas, Jenny 99, 117, 325, 331
Stocchetti, Matteo 449 Thompson, Sandra A. 372–373, 377–378,
Stoel-Gammon, Carol 109, 119 385–386, 390, 392, 397, 421, 558, 584–
Stolcke, Andreas 504–505, 524 585, 639
Stracke, Marén 262, 277 Thorne, Sally 359, 366
Strawson, Peter F. 135, 137, 142, 145, 152 Tian, Ye 191, 212
Streeck, Jürgen 20, 35, 378, 393 Tilley, Susan A. 99, 112, 120
Stubbs, Michael 199, 213, 410, 422, 430, 449 Tissari, Heli 627, 641
Stukenbrock, Anja 393 Titscher, Stefan 450
Stutterheim, Christiane von 289, 303 Toerien, Merran 379, 389
Suchomel, Vít 640–641 Tognini-Bonelli, Elena 472, 494
Sudhof, Moritz 614 Tolhurst, Gerda 347, 362
Suhr, Carla 550 Tomasello, Michael 262–263, 275, 277–279
Suhr, Stephanie 485, 490 Tooby, John 191, 195, 208
Surian, Luca 201, 213 Torgersen, Eivind 471, 473, 484, 488, 494,
Suszczyńska, Małgorzata 230, 254 570, 585
Suter, Hans-Jürg 10, 35 Torreira, Francisco 385
Sutton-Spence, Rachel 18, 35 Tottie, Gunnel 468, 470, 480–481, 494,
Svartvik, Jan 78, 92, 103, 120, 482, 494 576–577, 585
Svennevig, Jan 58, 92, 381–382, 393 Tran, Giao Q. 316, 321, 331
Swales, John 48, 92, 642 Traugott, Elizabeth Closs 5, 35, 167, 183,
Swerts, Marc 468, 494 407, 423, 533–534, 538, 551
Szczepek Reed, Beatrice 377–378, 383, 393 Travis, Charles 125–126, 128, 131, 133, 147,
152–153
T Trew, Tony 447
Taavitsainen, Irma 5, 12, 14, 32, 41–42, 45, Trippel, Thorsten 107, 115
47, 50, 86–87, 92, 455–457, 465–466, Trommer, Ann-Kathrin 484, 494
470, 477, 491, 494, 496, 517, 522, 524, Trosborg, Anna 7, 35, 41, 92, 221, 228, 307,
528–529, 533–535, 539–551, 569, 582, 321, 331, 415, 423
666 Name index

Trudgill, Peter 80, 92 Walker, Marilyn A. 521

Tsui, Amy 410, 423 Walker, Marsha 614
Tuominen, Jukka 470, 494, 496, 524, 551 Walker, Terry 489, 544, 548, 553
Turnbull, William 235, 243–244, 254 Wallis, Sean 483, 492
Tyrkkö, Jukka 477, 493, 542, 546–547, 549 Walsh, Steve 478, 494, 633, 642
Walters, Joel 307, 327, 331
U Walton, Lisa 96, 118
Uhmann, Susanne 104, 120, 379, 383, 393 Ward, Gregory 180
Uliss-Weltz, Robin 232, 236, 238, 248 Ward, Karen 96, 118
Unger, Christoph 203, 214 Warga, Muriel 231, 254
Upton, Thomas A. 478, 488 Warren, Martin 78, 83, 92
Urzua, Alfredo 614 Wason, Peter C. 193, 214
Watts, Richard J. 11, 35, 42, 63, 71, 92, 200,
V 214, 247, 254, 349, 365, 626, 642
Valkonen, Petteri 534, 551 Watzlawick, Paul 137, 153
Van der Henst, John-Baptist 293, 303 Wearing, Catherine 191, 201, 203, 206, 212,
Van Dijk, Teun 411, 416, 423, 425, 428, 432, 214
450 Webb, Helena 98, 115
Van Eemeren, Frans 428, 450 Weinbach, Liora 232, 252
Van Ess-Dykema, Carol 524 Weiss, Gilbert 426, 450
Van Henk, Gerard 117 Weisser, Martin 496, 504, 507–508, 519, 523,
Van Leeuwen, Theo 366, 428, 448, 450 525, 599–600, 613, 617
van Mittenburg, Emiel 213 Weizman, Elda 402, 423
Van Mulken, Margot 230, 254 Wells, Bill 377, 389–390
Van Rillaer, Gert 597, 615 Wenger, Etienne 346, 366
van Tiel, Bob 179, 197, 213 Werlich, Egon 10, 35
Vanderveken, Daniel 141, 152 Wertz, Joan M. 106, 119
Varghese, Manka 229, 232, 249, 310, 328, Wharton, Tim 20, 35, 202–203, 214
587, 614 White, Peter 448
Varila, Mari-Liisa 546 Whitt, Richard Jason 534, 551
Veenstra, Alma 268, 271, 279 Wichman, Anne 82, 336, 341, 377, 383, 561,
Vega Moreno, Rosa 203, 214 585
Ventola, Eija 402, 419, 450 Widdowson, Henry 395, 399–400, 410, 414,
Veres, Adrian 34, 549 423, 427, 430, 450–451
Verkerk, Suzanne 260, 276 Widodo, Handoyo P. 96, 120
Verschueren, Jef 431–432, 450, 514, 525, Wierzbicka, Anna 44, 92, 513, 525, 625, 643
529, 551, 590, 617, 621, 642 Wilcox, Kim A. 112, 119
Vetter, Eva 450 Wilkes-Gibbs, Deanna 294, 301
Viehweger, Dieter 10, 31 Williams, John 260, 276
Vine, Bernadette 86, 352, 356, 364 Wilson, Andrew 497, 500, 523
Virtanen, Tuija 16, 31 Wilson, Deirdre 6–7, 35, 49, 92, 131,
Vogel, Friedemann 455, 465 138–139, 152, 167–168, 171, 183, 187,
Vogel, Irmgard 78, 84 189–191, 193, 199–200, 203–204, 206,
Völker, Harald 540, 550 210, 213–215, 220, 228, 260, 279, 292,
Volo, Lorraine de 345, 360, 363 303, 516, 524
Wilson, Elspeth 263, 278
W Wilson, Gwyneth 358, 366
Wachtel, Gwyn F. 262, 278 Winski, Richard 509, 522
Wagner, Johannes 374, 380–381, 385, 393 Wittenburg, Peter 109, 120
Wałaszewska, Ewa 202, 214 Wittgenstein, Ludwig 134, 153, 634, 643
Walker, Brian 470–471, 485, 492 Włodarczyk, Matylda 533, 551
Name index 667

Wodak, Ruth 48, 93, 425–430, 432, 444, 446, Yao, Xinyue 480, 488
449–451 Yoon, Erica J. 273, 279
Wolfram, Walt 80, 93 Yovel, Jonathan 622–623, 641
Wolfson, Nessa 37, 41, 43, 46, 51, 55, 88, Yuan, Yi 38, 65, 93, 241, 255, 380, 394, 597,
324, 331, 631, 643 618
Woll, Bencie 18, 35–36 Yufu, Mamiko 380, 384
Wong, Jean 382, 394 Yus, Francisco 186, 189, 203, 215, 430,
Wong, Tze-Peng 201, 208 451
Woodfield, Helen 43, 64, 93, 220, 227, 231,
254, 568, 580 Z
Woodman, Gill 608, 618 Zagar, Igor 451
Woods, David K. 100, 120 Zevakhna, Natalia 213
Wooffitt, Robin 372, 388 Zeyrek, Deniz 85, 231, 252
Wootton, Anthony 245, 254 Zhang, Grace Qiao 231, 254
Wouk, Fay 230, 255, 386 Zhu, Xiaoshu 73, 93
Wu, Jie 117 Zhu, Yunxia 344, 366
Wu, Yunan Charles 273, 279 Zienert, Heiko 109, 119
Wynn, Karen 263, 279 Zinken, Jörg 245, 255, 430, 451
Zipperlen, George 631, 638
Y Zirnstein, Megan 117
Yang, Dafu 45, 83 Ziv, Yael 557, 583
Yang, Shu-Ju 275 Zwets, Martine 19, 32
Subject index

Subject index

A B
activity type 21–22, 397, 409, 417, 508–509, behavior
564–565, 567, 574 nonverbal ~ 19–20, 72, 97–98, 298, 301
address form 347, 555–557, 578, 603 see also communication
aetiolation 126, 134
analysis C
Conversation ~ (CA) see method Child Language Database Exchange System
Critical Discourse ~ (CDA) see method (CHILDES) 104–105, 109
Discourse ~ see method coding 96–97, 100, 102, 104–108, 222, 226,
horizontal ~ 619, 623–624, 632–633 230, 355, 435, 500, 502, 509–510, 513–
qualitative ~ 40, 46, 54, 64, 73, 99, 198, 514, 531, 562, 597, 603, 612, 632, 635
244, 247, 325–326, 335, 340–341, facial action ~ system 106, 109
369, 395–398, 400, 402, 418, 429, coherence 10, 340, 395–397, 402–410, 412–
443, 455, 464, 527, 531, 535, 542, 418, 474, 564–465, 601
590, 602, 606, 612, 623, 632–633, cohesion 10, 357, 402–403, 408–410, 413–
636 414
quantitative ~ 40, 46, 54, 57, 198, 244, collocation 430, 462, 473, 481, 484–485,
246, 335, 340, 353, 367, 395, 397, 504, 527, 535, 537, 542, 555, 562–563,
400, 402, 418, 429, 436–437, 443, 573, 579, 587–588, 608–609, 613, 620,
455, 458–460, 468, 484, 500, 516, 622, 625–627, 634–635
527, 531, 561, 606, 623, 628, 632– common ground 99, 126–127, 133, 143–145,
633, 636 149–150, 225, 262, 291–292, 294, 296–
reliability of ~ see reliability 298, 401, 403, 471, 539, 566
sequential ~ 320, 324–325, 371–372, 374, discourse ~ 395, 403–404, 411, 416
380, 635 communication
speech act ~ see method computer-mediated ~ (CMC) 16–18, 430,
vertical ~ 619, 623–624, 628, 632–633 475–476
annotation face-to-face ~ see conversation
corpus ~ 47, 457, 461–462, 467, 482, nonverbal ~ 27, 48, 203, 219, 263 see also
495–497, 502–503, 509, 514–515, communication
519–520, 531 oral ~ 67, 76, 221
pragmatic ~ 47, 73, 458, 461, 483–484, spoken ~ see language
495–496, 498, 500, 512, 514, 520, 543, written ~ see language
599–600, 602, 612 communicative
approach ~ act/action/activity 17, 106, 306–307,
corpus-based versus corpus-driven ~ 461, 324, 397, 405–407, 409–410, 416, 432,
467, 472–473, 485 578
form-based ~ 51, 461, 467, 471, 555, ~ distance 12–13
587 function 10, 46, 317, 555
function-based ~ 431, 461, 467, 471 ~ immediacy 12–14
keyness ~ 473, 484–485, 517 ~ intention 127, 142–143, 291–292, 296,
see also method 396, 403–404, 409, 415–416
see also model ~ situation 4, 8, 13, 540
Community of Practice 346, 349

https://doi.org/10.1515/9783110424928-027
670 Subject index

computer-mediated communication (CMC) meta~ 468, 475, 481, 528, 530, 539, 543,
see communication 591–593, 605, 607, 613
Construction Grammar see grammar natural/naturally occurring ~ 6, 23, 25,
conversation 50, 78, 219, 223, 229–230, 242–245,
~ Analysis (CA) see method 315–316, 324, 328, 336–340, 375, 379,
everyday ~ see discourse 591, 601
face-to-face ~ 6, 13, 52, 54–55, 95–96, observational ~ 6, 20, 50, 80, 200–201,
99, 105–106, 226, 244, 307, 309, 315, 335
321–322, 325, 327, 346, 368, 375, 503, ~ reliability see reliability
509, 530, 597 representativeness of ~ see representative-
Cooperative Principle (CP) 7, 49, 126, 128, ness
138–139, 169, 173, 516 rich ~ 541, 543, 577
corpus spoken ~ see language
~ annotation see annotation written ~ see language
~ compilation 12, 124, 155, 456, 468, 470, deictic/deixis 3, 6, 8, 11, 13, 107, 124, 403,
480, 527, 529, 544–545 431, 434–437, 440, 442–443, 532, 592
~ construction 52, 456, 461, 467–471, demographic/demography 46, 50, 56, 59, 61,
474–482, 484–486 79, 234, 467–469, 478–479, 481, 567, 570,
historical ~ 456, 462, 470, 527–534, 597
536, 539, 541–543, 545, 577, 594, digital
610–611 ~ data see data
~ linguistics see linguistics ~ genre see genre
monitor ~ 476, 545 ~ humanities 527, 530, 544
parallel ~ 397, 456, 461, 479–480 discourse
purpose-built ~ 468, 541, 630 ~ Analysis see method
~ pragmatics see pragmatics ~ common ground see common ground
representativeness of ~ see representative- ~ Completion Task (DCT) see task
ness everyday ~ 14, 22–23, 37–38, 45, 51,
spoken ~ 478, 481–482, 555, 557, 562, 53–55, 170, 307, 343, 352, 368–369,
565–567, 579, 600, 636–637 381, 633
~ tools 455, 462, 516, 599, 603 ~ genre see genre
written ~ 467, 478, 603 institutional ~ 51, 55, 67, 380–381, 397,
corpus-based versus corpus-driven approach 417–418, 597
see approach ~ marker 3, 6, 8–9, 11, 44, 46–47, 51, 53,
critical 56, 59, 61, 96, 98, 198, 378, 458, 462,
~ linguistics see linguistics 471, 473, 483, 485, 503, 507, 511, 555–
~ metaphor analysis 427, 429–430, 443 570, 572–576, 578, 599, 601, 629
cue-based model see model natural ~ see natural/naturally occurring
data
D ~ relation 396–397, 399, 406–408, 564
data ~ space 434–437, 442–444
big ~ 3, 50, 457, 541–542, 593 spoken ~ see language
contextualized ~ 3, 25, 27, 309 workplace ~ 349, 354
cross-linguistic ~ 155–156, 173, 178, 230, written ~ see language
560 discursive glue 395–397, 400, 402–404, 406,
decontextualized ~ 3, 26–27, 164 412
digital ~ 6, 16–18, 528
elicited ~ 6, 25, 245, 324, 514, 587, 593, E
597 emic 55, 170, 339, 343–344, 347, 349–350,
experimental ~ see method 352, 379, 381
fictional ~ 22–23 see also etic
Subject index 671

ethics/ethical issues/research 24, 37–38, 40, implicature 7, 69, 71, 128, 137, 139, 148–
46, 53, 57, 74–81, 169, 226, 243, 293, 149, 156–163, 165–167, 171, 176–178,
305–306, 325, 327, 338, 353, 361–362 188–190, 196–197, 213, 227, 257–266,
ethnographic/ethnography see method 270, 272–275, 292, 406, 440, 442, 486,
ethnomethodological/ethnomethodology see 514, 516
method quantity ~ 258, 260, 262–264
etic 55, 170, 339, 343–344, 349–350, 352 scalar ~ 71, 196–197, 223, 257–261, 263,
see also emic 270, 272
everyday conversation/discourse/interaction inference 124–127, 129, 138, 140–144, 149,
see discourse 168, 186, 188, 195–197, 199, 202, 223,
258–263, 284, 406, 440, 474, 489, 500,
F 535
falsifiability/falsifiable 128, 155–156, 167, informative/informativeness 156, 160–161,
178, 300 168, 177, 258–264, 266, 269–271, 274
felicity condition 5, 137, 144, 225, 282–284, over~ 188, 223, 266–268
403, 407, 415–416 under~ 168, 188, 261–270, 272–274
field notes 19, 38, 55, 65, 67, 316, 321, 324, institutional discourse see discourse
328, 335, 338–339, 347–348, 350–352, intention see communicative intention
476 interaction
form-based approach see approach everyday ~ see discourse
Functional Grammar see grammar face-to-face ~ see conversation
function-based approach see approach interjection 9, 458, 462, 469, 473, 483, 555–
557, 571–572, 575–576, 578
G interview 13, 21, 23–25, 37, 42, 49, 55, 58,
genre 10, 12, 42, 48, 51–52, 54, 62–63, 78, 62–64, 68, 75, 226, 315–316, 319–320,
124, 141, 148, 203, 288, 395, 397–398, 336, 353–354, 358, 375, 406, 414, 417,
401, 403–409, 411, 417–418, 430, 472, 480
456–457, 461–463, 474–477, 485, 539, introspection/introspective see method
542–543, 545, 570, 597–598, 607–611 intuition/intuitive see method
digital ~ 51, 77, 79 irony 105, 193, 203, 224, 227, 281, 287–288,
gestural/gesture 6, 9, 11, 18–20, 72, 79, 95, 539–540, 625, 627
97–100, 102–103, 105–108, 110, 225, 244,
288, 296–297, 482, 556, 562, 575 K
grammar keyness approach see approach
Construction ~ 462, 571–573
Functional ~ 398–400, 403, 406, 409, 429 L
Thetical ~ 462, 571–572 language
granularity 395–404, 408, 410, 418, 628, 634 ~ change 44–45, 476, 533–534, 540, 545
figurative ~ 189, 225, 281, 286–287
H sign ~ 3, 6, 11, 18–20, 27, 95, 109
hesitation marker 9, 309, 462, 555–558, 565, spoken ~ 3–4, 6, 10–12, 14, 16, 18, 24, 27,
576–577 37–38, 48–50, 52–55, 61–63, 65, 67,
72, 77–78, 80, 109, 221, 223, 226–227,
I 241–242, 247, 297, 341, 398, 456,
illocution/illocutionary 47, 53, 61, 96, 126, 461, 468–469, 475, 478, 482, 497, 509,
134–136, 140–146, 148–149, 231, 233, 557–559, 567, 571–572, 578–579, 597,
284, 325, 395, 397–398, 401, 404, 406, 600, 602, 607
410–416, 480, 500–502, 512, 515, 589, written ~ 3–4, 6, 10–12, 14, 16–17, 27,
600 48–50, 52–53, 63, 65, 67, 72, 80, 109,
~ force indicating device (IFID) 51, 411, 244, 307, 335, 341, 398, 427, 456, 475,
463, 587, 589–591, 598, 607–609, 613 497, 500, 532, 597
672 Subject index

legitimization-proximization model see model experimentational ~ 39, 49–50, 57, 59–60,

linguistics 68, 73, 75, 80–81, 219, 221–222, 224,
corpus ~ 12, 429, 467, 475, 478, 481, 227 see also pragmatics
484, 486, 496, 510, 520, 528–531, Functional Grammar see grammar
533, 536–538, 569–570, 579, 587–588, Gricean pragmatic ~ see pragmatics
612 introspection/introspective ~ 5–6, 49,
text ~ 10, 48, 395, 398, 400, 479 123–125, 128–129, 155–156, 163–167,
locution/locutionary 126, 134–135, 411, 414, 178, 185–188, 191, 193, 198–199
600 intuition/intuitive ~ 5, 7, 25, 49, 123,
129, 163, 165, 185–192, 195–196,
M 198–200, 204, 219–220, 257, 300,
meaning 379, 473, 537
literal/nonliteral ~ 77, 142, 147, 219, 262, Neo-Gricean pragmatic ~ see pragmatics
272, 290–291, 457, 478, 498, 502–503, observational ~ 124, 130, 185, 198, 418
601 see also observation
natural/nonnatural ~ 138, 156, 164 qualitative ~ see analysis
truth-conditional ~ 137–138 quantitative ~ see analysis
metacommunication/metacommunicative Relevance Theory 7, 49, 69, 123–124,
400, 402, 412, 419, 421–428, 432–433, 127, 129, 139, 142, 168, 171, 185–
437, 463, 477, 587, 590, 611–613, 621– 187, 189–195, 198, 201–204, 220,
622 292, 516
method role
Conversation Analysis (CA) 8, 37, 43–44, ~ enactment 7, 23, 25, 58–59, 225–226,
47–48, 54–55, 102, 219, 244–245, 247, 305–307, 315, 317, 321–322
325–327, 336, 339–340, 367, 369–372, ~ play 7, 23, 25, 27, 37, 41, 47, 58–62,
374–375, 377–382, 396, 398, 401, 407, 64–65, 67–68, 71–72, 220–222, 225–
478, 502, 558, 568, 572, 619, 630, 635 227, 239, 241–243, 305–313, 315–328,
Construction Grammar see grammar 337–338, 587, 589
corpus linguistic ~ see linguistics see also role play scenario
Critical Discourse Analysis (CDA) 48, speech act analysis 37, 41, 49, 51, 56, 61,
340, 425–433, 435, 441–444 597
Discourse Analysis 37, 47–48, 51, 60, 74, text linguistic ~ see linguistics
96, 141–142, 146, 326, 340, 395–397, Thetical Grammar see grammar
399, 401–403, 410, 427, 502, 568, variational ~ see pragmatics
572 see also approach
discourse completion ~ (DCT) see task see also model
ethnographic ~/ethnography 19, 25, 37– see also task
38, 41, 46, 51, 55–56, 63, 65, 123, 324, methodological
339, 343–351, 353–355, 357–362, 400, ~ comparison 240, 242–243
568, 574, 623 ~ expansionism versus ~ reduction-
ethnomethodological ~/ethnomethod- ism 171–173, 178
ology 7, 43, 48, 54, 124, 339, 367, model
369–370, 372, 379–380, 398, 400–401, cue-based ~ 503–504
407, 533 legitimization-proximization ~ 340, 425,
experimental ~ 5–7, 123, 129–130, 155, 429–431, 433–435, 437–438, 441–443
166, 185–186, 191–193, 195, 197– social actor ~ 428–429
199, 202, 204, 220, 225, 227, 235, see also approach
245, 257, 263, 281–283, 285, 290, see also method
292, 294, 297–301, 305, 308, 325, multimodal/multimodality 18, 48, 109, 203,
335–336, 345 224, 281–282, 297, 299, 301, 347, 426,
see also pragmatics 482, 530, 541, 544, 562
Subject index 673

N interlanguage ~ 27, 41, 43, 49, 64, 67, 71,

negotiability 529, 539 73, 76, 80, 220, 222, 225, 227, 229–232,
nonverbal 235, 240, 246, 305–306, 321
~ behavior see behavior meta~ 619–622, 624–625, 627–630, 632–
~ communication see communication 633, 635, 637
Neo-Gricean ~ 128, 155–156, 159, 163–
O 164, 166–167, 178, 185
observation observational ~ 221, 335
non-participant ~ 345, 350–352 variational ~ 43, 47, 53, 56, 61, 220, 225,
participant ~ 23–25, 55, 324, 345, 351– 231, 239–240, 481, 484, 569
352, 354 presupposition 124, 126–127, 133–134, 143,
observer’s paradox 24, 38, 53, 55, 61, 77, 145–146, 148–149, 258, 272, 401, 405,
337 486
oral communication see communication prosodic/prosody 54, 59, 61, 78, 97–98,
organization 101–104, 107–108, 110–111, 123, 221,
sequential ~ 7, 239, 245, 309, 313, 320, 225–226, 244, 246, 288–289, 298, 309,
396, 402, 404, 407, 415, 417 312–313, 324, 327, 372, 375, 377–378,
overinformative/overinformativeness see 430, 481, 483, 503–504, 507, 529, 555–
informative/informativeness 556, 560–562, 573–576, 579, 600, 608
see also semantic prosody
P
performative 51–52, 135–136, 141, 284–285, Q
413–414, 534, 536, 590, 611 qualitative analysis/method see analysis
perlocution/perlocutionary 126, 134–135, quantitative analysis/method see analysis
148, 412–415, 533, 600
politeness/impoliteness 11, 19, 37, 41–42, R
44, 49, 63, 69–71, 138, 141, 170, 172– Relevance Theory see method
173, 229, 232, 236–237, 240, 247, 305, reliability of data/analysis 38, 70, 97, 101,
312–313, 315, 339, 349, 354, 417, 515, 106–107, 112, 191, 226, 235, 305–306,
517–519, 534, 564–565, 569–570, 574, 323–324, 326, 458–461, 500, 530–531,
599–600, 623, 625–627 594
pragmatic repair 63, 67, 309, 320, 322, 324–325, 378,
~ annotation see annotation 380, 504, 556, 565, 629, 632, 636
~ noise 7–9 representative/representativeness of corpus/
~ Tolerance Hypothesis 224, 263, 265– data 40, 45, 164, 246, 298, 324, 326, 456,
266, 268 474, 478, 485, 527, 532, 609–610
pragmatics role
corpus ~ 198, 455, 462, 467–473, 482– ~ enactment see method
486, 495–496, 500, 520, 527–529, 533, ~ play see method
536, 539, 541–542, 555–556, 587–589, ~ play scenario 59–62, 64–65, 67–69,
598, 600, 605, 612, 621–623, 633, 637 71–72, 222, 225–226, 229, 232–240,
cross-cultural ~ 37, 43, 79, 225, 229, 231, 242–244, 246, 270, 306–307, 310, 323–
244–245, 314 325, 595–597, 605
developmental ~ 76, 129, 185–186, 201–
202, 307 S
experimental ~ 73, 123, 129, 166, 185– scenario see role play
186, 192, 219–220, 224, 227, 275 see second language acquisition (SLA) 200, 202,
also method 305, 326, 381
experimentational ~ 139, 219, 221 see also semantic
method ~ field 136, 517–519, 532, 625–626
Gricean ~ 37, 44, 73, 156, 227, 258 ~ prosody 480–481, 537, 625–626
674 Subject index

sequential organization see organization transcription 3, 49, 53, 55, 62, 65, 74,
sign language see language 95–112, 335, 338–340, 380, 464, 472, 482–
significance test see test 484, 486, 544, 560, 600, 608, 635, 637
social actor model see model ~ conventions/notations 309, 311, 356,
speech act analysis see method 375–376, 461, 467, 619, 630, 643
spoken ~ systems 95, 97–112, 309, 355, 376–377,
~ communication/data/discourse/language 636
see language turn-taking 8, 18, 45–46, 54, 56, 59, 62–63,
~ corpus see corpus 109, 244, 309, 320, 322, 324, 326, 377–
378, 380, 556, 558, 564, 576–577
T
tag/tagging (annotational) 457–458, 462–463, U
491, 497, 499, 501–504, 507, 509, 511, underinformative/underinformativeness see
515–520, 536, 542–543, 598–599, 606 informative/informativeness
task
comprehension ~ 50, 68–69, 220, 223, V
227, 257–258, 261 validity of data/analysis 39–40, 59–60, 67,
discourse completion ~ (DCT) 6–7, 23, 71, 101, 106, 109–110, 226, 305–308,
25, 27, 37, 39, 41, 45–47, 58, 62, 64–72, 310, 315–316, 321, 323–325, 328, 344,
220–223, 225, 227, 229–247, 307, 324, 359–360, 417, 633
463, 500, 514, 587, 589, 591, 593, variables
595–597, 605 contextual ~ 50, 71, 226, 233, 509, 565,
judgment ~ 220, 223–224, 257–258, 261, 593, 602, 605, 607
263, 270–272, 274 situational ~ 65, 219
multiple choice ~ 37, 46, 66, 69, 71–72, social/sociolinguistic ~ 222, 229–230,
222, 241, 323 233–235, 325, 380, 543, 570, 607,
psycholinguistic production ~ 220, 281, 632
290, 300 (socio)pragmatic ~ 462, 511, 529, 569
selection ~ 69, 71, 193–195, 263, 272 variability 4, 431, 472, 481, 529, 539, 566,
teasing 357, 620, 636 568–569, 588, 637
test variational pragmatics see pragmatics
reliability ~ 459–460 see also reliability of
data/analysis W
significance ~ 461, 475, 536, 538, 609 workplace discourse see discourse
text written
~ linguistics see linguistics ~ communication/data/discourse/language
~ type see genre see language
written ~ see language ~ corpus see corpus
Thetical Grammar see grammar

Trudgill. Dialectology
No ratings yet
Trudgill. Dialectology
24 pages
Dokumen - Pub - Pragmatics of Social Media 3110431076 9783110431070
100% (1)
Dokumen - Pub - Pragmatics of Social Media 3110431076 9783110431070
762 pages
Pragmatics of Society
No ratings yet
Pragmatics of Society
720 pages
Foundations of Pragmatics
No ratings yet
Foundations of Pragmatics
725 pages
Pragmatics of Discourse
No ratings yet
Pragmatics of Discourse
641 pages
Pragmatics of Speech Actions Hops 2: Unauthenticated Download Date - 6/3/16 11:21 Am
100% (1)
Pragmatics of Speech Actions Hops 2: Unauthenticated Download Date - 6/3/16 11:21 Am
744 pages
Designing and Evaluating Language Corpora - A Practical - Egbert, Jesse, Biber, Douglas, Gray, Bethany - New, 2022 - Cambridge University Press - 9781107151383 - Anna's
100% (1)
Designing and Evaluating Language Corpora - A Practical - Egbert, Jesse, Biber, Douglas, Gray, Bethany - New, 2022 - Cambridge University Press - 9781107151383 - Anna's
309 pages
Foundations of Pragmatics
100% (1)
Foundations of Pragmatics
725 pages
Pragmatics of Computer-Mediated Communication
No ratings yet
Pragmatics of Computer-Mediated Communication
763 pages
(Language, Cognition, and Mind, 8) Darcy Sperlich - Reflexive Pronouns - A Theoretical and Experimental Synthesis-Springer (2021)
100% (1)
(Language, Cognition, and Mind, 8) Darcy Sperlich - Reflexive Pronouns - A Theoretical and Experimental Synthesis-Springer (2021)
250 pages
Semantics All Lectures
No ratings yet
Semantics All Lectures
136 pages
Ellyn Lucas Arwood - Language Function - An Introduction To Pragmatic Assessment and Intervention For Higher Order Thinking and Better Literacy - Jessica Kingsley Publishers (2011)
100% (1)
Ellyn Lucas Arwood - Language Function - An Introduction To Pragmatic Assessment and Intervention For Higher Order Thinking and Better Literacy - Jessica Kingsley Publishers (2011)
417 pages
The Language of Advertising - A Pragmatic Approach
No ratings yet
The Language of Advertising - A Pragmatic Approach
286 pages
(FREE PDF Sample) Teaching and Testing Second Language Pragmatics and Interaction A Practical Guide Second Language Acquisition Research Series 1st Edition Carsten Roever Ebooks
100% (4)
(FREE PDF Sample) Teaching and Testing Second Language Pragmatics and Interaction A Practical Guide Second Language Acquisition Research Series 1st Edition Carsten Roever Ebooks
49 pages
Adding Sense Context and Interest in A Grammar of Multimodal Meaning by Mary Kalantzis Bill Cope
No ratings yet
Adding Sense Context and Interest in A Grammar of Multimodal Meaning by Mary Kalantzis Bill Cope
398 pages
Testing Pragmatic Competence in A Second Language
100% (1)
Testing Pragmatic Competence in A Second Language
22 pages
On Subject and Theme PDF
100% (1)
On Subject and Theme PDF
469 pages
A Multimodal Discourse Analysis of The Interactive
100% (1)
A Multimodal Discourse Analysis of The Interactive
12 pages
Active Learning - Creating Excitement in The Classroom - Handout
100% (1)
Active Learning - Creating Excitement in The Classroom - Handout
262 pages
A Taste For Corpora - in Honour of Sylviane Granger
No ratings yet
A Taste For Corpora - in Honour of Sylviane Granger
313 pages
Under Representation
100% (2)
Under Representation
278 pages
C++ CH 2
100% (1)
C++ CH 2
43 pages
(Routledge Studies in Linguistics) Michele Prandi, Micaela Rossi - Researching Metaphors - Towards A Comprehensive Account-Routledge (2022)
100% (1)
(Routledge Studies in Linguistics) Michele Prandi, Micaela Rossi - Researching Metaphors - Towards A Comprehensive Account-Routledge (2022)
272 pages
ELT, Contexts of Competence Social and CuItural Considerations in Communicative Language Teaching
100% (1)
ELT, Contexts of Competence Social and CuItural Considerations in Communicative Language Teaching
195 pages
(2017) Konrad Zysko A Cognitive Linguistics Account of Wordplay Cambridge Scholars Publishing
100% (1)
(2017) Konrad Zysko A Cognitive Linguistics Account of Wordplay Cambridge Scholars Publishing
179 pages
Pragmatics and The English Language
No ratings yet
Pragmatics and The English Language
14 pages
Untitled
100% (2)
Untitled
277 pages
Cognitive Pragmatics
100% (2)
Cognitive Pragmatics
663 pages
SLDG Book - Full
No ratings yet
SLDG Book - Full
2,149 pages
1ruiz de Mendoza Ibanez Cognitive Modeling
100% (4)
1ruiz de Mendoza Ibanez Cognitive Modeling
261 pages
(Studies of Organized Crime 8) Carlo Morselli (Auth.) - Inside Criminal Networks-Springer-Verlag New York (2009)
100% (1)
(Studies of Organized Crime 8) Carlo Morselli (Auth.) - Inside Criminal Networks-Springer-Verlag New York (2009)
207 pages
2019 - 12th International Conference On (Im) Politeness
100% (1)
2019 - 12th International Conference On (Im) Politeness
91 pages
Jean Harkins, Anna Wierzbicka - Emotions in Crosslinguistic Perspective
100% (2)
Jean Harkins, Anna Wierzbicka - Emotions in Crosslinguistic Perspective
428 pages
Vilem Mathesius and Functional Sentence Perspective-Def
100% (1)
Vilem Mathesius and Functional Sentence Perspective-Def
11 pages
Paul Bloom Et Alii Language and Space 1996
No ratings yet
Paul Bloom Et Alii Language and Space 1996
623 pages
Evolutionary Syntax (2015)
100% (1)
Evolutionary Syntax (2015)
280 pages
Cognitive Linguistics Exam Questions
No ratings yet
Cognitive Linguistics Exam Questions
1 page
Souza Filho (1984) - Language and Action - A Reassessment of Speech Act Theory PDF
No ratings yet
Souza Filho (1984) - Language and Action - A Reassessment of Speech Act Theory PDF
174 pages
Operation Mannual Water Treatment System
No ratings yet
Operation Mannual Water Treatment System
488 pages
The Pragmatics Encyclop
100% (1)
The Pragmatics Encyclop
675 pages
The Structure of Discourse-Pragmatic Variation
100% (1)
The Structure of Discourse-Pragmatic Variation
299 pages
Linguistic Theory: The Discourse of Fundamental Works BY Robert de Beaugrande
100% (6)
Linguistic Theory: The Discourse of Fundamental Works BY Robert de Beaugrande
396 pages
Essay On My Hero
100% (2)
Essay On My Hero
3 pages
Communicative Functions and Linguistic Forms in Speech Interaction
100% (3)
Communicative Functions and Linguistic Forms in Speech Interaction
320 pages
Mind Style As An Interdisciplinary Approach To PDF
No ratings yet
Mind Style As An Interdisciplinary Approach To PDF
18 pages
Mini-Thesis Template
No ratings yet
Mini-Thesis Template
14 pages
Some Practices For Referring To Persons in Talk-in-Interaction
No ratings yet
Some Practices For Referring To Persons in Talk-in-Interaction
51 pages
Swales Notas PDF
No ratings yet
Swales Notas PDF
36 pages
3.the graph provides some interesting data regarding... 该图为我们提供了有关... 有趣数据。 depicts
No ratings yet
3.the graph provides some interesting data regarding... 该图为我们提供了有关... 有趣数据。 depicts
20 pages
Sociolinguistics and Language Education
No ratings yet
Sociolinguistics and Language Education
2 pages
Language and Society. Halliday
No ratings yet
Language and Society. Halliday
319 pages
Basics of Linguistic Research Methodology
86% (7)
Basics of Linguistic Research Methodology
11 pages
Lexico Grammar
No ratings yet
Lexico Grammar
4 pages
On Chomsky's Appraisal of Skinner's Verbal Behavior
No ratings yet
On Chomsky's Appraisal of Skinner's Verbal Behavior
16 pages
ME 111 Thermodynamics 1
No ratings yet
ME 111 Thermodynamics 1
8 pages
Anatomy of A Dictionary Entry
No ratings yet
Anatomy of A Dictionary Entry
7 pages
2024 Advanced Diploma Pro-Forma Invoice Full Time
No ratings yet
2024 Advanced Diploma Pro-Forma Invoice Full Time
1 page
Interview With Gunther Kress: by Fredrik Lindstrand
No ratings yet
Interview With Gunther Kress: by Fredrik Lindstrand
7 pages
08 Pragmatics
No ratings yet
08 Pragmatics
3 pages
Style and Stylistics
100% (1)
Style and Stylistics
132 pages
Oracle Export and Import Utility
No ratings yet
Oracle Export and Import Utility
11 pages
Cambridge Language As Hope
100% (1)
Cambridge Language As Hope
202 pages
Simple Future Tense: Presented by Henny Septia Utami, M.PD
100% (2)
Simple Future Tense: Presented by Henny Septia Utami, M.PD
10 pages
Operation Guide 3294: About This Manual
No ratings yet
Operation Guide 3294: About This Manual
3 pages
Practice Problems: Paul Dawkins
No ratings yet
Practice Problems: Paul Dawkins
75 pages
Bardovi Harlig, K. (2019) .
No ratings yet
Bardovi Harlig, K. (2019) .
7 pages
A Data-Driven Online Prediction Model For Battery
No ratings yet
A Data-Driven Online Prediction Model For Battery
17 pages
Luna Filipovic Talking About Motion
100% (2)
Luna Filipovic Talking About Motion
197 pages
DPKG Command Cheat Sheet For Debian Linux
No ratings yet
DPKG Command Cheat Sheet For Debian Linux
2 pages
HBRI Brochure
0% (1)
HBRI Brochure
8 pages
Psychology Research Proposal Template
No ratings yet
Psychology Research Proposal Template
8 pages
BS en 50164-6-2009
No ratings yet
BS en 50164-6-2009
18 pages
PNB STMT (Kavit)
No ratings yet
PNB STMT (Kavit)
6 pages
Sneha Sarkar, 127, B, Beta and Gamma Function
No ratings yet
Sneha Sarkar, 127, B, Beta and Gamma Function
12 pages
Power Press
100% (1)
Power Press
7 pages
Pages From 2512
No ratings yet
Pages From 2512
3 pages
Overcurrent Protection Device Basis
No ratings yet
Overcurrent Protection Device Basis
10 pages
Pragmatics
No ratings yet
Pragmatics
9 pages
Electronics AND Communication Engineers: Indian Society OF
No ratings yet
Electronics AND Communication Engineers: Indian Society OF
2 pages
Webview
No ratings yet
Webview
3 pages
10024947D00 - Turbine Control Board Requirements Specification, PB 540
No ratings yet
10024947D00 - Turbine Control Board Requirements Specification, PB 540
8 pages
VCO Non-Adjusting PLL FM MPX Stereo Demodulator With FM Accessories
No ratings yet
VCO Non-Adjusting PLL FM MPX Stereo Demodulator With FM Accessories
16 pages
3.-GE11 EntrepreneurialMind FINAL
100% (4)
3.-GE11 EntrepreneurialMind FINAL
15 pages
The Role of Academic Libraries in The Digital Transformation of The Universities
No ratings yet
The Role of Academic Libraries in The Digital Transformation of The Universities
5 pages
FLege Et Al (1999)
No ratings yet
FLege Et Al (1999)
27 pages
Basfiber For Construction Market (US Customary Units) .
No ratings yet
Basfiber For Construction Market (US Customary Units) .
4 pages
Ministry of Corporate Affairs: Only For Pay Later Payment. Not For Payment at Branch Counter E-Challan For Paying Later
No ratings yet
Ministry of Corporate Affairs: Only For Pay Later Payment. Not For Payment at Branch Counter E-Challan For Paying Later
2 pages
HuangYan 2007 1introduction Pragmatics
No ratings yet
HuangYan 2007 1introduction Pragmatics
20 pages
Neubert and Shreve, Textuality
No ratings yet
Neubert and Shreve, Textuality
30 pages
Practicing Theory in Second Language Writing
From Everand
Practicing Theory in Second Language Writing
CSPtrade2
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

RM14-Methods in Pragmatics

Uploaded by

RM14-Methods in Pragmatics

Uploaded by

Methods in Pragmatics

Library of Congress Cataloging-in-Publication Data

Bibliographic information published by the Deutsche Nationalbibliothek

© 2018 Walter de Gruyter GmbH, Berlin/Boston

The series Handbooks of Pragmatics, which comprises thirteen self-contained

(which we consider a decided asset), we suggest a definite structure, which gives

– it operates with a wide conception of pragmatics, dealing with approaches that

Pragmatics is no doubt an unusually large and diverse subfield of linguistics. Over

Zurich, Bonn and Berlin, December 2017

Preface to the handbook series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Preface to Methods in Pragmatics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

1. Data in pragmatic research

2. Methods and ethics of data collection

3. The art of transcription: Systems and methodological issues

II. Introspectional pragmatics

4. Introduction to part 2: Introspectional pragmatics

6. Research methodology in classical and neo-Gricean pragmatics

7. Cognitive pragmatics: Relevance-theoretic methodology

III. Experimentational pragmatics

8. Introduction to part 3: Experimentational pragmatics

9. Discourse completion tasks

10. Assessing the comprehension of pragmatic language:

11. Psycholinguistic production tasks

12. Role plays

IV. Observational pragmatics

13. Introduction to part 4: Observational pragmatics

14. Ethnographic methods in pragmatics

15. Ethnomethodology and conversation analysis

16. Discourse analysis

17. Critical discourse analysis

18. Introduction to part 5: Corpus pragmatics

19. Corpus construction

20. Corpus annotation

21. Historical corpus pragmatics

22. Corpus pragmatics: From form to function

23. Corpus-based function-to-form approaches

24. Corpus-based metapragmatics

Name Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651

Abstract: This introductory chapter gives a broad-brush overview of the various

There is no research in pragmatics without data. Data – in one form or another –

to be investigated. This shift in perspective from homogeneity to heterogeneity has

For philosophers, it is important to discuss the possible foundation of such intuitive

2.2. Micro units (smaller than utterances)

but, therefore, however, in conclusion and anyway connect it to the discourse in

2.3. Macro units (larger than utterances)

large and overlapping. Historical pragmatics has a long tradition of investigating

According to a simplistic view of language, there is a straightforward distinction

3.1. Spoken versus written language

Figure 1: Koch and Oesterreicher’s model of communicative immediacy and distance

In this model, communicative immediacy is characterized by the parameters in

Figure 2: Data in historical pragmatics: the “communicative view”

Written representations of spoken language can be separated into three different

Figure 3: Enriched communicative model (Landert and Jucker 2011: 1427)

The scale of accessibility is defined by the ease of access to a particular message

3.2. Online/digital data

3.3. Sign language data

Like spoken language, sign language is ephemeral. If it is not recorded, it vanishes

3.4. Data of nonverbal behaviour

Research of gestures and nonverbal behaviour shares some of the problems of

4. Observational data: Four dimensions

4.1. Situational dimension: Constrained versus unconstrained

4.2. Fictionality dimension: Fictional versus factual

data seemed to be a reasonably good approximation especially in the case of a

4.3. Researcher interference dimension: Low versus high

Researcher Relevant examples

Figure 4: Researcher interference dimension

Speech recordings without any researcher involvement, number 1 in Figure 4,

mention a more recent example, investigated thanks responses from a variational

4.4. Researcher perspective dimension: Micro versus macro

Continuum of text/discourse data

Individual text(s): Small-scale Large-scale

Figure 5: Researcher perspective dimension (Bednarek 2011: 546)

disconnected from its actual context. This is an extreme case of a decontextualized

Figure 6: “I’m sorry” in American English from 1800 to 2000 (http://books.google.com/

A special word of thanks goes to Andrea Golato, Daniela Landert, Magdalena

Barron, Peter Grundy and Gu Yueguo (eds.), Routledge Handbook of Prag-

Figure 1: Koch and Oesterreicher’s model of communicative immediacy and distance

Figure 2: Data in historical pragmatics: the “communicative view”

Figure 6: “I’m sorry” in American English from 1800 to 2000 (http://books.google.com/