0% found this document useful (0 votes)
289 views128 pages

128 Because Data Cant Speak For Itself

Because Data Cant Speak for Itself

Uploaded by

Known Id
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
289 views128 pages

128 Because Data Cant Speak For Itself

Because Data Cant Speak for Itself

Uploaded by

Known Id
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 128

BECAUSE DATA CAN’T SPEAK

FOR ITSELF

OceanofPDF.com
BECAUSE DATA CAN’T SPEAK
for ITSELF

A Practical Guide to Telling Persuasive Policy


Stories

DAVID CHRISINGER and LAUREN BRODSKY

JOHNS HOPKINS UNIVERSITY PRESS


Baltimore

OceanofPDF.com
© 2023 David Chrisinger and Lauren Brodsky
All rights reserved. Published 2023
Printed in the United States of America on acid-free paper
246897531

Johns Hopkins University Press


2715 North Charles Street
Baltimore, Maryland 21218
www.press.jhu.edu

Library of Congress Cataloging-in-Publication Data is available.

ISBN 978-1-4214-4584-7 (paperback)


ISBN 978-1-4214-4585-4 (ebook)

A catalog record for this book is available from the British Library.

Special discounts are available for bulk purchases of this book. For more information, please contact
Special Sales at specialsales@jh.edu.

OceanofPDF.com
A worldview is not a Lego set where a block is added here, removed there. It’s a fortress
that is defended tooth and nail, with all possible reinforcements, until the pressure
becomes so overpowering that the walls cave in.

Rutger Bregman, Utopia for Realists

OceanofPDF.com
CONTENTS

Foreword, by Ethan Bueno de Mesquita

INTRODUCTION
Why Should You Learn to Tell Stories with Data?

PART I PEOPLE
Telling Stories with Data about People for People

PART II PURPOSE, THEN PROCESS


Finding Meaning in the Data and Making It Work for You

PART III PERSISTENCE


Using Data to Solve Wicked Problems with Integrity

Conclusion

Acknowledgments
Tips to Help You Write More Effectively with Data
Notes
Index

OceanofPDF.com
FOREWORD
Ethan Bueno de Mesquita

We live in an age of data triumphalism. Evidence-based decision-making,


evidence-based medicine, evidence-based parenting, evidence-based
policing, evidence-based policy, even evidence-based sports management;
these are the watchwords of our times.
But there is no guarantee that a world saturated with data analysis will be
a better place. Stories of bias baked into quantitative algorithms, replication
crises in scientific literatures, misinformed citizens, and decision makers
guided by dubious interpretations of quantitative evidence are
commonplace. This should give us pause. What will it take for all this data
to make the world a better place?
David Chrisinger and Lauren Brodsky provide us with a key part of the
answer in this essential book. More data is not enough. Nor is good
analysis. Harnessing the power of data requires clear, accurate, and
compelling communication. Because Data Can’t Speak for Itself teaches us
how to achieve those goals. It does so through examples and fundamental
principles, perhaps undersold as mere “tips.”
A central theme of Chrisinger and Brodsky’s book is that anyone writing
about data should stop. If you want to make an impact on your reader, don’t
write about data. Write with data. Your job as a writer is to help “the reader
make sense of the world,” as they tell us in the introduction. You make
sense of a complicated world by telling a story. Data can help tell the story.
But data should almost never be the story.
I can’t overemphasize how much I agree with this point. Too often, our
breathless discourse presents data as the answer to our problems. But that is
ridiculous. No matter how big your data set or how fancy your statistical
model, your computer can’t do your thinking for you. Or, as Chrisinger and
Brodsky put it, “data can’t speak for itself.”
Making progress on hard problems requires hard thinking. If you want to
change the world, you need a theory of how the world works. Data can help
you probe that theory and propose a change. But the data isn’t the theory
itself. The theory is an idea, or a story, that helps you make sense of things.
A person trying to persuade with data (or without data for that matter) must
have a story, of course. Imagine trying to persuade someone without
knowing what idea you want to persuade them of.
A good story does lots of things. A good story motivates, helping your
reader see why they should care. A good story clarifies, helping your reader
see what you think is going on or how you think the world works. And a
good story helps both you and your reader assess how compelling the
evidence provided by the data is.
Let me explain what I mean. In the 1980s, the New York City Police
Department infamously implemented “broken windows policing.” The
department tried fighting serious crime by increasing law enforcement for
low-level infractions. The idea was to reduce the environment of
lawlessness (symbolized by broken windows) in which crime flourishes.
Here’s what the data said: neighborhoods where the police implemented
broken windows policing experienced a decrease in crime compared with
other neighborhoods. What story does that tell?
You could tell a triumphant story with that data: broken windows
policing worked! Or you could dig deeper and find there is a different story
to tell. For instance, you might ask which neighborhoods got broken
windows policing. According to the data, the answer is neighborhoods with
a recent surge in violence. That leads to another interesting question. What
do we know about what usually happens in neighborhoods that see a surge
in crime? It turns out that the data has an answer to that question, too. Such
neighborhoods usually experience a “reversion to the mean.” The
idiosyncrasies that caused the surge are often fleeting. Thus, crime tends to
go down after a surge, even without a change in policing. So maybe the
story we want to tell isn’t one about the success of a new policing tactic but
rather one about the selective targeting of those tactics. That story suggests
different questions we should ask the data and different conclusions we
could reach.
Importantly, neither the data nor our computers can tell us which story to
tell. We must think about the stories we might tell, and then the data can
help us probe which story we should tell. If we try to leave out the thinking
and storytelling, we might still persuade. But we are also likely to mislead.
So how do you tell a clear, accurate, and compelling story with data?
Well, to state the obvious, read this book to find out! But let me highlight a
few of Chrisinger and Brodsky’s most important ideas.
If you want to write effectively with data, start with a good question: ask
better research questions (tip #10), because “good questions drive good
stories.” What makes a good question? It had better be something with an
answer that matters. And it had better be possible to make some progress in
answering it. Of course, what counts as a question whose answer matters is
in the eye of the beholder. You must always bear in mind whom you are
writing for and what you hope they will do with your story (tip #12).
These two requirements—a question whose answer matters and that is
answerable—are often in tension. For instance, many studies provide
convincing evidence about how some educational intervention affects
standardized test scores in primary and secondary schooling. But readers
may not care much about standardized test scores. They may want to know
about the effect on “real” outcomes, say college attendance, employment, or
wages. Those outcomes are super hard to study because they are some time
off in the future. A good question must thread the needle of the interesting
and the answerable. That’s hard. So, a writer must choose carefully which
stories they do and don’t tell.
Another key point of the coauthors is that persuasion requires trust. A
persuasive writer ought to be honest and transparent with their readers.
Don’t blind them with science, using fancy data analyses to hide
weaknesses in your story. Be forthright about what you do and don’t know
from and about the data (tips #25 and #26). Be cautious when interpreting
evidence of correlation; it need not imply causation (tip #30). Inform your
readers when the outcome measured in the data doesn’t fully capture the
outcome of interest. For instance, in our example of standardized testing,
lay out the limits of such test scores as a measure of educational outcomes
(tip #11). Present data in ways that help readers understand its substantive
meaning. This might involve switching from percentages to percentage
points, converting numbers to ratios, or making an analogy for a quantity of
unimaginable size (tips #3 and #18–#20). And help your readers think for
themselves by presenting other kinds of evidence, beyond the quantitative
data.
That’s a lot of good advice. But I might be willing to trade it all for tip
#29: “Manage the tension between timeliness and accuracy.” I want to
interpret this tip as a warning about the risks of hunting for a good story and
being too persuasive.
Novelty often makes for the best stories. A credible-looking study with a
surprising result is hard to resist. I’m thinking of stories about research on
topics like the power pose, extrasensory perception, or miracle cures for a
virus. But any scientific finding comes with uncertainty. Does the finding
reflect something real in the world, or is it spurious, the result of
meaningless noise? The more surprising or unlikely a finding, the more
likely it is to be spurious. We should almost never change our beliefs much
in response to any single finding.
Those who write with data have an important role to play here. Don’t be
too persuasive when telling a story based on a single surprising result.
When a substantial body of evidence contradicts the conventional wisdom,
then, by all means, persuade away. But when you see that first surprising
result, remember that it is probably wrong. So, managing the tension
between timeliness and accuracy might mean not telling that story. At the
very least it means being transparent about the uncertainty.
Let me end where I started. We live in a data-driven age. Data analysis
has the potential to improve our lives. But to realize that potential, we all
need to become better at thinking about and communicating with data.
David Chrisinger and Lauren Brodsky have provided us with a vital tool in
service of that noble mission. I hope you enjoy it as much as I did.

ETHAN BUENO DE MESQUITA is the Sydney Stein Professor and deputy dean at the Harris
School of Public Policy at the University of Chicago. His research focuses on applications of
game theoretic models to a variety of political phenomena including conflict, political violence,
and electoral accountability. He is the author of Thinking Clearly with Data and Political
Economy for Public Policy, both published by Princeton University Press, as well as many
articles in leading journals in both political science and economics. His research has been
supported by the National Science Foundation, the Office of Naval Research, and the United
States Institute of Peace. Before arriving at the University of Chicago, Ethan taught in the
political science department at Washington University in St. Louis. He received a BA from the
University of Chicago in 1996 and an MA and a PhD from Harvard University in 2003.

OceanofPDF.com
INTRODUCTION

OceanofPDF.com
Why Should You Learn to Tell Stories with Data?

Because data can’t speak for itself—that’s why. Case in point: In the spring
of 1944, as the Allies prepared to invade Hitler’s Fortress Europe, two
psychologists from Smith College—Fritz Heider and Marianne Simmel—
published the results of a study that forever changed our understanding of
how people make sense of new information. Heider and Simmel created a
90-second film of three black geometric shapes moving across a white, two-
dimensional background and then had three groups of college students
watch the film. They asked the first group to describe what they saw
without any further prompting. The second group was told to interpret the
moving shapes as though they were people acting in the real world. The
researchers gave the third group the same instructions as the second group,
only the film the third group watched was shown in reverse. You can watch
the video for yourself on YouTube; search for “Heider and Simmel (1944)
animation.”1
At the start of the film, a triangle gets locked inside a larger rectangle
before a smaller triangle and a circle enter the scene from the top of the
screen. The bigger triangle then leaves the confines of the rectangle when
the left side of it opens outward on a hinge, like a door. The two triangles,
now out in the open, repeatedly bash into each other as the circle moves
toward the rectangle. The action continues until the bigger triangle finds
itself locked inside the rectangle again, like it was at the beginning of the
film.
Heider and Simmel found that of the 114 students they tested, only 3
(1 in the first group and 2 in the third group) didn’t interpret what they saw
as characters acting out a story. The vast majority, in fact, invented quite
elaborate stories to explain what they saw. Some students saw the triangles
as two men fighting, and they saw the circle as a woman trying to evade the
bigger triangle, which was clearly an aggressive bully in their minds. Many
students perceived the smaller triangle and the circle as “innocent young
things,” while the bigger triangle was “blinded by rage and frustration.”2 To
make sense of an exceedingly complicated world, Heider and Simmel
argued, most people must turn facts, data points, observations, and other
aspects of life into a story with characters who have different needs and
who must confront one another to get whatever it is they desire.

Tip #1. Use reporting to convey information. Use stories to create an


experience. Stories can transport the reader by creating an experience that
helps them see what you’re trying to say. Information alone is not a story.

Since Heider and Simmel’s experiment, research in a variety of


disciplines has confirmed their findings—and expanded on them. In 2003,
Michael D. Slater, a professor of social and behavioral science at Ohio State
University, published the results of a study showing that people were
persuaded to eat more fruits and vegetables when they were told stories
with characters they could identify with. By contrast, Slater and his
coauthors found that giving people an article detailing evidence on how
much eating fruits and vegetables can improve health was much less
persuasive.3
In 2007, a professor of journalism and international media systems,
Marcel Machill, and two coauthors studied what could make people
watching television news understand and retain information more
effectively. They found that people who watched a short segment about a
local baker who suffered from health problems because of poor air quality
better understood and remembered information on the dangers of air
pollution than did people who watched a “typical” news segment devoid of
storytelling. Additionally, they found that “adopting a narrative form for TV
news also gives a clearer distance and perspective to the news content,
which has advantages for social communication.”4
Dan P. McAdams, who teaches psychology at Northwestern University,
wrote a book in 2005 titled The Redemptive Self: Stories Americans Live
By, in which he reported two important findings from his years of studying
the impact that stories have on readers. First, people remember facts much
longer and can make better sense of what they read when those facts are
part of a story. And second, people are persuaded more quickly and
effectively when information and ideas are presented to them in story form.5
There are countless other studies on storytelling we could cite here, but
we won’t. Not because such a discussion wouldn’t be interesting but
because none of you really need stacks of proof to be persuaded that stories
are far superior communication to a simple recitation of facts. You likely
wouldn’t have even bought this book if you didn’t already agree with its
premise: to persuade a reader with data, you must tell a story with the data
that helps the reader make sense of the world.

Tip #2. Communicate, don’t complicate. The last thing people need is
more information. They have far too much of it already. What they need is
help making sense of all that information and to understand the difference
between what’s important and what’s just noise.
This book is filled with tips to help you write persuasive stories
with data
Try to think of communicating well with data as a craft not unlike the
skilled trades of carpentry, masonry, or blacksmithing. These skilled trades,
like communicating with data, require both technical and artistic skill to
fulfill a specific purpose. The carpenter who designs and builds your
favorite chair, for instance, would have much more difficulty doing so
without having all the right tools and techniques at their fingertips.
The tips of this book—32 in all, woven throughout and made prominent
—are tools you should store in your own writer’s toolbox for future use. As
you notice these tools being used in the real world and as you learn more
about them—and practice using them, too—communicating well with data
will eventually become second nature to you, like a carpenter driving a 16-
penny nail with a single thwack! of their framing hammer. Not every
writing tool will be needed at all times, but they will be ready to use when
called for.
If you bought this book hoping to learn how to analyze the great
multitude of data out there, we’re sorry to say this isn’t the book for you. At
least not yet. But once you’ve grasped how to crunch the numbers, build
your models, and run those regressions, you’ll be ready to dust this baby off
and learn all you need to know—and what you should avoid—when
communicating the so what? of your data analysis to readers who may not
know a regression from an inkblot test. They will need your communication
skills to help them get it, and to care about it.
This book is also for anyone who is tasked with writing about other
people’s research in ways that are accurate and persuasive, especially for
readers who are more interested in having answers to their questions than
they are in learning the technical details of how those answers were
obtained. If that describes you, you’re going to benefit especially from “Part
III. Persistence,” which will teach you all you need to know to write with
integrity every time you write with data.
For a complete list of all our tips, please flip to page 107.
While nobody ever said this was easy, there is hope!
Most people—including those in the highest political offices in the land—
simply do not have the time or expertise to properly interpret and assess the
credibility and usefulness of available data and the countless reports,
studies, and analyses of data released every day. It’s not that most people
aren’t smart enough. Far from it. Whether you’re a policy analyst,
consultant, journalist, academic, politician, or CEO, the reality is that if
telling effective stories with data were easy, you’d be doing it already.
Sadly, nothing about it is easy—not collecting and sifting data; not
confronting its contradictions and conflicts; and not creating a framework
for describing what we know from it, evaluating what works, or devising
next steps for corrective action.
But have no fear! Because data can’t speak for itself, each day we
commit to this work, we are afforded an incredible opportunity to think
critically about what we know, what we don’t, and why anyone should care
either way. Claims we make about the world, when supported by credible
evidence, have the power to change the way a reader sees the world—the
first step in a long journey to creating positive and lasting change.
Our interest in communicating effectively with data stems from our
combined decades of teaching public policy students and practitioners to
use their data in support of a story that helps readers make sense of
something. Sometimes the people we teach must learn how to use more
data, other times less. Some need to learn how to explain and contextualize
the evidence they have, while others need to figure out how to collect data
that would help them say something valuable. Above all else, nearly every
writer we’ve ever taught or consulted with has needed help figuring out
how to tell stories about data that meet the needs specific to their readers.
That’s what we’ll be covering in “Part II. Purpose, Then Process.” We’ll
show you how to think about why you do the work you do and how
knowing the why can lead you not only to finding meaning in your data but
also to gathering and describing that data in a way that’s useful. It is often
quite useful to make your reader understand how data pertains to real
people, which is what we’ll discuss in “Part I. People.”
Part III of the book, as we mentioned above, deals with matters of
integrity. How, for example, do we tell accurate stories with data that can
actually help solve some of society’s most intransigent problems, not just
describe them? By the end of this book, you will understand what separates
strong data narratives from weak ones and will have a much better sense of
how to turn the latter into the former.
We wrote this book because we wanted something useful (and succinct)
to share with our students and colleagues that covers all we’ve learned over
our many years of teaching. It’s our hope that this short book will equip you
with enough writing tools that you’ll feel more confident the next time you
write in support of a change you’d like to see happen in the world.
What are we waiting for? Let’s get started!

OceanofPDF.com
PART I

PEOPLE

OceanofPDF.com
Telling Stories with Data about People for People

“They never got real work again”


At the beginning of the Great Recession, which officially began in
December 2007, almost every line graph you could find representing the
gravity and scope of the economic downturn in the United States showed
one of two trends. For the number of layoffs and the unemployment rate,
the line graphs showed an alarmingly steep trend upward. At the height of
the recession, the United States had to contend with a 10.6 percent
unemployment rate—the highest it had been since the Great Depression of
the 1930s. On the other hand, graphs of the numbers of hours worked per
employee, wage inflation, the number of job vacancies, and the country’s
gross domestic product (GDP) moved in the opposite direction.
After President Barack Obama took the oath of office in January 2009,
he and his administration set out to correct course. In February, Obama
signed into law the American Recovery and Reinvestment Act, an $831
billion stimulus package that was designed to save existing jobs and create
new ones as quickly as possible. The next month, the Obama administration
bailed out the automotive industry by loaning General Motors and Chrysler
about $80 billion so they could stay in business while they reorganized their
bankrupted companies. The auto industry bailout saved an estimated 1.34
million jobs and avoided billions in lost revenue, according to the Center
for Automotive Research, an independent research group that gets some
funding from automakers.1
Five months after Obama became president, the Great Recession
officially ended, according to the one and only group of experts in the
United States that gets to decide when a recession begins and ends: the
National Bureau of Economic Research. Nevertheless, in the spring of
2010, Pew Research Center released survey results showing that nearly nine
out of ten people in the United States believed the economy was either in
poor shape (49 percent) or in fair shape (39 percent). Moreover, Pew found
that most Americans didn’t think the Obama administration’s stimulus
package had improved the job situation for those who had been laid off at
the height of the recession.2

Tip #3. Ratios can help readers make sense of large numbers. Saying
“one in four people” is much easier for readers to picture than “7,526,333 of
30,111,489 people.”

Peter Cappelli summed up nicely what all this data tells us. “One in five
employees lost their jobs at the beginning of the great recession,” the
Wharton professor of management told Penn Today in 2018. “Many of
those people never recovered; they never got real work again.”3

It’s not your job to tell the reader everything there is to know
Who exactly are these people who never recovered, who were still
unemployed—through no fault of their own—four years after the recession
started? Mostly they were what the federal government calls “older
workers,” meaning people aged 55 years or older who could and want to
work. These folks had an especially hard time recovering after being laid
off at the height of the recession. Taking a look at figure 1.1, you’ll see that
the number of unemployed older workers peaked in February 2010 at
2.3 million. Fast-forward 21 months later, to December 2011, and the
number was still quite high at 1.9 million.
To make sense of what was happening to unemployed older workers, the
US Government Accountability Office (GAO) analyzed multiple sets of
data related to unemployment from the Bureau of Labor Statistics,
including data from the Current Population Survey, Job Openings and
Labor Turnover Survey, and Displaced Worker Supplement. GAO also
analyzed data on retirement savings from the 2007 Survey of Consumer
Finances. Lastly, GAO modeled microsimulations to estimate retirement
income for workers who stopped working at different ages. For those
unfamiliar with GAO’s work, you should know that GAO is considered the
“supreme audit institution of the United States,”4 meaning that policy
analysts from all levels of government—both domestic and international—
look to GAO as an exemplar of how to evaluate governmental performance
objectively.
Here’s what GAO found after crunching all those numbers: the rates of
long-term unemployment among older workers rose at a greater pace than it
did for younger workers, and by 2011, more than half of all unemployed
older workers had been actively looking for work for longer than six
months.5 Bear in mind that the Bureau of Labor Statistics counts people as
unemployed only if they are still looking for work. Those who give up for
whatever reason no longer count—at least as far as the unemployment rate
is concerned. Therefore, the true number of older unemployed workers was
even greater.
Figure 1.1. Millions of older workers were still unemployed more than two years after the recession
ended, according to the Government Accountability Office’s analysis of 2007–2011 Current
Population Survey data.
Source: Government Accountability Office, Unemployed Older Workers: Many Experience
Challenges Regaining Employment and Face Reduced Retirement Security, GAO-12-445
(Washington, DC: GAO, April 2012), figure 3

“The data is the prop”


The process of communicating effectively with data begins with a firm
understanding of your reader. For whom are you writing? What do they care
about? What do they already know? What do they want? Is that different
from what they really need? Does any fear keep them up at night? How
persuadable are they? If they are persuadable, what do they need to know to
see things the same way you do? Writing a story with data is not for you; it
is for your reader. You can’t make a change in the world if you don’t
understand who it is you’re addressing. Knowing your reader is an
important first step to crafting a story with data for them.
Sometimes the answers to questions about the reader will be well known
or easy to find, while other times you’ll have to do some digging and make
educated guesses. Even just imagining a conversation with your reader can
help you write more strategically to meet that reader’s particular needs.
“You need to dazzle people,” says Paul Von Chamier, a research officer
at New York University’s Center on International Cooperation (CIC).
Because there are so many competing policy issues on any given decision
maker’s agenda, writers who want to persuade their readers must recognize
this and look for the most efficient and effective ways of saying what needs
to be said. In Von Chamier’s work, knowing his reader is essential. He must
collaborate with elected officials and appointees to build coalitions of
organizations to address income inequality, reduce exclusion, and develop
evidence-based solutions, and the best way he’s found to communicate with
such readers is through stories rooted in robust data.6
But not just any story.
Von Chamier collaborated with a coalition of 12 countries from around
the world that were eager to “put in the extra effort to achieve equality and
inclusion—financial and social inclusion.” One of the coalition countries,
Ethiopia, invited Von Chamier and his CIC colleagues to visit the country
and help Ethiopians improve their inclusion efforts. Ultimately, Von
Chamier and his team were tasked with developing a master plan to guide
Ethiopia’s development in this area. By his own account, the trip was a
tremendous success, although soon after he returned to the United States,
Von Chamier ran into a delicate problem: How could he and his team
communicate with the Ethiopian officials in a way they would find most
useful—without offending or turning their client away from the hard truths?
In working with practitioners, we often hear about this concern:
representing a problem—whether one with poverty, diversity and inclusion,
or financial solvency—is difficult. Even when clients have intimate
knowledge of their own problems, it’s not always clear how to represent the
depth of problems in a way that doesn’t put them on the defensive. What we
have found in our own work is that, when put delicately, or maybe when
framed as an opportunity, clients are usually more than happy to have their
organization and other stakeholders coalesce around a problem defined in
such a way that everyone wants to see it solved. It’s a mobilizing effort, not
an offense.
As a show of respect to their Ethiopian hosts, CIC began its story by
praising the country’s efforts to improve outcomes for its children. This was
a key source of pride for the officials: because of the steps the country had
taken in recent years, Ethiopian children at that time were healthier, stayed
in school longer, and had a longer life expectancy than previous
generations. And rather than simply spouting a litany of data points about
life expectancy and graduation rates, Von Chamier’s team wrote about the
children themselves, using the data as mere support for that story.
“Data is not the thing in the center,” he says. “The thing in the center is
the political narrative. The data is the prop.”
From there, Von Chamier and his team pivoted to discussing ways in
which Ethiopia was not progressing as the officials had intended. Some of
the country’s initiatives, it turned out, weren’t helping to reduce inequality,
nor were they having a positive impact on the country’s struggling
economy. To help his readers understand that the problems the country
faced were not insurmountable, Von Chamier turned to another story he felt
confident would resonate with the Ethiopians.
“Ethiopia has an ideology to be a beacon of light for Africa, it’s their
goal,” he says. “In each section of the report we had ‘policy beacons’:
policies that were successful and could be a guide to share with Africa.
These were examples of success.” From there, all Von Chamier had to do
was frame Ethiopia’s struggles as “opportunities for action”—beacons to a
better future.
Von Chamier knew that the officials he was communicating with were
already well aware of the problems they faced, so there wasn’t much need
to prove how bad things were to them. “In the past there would be eye
rolls,” Von Chamier explains, “because they know their own problems—
high inflation, etc.” That’s why he and his team started with acknowledging
the progress the country had made and reminded the Ethiopians of their
goal to be a leader and trend-setter. Given what they knew about their
readers, Von Chamier and his team didn’t have to dedicate much space to
laying out all the negative outcome data; instead, they were able to give
over more space to analyzing one of the root causes of Ethiopia’s
misfortune—the country’s massive exchange rate problem, which reduces
its access to hard currency. Because the Ethiopian government had taken on
massive debt, CIC determined that the country simply could not lower the
exchange rate, which severely hampered its ability to trade with other
countries.

Tip #4. Don’t forget there are real people behind all those numbers
you’re crunching. Readers will care a hell of a lot more about people than
about data points, so if your goal is to get the reader to care, find the people
in the numbers and tell a story about how those people are affected.

What did the Ethiopians really want to know? They wanted to know
whether they should pay down their debt or if there was another strategy
that would work better to even out the country’s exchange rate. “We
analyzed the dynamics of the situation,” says Von Chamier, “and found out
that government-owned companies were using a parallel exchange because
international companies refused to use an overvalued rate. No one used
official values. In other words, some of the ‘hit’ had already been factored
in.” Therefore, the CIC report explained—quite delicately—that, in Von
Chamier’s words, “the cost of making inflation realistic is not as hard as
you think. This is a fresh fact. The cost has been organically absorbed by a
parallel market.”
To improve the palatability of the story he needed to tell, Von Chamier
mimicked language used by the Ethiopians to describe their situation. His
readers cared much more about the story than they did about the data.
That’s not to say data isn’t important; it’s incredibly important. But it’s not
more important than the story. The story is about Ethiopia as a beacon for
what is possible in the future.

Quantitative data doesn’t always help us understand why


something is happening
Why was the Great Recession especially hard on older workers? The
quantitative data from the Bureau of Labor Statistics can’t really tell us the
reasons. That’s why GAO decided to talk directly with older workers who
had lost their jobs during the recession and were struggling to find other
work years later. Specifically, GAO conducted ten focus groups with a total
of 77 long-term unemployed workers in Virginia, Maryland, California, and
Missouri. GAO also interviewed officials at the Social Security
Administration, the US Department of Labor, and front-line staff members
at one-stop career centers, where people experiencing unemployment can
access a variety of employment and training services. Lastly, GAO analysts
interviewed a couple dozen experts about issues related to older workers
and unemployment. While the policy analysts who work for GAO do run
regression analyses, a larger chunk of their time is spent talking to people—
those who make decisions and pull the levers of power, as well as those
who are either helped or harmed by the decision-making and the lever-
pulling.

Tip #5. If you want to be an exceptional data analyst, you must learn
how to talk to people. And we mean really talk to people—and listen, too.

What GAO heard in their focus groups was much more revealing—and
moving—than what they gleaned from government databases.7 Many of the
older workers perceived that employers were reluctant to hire them because
of their age. One 57-year-old man, for example, said, “I had a hand in some
of the hiring. You know, it wasn’t for publication, but the guy said, ‘Don’t
hire anybody older than me or fatter than me.’ ” A woman a year younger
remarked, “I have a job interview tomorrow for a job at 50 percent of my
salary for $25,000 a year. And you know what? I’ll take it if they offer it to
me because I can keep looking while I have [the job]—if they want me after
they see how old I am when I walk in the door.”

Tip #6. When you want your readers to remember your story, use
striking imagery that will stick with them over time. When looking for
details to include in data-driven stories, pay attention to your gut reactions.
If you feel like you’ve been punched in the gut after reading a statistic or a
quote from an interview, take note of that. Try to re-create the experience
for the reader. Chances are if you felt something, they’ll feel something too.

Others in the focus groups spoke about the personal challenges of


looking for work after months and months of rejection and discouragement.
“You don’t feel like you’re part of the community,” one 65-year-old woman
explained. “It’s like you get older, and people just toss you aside.” Another
woman, a few years younger, confided in the focus group leader: “When
you’re not working, you don’t feel very good; you’re depressed. You’re,
you know—you feel discouraged. Your self-esteem is about, you know, an
inch high.”
Is being unemployed especially bad for older workers? Again, the
quantitative data can’t tell us as much as we’d like to know. People, on the
other hand, generally have no problem telling stories about how something
has negatively affected them. “My son actually moved back in with us so he
could give me rent money each week because he felt bad for us,” one 55-
year-old man told the group. “He got rid of his apartment and moved back
in with us.” Another participant, a 56-year-old woman, said that after she
was laid off, she developed a fear of visiting the doctor. “I don’t even want
to go to the doctor to find out there’s something wrong because then we
can’t afford to get it fixed,” she explained. Perhaps the most heart-
wrenching of all the points made during the focus groups came from a 61-
year-old man: “Hopefully, I don’t live … for more than about another ten
years because I will be broke in ten years.”
Testimonials are useful in explaining data because they
1. ground big concepts, such as income inequality, in the specific
circumstances of individuals;
2. introduce readers to points of view they might never have otherwise
considered; and
3. afford people who may have been largely ignored an opportunity to
share their stories, perceptions, and opinions with a wider audience.

We especially like to include quotations from real people whenever they


present a distinctive take on a complex issue or when we want to compare
one point that’s been made to another—either because we want to critique
or, conversely, to amplify what was said. We also like to use quotes
whenever we want to present a well-stated passage whose meaning might
be lost if paraphrased instead.

Tip #7. Help your reader understand abstractions by comparing them


to concrete things they can picture. Here’s an example: Why should
anyone care about net neutrality? How many Americans even know what
net neutrality is? Back in 2014, comedian John Oliver waded into the
“boring” and obscure issue and explained to his audience that net neutrality
protects start-up companies from being swallowed by bigger companies on
the internet. Here’s how he explains the impact of the Federal
Communications Commission’s proposed rule changes: “Ending net
neutrality would allow big companies to buy their way into the fast
[broadband] lane, leaving everyone else in the slow lane.” He leavens this
potentially boring subject by turning to humor: Without net neutrality, “how
else is my start-up video streaming service Nutflix going to compete? It’s
going to be America’s one-stop resource for videos of men getting hit in the
nuts.” In other words, Nutflix and other real start-ups would be at risk of
falling victim to anticompetitive tactics.8

Data is about people and is used by people


In 2013, the Obama administration decided to grant the American public
access to data collected by the federal government that had previously been
nearly impossible for them to get their hands on. Obama’s hope was that by
granting public access to more than 100,000 federal government data sets,
innovative companies and organizations could use them to benefit society,
grow the economy, and create jobs. This wasn’t the first time something
like this had been tried. Decades ago, the federal government made weather
data and the Global Positioning System, or GPS, available to the public.
Once that data was shared, entrepreneurs used it to create navigation
systems, warning systems for severe weather, and precision farming tools,
among many other applications. This is data for the public good.
To lead his administration’s “Open Data Initiative,” President Obama
turned to Nick Sinai, who served as the US deputy chief technology officer.
Sinai’s first step was to tell the public the good news. In the early
announcements about the Open Data Initiative, Sinai’s team focused on
how much data the federal government controlled and did not worry as
much about the stories the data could be used to tell. It took a few failed
attempts before Sinai realized that companies and organizations were not as
concerned about how much data was accessible as they were about what
kinds of problems could be solved using the data.

Tip #8. Try starting with the main finding—your message—not facts or
your methodology. Instead of this: One study probed the relationship
between parental education and income and participation in postsecondary
education and found that young people from moderate- and low-income
families were no less likely to attend college in 2001 than they were in
1993. Try this: Young people from moderate- and low-income families were
no less likely to attend college in 2001 than they were in 1993, according to
one study.

“We like to talk about inputs,” Sinai recalls, “not outcomes,” but it’s
outcomes that most interest entrepreneurs and innovators. The focus of the
story he needed to tell, Sinai realized, had to be on the fact that there were
countless ways to use all the data the government had collected to make
real, lasting, and positive change in people’s lives.9
Sinai was later able to apply this lesson to another data-communication
challenge when working on the Obama administration’s initiative to bring
fast broadband to students around the country. Rather than focus on how
much the administration was going to improve broadband infrastructure,
Sinai focused his communication to the public on explaining outcomes and
impact. He led with the goal—that 99 percent of students would soon have
access to fast broadband—rather than how they were going to accomplish
that goal (that is, the method). The campaign’s slogan became “Fast
Broadband for All.”
“When you have a presidential policy goal,” Sinai explains, “you need to
do a good job of articulating that goal, externally and internally, in a way
that is measurable and realistic.” To do this, you can think about how a
journalist might write about the policy initiative to give readers “actionable,
timeboxed, and specific metrics,” which can be used to evaluate the
effectiveness of a given initiative. According to Sinai, the “timebox” is
when the initiative will happen. Journalists and the public also need to
know which metrics will be used to evaluate it. But they want that
information conveyed in a story—preferably one that can fit in a headline.
When readers want fresh baked cookies, sharing a recipe from your
mother’s side of the family won’t immediately curb their appetite.
Sinai also applied this strategy when crafting messages for President
Obama’s energy policy proposals. Rather than relaying quantitative data
and projections chockful of scientific jargon—all too common in energy
policy communications—Sinai’s message was simple, and it focused on
impact: “Solar as cheap as coal by 2020.” This slogan was particularly
attractive because it makes its point clearly, even to the reader who doesn’t
know how expensive coal is—or solar, for that matter. What does matter is
the potential impact on people who likely would adopt solar technology if it
were as cost-effective (in the short term, anyway) as burning coal to
produce electricity. Even though the Obama administration wasn’t able to
achieve its goal in this area, Sinai’s use of a story-first strategy was the right
tack. It’s important to remember that in the world of public policy, some
ideas take years—even decades—to catch on. Just because your solution is
not instantly adopted by the readers you’ve addressed doesn’t mean it never
will be.
Another good example of starting with the story (and using data to
support that story) comes from David Leonhardt and Yaryna Serkez. In July
2020, they published an opinion essay in the New York Times titled “The
U.S. Is Lagging behind Many Rich Countries. These Charts Show Why.”
The first two paragraphs of the essay are worth quoting in full:
The United States is different. In nearly every other high-income country, people have both
become richer over the last three decades and been able to enjoy substantially longer lifespans.
But not in the United States. Even as average incomes have risen, much of the economic
gains have gone to the affluent—and life expectancy has risen only three years since 1990.
There is no other developed country that has suffered such a stark slowdown in lifespans.10

How great is that first line? “The United States is different.” Different how?
Keep reading to find out! Leonhardt and Serkez then present their data in a
series of 11 data visualizations that cover everything from life expectancy,
GDP per capita, and rates of union membership to health expenditures as a
share of GDP and the distribution of national income across the economy.
The authors’ goal, as the title of the essay indicates, is not just to point
out problems but rather to offer an explanation as to why the United States
can’t seem to produce the same positive outcomes for its citizens that other
rich countries have managed to do: Britain, Denmark, Japan, Canada, and
Germany among them. What they found was multiple contributing factors
worked together to make corporations and rich people in the United States
more wealthy and powerful over time, mostly at the expense of middle- and
working-class families. The data they present, Leonhardt and Serkez argue,
shows that most American workers and their families “receive a smaller
share of society’s resources than they once did and often have less control
over their lives.” Moreover, their “lives are generally shorter and more
likely to be affected by pollution and chronic health problems.”11
By focusing on who is impacted, where that impact occurs, and how that
impact is felt, the authors show a “disturbing new version of American
exceptionalism” that acts as a frame to help the reader make sense of all the
data that follows.
Tip #9. The tone of your writing matters—a lot. If you want your reader
to see you as objective, use an objective tone and present your findings as
objectively as possible. Avoid judgmental words such as failure or
incompetence.

What happened to those older workers GAO interviewed?


There’s no easy way for us to know for sure what happened to GAO’s focus
group participants; we personally would like to know what became of the
man who worried about outliving his savings. (If you’re reading this, sir,
look us up online and let us know how you’re doing.) What we do know,
however, is that GAO’s report made an impression on those occupying seats
of power.
One person so moved was Herb Kohl, a Democratic senator from
Wisconsin, who at the time served as the chairman of the Senate’s Special
Committee on Aging. The same was true for the US Department of Labor.
“To foster the employment of older workers,” GAO had recommended, “the
Secretary of Labor should consider what strategies are needed to address
the unique needs of older job seekers, in light of economic and
technological changes.”12 The secretary of labor at the time, Hilda Solis,
directed the agency to host webinars, update its protocols, and train its one-
center staff on how better to serve unemployed older workers. The
Department of Labor also awarded $170 million in grants from the Ready
to Work Partnership to “expedite the employment of Americans struggling
with long-term unemployment,”13 with a special emphasis on workers aged
55 years and older. These grants were designed to help those experiencing
long-term unemployment through a range of training and specialized
supportive services. To our knowledge, GAO has not yet evaluated the
impact of these grants, so we can’t be sure what the outcomes look like for
older workers. What we do believe, though, is that some of those interview
quotations from GAO’s report planted the seeds from which these new
policies grew. Stories, supported by data, sowed the change to come.

OceanofPDF.com
PART II

PURPOSE, THEN PROCESS

OceanofPDF.com
Finding Meaning in the Data and Making It Work for
You

Start with why


In her role as the deputy director of parliamentary relations for Canada’s
foreign ministry—known as Global Affairs Canada—Rebecca Barnes
became increasingly frustrated every time a member of the Canadian
Parliament requested a data point that her office should have been able to
supply without a lot of effort. In an ideal system, Barnes believed, the
foreign ministry’s data would be centrally located and easy to access; that
is, the data from each division, each program, and even each person would
be stored in the same place. That way, when Barnes was asked how one
country or another viewed its diplomatic ties to Canada, she would be able
to access the information she needed. But it didn’t work that way at Global
Affairs Canada. The ministry’s data was siloed. Different departments
collected different kinds of data in different ways, and much of the data was
stored on individual computers, not in a central system.
But let’s set that problem aside for a moment.
Even if Barnes had access to all the data at Global Affairs Canada, she
would need to figure out what it all meant. What does it mean, for example,
that a Canadian diplomat in one country had participated in, let’s say, five
engagements with the host country’s leadership? At best, Barnes could use
her data to show trends over time: the numbers were going up, going down,
or staying the same. But that doesn’t tell us anything about impact or about
what these outputs mean.
These issues are not unique to Global Affairs Canada, of course. Many
organizations have siloed data collection, and many struggle to make
meaning from the data they have.
After the terrorist attacks of September 11, 2001, some American
officials became fixated on evaluating how the rest of the world viewed the
United States, which begs an interesting question: How do you quantify
how much the people of a foreign country like America? You could conduct
a poll, assuming you can reach your target population—and granting they
feel secure enough to answer honestly. Or you could look for clues in a
foreign country’s mass media. But what meaning can we glean if one
country’s major newspapers published, say, 72 articles favorable to the
United States and 35 unfavorable ones last year? Does the ratio say
something about public opinion or about freedom of the press?
A 2014 US Department of State report, Data-Driven Public Diplomacy,
characterized the challenge this way: “Often, public diplomacy officers are
under pressure not just to produce immediate outputs, but also immediately
to demonstrate their results. Yet public diplomacy, like traditional
diplomacy, is a long game. Impact measurement takes rigorous and
consistent data collection, pre-, mid- and post-activity, for extended
analysis.”1 Public diplomacy—in this case, engaging with foreign publics
and understanding their opinions of the United States—makes its gains
gradually. Data may be siloed, but even when it isn’t, making meaning from
it can take time.
How can we accurately determine impact when all too often it’s much,
much easier simply to conflate outputs with outcomes? The counts (5
events, 72 articles) are not of outcomes; they are outputs. To find meaning
in numbers, start with why. Find your purpose for using data to begin with.
Then, check to see if your process supports that purpose after all.

Asking why will help you develop better research questions,


which lead to more interesting and accurate stories
Data can help you tell an effective story to meet three main goals:

1. to understand and describe what is happening (and how we got here);


2. to determine what reforms or interventions work best—and which
ones don’t work; or
3. to demonstrate what should be done next to address a challenge, issue,
or problem.

You may want to accomplish all three of these goals. And you absolutely
can. But first you have to decide which goal(s) to pursue because, once you
do, you’ll then be able to focus your attention on asking better research
questions—and getting better answers.
Let’s start with the first goal you may have for using your data to tell an
effective story: you want to understand and describe what is happening (and
how we got here). Imagine that you’re a deputy-level policy professional in
charge of communicating the impact of America’s efforts to spread
democracy around the world. How’s that going? Well, let’s say you want to
know specifically about North Africa and the Middle East, so you find out
how many events the US embassy in Tunisia, for example, has sponsored
that are somehow related to improving conditions favorable to democracy.
And let’s say there were 35 such events. Now let’s ask ourselves, Why were
those events held? Was the goal to help change public opinion about
something? To serve as inspiration of some kind? Or maybe the point of the
events was to show Tunisians what the United States stands for and what
we support (and will not support). It’s hard to say without having more
information, but let’s assume the 35 events at least show what matters to the
embassy. That’s the story. That’s the why. And that’s where you should
begin.

Tip #10. Ask better research questions. Good questions drive good
stories, and the most common types of questions we see answered in public
policy writing are these: (1) Descriptive: What’s happening? (2) Evaluative:
What’s working? What’s not? (3) Prescriptive: What should be done next?

If you only report outputs (such as how many events were held and how
many people attended them), you’re missing an opportunity to tell a
compelling story—maybe one about the embassy’s goals. Also, your reader
will surely be left feeling starved for meaning. Communicating well with
data requires more than serving a data point or two to a hungry reader. It
requires—if you’ll indulge another food metaphor—that we collect our
ingredients, follow the recipe, cook something delicious, feed it to our
guests, and tell them a story about what that food is going to do to improve
their health, boost their energy, or whatever our goal may be. Notice the
difference between “The US embassy in Tunisia hosted 35 events last year”
and “To show its commitment to supporting and celebrating democracy in
North Africa, the US embassy in Tunisia hosted 35 democracy-themed
events.” With the former, we have a point devoid of meaning, and with the
latter, we have a story about a goal and value as demonstrated by outputs.
We have a descriptive data point. And the reader can then understand the
why of our story.

Tip #11. Don’t confuse descriptions of outputs with policy outcomes


and impact. Measuring outputs is important to explain what is happening
but not to explain what is working.

Now let’s talk about impact. What’s working? What isn’t? And what
should we do next? These are the important prescriptive research questions
that so many in the policy world tend to stress. With just the outputs—35
democracy-themed events at the US embassy in Tunisia attended by 1,000-
plus people—it’s hard to tell what impact, if any, these efforts had. To
answer an evaluative question, we need more. Again, as GAO did, we may
need to turn to the people, to ask about their opinions of the events and their
level of engagement; perhaps we could employ a before-and-after poll to
canvass their views on democracy. Or perhaps it would be wise to see what
related stories may be trending on popular social media platforms.
Whatever method of data collection we choose, a persuasive argument will
need to include specific language to tell the reader the purpose we are trying
to serve.

Tip #12. Nearly every decision you need to make as a writer depends on
two things: Whom are you writing for, and what do you want them to
do with what you have written? Understanding your reader’s goals will
help you determine everything from what kinds of data (and how much) to
use to how you should frame the implications of your research. Knowing
what you want your writing to accomplish is equally important. Are you
trying to educate and inform or to persuade and inspire the reader to act?
Are you trying to comfort the disturbed or disturb the comfortable?
Everything you write depends on your answers to these sorts of questions,
and once you know the answers, you can use data to support your message
effectively.

Using data to talk about outcomes (by moving beyond outputs)


can feel like “sticking your neck out”
The work of Dr. Ranjana Srivastava, an oncologist in Melbourne, Australia,
is a good example of what to do when you want to move past outputs and
focus on evaluating what’s happening in order to offer viable, evidence-
based ideas for what to do next. One of Dr. Srivastava’s biggest fears when
she began writing about health and medicine as a columnist for the
Guardian of Australia was that she would be “sticking her neck out”
whenever she offered opinions on what should be done next. It would be
easier instead simply to describe what she saw at work, rather than
advocating for systemic change in health care. It didn’t take her long,
however, to realize that her writing was not the liability she feared it would
be if she viewed it as an “extension of public service,” as she puts it. Her
purpose as a communicator shifted and came into focus. Engaging the
public through a newspaper column was like “bedside medicine,” she
muses. “If 200,000 people are reading this,” she says, “I have a duty to
inform.”2 And inform with a purpose. She had found her why.
To fulfill this new goal, Dr. Srivastava learned to “distill scientific
information down” to language that the average reader could understand
and be able to use in their everyday life. She quickly established a valuable
intermediary role by filling in the gap between doctor and patient
communication by “deconstructing” medical papers and discussing their
implications. In doing that, her writing process began to serve her larger
purpose.
For example, what does it mean that Australia recently saw a resurgence
of scurvy—the “old-world disease” caused by a lack of vitamin C—in for-
profit residential nursing homes? In a column on this topic, Dr. Srivastava
could have spent the bulk of her space describing what the data says is
happening, but that wouldn’t tell readers what could be done about it.
Instead, she told a story, complete with vivid, immersive details, to show
readers what was happening and then suggest what might be done about it.
To figure out how this anachronistic disease could reemerge, Dr.
Srivastava followed the money. According to a two-year royal
commission’s inquiry into the state of Australia’s nursing facilities, the
average amount of money spent on a resident’s three meals and snacks per
day was a paltry $6.08. “This figure is significantly lower,” Dr. Srivastava
explains by comparison, “than that found in community-dwelling adults
($18.29) and prisoners ($8.25). Spending on fresh produce has declined and
spending on supplements gone up.” “What’s needed,” Dr. Srivastava
concludes, “is a multidisciplinary transparent and accountable structure of
malnutrition screening, assessment, nutritional planning, provider education
and indeed, a shift in the way we imagine what elderly and vulnerable
residents ‘deserve.’ ”3
In another column, this one on the politicization of the COVID-19
vaccine rollout in Australia, Dr. Srivastava again used a story and followed
her own sense of why. The goal of this column was to show how important
it is for Australian society to trust health care professionals to act ethically
when faced with novel dilemmas.
She begins the story by recounting an experience she had while on an
airplane. One of her fellow passengers broke out in panic. Another
passenger offered to give the woman medication that would calm her down.
When Dr. Srivastava told the pilot there was no need to divert the plane and
that the medication would likely help, the pilot’s only response was “You
can give it, but you’re responsible for the consequences.” “The pilot’s
words seem like a warning,” Dr. Srivastava continues, “so I decide not to
medicate her.” After her condition worsened, the pilot diverted the plane. In
an odd attempt at consoling Dr. Srivastava, another passenger tells her,
“You must protect yourself. Sometimes, helping others isn’t worth the
trouble.”
From there, she pivots to another story, this one about a physician in
Texas who lost his job because he administered leftover coronavirus
vaccines that would have otherwise expired to patients from his contact list.
He vaccinated a “nonagenarian, a dementia patient, a medical receptionist,
and the mother of a child on a ventilator.” He also vaccinated his wife, a
woman who suffers from a chronic lung disease. What the doctor thought
was a practical and even morally justified move ended up costing him more
than he could have imagined. The county he worked for fired him, claiming
that he should have thrown away the unused doses. Then the district
attorney pressed charges against him for vaccine theft and abuse of process
and power.
What lesson should we glean from such a story? For Dr. Srivastava, the
lesson is obvious: “Professionals who have had a front seat to the deadly
toll of the virus are resolute in their motivation to move the public towards
safety. The best thing governments can do is to get out of the way, not
micromanage the rollout, and allow the people doing the administering
discretion over their decisions. Blaming individuals won’t help;
continuously improving processes will.”4 Make it possible, she argues, for
people to help without getting into trouble—that’s what we should do next.
Dr. Srivastava’s work aims to inform the public. That is her purpose.
Data supports her work but doesn’t drive it, and story is always out in front.
That is her process.
For all the benefits that storytelling can bring to your writing, many
experts don’t follow the course set by good science writers and other
professional communicators. Most of the time, at least in our experience,
scientists, analysts, economists, and other experts rarely center their writing
with a story. Perhaps this is because they’re not sure what their purpose is.
They may not have asked themselves why they are communicating, after
all.
The coronavirus crisis, a case study in communicating with
data
The COVID-19 pandemic may, unfortunately, be analyzed for years to
come as a case study in the tension between communication and science.
Articles on this subject are numerous. “The scientific community,” one
scientist warned in an opinion piece published by the Boston Globe in July
2020, “through an over-abundance of caution, has not consistently provided
clarity to an understandably worried and confused public” when it comes to
coronavirus research and its implications. “While it is technically correct to
say that we don’t know if antibodies will protect us from coronavirus
infection,” Laurence Turka continues, “the likelihood is overwhelming that
they will.” If that’s true, then “why are so few of us communicating what
we know?” Probably because these writers lost sight of their why.
Turka ends his piece by imploring fellow experts to be clearer about
purpose and to stop being obtuse when presenting research findings. Yes,
it’s good to be cautious—maybe humble is the better word—and, no, we
don’t want experts making claims that aren’t supported by the evidence. But
we do want experts who are willing to fulfill the purpose of helping nervous
readers who need answers to questions they can use to make decisions,
address problems, and chart a plan for the future. This is especially true in a
crisis. “Can I absolutely guarantee that antibodies as a result of infection
will protect someone?” Turka asks. “Every rule can have exceptions, so I
suppose I cannot. But I also cannot guarantee that the sun will rise
tomorrow morning. Yet, we should plan our day around it nonetheless.”5

Tip #13. When deciding how many examples to include, remember the
power of three. Use one example if you want to show the reader how
powerful the example is. Use two examples if you want to compare and
contrast them. And to give the reader a sense of roundness and
completeness, use three. Some news organizations share “three things to
know” with their readers, and they include one data point for each. More
information would crowd the story. Readers love threes.
In February 2021, Zeynep Tufekci wrestled with similar concerns about
science writing in the Atlantic. Data wasn’t being used well to light a path
forward for an anxious public wanting guidance. Tufekci directed her
analysis at the World Health Organization (WHO), and what she derived
from the data was rather critical. At the beginning of the coronavirus
pandemic, in January 2020, the WHO—the international body responsible
for communicating effectively about health—said there was “no clear
evidence of human-to-human transmission.” What the WHO should have
said, according to Tufekci, was “there is increasing likelihood that human-
to-human transmission is taking place, but we haven’t yet proven this,
because we have no access to Wuhan, China,” the purported epicenter of
the virus.6
A similar criticism could be leveled at WHO for how it communicated
about antibodies’ capability of protecting people from contracting the virus
a second time. In the spring of 2020, WHO officials reported that there was
“currently no evidence that people who have recovered from COVID-19
and have antibodies are protected from a second infection.” The result? I’m
sure you remember: a profusion of news articles and commentary animated
by trepidation and dismay. “Instead,” Tufekci writes, WHO “should have
said, ‘We expect the immune system to function against this virus, and to
provide some immunity for some period of time, but it is still hard to know
specifics because it is so early.’ ”7 In other words, WHO officials forgot
their purpose—to collect data, use it to support a claim, and communicate
that supported claim clearly to people around the world who don’t know
how to keep themselves and their families safe. After more than two years
of living through a global pandemic, we’ve been shown time and again that
variants of the virus develop and spread, cases surge and subside, and the
knowledge we have about what works to treat the virus—and what doesn’t
—expands every day. This is not to say, of course, that WHO should have
claimed something was proven back in 2020 even when it hadn’t been, but
if WHO had started with purpose—to help people navigate uncertainties
during a crisis—then its communications might have helped us understand
that no one policy response would likely stay effective for long and that we
would need to be ready to adapt as we collected more data and followed the
science.

What data do you have? What data do you need?


The truth is that you’re not always going to have the data you need to tell
the story you want, which has been a problem for scientists since the
earliest days of the COVID pandemic. But that doesn’t mean you shouldn’t
say anything. And it definitely doesn’t mean you should hide behind a veil
of unnecessary qualifications and unfounded speculations. No reader could
possibly appreciate reading a sentence that boils down to “something could
possibly happen if this other thing that might happen also happens—
maybe.”
This was a concern at the Atlantic in the spring of 2020. What was clear
to the magazine’s editors was that the federal government didn’t have all the
data it needed at that time to communicate clearly with the public. In
response, a team of researchers began collecting and reporting what they
could. Soon enough they realized their “temporary volunteer effort” had in
fact become “a de facto source of pandemic data,” once the White House
began using it in an official capacity.8
A year later, Robinson Meyer and Alexis C. Madrigal wrote an article to
explain “why the pandemic experts failed” and how we were still not
thinking about pandemic data in a way that made sense. “Many stressed the
importance of data-driven decision making,” the authors write. “Yet these
plans largely assumed that detailed and reliable data would simply …
exist.” In addition to being unable to speak for itself, data can’t collect
itself, either, let alone organize itself into a coherent story that can be
understood by folks who haven’t studied statistics.

Tip #14. If you don’t have any data, try articulating to the reader what
kind of data would help and how it could be collected. Some refer to this
practice as “evidence-building,” which takes time, money, and inclination.
Not every problem we face will have all three things going for it.
What story can you tell with your data (or your lack of data)?
This is a question that has captivated Carl Zimmer, a science writer for the
New York Times, for the better part of three decades. “While I’m not a
scientist myself,” Zimmer wrote in the summer of 2020, “I’ve gotten pretty
comfortable navigating around them. One lesson I’ve learned is that it can
take work to piece together the story underlying a paper. If I call a scientist
and simply ask them to tell me about what they’ve done, they can offer me
a riveting narrative of intellectual exploration. But on the page, we readers
have to assemble the story ourselves.”9
In our experience working with policy students, economists, military
folks, lawyers, and others, we’ve seen this sort of thing play out countless
times. What it boils down to is the writer having lost their sense of purpose,
and when that happens, the words on the page won’t ever have much
meaning for the reader. One of the first questions we usually ask about a
first draft (other than Who are you writing this for? and What do you want
the reader to do with this information?) is simply Why is this important?
Nearly every time we ask this question, something magical happens. The
writer’s eyes twinkle. Sometimes a smile breaks across their face. They
then tell us a story. They don’t just give us a mound of data punctuated with
academic jargon.
A way of enhancing your own narrative that we use all the time is to
imagine yourself having a conversation with a reasonable person (however
you define that) who is interested in your topic. What do they know
already? What do they need to know to form an opinion, make a decision,
or take action? How does the story you tell them help them achieve their
goals? Perhaps you can answer some of these questions with the data you
have. But other times you may need to fill in the gaps with specific
examples or by describing what we don’t yet know but may reasonably
predict.

Not all policy goals are created equal


Some policy goals are so broad that measuring progress toward meeting
them is next to impossible. Take, for example, the 17 Sustainable
Development Goals (SDGs) of the United Nations (UN). These goals were
adopted by all UN member states in 2015 as part of the 2030 Agenda for
Sustainable Development. According to the UN, the SDGs provide “a
shared blueprint for peace and prosperity for people and the planet, now and
into the future.”10 Like we said, broad. Here’s how Tim Harford, author of
The Data Detective, puts it: “development experts are calling attention to a
problem: we rarely have the data we would need to figure out whether these
[SDGs] have been met.”11 In other words, the SDGs were rooted in purpose
but suffered from a procedural problem.
The Ministry for Foreign Affairs of Finland also drew attention to this
issue in 2017. How can the UN possibly track progress, the ministry mused,
when there are so many organizations and countries responsible for
collecting the relevant data? As we saw with Rebecca Barnes of Global
Affairs Canada, not everyone will gather evidence in the same way, ask the
same research questions, or have access to the same information. Making
comparisons will be difficult, if not impossible.12 Other countries,
nonprofits, and international organizations noticed the same thing: lots of
data points that didn’t necessarily speak the same language, metaphorically.
To cut through the noise and doubt, a group of statisticians from around
the world came together and decided that one target under the goal of
education deserved priority. “By 2030,” Target 4.2 of SDG 4 reads, “ensure
that all girls and boys have access to quality early childhood development,
care and pre-primary education so that they are ready for primary
education.”13
But even when concentrating on that one target, the statisticians
struggled to agree on an indicator to measure satisfactory development in
early childhood. The statisticians originally decided that data collection
should focus on determining the “percentage of children under 5 who are
developmentally on track in health, learning and psychological well-being.”
But even with that focus, they discovered, it was still too difficult to define
and measure success. Instead, the focus shifted to measuring how many
children participated in early learning programs. A potential problem with
that indicator, according to the nonpartisan Brookings Institution, was that
measuring outputs of learning programs might lead some countries to ramp
up programming—even poor-quality programming—so they would meet
the criterion. Outputs, as we have seen, don’t tell us much about outcomes.
Eventually, the statisticians compromised. “It was thus decided to keep
the developmentally on track indicator but also add the one on
participation,” Anderson later wrote.14 What story could all this data tell us?
What do we hope to achieve by tracking SDGs in the first place? Here’s
how Anderson makes sense of it all: “It is important to keep focus on those
who benefit from the SDGs the most: the children, the poor, and the
marginalized. Unless these indicators are used to significantly improve their
lives and opportunities, it won’t matter how we define a mountain, a basic
service or a city—we will have failed in this ambitious agenda any way you
calculate it.”15 The UN must, in other words, remain focused on its purpose
and encourage the collection of data that can be used to answer the
questions for which we most want answers. Once that purpose is clear, the
UN can measure various indicators and explain what these numbers tell us
and what they do not.

Make sure whatever you’re comparing is commensurable


If a researcher is aiming to understand how education around the world has
changed over time, it may make sense to investigate whether developing
countries are catching up to developed countries in educational outcomes. If
so, the researcher might choose to examine data collected by the
Programme for International Student Assessment (PISA), under the
auspices of the Organisation for Economic Co-operation and Development,
which uses data from PISA to measure the ability of 15-year-old students
“to use their reading, mathematics and science knowledge and skills to meet
real life challenges.”16 Because the data is collected the same way for every
country, the researcher could compare PISA scores for students in the
United States with scores of students living in other countries. That
comparative analysis should work fine.
As another source of data, the US Department of Education tracks
educational achievement through the National Assessment of Educational
Progress (NAEP). NAEP scores measure how well elementary and
secondary students in the United States perform in subjects ranging from
civics and reading to US history. Perhaps our hypothetical researcher sees a
trend in NAEP scores and wants to compare the United States with another
country, so the researcher pulls data from PISA. That comparison won’t tell
the researcher what they want to know, however, and may confuse their
readers if presented. Because PISA and the NAEP ask different questions,
study students at different ages, and use different indicators, comparing
their incommensurable data sets would result in flawed findings. This
flawed process won’t serve the researcher’s purpose.

Tip #15. Before comparing data sets, check first to see if the data sets
were collected and analyzed in similar ways. Consider whether your data
points will “speak” well to one another; that is, were they measured in the
same manner, in the same time period, by the same organization? If they
were not (and bringing them together would amount to “comparing apples
to oranges”), explain to the reader what comparison or contrast can be
reasonably made—and what cannot.

If our researcher felt an urgency nonetheless to say something about the


two data sets, they could tell a story explaining what we can learn from the
NAEP data and what we can learn from PISA data, separately. It boils down
to this: if you’re honest with your readers throughout your research and
writing process, and transparent about where your data comes from and
what it means, your readers will appreciate the nuance you provide.

Instead of “dumping” data, try layering it instead


When writing with data, two paradoxical needs often arise. On one hand,
we want our sources of data to be comparable; on the other hand, we don’t
want to overwhelm our readers with too much data and analysis. Finding
the right balance between these needs, for many writers, is easier said than
done. But remember, because busy readers can take in only so many
numbers at a time, we must make strategic decisions about which data to
use. Earlier we said that readers like threes, but what we should have added
is that readers don’t need or want more than three. Sadly, even writers who
know to tell stories driven by purpose sometimes resort to dumping giant
piles of data on the reader because they think that’s what transparency and
credibility look like. For an example of this, let’s look at a 2016 report
published by Pew Research Center titled Many Americans Believe Fake
News Is Sowing Confusion. Here’s how the authors begin:
In the wake of the 2016 election, everyone from President Obama to Pope Francis has raised
concerns about fake news and the potential impact on both political life and innocent
individuals. Some fake news has been widely shared, and so-called “pizzagate” stories led a
North Carolina man to bring a gun into a popular Washington, D.C. pizza restaurant under the
impression that it was hiding a child prostitution ring.17

Then, in a misguided effort to share their most important research findings,


the authors pile up the data points, thereby limiting their potential impact on
the reader:
According to a new survey by Pew Research Center, most Americans suspect that made-up
news is having an impact. About two-in-three U.S. adults (64%) say fabricated news stories
cause a great deal of confusion about the basic facts of current issues and events. This sense is
shared widely across incomes, education levels, partisan affiliations and most other
demographic characteristics. These results come from a survey of 1,002 U.S. adults conducted
from Dec. 1 to 4, 2016.
Though they sense these stories are spreading confusion, Americans express a fair amount
of confidence in their own ability to detect fake news, with about four-in-ten (39%) feeling
very confident that they can recognize news that is fabricated and another 45% feeling
somewhat confident. Overall, about a third (32%) of Americans say they often see political
news stories online that are made up.18

What are the authors trying to accomplish with this story? We think it’s
safe to assume they want to educate the reader about an important policy-
related problem. Or perhaps the point is to raise alarm about a set of
disconcerting developments. If their purpose is to convince the reader that
there truly is a problem here, adding more data points isn’t necessary.
Individual data points are like ice cream. One scoop is probably plenty for
most of us. Two scoops will fill you up. Three might give you a
stomachache. What about more than three? Well, nobody needs more than
three scoops of ice cream.
“What I try to avoid is too many numbers,” Dr. Srivastava says: “not
more than one or two figures, and not too many graphs.” Applying this
thinking to the Pew passage, note the difference it makes when only two
data points are mentioned:
The majority of Americans feel they can recognize fake news—39 percent are very confident
and 45 percent are somewhat confident in their ability to do so.

Clear and simple.


As we have seen, data itself is not the story. The story is that most
Americans think they can spot fake news. How do we know that? a
skeptical reader may ask. Here are two data points that support the story.
Now we can move on to more important questions: How many Americans
do recognize fake news? How many think they can spot fake news and yet
will read it anyway?

Tip #16. No “data dumping” allowed! There’s a tremendous difference


between what you could say with all your data and what you should say.
Much of what you find in your research will most likely serve the reader
better as subtext that informs the core message of your story.

The temptation many of us face when writing with data is to show the
multiple layers of what we found. Researchers working on issues related to
income disparities, for example, may be tempted to detail many sorts of
income gaps—between men and women, white and Black people, rural
residents and urban, those with a college education and those without one—
to provide what they believe is a fuller picture. The risk they run, however,
is that taking this comprehensive approach could end up working against
their intended purpose by losing the point of the story in details.
One important question all communicators need to ask before they add
more data to their story is this: Does an additional layer of data advance the
story, or does it simply restate or reinforce a point already made?
Tip #17. When layering on data in your story, make sure each
additional data point expands the story you’re telling. Try not to
unnecessarily reiterate a point you’ve already made.

Giving your reader too many reasons to accept your argument can
actually make it less persuasive to some, according to psychologist and
author Adam Grant. In his 2021 book, Think Again: The Power of Knowing
What You Don’t Know, Grant shows that providing more reasons to a reader
can make you seem more credible but only to a reader who has already
been persuaded or who is sympathetic to your perspective. If you want to
reach a reader who may be skeptical of, or even hostile to, your argument,
you should focus on quality over quantity. “If they’re resistant to
rethinking,” Grant writes, “more reasons simply give them more
ammunition to shoot your views down.”19
The same can be true when communicating with data. Each additional
data point you offer increases the likelihood that the reader will get stuck in
the weeds or lose sight of your meaning and purpose. We see this all the
time in well-intended policy writing. In one example, from a research report
about gender disparities in education among African nations, the authors
had an abundance of data to choose from:
Education levels are rising—almost all girls and boys are enrolled in primary schools in all
regions, and globally more than 70 percent of children are enrolled in secondary schools.
Completion rates at the primary level are also on the rise globally. Of 173 countries with data,
almost half have completion rates of 95 percent or higher. Over the last decade, completion
rates rose from 78 to 87 percent for girls, and from 84 to 90 percent for boys. However, in the
least developed countries, around 41 percent of children are enrolled in secondary school and
fewer girls than boys are enrolled (14 percent).20

While all this data is interesting and important, no doubt, it is too much
for a busy reader to take in all at once. If the authors would have clarified
their purpose before committing words to page, they’d be much less likely
to bury their story under all those layers of data. They would instead be able
to focus on meeting readers where they are. For example, assuming the
purpose of this paragraph was simply to show where progress has been
made and where more is needed, the authors could have written something
along these lines instead:
While education levels are rising globally—more that 70 percent of children are enrolled in
secondary schools—the developing world continues to lag behind. In the least developed
countries, only 41 percent of children are enrolled in secondary schools. And fewer girls are
enrolled than boys.

If your purpose is impact, your process should be to simplify what you


know.
Simplicity is important, but so too is having a full, coherent view of a
situation. To demonstrate this point, let us tell you a story about the United
States Information Agency (USIA). Before it was disbanded after the
collapse of the Soviet Union, the USIA used polls to help policy makers
understand public opinion of America abroad. In one interesting poll,
conducted soon after the Cuban Missile Crisis in 1962, the USIA asked
citizens in the allied countries of Great Britain, West Germany, France, and
Italy a seemingly simple question: “Which country do you think is ahead in
military strength in nuclear weapons?”21 Across the board, our allies
reported that they believed the United States was beating the Soviets in this
regard. But that was not the whole story. When it came to the peaceful
outcome of the Cuban Missile Crisis, more respondents believed it was
“moderation on the part of the Russians” that made a larger contribution
than the “greater military strength” of the United States.”22
If pollsters had asked about military strength only, the story would have
been incomplete, owing to nuance left uncaptured. Many people living in
allied countries might have agreed that US military strength had grown over
time, but that strength didn’t necessarily improve international security,
which was undoubtedly a far more important consideration for Western
Europeans than whether the United States was winning the arms race of the
Cold War. Western Europeans could well have supported such military
growth yet have also questioned, at the same time, the capacity of US
nuclear weapons to keep them safe in the absence of a moderate Soviet
government. This whole story is necessary, especially because the pollsters
working for the USIA were communicating with the US Department of
Defense and, directly or indirectly, with President John F. Kennedy.

What do you make of multidimensional views?


In their book Anti-Americanisms in World Politics, Peter J. Katzenstein and
Robert O. Keohane write about this idea of “multidimensional views,”
meaning that people can hold two views, seemingly in conflict or at least in
variance, at the same time. Such multidimensional views of the United
States were on full display around the world after the terrorist attacks of
September 11, 2001. “People can simultaneously say that they dislike the
United States,” Katzenstein and Keohane maintain, “and believe that
emigrants from their country to the United States generally have a better
life than those who remain.”23 Such multidimensional outlooks may require
multiple polls (and other methodologies as well), and multiple data points,
before a viable story can be crafted with the results.
When asking a leading question—What is your opinion of the United
States?—pollsters may miss out on nuance. Such leading questions,
Katzenstein and Keohane explain, could elicit overstated “levels of pro-
American support because subjects might deem it inappropriate to display
wholesale rejection of a country publicly, thus hiding the predispositions
that truly condition their political behavior.”24 (Recall tip #10: ask better
research questions.) This is what researchers refer to as “the politeness
norm.” In addition, be warned that polls can “even create the ‘attitudes’
they report, since people wish to provide answers to questions that are
posed.”25
Instead of priming survey subjects’ attitudes with leading questions, look
for multidimensional views and the stories they support. For example, polls
taken in eight Islamic countries after 9/11, but before the United States
invaded Iraq in March 2003, demonstrated multidimensionality: “Almost 82
percent of respondents held favorable opinions about U.S. science and
technology. About 65 percent thought positively about U.S. education,
movies and TV, and commercial products. Only 47 percent held favorable
views of U.S. ideas of freedom and democracy or the American people.”26
In other words, one can like scientific invention more than the institutions
where scientists are free to invent. To tell this story, one data set simply will
not do.
When he took office in January 2021, President Joseph R. Biden stated
that winning back the favorable opinions of allies was a priority for his
administration. The thinking, we’re sure, goes something like this: If our
allies won’t work with us, who will? But is this thinking quite right? Is it
based on a faulty research question? Looking back at the war in Iraq,
Katzenstein and Keohane note that there seemed to be “no indication that
European-American cooperation in organizations such as the IMF
[International Monetary Fund] or World Bank, on development issues more
generally, [had] been stymied by European anti-Americanism.”27 In other
words, some European officials may not have been happy with American
military action, but they were still willing to work with America on
important global issues such as development.
Just like the Iraq War, the 2022 Russian invasion of Ukraine
demonstrates that alliances are elastic and multidimensional. After the
tumultuous relationship President Donald J. Trump had with America’s
allies in the North Atlantic Treaty Organization, it is not surprising that
tending to these alliances was a priority for President Biden. But even so, it
was a surprise to many to see how quickly the alliance could deploy
military equipment and supplies for Ukraine and enact unprecedented
economic sanctions against Russia. This was likely a shock to President
Putin. His miscalculation of the multidimensional attitudes of alliances was
one of many causes for the suffering and destruction of the war in Ukraine
—a wicked problem the world has not yet solved as we write these words.

Tip #18. Know the difference between percentage changes and


proportions. A percentage change and a percentage point change are two
different things. When you subtract numbers expressed as proportions, the
result is a percentage point difference, not a percentage change.
Another perplexing example of multidimensional data comes from a
September 2020 study conducted by Pew Research Center. What Pew found
was that 82 percent of respondents agreed that government investments in
scientific research are usually worthwhile. Pew also reported, however, that
a median of only 36 percent of respondents said they have “ ‘a lot’ of trust
in scientists.” What do such incongruent data points mean? An effective
communicator won’t assume their reader knows how to process such
information. They will instead tell a story that shows the reader that people
are complex (and sometimes behave in ways that seem irrational); when
they are polled, they often express opposing views, such as wanting more
scientific invention but worrying about what scientists do. Here’s how the
authors of the Pew report made sense of it: “While there is generally a
positive tilt toward public trust in scientists, trust often varies with ideology.
In general, those on the left express more trust in scientists than those on the
right.”28 Now that’s the story.
Katzenstein and Keohane noticed a somewhat similar correlation in their
studies of foreign opinions of the United States: “respondents that had a
positive view of modernity were usually more inclined to positive attitudes
towards the United States.”29 And not surprisingly, “dislike of American
movies, television and music … was significantly higher among the
respondents with traditional views about the place of women in society” in
most regions.30 But let’s not confuse cause and effect. Promoting more
American movies abroad in regions with traditional views will not
necessarily change minds. Promoting movies is an output, not an outcome.
Another example of puzzling multidimensional views occurred in the
spring and summer of 2020, when many scientists believed that, after
suffering through the pandemic in a variety of ways, most people would
gladly get vaccinated when an effective vaccine became available. But
that’s not how many felt—not by a long shot, partly because many people
around the world hold multidimensional views on vaccination. Take France
as an example. According to Professor Antoine Bristielle, who studies how
new means of communication affect young people’s relationship to politics
in France, those on the Far Left and the Far Right tend to be more hesitant
about vaccination than those in the center; additionally, “two other factors
largely explain the (lower level of) acceptance of a vaccine against COVID-
19 within the French population: confidence in political institutions and
confidence in scientists.”31 The story, it seems, is more complex than French
anti-vaxxers not wanting to get vaccinated because they mistrust science.
To get a fuller picture of what lay behind vaccine hesitancy in France,
we again must turn to purpose. If the purpose is to persuade more French
people to get vaccinated, omitting what is known about the role of science
and government in French society—and the French people’s relationship to
science and government—will not be persuasive to already mistrusting
individuals. Acknowledging multidimensional views will likely come
across as more credible and have more impact as well.
Leaders from around the world suffered from ups and downs when
communicating about the pandemic to their people. France’s president,
Emmanuel Macron, was no exception. In February 2021, he said that
AstraZeneca’s vaccine was “quasi-ineffective” in people over the age of
65.32 The British press piped up quickly, as that vaccine was widely
administered in the United Kingdom to advance the populace toward herd
immunity. The issue, in fact, was not about “effectiveness” but rather was
that too few people aged 65 and older had been included in the trial to prove
effectiveness for that age group. That does not mean, though, that the
vaccine doesn’t work in older people. The United Kingdom pressed its case
further, and Macron was criticized for misleading people about what the
data supported.

Sometimes not using all the data you have makes the most
sense
In 2017, the Ministry of Foreign Affairs in Finland convened a roundtable
to discuss the role of data in international affairs. At the time there was a
sense that “data obtained from social media platforms [could] serve as a
basis for sentiment analysis towards particular issues, regions or countries.”
It now seemed possible, in other words, for government officials from
around the world to log on to Twitter or Instagram and gauge public opinion
on issues that previously required expensive and time-consuming polls.
What the researchers discovered, of course, was that it wasn’t that simple.
As the resulting report noted, “although big data can pinpoint trends … it
has limited predictive power.” The report’s authors went on to claim that
“big data does not always paint an accurate picture of society, as it usually
over-represents those who have access to the Internet and digital devices.
The representative bias might become even more prominent when
analyzing data from certain online platforms … as the analysis will over-
represent certain demographics that are active in framing online
discussions.”33 That is to say, data pulled from social media to understand
opinion on international issues illustrates the opinions of those who chose to
post on their social media accounts. Who are they? Are they even real
people? If you gather your data from social media, what sort of people are
likely omitted from the sample? Does telling a story based on such data
help you fulfill your purpose? In the end, sometimes the right thing to do is
not use all the data you have, especially if it doesn’t help you achieve your
purpose.

When in doubt about data layering, remember that context is


key
When you are trying to tell a nuanced story with data, give the reader the
information they need to make sense of the data. How large is that number?
Compared to what? How bad is this problem? Compared to what? How
long has it been getting worse? Compared to what? By contextualizing our
data in this way, we can clarify what it means to the reader and why they
should care. For example, on April 9, 2020, British media outlets reported
that 887 people had died from COVID-19 that day. At the time, that number
sounded like a lot of deaths, but when Tim Harford read that statistic, his
statistician’s mind became skeptical. The total number of deaths, he later
mused, was probably closer to 1,500. Why such a discrepancy? Because the
context around that 887 figure was missing: “Partly because some people
died at home and statistics represent only those who died in a hospital,” he
writes in The Data Detective. “But mostly because these overstretched
hospitals are reporting deaths with a delay of several days.”34 And what
about people who died at home that day from other illness, afraid to go to
the hospital on account of COVID?
Here’s a similar example from one of our classes: A student began a
policy brief with a sentence she thought would grab the reader’s attention:
“Our state’s high school dropout rate is 9.6 percent of all students, up from
9.1 percent two years ago.” In her mind, she had succeeded in telling the
story because she showed how the dropout rate was rising. To her, the
increase in the rate of high school dropouts was alarming—just like that
887 figure was at first glance. But feedback from her peers was telling.
They asked about base rates. What do the dropout rates in neighboring
states look like? What have they been historically? Perhaps the rate had an
uptick after having decreased over decades prior. They also asked about raw
numbers. How many kids are we talking about in terms of the change over
the last two years? Does the number of dropouts tend to fluctuate across
years? How bad is this, really? Simply citing two numbers doesn’t tell the
whole story because—you guessed it—data can’t speak for itself. It’s the
context that will help your readers see what you mean.
In The Data Detective, where Harford’s lays out his rules of thumb for
understanding statistics, he notes that big numbers don’t usually tell much
of a story. For example, he reflects on President Trump’s proposed border
wall between the United States and Mexico, which at the time he proposed
it was estimated to cost some $25 billion to build. Harford asks his readers
whether that’s a big number—or not. “It certainly sounds biggish,” he
writes, “but to really understand the number you need something to
compare it with. For example, the U.S. defense budget is a little under $700
billion, or $2 billion a day. The wall would fund about two weeks of U.S.
military operations.”35 Now the story becomes not just about the cost of the
wall but about whether building the wall is a reasonable way to spend two
weeks’ worth of the defense budget. Even for those who were firmly
against building the wall, that $25 billion price tag seems much smaller
when juxtaposed with yearly US military spending.

Tip #19. If a huge, difficult-to-grasp number is important to your story,


help the reader visualize its epic size by converting it into something
they can more easily comprehend. In a story about water waste in
Arizona, for example, we might want to point out that the state’s annual
groundwater overdraft (the amount sucked out of the aquifers in excess of
natural recharge) is about 2.5 million acre-feet. But what is an “acre-foot”?
This large, incomprehensible number becomes a little easier to understand
when we tell the reader that an acre-foot equals just under 326,000 gallons
of water. But even that is hard to picture. So, what if we tell the reader that’s
enough water to fill about 1.2 million Olympic-sized swimming pools? No
one’s ever seen 1.2 million Olympic-sized swimming pools, though most of
us have seen at least one, and we know that 1.2 million is A LOT of pools.
Better, right?

We often hear from writers that they choose not to provide context
because they’re afraid they’ll bore the reader or tell them something they
already know. In our experience, such concerns are mostly unfounded.
Neither one of us has yet met a decision maker who got annoyed over the
inclusion of a sentence or two that explained the context around a data
point. In fact, providing context can serve as a useful reminder to the reader
or give them an opportunity to understand the trend you identified in a new
way. Ultimately, decision makers expect your analysis to have some
background and context. So don’t shortchange them.
In illustration of this point, CBS ran a story on the show 60 Minutes in
November 2020 on Operation Warp Speed, the American public-health
moonshot to expedite the development and distribution of a COVID-19
vaccine. Journalist David Martin interviewed four-star general Gustave F.
Perna, who had been tapped by the Trump administration to run the
operation. Martin followed Perna around his office, learning about the
wicked challenges he faced. When looking at Perna’s desk, Martin noticed a
cheat sheet with bureaucratic acronyms. Martin was bemused to learn that
the former army supply officer had a “steep learning curve to master the
jargon of the pharmaceutical industry.” Perna was not shy about it. “I listen
every day to what is said,” he explained, “and then I spend a good part of
my evening googling these words.” Googling the words! Perna was the
decision maker, and he was having to bridge his own area of expertise
(logistics and personnel for the US Army) with something new (Big
Pharma). Surely he would have appreciated receiving definitions and
context from his staff to help him learn the lingo.36 They may have wrongly
assumed, though, that the boss already knew.

The need for context extends to data visualizations


Context is not only essential when writing with quantitative data but also
when representing data in graphs, figures, and other types of visualizations.
Just as writers often dump their data into strings of sentences with little
contextualization, they may likewise dump their data into a pie chart or
histogram and expect the graph to do the talking for them. They’ve
forgotten their purpose and the need to communicate with readers on their
level. Writers often overestimate how clear the story they’re trying to tell is
to a reader who doesn’t share their knowledge.
When having a data visualization substitute for a lengthy written
explanation, use figure captions to explain what story you’re telling with
the visualization. What does the graph mean? Why should we care? In
short: What’s the story?

Tip #20. Humanize the scale of the math for your reader. Change “Of
the $246.8 billion in retail spending last year, consumers spent $86.4 billion
on cars and car parts” to something like “Of every $100 spent in retail last
year, consumers spent $31 on cars and car parts.”

One good example of how to contextualize data visualizations comes


from an opinion piece by Admiral James Stavridis, the former supreme
allied commander of NATO, in Bloomberg News. His piece is about rare
earth elements, which may sound offputtingly technical, but he does not
presume his reader understands or cares about his topic: “You could be
forgiven if you are confused about what’s going on with rare-earth
elements,” he begins.37 Not to worry, though; he will soon fill in the missing
pieces. The problem, as Stavridis explains it, is that China controls roughly
80 percent of rare-earth-mineral markets, and if China wanted to, it could
restrict the world’s supply, which “it has repeatedly threatened to do.” If
China took such a bold step, according to Stavridis, it would “create a
significant challenge for manufacturers and a geopolitical predicament for
the industrialized world.” (Rare earth minerals are used in manufacturing
smartphones—something Stavridis probably mentioned to strike home the
message to his readers; many of whom probably read his words on a
smartphone.) This story leads the reader to the numbers and frames the data
in a way that helps them understand.
One graph included in the piece is a pie chart of leading rare-earths-
producing nations. The graph employs a useful caption: “China dominates
global production of mined rare earths.” Good captions, like this one, often
have a subject (China) and an action verb (dominates). The data represented
in the chart, in turn, serves to support the claim made in the caption. The
next graph in the piece is a bar chart comparing China’s production to other
countries, from 2009 to 2020, and again the caption tells the story: “China’s
2010 attempt to use rare earths as a political weapon inspired other
countries to ramp up production.” A caption like that tells the reader the
purpose of sharing the graph, the why, rather than what the data is. An
alternative caption, such as “Rare earths production, 2009–2020,” would
not only miss an opportunity to tell a story; it also would leave readers
unmoved. The choices you make in collecting and communicating with data
—your process—should always serve your purpose.

Tip #21. Make sentence subjects concrete and verbs action-oriented


whenever you can. Readers grasp complexity more easily when they can
picture who is doing what, and the best way at the sentence level to show
clearly who is doing what is to name a person or other entity as your subject
and pair that subject with a verb conveying action.

When you don’t have the data you need, collect it


Well-collected and clearly communicated data can shine a light on
important global issues. But what happens when data is out of reach? Do
we know the true number of deaths from COVID in April 2020? And what
about Rebecca Barnes from Global Affairs Canada? How can she fulfill her
diplomatic purpose to communicate with Parliament when she can’t access
information across various computers?
Not surprisingly, one of Barnes’s main goals has been to create a culture
of evidence building in Canadian diplomatic circles. But in her desire to
create “one centralized place” to collect data, Barnes ran into another
problem: the fear of her colleagues. Many diplomats are afraid to use data,
Barnes told us, as their profession is built on collaboration and
relationships, not quantification. “When you put down a number,” Barnes
explains, “you worry that managers and colleagues will have a different
number.” This fear can lead to hedging, or “vagueness in policy making,”
Barnes says.38 Quantifying something incorrectly is perhaps the biggest fear
of all. It can feel safer to avoid data altogether and argue for solutions based
on hunches and past practices.
South of the Canadian border, the same problem exists. Various reports
assessing the work of the US State Department have called for the
executive-branch agency to tackle its risk-averse nature when it comes to
collecting and using data. “As is the case with almost all bureaucracies,”
one report notes, “suggestions of limited or negative outcomes may inhibit
future funding and administrative support. This creates a climate that
inhibits realistic evaluations, and evaluations in general. In the current
environment, it is hard to imagine how critical, forward-looking research
designs could be implemented given existing cultures of fear and risk-
aversion.”39
We are not recommending an evidence-building culture where data
trumps all reasoning. Not at all. It is, in fact, incredibly important to
distinguish where data supports purpose from where it may not capture
reality or lead to creative solutions. Some trials in becoming evidence based
are expensive, and learning from them can be painful. The State
Department also admits to shortcomings of transparency in its past use of
data: “Evaluations should be written in a balanced manner that highlights
the successes and failures of particular campaigns and activities. Research
units need the authority to make such guidance, and leadership must
encourage analytical products to be seen as constructive rather than
punitive.”40
Barnes believes that agencies should “teach managers to encourage data
in policy decisions. This is useful because data forces you to make
decisions; it doesn’t let you waffle.” For Barnes, data can allow a
communicator to feel credible and be concerned, simultaneously. She
argues that the purpose of data collection and reporting shouldn’t be to
replace human decision-making; rather, “it should be part of it, adding a
quantitative element, but without over-categorizing everything.”
Aversion to data comes with a cost, especially when data gathered in the
field of diplomacy can take time to evolve. The Cold War provides
examples of how painfully slow this process can be. Only when the Berlin
Wall fell did the US government gain full access and begin to study what
had happened on the other side of the Iron Curtain, where approximately
“30%–40% of the adult population had heard Western radio broadcasts.”41
The USIA, the agency responsible for this broadcasting, had been criticized
“for failing to measure the contribution of its exchanges to actual foreign
policy goals” and had its duties absorbed by the State Department in 1999.42
The data demonstrating its impact did not become available until two
decades later. Certainly, this information would have been helpful in
making a case to Congress for why international broadcasting was so
important (and needed more funding) in a world divided by bipolarity. But
back then, that data was unavailable—or “dark,” in Tim Harford’s parlance.
Harford explains that dark data happens when “we know the people are out
there and we know that they have opinions, but we can only guess at what
those opinions are.”43 When evidence building, make sure you’re thinking
about what data out there may be dark. We don’t know what we don’t know,
but that doesn’t mean we shouldn’t keep looking. That too is our purpose.

OceanofPDF.com
PART III

PERSISTENCE

OceanofPDF.com
Using Data to Solve Wicked Problems with Integrity

Statistics can’t tell you everything you need to know


One day in October 2012, a 40-year-old man named Skandaraj Navaratnam
went missing. At the time of his disappearance, he was living in Toronto. A
month later, the Toronto Police Service convened a task force with the goal
of finding out what had happened to Navaratnam. They began by
questioning his friends and acquaintances—anyone who might have a clue
as to where he went or who might have wanted to hurt him. Navaratnam,
the detectives learned, had been working as a landscaper, so they questioned
his boss, an unassuming 60-year-old landscape designer named Bruce
McArthur. More than just coworkers, McArthur and Navaratnam had also
been on-again, off-again lovers for years. In Canada, from 2014 to 2019,
law enforcement solved nearly 2,000 murders of non-Aboriginal people.
About 63 percent of the time, the killers were family members (including
intimate partners) or close friends (including authority figures and business
partners) of the victims.1
McArthur’s neighbors knew him as “Santa” because each holiday
season, with landscaping jobs being few and far between, McArthur would
dress up as the jolly fat man and take toy requests from hundreds of
neighborhood kids at a local shopping mall. Those who knew him well told
the task force that McArthur was a twinkle-eyed grandfatherly type—the
“kindest person I’ve ever known,” one said—who gifted roses to his friends
on their birthday.2 After questioning McArthur and uncovering no evidence
linking him to Navaratnam’s disappearance, Toronto police ruled him out as
a suspect. In April 2014, the task force was disbanded.
According to the Royal Canadian Mounted Police, from 2015 to 2018,
nearly 300,000 Canadians were reported missing.3 In 2019 alone, nearly
33,000 adults disappeared, though about 90 percent were found or returned
home within a week or so. But that still leaves thousands of people each
year who simply vanish, never to be found.

Tip #22. Avoid presenting proportions in parentheses. Instead of writing,


“In 2004, older people still in the workforce were somewhat more likely
than younger people to report doing unpaid family work (12 percent vs. 4
percent),” try “In 2004, about 12 percent of older people who are still
working reported doing unpaid family work, compared with 4 percent of
younger people.”

All too often, this is where analysis ends. Once all the missing persons
reports are entered in a database, with data sorted by province and by city,
populated by searchable variables, like gender, race, and age, the only story
we can feel totally confident telling is one of crisis. Plain and simple. Too
many people go missing each year; many cases remain unsolved. In order to
find solutions, what’s needed is persistence.

Good data can help solve the problem, not just describe it
What if in Canada and elsewhere more data analysts used what they could
find out about missing people to tell a more helpful, albeit more
complicated, story? That’s the sort of question Sasha Reid was asking
herself three years after Navaratnam disappeared, while she was building
two databases: one for missing persons and another for unsolved homicide
cases in Canada. Reid teaches psychology at the University of Calgary, and
of all the interesting phenomena a psychologist could study, Reid chose
missing persons and serial killers. For as long as she can remember, she’s
wanted to know how many of Canada’s missing persons may have fallen
victim to a yet-to-be-discovered serial killer. She’s fascinated by serial
killers, mostly because they didn’t strike Reid as irrational. There was a
method to their madness, and she needed somehow to uncover what makes
them tick. How do serial killers perceive the world around them? she
wondered. How do their perceptions affect their motivations to kill?
Perhaps most importantly, what can be done to stop them once they’ve
started? Can we use data to find the solutions, not just describe the
problems?

Tip #23. Continually remind yourself that to be persuasive, solutions


need to address a specific problem the reader cares about solving. Many
policy writers feel compelled to lead off their writing with innovative
solutions. But those solutions will fail to entice a reader until they realize
there’s a problem they should care about. Save your solutions for after
you’ve built urgency around the problem you’ve identified.

Among all the data Reid had collected—thousands of names,


demographic details, GPS coordinates from last-known locations, and more
—she noticed something out of the ordinary one day: three names and three
photos of Brown, bearded men, all from a neighborhood in Toronto known
as the Gay Village. Skandaraj Navaratnam was the first name. The other
two were a 58-year-old man named Majeed Kayhan and a 42-year-old man
named Abdulbasir Faizi. Both had immigrated to Canada from Afghanistan.
After googling the men’s names, she found some obscure news articles and
several blog posts written by people from Toronto’s LGBTQ+ community.
By then, five years had passed since Navaratnam’s disappearance.
Reid entered what she noticed in her database. “It gives me a good
scope,” she says of the mixed quantitative and qualitative data she collects.
“It humanizes the data to add that qualitative information. These are people
with stories and lives—and it’s so important that they don’t get lost in the
numbers.”4
Reid then turned to the other database she created, which includes
hundreds of data points on serial killers from around the world, dating back
to the fifteenth century. The list of 6,000-plus killers she’s collected data on
begins alphabetically with “Abbott, John Henry,” an American author who
fatally stabbed two people, and ends with “Zwanziger, Anna,” a German
housekeeper who confessed to poisoning people with arsenic in the early
1800s. Besides basic demographic information, Reid and a team of nearly
40 student researchers have entered more than 600 other data points,
including information on the killers’ medical diagnoses, reasons for their
first arrests, and the site(s) where they disposed of their victims’ bodies.
“We collect data on all manner of things,” Reid says, “from conception to
crime, arrest to death.”
It didn’t take long for Reid to develop a data-driven profile for a killer
who may be targeting Brown gay men in Toronto. The killer, Reid predicted
from her analysis, would be gay himself, a male a “little older” than 30
years (because many of the gay serial killers in Reid’s database started
killing later in life than heterosexual serial killers), and he would work a
blue-collar job and have a history of violence. Reid also believed, based on
a statistical comparison, that the killer would likely be sexually motivated
and that he would be a person of color, considering that most serial killers
select victims of their own race. Once her preliminary profile was complete,
Reid called the Toronto Police Service. The officer she spoke to was mostly
unimpressed with what she had deduced. And like the residents of Toronto’s
Gay Village who had repeatedly raised concerns about a potential serial
killer in their midst, Reid was not taken seriously. The officer thanked her
for the tip and said it would be taken into consideration. She never heard
another word about it.

Creating data-informed solutions is important but not easy


If it were easy to use rigorous data analysis to solve problems like those
posed by missing persons and serial killers, everybody would already be
doing it. But it’s not. The Canadian government can attest to that. Back in
the 1990s, a police unit known as Project KARE created a database to aid
investigations into suspicious deaths in Edmonton, Alberta. That database
was limited, however, by its restriction to jurisdictional boundaries. “The
data I’ve collected,” explains Reid, “shows that serial killers and sex
offenders rarely confine themselves to a single police jurisdiction.” When
offenders cross jurisdictional lines, it becomes difficult for law enforcement
to make the links necessary for identifying a single serial offender at work,
if data collection and analysis are limited to their respective jurisdictions.
To address such limitations, the Royal Canadian Mounted Police
implemented the Violent Crime Linkage and Analysis System (ViCLAS).
Since its implementation in 1995, police have used ViCLAS, a
computerized investigative aid, to increase information sharing across
jurisdictions. “But it too has a serious limitation,” Reid says. “The
reliability of ViCLAS depends on the quality-control processes associated
with what variables are coded.” In Canada, police officers code the
variables themselves. In contrast, European countries that use ViCLAS
employ specially trained crime analysts working in a specialized unit to do
this work. “Coding of human behavior into quantifiable and standardized
responses is a complex and difficult process,” Reid explains. “And when
there is less reliability in how the data was coded, there is less reliability in
the modeling police use to connect crimes.”
The second database Reid created to address analytical shortcomings in
policing she calls the Serial Homicide Database. The purpose of this
database is to identify the origin of the psychopathology that underlies
serial killers’ homicidal motivations and to recommend action plans to
manage and mitigate the threat posed by such individuals. Did the future
killer grow up in a house with lead-based paint? Was the father an
alcoholic? Was the mother using drugs or drinking during her pregnancy?
When looking into the future killer’s childhood, Reid seeks information on
whether they were born with any abnormalities. Were there complications
during birth? Reid’s team, which is made up of students from diverse
backgrounds and disciplines—including history, native studies, sociology,
psychology, and law—supply as many variables as possible. “These
hundreds of data points,” Reid says, “enable researchers to examine the
mental health status of the offenders longitudinally, which may be
important for developing risk assessments for serious violent offenders in
the future.”

Data is a tool; it can be used for good or bad


In 2011, Pasco County sheriff Chris Nocco presented his Florida
community with a data-driven plan to reduce crime and bias in policing by
using data to stop crimes from being committed in the first place. What he
promised and what he delivered, however, were two different things. “What
he actually built,” writes Kathleen McGrory and Neil Bedi, “was a system
to continuously monitor and harass Pasco County residents.”5 Here’s how it
worked: Like Sasha Reid, Nocco and his team of analysts collected data
from arrest records and “unspecified intelligence” they believed could tell
them who in their county were liable to break the law. Once the analysts
decided the probability was high enough, Nocco sent deputies to interrogate
the extrapolated culprit, often with no discernable probable cause, search
warrant, or hard evidence that a crime had been or would be committed.
“They swarm homes in the middle of the night,” McGrory and Bedi
write of the Pasco County deputies, “waking families and embarrassing
people in front of their neighbors. They write tickets for missing mailbox
numbers and overgrown grass, saddling residents with court dates and fines.
They come again and again, making arrests for any reason they can.”
Nocco’s data-driven solution to crime in his county ensnared nearly 1,000
people who might have otherwise had no interaction with law enforcement.
While it’s true that cases of burglary, larceny, and auto theft in Pasco
County declined after Nocco’s program was implemented, such declines
also occurred in the seven largest nearby police jurisdictions. Moreover,
from 2011 to 2016, violent crime actually increased under Nocco’s watch.
Not surprisingly, two of the nation’s largest law enforcement agencies
scrapped similar programs once the public learned of their serious flaws.
David Kennedy is a renowned criminologist at the John Jay College of
Criminal Justice. Pasco County used some of his research on crime
prevention to justify the predictive analytics they were using to harass
people. It was “one of the worst manifestations of the intersection of junk
science and bad policing,” Kennedy said of Nocco’s program, “and an
absolute absence of common sense and humanity.”6 Collecting data and
solving problems are not the same thing, especially when data collection
overrides human judgment and compassion.

Tip #24. Make sure that data supports your solution but that it doesn’t
create the need for it. There is a difference between cause and effect
with data. Don’t let cognitive dissonance or your own belief system lead
you to using data to manufacture a problem.

We often see confusion over cause and effect in policy writing. In a


paper about the use of force as a tool to fight terrorism, one student argued
that, since 9/11, the United States has designated more terrorist
organizations than before that fateful day. Therefore, the student concluded,
the use of force is likely not working to fight terrorism. But is this right?
Might it not be more likely that the United States has designated more
terrorist organizations because of a heightened perception of threat, and so
the growing list of terrorist organizations only indicates as much? It is not
proof that the current solution isn’t working. To make that case, the student
would need different evidence.
One way to be true to the world around you is to make space for
complex data inside systems. Reid’s Serial Homicide Database is unique
because of the depth of the data it contains. By including robust qualitative
data, the database allows for a more complete description of the
developmental and criminological events under consideration. “Think of it
this way,” Reid explains: “Quantitative databases may code data points in a
binary form (0 = something happened; 1 = something didn’t happen). This
type of coding lacks the flexibility and depth that developmental science
requires.” Instead of, for example, noting whether a serial killer was abused
as a child or not, either 1 or 0, Reid’s database allows her team to record
when that abuse occurred, how often it occurred, who abused the future
killer, and how the killer was affected by the abuse. Each murder, in other
words, was preceded by individual circumstances, stories the killers told
themselves, and myriad reasons and justifications for their actions.
“Including all of these additional qualitative data points,” Reid says, “helps
us develop a more complete understanding of the development of each
serial killer. Human behavior is messy and hard to predict. The more
information we have the better—as long as we do something useful with it.”
“At the end of the day,” Reid continues, “there was a very complicated
series of accumulated risk factors at play that led a serial killer down the
path they ended up on—to an expression of deep, maladjusted
psychopathology. Then they fell through the cracks.” Reid’s goal, in turn, is
to use data—not intuition—to uncover what happened. This work will take
time, patience, and persistence, no doubt. Reid is undeterred. Above all
else, she wants to learn how people facing similar circumstances today
might be helped before they too fall through those cracks. We believe her
nuanced understanding of the problem, paired with data-driven reform
proposals, will help policy makers craft better solutions … someday.

Act with integrity and you won’t have to worry about being
“wrong”
Here are a few truths about data: One, data can come in many, many forms
—both quantitative and qualitative—and no single kind of data is inherently
better than any other; it all depends on your reader and purpose. Two, the
data you need may not exist, may take time to be collected, or may be, as
we noted in part II, “dark.” Three, more often than not, the data you do have
access to can be used to support divergent stories about what the data
means. We told you this stuff wasn’t easy!
When you don’t have the data you need—or the numbers don’t quite line
up in the way you’d hoped—be honest about it. Tell your reader what you
have and what you don’t have. Lean into the complexity. Contrary to what
some may believe, owning up to the limitations of your conclusions can be
a tremendously persuasive communication strategy. While it may
sometimes feel like walking into a sword fight without any armor, it’s
important to remember that there’s no such thing as a perfect idea or
proposal or policy recommendation—and your reader knows that
intuitively. So instead of pretending like this isn’t the case, you should let
yourself admit it when your data has limitations and perhaps discuss other
interpretations you’ve entertained and why you decided to reject them.
Bring up potential trade-offs and explain to your reader how you would try
to avert or mitigate any unintended consequences. This sort of transparency
can show your reader that you hold yourself (and your analysis) to a high
standard and that you can be persuaded by new evidence, which in our
experience is the first step to showing a reader that it’s all right for them to
be persuaded, too.
This approach to data places it in a supporting role to your own honest,
human interpretations. And it can help those who are nervous about using
data, such as the diplomats mentioned in the previous chapter, to feel
supported, not limited, by data.

Tip #25. If the data you have is suggestive but not necessarily
representative, say that. Don’t overstate your data’s meaning, but also
don’t be afraid to use it for fear that you might later be proven wrong.
Instead, tell the reader where the data comes from, what it represents, and
what it indicates to you, the communicator.

Being humble about your findings isn’t just persuasive—it’s also a


practice that leads to better analysis in the long term. After decades of
research, Philip Tetlock, a professor of management and psychology at the
University of Pennsylvania’s Wharton School, believes that the most
accurate analysts share three characteristics: they are (1) skeptical about
grand theories, (2) hesitant to predict the future with complete confidence,
and (3) willing and able to adjust their ideas when conflicting evidence is
presented. He calls these types of people “foxes,” in homage to a poem by
the ancient Greek warrior Archilochus, which includes this line: “The fox
knows many things; the hedgehog one great thing.”7 In Tetlock’s
understanding, hedgehogs prefer to champion a grand political or
psychological theory, and they rely almost exclusively on that view of the
world to make sense of its multitudinous parts. Hedgehogs, in turn, are less
likely to incorporate contradictory data into the story they tell themselves.
Occasionally, hedgehogs find success, mostly because their readers seem to
appreciate their supreme confidence. But more often than not, Tetlock
found, hedgehogs don’t really know what they’re talking about.
We should all strive to be more like the fox, acknowledging the
complexity of the issues we face. If you’re more like the hedgehog, have no
fear; Tetlock has three key pieces of advice to help you change course:
1. Don’t fall victim to the belief that the world can be explained by one
or two ideas. Reality is full of contradictions and complexity that
aren’t easily explained.
2. When forming conclusions or judgments, consult a diverse range of
sources, continue to review your assumptions, and update your
thinking accordingly.
3. Don’t be afraid to use words and phrases like however, but, although,
and on the other hand when they are warranted.

Above all else, Tetlock says, when the facts change, foxes change their
minds. To better persuade others, be more open to persuasion yourself. You
might be surprised by the reaction you receive, especially if your reader is
one of the “exasperated majority” that’s fed up with political polarization
and divisive, falsely simplistic rhetoric.
Even when we do our best to be foxes and tell the truth about what we
know and what we don’t, errors and miscommunication happen; like death
and taxes, they’re inevitable. Fortunately, if we learn to recognize some of
the more common errors people make, that awareness should help keep us
out of trouble, at least most of the time.

Tip #26. Be honest about what data you have—and what you still don’t
know. Honesty and accuracy are the most important virtues when it comes
to telling stories with data—more important than timeliness. If you don’t
have good data, accept that reality and don’t include it in your writing.

These are lessons Hong Qu teaches his students about data visualization
at the Harvard Kennedy School. As part of his research and work, Qu
follows data visualization experts who advocate for presenting data well to
a general audience. At the height of the spring 2020 surge of coronavirus
cases in the United States, Qu noticed what he described as a “burst of
critiques of the health department of Georgia.”8 Data experts from around
the world were using social media to slam Georgia’s visual representation
of county-by-county data in a bar chart that made it appear, misleadingly,
that the number of new COVID cases in all five of the state’s most-affected
counties were trending downward by early May (figure 3.1).
Qu was intrigued by the reactions of other data analysts, so he examined
the bar chart. The first issue he noticed was that the x-axis in the chart,
created by the Georgia Department of Public Health, wasn’t chronologically
arranged, which struck him as incredibly odd for a chart purporting to track
a trend over time. He further noticed that its nonchronological order made it
appear as though the numbers told a different story from what was actually
happening.

Figure 3.1. Top five counties in Georgia with the greatest number of confirmed COVID-19 cases.
The chart represents the most affected counties over the past 15 days and the number of cases over
time.
Source: Georgia Department of Public Health

In looking at the bar clusters for April and May, “it seemed like cases
were going down,” Qu observed, “but, in reality, cases were not going
down. They were hitting a plateau and would eventually go up.” The
downward trend was caused by a delay in reporting cases. In short,
Georgians weren’t out of the woods just yet.
Data visualization expert Alberto Cairo noticed the same problem. On
his website, he published a redesigned chart to show how much the story
seemed to change when the data was presented chronologically.9
“Visualization books, including mine,” Cairo explains in the paragraph
beneath his redrawn chart (figure 3.2), “spend many pages discussing how
to choose encodings to match the intended purpose of every graphic, but we
pay too little attention to the nuances of sorting: should we do it
alphabetically, by geographic unit, by time, from highest to lowest, from
lowest to highest—or do we need an ad-hoc criterion? Or should we make
the graphic interactive and let people choose? As always, the answer will
depend on what we want the reader to get from the visualization.”10

Figure 3.2. Albert Cairo’s redrawing of the chart from the Georgia Department of Public Health (see
fig. 3.1).
Source: Alberto Cairo, “About That Weird Georgia Chart,” Cairo(blog), May 20, 2020

A problem with the original chart was that data collection for COVID
cases was occurring with a delay. Therefore, the most recent numbers were
the least likely to be accurate. Including them would falsely demonstrate a
downward shift and misinform the reader, especially a busy one who only
scans the chart. Excluding the most recent numbers would prioritize
accuracy over timeliness. But in the early days of the pandemic, timing was
everything.
Is there a compromise here? We think there is: more storytelling. The
Georgia Department of Public Health could have provided chronologically
arranged data (Cairo’s correction), and the chart’s caption could have
informed readers that the most recent data is the least accurate owing to
delays in the reporting of new cases.
Captions can add a layer of honesty to visualizations. Qu teaches his
students to use captions, which he says are “underutilized in data
visualizations because it’s an afterthought by the designer.” He encourages
designers to use them because they are “one of the most effective
techniques to guide the audience’s attention, as well as to explain the key
takeaways by adding a signpost that tells a mini-story.” Done well, captions
“convey clearly the reasoning behind the data insights” represented in the
graphic.
Adding to the caption of Georgia’s chart a sentence stating that COVID
cases are likely higher because of incomplete reporting is an excellent and
effective compromise.

Tip #27. When presenting data visualizations, use the graph’s caption
to tell the reader what the point of it is. Instead of a caption that reads,
“Number of older workers who report not having enough money to retire,”
try something like this: “Twice as many older workers today report not
having enough money to retire than older workers reported two decades
ago.” Don’t expect your reader to interpret the data themselves. They may
derive a different story from it than what you intended.

This sort of communication error—whether deliberate or not—has real


impact on policy and the people who depend on policy. Qu believes
Georgia’s county-by-county COVID-19 data was either purposefully
manipulated or unknowingly bungled: that is, health department officials
may have “picked a sorting option and had no idea they were going against
the basic principles of time.” Either way, Georgia was one of the first states
to end pandemic lockdowns and reopen for business, at least partially
because the people in power believed the threat had passed.
Inaccuracies may be innocent, of course, but they can still appear to be
deceptive. And it’s incredibly embarrassing to the writer, and their
organization, when inaccuracies are published. If you ever find yourself in a
situation where you feel forced to choose between timeliness and accuracy
when communicating with data, choose accuracy.

Tip #28. Avoid unnecessary attribution. Lots of student writing—and


plenty of professional writing too—suffers from too much attribution, likely
stemming from a writer’s fear of being accused of plagiarism or their
having little confidence in their own conclusions. While attribution is
necessary at times, especially if you’re quoting directly from a source, too
much attribution may undermine your authority with readers. If every
sentence you write either begins or ends with “according to [name of
expert],” your prose will come across as a patchwork of others’ expert
opinion—likely to be less persuasive than clearly stated facts and logical
claims and conclusions. We understand that attribution can be a cultural and
academic norm, but we still warn against having too much of it.

Reading bad or disproven data, it turns out, can repel new and better data
encountered later on. Several studies of communication show that once
readers finish reading a text and have had time for its content to sink in, the
human brain has a way of making its content seem true. Later, when it’s
time to correct a flawed argument or data point, some readers will struggle
with cognitive dissonance. Wait, they say. You told me not to wear a mask
when the pandemic started. Now you’re telling me I have to wear a mask in
every indoor public place I go? That can’t be right.
An interesting example of this phenomenon from the world of politics
comes from the contentious 2000 US presidential election between George
W. Bush and Albert “Al” Gore. In their book The Press Effect, Kathleen
Hall Jamieson and Paul Waldman explain what happened: “When networks
called the election for Bush at 2:20 a.m., televisions were on in fifteen
million homes … Graphics with Bush’s picture and the words ‘George W.
Bush—the 43rd President of the United States’ flashed on the screen …
When the call was retracted at 3:50 a.m., 8.5 million homes still had their
televisions on. In 6.5 million homes, viewers went to bed thinking Bush had
won but awoke to find an unsettled election.”11 That news would bring
cognitive dissonance to many who had slept on—and made peace with—
the election results. Many of those Americans were more likely to believe
that Bush had won the election, which would have been “stolen” were Gore
declared the winner in a reversal, regardless of their political leanings.

Tip #29. Manage the tension between timeliness and accuracy. Consider
your readers and then answer their questions, honestly, rather than
succumbing to the desire for having a definite answer as quickly as
possible. That is how mistakes are made.

News organizations used predictive models in 2000 that contained what


executives later called “bad data.” But it was late at night, and time was of
the essence. Five weeks after election day, the US Supreme Court, in a five-
to-four ruling in the case of Bush v. Gore, blocked further recounting of
votes in Florida, thereby awarding the state’s 25 electoral votes—and the
presidency itself—to Bush.

Does the data actually support your conclusions? Are you sure?
In August 2019, a research team, led by senior author Joseph Cesario, had
the results of their study on fatal police shootings published in the
prestigious Proceedings of the National Academy of Sciences (PNAS).
“Concerns that White officers might disproportionately fatally shoot racial
minorities can have powerful effects on police legitimacy,” they declare
about the implications of their work, which was based on a “near-complete
database” of more than 900 fatal shootings in 2015 that included
demographic information on each officer who did the shooting.12
What did they find? Despite “recent high-profile police shootings of
Black Americans,” the researchers “did not find evidence of anti-Black or
anti-Hispanic disparity in police use of force across all shootings.” Put
simply, “White officers are not more likely to shoot minority civilians than
non-White officers.”13 In most cases of fatal shootings, “the person killed
was armed and posed a threat or had opened fire on officers,” said the
study’s senior author when interviewed for a reporter’s story about the
research.14
According to the coauthors, police reformers’ calls to diversify police
departments may not have any effect on the frequency of officer-involved
fatalities. “If this study is right,” NPR reasons, “just hiring more black cops
will not mean fewer black people get shot.”15 Instead, the best way to
reduce these fatal shootings, the study’s findings suggest, might be to
redress “the socio-historical factors that lead [Black and Hispanic] civilians
to commit violent crime,” which is what led to their increased likelihood of
being killed by police, according to the researchers. In other words, if
you’re Black or Hispanic, don’t commit violent crime because, if you do,
you’re more likely to be killed by police. And it won’t be because you’re
not white.
Several months after the results of this study were published, two
Princeton professors demonstrated mathematically that the study was
“based on a logical fallacy and erroneous statistical reasoning and sheds no
light on whether police violence is racially biased.” In an op-ed published in
the Washington Post, Dean Knox and Jonathan Mummolo explain what was
so faulty about the study:
It takes no technical expertise to understand the core problem of the study. The authors used
data on fatal police shootings to determine the likelihood of officers shooting minority
civilians, ignoring the fact that most police encounters with people do not result in a fatal
shooting. Under this fallacious approach, an officer who encountered one minority civilian and
fatally shot him or her (a 100 percent fatal shooting rate) would appear identical to an officer
who shot one minority civilian out of a thousand similar encounters (a 0.1 percent fatal
shooting rate). Data on fatal shootings alone cannot tell us which officers are more likely to
pull the trigger, let alone account for all relevant differences between incidents to allow us to
isolate the role of race.16

How could PNAS, one of the most cited peer-reviewed journals in the
world, publish such flawed research? Soon after Knox and Mummolo
notified the journal of the glaring errors they had discovered, an editor at
the journal responded to Knox and Mummolo in defense of the article. The
“clear logical errors” that Knox and Mummolo pointed out were, according
to the editor, a matter of preference over how best to study how race
influences officer-involved shootings; moreover, the tone of the critique
was “intemperate.”17
Knox and Mummolo then took their concerns to the digital marketplace
of ideas we call Twitter. As the likes and retweets piled up, senior author
Cesario and first author Johnson published a reply to the critique. “Though
they still largely stood by their study,” Knox and Mummolo later wrote in
their op-ed, “they admitted their central claim—that white officers are not
more likely to shoot minority civilians than their nonwhite peers—was
unsupported by their analysis.”18
By that point, though, it was largely too late to walk it back. The study’s
flawed findings had already been widely covered in the media. They were
even presented during testimony in an oversight hearing on policing
practices convened by the US House Committee on the Judiciary.19
“There are endless examples of bad research designs producing flawed
findings, followed by uncritical media reports touting the results,” write
Stephen Soumerai and Ross Koppel, which “can result in costly, ineffective
and even harmful national policies.”20 For example, a 2015 study claimed to
show that better-trained paramedics with more sophisticated lifesaving
equipment actually caused more deaths than their lesser-trained colleagues
when responding to emergency calls in nonrural areas for people receiving
Medicare benefits.21 But, as Soumerai and Koppel explain, “the authors
confused cause and effect: Ambulance dispatchers send the better-equipped
ambulances to dying patients in an effort to save them before transporting
them to the hospital. These patients are already more likely to die on the
way to the hospital than patients in basic ambulances.”22
Even though it was clear to many that the researchers had not controlled
methodologically for an assumption that violates basic logic, ScienceDaily,
a website that proclaims to share the “latest science news,” published a
summary subtitled “Advanced Life Support Ambulance Transport Increases
Mortality.”23 Similarly, the Portland (ME) Press Herald reprinted a
Washington Post article under the headline “Ambulances with Less-
Sophisticated Gear May Be Better for Patients.”24 “Such distortions,”
introduced into public discourse, Soumerai and Koppel say, “have
potentially life-threatening consequences to patients and policy.”25
Persistence, patience, and honesty are a requirement not only to speak well
with data but also to speak accurately with data. It is important to slow
down and make sure you are not confusing cause and effect.

We know you know, but remember: correlation does not prove


causation!
Distinguishing between correlation and causation is an essential skill for
any writer because, when we mistake correlation for causation, we reach
falsely firm conclusions that can have bad consequences. When two
features of the world occur together with some regularity, we say they are
correlated. Being able to determine correlation is useful if you want to
predict when and where to expect something to occur in the world, given
your knowledge of other situational characteristics.
In his 1998 book, More Guns, Less Crime: Understanding Crime and
Gun Control Laws, John R. Lott Jr. argues that “states with the largest
increases in gun ownership also have the largest drops in violent crimes.”
More specifically, in 31 states that allowed adults to carry concealed
handguns (if they had no criminal record or history of significant mental
illness), rates of violent crime dropped. “For each additional year that a
concealed handgun law is in effect,” Lott explains, “the murder rate
declines by 3 percent, rape by 2 percent, and robberies by over 2 percent.”
After analyzing crime data from every county in the United States that
permitted adults to carry a concealed handgun from 1977 to 2005, Lott
concludes in the third edition of his book that the effects of concealed-carry
legality on crime were even more pronounced in “high crime urban areas
and neighborhoods with large minority populations.”26
A next question we must ask ourselves, in trying to tease apart
correlation and causality, is whether having more guns caused less crime or
whether, in fact, both features merely occurred together and had no
deterministic effect on each other. Lott argued for the former: that more
guns in a community was the primary reason why rates of crime
subsequently dropped. This happened for two reasons, he says: “First, they
reduce the number of attempted crimes because criminals are uncertain
which potential victims can defend themselves. Second, victims who have
guns are in a much better position to defend themselves.” Furthermore, Lott
contends, when states passed concealed-carry laws, “the number of
multiple-victim shootings declined by 84 percent. Deaths from these
shootings plummeted on average by 90 percent, and injuries by 82 percent.”
What’s more, “there is no evidence,” he continues, “that increasing the
number of concealed handguns increases accidental shootings.”27
Tip #30. Do not claim a direction of cause to effect when correlation
supports no clear direction other than coincidence. Making that error
will harm your own credibility and limit your ability to persuade.

In the two decades after Lott’s book was published, at least two dozen
empirical studies made convincing counterarguments, namely, that
concealed-carry laws had little or no effect on the reduction of crime. For
example, in a 2012 article published in the American Statistician, Patricia
Grambsch showed that there isn’t any evidence to support Lott’s claim that
concealed-carry laws “have beneficial effects in reducing murder rates.”
The real culprit, according to Grambsch, is a phenomenon known as
“regression to the mean.” When a data point differs significantly from other
observations at first, but then in subsequent observations is found to be
closer to average, we can say that it regressed to the mean, or average.
Because this phenomenon absolutely must be considered when designing
scientific experiments and interpreting data, Grambsch used random and
fixed effects models to do just that. How those models work is less
important than what they tell us in this case, which is that when regression
to the mean is factored in, the effects that legalizing concealed carry has on
crime shift. In the end, Grambsch found that when states passed concealed-
carry laws, doing so had no effect on rates of murder.28 Other researchers
found that concealed-carry laws have resulted, if anything, in an increase of
certain types of violent crime, including adult homicide29 and aggravated
assault.30
It’s true that sometimes correlation, a co-occurrence of phenomena, is in
fact an instance of causation. But you can’t trust an interpretation of
causality in one direction (A causes B) without testing for reverse causality
(B causes A) and for what are known as confounders.
Let’s start with reverse causality. When the outcome, or the anticipated
outcome, affects the treatment (that is, the presumed effect is, in fact, acting
on the purported cause), we call that reverse causality. Identifying reverse
causality is sometimes a matter of common sense. For example, a study
might find that brown spots on the skin and sunbathing are linked. It’s
plausible of course to hypothesize that sunbathing can cause brown skin
spots, while it’s all but impossible to suppose, inversely, that brown spots
cause sunbathing.
To take another, less obvious, example, let’s say a study finds that
smoking cigarettes and depression are linked. We could conclude, perhaps,
that smoking causes depression, though it’s also possible that the causality
runs in the other direction: depression could cause people to smoke
cigarettes. Follow this line of reasoning: smokers may feel depressed over
their inability to quit a habit that’s socially stigmatized or depressed about
the deterioration of their fitness from smoking, so to relieve their
depression, they turn to cigarettes for the familiar uplift of nicotine. In all
likelihood, it seems probable that smoking and depression interrelate in
causation. When this sort of mutual relationship between two features of the
world exists, we call that “simultaneity.”
Let’s now take up confounders. Confounding occurs when some feature
of the world affects the treatment and the outcome above and beyond the
effect of the treatment. To statistically remove the influence of confounders,
we need to control for a measurable variable. Before making a causal
inference, you must try to determine whether any other factor may be
influencing the outcome of your analysis.

Tip #31. Whenever you’re writing about something complex, use


shorter words, sentences, and paragraphs. Simple language and syntax
can make even the most complex facts understandable to readers.

A good example of confounding factors comes from research sponsored


by the Bill and Melinda Gates Foundation into how class size affects
students’ performance in school. By January 2009, the foundation had
invested more than $2 billion to reduce class sizes in high schools with the
objective of improving student achievement and graduation rates. The
results of the research, however, were more than a little disappointing. As it
turns out, smaller class size does not necessarily lead to higher student
achievement. If class size is reduced in well-resourced schools with
excellent teachers who instruct well-supported students, that reduction may
lead to higher achievement. But if class sizes are reduced in under-
resourced, mostly rural schools, the same reduction may not have the same
effect. In this case, school funding, teacher quality, and student support
were confounders that evidently had not been adequately controlled for in
decades of research showing that smaller class sizes were better for student
learning than larger ones.31 According to the foundation’s then CEO, Jeff
Raikes, “Almost by definition, good philanthropy means we’re going to
have to do some risky things, some speculative things to try and see what
works and what doesn’t.”32 At least he was honest about it.
If you’d like to avoid throwing good money after bad (or at least neutral)
solutions, here are two questions to ask yourself when presented with a
purported causal relationship:

1. Could the outcome—or anticipation of it—cause the treatment instead


of the other way around?
2. Are there uncontrolled-for factors that are correlated with both the
treatment and the outcome that need to be controlled for?

Be particularly skeptical of a result whenever (a) there are few scientific


studies on the topic, (b) the results are surprisingly counter to common
sense, or (c) you suspect the result would not have been published if the
finding had gone the other way. Digging in deeper to understand these
relationships—of confounding factors or causation—takes persistence. But
that step is worth it to ensure that the story you create with data is one
worth telling.

Tip #32. Randomized controlled trials aren’t always the silver bullet
they’re made out to be. Not everything can be measured; not every
question can be answered, even with the best data; and research findings
cannot always be generalized.

In response to profuse pushback, Lott wrote a follow-up book in 2003—


The Bias against Guns: Why Almost Everything You’ve Heard about Gun
Control Is Wrong—in which he argued that various psychological biases
prevent other researchers from accepting the results of his work. John V.
Peeper, a professor of economics at the University of Virginia, disagreed in
his review of Lott’s follow-up book, writing that “Lott distorts anecdotal
evidence about biases, misrepresents the relevant empirical literature, and
presents evidence that cannot be used to draw credible conclusions about
the effects of gun laws on crime.”33 Similarly, David Hemenway, director of
the Harvard Injury Control Research Center, concludes, “much that Lott
writes is either wrong or misleading.”34
“In many ways,” Knox and Mummolo argue in their op-ed on officer-
involved shootings and race, “the current era represents a golden age for the
scientific study of social problems.” New sources of granular data and
sophisticated techniques for differentiating correlation from causation are
allowing unprecedented insight into human behavior. And politicians have
expressed interest in evidence-based policy, leaving the door open for
academics to make a real-world impact. “But the promise of this moment
will be wasted if scientists cannot be relied upon to separate fact from
fiction. We must do better, or risk receding into justified irrelevance.”35

How many people might still be alive today had the Toronto
police been more open to using data?
Let’s return to the cold case in Toronto. Unbeknownst to Reid, the police
had assembled a small group of detectives to look into Navaratnam’s case.
They called themselves Project Prism, and their mission was to take a closer
look at Bruce McArthur to determine whether he was responsible not just
for Navaratnam’s disappearance but for the other disappearances as well. It
seemed to the detectives that McArthur had a disturbing routine. First, he’d
meet gay men who had immigrated to Canada. Then he’d hook up with
them and hire them to work for his landscaping business. Not too long after
that, the men would disappear. McArthur, they were coming to realize, was
possibly the thread that tied it all together.
The task force started surveilling McArthur around the clock, and one
day the officers parked outside watched a young man enter McArthur’s
apartment. Fearing the worst, they decided to act. Once they had forced
themselves inside, the officers found the young man “bound, restrained to a
bed, but unharmed,” according to the Toronto Star.36 The man was freed,
and police arrested and charged McArthur with two counts of first-degree
murder.
Using McArthur’s list of landscaping clients, the Toronto Police Service
began searching dozens of properties across the city. On the neatly
landscaped lot of one small home, police officers found the remains of six
men stuffed into large planters.
It didn’t take long for the Toronto Police Service to come under fire from
multiple fronts. Why wasn’t McArthur arrested sooner? How could the
police have let him go after first interviewing him in 2013? “All serial
homicide cases have their fair share of systemic and human errors,” Reid
explains. “The reason I and my team do what we do is because of these
errors. The more we can bring objective analysis to what has traditionally
been a very subjective profession, and the more we look for and listen to
people’s stories—especially people who have traditionally been silenced—
the quicker the police will be able to stop the Bruce McArthurs of the world
from hurting more people.”
Based on all the evidence she’s collected on McArthur since his arrest,
Reid believes there may be more victims who haven’t yet been found. “I’ve
done a developmental profile of Bruce,” she told a journalist soon after
McArthur was convicted. “I’ve gone into his past and looked at his entire
development from essentially conception until the time he was arrested. It is
possible that there are more.”37
On January 29, 2019, Bruce McArthur pleaded guilty to eight counts of
murder in Ontario’s Superior Court of Justice. At 66 years old, he became
the oldest convicted serial killer in Canada. He was subsequently sentenced
to life imprisonment with no eligibility for parole for 25 years.
Empirical work like Sasha Reid’s depends on assumptions, beliefs, and
judgments about what data points are acceptable and useful, which
relationships should be examined, how variables are defined, and what the
findings mean. While we often hear calls to make public policy data-driven
or evidence-based, these calls seem to ignore the fact that empirical
research relies on subjective human judgment and framing. When we
quantify something, when we measure it, we shine a light on it, and when
we fail to quantify or measure something, we leave it in the dark.
Sometimes we avoid collecting certain data because it’s too hard to do, too
expensive, or there’s little money to be made from it. On top of that, the
data sets we do have available to us are constructed by human beings and
are, therefore, subject to human biases, errors, and manipulations, as are all
things humanmade. In every case of drawing on data, we must be honest
with ourselves about what the data can let us say with integrity and what it
cannot. That’s the only way your reader will know whether they should
trust what you have to say.

OceanofPDF.com
Conclusion

Throughout this book, we’ve shared stories from our work to demonstrate
the challenges we all face in writing effectively with data. What we’ve
concluded after all these years is that effectiveness depends mostly on a
writer’s ability to understand how their reader tends to make sense of data.
Once you’ve got an understanding of your reader’s goals, tell them a story
that gives them the information they need to form an opinion, make a
decision, or take steps to address a problem. Use only the data that must be
presented to convince your reader that the story you’re telling is logical and
appropriate. This can be hard, we know; but after being exposed to the
frameworks, tips, and tactics we’ve presented, it is our sincere hope that
you feel much more prepared to take on such an important challenge.
Another lesson we hope you’ve gleaned is that when you do craft a story,
elevate the people impacted to the headline. Remembering that your reader
cares more about people than they do about statistics will serve you well.
Instead of focusing on the numbers alone, provide a context around the
numbers. It will take your voice and your description to convey what is
known, and what is unknown, to your reader. The effort this work requires
will serve your purpose, which is ultimately to make important and positive
change in the world.
Lastly, writing effectively with data depends on your ability to be both
accurate and honest. Explain limitations. Check your work. Accuracy and
honesty won’t persuade everyone in the short term, no doubt, but
inaccuracy and dishonesty will persuade far fewer people in the long term.
Be mindful of using data to develop strong stories that don’t confuse
causation and correlation. Don’t stretch and bend the numbers. Cultivate
your credibility instead. And remember that demonstrating empathy and
respect for your reader has the power to elicit emotion that can draw them
in and help them really see.
Effectiveness can’t be forced; it comes only from doing good work as
well as one can. Ultimately, it is up to you to speak for data.

OceanofPDF.com
ACKNOWLEDGMENTS

DAVID CHRISINGER
This book is the product of our collective experiences as students, practitioners, and instructors of
effective communication. Thinking back on how we learned to write stories with data, we realized
that we acquired most of the ideas detailed in this book from patient advisors, helpful mentors,
thoughtful collaborators, and other communication experts. This book, in turn, was conceived out of
a sincere desire to share frameworks, principles, and tools for writing with more people than our
classrooms can accommodate.
First of all, I’d like to thank my students, colleagues, and mentors at the University of Chicago’s
Harris School of Public Policy for helping to plant the seeds of this book. The author of our
foreword, Ethan Bueno de Mesquita, and his frequent collaborator Anthony Fowles—both professors
at the Harris School—were the source of several concepts captured in this book and the inspiration
for others. I’m also in great debt to all the other faculty who have invited me into their classrooms to
help teach Harris students how best to communicate quantitative analysis to meet the particular needs
of readers. Ever since I arrived at the Harris School in February 2019, these colleagues have shared
with me, either directly or through their writings, myriad lessons learned from their own experiences
—many of which found their way into this book. In particular, I would like to thank Dan Black,
Christopher Blattman, Sorcha Brophy, Chad Broughton, John Burrows, Isabeau Dasho, Matthew
Fleming, James A. Leitzel, John A. List, Jens Ludwig, Luis Martinez, Roger Myerson, Konstantin
Sonin, Brian Williams, Rebecca Wolfe, Kimberly Wolske, Paula Worthington, Austin Wright, and
Adam Zelizer. My sincerest thanks as well to Ranjan Daniels, Andie Ingram Eccles, Jenny Erickson,
Shilin Liu, Sakshi Parihar, Sam Schmidt, and others for all the support, encouragement, and
connection.
Second, I would like to extend my warmest thanks to the dean of the Harris School, Katherine
Baicker; our dean of students, Kate Shannon Biddle; and the rest of the executive team at the Harris
School for believing in my work and supporting my efforts to make our students the most effective
communicators they can possibly be.
Third, I want to express my gratitude to my students, past and present, for offering a steady stream
of ideas and inspiration about what to include in this book—as well as plenty of opportunities to try
out and refine the lessons and tools featured throughout.
Lastly, without my wife, Ashley, and our three beautiful children, this work of mine would feel
much, much less fulfilling. It is my honor and privilege to share the world with them.

LAUREN BRODSKY
I would like to thank my students at the Harvard Kennedy School, who are the inspiration for this
book. Students come to the Kennedy School with passions for a variety of policy issues. While their
interests may differ, what they have in common is a desire to make an impact. To make the world a
better place. And to improve the lives of citizens around the globe. Communicating well with
evidence is an important skill to make that change. I am so thankful to my students for trusting me to
lead them through that work and for helping me learn along the way. I am also thankful to the
leadership at the Kennedy School for supporting me in growing my own knowledge and expertise in
policy writing.
Many of the tips and lessons of this book have grown from conversations with colleagues and
with students who became alumni and practitioners. Thank you to those who shared their stories with
me, including Hong Qu, Ranjana Srivastava, Nick Sinai, Rebecca Barnes, and Paul Von Chamier.
Through our discussions I was able to see trends and best practices of persuasive policy
communications. I was inspired by the way you model honesty, compassion, and perseverance in the
work you do.
I am lucky to work with the most supportive colleagues in the Communications Program at the
Kennedy School. They are thoughtful and gifted teachers who continue to inspire me. Our
conversations on teaching and learning are often the highlight of my day. Topping that list is Jeffrey
Seglin. There is not a more dedicated professor out there, and without his faith in me, this book, and
my work that informs it, would not have been possible. Thank you also to Alison Kommer, our
program coordinator, who steps right in to help in ways that are always above and beyond. You have
both been such an important support system to my work over the years.
Lastly, I want to thank my family and especially my husband, Gregg. Two careers in our selected
fields is a juggling act. But you have pushed me every step of the way and helped our two wonderful
kids see what is possible.

OceanofPDF.com
TIPS TO HELP YOU WRITE MORE EFFECTIVELY WITH
DATA

Below are the 32 tips we shared throughout the book, conveniently in one
place so you can return to this list and check your work whenever you set
out to write effectively with data. We don’t all write effectively with data on
the first try—or even the second or third try. And that’s okay. Once you
know how to tell stories with your data, and what resonates with readers
and compels them to care about those stories, you can revisit these pages
and revise your writing with the tips in mind.

Tip #1. Use reporting to convey information. Use stories to create an


experience. Stories can transport the reader by creating an experience that
helps them see what you’re trying to say. Information alone is not a story.

Tip #2. Communicate, don’t complicate. The last thing people need is
more information. They have far too much of it already. What they need is
help making sense of all that information and to understand the difference
between what’s important and what’s just noise.

Tip #3. Ratios can help readers make sense of large numbers. Saying
“one in four people” is much easier for readers to picture than “7,526,333 of
30,111,489 people.”

Tip #4. Don’t forget there are real people behind all those numbers
you’re crunching. Readers will care a hell of a lot more about people than
about data points, so if your goal is to get the reader to care, find the people
in the numbers and tell a story about how those people are affected.

Tip #5. If you want to be an exceptional data analyst, you must learn
how to talk to people. And we mean really talk to people—and listen, too.
Tip #6. When you want your readers to remember your story, use
striking imagery that will stick with them over time. When looking for
details to include in data-driven stories, pay attention to your gut reactions.
If you feel like you’ve been punched in the gut after reading a statistic or a
quote from an interview, take note of that. Try to re-create the experience
for the reader. Chances are if you felt something, they’ll feel something too.

Tip #7. Help your reader understand abstractions by comparing them


to concrete things they can picture. Here’s an example: Why should
anyone care about net neutrality? How many Americans even know what
net neutrality is? Back in 2014, comedian John Oliver waded into the
“boring” and obscure issue and explained to his audience that net neutrality
protects start-up companies from being swallowed by bigger companies on
the internet. Here’s how he explains the impact of the Federal
Communications Commission’s proposed rule changes: “Ending net
neutrality would allow big companies to buy their way into the fast
[broadband] lane, leaving everyone else in the slow lane.” He leavens this
potentially boring subject by turning to humor: Without net neutrality, “how
else is my start-up video-streaming service Nutflix going to compete? It’s
going to be America’s one-stop resource for videos of men getting hit in the
nuts.” In other words, Nutflix and other real start-ups would be at risk of
falling victim to anticompetitive tactics.

Tip #8. Try starting with the main finding—your message—not facts or
your methodology. Instead of this: One study probed the relationship
between parental education and income and participation in postsecondary
education and found that young people from moderate- and low-income
families were no less likely to attend college in 2001 than they were in
1993. Try this: Young people from moderate- and low-income families were
no less likely to attend college in 2001 than they were in 1993, according to
one study.

Tip #9. The tone of your writing matters—a lot. If you want your reader
to see you as objective, use an objective tone and present your findings as
objectively as possible. Avoid judgmental words such as failure or
incompetence.

Tip #10. Ask better research questions. Good questions drive good
stories, and the most common types of questions we see answered in public
policy writing are these: (1) Descriptive: What’s happening? (2) Evaluative:
What’s working? What’s not? (3) Prescriptive: What should be done next?

Tip #11. Don’t confuse descriptions of outputs with policy outcomes


and impact. Measuring outputs is important to explain what is happening
but not to explain what is working.

Tip #12. Nearly every decision you need to make as a writer depends on
two things: Whom are you writing for, and what do you want them to
do with what you have written? Understanding your reader’s goals will
help you determine everything from what kinds of data (and how much) to
use to how you should frame the implications of your research. Knowing
what you want your writing to accomplish is equally important. Are you
trying to educate and inform or to persuade and inspire the reader to act?
Are you trying to comfort the disturbed or disturb the comfortable?
Everything you write depends on your answers to these sorts of questions,
and once you know the answers, you can use data to support your message
effectively.

Tip #13. When deciding how many examples to include, remember the
power of three. Use one example if you want to show the reader how
powerful the example is. Use two examples if you want to compare and
contrast them. And to give the reader a sense of roundness and
completeness, use three. Some news organizations share “three things to
know” with their readers, and they include one data point for each. More
information would crowd the story. Readers love threes.

Tip #14. If you don’t have any data, try articulating to the reader what
kind of data would help and how it could be collected. Some refer to this
practice as “evidence-building,” which takes time, money, and inclination.
Not every problem we face will have all three things going for it.

Tip #15. Before comparing data sets, check first to see if the data sets
were collected and analyzed in similar ways. Consider whether your data
points will “speak” well to one another; that is, were they measured in the
same manner, in the same time period, by the same organization? If they
were not (and bringing them together would amount to “comparing apples
to oranges”), explain to the reader what comparison or contrast can be
reasonably made—and what cannot.

Tip #16. No “data dumping” allowed! There’s a tremendous difference


between what you could say with all your data and what you should say.
Much of what you find in your research will most likely serve the reader
better as subtext that informs the core message of your story.

Tip #17. When layering on data in your story, make sure each
additional data point expands the story you’re telling. Try not to
unnecessarily reiterate a point you’ve already made.

Tip #18. Know the difference between percentage changes and


proportions. A percentage change and a percentage point change are two
different things. When you subtract numbers expressed as proportions, the
result is a percentage point difference, not a percentage change.

Tip #19. If a huge, difficult-to-grasp number is important to your story,


help the reader visualize its epic size by converting it into something
they can more easily comprehend. In a story about water waste in
Arizona, for example, we might want to point out that the state’s annual
groundwater overdraft (the amount sucked out of the aquifers in excess of
natural recharge) is about 2.5 million acre-feet. But what is an “acre-foot”?
This large, incomprehensible number becomes a little easier to understand
when we tell the reader that an acre-foot equals just under 326,000 gallons
of water. But even that is hard to picture. So, what if we tell the reader that’s
enough water to fill about 1.2 million Olympic-sized swimming pools? No
one’s ever seen 1.2 million Olympic-sized swimming pools, though most of
us have seen at least one, and we know that 1.2 million is A LOT of pools.
Better, right?

Tip #20. Humanize the scale of the math for your reader. Change “Of
the $246.8 billion in retail spending last year, consumers spent $86.4 billion
on cars and car parts” to something like “Of every $100 spent in retail last
year, consumers spent $31 on cars and car parts.”

Tip #21. Make sentence subjects concrete and verbs action-oriented


whenever you can. Readers grasp complexity more easily when they can
picture who is doing what, and the best way at the sentence level to show
clearly who is doing what is to name a person or other entity as your subject
and pair that subject with a verb conveying action.

Tip #22. Avoid presenting proportions in parentheses. Instead of writing,


“In 2004, older people still in the workforce were somewhat more likely
than younger people to report doing unpaid family work (12 percent vs. 4
percent),” try “In 2004, about 12 percent of older people who are still
working reported doing unpaid family work, compared with 4 percent of
younger people.”

Tip #23. Continually remind yourself that to be persuasive, solutions


need to address a specific problem the reader cares about solving. Many
policy writers feel compelled to lead off their writing with innovative
solutions. But those solutions will fail to entice a reader until they realize
there’s a problem they should care about. Save your solutions for after
you’ve built urgency around the problem you’ve identified.

Tip #24. Make sure that data supports your solution but that it doesn’t
create the need for it. There is a difference between cause and effect with
data. Don’t let cognitive dissonance or your own belief system lead you to
using data to manufacture a problem.
Tip #25. If the data you have is suggestive but not necessarily
representative, say that. Don’t overstate your data’s meaning, but also
don’t be afraid to use it for fear that you might later be proven wrong.
Instead, tell the reader where the data comes from, what it represents, and
what it indicates to you, the communicator.

Tip #26. Be honest about what data you have—and what you still don’t
know. Honesty and accuracy are the most important virtues when it comes
to telling stories with data—more important than timeliness. If you don’t
have good data, accept that reality and don’t include it in your writing.

Tip #27. When presenting data visualizations, use the graph’s caption
to tell the reader what the point of it is. Instead of a caption that reads,
“Number of older workers who report not having enough money to retire,”
try something like this: “Twice as many older workers today report not
having enough money to retire than older workers reported two decades
ago.” Don’t expect your reader to interpret the data themselves. They may
derive a different story from it than what you intended.

Tip #28. Avoid unnecessary attribution. Lots of student writing—and


plenty of professional writing too—suffers from too much attribution, likely
stemming from a writer’s fear of being accused of plagiarism or their
having little confidence in their own conclusions. While attribution is
necessary at times, especially if you’re quoting directly from a source, too
much attribution may undermine your authority with readers. If every
sentence you write either begins or ends with “according to [name of
expert],” your prose will come across as a patchwork of others’ expert
opinion—likely to be less persuasive than clearly stated facts and logical
claims and conclusions. We understand that attribution can be a cultural and
academic norm, but we still warn against having too much of it.

Tip #29. Manage the tension between timeliness and accuracy. Consider
your readers and then answer their questions, honestly, rather than
succumbing to the desire for having a definite answer as quickly as
possible. That is how mistakes are made.

Tip #30. Do not claim a direction of cause to effect when correlation


supports no clear direction other than coincidence. Making that error
will harm your own credibility and limit your ability to persuade.

Tip #31. Whenever you’re writing about something complex, use


shorter words, sentences, and paragraphs. Simple language and syntax
can make even the most complex facts understandable to readers.

Tip #32. Randomized controlled trials aren’t always the silver bullet
they’re made out to be. Not everything can be measured; not every
question can be answered, even with the best data; and research findings
cannot always be generalized.

OceanofPDF.com
NOTES

Introduction

1. “Heider and Simmel (1944) animation,” YouTube video, 1:32, posted by Kenjirou, July 26,
2010, https://www.youtube.com/watch?app=desktop&v=VTNmLt7QX8E.
2. Fritz Heider and Marianne Simmel, “An Experimental Study of Apparent Behavior,” American
Journal of Psychology 57, no. 2 (April 1944): 243–59, https://www.jstor.org/stable/1416950?
seq=1#metadata_info_tab_contents.
3. Michael D. Slater, David B. Buller, Emily Waters, Margarita Archibeque, and Michelle
LeBlanc, “A Test of Conversational and Testimonial Messages versus Didactic Presentations of
Nutrition Information,” Journal of Nutrition Education & Behavior 35, no. 5 (September/October
2003): 255–59, https://pubmed.ncbi.nlm.nih.gov/14521825/.
4. Marcel Machill, Sebastian Köhler, and Markus Waldhauser, “The Use of Narrative Structures in
Television News: An Experiment in Innovative Forms of Journalistic Presentation,” European
Journal of Communication 22, no. 2 (2007): 185–205,
https://journals.sagepub.com/doi/10.1177/0267323107076769.
5. Dan P. McAdams, The Redemptive Self: Stories Americans Live By, rev. and expanded ed. (New
York: Oxford University Press, 2013).
Part I. People

1. Tom Kertscher, “Obama Auto Rescue Saved 28,000 ‘Middle-Class’ Jobs in Wisconsin, 1
Million in U.S., Ex–Michigan Governor Says,” Politifact, September 14, 2012,
https://www.politifact.com/factchecks/2012/sep/14/jennifer-granholm/obama-auto-rescue-saved-
28000-middle-class-jobs-wi/.
2. Republicans Draw Even with Democrats on Most Issues: Pessimistic Public Doubts
Effectiveness of Stimulus (Washington, DC: Pew Research Center, April 28, 2010), sec. 2, “The
National Economy and Economic Policies,”
https://www.pewresearch.org/politics/2010/04/28/section-2-the-national-economy-and-economic-
policies/.
3. “How the Great Recession Changed American Workers,” Penn Today, September 12, 2018,
https://penntoday.upenn.edu/news/how-great-recession-changed-american-workers.
4. For one, the Organisation of Economic Co-operation and Development said so in one of its
reports: OECD, Relations between Supreme Audit Institutions and Parliamentary Committees,
SIGMA Papers No. 33 (Paris: OECD, December 9, 2002), 82.
5. Government Accountability Office, Unemployed Older Workers: Many Experience Challenges
Regaining Employment and Face Reduced Retirement Security, GAO-12-445 (Washington, DC:
GAO, April 2012), https://www.gao.gov/assets/gao-12-445.pdf.
6. Paul Von Chamier (Kennedy School graduate), interview with coauthor Lauren Brodsky,
February 19, 2020.
7. Government Accountability Office, “GAO: Excerpts from Focus Groups and Interviews with
Unemployed Older Workers, June and July 2011,” YouTube video, 4:30, posted by GAO,
https://www.youtube.com/watch?v=HdZbVKcloYI.
8. HBO, Last Week Tonight with John Oliver, excerpted in the video “Net Neutrality: Last Week
Tonight with John Oliver (HBO), YouTube, 13:17, uploaded June 2, 2014,
https://www.youtube.com/watch?v=fpbOEoRrHyU.
9. Nick Sinai (Kennedy School adjunct professor), interview with coauthor Lauren Brodsky,
September 25, 2019.
10. David Leonhardt and Yaryna Serkez, “The U.S. Is Lagging behind Many Rich Countries.
These Charts Show Why.” New York Times, July 2, 2020,
https://www.nytimes.com/interactive/2020/07/02/opinion/politics/us-economic-social-
inequality.html.
11. Leonhardt and Serkez, “U.S. Is Lagging behind Many Rich Countries.”
12. Government Accountability Office, Unemployed Older Workers, 57.
13. “Ready to Work,” Employment and Training Administration, US Department of Labor,
created November 19, 2013, “https://www.doleta.gov/readytowork/.
Part II. Purpose, Then Process
1. Public diplomacy is the art of influencing and communicating with foreign publics, in order to
impact foreign policy. United States Advisory Commission on Public Diplomacy, US Department of
State, Data-Driven Public Diplomacy: Progress towards Measuring the Impact of Public Diplomacy
and International Broadcasting Activities, September 16, 2014, https://2009-
2017.state.gov/documents/organization/231945.pdf.
2. Ranjana Srivastava (Kennedy School graduate), interview with coauthor Lauren Brodsky,
March 21, 2021.
3. Ranjana Srivastava, “ ‘Could It Be Scurvy?’ It’s a Travesty So Many Australian Aged Care
Patients Are Malnourished,” Guardian (Australia), March 10, 2021,
https://www.theguardian.com/commentisfree/2021/mar/10/could-it-be-scurvy-its-a-travesty-so-
many-australian-aged-care-patients-are-malnourished.
4. Ranjana Srivastava, “Despite Some Errors, Australia Shouldn’t Politicise the Process of the
Vaccine Rollout,” Guardian (Australia), February 25, 2021,
https://www.theguardian.com/commentisfree/2021/feb/25/despite-some-errors-australia-shouldnt-
politicise-the-process-of-the-vaccine-rollout.
5. Laurence Turka, “Scientists Are Failing Miserably to Communicate with the Public about the
Coronavirus,” Boston Globe, July 27, 2020,
https://www.bostonglobe.com/2020/07/27/opinion/scientists-are-failing-miserably-communicate-
with-public-about-coronavirus/.
6. Zeynep Tufekci, “5 Pandemic Mistakes We Keep Repeating,” Atlantic, February 26, 2021,
https://www.theatlantic.com/ideas/archive/2021/02/how-public-health-messaging-backfired/618147/.
7. Tufekci, “5 Pandemic Mistakes We Keep Repeating.”
8. Robinson Meyer and Alexis C. Madrigal, “Why the Pandemic Experts Failed,” Atlantic, March
15, 2021, https://www.theatlantic.com/science/archive/2021/03/americas-coronavirus-catastrophe-
began-with-data/618287/.
9. Carl Zimmer, “How You Should Read Coronavirus Studies, or Any Science Paper,” New York
Times, June 1, 2020, https://www.nytimes.com/article/how-to-read-a-science-study-
coronavirus.html?referringSource=articleShare.
10. “The 17 Goals,” Department of Economic and Social Affairs, United Nations,
https://sdgs.un.org/goals.
11. Tim Harford, The Data Detective: Ten Easy Rules to Make Sense of Statistics (New York:
Riverhead Books, 2021), 142.
12. Ministry for Foreign Affairs of Finland, Data Diplomacy: Mapping the Field; Summary
Report of the Geneva Data Diplomacy Roundtable, April 2017, https://www.diplomacy.edu/wp-
content/uploads/2017/03/DataDiplomacyreport.pdf.
13. See Target 4.2, “Sustainable Development Goal 4 (SDG 4),” Global Education Cooperation
Mechanism, https://sdg4education2030.org/the-goal.
14. Kate Anderson, “We Have SDGs Now, but How Do We Measure Them?” Brookings
Institution, November 3, 2015, https://www.brookings.edu/blog/education-plus-
development/2015/11/03/we-have-sdgs-now-but-how-do-we-measure-them/.
15. Anderson, “We Have SDGs Now, but How Do We Measure Them?”
16. “What Is PISA?,” Organisation of Economic Co-operation and Development,
https://www.oecd.org/pisa/.
17. Michael Barthel, Amy Mitchell, and Jesse Holcomb, Many Americans Believe Fake News Is
Sowing Confusion (Washington, DC: Pew Research Center, December 15, 2016), 3 in PDF,
https://www.journalism.org/2016/12/15/many-americans-believe-fake-news-is-sowing-confusion/.
18. Barthel, Mitchell, and Holcomb, Many Americans Believe Fake News Is Sowing Confusion, 3
in PDF.
19. Adam Grant, Think Again: The Power of Knowing What You Don’t Know (New York: Viking,
2021), 110–11.
20. Jeni Klugman and Sarah Twigg, “Gender at Work in Africa: Legal Constraints and
Opportunities for Reform,” Working Paper No. 3, 10–11,
https://wappp.hks.harvard.edu/files/wappp/files/oxhrh-working-paper-no-3-klugman.pdf.
21. United States Information Agency, West European Trends on U.S. and Soviet Union Strength,
February 1963, p. 13, digital identifier JFKPOF-091-006-p0003, Papers of John F. Kennedy,
Presidential Papers, President’s Office Files, John F. Kennedy Library and Museum,
https://www.jfklibrary.org.
22. United States Information Agency, Reactions to the European Situation, March 1, 1963, p. 71,
digital identifier JFKPOF-091-006-p0003, Papers of John F. Kennedy, Presidential Papers,
President’s Office Files, John F. Kennedy Library and Museum, https://www.jfklibrary.org.
23. Peter J. Katzenstein and Robert O. Keohane, Anti-Americanisms in World Politics (Ithaca, NY:
Cornell University Press, 2007), 17.
24. Katzenstein and Keohane, Anti-Americanisms in World Politics, 108.
25. Katzenstein and Keohane, Anti-Americanisms in World Politics, 19.
26. Katzenstein and Keohane, Anti-Americanisms in World Politics, 16.
27. Katzenstein and Keohane, Anti-Americanisms in World Politics, 288.
28. Cary Funk, Alec Tyson, Brian Kennedy, and Courtney Johnson, Science and Scientists Held in
High Esteem across Global Publics (Washington, DC: Pew Research Center, September 29, 2020), 8
in PDF, https://www.pewresearch.org/science/2020/09/29/science-and-scientists-held-in-high-
esteem-across-global-publics/.
29. Katzenstein and Keohane, Anti-Americanisms in World Politics, 119.
30. Katzenstein and Keohane, Anti-Americanisms in World Politics, 121.
31. Holly Ellyatt, “France’s Vaccine-Skepticism Is Making Its Covid Immunization Drive Much
Harder,” CNBC, January 13, 2021, https://www.cnbc.com/2021/01/13/france-swhy-france-is-the-
most-vaccine-skeptical-nation-on-earth.html.
32. “Macron: AstraZeneca Vaccine ‘Quasi-ineffective’ for Over-65s,” France 24, January 29,
2021, https://www.france24.com/en/live-news/20210129-macron-astrazeneca-vaccine-quasi-
ineffective-for-over-65s.
33. Ministry for Foreign Affairs of Finland, Data Diplomacy, 3.
34. Harford, Data Detective, 68.
35. Harford, Data Detective, 93–94.
36. CBS, 60 Minutes, “Operation Warp Speed: Planning the Distribution of a Future COVID-19
Vaccine,” YouTube video, 13:25, uploaded November 9, 2020, https://www.youtube.com/watch?
v=240DMmhgp4M.
37. James Stavridis, “U.S. Needs a Strong Defense against China’s Rare-Earth Weapon,”
Bloomberg News, March 4, 2021, https://www.bloomberg.com/opinion/articles/2021-03-04/u-s-
needs-a-strong-defense-against-china-s-rare-earth-weapon.
38. Rebecca Barnes (Kennedy School graduate), interview with coauthor Lauren Brodsky,
September 25, 2019.
39. United States Advisory Commission on Public Diplomacy, US Department of State, Data-
Driven Public Diplomacy, 20.
40. United States Advisory Commission on Public Diplomacy, US Department of State, Data-
Driven Public Diplomacy, 22.
41. R. Eugene Parta, Discovering the Hidden Listener: An Assessment of Radio Liberty and
Western Broadcasting to the USSR during the Cold War (Stanford, CA: Hoover Institution Press /
Stanford University Press, 2007), xx.
42. United States Advisory Commission on Public Diplomacy, US Department of State, Data-
Driven Public Diplomacy, 11.
43. Harford, Data Detective, 146. Harford credits statistician David Hand for the concept of “dark
data.”
Part III. Persistence
1. “Number and Rate of Victims of Solved Homicides, by Sex, Aboriginal Identity and Type of
Accused-Victim Relationship” from 2014 to 2019, Statistics Canada, released July 27, 2021,
available from https://www150.statcan.gc.ca.
2. Zander Sherman, “Bruce McArthur, Toronto’s Accused Landscaper Killer, Was Hiding in Plain
Sight All Along,” Vanity Fair, July 3, 2018, https://www.vanityfair.com/style/2018/07/toronto-serial-
killer-bruce-mcarthur-accused-landscaper.
3. Office of the Federal Ombudsman for Victims of Crime, “Submission to the Independent
Civilian Review into Missing Persons Investigations Conducted by the Toronto Police Service,”
submitted by Heidi Illingsworth, Ombudsperson Office of the Federal Ombudsman for Victims of
Crime, Government of Canada (website), November 2019, https://www.victimsfirst.gc.ca/vv/MPI-
RPD/index.html.
4. Sasha Reid (University of Calgary sessional instructor), interview with coauthor David
Chrisinger, September 2019.
5. Kathleen McGrory and Neil Bedi, “Targeted,” Tampa Bay Times, September 3, 2020,
https://projects.tampabay.com/projects/2020/investigations/police-pasco-sheriff-targeted/intelligence-
led-policing/.
6. Kennedy quoted in McGrory and Bedi, “Targeted.”
7. Philip Tetlock, “Why Foxes Are Better Forecasters than Hedgehogs,” Long Now Foundation,
Seminars about Long-Term Thinking, January 26, 2007,
https://longnow.org/seminars/02007/jan/26/why-foxes-are-better-forecasters-than-hedgehogs/.
8. Hong Qu (Kennedy School adjunct lecturer), interview with coauthor Lauren Brodsky,
September 15, 2020.
9. Alberto Cairo, “About That Weird Georgia Chart,” Cairo (blog), May 20, 2020,
http://www.thefunctionalart.com/2020/05/about-that-weird-georgia-chart.html; Willoughby Mariano,
“ ‘It’s Just Cuckoo’: State’s Latest Data Mishap Causes Critics to Cry Foul,” Atlanta Journal-
Constitution, May 13, 2020, https://www.ajc.com/news/state--regional-govt--politics/just-cuckoo-
state-latest-data-mishap-causes-critics-cry-foul/182PpUvUX9XEF8vO11NVGO/.
10. Cairo, “About That Weird Georgia Chart.”
11. Kathleen Hall Jamieson and Paul Waldman, The Press Effect: Politicians, Journalists, and the
Stories That Shape the Political World (New York: Oxford University Press, 2002), 97.
12. David J. Johnson, Trevor Tress, Nicole Burkel, Carley Taylor, and Joseph Cesario, “Officer
Characteristics and Racial Disparities in Fatal Officer-Involved Shootings,” Proceedings of the
National Academy of Sciences 116, no. 32 (2019), 15877–82, 15880.
13. Johnson, Tress, Burkel, Taylor, and Cesario, “Officer Characteristics and Racial Disparities in
Fatal Officer-Involved Shootings,” 15880, 15877.
14. Alex Dobuzinskis, “More Racial Diversity in U.S. Police Departments Unlikely to Reduce
Shootings: Study,” Reuters, July 22, 2019, https://www.reuters.com/article/us-usa-police-race/more-
racial-diversity-in-u-s-police-departments-unlikely-to-reduce-shootings-study-idUSKCN1UI017.
15. Martin Kaste, “New Study Says White Police Officers Are Not More Likely to Shoot Minority
Suspects,” NPR, July 26, 2019, https://www.npr.org/2019/07/26/745731839/new-study-says-white-
police-officers-are-not-more-likely-to-shoot-minority-suspe.
16. Dean Knox and Jonathan Mummolo, “It Took Us Months to Contest a Flawed Study on Police
Bias. Here’s Why That’s Dangerous,” op-ed, Washington Post, January 28, 2020,
https://www.washingtonpost.com/opinions/2020/01/28/it-took-us-months-contest-flawed-study-
police-bias-heres-why-thats-dangerous/.
17. Knox and Mummolo, “It Took Us Months.”
18. Knox and Mummolo, “It Took Us Months.”
19. The hearing, set for September 19, 2019, was announced on the committee’s website:
“Oversight Hearing on Policing Practices,” House Committee on the Judiciary,
https://judiciary.house.gov/calendar/eventsingle.aspx?EventID=2278.
20. Stephen Soumerai and Ross Koppel, “How Bad Science Can Lead to Bad Science Journalism
—and Bad Policy,” Washington Post, June 7, 2017,
https://www.washingtonpost.com/posteverything/wp/2017/06/07/how-bad-science-can-lead-to-bad-
science-journalism-and-bad-policy/.
21. Prachi Sanghavi, Anupam B. Jena, Joseph P. Newhouse, and Alan M. Zaslavsky, “Outcomes
of Basic versus Advanced Life Support for Out-of-Hospital Medical Emergencies,” Annals of
Internal Medicine 163, no. 9 (November 3, 2015): 681–91.
22. Soumerai and Koppel, “How Bad Science Can Lead to Bad Science Journalism.”
23. “Advanced Care, Increased Risk: Advanced Life Support Ambulance Transport Increases
Mortality,” ScienceDaily, October 13, 2015,
https://www.sciencedaily.com/releases/2015/10/151013102416.htm.
24. Lean H. Sun, “Ambulances with Less-Sophisticated Gear May Be Better for Patients,”
Portland (ME) Press Herald, October 12, 2015,
https://www.pressherald.com/2015/10/12/ambulances-with-less-sophisticated-gear-may-be-better-
for-patients/.
25. Soumerai and Koppel, “How Bad Science Can Lead to Bad Science Journalism.”
26. “An Interview with John R. Lott, Jr.,” University of Chicago Press (website), 1998,
https://press.uchicago.edu/Misc/Chicago/493636.html.
27. “An Interview with John R. Lott, Jr.,” University of Chicago Press (website).
28. Patricia Grambsch, “Regression to the Mean, Murder Rates, and Shall-Issue Laws,” American
Statistician 62, no. 4 (2008): 289–95.
29. Jens Ludwig, “Concealed-Gun-Carrying Laws and Violent Crime: Evidence from State Panel
Data,” International Review of Law and Economics 18, no. 3 (1998), 239–54.
30. Ian Ayres and John J. Donohue III, “Shooting Down the ‘More Guns, Less Crime’
Hypothesis,” Stanford Law Review 55, no. 4 (2003): 1193–312.
31. The Regional Educational Laboratory at Florida State University provides an annotated
bibliography of this research in a section of its website named “Ask a REL Response”:
https://ies.ed.gov/ncee/edlabs/regions/southeast/aar/u_03-2019.asp.
32. Donna Gordon Blankinship and the Associated Press, “New CEO: Gates Foundation Learns
from Experiments,” Hartford Courant, May 28, 2009, https://www.courant.com/sdut-us-gates-
foundation-raikes-052809-2009may28-story.html.
33. John V. Pepper, review of The Bias against Guns: Why Almost Everything You’ve Heard about
Gun Control Is Wrong, by John R. Lott Jr., Journal of Applied Econometrics 20, no. 7 (2005), 931–
42, 931.
34. David Hemenway, unpublished review of The Bias against Guns: Why Almost Everything
You’ve Heard about Gun Control Is Wrong, by John R. Lott Jr., https://cdn1.sph.harvard.edu/wp-
content/uploads/sites/247/2013/02/Hemenway-Book-Review.pdf.
35. Knox and Mummolo, “It Took Us Months.”
36. Jacques Gallant, Paul Hunter, and Vjosa Isai, “How Alleged Serial Killer Bruce McArthur Hid
in Plain Sight for Years,” Toronto Star, March 16, 2018,
https://www.thestar.com/news/gta/2018/03/16/how-alleged-serial-killer-bruce-mcarthur-hid-in-plain-
sight-for-years.html.
37. David Bell, “U of C Serial Killer Expert Says There May Be More Bruce McArthur Victims,”
Canadian Broadcasting Corporation News, February 12, 2019,
https://www.cbc.ca/news/canada/calgary/sasha-reid-serial-killer-database-university-of-calgary-
bruce-mcarthur-1.5016846.

OceanofPDF.com
INDEX

abstractions, 24, 108


accuracy, 83, 102
timeliness vs., 83–87, 88, 112
Africa, 19, 35, 37, 53–54
although, when to use, 82
ambulances, life-support equipment, 91–92
Anderson, Kate, 48
Archilochus, 81–82
assumptions, 82, 91–92, 99
AstraZeneca, COVID vaccine, 59
Atlantic, 42–43, 44–45
attribution, unnecessary, 87–88, 112
audience. See readers
Australia, 38–41
authority, 67, 87, 112
authority figures, 71

Barnes, Rebecca, 33–34, 47, 66–68


base rates, 61–62
Bedi, Neil, 77–78
bias, ix, 100
in policing, 89–91, 97
psychological, 97
representative, 60
Biden, Joseph R., 56–57
big data, 60
Bill and Melinda Gates Foundation, 95–96
Bloomberg News, 65
Boston Globe, 41–42
Bristielle, Antoine, 59
Brookings Institution, 47–48
Bureau of Labor Statistics, 15–17, 21
Bush, George W., 88
but, when to use, 82

Cairo, Alberto, 84–85


Canada, 28
Global Affairs Canada, 33–34, 47, 66–67
homicide and missing persons cases, 71–75, 98–99
Project KARE, 75–76
Violent Crime Linkage Analysis System (ViCLAS), 76
Cappelli, Peter, 14
causality/causation, 78–80
vs. correlation, 92–97, 98, 102
reverse, 94–95, 96
CBS, 60 Minutes, 63–64
Cesario, Joseph, 89, 91
charts, 27
bar, 65–66, 83–85
pie, 64, 65
China: COVID-19 pandemic epicenter, 43
rare-earth-mineral market control, 65–66
chronological data presentation, 83–85
coalition building, 18–20
cognitive dissonance, 78, 87–88, 111
coincidence, 93–95, 113
Cold War, 55, 68
college attendance, parental education and income factors, 25, 108–9
communication: as data storytelling, 5–6
doctor-patient, 39
errors and miscommunication, 83, 86–87
face-to-face, 22
of scientific information, 41–43, 59–60
communication skills, of data analysts, 7, 22, 108
comparability, of data, 47, 48–50, 58, 87–88, 110
complexity, of issues: data limitations and, 80, 82
readers’ understanding of, 66, 111, 113
concealed-carry laws, 92–94
conclusions: acknowledging the limitations of, 80–81
basis for, 82
correlation-based, 92
flawed data–based, 89–91, 97
reversal of, 81–82
unnecessary attributions and, 87–88, 112
confounders, 94, 95–97
contextualization, of data, 8, 61–66, 101–2
corporations, 28, 82
correlation, distinguished from causation, 92–95, 113
COVID-19 pandemic, 41–45, 61, 83–85, 87
COVID-19 vaccine, 58–60, 66
Operation Warp Speed, 63–64
politicization, 40–41, 59
credibility, 8, 53, 59, 67–68, 93, 97
crime: gun-related, 92–94
violent, 89–90. See also law enforcement
critical research designs, 67
critical thinking, 8
Cuban Missile Crisis, 54–55
Current Population Study, 15

“dark data,” 68, 80


data: appropriate, 101
aversion to, 67, 68
coding, 76, 79, 84–85
comparability, 47, 48–50, 58, 87–88, 110
contradictory, 80–83, 87–88
excessive (“data dumping”), 50–53, 64, 107, 110
lack of, 44–45, 46–48, 80, 110, 112
layering, 50–55, 61–64, 110
limitations, 80–86, 100
as narrative prop, 17–21
negative, presentation of, 18–21
nonrepresentative, 81, 112
open access to, 24–25
reasons for using, 35–38
as subtext, 52, 110
suggestive, xii, 81–83, 112. See also data points
data analysis, 6–7
commensurable, 49, 110
by readers, 86, 101
of social media–sourced data, 60–61
data analysis errors, 83, 100
causal inference–related, 93–95, 113
errors in logic, 89–92
misleading data visualizations, 83–87
in serial homicide cases, 99
data analysts: communication skills, 108
“fox” and “hedgehog” approaches of, 81–83
listening skills, 108
data collection, 8
commensurable, 49, 110
delayed, 68, 85
determination of indicators in, 47–48
in diplomacy, 33–34, 66–68
inadequate, 100
in law enforcement, 75–78
data dumping, 50–53, 64, 107, 110
data points, 19, 20, 33, 36, 53, 107–8, 110
descriptive, 37
in empirical research, 99
for examples, 42, 109–10
flawed, 87
number of, 42, 47, 50–52, 56, 74–75, 77, 107
quantitative vs. qualitative, 79
reiteration, 53–55, 110
in serial homicide database, 77
story expansion function, 53–54, 110
data silos, 33–34, 35, 66
data storytelling, 3–9
divergent stories, 80
health care–related, 38–41
persuasion by, 4–5. See also narratives; people-based data
storytelling
data visualizations, 27, 83–86
captions to, 84–86, 112
contextualization, 64–66
decision-making, 21, 46, 63–64, 67–68
data selection for, 50, 101
evidence-based, ix
expert opinion–based, 42
in public health care policy, 40–41, 44–45
in writing, 38, 50, 109
democracy, 36, 37, 56
demographic data, 51, 60, 74–75, 89
descriptive data points, 37
development: in education, 47–49, 53–54
international cooperation for, 57
sustainable, 46–48
Displaced Worker Supplement, 15

education: class size–student performance relationship, 95–96


development goals of UN, 47–49
gender disparities, 53–54
parental education–college attendance relationship, 25–26, 108–9
standardized test scores, xii
effectiveness, of data storytelling, 101–2. See also tips, for data
storytelling
empirical research, 93–94, 97, 99–100
energy policy communications, 26–27
entrepreneurship, 24–25
Ethiopia, 18–21
evaluative research questions, 36, 37, 109
evidence, conflicting, 81–83
evidence-based claims, 8
evidence-based policy, ix, 97–98
evidence-based solutions, 17–18, 38
evidence building, 45, 66–68, 110
examples, 46
ideal number of, 42, 109–10
experiences, data storytelling–based, 4, 40, 107
imagery of, 22, 108
re-creation for readers, 22, 107, 108
expert opinion / experts, 14, 21, 41, 42, 44–45, 63–64
unnecessary use of, 87–88, 112
explanations, 27–28, 64

facts, 4, 5, 87
complexity, 95, 113
unnecessary attribution vs., 87, 112
Faizi, Abdulbasir, 74
fake news, 50–51, 52
Federal Communications Commission, 24, 108
Finland, Ministry for Foreign Affairs, 47, 60–61
first drafts, 46
focus groups, 21–23, 28
France, 54, 59–60

General Accounting Office (GAO), 15–17, 21–22, 28–29, 37


Georgia Department of Public Health, COVID data, 83–86
Global Affairs Canada, 33–34, 47, 66–67
Global Positioning System (GPS), 25, 74
goals: for data use, 35–38
goals, of data storytelling, 20, 35–36, 39–40, 107–8, 109
readers’ goals, 38, 46, 101, 109. See also policy goals
Gore, Albert “Al,” 88
Grambsch, Patricia, 94
Grant, Adam, Think Again: The Power of Knowing What You Don’t
Know, 53
graphs, 51–52, 64, 65–66, 112
captions, 85–86, 112
data visualizations, 84–86
line, 13
Great Recession, unemployment data, 13–17, 21–23, 28–29
gross domestic product (GDP), 13, 27–28
Guardian, 38
guns: concealed-carry laws, 92–94
gun control laws, 97–98
gut reactions, to data-driven stories, 22, 108

Harford, Tim, The Data Detective, 46–47, 61, 62, 68


Harvard Injury Control Research Center, 97
Harvard University, Kennedy School, 83
headlines, 26, 92, 101
health care, science-based information about, 38–41
during COVID-19 pandemic, 38–41
health care professionals, ethical decision-making, 40–41
Heider, Fritz, 3–5
Hemenway, David, 97
histograms, 64
homicide, 71
gun-related, 92–94, 97
serial, 73–77, 98–99
honesty, xii, 83–86, 102, 112
however, when to use, 82
humanization, of data scale, xii–xiii, 64–66, 74, 111
humor, 24, 108

imagery, 22, 108


impact, 50, 54, 68, 86, 88, 97, 108
focus on, 26–27, 28, 101
measurement, 34–36, 37
multidimensional views and, 59
relation to outcome, 29, 37, 109
inflation, 13, 19, 20–21
information: audience’s wants and needs for, 17–20, 26, 61, 101, 107
excessive, 6, 42, 107, 110
processing of, 3–4, 6, 58
retention, 5. See also data
input, relation to outcome, 25–27, 37
integrity, 8, 80–88
International Monetary Fund (IMF), 57
internet, 60
broadband access, 26
net neutrality, 24, 108
interventions: evaluation of, 35. See also solutions
interviews, 21, 22, 29, 63–64, 89, 99, 108
Iraq War, 57

Jamieson, Kathleen Hall, The Press Effect, 88


Job Openings and Labor Turnover Study, 15
John Jay College of Criminal Justice, 78
Johnson, David J., 91

Katzenstein, Peter J., Anti-Americanism in World Politics, 55–56, 57,


58
Kayhan, Majeed, 74
Kennedy, David, 78
Kennedy, John F., 55
Keohane, Robert O., Anti-Americanism in World Politics, 55–56, 57,
58
Knox, Dean, 90–91, 97–98
Kohl, Herb, 29
Koppel, Ross, 91–92

language, simplicity of, 39, 95, 113


law enforcement: crime prevention programs, x–xi, 77–78
missing persons and serial killer cases, 73–80
racially biased police violence, 89–91, 97
Leonhardt, David, “The U.S. Is Lagging behind Many Rich Countries.
These Charts Show Why,” 27–28
LGBTQ+ community, 74–75
life expectancy, 27–28
line graphs, 13
listening skills, 22, 108
logical fallacy, 90–91
Lott, John R., Jr.: The Bias against Guns: Why Almost Everything
You’ve Heard about Gun Control Is Wrong, 97
More Guns, Less Crime: Understanding Crime and Gun Control
Laws, 92–94

Machill, Marcel, 5
Macron, Emmanuel, 59–60
Madrigal, Alexis C., 44–45
main finding. See message (main finding)
manipulation, of data, 100
Martin, David, 63–64
McAdams, Dan P., The Redemptive Self: Stories Americans Live By, 5
McArthur, Bruce, 71–72, 98–99
McGrory, Kathleen, 77–78
meaning, in data, 8, 33–68
paraphrasing and, 24
medical writing, 38–41
message (main finding), 38, 65, 108–9
impact-focused, 26–27
placement, 25–26, 108–9
subtext, 52, 110
methodology, 25, 108
Meyer, Robinson, 44–45
missing persons cases, 71–77
multidimensional views, 55–60
Mummolo, Jonathan, 90–91, 97–98

narratives, 45
enhancement of, 46
political, 19
strong differentiated from weak, 8–9
as TV news format, 5
National Assessment of Educational Progress (NAEP), 49–50
National Bureau of Economic Research, 14
Navaratnam, Skandaraj, 71–75, 98–99
net neutrality, 24, 108
New York Times, 27, 45
New York University, Center on International Cooperation (CIC), 17–
21
Nocco, Chris, 77–78
North Atlantic Treaty Organization (NATO), 57, 65
NPR (National Public Radio), 89
numbers: difficult-to-grasp, xii–xiii, 62–63, 110–11
ratios, 14, 34, 107

Obama, Barack, 13–14


energy policy proposals, 26–27
“Fast Broadband for All” initiative, 26
“Open Data Initiative,” 24–26
objectivity, 15, 28, 99, 109
Ohio State University, 4–5
older people: COVID vaccinations, 59–60
employment/unemployment data, 15–17, 21–23, 28–29, 72, 86, 111,
112
Oliver, John, 24, 108
on the other hand, when to use, 82
open access, to data, 24–26
“Open Data Initiative,” 24–26
Operation Warp Speed, 63–64
opportunity, problems framed as, 18–21
Organisation for Economic Co-operation and Development (OECD),
48–49
outcome: confounding factors, 94, 95–96
data-driven discussions, 38–41
negative, 20, 67
relation to impact, 29, 37, 109
relation to input, 25–27, 37
relation to output, 35, 37, 38, 48, 58, 109
reverse causality, 94–95, 96
output, 34, 36–37
goals demonstrated by, 37
relation to impact, 37, 109
relation to outcome, 35, 37, 38, 48, 58, 109

paragraphs, 54
with message (main finding), 27–28
and simplicity, 95, 113
paramedics, 91–92
paraphrasing, 24
parental education–college attendance relationship, 25–26, 108–9
parentheses, proportions in, 72, 111
Pasco County, FL, crime prevention program, 77–78
Peeper, John V., 97
peer review, 90
Penn Today, 14
people-based data storytelling, 13–29, 37, 101–2, 107–8
abstractions, 24, 108
focus groups and, 22–23, 28
imagery, 22, 108
personal stories, 22–24
qualitative data, 74, 79
quotations, 23–24, 29
testimonials, 22–24
percentage change, 57, 110
percentage point difference, xii–xiii, 57, 110
Perna, Gustave F., 63–64
persuasion, 4–5, 7, 53, 82
excessive data and, 53–54
multidimensional views and, 59
problem-specific solutions and, 73–74, 111
purpose and, 37, 59
susceptibility to, 17
uncertain data and, xiii, 80–82
unnecessary attribution and, 87, 112
Pew Research Center, 14, 58
Many Americans Believe Fake News Is Sowing Confusion, 50–51
plagiarism, 87, 112
points of view, 23
policy beacons, 19–21
policy goals, 46–48
broad vs. targeted, 46–48
foreign policy, 68
presidential, 26–27
“politeness norm,” 56
polls, 34, 37, 54–55, 56, 58, 60
Portland (ME) Press Herald, 92
predictive analysis, 60, 77–78, 81, 88, 92
presidential elections, 88
Princeton University, 90
problems: manufactured, 78–79, 111
presentation to clients, 18–21
problem solving: data-informed solutions, 6, 25, 73–80
identification of specific problem, 73–74
Proceedings of the National Academy of Sciences (PNAS), 89–91
process, relation to purpose, 35, 49, 66
Programme for International Student Assessment (PISA), 48–49, 50
progress, emphasis on, 19–21
proportions: differentiated from percentage changes, xii–xiii, 57, 110
in parentheses, 72, 111
props, narrative, 17–21
public diplomacy, 33–37, 54–57, 58, 68, 81
culture of evidence, 66–68
definition, 117n1
public good, 24–27
public opinion: foreign, of United States, 33–37, 54–57, 58, 68
social media as indicator of, 60–61
public policy, 8
implementation time, 27
public service, 38
purpose, 6, 8, 57, 67–68, 80, 102
data dumping and, 50–54, 60–61, 64
data indicators as focus, 48
data visualizations and, 84, 86
loss of, 45
persuasion and, 37, 59
policy goals and, 46–47, 48
readers’ goals and, 38
relation to process, 35, 49, 66
in science writing, 38–39, 41–42, 43–44
Putin, Vladimir, 57

Qu, Hong, 83–86


quantitative data, 21, 23, 26, 68, 80
combined with qualitative data, 74, 79
context, 64
quotations, 23–24, 29, 87, 108

racial minorities, racially biased police violence toward, 89–91, 97


Raikes, Jeff, 96
random and fixed effects models, 94
randomized controlled trials, 97, 113
ratios, 14, 34, 107, xii–xiii
readers: data interpretation by, 86, 101
data requirements/needs, 46
empathy and respect for, 102
as “exasperated majority,” 82
goals of, 38, 46, 101, 109
targeting data-based stories to, 17–20, 46, 109, xii
understanding of, 17–20, 101
regression analysis, 7, 21
regression to the mean, xi, 94
Reid, Sasha, 73–77, 79–80, 98–99
reporting, to convey information, 4–6, 44, 61, 68, 84, 85, 107
research, writing about, 7
research design: flawed, 89–92
regression to the mean and, 94
research findings: clarity of presentation, 42
generalization, 113
objective presentation, 28. See also message (main finding)
research questions, xi–xii, 35, 36–37, 47, 109
descriptive, 36, 109
faulty, 56, 57
prescriptive, 36, 37, 109
what’s happening?, 35–36, 109
what should be done next?, 35, 36, 37, 38–41, 109
what’s working / what’s not working?, 36, 37, 109
risk aversion, 67
Royal Canadian Mounted Police, 72
Violent Crime Linkage Analysis System (ViCLAS), 76

science and scientists, trust/mistrust in, 58, 59


ScienceDaily, 91–92
science writing, 38–45
scurvy, 39
sentences: simplicity, 95, 113
subject-verb structure, 66, 111
September 11, 2001, terrorist attacks, 34, 55–56, 78–79
Serial Homicide Database, 76–77, 79–80
serial homicides, 73–77, 98–99
Serkez, Yaryna, “The U.S. Is Lagging behind Many Rich Countries.
These Charts Show Why,” 27–28
Simmel, Marianne, 3–5
simplicity, in data storytelling, 26–27, 51–52, 54–55, 72–73, 89, 95,
113
simultaneity, 95
Sinai, Nick, 25–27
Slater, Michael D., 4–5
slogans, 26–27
Smith College, 3
social communication, 5
social media, 37, 60–61, 83
Social Security Administration, 21
solar energy, 26–27
Solis, Hilda, 29
solutions, 72–73
adoption of, 27
bad or neutral, 96
causation issue, 78–79, 96, 111
data-informed, 75–80
evidence-based, 18, 67, 68, 72
not based on data, 27
problem-specific, 73–74, 111
Soumerai, Stephen, 91–92
Srivastava, Ranjana, 38–41, 51–52
start-up companies, 24, 108
statistics, 45, 61, 62, 71–77
Stavridis, James, 65–66
student achievement, class size relationship, 95–96
Survey of Consumer Finances, 15
syntax, simplicity of, 95

television news, 5
testimonials, 22–24
Tetlock, Philip, 81–82
theories, data analysts’ approach to, x, 81–82
timeliness, accuracy vs., xiii, 83–87, 88, 112
tips: abstractions (#7), 24, 108
communication skills (#5), 7, 22, 108
comparability of data (#15), xii, 47, 48–50, 58, 87–88, 110
correlation distinguished from causation (#30), xii, 92–95, 113
data dumping (#16), 50–53, 64, 107, 110
data layering (#17), 50–55, 61–64, 110
data-supported solutions (#24), 78, 111
decision-making in writing (#12), xi, 38, 109
difficult-to-grasp numbers (#19), xii–xiii, 62–63, 110
evidence building (#14), 45, 66–68, 110
excessive data (#2), 50–53, 64, 107, 110
expansion of story
tips (cont.)
function (#17), 53–54, 110
graph captions (#27), 85–86, 112
honesty and accuracy in data use (#26), xii, 83–86, 112
humanization of data scale (#20), xii–xiii, 64–66, 111
imagery (#6), 22, 108
manufactured problems (#24), 78–79, 111
message (main finding) (#8), 25–27, 38, 52, 65, 108–9
number of examples (#13), 42, 109–10
output vs. outcome and impact (#11), xi, xii, 37, 109
people-based data storytelling (#4), 107–8
percentage changes differentiated from proportions (#18), xii–xiii,
57, 110
problem-specific solutions (#23), 73–74, 111
proportions in parentheses (#22), 72, 111
randomized controlled trials (#32), 97, 113
ratios (#3), xii–xiii, 14, 34, 107
reporting to convey information (#1), 4–6, 44, 61, 68, 84, 85, 107
research questions (#10), xi–xii, 35, 36–37, 47, 109
selection of examples (#13), 42, 109–10
sentence subjects and verbs (#21), 66, 111
simplification of language (#31), 95, 113
suggestive/nonrepresentative data (#25), xii, 81–83, 112
timeliness vs. accuracy (#29), xiii, 83–87, 88, 112
tone of writing (#9), 28, 90, 109
unnecessary attribution (#28), 87–88, 112
tone, of writing, 28, 90, 109
Toronto Police Service, missing person / serial killer cases, 71–75, 98–
99
Toronto Star, 98
transparency, xii, xiii, 50, 67, 80–81
trends, 13, 20, 34, 37, 49, 60
context, 61–64
data visualizations of, 83–86
in education, 48–49
Trump, Donald J., 57, 62, 63–64
trust, 100
persuasion based on, xii
in science and scientists, 58, 59
Tufekci, Zeynep, 42–43
Turka, Laurence, 41–42
Twitter, 91

Ukraine, Russian invasion of, 57


unemployment rate data, 13–17
among older workers, 15–17, 21–23, 28–29, 72, 86, 111, 112
United Kingdom, 59–60
United Nations: Sustainable Development Goals (SDGs), 46–48
2030 Agenda for Sustainable Development, 46
United States: international public opinions of, 34–35, 54–57, 58, 68
wealth distribution in, 27–28
University of Calgary, 73
University of Pennsylvania, Wharton School, 14, 81
University of Virginia, 97
US Army, 64
US Department of Defense, 55
US Department of Education, National Assessment of Educational
Progress (NAEP), 49–50
US Department of Labor, 21
Ready to Work Partnership, 29
US Department of State, 67, 68
Data-Driven Public Diplomacy, 34
US House Committee on the Judiciary, 91
US Information Agency (USIA), 54–55, 68
US Senate, Special Committee on Aging, 29

verbs, action-oriented, 66, 111


Violent Crime Linkage Analysis System (ViCLAS), 76
Von Chamier, Paul, 17–21

Waldman, Paul, The Press Effect, 88


Washington Post, 90, 92
wealth distribution, 27–28
weather data, 25
words: judgmental, 28, 109
simplicity, 95, 113
World Bank, 57
World Health Organization (WHO), 43–44
writer’s toolbox, 6

YouTube, 3

Zimmer, Carl, 45

OceanofPDF.com
About the Authors

David Chrisinger is the executive director of the Harris Public Policy


Writing Workshop at the University of Chicago, where he teaches policy
design and communication. He is the author of Public Policy Writing That
Matters and also serves as the director of writing seminars and teaches
memoir writing for The War Horse, an award-winning nonprofit newsroom
educating the public on military service, war, and its impact. He wrote a
book based on his teaching, also published by Johns Hopkins University
Press, titled Stories Are What Save Us: A Survivor’s Guide to Writing about
Trauma. Before coming to the University of Chicago in 2019, David spent
nearly a decade working as a communications specialist for the US
Government Accountability Office, where he helped write and edit public
policy reports and testimonies for Congress, turning complex data into
relatable narratives. The topics that his research and writing covered
included primary, secondary, and higher education; retirement security;
social policy; and issues related to military veterans’ transitions back to
civilian life. For six years David also taught public policy writing to
graduate students in the Master of Public Policy Program in the Bloomberg
School of Public Health at Johns Hopkins University.

Lauren Brodsky is a lecturer in public policy at the Harvard Kennedy


School, where she teaches degree program courses on persuasive
communications, policy analysis, and writing. Lauren is also the faculty
chair of an executive education program and created the website Policy
Memo Resource, which is both a database and a blog about effective
writing: policymemos.hks.harvard.edu. Lauren’s articles have appeared in
Harvard Business Review and Fast Company. Before coming to HKS in
2014, she taught courses on communications and international relations at
Northeastern University, Tufts University, University at Albany–SUNY, and
Skidmore College. She is a former Theodore Sorensen Research Fellow at
the John F. Kennedy Presidential Library and Museum, where she
conducted archival research on public diplomacy programs during the
Kennedy administration. Lauren holds a PhD from the Fletcher School at
Tufts University, where she studied the use of US international broadcasting
to promote democracy abroad.

OceanofPDF.com

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy