128 Because Data Cant Speak For Itself
128 Because Data Cant Speak For Itself
FOR ITSELF
OceanofPDF.com
BECAUSE DATA CAN’T SPEAK
for ITSELF
OceanofPDF.com
© 2023 David Chrisinger and Lauren Brodsky
All rights reserved. Published 2023
Printed in the United States of America on acid-free paper
246897531
A catalog record for this book is available from the British Library.
Special discounts are available for bulk purchases of this book. For more information, please contact
Special Sales at specialsales@jh.edu.
OceanofPDF.com
A worldview is not a Lego set where a block is added here, removed there. It’s a fortress
that is defended tooth and nail, with all possible reinforcements, until the pressure
becomes so overpowering that the walls cave in.
OceanofPDF.com
CONTENTS
INTRODUCTION
Why Should You Learn to Tell Stories with Data?
PART I PEOPLE
Telling Stories with Data about People for People
Conclusion
Acknowledgments
Tips to Help You Write More Effectively with Data
Notes
Index
OceanofPDF.com
FOREWORD
Ethan Bueno de Mesquita
ETHAN BUENO DE MESQUITA is the Sydney Stein Professor and deputy dean at the Harris
School of Public Policy at the University of Chicago. His research focuses on applications of
game theoretic models to a variety of political phenomena including conflict, political violence,
and electoral accountability. He is the author of Thinking Clearly with Data and Political
Economy for Public Policy, both published by Princeton University Press, as well as many
articles in leading journals in both political science and economics. His research has been
supported by the National Science Foundation, the Office of Naval Research, and the United
States Institute of Peace. Before arriving at the University of Chicago, Ethan taught in the
political science department at Washington University in St. Louis. He received a BA from the
University of Chicago in 1996 and an MA and a PhD from Harvard University in 2003.
OceanofPDF.com
INTRODUCTION
OceanofPDF.com
Why Should You Learn to Tell Stories with Data?
Because data can’t speak for itself—that’s why. Case in point: In the spring
of 1944, as the Allies prepared to invade Hitler’s Fortress Europe, two
psychologists from Smith College—Fritz Heider and Marianne Simmel—
published the results of a study that forever changed our understanding of
how people make sense of new information. Heider and Simmel created a
90-second film of three black geometric shapes moving across a white, two-
dimensional background and then had three groups of college students
watch the film. They asked the first group to describe what they saw
without any further prompting. The second group was told to interpret the
moving shapes as though they were people acting in the real world. The
researchers gave the third group the same instructions as the second group,
only the film the third group watched was shown in reverse. You can watch
the video for yourself on YouTube; search for “Heider and Simmel (1944)
animation.”1
At the start of the film, a triangle gets locked inside a larger rectangle
before a smaller triangle and a circle enter the scene from the top of the
screen. The bigger triangle then leaves the confines of the rectangle when
the left side of it opens outward on a hinge, like a door. The two triangles,
now out in the open, repeatedly bash into each other as the circle moves
toward the rectangle. The action continues until the bigger triangle finds
itself locked inside the rectangle again, like it was at the beginning of the
film.
Heider and Simmel found that of the 114 students they tested, only 3
(1 in the first group and 2 in the third group) didn’t interpret what they saw
as characters acting out a story. The vast majority, in fact, invented quite
elaborate stories to explain what they saw. Some students saw the triangles
as two men fighting, and they saw the circle as a woman trying to evade the
bigger triangle, which was clearly an aggressive bully in their minds. Many
students perceived the smaller triangle and the circle as “innocent young
things,” while the bigger triangle was “blinded by rage and frustration.”2 To
make sense of an exceedingly complicated world, Heider and Simmel
argued, most people must turn facts, data points, observations, and other
aspects of life into a story with characters who have different needs and
who must confront one another to get whatever it is they desire.
Tip #2. Communicate, don’t complicate. The last thing people need is
more information. They have far too much of it already. What they need is
help making sense of all that information and to understand the difference
between what’s important and what’s just noise.
This book is filled with tips to help you write persuasive stories
with data
Try to think of communicating well with data as a craft not unlike the
skilled trades of carpentry, masonry, or blacksmithing. These skilled trades,
like communicating with data, require both technical and artistic skill to
fulfill a specific purpose. The carpenter who designs and builds your
favorite chair, for instance, would have much more difficulty doing so
without having all the right tools and techniques at their fingertips.
The tips of this book—32 in all, woven throughout and made prominent
—are tools you should store in your own writer’s toolbox for future use. As
you notice these tools being used in the real world and as you learn more
about them—and practice using them, too—communicating well with data
will eventually become second nature to you, like a carpenter driving a 16-
penny nail with a single thwack! of their framing hammer. Not every
writing tool will be needed at all times, but they will be ready to use when
called for.
If you bought this book hoping to learn how to analyze the great
multitude of data out there, we’re sorry to say this isn’t the book for you. At
least not yet. But once you’ve grasped how to crunch the numbers, build
your models, and run those regressions, you’ll be ready to dust this baby off
and learn all you need to know—and what you should avoid—when
communicating the so what? of your data analysis to readers who may not
know a regression from an inkblot test. They will need your communication
skills to help them get it, and to care about it.
This book is also for anyone who is tasked with writing about other
people’s research in ways that are accurate and persuasive, especially for
readers who are more interested in having answers to their questions than
they are in learning the technical details of how those answers were
obtained. If that describes you, you’re going to benefit especially from “Part
III. Persistence,” which will teach you all you need to know to write with
integrity every time you write with data.
For a complete list of all our tips, please flip to page 107.
While nobody ever said this was easy, there is hope!
Most people—including those in the highest political offices in the land—
simply do not have the time or expertise to properly interpret and assess the
credibility and usefulness of available data and the countless reports,
studies, and analyses of data released every day. It’s not that most people
aren’t smart enough. Far from it. Whether you’re a policy analyst,
consultant, journalist, academic, politician, or CEO, the reality is that if
telling effective stories with data were easy, you’d be doing it already.
Sadly, nothing about it is easy—not collecting and sifting data; not
confronting its contradictions and conflicts; and not creating a framework
for describing what we know from it, evaluating what works, or devising
next steps for corrective action.
But have no fear! Because data can’t speak for itself, each day we
commit to this work, we are afforded an incredible opportunity to think
critically about what we know, what we don’t, and why anyone should care
either way. Claims we make about the world, when supported by credible
evidence, have the power to change the way a reader sees the world—the
first step in a long journey to creating positive and lasting change.
Our interest in communicating effectively with data stems from our
combined decades of teaching public policy students and practitioners to
use their data in support of a story that helps readers make sense of
something. Sometimes the people we teach must learn how to use more
data, other times less. Some need to learn how to explain and contextualize
the evidence they have, while others need to figure out how to collect data
that would help them say something valuable. Above all else, nearly every
writer we’ve ever taught or consulted with has needed help figuring out
how to tell stories about data that meet the needs specific to their readers.
That’s what we’ll be covering in “Part II. Purpose, Then Process.” We’ll
show you how to think about why you do the work you do and how
knowing the why can lead you not only to finding meaning in your data but
also to gathering and describing that data in a way that’s useful. It is often
quite useful to make your reader understand how data pertains to real
people, which is what we’ll discuss in “Part I. People.”
Part III of the book, as we mentioned above, deals with matters of
integrity. How, for example, do we tell accurate stories with data that can
actually help solve some of society’s most intransigent problems, not just
describe them? By the end of this book, you will understand what separates
strong data narratives from weak ones and will have a much better sense of
how to turn the latter into the former.
We wrote this book because we wanted something useful (and succinct)
to share with our students and colleagues that covers all we’ve learned over
our many years of teaching. It’s our hope that this short book will equip you
with enough writing tools that you’ll feel more confident the next time you
write in support of a change you’d like to see happen in the world.
What are we waiting for? Let’s get started!
OceanofPDF.com
PART I
PEOPLE
OceanofPDF.com
Telling Stories with Data about People for People
Tip #3. Ratios can help readers make sense of large numbers. Saying
“one in four people” is much easier for readers to picture than “7,526,333 of
30,111,489 people.”
Peter Cappelli summed up nicely what all this data tells us. “One in five
employees lost their jobs at the beginning of the great recession,” the
Wharton professor of management told Penn Today in 2018. “Many of
those people never recovered; they never got real work again.”3
It’s not your job to tell the reader everything there is to know
Who exactly are these people who never recovered, who were still
unemployed—through no fault of their own—four years after the recession
started? Mostly they were what the federal government calls “older
workers,” meaning people aged 55 years or older who could and want to
work. These folks had an especially hard time recovering after being laid
off at the height of the recession. Taking a look at figure 1.1, you’ll see that
the number of unemployed older workers peaked in February 2010 at
2.3 million. Fast-forward 21 months later, to December 2011, and the
number was still quite high at 1.9 million.
To make sense of what was happening to unemployed older workers, the
US Government Accountability Office (GAO) analyzed multiple sets of
data related to unemployment from the Bureau of Labor Statistics,
including data from the Current Population Survey, Job Openings and
Labor Turnover Survey, and Displaced Worker Supplement. GAO also
analyzed data on retirement savings from the 2007 Survey of Consumer
Finances. Lastly, GAO modeled microsimulations to estimate retirement
income for workers who stopped working at different ages. For those
unfamiliar with GAO’s work, you should know that GAO is considered the
“supreme audit institution of the United States,”4 meaning that policy
analysts from all levels of government—both domestic and international—
look to GAO as an exemplar of how to evaluate governmental performance
objectively.
Here’s what GAO found after crunching all those numbers: the rates of
long-term unemployment among older workers rose at a greater pace than it
did for younger workers, and by 2011, more than half of all unemployed
older workers had been actively looking for work for longer than six
months.5 Bear in mind that the Bureau of Labor Statistics counts people as
unemployed only if they are still looking for work. Those who give up for
whatever reason no longer count—at least as far as the unemployment rate
is concerned. Therefore, the true number of older unemployed workers was
even greater.
Figure 1.1. Millions of older workers were still unemployed more than two years after the recession
ended, according to the Government Accountability Office’s analysis of 2007–2011 Current
Population Survey data.
Source: Government Accountability Office, Unemployed Older Workers: Many Experience
Challenges Regaining Employment and Face Reduced Retirement Security, GAO-12-445
(Washington, DC: GAO, April 2012), figure 3
Tip #4. Don’t forget there are real people behind all those numbers
you’re crunching. Readers will care a hell of a lot more about people than
about data points, so if your goal is to get the reader to care, find the people
in the numbers and tell a story about how those people are affected.
What did the Ethiopians really want to know? They wanted to know
whether they should pay down their debt or if there was another strategy
that would work better to even out the country’s exchange rate. “We
analyzed the dynamics of the situation,” says Von Chamier, “and found out
that government-owned companies were using a parallel exchange because
international companies refused to use an overvalued rate. No one used
official values. In other words, some of the ‘hit’ had already been factored
in.” Therefore, the CIC report explained—quite delicately—that, in Von
Chamier’s words, “the cost of making inflation realistic is not as hard as
you think. This is a fresh fact. The cost has been organically absorbed by a
parallel market.”
To improve the palatability of the story he needed to tell, Von Chamier
mimicked language used by the Ethiopians to describe their situation. His
readers cared much more about the story than they did about the data.
That’s not to say data isn’t important; it’s incredibly important. But it’s not
more important than the story. The story is about Ethiopia as a beacon for
what is possible in the future.
Tip #5. If you want to be an exceptional data analyst, you must learn
how to talk to people. And we mean really talk to people—and listen, too.
What GAO heard in their focus groups was much more revealing—and
moving—than what they gleaned from government databases.7 Many of the
older workers perceived that employers were reluctant to hire them because
of their age. One 57-year-old man, for example, said, “I had a hand in some
of the hiring. You know, it wasn’t for publication, but the guy said, ‘Don’t
hire anybody older than me or fatter than me.’ ” A woman a year younger
remarked, “I have a job interview tomorrow for a job at 50 percent of my
salary for $25,000 a year. And you know what? I’ll take it if they offer it to
me because I can keep looking while I have [the job]—if they want me after
they see how old I am when I walk in the door.”
Tip #6. When you want your readers to remember your story, use
striking imagery that will stick with them over time. When looking for
details to include in data-driven stories, pay attention to your gut reactions.
If you feel like you’ve been punched in the gut after reading a statistic or a
quote from an interview, take note of that. Try to re-create the experience
for the reader. Chances are if you felt something, they’ll feel something too.
Tip #8. Try starting with the main finding—your message—not facts or
your methodology. Instead of this: One study probed the relationship
between parental education and income and participation in postsecondary
education and found that young people from moderate- and low-income
families were no less likely to attend college in 2001 than they were in
1993. Try this: Young people from moderate- and low-income families were
no less likely to attend college in 2001 than they were in 1993, according to
one study.
“We like to talk about inputs,” Sinai recalls, “not outcomes,” but it’s
outcomes that most interest entrepreneurs and innovators. The focus of the
story he needed to tell, Sinai realized, had to be on the fact that there were
countless ways to use all the data the government had collected to make
real, lasting, and positive change in people’s lives.9
Sinai was later able to apply this lesson to another data-communication
challenge when working on the Obama administration’s initiative to bring
fast broadband to students around the country. Rather than focus on how
much the administration was going to improve broadband infrastructure,
Sinai focused his communication to the public on explaining outcomes and
impact. He led with the goal—that 99 percent of students would soon have
access to fast broadband—rather than how they were going to accomplish
that goal (that is, the method). The campaign’s slogan became “Fast
Broadband for All.”
“When you have a presidential policy goal,” Sinai explains, “you need to
do a good job of articulating that goal, externally and internally, in a way
that is measurable and realistic.” To do this, you can think about how a
journalist might write about the policy initiative to give readers “actionable,
timeboxed, and specific metrics,” which can be used to evaluate the
effectiveness of a given initiative. According to Sinai, the “timebox” is
when the initiative will happen. Journalists and the public also need to
know which metrics will be used to evaluate it. But they want that
information conveyed in a story—preferably one that can fit in a headline.
When readers want fresh baked cookies, sharing a recipe from your
mother’s side of the family won’t immediately curb their appetite.
Sinai also applied this strategy when crafting messages for President
Obama’s energy policy proposals. Rather than relaying quantitative data
and projections chockful of scientific jargon—all too common in energy
policy communications—Sinai’s message was simple, and it focused on
impact: “Solar as cheap as coal by 2020.” This slogan was particularly
attractive because it makes its point clearly, even to the reader who doesn’t
know how expensive coal is—or solar, for that matter. What does matter is
the potential impact on people who likely would adopt solar technology if it
were as cost-effective (in the short term, anyway) as burning coal to
produce electricity. Even though the Obama administration wasn’t able to
achieve its goal in this area, Sinai’s use of a story-first strategy was the right
tack. It’s important to remember that in the world of public policy, some
ideas take years—even decades—to catch on. Just because your solution is
not instantly adopted by the readers you’ve addressed doesn’t mean it never
will be.
Another good example of starting with the story (and using data to
support that story) comes from David Leonhardt and Yaryna Serkez. In July
2020, they published an opinion essay in the New York Times titled “The
U.S. Is Lagging behind Many Rich Countries. These Charts Show Why.”
The first two paragraphs of the essay are worth quoting in full:
The United States is different. In nearly every other high-income country, people have both
become richer over the last three decades and been able to enjoy substantially longer lifespans.
But not in the United States. Even as average incomes have risen, much of the economic
gains have gone to the affluent—and life expectancy has risen only three years since 1990.
There is no other developed country that has suffered such a stark slowdown in lifespans.10
How great is that first line? “The United States is different.” Different how?
Keep reading to find out! Leonhardt and Serkez then present their data in a
series of 11 data visualizations that cover everything from life expectancy,
GDP per capita, and rates of union membership to health expenditures as a
share of GDP and the distribution of national income across the economy.
The authors’ goal, as the title of the essay indicates, is not just to point
out problems but rather to offer an explanation as to why the United States
can’t seem to produce the same positive outcomes for its citizens that other
rich countries have managed to do: Britain, Denmark, Japan, Canada, and
Germany among them. What they found was multiple contributing factors
worked together to make corporations and rich people in the United States
more wealthy and powerful over time, mostly at the expense of middle- and
working-class families. The data they present, Leonhardt and Serkez argue,
shows that most American workers and their families “receive a smaller
share of society’s resources than they once did and often have less control
over their lives.” Moreover, their “lives are generally shorter and more
likely to be affected by pollution and chronic health problems.”11
By focusing on who is impacted, where that impact occurs, and how that
impact is felt, the authors show a “disturbing new version of American
exceptionalism” that acts as a frame to help the reader make sense of all the
data that follows.
Tip #9. The tone of your writing matters—a lot. If you want your reader
to see you as objective, use an objective tone and present your findings as
objectively as possible. Avoid judgmental words such as failure or
incompetence.
OceanofPDF.com
PART II
OceanofPDF.com
Finding Meaning in the Data and Making It Work for
You
You may want to accomplish all three of these goals. And you absolutely
can. But first you have to decide which goal(s) to pursue because, once you
do, you’ll then be able to focus your attention on asking better research
questions—and getting better answers.
Let’s start with the first goal you may have for using your data to tell an
effective story: you want to understand and describe what is happening (and
how we got here). Imagine that you’re a deputy-level policy professional in
charge of communicating the impact of America’s efforts to spread
democracy around the world. How’s that going? Well, let’s say you want to
know specifically about North Africa and the Middle East, so you find out
how many events the US embassy in Tunisia, for example, has sponsored
that are somehow related to improving conditions favorable to democracy.
And let’s say there were 35 such events. Now let’s ask ourselves, Why were
those events held? Was the goal to help change public opinion about
something? To serve as inspiration of some kind? Or maybe the point of the
events was to show Tunisians what the United States stands for and what
we support (and will not support). It’s hard to say without having more
information, but let’s assume the 35 events at least show what matters to the
embassy. That’s the story. That’s the why. And that’s where you should
begin.
Tip #10. Ask better research questions. Good questions drive good
stories, and the most common types of questions we see answered in public
policy writing are these: (1) Descriptive: What’s happening? (2) Evaluative:
What’s working? What’s not? (3) Prescriptive: What should be done next?
If you only report outputs (such as how many events were held and how
many people attended them), you’re missing an opportunity to tell a
compelling story—maybe one about the embassy’s goals. Also, your reader
will surely be left feeling starved for meaning. Communicating well with
data requires more than serving a data point or two to a hungry reader. It
requires—if you’ll indulge another food metaphor—that we collect our
ingredients, follow the recipe, cook something delicious, feed it to our
guests, and tell them a story about what that food is going to do to improve
their health, boost their energy, or whatever our goal may be. Notice the
difference between “The US embassy in Tunisia hosted 35 events last year”
and “To show its commitment to supporting and celebrating democracy in
North Africa, the US embassy in Tunisia hosted 35 democracy-themed
events.” With the former, we have a point devoid of meaning, and with the
latter, we have a story about a goal and value as demonstrated by outputs.
We have a descriptive data point. And the reader can then understand the
why of our story.
Now let’s talk about impact. What’s working? What isn’t? And what
should we do next? These are the important prescriptive research questions
that so many in the policy world tend to stress. With just the outputs—35
democracy-themed events at the US embassy in Tunisia attended by 1,000-
plus people—it’s hard to tell what impact, if any, these efforts had. To
answer an evaluative question, we need more. Again, as GAO did, we may
need to turn to the people, to ask about their opinions of the events and their
level of engagement; perhaps we could employ a before-and-after poll to
canvass their views on democracy. Or perhaps it would be wise to see what
related stories may be trending on popular social media platforms.
Whatever method of data collection we choose, a persuasive argument will
need to include specific language to tell the reader the purpose we are trying
to serve.
Tip #12. Nearly every decision you need to make as a writer depends on
two things: Whom are you writing for, and what do you want them to
do with what you have written? Understanding your reader’s goals will
help you determine everything from what kinds of data (and how much) to
use to how you should frame the implications of your research. Knowing
what you want your writing to accomplish is equally important. Are you
trying to educate and inform or to persuade and inspire the reader to act?
Are you trying to comfort the disturbed or disturb the comfortable?
Everything you write depends on your answers to these sorts of questions,
and once you know the answers, you can use data to support your message
effectively.
Tip #13. When deciding how many examples to include, remember the
power of three. Use one example if you want to show the reader how
powerful the example is. Use two examples if you want to compare and
contrast them. And to give the reader a sense of roundness and
completeness, use three. Some news organizations share “three things to
know” with their readers, and they include one data point for each. More
information would crowd the story. Readers love threes.
In February 2021, Zeynep Tufekci wrestled with similar concerns about
science writing in the Atlantic. Data wasn’t being used well to light a path
forward for an anxious public wanting guidance. Tufekci directed her
analysis at the World Health Organization (WHO), and what she derived
from the data was rather critical. At the beginning of the coronavirus
pandemic, in January 2020, the WHO—the international body responsible
for communicating effectively about health—said there was “no clear
evidence of human-to-human transmission.” What the WHO should have
said, according to Tufekci, was “there is increasing likelihood that human-
to-human transmission is taking place, but we haven’t yet proven this,
because we have no access to Wuhan, China,” the purported epicenter of
the virus.6
A similar criticism could be leveled at WHO for how it communicated
about antibodies’ capability of protecting people from contracting the virus
a second time. In the spring of 2020, WHO officials reported that there was
“currently no evidence that people who have recovered from COVID-19
and have antibodies are protected from a second infection.” The result? I’m
sure you remember: a profusion of news articles and commentary animated
by trepidation and dismay. “Instead,” Tufekci writes, WHO “should have
said, ‘We expect the immune system to function against this virus, and to
provide some immunity for some period of time, but it is still hard to know
specifics because it is so early.’ ”7 In other words, WHO officials forgot
their purpose—to collect data, use it to support a claim, and communicate
that supported claim clearly to people around the world who don’t know
how to keep themselves and their families safe. After more than two years
of living through a global pandemic, we’ve been shown time and again that
variants of the virus develop and spread, cases surge and subside, and the
knowledge we have about what works to treat the virus—and what doesn’t
—expands every day. This is not to say, of course, that WHO should have
claimed something was proven back in 2020 even when it hadn’t been, but
if WHO had started with purpose—to help people navigate uncertainties
during a crisis—then its communications might have helped us understand
that no one policy response would likely stay effective for long and that we
would need to be ready to adapt as we collected more data and followed the
science.
Tip #14. If you don’t have any data, try articulating to the reader what
kind of data would help and how it could be collected. Some refer to this
practice as “evidence-building,” which takes time, money, and inclination.
Not every problem we face will have all three things going for it.
What story can you tell with your data (or your lack of data)?
This is a question that has captivated Carl Zimmer, a science writer for the
New York Times, for the better part of three decades. “While I’m not a
scientist myself,” Zimmer wrote in the summer of 2020, “I’ve gotten pretty
comfortable navigating around them. One lesson I’ve learned is that it can
take work to piece together the story underlying a paper. If I call a scientist
and simply ask them to tell me about what they’ve done, they can offer me
a riveting narrative of intellectual exploration. But on the page, we readers
have to assemble the story ourselves.”9
In our experience working with policy students, economists, military
folks, lawyers, and others, we’ve seen this sort of thing play out countless
times. What it boils down to is the writer having lost their sense of purpose,
and when that happens, the words on the page won’t ever have much
meaning for the reader. One of the first questions we usually ask about a
first draft (other than Who are you writing this for? and What do you want
the reader to do with this information?) is simply Why is this important?
Nearly every time we ask this question, something magical happens. The
writer’s eyes twinkle. Sometimes a smile breaks across their face. They
then tell us a story. They don’t just give us a mound of data punctuated with
academic jargon.
A way of enhancing your own narrative that we use all the time is to
imagine yourself having a conversation with a reasonable person (however
you define that) who is interested in your topic. What do they know
already? What do they need to know to form an opinion, make a decision,
or take action? How does the story you tell them help them achieve their
goals? Perhaps you can answer some of these questions with the data you
have. But other times you may need to fill in the gaps with specific
examples or by describing what we don’t yet know but may reasonably
predict.
Tip #15. Before comparing data sets, check first to see if the data sets
were collected and analyzed in similar ways. Consider whether your data
points will “speak” well to one another; that is, were they measured in the
same manner, in the same time period, by the same organization? If they
were not (and bringing them together would amount to “comparing apples
to oranges”), explain to the reader what comparison or contrast can be
reasonably made—and what cannot.
What are the authors trying to accomplish with this story? We think it’s
safe to assume they want to educate the reader about an important policy-
related problem. Or perhaps the point is to raise alarm about a set of
disconcerting developments. If their purpose is to convince the reader that
there truly is a problem here, adding more data points isn’t necessary.
Individual data points are like ice cream. One scoop is probably plenty for
most of us. Two scoops will fill you up. Three might give you a
stomachache. What about more than three? Well, nobody needs more than
three scoops of ice cream.
“What I try to avoid is too many numbers,” Dr. Srivastava says: “not
more than one or two figures, and not too many graphs.” Applying this
thinking to the Pew passage, note the difference it makes when only two
data points are mentioned:
The majority of Americans feel they can recognize fake news—39 percent are very confident
and 45 percent are somewhat confident in their ability to do so.
The temptation many of us face when writing with data is to show the
multiple layers of what we found. Researchers working on issues related to
income disparities, for example, may be tempted to detail many sorts of
income gaps—between men and women, white and Black people, rural
residents and urban, those with a college education and those without one—
to provide what they believe is a fuller picture. The risk they run, however,
is that taking this comprehensive approach could end up working against
their intended purpose by losing the point of the story in details.
One important question all communicators need to ask before they add
more data to their story is this: Does an additional layer of data advance the
story, or does it simply restate or reinforce a point already made?
Tip #17. When layering on data in your story, make sure each
additional data point expands the story you’re telling. Try not to
unnecessarily reiterate a point you’ve already made.
Giving your reader too many reasons to accept your argument can
actually make it less persuasive to some, according to psychologist and
author Adam Grant. In his 2021 book, Think Again: The Power of Knowing
What You Don’t Know, Grant shows that providing more reasons to a reader
can make you seem more credible but only to a reader who has already
been persuaded or who is sympathetic to your perspective. If you want to
reach a reader who may be skeptical of, or even hostile to, your argument,
you should focus on quality over quantity. “If they’re resistant to
rethinking,” Grant writes, “more reasons simply give them more
ammunition to shoot your views down.”19
The same can be true when communicating with data. Each additional
data point you offer increases the likelihood that the reader will get stuck in
the weeds or lose sight of your meaning and purpose. We see this all the
time in well-intended policy writing. In one example, from a research report
about gender disparities in education among African nations, the authors
had an abundance of data to choose from:
Education levels are rising—almost all girls and boys are enrolled in primary schools in all
regions, and globally more than 70 percent of children are enrolled in secondary schools.
Completion rates at the primary level are also on the rise globally. Of 173 countries with data,
almost half have completion rates of 95 percent or higher. Over the last decade, completion
rates rose from 78 to 87 percent for girls, and from 84 to 90 percent for boys. However, in the
least developed countries, around 41 percent of children are enrolled in secondary school and
fewer girls than boys are enrolled (14 percent).20
While all this data is interesting and important, no doubt, it is too much
for a busy reader to take in all at once. If the authors would have clarified
their purpose before committing words to page, they’d be much less likely
to bury their story under all those layers of data. They would instead be able
to focus on meeting readers where they are. For example, assuming the
purpose of this paragraph was simply to show where progress has been
made and where more is needed, the authors could have written something
along these lines instead:
While education levels are rising globally—more that 70 percent of children are enrolled in
secondary schools—the developing world continues to lag behind. In the least developed
countries, only 41 percent of children are enrolled in secondary schools. And fewer girls are
enrolled than boys.
Sometimes not using all the data you have makes the most
sense
In 2017, the Ministry of Foreign Affairs in Finland convened a roundtable
to discuss the role of data in international affairs. At the time there was a
sense that “data obtained from social media platforms [could] serve as a
basis for sentiment analysis towards particular issues, regions or countries.”
It now seemed possible, in other words, for government officials from
around the world to log on to Twitter or Instagram and gauge public opinion
on issues that previously required expensive and time-consuming polls.
What the researchers discovered, of course, was that it wasn’t that simple.
As the resulting report noted, “although big data can pinpoint trends … it
has limited predictive power.” The report’s authors went on to claim that
“big data does not always paint an accurate picture of society, as it usually
over-represents those who have access to the Internet and digital devices.
The representative bias might become even more prominent when
analyzing data from certain online platforms … as the analysis will over-
represent certain demographics that are active in framing online
discussions.”33 That is to say, data pulled from social media to understand
opinion on international issues illustrates the opinions of those who chose to
post on their social media accounts. Who are they? Are they even real
people? If you gather your data from social media, what sort of people are
likely omitted from the sample? Does telling a story based on such data
help you fulfill your purpose? In the end, sometimes the right thing to do is
not use all the data you have, especially if it doesn’t help you achieve your
purpose.
We often hear from writers that they choose not to provide context
because they’re afraid they’ll bore the reader or tell them something they
already know. In our experience, such concerns are mostly unfounded.
Neither one of us has yet met a decision maker who got annoyed over the
inclusion of a sentence or two that explained the context around a data
point. In fact, providing context can serve as a useful reminder to the reader
or give them an opportunity to understand the trend you identified in a new
way. Ultimately, decision makers expect your analysis to have some
background and context. So don’t shortchange them.
In illustration of this point, CBS ran a story on the show 60 Minutes in
November 2020 on Operation Warp Speed, the American public-health
moonshot to expedite the development and distribution of a COVID-19
vaccine. Journalist David Martin interviewed four-star general Gustave F.
Perna, who had been tapped by the Trump administration to run the
operation. Martin followed Perna around his office, learning about the
wicked challenges he faced. When looking at Perna’s desk, Martin noticed a
cheat sheet with bureaucratic acronyms. Martin was bemused to learn that
the former army supply officer had a “steep learning curve to master the
jargon of the pharmaceutical industry.” Perna was not shy about it. “I listen
every day to what is said,” he explained, “and then I spend a good part of
my evening googling these words.” Googling the words! Perna was the
decision maker, and he was having to bridge his own area of expertise
(logistics and personnel for the US Army) with something new (Big
Pharma). Surely he would have appreciated receiving definitions and
context from his staff to help him learn the lingo.36 They may have wrongly
assumed, though, that the boss already knew.
Tip #20. Humanize the scale of the math for your reader. Change “Of
the $246.8 billion in retail spending last year, consumers spent $86.4 billion
on cars and car parts” to something like “Of every $100 spent in retail last
year, consumers spent $31 on cars and car parts.”
OceanofPDF.com
PART III
PERSISTENCE
OceanofPDF.com
Using Data to Solve Wicked Problems with Integrity
All too often, this is where analysis ends. Once all the missing persons
reports are entered in a database, with data sorted by province and by city,
populated by searchable variables, like gender, race, and age, the only story
we can feel totally confident telling is one of crisis. Plain and simple. Too
many people go missing each year; many cases remain unsolved. In order to
find solutions, what’s needed is persistence.
Good data can help solve the problem, not just describe it
What if in Canada and elsewhere more data analysts used what they could
find out about missing people to tell a more helpful, albeit more
complicated, story? That’s the sort of question Sasha Reid was asking
herself three years after Navaratnam disappeared, while she was building
two databases: one for missing persons and another for unsolved homicide
cases in Canada. Reid teaches psychology at the University of Calgary, and
of all the interesting phenomena a psychologist could study, Reid chose
missing persons and serial killers. For as long as she can remember, she’s
wanted to know how many of Canada’s missing persons may have fallen
victim to a yet-to-be-discovered serial killer. She’s fascinated by serial
killers, mostly because they didn’t strike Reid as irrational. There was a
method to their madness, and she needed somehow to uncover what makes
them tick. How do serial killers perceive the world around them? she
wondered. How do their perceptions affect their motivations to kill?
Perhaps most importantly, what can be done to stop them once they’ve
started? Can we use data to find the solutions, not just describe the
problems?
Tip #24. Make sure that data supports your solution but that it doesn’t
create the need for it. There is a difference between cause and effect
with data. Don’t let cognitive dissonance or your own belief system lead
you to using data to manufacture a problem.
Act with integrity and you won’t have to worry about being
“wrong”
Here are a few truths about data: One, data can come in many, many forms
—both quantitative and qualitative—and no single kind of data is inherently
better than any other; it all depends on your reader and purpose. Two, the
data you need may not exist, may take time to be collected, or may be, as
we noted in part II, “dark.” Three, more often than not, the data you do have
access to can be used to support divergent stories about what the data
means. We told you this stuff wasn’t easy!
When you don’t have the data you need—or the numbers don’t quite line
up in the way you’d hoped—be honest about it. Tell your reader what you
have and what you don’t have. Lean into the complexity. Contrary to what
some may believe, owning up to the limitations of your conclusions can be
a tremendously persuasive communication strategy. While it may
sometimes feel like walking into a sword fight without any armor, it’s
important to remember that there’s no such thing as a perfect idea or
proposal or policy recommendation—and your reader knows that
intuitively. So instead of pretending like this isn’t the case, you should let
yourself admit it when your data has limitations and perhaps discuss other
interpretations you’ve entertained and why you decided to reject them.
Bring up potential trade-offs and explain to your reader how you would try
to avert or mitigate any unintended consequences. This sort of transparency
can show your reader that you hold yourself (and your analysis) to a high
standard and that you can be persuaded by new evidence, which in our
experience is the first step to showing a reader that it’s all right for them to
be persuaded, too.
This approach to data places it in a supporting role to your own honest,
human interpretations. And it can help those who are nervous about using
data, such as the diplomats mentioned in the previous chapter, to feel
supported, not limited, by data.
Tip #25. If the data you have is suggestive but not necessarily
representative, say that. Don’t overstate your data’s meaning, but also
don’t be afraid to use it for fear that you might later be proven wrong.
Instead, tell the reader where the data comes from, what it represents, and
what it indicates to you, the communicator.
Above all else, Tetlock says, when the facts change, foxes change their
minds. To better persuade others, be more open to persuasion yourself. You
might be surprised by the reaction you receive, especially if your reader is
one of the “exasperated majority” that’s fed up with political polarization
and divisive, falsely simplistic rhetoric.
Even when we do our best to be foxes and tell the truth about what we
know and what we don’t, errors and miscommunication happen; like death
and taxes, they’re inevitable. Fortunately, if we learn to recognize some of
the more common errors people make, that awareness should help keep us
out of trouble, at least most of the time.
Tip #26. Be honest about what data you have—and what you still don’t
know. Honesty and accuracy are the most important virtues when it comes
to telling stories with data—more important than timeliness. If you don’t
have good data, accept that reality and don’t include it in your writing.
These are lessons Hong Qu teaches his students about data visualization
at the Harvard Kennedy School. As part of his research and work, Qu
follows data visualization experts who advocate for presenting data well to
a general audience. At the height of the spring 2020 surge of coronavirus
cases in the United States, Qu noticed what he described as a “burst of
critiques of the health department of Georgia.”8 Data experts from around
the world were using social media to slam Georgia’s visual representation
of county-by-county data in a bar chart that made it appear, misleadingly,
that the number of new COVID cases in all five of the state’s most-affected
counties were trending downward by early May (figure 3.1).
Qu was intrigued by the reactions of other data analysts, so he examined
the bar chart. The first issue he noticed was that the x-axis in the chart,
created by the Georgia Department of Public Health, wasn’t chronologically
arranged, which struck him as incredibly odd for a chart purporting to track
a trend over time. He further noticed that its nonchronological order made it
appear as though the numbers told a different story from what was actually
happening.
Figure 3.1. Top five counties in Georgia with the greatest number of confirmed COVID-19 cases.
The chart represents the most affected counties over the past 15 days and the number of cases over
time.
Source: Georgia Department of Public Health
In looking at the bar clusters for April and May, “it seemed like cases
were going down,” Qu observed, “but, in reality, cases were not going
down. They were hitting a plateau and would eventually go up.” The
downward trend was caused by a delay in reporting cases. In short,
Georgians weren’t out of the woods just yet.
Data visualization expert Alberto Cairo noticed the same problem. On
his website, he published a redesigned chart to show how much the story
seemed to change when the data was presented chronologically.9
“Visualization books, including mine,” Cairo explains in the paragraph
beneath his redrawn chart (figure 3.2), “spend many pages discussing how
to choose encodings to match the intended purpose of every graphic, but we
pay too little attention to the nuances of sorting: should we do it
alphabetically, by geographic unit, by time, from highest to lowest, from
lowest to highest—or do we need an ad-hoc criterion? Or should we make
the graphic interactive and let people choose? As always, the answer will
depend on what we want the reader to get from the visualization.”10
Figure 3.2. Albert Cairo’s redrawing of the chart from the Georgia Department of Public Health (see
fig. 3.1).
Source: Alberto Cairo, “About That Weird Georgia Chart,” Cairo(blog), May 20, 2020
A problem with the original chart was that data collection for COVID
cases was occurring with a delay. Therefore, the most recent numbers were
the least likely to be accurate. Including them would falsely demonstrate a
downward shift and misinform the reader, especially a busy one who only
scans the chart. Excluding the most recent numbers would prioritize
accuracy over timeliness. But in the early days of the pandemic, timing was
everything.
Is there a compromise here? We think there is: more storytelling. The
Georgia Department of Public Health could have provided chronologically
arranged data (Cairo’s correction), and the chart’s caption could have
informed readers that the most recent data is the least accurate owing to
delays in the reporting of new cases.
Captions can add a layer of honesty to visualizations. Qu teaches his
students to use captions, which he says are “underutilized in data
visualizations because it’s an afterthought by the designer.” He encourages
designers to use them because they are “one of the most effective
techniques to guide the audience’s attention, as well as to explain the key
takeaways by adding a signpost that tells a mini-story.” Done well, captions
“convey clearly the reasoning behind the data insights” represented in the
graphic.
Adding to the caption of Georgia’s chart a sentence stating that COVID
cases are likely higher because of incomplete reporting is an excellent and
effective compromise.
Tip #27. When presenting data visualizations, use the graph’s caption
to tell the reader what the point of it is. Instead of a caption that reads,
“Number of older workers who report not having enough money to retire,”
try something like this: “Twice as many older workers today report not
having enough money to retire than older workers reported two decades
ago.” Don’t expect your reader to interpret the data themselves. They may
derive a different story from it than what you intended.
Reading bad or disproven data, it turns out, can repel new and better data
encountered later on. Several studies of communication show that once
readers finish reading a text and have had time for its content to sink in, the
human brain has a way of making its content seem true. Later, when it’s
time to correct a flawed argument or data point, some readers will struggle
with cognitive dissonance. Wait, they say. You told me not to wear a mask
when the pandemic started. Now you’re telling me I have to wear a mask in
every indoor public place I go? That can’t be right.
An interesting example of this phenomenon from the world of politics
comes from the contentious 2000 US presidential election between George
W. Bush and Albert “Al” Gore. In their book The Press Effect, Kathleen
Hall Jamieson and Paul Waldman explain what happened: “When networks
called the election for Bush at 2:20 a.m., televisions were on in fifteen
million homes … Graphics with Bush’s picture and the words ‘George W.
Bush—the 43rd President of the United States’ flashed on the screen …
When the call was retracted at 3:50 a.m., 8.5 million homes still had their
televisions on. In 6.5 million homes, viewers went to bed thinking Bush had
won but awoke to find an unsettled election.”11 That news would bring
cognitive dissonance to many who had slept on—and made peace with—
the election results. Many of those Americans were more likely to believe
that Bush had won the election, which would have been “stolen” were Gore
declared the winner in a reversal, regardless of their political leanings.
Tip #29. Manage the tension between timeliness and accuracy. Consider
your readers and then answer their questions, honestly, rather than
succumbing to the desire for having a definite answer as quickly as
possible. That is how mistakes are made.
Does the data actually support your conclusions? Are you sure?
In August 2019, a research team, led by senior author Joseph Cesario, had
the results of their study on fatal police shootings published in the
prestigious Proceedings of the National Academy of Sciences (PNAS).
“Concerns that White officers might disproportionately fatally shoot racial
minorities can have powerful effects on police legitimacy,” they declare
about the implications of their work, which was based on a “near-complete
database” of more than 900 fatal shootings in 2015 that included
demographic information on each officer who did the shooting.12
What did they find? Despite “recent high-profile police shootings of
Black Americans,” the researchers “did not find evidence of anti-Black or
anti-Hispanic disparity in police use of force across all shootings.” Put
simply, “White officers are not more likely to shoot minority civilians than
non-White officers.”13 In most cases of fatal shootings, “the person killed
was armed and posed a threat or had opened fire on officers,” said the
study’s senior author when interviewed for a reporter’s story about the
research.14
According to the coauthors, police reformers’ calls to diversify police
departments may not have any effect on the frequency of officer-involved
fatalities. “If this study is right,” NPR reasons, “just hiring more black cops
will not mean fewer black people get shot.”15 Instead, the best way to
reduce these fatal shootings, the study’s findings suggest, might be to
redress “the socio-historical factors that lead [Black and Hispanic] civilians
to commit violent crime,” which is what led to their increased likelihood of
being killed by police, according to the researchers. In other words, if
you’re Black or Hispanic, don’t commit violent crime because, if you do,
you’re more likely to be killed by police. And it won’t be because you’re
not white.
Several months after the results of this study were published, two
Princeton professors demonstrated mathematically that the study was
“based on a logical fallacy and erroneous statistical reasoning and sheds no
light on whether police violence is racially biased.” In an op-ed published in
the Washington Post, Dean Knox and Jonathan Mummolo explain what was
so faulty about the study:
It takes no technical expertise to understand the core problem of the study. The authors used
data on fatal police shootings to determine the likelihood of officers shooting minority
civilians, ignoring the fact that most police encounters with people do not result in a fatal
shooting. Under this fallacious approach, an officer who encountered one minority civilian and
fatally shot him or her (a 100 percent fatal shooting rate) would appear identical to an officer
who shot one minority civilian out of a thousand similar encounters (a 0.1 percent fatal
shooting rate). Data on fatal shootings alone cannot tell us which officers are more likely to
pull the trigger, let alone account for all relevant differences between incidents to allow us to
isolate the role of race.16
How could PNAS, one of the most cited peer-reviewed journals in the
world, publish such flawed research? Soon after Knox and Mummolo
notified the journal of the glaring errors they had discovered, an editor at
the journal responded to Knox and Mummolo in defense of the article. The
“clear logical errors” that Knox and Mummolo pointed out were, according
to the editor, a matter of preference over how best to study how race
influences officer-involved shootings; moreover, the tone of the critique
was “intemperate.”17
Knox and Mummolo then took their concerns to the digital marketplace
of ideas we call Twitter. As the likes and retweets piled up, senior author
Cesario and first author Johnson published a reply to the critique. “Though
they still largely stood by their study,” Knox and Mummolo later wrote in
their op-ed, “they admitted their central claim—that white officers are not
more likely to shoot minority civilians than their nonwhite peers—was
unsupported by their analysis.”18
By that point, though, it was largely too late to walk it back. The study’s
flawed findings had already been widely covered in the media. They were
even presented during testimony in an oversight hearing on policing
practices convened by the US House Committee on the Judiciary.19
“There are endless examples of bad research designs producing flawed
findings, followed by uncritical media reports touting the results,” write
Stephen Soumerai and Ross Koppel, which “can result in costly, ineffective
and even harmful national policies.”20 For example, a 2015 study claimed to
show that better-trained paramedics with more sophisticated lifesaving
equipment actually caused more deaths than their lesser-trained colleagues
when responding to emergency calls in nonrural areas for people receiving
Medicare benefits.21 But, as Soumerai and Koppel explain, “the authors
confused cause and effect: Ambulance dispatchers send the better-equipped
ambulances to dying patients in an effort to save them before transporting
them to the hospital. These patients are already more likely to die on the
way to the hospital than patients in basic ambulances.”22
Even though it was clear to many that the researchers had not controlled
methodologically for an assumption that violates basic logic, ScienceDaily,
a website that proclaims to share the “latest science news,” published a
summary subtitled “Advanced Life Support Ambulance Transport Increases
Mortality.”23 Similarly, the Portland (ME) Press Herald reprinted a
Washington Post article under the headline “Ambulances with Less-
Sophisticated Gear May Be Better for Patients.”24 “Such distortions,”
introduced into public discourse, Soumerai and Koppel say, “have
potentially life-threatening consequences to patients and policy.”25
Persistence, patience, and honesty are a requirement not only to speak well
with data but also to speak accurately with data. It is important to slow
down and make sure you are not confusing cause and effect.
In the two decades after Lott’s book was published, at least two dozen
empirical studies made convincing counterarguments, namely, that
concealed-carry laws had little or no effect on the reduction of crime. For
example, in a 2012 article published in the American Statistician, Patricia
Grambsch showed that there isn’t any evidence to support Lott’s claim that
concealed-carry laws “have beneficial effects in reducing murder rates.”
The real culprit, according to Grambsch, is a phenomenon known as
“regression to the mean.” When a data point differs significantly from other
observations at first, but then in subsequent observations is found to be
closer to average, we can say that it regressed to the mean, or average.
Because this phenomenon absolutely must be considered when designing
scientific experiments and interpreting data, Grambsch used random and
fixed effects models to do just that. How those models work is less
important than what they tell us in this case, which is that when regression
to the mean is factored in, the effects that legalizing concealed carry has on
crime shift. In the end, Grambsch found that when states passed concealed-
carry laws, doing so had no effect on rates of murder.28 Other researchers
found that concealed-carry laws have resulted, if anything, in an increase of
certain types of violent crime, including adult homicide29 and aggravated
assault.30
It’s true that sometimes correlation, a co-occurrence of phenomena, is in
fact an instance of causation. But you can’t trust an interpretation of
causality in one direction (A causes B) without testing for reverse causality
(B causes A) and for what are known as confounders.
Let’s start with reverse causality. When the outcome, or the anticipated
outcome, affects the treatment (that is, the presumed effect is, in fact, acting
on the purported cause), we call that reverse causality. Identifying reverse
causality is sometimes a matter of common sense. For example, a study
might find that brown spots on the skin and sunbathing are linked. It’s
plausible of course to hypothesize that sunbathing can cause brown skin
spots, while it’s all but impossible to suppose, inversely, that brown spots
cause sunbathing.
To take another, less obvious, example, let’s say a study finds that
smoking cigarettes and depression are linked. We could conclude, perhaps,
that smoking causes depression, though it’s also possible that the causality
runs in the other direction: depression could cause people to smoke
cigarettes. Follow this line of reasoning: smokers may feel depressed over
their inability to quit a habit that’s socially stigmatized or depressed about
the deterioration of their fitness from smoking, so to relieve their
depression, they turn to cigarettes for the familiar uplift of nicotine. In all
likelihood, it seems probable that smoking and depression interrelate in
causation. When this sort of mutual relationship between two features of the
world exists, we call that “simultaneity.”
Let’s now take up confounders. Confounding occurs when some feature
of the world affects the treatment and the outcome above and beyond the
effect of the treatment. To statistically remove the influence of confounders,
we need to control for a measurable variable. Before making a causal
inference, you must try to determine whether any other factor may be
influencing the outcome of your analysis.
Tip #32. Randomized controlled trials aren’t always the silver bullet
they’re made out to be. Not everything can be measured; not every
question can be answered, even with the best data; and research findings
cannot always be generalized.
How many people might still be alive today had the Toronto
police been more open to using data?
Let’s return to the cold case in Toronto. Unbeknownst to Reid, the police
had assembled a small group of detectives to look into Navaratnam’s case.
They called themselves Project Prism, and their mission was to take a closer
look at Bruce McArthur to determine whether he was responsible not just
for Navaratnam’s disappearance but for the other disappearances as well. It
seemed to the detectives that McArthur had a disturbing routine. First, he’d
meet gay men who had immigrated to Canada. Then he’d hook up with
them and hire them to work for his landscaping business. Not too long after
that, the men would disappear. McArthur, they were coming to realize, was
possibly the thread that tied it all together.
The task force started surveilling McArthur around the clock, and one
day the officers parked outside watched a young man enter McArthur’s
apartment. Fearing the worst, they decided to act. Once they had forced
themselves inside, the officers found the young man “bound, restrained to a
bed, but unharmed,” according to the Toronto Star.36 The man was freed,
and police arrested and charged McArthur with two counts of first-degree
murder.
Using McArthur’s list of landscaping clients, the Toronto Police Service
began searching dozens of properties across the city. On the neatly
landscaped lot of one small home, police officers found the remains of six
men stuffed into large planters.
It didn’t take long for the Toronto Police Service to come under fire from
multiple fronts. Why wasn’t McArthur arrested sooner? How could the
police have let him go after first interviewing him in 2013? “All serial
homicide cases have their fair share of systemic and human errors,” Reid
explains. “The reason I and my team do what we do is because of these
errors. The more we can bring objective analysis to what has traditionally
been a very subjective profession, and the more we look for and listen to
people’s stories—especially people who have traditionally been silenced—
the quicker the police will be able to stop the Bruce McArthurs of the world
from hurting more people.”
Based on all the evidence she’s collected on McArthur since his arrest,
Reid believes there may be more victims who haven’t yet been found. “I’ve
done a developmental profile of Bruce,” she told a journalist soon after
McArthur was convicted. “I’ve gone into his past and looked at his entire
development from essentially conception until the time he was arrested. It is
possible that there are more.”37
On January 29, 2019, Bruce McArthur pleaded guilty to eight counts of
murder in Ontario’s Superior Court of Justice. At 66 years old, he became
the oldest convicted serial killer in Canada. He was subsequently sentenced
to life imprisonment with no eligibility for parole for 25 years.
Empirical work like Sasha Reid’s depends on assumptions, beliefs, and
judgments about what data points are acceptable and useful, which
relationships should be examined, how variables are defined, and what the
findings mean. While we often hear calls to make public policy data-driven
or evidence-based, these calls seem to ignore the fact that empirical
research relies on subjective human judgment and framing. When we
quantify something, when we measure it, we shine a light on it, and when
we fail to quantify or measure something, we leave it in the dark.
Sometimes we avoid collecting certain data because it’s too hard to do, too
expensive, or there’s little money to be made from it. On top of that, the
data sets we do have available to us are constructed by human beings and
are, therefore, subject to human biases, errors, and manipulations, as are all
things humanmade. In every case of drawing on data, we must be honest
with ourselves about what the data can let us say with integrity and what it
cannot. That’s the only way your reader will know whether they should
trust what you have to say.
OceanofPDF.com
Conclusion
Throughout this book, we’ve shared stories from our work to demonstrate
the challenges we all face in writing effectively with data. What we’ve
concluded after all these years is that effectiveness depends mostly on a
writer’s ability to understand how their reader tends to make sense of data.
Once you’ve got an understanding of your reader’s goals, tell them a story
that gives them the information they need to form an opinion, make a
decision, or take steps to address a problem. Use only the data that must be
presented to convince your reader that the story you’re telling is logical and
appropriate. This can be hard, we know; but after being exposed to the
frameworks, tips, and tactics we’ve presented, it is our sincere hope that
you feel much more prepared to take on such an important challenge.
Another lesson we hope you’ve gleaned is that when you do craft a story,
elevate the people impacted to the headline. Remembering that your reader
cares more about people than they do about statistics will serve you well.
Instead of focusing on the numbers alone, provide a context around the
numbers. It will take your voice and your description to convey what is
known, and what is unknown, to your reader. The effort this work requires
will serve your purpose, which is ultimately to make important and positive
change in the world.
Lastly, writing effectively with data depends on your ability to be both
accurate and honest. Explain limitations. Check your work. Accuracy and
honesty won’t persuade everyone in the short term, no doubt, but
inaccuracy and dishonesty will persuade far fewer people in the long term.
Be mindful of using data to develop strong stories that don’t confuse
causation and correlation. Don’t stretch and bend the numbers. Cultivate
your credibility instead. And remember that demonstrating empathy and
respect for your reader has the power to elicit emotion that can draw them
in and help them really see.
Effectiveness can’t be forced; it comes only from doing good work as
well as one can. Ultimately, it is up to you to speak for data.
OceanofPDF.com
ACKNOWLEDGMENTS
DAVID CHRISINGER
This book is the product of our collective experiences as students, practitioners, and instructors of
effective communication. Thinking back on how we learned to write stories with data, we realized
that we acquired most of the ideas detailed in this book from patient advisors, helpful mentors,
thoughtful collaborators, and other communication experts. This book, in turn, was conceived out of
a sincere desire to share frameworks, principles, and tools for writing with more people than our
classrooms can accommodate.
First of all, I’d like to thank my students, colleagues, and mentors at the University of Chicago’s
Harris School of Public Policy for helping to plant the seeds of this book. The author of our
foreword, Ethan Bueno de Mesquita, and his frequent collaborator Anthony Fowles—both professors
at the Harris School—were the source of several concepts captured in this book and the inspiration
for others. I’m also in great debt to all the other faculty who have invited me into their classrooms to
help teach Harris students how best to communicate quantitative analysis to meet the particular needs
of readers. Ever since I arrived at the Harris School in February 2019, these colleagues have shared
with me, either directly or through their writings, myriad lessons learned from their own experiences
—many of which found their way into this book. In particular, I would like to thank Dan Black,
Christopher Blattman, Sorcha Brophy, Chad Broughton, John Burrows, Isabeau Dasho, Matthew
Fleming, James A. Leitzel, John A. List, Jens Ludwig, Luis Martinez, Roger Myerson, Konstantin
Sonin, Brian Williams, Rebecca Wolfe, Kimberly Wolske, Paula Worthington, Austin Wright, and
Adam Zelizer. My sincerest thanks as well to Ranjan Daniels, Andie Ingram Eccles, Jenny Erickson,
Shilin Liu, Sakshi Parihar, Sam Schmidt, and others for all the support, encouragement, and
connection.
Second, I would like to extend my warmest thanks to the dean of the Harris School, Katherine
Baicker; our dean of students, Kate Shannon Biddle; and the rest of the executive team at the Harris
School for believing in my work and supporting my efforts to make our students the most effective
communicators they can possibly be.
Third, I want to express my gratitude to my students, past and present, for offering a steady stream
of ideas and inspiration about what to include in this book—as well as plenty of opportunities to try
out and refine the lessons and tools featured throughout.
Lastly, without my wife, Ashley, and our three beautiful children, this work of mine would feel
much, much less fulfilling. It is my honor and privilege to share the world with them.
LAUREN BRODSKY
I would like to thank my students at the Harvard Kennedy School, who are the inspiration for this
book. Students come to the Kennedy School with passions for a variety of policy issues. While their
interests may differ, what they have in common is a desire to make an impact. To make the world a
better place. And to improve the lives of citizens around the globe. Communicating well with
evidence is an important skill to make that change. I am so thankful to my students for trusting me to
lead them through that work and for helping me learn along the way. I am also thankful to the
leadership at the Kennedy School for supporting me in growing my own knowledge and expertise in
policy writing.
Many of the tips and lessons of this book have grown from conversations with colleagues and
with students who became alumni and practitioners. Thank you to those who shared their stories with
me, including Hong Qu, Ranjana Srivastava, Nick Sinai, Rebecca Barnes, and Paul Von Chamier.
Through our discussions I was able to see trends and best practices of persuasive policy
communications. I was inspired by the way you model honesty, compassion, and perseverance in the
work you do.
I am lucky to work with the most supportive colleagues in the Communications Program at the
Kennedy School. They are thoughtful and gifted teachers who continue to inspire me. Our
conversations on teaching and learning are often the highlight of my day. Topping that list is Jeffrey
Seglin. There is not a more dedicated professor out there, and without his faith in me, this book, and
my work that informs it, would not have been possible. Thank you also to Alison Kommer, our
program coordinator, who steps right in to help in ways that are always above and beyond. You have
both been such an important support system to my work over the years.
Lastly, I want to thank my family and especially my husband, Gregg. Two careers in our selected
fields is a juggling act. But you have pushed me every step of the way and helped our two wonderful
kids see what is possible.
OceanofPDF.com
TIPS TO HELP YOU WRITE MORE EFFECTIVELY WITH
DATA
Below are the 32 tips we shared throughout the book, conveniently in one
place so you can return to this list and check your work whenever you set
out to write effectively with data. We don’t all write effectively with data on
the first try—or even the second or third try. And that’s okay. Once you
know how to tell stories with your data, and what resonates with readers
and compels them to care about those stories, you can revisit these pages
and revise your writing with the tips in mind.
Tip #2. Communicate, don’t complicate. The last thing people need is
more information. They have far too much of it already. What they need is
help making sense of all that information and to understand the difference
between what’s important and what’s just noise.
Tip #3. Ratios can help readers make sense of large numbers. Saying
“one in four people” is much easier for readers to picture than “7,526,333 of
30,111,489 people.”
Tip #4. Don’t forget there are real people behind all those numbers
you’re crunching. Readers will care a hell of a lot more about people than
about data points, so if your goal is to get the reader to care, find the people
in the numbers and tell a story about how those people are affected.
Tip #5. If you want to be an exceptional data analyst, you must learn
how to talk to people. And we mean really talk to people—and listen, too.
Tip #6. When you want your readers to remember your story, use
striking imagery that will stick with them over time. When looking for
details to include in data-driven stories, pay attention to your gut reactions.
If you feel like you’ve been punched in the gut after reading a statistic or a
quote from an interview, take note of that. Try to re-create the experience
for the reader. Chances are if you felt something, they’ll feel something too.
Tip #8. Try starting with the main finding—your message—not facts or
your methodology. Instead of this: One study probed the relationship
between parental education and income and participation in postsecondary
education and found that young people from moderate- and low-income
families were no less likely to attend college in 2001 than they were in
1993. Try this: Young people from moderate- and low-income families were
no less likely to attend college in 2001 than they were in 1993, according to
one study.
Tip #9. The tone of your writing matters—a lot. If you want your reader
to see you as objective, use an objective tone and present your findings as
objectively as possible. Avoid judgmental words such as failure or
incompetence.
Tip #10. Ask better research questions. Good questions drive good
stories, and the most common types of questions we see answered in public
policy writing are these: (1) Descriptive: What’s happening? (2) Evaluative:
What’s working? What’s not? (3) Prescriptive: What should be done next?
Tip #12. Nearly every decision you need to make as a writer depends on
two things: Whom are you writing for, and what do you want them to
do with what you have written? Understanding your reader’s goals will
help you determine everything from what kinds of data (and how much) to
use to how you should frame the implications of your research. Knowing
what you want your writing to accomplish is equally important. Are you
trying to educate and inform or to persuade and inspire the reader to act?
Are you trying to comfort the disturbed or disturb the comfortable?
Everything you write depends on your answers to these sorts of questions,
and once you know the answers, you can use data to support your message
effectively.
Tip #13. When deciding how many examples to include, remember the
power of three. Use one example if you want to show the reader how
powerful the example is. Use two examples if you want to compare and
contrast them. And to give the reader a sense of roundness and
completeness, use three. Some news organizations share “three things to
know” with their readers, and they include one data point for each. More
information would crowd the story. Readers love threes.
Tip #14. If you don’t have any data, try articulating to the reader what
kind of data would help and how it could be collected. Some refer to this
practice as “evidence-building,” which takes time, money, and inclination.
Not every problem we face will have all three things going for it.
Tip #15. Before comparing data sets, check first to see if the data sets
were collected and analyzed in similar ways. Consider whether your data
points will “speak” well to one another; that is, were they measured in the
same manner, in the same time period, by the same organization? If they
were not (and bringing them together would amount to “comparing apples
to oranges”), explain to the reader what comparison or contrast can be
reasonably made—and what cannot.
Tip #17. When layering on data in your story, make sure each
additional data point expands the story you’re telling. Try not to
unnecessarily reiterate a point you’ve already made.
Tip #20. Humanize the scale of the math for your reader. Change “Of
the $246.8 billion in retail spending last year, consumers spent $86.4 billion
on cars and car parts” to something like “Of every $100 spent in retail last
year, consumers spent $31 on cars and car parts.”
Tip #24. Make sure that data supports your solution but that it doesn’t
create the need for it. There is a difference between cause and effect with
data. Don’t let cognitive dissonance or your own belief system lead you to
using data to manufacture a problem.
Tip #25. If the data you have is suggestive but not necessarily
representative, say that. Don’t overstate your data’s meaning, but also
don’t be afraid to use it for fear that you might later be proven wrong.
Instead, tell the reader where the data comes from, what it represents, and
what it indicates to you, the communicator.
Tip #26. Be honest about what data you have—and what you still don’t
know. Honesty and accuracy are the most important virtues when it comes
to telling stories with data—more important than timeliness. If you don’t
have good data, accept that reality and don’t include it in your writing.
Tip #27. When presenting data visualizations, use the graph’s caption
to tell the reader what the point of it is. Instead of a caption that reads,
“Number of older workers who report not having enough money to retire,”
try something like this: “Twice as many older workers today report not
having enough money to retire than older workers reported two decades
ago.” Don’t expect your reader to interpret the data themselves. They may
derive a different story from it than what you intended.
Tip #29. Manage the tension between timeliness and accuracy. Consider
your readers and then answer their questions, honestly, rather than
succumbing to the desire for having a definite answer as quickly as
possible. That is how mistakes are made.
Tip #32. Randomized controlled trials aren’t always the silver bullet
they’re made out to be. Not everything can be measured; not every
question can be answered, even with the best data; and research findings
cannot always be generalized.
OceanofPDF.com
NOTES
Introduction
1. “Heider and Simmel (1944) animation,” YouTube video, 1:32, posted by Kenjirou, July 26,
2010, https://www.youtube.com/watch?app=desktop&v=VTNmLt7QX8E.
2. Fritz Heider and Marianne Simmel, “An Experimental Study of Apparent Behavior,” American
Journal of Psychology 57, no. 2 (April 1944): 243–59, https://www.jstor.org/stable/1416950?
seq=1#metadata_info_tab_contents.
3. Michael D. Slater, David B. Buller, Emily Waters, Margarita Archibeque, and Michelle
LeBlanc, “A Test of Conversational and Testimonial Messages versus Didactic Presentations of
Nutrition Information,” Journal of Nutrition Education & Behavior 35, no. 5 (September/October
2003): 255–59, https://pubmed.ncbi.nlm.nih.gov/14521825/.
4. Marcel Machill, Sebastian Köhler, and Markus Waldhauser, “The Use of Narrative Structures in
Television News: An Experiment in Innovative Forms of Journalistic Presentation,” European
Journal of Communication 22, no. 2 (2007): 185–205,
https://journals.sagepub.com/doi/10.1177/0267323107076769.
5. Dan P. McAdams, The Redemptive Self: Stories Americans Live By, rev. and expanded ed. (New
York: Oxford University Press, 2013).
Part I. People
1. Tom Kertscher, “Obama Auto Rescue Saved 28,000 ‘Middle-Class’ Jobs in Wisconsin, 1
Million in U.S., Ex–Michigan Governor Says,” Politifact, September 14, 2012,
https://www.politifact.com/factchecks/2012/sep/14/jennifer-granholm/obama-auto-rescue-saved-
28000-middle-class-jobs-wi/.
2. Republicans Draw Even with Democrats on Most Issues: Pessimistic Public Doubts
Effectiveness of Stimulus (Washington, DC: Pew Research Center, April 28, 2010), sec. 2, “The
National Economy and Economic Policies,”
https://www.pewresearch.org/politics/2010/04/28/section-2-the-national-economy-and-economic-
policies/.
3. “How the Great Recession Changed American Workers,” Penn Today, September 12, 2018,
https://penntoday.upenn.edu/news/how-great-recession-changed-american-workers.
4. For one, the Organisation of Economic Co-operation and Development said so in one of its
reports: OECD, Relations between Supreme Audit Institutions and Parliamentary Committees,
SIGMA Papers No. 33 (Paris: OECD, December 9, 2002), 82.
5. Government Accountability Office, Unemployed Older Workers: Many Experience Challenges
Regaining Employment and Face Reduced Retirement Security, GAO-12-445 (Washington, DC:
GAO, April 2012), https://www.gao.gov/assets/gao-12-445.pdf.
6. Paul Von Chamier (Kennedy School graduate), interview with coauthor Lauren Brodsky,
February 19, 2020.
7. Government Accountability Office, “GAO: Excerpts from Focus Groups and Interviews with
Unemployed Older Workers, June and July 2011,” YouTube video, 4:30, posted by GAO,
https://www.youtube.com/watch?v=HdZbVKcloYI.
8. HBO, Last Week Tonight with John Oliver, excerpted in the video “Net Neutrality: Last Week
Tonight with John Oliver (HBO), YouTube, 13:17, uploaded June 2, 2014,
https://www.youtube.com/watch?v=fpbOEoRrHyU.
9. Nick Sinai (Kennedy School adjunct professor), interview with coauthor Lauren Brodsky,
September 25, 2019.
10. David Leonhardt and Yaryna Serkez, “The U.S. Is Lagging behind Many Rich Countries.
These Charts Show Why.” New York Times, July 2, 2020,
https://www.nytimes.com/interactive/2020/07/02/opinion/politics/us-economic-social-
inequality.html.
11. Leonhardt and Serkez, “U.S. Is Lagging behind Many Rich Countries.”
12. Government Accountability Office, Unemployed Older Workers, 57.
13. “Ready to Work,” Employment and Training Administration, US Department of Labor,
created November 19, 2013, “https://www.doleta.gov/readytowork/.
Part II. Purpose, Then Process
1. Public diplomacy is the art of influencing and communicating with foreign publics, in order to
impact foreign policy. United States Advisory Commission on Public Diplomacy, US Department of
State, Data-Driven Public Diplomacy: Progress towards Measuring the Impact of Public Diplomacy
and International Broadcasting Activities, September 16, 2014, https://2009-
2017.state.gov/documents/organization/231945.pdf.
2. Ranjana Srivastava (Kennedy School graduate), interview with coauthor Lauren Brodsky,
March 21, 2021.
3. Ranjana Srivastava, “ ‘Could It Be Scurvy?’ It’s a Travesty So Many Australian Aged Care
Patients Are Malnourished,” Guardian (Australia), March 10, 2021,
https://www.theguardian.com/commentisfree/2021/mar/10/could-it-be-scurvy-its-a-travesty-so-
many-australian-aged-care-patients-are-malnourished.
4. Ranjana Srivastava, “Despite Some Errors, Australia Shouldn’t Politicise the Process of the
Vaccine Rollout,” Guardian (Australia), February 25, 2021,
https://www.theguardian.com/commentisfree/2021/feb/25/despite-some-errors-australia-shouldnt-
politicise-the-process-of-the-vaccine-rollout.
5. Laurence Turka, “Scientists Are Failing Miserably to Communicate with the Public about the
Coronavirus,” Boston Globe, July 27, 2020,
https://www.bostonglobe.com/2020/07/27/opinion/scientists-are-failing-miserably-communicate-
with-public-about-coronavirus/.
6. Zeynep Tufekci, “5 Pandemic Mistakes We Keep Repeating,” Atlantic, February 26, 2021,
https://www.theatlantic.com/ideas/archive/2021/02/how-public-health-messaging-backfired/618147/.
7. Tufekci, “5 Pandemic Mistakes We Keep Repeating.”
8. Robinson Meyer and Alexis C. Madrigal, “Why the Pandemic Experts Failed,” Atlantic, March
15, 2021, https://www.theatlantic.com/science/archive/2021/03/americas-coronavirus-catastrophe-
began-with-data/618287/.
9. Carl Zimmer, “How You Should Read Coronavirus Studies, or Any Science Paper,” New York
Times, June 1, 2020, https://www.nytimes.com/article/how-to-read-a-science-study-
coronavirus.html?referringSource=articleShare.
10. “The 17 Goals,” Department of Economic and Social Affairs, United Nations,
https://sdgs.un.org/goals.
11. Tim Harford, The Data Detective: Ten Easy Rules to Make Sense of Statistics (New York:
Riverhead Books, 2021), 142.
12. Ministry for Foreign Affairs of Finland, Data Diplomacy: Mapping the Field; Summary
Report of the Geneva Data Diplomacy Roundtable, April 2017, https://www.diplomacy.edu/wp-
content/uploads/2017/03/DataDiplomacyreport.pdf.
13. See Target 4.2, “Sustainable Development Goal 4 (SDG 4),” Global Education Cooperation
Mechanism, https://sdg4education2030.org/the-goal.
14. Kate Anderson, “We Have SDGs Now, but How Do We Measure Them?” Brookings
Institution, November 3, 2015, https://www.brookings.edu/blog/education-plus-
development/2015/11/03/we-have-sdgs-now-but-how-do-we-measure-them/.
15. Anderson, “We Have SDGs Now, but How Do We Measure Them?”
16. “What Is PISA?,” Organisation of Economic Co-operation and Development,
https://www.oecd.org/pisa/.
17. Michael Barthel, Amy Mitchell, and Jesse Holcomb, Many Americans Believe Fake News Is
Sowing Confusion (Washington, DC: Pew Research Center, December 15, 2016), 3 in PDF,
https://www.journalism.org/2016/12/15/many-americans-believe-fake-news-is-sowing-confusion/.
18. Barthel, Mitchell, and Holcomb, Many Americans Believe Fake News Is Sowing Confusion, 3
in PDF.
19. Adam Grant, Think Again: The Power of Knowing What You Don’t Know (New York: Viking,
2021), 110–11.
20. Jeni Klugman and Sarah Twigg, “Gender at Work in Africa: Legal Constraints and
Opportunities for Reform,” Working Paper No. 3, 10–11,
https://wappp.hks.harvard.edu/files/wappp/files/oxhrh-working-paper-no-3-klugman.pdf.
21. United States Information Agency, West European Trends on U.S. and Soviet Union Strength,
February 1963, p. 13, digital identifier JFKPOF-091-006-p0003, Papers of John F. Kennedy,
Presidential Papers, President’s Office Files, John F. Kennedy Library and Museum,
https://www.jfklibrary.org.
22. United States Information Agency, Reactions to the European Situation, March 1, 1963, p. 71,
digital identifier JFKPOF-091-006-p0003, Papers of John F. Kennedy, Presidential Papers,
President’s Office Files, John F. Kennedy Library and Museum, https://www.jfklibrary.org.
23. Peter J. Katzenstein and Robert O. Keohane, Anti-Americanisms in World Politics (Ithaca, NY:
Cornell University Press, 2007), 17.
24. Katzenstein and Keohane, Anti-Americanisms in World Politics, 108.
25. Katzenstein and Keohane, Anti-Americanisms in World Politics, 19.
26. Katzenstein and Keohane, Anti-Americanisms in World Politics, 16.
27. Katzenstein and Keohane, Anti-Americanisms in World Politics, 288.
28. Cary Funk, Alec Tyson, Brian Kennedy, and Courtney Johnson, Science and Scientists Held in
High Esteem across Global Publics (Washington, DC: Pew Research Center, September 29, 2020), 8
in PDF, https://www.pewresearch.org/science/2020/09/29/science-and-scientists-held-in-high-
esteem-across-global-publics/.
29. Katzenstein and Keohane, Anti-Americanisms in World Politics, 119.
30. Katzenstein and Keohane, Anti-Americanisms in World Politics, 121.
31. Holly Ellyatt, “France’s Vaccine-Skepticism Is Making Its Covid Immunization Drive Much
Harder,” CNBC, January 13, 2021, https://www.cnbc.com/2021/01/13/france-swhy-france-is-the-
most-vaccine-skeptical-nation-on-earth.html.
32. “Macron: AstraZeneca Vaccine ‘Quasi-ineffective’ for Over-65s,” France 24, January 29,
2021, https://www.france24.com/en/live-news/20210129-macron-astrazeneca-vaccine-quasi-
ineffective-for-over-65s.
33. Ministry for Foreign Affairs of Finland, Data Diplomacy, 3.
34. Harford, Data Detective, 68.
35. Harford, Data Detective, 93–94.
36. CBS, 60 Minutes, “Operation Warp Speed: Planning the Distribution of a Future COVID-19
Vaccine,” YouTube video, 13:25, uploaded November 9, 2020, https://www.youtube.com/watch?
v=240DMmhgp4M.
37. James Stavridis, “U.S. Needs a Strong Defense against China’s Rare-Earth Weapon,”
Bloomberg News, March 4, 2021, https://www.bloomberg.com/opinion/articles/2021-03-04/u-s-
needs-a-strong-defense-against-china-s-rare-earth-weapon.
38. Rebecca Barnes (Kennedy School graduate), interview with coauthor Lauren Brodsky,
September 25, 2019.
39. United States Advisory Commission on Public Diplomacy, US Department of State, Data-
Driven Public Diplomacy, 20.
40. United States Advisory Commission on Public Diplomacy, US Department of State, Data-
Driven Public Diplomacy, 22.
41. R. Eugene Parta, Discovering the Hidden Listener: An Assessment of Radio Liberty and
Western Broadcasting to the USSR during the Cold War (Stanford, CA: Hoover Institution Press /
Stanford University Press, 2007), xx.
42. United States Advisory Commission on Public Diplomacy, US Department of State, Data-
Driven Public Diplomacy, 11.
43. Harford, Data Detective, 146. Harford credits statistician David Hand for the concept of “dark
data.”
Part III. Persistence
1. “Number and Rate of Victims of Solved Homicides, by Sex, Aboriginal Identity and Type of
Accused-Victim Relationship” from 2014 to 2019, Statistics Canada, released July 27, 2021,
available from https://www150.statcan.gc.ca.
2. Zander Sherman, “Bruce McArthur, Toronto’s Accused Landscaper Killer, Was Hiding in Plain
Sight All Along,” Vanity Fair, July 3, 2018, https://www.vanityfair.com/style/2018/07/toronto-serial-
killer-bruce-mcarthur-accused-landscaper.
3. Office of the Federal Ombudsman for Victims of Crime, “Submission to the Independent
Civilian Review into Missing Persons Investigations Conducted by the Toronto Police Service,”
submitted by Heidi Illingsworth, Ombudsperson Office of the Federal Ombudsman for Victims of
Crime, Government of Canada (website), November 2019, https://www.victimsfirst.gc.ca/vv/MPI-
RPD/index.html.
4. Sasha Reid (University of Calgary sessional instructor), interview with coauthor David
Chrisinger, September 2019.
5. Kathleen McGrory and Neil Bedi, “Targeted,” Tampa Bay Times, September 3, 2020,
https://projects.tampabay.com/projects/2020/investigations/police-pasco-sheriff-targeted/intelligence-
led-policing/.
6. Kennedy quoted in McGrory and Bedi, “Targeted.”
7. Philip Tetlock, “Why Foxes Are Better Forecasters than Hedgehogs,” Long Now Foundation,
Seminars about Long-Term Thinking, January 26, 2007,
https://longnow.org/seminars/02007/jan/26/why-foxes-are-better-forecasters-than-hedgehogs/.
8. Hong Qu (Kennedy School adjunct lecturer), interview with coauthor Lauren Brodsky,
September 15, 2020.
9. Alberto Cairo, “About That Weird Georgia Chart,” Cairo (blog), May 20, 2020,
http://www.thefunctionalart.com/2020/05/about-that-weird-georgia-chart.html; Willoughby Mariano,
“ ‘It’s Just Cuckoo’: State’s Latest Data Mishap Causes Critics to Cry Foul,” Atlanta Journal-
Constitution, May 13, 2020, https://www.ajc.com/news/state--regional-govt--politics/just-cuckoo-
state-latest-data-mishap-causes-critics-cry-foul/182PpUvUX9XEF8vO11NVGO/.
10. Cairo, “About That Weird Georgia Chart.”
11. Kathleen Hall Jamieson and Paul Waldman, The Press Effect: Politicians, Journalists, and the
Stories That Shape the Political World (New York: Oxford University Press, 2002), 97.
12. David J. Johnson, Trevor Tress, Nicole Burkel, Carley Taylor, and Joseph Cesario, “Officer
Characteristics and Racial Disparities in Fatal Officer-Involved Shootings,” Proceedings of the
National Academy of Sciences 116, no. 32 (2019), 15877–82, 15880.
13. Johnson, Tress, Burkel, Taylor, and Cesario, “Officer Characteristics and Racial Disparities in
Fatal Officer-Involved Shootings,” 15880, 15877.
14. Alex Dobuzinskis, “More Racial Diversity in U.S. Police Departments Unlikely to Reduce
Shootings: Study,” Reuters, July 22, 2019, https://www.reuters.com/article/us-usa-police-race/more-
racial-diversity-in-u-s-police-departments-unlikely-to-reduce-shootings-study-idUSKCN1UI017.
15. Martin Kaste, “New Study Says White Police Officers Are Not More Likely to Shoot Minority
Suspects,” NPR, July 26, 2019, https://www.npr.org/2019/07/26/745731839/new-study-says-white-
police-officers-are-not-more-likely-to-shoot-minority-suspe.
16. Dean Knox and Jonathan Mummolo, “It Took Us Months to Contest a Flawed Study on Police
Bias. Here’s Why That’s Dangerous,” op-ed, Washington Post, January 28, 2020,
https://www.washingtonpost.com/opinions/2020/01/28/it-took-us-months-contest-flawed-study-
police-bias-heres-why-thats-dangerous/.
17. Knox and Mummolo, “It Took Us Months.”
18. Knox and Mummolo, “It Took Us Months.”
19. The hearing, set for September 19, 2019, was announced on the committee’s website:
“Oversight Hearing on Policing Practices,” House Committee on the Judiciary,
https://judiciary.house.gov/calendar/eventsingle.aspx?EventID=2278.
20. Stephen Soumerai and Ross Koppel, “How Bad Science Can Lead to Bad Science Journalism
—and Bad Policy,” Washington Post, June 7, 2017,
https://www.washingtonpost.com/posteverything/wp/2017/06/07/how-bad-science-can-lead-to-bad-
science-journalism-and-bad-policy/.
21. Prachi Sanghavi, Anupam B. Jena, Joseph P. Newhouse, and Alan M. Zaslavsky, “Outcomes
of Basic versus Advanced Life Support for Out-of-Hospital Medical Emergencies,” Annals of
Internal Medicine 163, no. 9 (November 3, 2015): 681–91.
22. Soumerai and Koppel, “How Bad Science Can Lead to Bad Science Journalism.”
23. “Advanced Care, Increased Risk: Advanced Life Support Ambulance Transport Increases
Mortality,” ScienceDaily, October 13, 2015,
https://www.sciencedaily.com/releases/2015/10/151013102416.htm.
24. Lean H. Sun, “Ambulances with Less-Sophisticated Gear May Be Better for Patients,”
Portland (ME) Press Herald, October 12, 2015,
https://www.pressherald.com/2015/10/12/ambulances-with-less-sophisticated-gear-may-be-better-
for-patients/.
25. Soumerai and Koppel, “How Bad Science Can Lead to Bad Science Journalism.”
26. “An Interview with John R. Lott, Jr.,” University of Chicago Press (website), 1998,
https://press.uchicago.edu/Misc/Chicago/493636.html.
27. “An Interview with John R. Lott, Jr.,” University of Chicago Press (website).
28. Patricia Grambsch, “Regression to the Mean, Murder Rates, and Shall-Issue Laws,” American
Statistician 62, no. 4 (2008): 289–95.
29. Jens Ludwig, “Concealed-Gun-Carrying Laws and Violent Crime: Evidence from State Panel
Data,” International Review of Law and Economics 18, no. 3 (1998), 239–54.
30. Ian Ayres and John J. Donohue III, “Shooting Down the ‘More Guns, Less Crime’
Hypothesis,” Stanford Law Review 55, no. 4 (2003): 1193–312.
31. The Regional Educational Laboratory at Florida State University provides an annotated
bibliography of this research in a section of its website named “Ask a REL Response”:
https://ies.ed.gov/ncee/edlabs/regions/southeast/aar/u_03-2019.asp.
32. Donna Gordon Blankinship and the Associated Press, “New CEO: Gates Foundation Learns
from Experiments,” Hartford Courant, May 28, 2009, https://www.courant.com/sdut-us-gates-
foundation-raikes-052809-2009may28-story.html.
33. John V. Pepper, review of The Bias against Guns: Why Almost Everything You’ve Heard about
Gun Control Is Wrong, by John R. Lott Jr., Journal of Applied Econometrics 20, no. 7 (2005), 931–
42, 931.
34. David Hemenway, unpublished review of The Bias against Guns: Why Almost Everything
You’ve Heard about Gun Control Is Wrong, by John R. Lott Jr., https://cdn1.sph.harvard.edu/wp-
content/uploads/sites/247/2013/02/Hemenway-Book-Review.pdf.
35. Knox and Mummolo, “It Took Us Months.”
36. Jacques Gallant, Paul Hunter, and Vjosa Isai, “How Alleged Serial Killer Bruce McArthur Hid
in Plain Sight for Years,” Toronto Star, March 16, 2018,
https://www.thestar.com/news/gta/2018/03/16/how-alleged-serial-killer-bruce-mcarthur-hid-in-plain-
sight-for-years.html.
37. David Bell, “U of C Serial Killer Expert Says There May Be More Bruce McArthur Victims,”
Canadian Broadcasting Corporation News, February 12, 2019,
https://www.cbc.ca/news/canada/calgary/sasha-reid-serial-killer-database-university-of-calgary-
bruce-mcarthur-1.5016846.
OceanofPDF.com
INDEX
facts, 4, 5, 87
complexity, 95, 113
unnecessary attribution vs., 87, 112
Faizi, Abdulbasir, 74
fake news, 50–51, 52
Federal Communications Commission, 24, 108
Finland, Ministry for Foreign Affairs, 47, 60–61
first drafts, 46
focus groups, 21–23, 28
France, 54, 59–60
Machill, Marcel, 5
Macron, Emmanuel, 59–60
Madrigal, Alexis C., 44–45
main finding. See message (main finding)
manipulation, of data, 100
Martin, David, 63–64
McAdams, Dan P., The Redemptive Self: Stories Americans Live By, 5
McArthur, Bruce, 71–72, 98–99
McGrory, Kathleen, 77–78
meaning, in data, 8, 33–68
paraphrasing and, 24
medical writing, 38–41
message (main finding), 38, 65, 108–9
impact-focused, 26–27
placement, 25–26, 108–9
subtext, 52, 110
methodology, 25, 108
Meyer, Robinson, 44–45
missing persons cases, 71–77
multidimensional views, 55–60
Mummolo, Jonathan, 90–91, 97–98
narratives, 45
enhancement of, 46
political, 19
strong differentiated from weak, 8–9
as TV news format, 5
National Assessment of Educational Progress (NAEP), 49–50
National Bureau of Economic Research, 14
Navaratnam, Skandaraj, 71–75, 98–99
net neutrality, 24, 108
New York Times, 27, 45
New York University, Center on International Cooperation (CIC), 17–
21
Nocco, Chris, 77–78
North Atlantic Treaty Organization (NATO), 57, 65
NPR (National Public Radio), 89
numbers: difficult-to-grasp, xii–xiii, 62–63, 110–11
ratios, 14, 34, 107
paragraphs, 54
with message (main finding), 27–28
and simplicity, 95, 113
paramedics, 91–92
paraphrasing, 24
parental education–college attendance relationship, 25–26, 108–9
parentheses, proportions in, 72, 111
Pasco County, FL, crime prevention program, 77–78
Peeper, John V., 97
peer review, 90
Penn Today, 14
people-based data storytelling, 13–29, 37, 101–2, 107–8
abstractions, 24, 108
focus groups and, 22–23, 28
imagery, 22, 108
personal stories, 22–24
qualitative data, 74, 79
quotations, 23–24, 29
testimonials, 22–24
percentage change, 57, 110
percentage point difference, xii–xiii, 57, 110
Perna, Gustave F., 63–64
persuasion, 4–5, 7, 53, 82
excessive data and, 53–54
multidimensional views and, 59
problem-specific solutions and, 73–74, 111
purpose and, 37, 59
susceptibility to, 17
uncertain data and, xiii, 80–82
unnecessary attribution and, 87, 112
Pew Research Center, 14, 58
Many Americans Believe Fake News Is Sowing Confusion, 50–51
plagiarism, 87, 112
points of view, 23
policy beacons, 19–21
policy goals, 46–48
broad vs. targeted, 46–48
foreign policy, 68
presidential, 26–27
“politeness norm,” 56
polls, 34, 37, 54–55, 56, 58, 60
Portland (ME) Press Herald, 92
predictive analysis, 60, 77–78, 81, 88, 92
presidential elections, 88
Princeton University, 90
problems: manufactured, 78–79, 111
presentation to clients, 18–21
problem solving: data-informed solutions, 6, 25, 73–80
identification of specific problem, 73–74
Proceedings of the National Academy of Sciences (PNAS), 89–91
process, relation to purpose, 35, 49, 66
Programme for International Student Assessment (PISA), 48–49, 50
progress, emphasis on, 19–21
proportions: differentiated from percentage changes, xii–xiii, 57, 110
in parentheses, 72, 111
props, narrative, 17–21
public diplomacy, 33–37, 54–57, 58, 68, 81
culture of evidence, 66–68
definition, 117n1
public good, 24–27
public opinion: foreign, of United States, 33–37, 54–57, 58, 68
social media as indicator of, 60–61
public policy, 8
implementation time, 27
public service, 38
purpose, 6, 8, 57, 67–68, 80, 102
data dumping and, 50–54, 60–61, 64
data indicators as focus, 48
data visualizations and, 84, 86
loss of, 45
persuasion and, 37, 59
policy goals and, 46–47, 48
readers’ goals and, 38
relation to process, 35, 49, 66
in science writing, 38–39, 41–42, 43–44
Putin, Vladimir, 57
television news, 5
testimonials, 22–24
Tetlock, Philip, 81–82
theories, data analysts’ approach to, x, 81–82
timeliness, accuracy vs., xiii, 83–87, 88, 112
tips: abstractions (#7), 24, 108
communication skills (#5), 7, 22, 108
comparability of data (#15), xii, 47, 48–50, 58, 87–88, 110
correlation distinguished from causation (#30), xii, 92–95, 113
data dumping (#16), 50–53, 64, 107, 110
data layering (#17), 50–55, 61–64, 110
data-supported solutions (#24), 78, 111
decision-making in writing (#12), xi, 38, 109
difficult-to-grasp numbers (#19), xii–xiii, 62–63, 110
evidence building (#14), 45, 66–68, 110
excessive data (#2), 50–53, 64, 107, 110
expansion of story
tips (cont.)
function (#17), 53–54, 110
graph captions (#27), 85–86, 112
honesty and accuracy in data use (#26), xii, 83–86, 112
humanization of data scale (#20), xii–xiii, 64–66, 111
imagery (#6), 22, 108
manufactured problems (#24), 78–79, 111
message (main finding) (#8), 25–27, 38, 52, 65, 108–9
number of examples (#13), 42, 109–10
output vs. outcome and impact (#11), xi, xii, 37, 109
people-based data storytelling (#4), 107–8
percentage changes differentiated from proportions (#18), xii–xiii,
57, 110
problem-specific solutions (#23), 73–74, 111
proportions in parentheses (#22), 72, 111
randomized controlled trials (#32), 97, 113
ratios (#3), xii–xiii, 14, 34, 107
reporting to convey information (#1), 4–6, 44, 61, 68, 84, 85, 107
research questions (#10), xi–xii, 35, 36–37, 47, 109
selection of examples (#13), 42, 109–10
sentence subjects and verbs (#21), 66, 111
simplification of language (#31), 95, 113
suggestive/nonrepresentative data (#25), xii, 81–83, 112
timeliness vs. accuracy (#29), xiii, 83–87, 88, 112
tone of writing (#9), 28, 90, 109
unnecessary attribution (#28), 87–88, 112
tone, of writing, 28, 90, 109
Toronto Police Service, missing person / serial killer cases, 71–75, 98–
99
Toronto Star, 98
transparency, xii, xiii, 50, 67, 80–81
trends, 13, 20, 34, 37, 49, 60
context, 61–64
data visualizations of, 83–86
in education, 48–49
Trump, Donald J., 57, 62, 63–64
trust, 100
persuasion based on, xii
in science and scientists, 58, 59
Tufekci, Zeynep, 42–43
Turka, Laurence, 41–42
Twitter, 91
YouTube, 3
Zimmer, Carl, 45
OceanofPDF.com
About the Authors
OceanofPDF.com