Bring Human Values To AI
Bring Human Values To AI
Giulio Bonasera
Summary. When it launched GPT-4, in March 2023, OpenAI touted its superiority
to its already impressive predecessor, saying the new version was better in terms of
accuracy, reasoning ability, and test scores—all of which are AI-performance
metrics that have been used for some time. However, most striking was OpenAI’s
characterization of GPT-4 as “more aligned”—perhaps the first time that an AI
product or service has been marketed in terms of its alignment with human values.
In this article a team of five experts offer a framework for thinking through the
development challenges of creating AI-enabled products and services that are safe
to use and robustly aligned with generally accepted and company-specific values.
The challenges fall into five categories, corresponding to the key stages in a typical
innovation process from design to development, deployment, and usage
monitoring. For each set of challenges, the authors present an overview of the
frameworks, practices, and tools that executives can leverage to face those
challenges. close
When it launched GPT-4, in March 2023,
OpenAI touted its superiority to its already
impressive predecessor, saying the new
version was better in terms of accuracy,
reasoning ability, and test scores—all of
which are AI-performance metrics that
have been used for some time. However, most striking was
OpenAI’s characterization of GPT-4 as “more aligned”—perhaps
the first time that an AI product or service has been marketed in
terms of its alignment with human values.
00:00 / 24:59
Listen to this article
To hear more, download the Noa app
[ 1]
Define Values for Your Product
The first task is to identify the people whose values must be taken
into account. Given the potential impact of AI on society,
companies will need to consider a more diverse group of
stakeholders than they would when evaluating other product
features. These may include not only employees and customers
but also civil society organizations, policymakers, activists,
industry associations, and others. The picture can become even
more complex when the product market encompasses
geographies with differing cultures or regulations. The
preferences of all these stakeholders must be understood, and
disagreements among them bridged.
This challenge can be approached in two ways.
Embed established principles. In this approach companies draw
directly on the values of established moral systems and theories,
such as utilitarianism, or those developed by global institutions,
such as the OECD’s AI principles. For example, the Alphabet-
funded start-up Anthropic based the principles guiding its AI
assistant, Claude, on the United Nations’ Universal Declaration of
Human Rights. Other companies have done much the same;
BMW’s principles, for example, resemble those developed by the
OECD.
[ 2]
Write the Values into the Program
Beyond establishing guiding values, companies need to think
about explicitly constraining the behavior of their AI. Practices
such as privacy by design, safety by design, and the like can be
useful in this effort. Anchored in principles and assessment tools,
these practices embed the target value into an organization’s
culture and product development process. The employees of
companies that apply these practices are motivated to carefully
evaluate and mitigate potential risks early in designing a new
product; to build in feedback loops that customers can use to
report issues; and to continually assess and analyze those reports.
Online platforms typically use this approach to strengthen trust
and safety, and some regulators are receptive to it. One leading
proponent of this approach is Julie Inman Grant, the
commissioner of esafety in Australia and a veteran of public
policy in the industry.
[ 3]
Assess the Trade-offs
In recent years we have seen companies struggle to balance
privacy with security, trust with safety, helpfulness with respect
for others’ autonomy, and, of course, values with short-term
financial metrics. For example, companies that offer products to
assist the elderly or to educate children must consider not only
safety but also dignity and agency: When should AI not assist
elderly users so as to strengthen their confidence and respect
their dignity? When should it help a child to ensure a positive
learning experience?
Given the challenges they face, managers are forced to make very
nuanced judgments. For example, how do they decide whether
certain content generated or recommended by AI is harmful? If an
autonomous vehicle narrowly misses hitting a pedestrian, is that
a safety failure or a sign that the vehicle’s safety system is
working? In this context organizations need to establish clear
communication processes and channels with stakeholders early
on, to ensure continual feedback, alignment, and learning.
Giulio Bonasera
A good example of what companies can do, although not
specifically focused on AI, is provided by Meta. In 2020, amid
growing public concern about how online platforms moderate
content, the company established its Oversight Board to help it
make value-driven decisions. The board is a group of
independent, experienced people from a variety of countries and
backgrounds who not only make some difficult decisions but also
help the company hear the views of its diverse stakeholders.
[ 4]
Align Your Partners’ Values
Sam Altman, as OpenAI’s CEO, shared a challenge on the podcast
In Good Company: How much flexibility should his company give
people of differing cultures and value systems to customize
OpenAI’s products? He was referencing a trend whereby
companies take pretrained models, such as GPT-4, PaLM,
LaMDA, and Stable Diffusion, and fine-tune them to build their
own products.
[ 5]
Ensure Human Feedback
Embedding values in AI requires enormous amounts of data—
much of which will be generated or labeled by humans, as noted
earlier. In most cases it comes in two streams: data used to train
the AI, and data from continuous feedback on its behavior. To
ensure values alignment, new processes for feedback must be set
up.
[ 6]
Prepare for Surprises
AI programs are increasingly displaying unexpected behaviors.
For example, an AI simulation tool used in a recent experiment by
the U.S. Air Force reportedly recommended that the pilot of an
aircraft be killed to ensure that the aircraft’s mission was properly
executed. In another example, the Go-playing program AlphaGo
invented new moves that Go experts deemed “superhuman and
unexpected.” Perhaps the best-known example involved
Microsoft’s Bing chatbot, which began to show aggressive and
even threatening behavior toward users shortly after launch,
stopping only after Microsoft reduced the possible length of
conversation significantly. Similarly unforeseen experiences will
increase in frequency, especially because Chat GPT and other
large AI models can now perform tasks that they weren’t explicitly
programmed for—such as translating from languages that were
not included in any training data.
...
In a world where AI values alignment may determine competitive
outcomes and even become a requirement for product quality, it
is critical to recognize the risks and the opportunities for product
differentiation and to embrace new practices and processes to
stay ahead of the game. Customers—and society more broadly—
expect companies to operate in accordance with certain values. In
this new world they can’t afford to launch AI-enabled products
and services that misbehave.
ABusiness
version Review.
of this article appeared in the March–April 2024 issue of Harvard
JA
Jacob Abernethy is an associate professor at
the Georgia Institute of Technology and a
cofounder of the water analytics company
BlueConduit.
FC
François Candelon is a managing director and
senior partner at Boston Consulting Group
(BCG), and the global director of the BCG
Henderson Institute.
Theodoros Evgeniou is a professor at INSEAD
and a cofounder of the trust and safety
company Tremau.
AG
Abhishek Gupta is the director for responsible
AI at Boston Consulting Group, a fellow at the
BCG Henderson Institute, and the founder and
principal researcher of the Montreal AI Ethics
Institute.
YL
Yves Lostanlen has held executive roles at and
advised the CEOs of numerous companies,
including AI Redefined and Element AI.
The Chair of Honeywell on Bringing an Industrial Business into the Digital Age
PODCAST
The Key to Preserving a Long-Term Competitive Advantage