0% found this document useful (0 votes)
182 views92 pages

Artefact Data and AI Transformation For Business Report

Uploaded by

Maaza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
182 views92 pages

Artefact Data and AI Transformation For Business Report

Uploaded by

Maaza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

DATA & AI

TRANSFORMATION
FOR BUSINESS
WE ACCELERATE DATA AND AI ADOPTION
TO POSITIVELY IMPACT
PEOPLE AND ORGANIZATIONS.

The Netherlands
UK Germany
New York France Switzerland
Spain
Los Angeles South Korea
Lebanon Chengdu
Morocco Shanghai
Dubai
Mexico Saudi Arabia India
Senegal
Colombia Malaysia
Singapore

Brazil
South Africa

21 1500 +1000
COUNTRIES EMPLOYEES CLIENTS

Artefact is a global leader in consulting services, specialized in data & AI transformation


and data-driven digital marketing, from strategy to the deployment of AI solutions.
We are offering a unique combination of innovation (Art) and cutting-edge AI technologies (Fact).

DATA READINESS | AI ACCELERATION | DATA & DIGITAL MARKETING | TECHNOLOGIES


TABLE OF CONTENTS

Data & AI transformation


for Business
4 Introduction by Vincent Luciani AI for Call Centre
CEO & co-founder of Artefact
50 Powering your call centre with artificial
intelligence
Data Readiness 54 MAIF — Using Topic Modelling to reduce
contact centre bottlenecks
8 Entreprise Governance in the data age 56 Using NLP to extract quick and valuable
11 ORANGE France — AI solution of visual insights from your customers’ reviews
recognition at the service of Orange France 63 HOMESERVE — Using speech analytics
technical intervention quality to improve customer satisfaction
14 A Practical Approach to Business Impact
from Data & AI | Ten Proven Tactics from the
Battlefield AI for Finance & Industry
17 From idea to implementation: Becoming an AI 66 Unlocking the Future: How financial
Factory institutions can prepare to scale AI

20 CARREFOUR — Google Data Lab Using AI to 68 Gaining buy-in for data & analytics
drive value in store based on the AI Factory’s initiatives in financial services
operating model of Artefact 74 The road ahead: data-driven sales is
22 HEINEKEN Brazil —Using the Data Factory critical for the evolving car industry
methodology as a Revenue Generation Center 77 Interview: How Nissan is transforming in
24 Data Governance, a prerequisite for AI project the digital world
success
26 
The vital role data governance plays in Data for Impact
achieving sustainability goals
30 PIERRE & VACANCES CENTER PARCS How 81 Use data to measure and reduce your
data governance and data quality can boost environmental impact with Artefact
digital marketing and activation performance
83 Industrializing carbon footprint
33 Data Mesh: Principles, promises and realities measurement to achieve neutrality
of a decentralized data management model
85 Applying machine learning algorithms
37 Why is the “data as a product” concept central to satellite imagery for agriculture
to data mesh? applications

AI Industry Solutions
Demand Forecasting
39 Demand forecasting: Using machine learning
to predict retail sales
41 L’OREAL — Trend Detection Innovating
tomorrow’s products today thanks to AI trend
detection by Artefact
43 Scoring Customer Propensity using Machine
Learning Models on Google Analytics Data

3
The outlook for data and How is Artefact leading the
generative AI transformation for

AI transformation, today
enterprises?

and tomorrow.
Since the availability of the first LLM
models (Large Language Models),
even before the official public launch
of ChatGPT in November 2022, we
at Artefact have been one of the key
An interview with Vincent Luciani global pioneers using this powerful
technology, designing and deploying
many generative AI use cases with our
clients throughout 2023.
The generative AI technology revolution has been a
paradigm shift for all industries and sectors. Artefact As certified experts with major
sees AI as an incredible opportunity that, if used Clouds and open source GenAI, we’ve
already acquired strong expertise and
properly and ethically, will lead to economic, social, developed a solid ecosystem. In this
and democratic progress. context, we recently announced our
official strategic collaboration with
Mistral AI, the most powerful LLM
How is generative AI profoundly platform for a European OpenAI.
transforming society and
businesses? Despite achieving notable reductions
in development time and enhanced
We are at the beginning of a new employee adoption, scalability of
era. The generative AI revolution is GenAI projects remains a challenge,
reshaping societal and economic emphasizing the need for ethical and
landscapes. After an experimentation secure environments grounded in
phase, generative AI will continue robust data foundations.
to change the game for the global
community. It’s a technology with  or more than 10 years, Artefact has
F
the potential to improve the world prioritized the crucial role of data in AI
in many ways, as long as solid success for enterprises. Initiating data
checks and balances are in place to acceleration programs, we focus on
ensure its responsible and beneficial elevated data quality, governance, and
development. interconnected platforms, adhering
to ethical and responsible guidelines.
• Economically, it offers undeniable
productivity gains that will spur Anticipating substantial growth thanks
innovation and new business growth. to these new LLM technologies,
companies are urged to embrace AI for
“At Artefact, we • Socially, generative AI will streamline a competitive edge. This transformative
administrative tasks, freeing up year will necessitate new organizational
hold an optimistic more valuable and creative time, models and widespread AI deployment
vision, viewing AI which could lead to innovative job across business value chains, with
opportunities and the development Artefact accompanying its clients from
as an incredible of new skills. strategy to full operations.
chance that, if
• Democratically, the accessibility However, the success of technology
used properly and of GenAI to all will provide deep shifts depends on fostering trust and
ethically, will lead to knowledge and solutions to address enthusiasm among all employees,
specific societal and educational requiring consultation and support
economic, social, and inequalities and advance the cause from top to bottom, an area where
democratic progress.” of social justice. hackathons and training can be
instrumental.

4
INTERVIEW

What is Artefact’s mission? What


initiatives have you taken to
anchor your AI adoption strategy
to accelerate business growth and
efficiency?

Artefact’s primary mission is to


accelerate the adoption of data and
AI to positively impact people and
organizations.

To translate this purpose, our motto


is «AI is about people».

The companies that will endure are


those that successfully foster a data With significant growth expected thanks
to generative AI technologies, we advise
culture with access to knowledge and
data for all.
all organizations to embrace AI to gain a
competitive advantage.
We’ve undertaken several initiatives
in this area that are highly strategic
to Artefact’s positioning as a major
player in data democratization, in major corporations, startups, Can you give us concrete
order to fully realize its potential for and universities, to disseminate examples that show how Artefact
positive transformation. knowledge about data and AI. designs AI solutions that improve
business competitiveness?
• The development of the Artefact Business data maturity has
School of Data, a key pillar in our advanced rapidly over the past Data is the key to understanding
strategy of providing clients with decade. How has Artefact evolved customers, developing better
training adapted to the constantly as a global leader in data and AI products and ser vices, and
evolving skills of the data industry. consulting services? streamlining internal operations to
We are also developing «à la carte» reduce costs and waste. Artefact
e-learning platforms for clients to Companies have implemented data supports over 1,000 clients worldwide,
quickly share knowledge of data governance policies, which are a including 300 international brands in
and AI with all of their employees. prerequisite for any transformation, sectors from consumer goods, retail
We’ve expanded our Artefact School but there are still sectors that lag & e-commerce, and healthcare, to
of Data from France to Dubai and far behind in terms of their data bancassurance, telecoms, industry,
New York City, and soon to other processing, with a real potential for energy - and more.
cities to educate organizations about efficiency.
data and AI, while creating new job For example, we’ve been working with
opportunities in this domain. We started to transform marketing the Orange Telecommunications
departments by making them more group for over six years, and among
• We’ve also launched many generative profitable and relevant in their multi- the many use cases for leveraging
AI hackathons at major companies channel media investments with the company’s automation and AI
to empower and inspire their pioneering targeting, measurement potential, we deployed a solution
employees with these new innovative and personalisation solutions. For with their teams to optimize their
technologies. the past few years, we have also been technicians’ interventions on the fiber
deploying acceleration programs in all network. The solution is based on
• For over five years, we’ve organized business areas (Sales, Supply Chain, visual recognition technology that
large-scale conferences such as Operations, Call Centers, HR and helps operators improve the quality
AI for Finance & Industry and AI Finance, etc.). of their installations or repairs. This
for Health. We also successfully application, available on a tablet, is
launched the first edition of AI for We create value wherever there is currently used by more than 10,000
Luxury in NYC and AI for Life in data, and work with our clients to Orange technicians throughout the
Geneva, bringing together top-level improve their processes and create country - a resounding success!
AI ecosystem stakeholders, including customized business AI applications.

5
This case perfectly illustrates
Artefact’s firm belief that to achieve
true data maturity, companies have no
choice but to make data accessible to
everyone: not only to experts, but also
to operational staff in the field. This
will lead to new forms of augmented
work, where applications and their
interfaces put intelligent information
in everyone’s hands to work more
efficiently and with more autonomy.

Artefact also helped the Carrefour Paris-Saclay) and leading enterprises for clients. Artefact has become one
Group in reducing the carbon impact of including Orange, Société Générale of the first and few consolidated pure
its e-commerce branch with a solution and Decathlon, with other companies data & AI players in the market, with
that can be implemented by the joining us soon. Through the the most comprehensive set of data-
company and consumers. Carrefour’s developments and publications of the driven services and AI applications.
aim is to become the world leader in Artefact Research Center, we aspire
food system transformation for all by to shape a future where AI is not only We offer data acceleration programs,
committing to four major objectives, a powerful tool but is also tailored to industry specific AI solutions, and
including achieving carbon neutrality the needs of businesses with ethics data-driven marketing services.
by 2030 for its e-commerce activities. and responsibility, thereby facilitating Our engineers build tech agnostic
The challenge for Artefact was to its adoption. solutions, combining custom code with
enable Carrefour to reliably measure open source and proprietary software,
all greenhouse gas emissions from • The creation of the SKAFF technology backed by strong partnerships with
data storage, transport and logistics platform, an open source developer leading cloud providers, to create
activities, from first click to final portal that includes a central software exactly what you need for your data
delivery, components catalog supporting and AI transformation.
TechDocs and a scaffolder for
Our solution measured greenhouse automating engineering processes. Today, Artefact is present in 20
gas emissions generated by This platform enhances efficiency countries across Europe, Asia, the
e-commerce orders, then collected by swiftly delivering high-quality Americas (North & South), the Middle
activity data to convert it into carbon outcomes through the consolidation East and Africa, with 23 offices and
emissions. All Carrefour business of technical assets, convictions, 1,500 employees. And we have robust
teams helped obtain the data – which and tutorials focused on our core plans for geographical expansion as
is why the operation was a success, as technologies. well as an ambitious M&A policy that
it allowed all stakeholders to become will continue.
ambassadors for the group’s “carbon After a decade of exponential
neutrality 2030” objective. growth, what is Artefact’s ambition We’re also continuously hiring new
for the coming years? consulting Partners and Directors,
How is Artefact able to always be experts in their respective fields,
at the forefront of AI through core First of all, our gratitude goes to our orchestrating collaboration across
research and advanced technology? clients for entrusting us, a cornerstone Artefact’s regions. They provide
of our success. dedicated support and industry-
At Artefact, we’ve implemented major specific services. While strengthening
projects to ensure that we always I believe that our success also stems our positions in CPG, Retail, and Luxury,
leverage the best of data science and from our unique ability to transform we’ve also intensified our development
AI technologies for our clients: data and AI into value for companies. in Financial Services, Healthcare &
We offer our 1000+ clients a unique Pharmaceuticals, and Manufacturing,
• The launch of the Artefact Research combination of innovation (Art) and reinforcing human resources.
Center, which fosters a robust data data science (Fact).
and AI R&D ecosystem by connecting We’re excited about the promising
PhD talent at Artefact with esteemed By creating multidisciplinary teams future that AI holds for individuals and
professors from top universities and breaking down silos between organizations. The excellence of AI
(Polytechnique, Sorbonne University, business and technology departments, technology will be realized through the
and CentraleSupélec, University of we generate real, immediate impact collective capabilities of human talent.

6
Data Readiness
8 Entreprise Governance in the data age

11 ORANGE France — AI solution of visual recognition at the service of


Orange France technical intervention quality

14 A Practical Approach to Business Impact from Data & AI

17 From idea to implementation: Becoming an AI Factory

20 CARREFOUR — Google Data Lab Using AI to drive value


in store based on the AI Factory’s operating model
of Artefact

22 HEINEKEN Brazil —Using the Data Factory


methodology as a Revenue Generation Center

24 Data Governance, a prerequisite for AI project


success

27 
The vital role data governance plays
in achieving sustainability goals

30 PIERRE & VACANCES CENTER PARCS How


data governance and data quality can
boost digital marketing and activation
performance

33 Data Mesh: Principles, promises and


realities of a decentralized data
management model

37 Why is the “data as a product”


concept central to data mesh?

7
routes: they don’t factor in the distance
to be covered, but rather the fewest
left turns to be made on each route.
By analysing the data, they realised
that 60% of all accidents were caused
by taking left turns, and only 3% by
taking right turns (and requiring more
waiting time).

Entreprise Governance
Analysis, prediction and optimisation:
with these, data becomes a ‘production
factor’ rather than an ‘innovation

in the data age


factor’. All these initiatives are linked
to significant ROI, whose stakes can
be in hundreds of points of turnover
or incremental margins.

Data enables better decisions that


Vincent Luciani improve business performance, but
Co-founder and CEO how can the company’s decision-
ARTEFACT making body and governance be
adapted to take better advantage of
the data age?

Data must be treated as a


First, an anecdote about the pitfalls How data enables better strategic asset
of not using data. decision making
Companies need to resolve their
In 1984, a young Michael Jordan Data helps you understand the past: ‘data debt’ – a technology debt
– today considered to be the best data analysis offers a clearer way of accumulated around the lifecycle of
basketball player of all time – had just seeing the root causes of problems data. We have a multiplicity of very
won the American college basketball in a multifactorial world. complex IT systems, which have been
championship in North Carolina. A superimposed over time, with data
shooting guard, he was one of the top Data helps you predict the future: sources that are very often poorly
three picks of the NBA draft, which data can forecast consumer demand. documented, sometimes inconsistent,
selects the best American college A good example is the way we helped difficult to access, and which don’t
players. But at the time, shooting Carrefour Group predict sales in their comply with the rules in force (internal
guards were considered smaller and bakery and pastry department. The or external such as GDPR). This ‘debt’
less impressive than pivots. Because goal was to sell more by avoiding wastes a lot of time in mobilising
of this reasoning, and regardless of the stockouts, and to sell better by reliable information for analysis.
data that indicated his extraordinary reducing food waste. The idea was
potential, Adidas twice refused to to provide the managers of the fresh A good illustration of this is the
sponsor Jordan. produce department, who make bread case of one of Artefact’s major
and pastries on site every day, with pharmaceutical clients. They were
The result? Nike – Adidas’ biggest an accurate estimate of volumes. unable, until recently, to obtain their
competitor – approached Jordan We used a technique called machine turnover by product figures due to
to design his own line of shoes. To learning, based on learning from an inability to cross-reference their
date, 100 million pairs of Air Jordans historical data. It worked very well: production plant product repository
have been sold. The huge commercial we improved turnover by a few points with their transactional repository,
success of the Jordan brand still by avoiding stockouts, while dividing which contains financial data
generates a turnover of 3 billion dollars waste by three. organised by sales outlet: an irrational
today thanks to the man who was six situation where the company’s own
times voted best player in the NBA. Data helps you optimise what already data was unavailable for it to use.
That’s a lesson every business should exists: UPS software gives each of
take to heart: always make the most its drivers specific routes to follow, The longer data debt is allowed to
of your data. but they aren’t always the shortest accumulate, the more costly it becomes

8
DATA READINESS

to rectify. Treating data as a strategic to develop in-demand expertise in for distributing calculations on
asset means agreeing to invest in areas such as data science and different servers).
a program to improve data quality, cybersecurity.
documentation and accessibility, and Decentralisation is also valid
to do so in a sustainable manner as When considering the talents that will in governance. Why? Because
sources multiply. be needed tomorrow, it’s tempting centralisation is impossible: there’s
to focus only on technology and simply too much data, with poorly
A data driven company assume the next generation will be controlled sources, which can
must become a talent exclusively composed of engineers easily be poorly interpreted without
development factory and data scientists. Clearly, there will contextual knowledge. Some benefits
be a need for them and there will even of decentralisation in governance
Talent has become the decisive be a shortage of them for the next few include:
factor in the digital age. Access years, but this is only part of the story.
to technology is a commodity so In a world where data and algorithms 1. Rapid decision-making and less
universally accessible that the can automate manual, repetitive and time spent going back and forth
emergence of no-code, for example, time-consuming tasks, and where
and cloud computing, are increasingly technology is ever more accessible, 2. Letting the ‘one who knows’ make
associated with turnkey services such there’s plenty of room for other types the best decision
as database storage and operation or of talents: problem-solving, creative,
automatic algorithm building. interpersonal, etc. 3. E mpowering decision-makers
with a mandate – with limits and
The recruitment war is very serious – In a modern company, the a control loop, of course.
among the seven million available job decision-making process
offers in the US posted on LinkedIn must be decentralised This implies a deep organisational
(70% of the total) for example, two out change oriented around knowledge:
of seven are for data-related positions. In technology, there is a major the organisation as an autonomous
The pace of technological change is so progressive movement towards whole, constituted of cognisant
rapid that it’s impossible to establish decentralisation, which began with communities organised around
a competency framework at any given technological breakthroughs (e.g., knowledge. At Artefact, for example,
time. crypto currency or the metaverse we made sure that certain entities
which are decentralised systems), (chapters, tribes, guilds) could decide
This acceleration is being driven but also IT systems (cloud, where we on their own very critical things, like
by GAFA-backed big budget cloud share our machines, or distributed salaries, bonuses, prices and offers
technology frameworks, research computing architectures such as and even staffing! We have created a
labs and free algorithms from a global Hadoop, a world-renowned framework fully decentralised governance.
network of 100k researchers, plus
the almost immediate adoption and
widespread use of open source in the
start-up world.

In an environment where technology


is changing so rapidly, companies
must be able to build their own talent.
For large companies in particular,
ex tensive training /retraining
programmes will be needed:

• The World Economic Forum (WEF)


predicted that by 2025, 85 million
jobs will disappear and 97 million
new ones involving data will be
created.

• AT&T now invests around $250


million annually in T University,
which enables existing employees

9
all share them too. They are invaluable
tools which break any strategy down
into measurable objectives, then into
two or three sub-objectives shared
by all employees. Andy Grove, who
taught them to John Doerr; he in
turn wrote the book, ‘Measure What
Matters’, about the process. OKRs
allow employees to be valued for their
accomplishments, not merely their
backgrounds, degrees, or titles.

Conclusion: A new
perception of data and data
governance
Data changes the way companies are
governed, and the role of managers –
and directors in particular – from one
of making the best decisions in the
company’s interest to one of creating
a system so that everyone contributes
to making the best decisions in the
company.

As a Board member, you have an


important role to play in addressing
issues that have a tangible impact
on the results of the company, but
it will require paradigm shifts. And
it is precisely because data shakes
up governance that it must be
appropriated by governance itself.
Leading is measuring our key Salesforce tools for pipeline
estimation, a staffing tool, our HR As your data governance policy
Some measurements are unnecessary database, and our ERP for customer matures, you will need to ensure that
in decision-making and can easily contracts. Out of this, we’ve created your priorities encompass responsible
be dispensed with in favour of other, a real-time file containing seven and ethical data processing and
more proactive management tools. key indicators summarising the energy sobriety initiatives within its
Here’s an example to demonstrate operational health of the company framework. It’s crucial to operate in
this concept: looking forward and shared with ways that reduce the economic, social
investors – a tool far more meaningful and environmental footprint of digital
A Profit & Loss statement is nothing and helpful for better decision making. technology.
more than the company’s report card.
It’s useless in decision-making and For example, we can respond to and In times of doubt or radical changes,
doesn’t serve the investor relationship engage with our shareholders on the human tendency is to shut down
at all. (We almost never look at issues such as what our expected and turn inward. People build self-
turnover and EBITDA which is like demand will be in six months, or what defence systems (confirmation
looking at past performance, not the composition of our workforce is bias) to avoid being shaken in their
future performance.) supposed to be (critical in a context convictions, by creating their own
of rapid growth and challenging systems of truth. Properly used, data is
Instead, for Artefact, we spent time recruitment). the truth that should allow us to make
constructing a very advanced data- better, more reliable and independent
driven finance department. We built a KPIs are broken down into Objectives decisions for the benefit of customers,
financial data warehouse connected and Key Results (OKRs) shared among users, employees and the company
in real time to a data platform, with the top partners of the company, and we as a whole.

10
CASE STUDY

ORANGE FRANCE
AI solution of visual recognition
at the service of Orange France
technical intervention quality
CHALLENGES

AI-based application lets fiber installers easily verify


intervention compliance
Orange, a leading telecommunications company in interventions on network equipment generate “defects”
France and Europe, undertook an Artificial Intelligence- or “non-conformities”, which are often the cause of the
based transformation in 2020. This new vision for growing degradation of the fiber network in France.
data and AI aims to help the thousands of technicians,
engineers and other professions employed by the group In 2022, we increased the number of connections by 23%
with tasks that are too complex for humans to manage compared to 2021. This sustained growth has led to an
alone. At the same time, an overhaul of data product increase in reported malfunctions. This is why ARCEP
governing regulations was also launched. made it mandatory, in summer 2021, to take pictures
To implement their new strategy, Orange was accompanied before and after each intervention.These images have
by data service consulting firm Artefact. Together, the a triple objective: to monitor, to intervene as quickly as
two businesses industrialized numerous use cases to possible in the event of a problem and to penalize the
support the group’s Business Units, facilitate technical operators found at fault.
decision-making and transform the business by realizing
the potential of data and AI. “For Orange, this regulation requires the analysis of 20,000
photos daily. A task impossible to perform quickly and
Among these jointly constructed use cases is an AI solution
faultlessly without AI assistance.”
designed to assist Orange technicians in connecting
clients to the fiber network. This solution, integrated into
Médéric Chomel, VP Data, AI & Automation, ORANGE
the technician’s application, verifies that none of their
FRANCE

11
An AI solution based on
image recognition
Asking technicians to verify numerous control points
at the job site, or having enough human resources
dedicated to analyzing the 20,000 pictures generated
every day is not a feasible solution. This would be too
time-consuming, too costly, and would not be error-
proof. In addition, sampling is not an option, as each and
every intervention must be verified.

To review these tens of thousands of photos daily (10,000


interventions x 2 – 10 photos), Orange and Artefact
developed an algorithmic model using image recognition
(computer vision). Technicians, via their mobile app,
send their photos to an artificial intelligence engine
which checks in near real-time whether their work is in
conformity. If the technician disagrees with the machine’s
recommendations, they are free to ignore them. The AI
is perfectly integrated into the technician’s workflow.

Success factors: quality data, multidisciplinary teams,


transfer learning model
The project was led by a multidisciplinary team with a his installation. To reduce latency as much as possible,
mix of profiles from both Orange and Artefact. A feature calculations are parallelized. Thus, several models are
team was created, composed of the Product Owner, executed at the same time to obtain results in quasi real
data scientists, engineers, users, and experts from time.
other professions to work on delivering the solution. The third challenge, but certainly not the least, was analysis
precision. In order to ensure maximum accuracy, the
Time was the first issue faced by the team in charge of
algorithm had to be supplied with a huge quantity of
the project. Orange only had nine months to deploy the
compliant and non-compliant photos. The labeling of
first version of their solution. This is why they decided to
these images was carried out by a partner company
base part of the project on “transfer learning”, a method of
called Isahit, which was able to process 80,000 photos
using pre-existing models either already in use within the
in three months of development while respecting data
company, or available as open source. Artefact teams then
confidentiality.
reworked these models via retraining, labeling and pre-
processing, to shorten delivery time, and also developed “This project is part of Orange’s long-term AI transformation
several others from scratch. The team then looked at the strategy. We have packaged the code so that it can be reused
response times of different computer vision solutions. in future use cases where image recognition is needed.
Some solutions on the market processed images in seven This AI product has already been reused to support fiber
or even eight minutes, while the target time was three technicians in another field of operations.”
seconds. The application should be launched when the Vincent Luciani, co-founder and CEO of the ARTEFACT
technician is about to leave the site. It is impractical to ask group.
the person to wait 10 minutes to check the conformity of

12
RESULTS

SOLUTION A success which is part


of a global
Change management: transformation
encouraging strategy through AI
end-user adoption of the In just nine months, this new application was

application
designed, tested, corrected and industrialized
on a large scale. It was a real technical and
human feat, as the tool has now been adopted
To understand their way of working, technicians have by the 10,000 technicians deployed every day
been part of the project from the very beginning . in France.
This allowed the development team to identify several
points, one of which is crucial: the application should This application is just one of the 150 use cases
not be perceived as a means of controlling the work of developed by Orange over the last two years as
technicians, but as a tool to facilitate their daily work. part of its transformation through AI. Since then,
Orange – together with Artefact – has put 15
So, to ensure that end-users are comfortable with the new models into production to support other
application and that it was ethically designed, the team functions, such as sales or customer service.
worked on two aspects.
First, technicians must be able to maintain control over
the machine and go against its recommendations. This
is why the explainability of the results returned by the
model was a core value. If the model finds one or more
non-conformities, the AI must specify which area or areas
are affected.
Then, once a first version of the application was ready,
the team had it tested by 50 volunteer technicians. This Médéric Chomel
allowed the team to collect relevant feedback so they could VP Data & AI Automatisation
improve the models. As an example, the conditions in ORANGE FRANCE
which the photos are taken can lead to confusion between
orange-colored cables (from Orange) and red-colored “Regulations on artificial intelligence
cables (competitors). The recurrence of this error led the are in the process of being
feature team to improve the algorithm’s acceptability. developed. Our transformation,
The model’s performance was reduced in order to avoid
contradicting what the human sees.
using data and AI, is intended to
be respectful of privacy, to benefit
For Vincent Luciani, co-founder and CEO of the Artefact
group, humans and their environment,
and to be unbiased. This is why
“All of our AI projects are designed to respect the seven
fundamental principles for ethical AI use established by a this strategy anticipates future
group of European Commission experts. The first of these regulatory changes as far as
values is human control. We have placed technicians at possible. We must remember
the heart of the project to ensure that this new solution
that this type of project is not just
makes their daily lives easier and doesn’t hinder their
autonomy. This has also been crucial for its adoption by technical; humans are the greatest
all Orange installers.” factor in their success.”

13
A Practical Approach
to Business Impact
from Data & AI 1-B
 uilding data solutions should
be driven by the business, for
the business

Ten Proven Tactics from the


In today’s rapidly-evolving business
landscape, data intelligence has
become an essential tool for
Battlefield companies looking to remain
competitive. Organizations that fail to
adopt data-driven models risk falling
behind their rivals by missing out on
valuable insights and opportunities
for expansion, optimization and
innovation. In short, leveraging data
intelligence for business is no longer a
luxury, but a necessity for sustainability
and evolution, and business leaders
should be the ones spearheading
the identification, prioritization, and
development of data & AI solutions.
Oussama Ahmad Karim Haye Contrary to the common belief that
Data Consulting Partner, Data Consulting business stakeholders are just
Global Travel & Tourism Lead Senior Manager “consumers” of data solutions, we
ARTEFACT ARTEFACT believe that they should lead the
entire process , supported by data
and technology experts.

2 - I dentifying the “right” data


solutions requires in-depth
analysis of the business value
chain and business processes

A thorough analysis of the business


value chain and key business
processes is best carried out by
business stakeholders themselves.
This analysis identifies areas where
data solutions can drive significant
business impact in the form of
revenue growth, cost optimization,
customer experience enhancement
or operational excellence. During
this process, it’s essential to identify
business opportunities that align
with the company’s overall business
strategy. For example, analysis of
the supply chain and its key ratios
can help identify potential gaps and
inefficiencies that can benefit from
data analytics and intelligence.

14
DATA READINESS

3-P
 rioritizing a few data solutions “As organizations seek to
will ultimately have the most achieve tangible business
business impact
results from their investments
The goal shouldn’t be to impress with in data analytics and artificial
a long list of data solutions, but rather intelligence, it’s critical to adopt
to identify the most critical business
areas that can benefit from data-driven a focused approach that builds
insights. By avoiding the temptation the right solutions and sets
to pursue too many data solutions, the right expectations. Through this approach,
organizations can stay focused and
increase their chances of building business leaders spearhead the development
successful data solutions. It’s also of data & AI solutions ‘for the business by the
important to identify the value-added business’ - prioritizing the most impactful solutions,
capabilities of data solutions beyond
simple reporting. While reporting is building quick POCs with data experts, scaling
valuable in providing a summary of data solutions that work, and accepting ‘failure’
business performance, it only provides on those that don’t. Having business teams lead
a retrospective view of data, leaving
little room for analysis and decision- the whole process ensures business buy-in and
making. To fully leverage the power adoption by design.”
of data, organizations must identify
data solutions that provide diagnostic
analytics that automatically identify
Oussama Ahmad, Data Consulting Partner,
the root causes of performance and Global Travel & Tourism Lead - ARTEFACT
predictive analytics that anticipate
future trends.

4-A
 ssessing feasibility of data Building and scaling data solutions projects. Instead, companies should
solutions requires a full for businesses requires a new focus on industrializing successful
understanding of data sources operating model - an AI Factory data use cases, scaling them to full
and technologies - made up of feature teams led by data domains, and optimizing their
business experts supported by algorithms and data sources. This
Before embarking on the development data scientists, engineers, analysts also includes ongoing monitoring and
of a data solution, it is vital to conduct and software engineers. This team improvement of the use case to ensure
a detailed feasibility study that structure ensures that data solutions that it continues to meet the needs of
examines the availability and quality are always built with a business the business users.
of the required data sources, as well objective in mind. Adopting an agile
as the cost of the technologies and test-and-learn process that attempts 7-S
 haring knowledge is
expertise required to collect and to build a successful POC in a short necessary but not sufficient for
process these data sources. This time span is also essential to achieve wide data solution adoption
includes examining the hardware and faster time-to-build.
software requirements, as well as the Providing data solution training
human skills needed to implement 6-A
 ccepting that some data and easy-to-use documentation for
and maintain the technology. This solutions will fail, and scaling business users is necessary, but
also helps to set realistic expectations and maintaining those that usually not sufficient for widespread
for data solutions that are consistent work adoption of data use cases.
with the maturity of the required Widespread adoption of data solutions
data sources, technologies, and Not all data solutions will succeed; by business users is best achieved by
capabilities. some will fail, due to technical or having users lead the development
data limitations, despite careful process, integrating these solutions
5-B
 uilding data solutions planning and execution. It is crucial into the organization’s learning
efficiently needs a scalable for organizations to recognize curriculum, and including adoption
AI Factory and an agile that failure is a natural part of the and impact KPIs in business user
development process development process: it should not scorecards. By aligning business user
discourage them from pursuing future scorecards with the organization’s

15
data strategy, organizations can create 9-M
 aintaining robust governance
a culture of data-driven decision of data solutions ensures
making and ensure that the adoption accurate results with minimal
of data solutions leads to tangible oversight
business impact.
Maintaining high-quality data sources
8 - I mproving data solutions for data solutions is crucial for
is continuous; prioritizing achieving automated, accurate results
enhancements that matter is with minimal oversight. To achieve
key this, organizations should implement
a robust data quality framework
To achieve continuous enhancement that enforces clear guidelines
of data solutions, it is vital to regularly and standards for data collection
collect feedback from business and transformation. In addition,
users, evaluate their needs and organizations should implement
requirements, and make necessary strong data security and privacy
adjustments to optimize these. The policies for secure and compliant data
Scrum methodology provides an processing. This approach ensures
effective approach for gathering and that input data is accurate, current,
implementing improvements in an and consistent, which reduces the
iterative and incremental manner. risk of errors and improves the overall
Users of data solutions should log efficiency of the data processing
continuous feedback on the accuracy workflow.
and usability of data solutions as
well as required improvements to 10 - T
 racking the business impact
business processes. It’s important of data solutions requires
to (1) implement improvements that defining direct impact KPIs
increase the accuracy of the solution’s and assigning incremental
output, (2) expand its features and business impact
functionality, and (3) improve its
usability and user experience. Identifying the commercial or
operational KPIs that are directly
improved by a data solution is
essential to measuring its business
impact. Once these KPIs are identified,
“Data acceleration projects have the next step is to develop a formula
to measure the incremental impact
been surging in the MENA region of the data solution on each of these
in recent years, as organizations KPIs. This formula should take into
embrace the power of data for account the baseline of these KPIs
before (or without) the implementation
business growth. While certain of the data solution and compare it
challenges persist, such as to the performance of these KPIs
maintaining data quality, especially with legacy after (or with) the implementation
of this solution, taking into account
systems, organizations are actively seeking solutions other factors that may have led to this
to overcome these obstacles. Building the right data increase. Once the incremental impact
capabilities within business teams and the right on each KPI has been calculated, it
should be translated into financial
operating model is the single most important way to terms, such as reduced costs or
ensure the successful implementation and adoption increased revenues. Finally, it’s always
of data solutions and the realization of tangible recommended to use automated
business impact measurement of
business impact.” data solutions to ensure unbiased
and timely measurement of business
Karim Hayek, Data Consulting Senior Manager impact.
ARTEFACT

16
DATA READINESS

From idea Artificial Intelligence (AI) is seen


as the major lever of competitive

to implementation:
advantage. The data doesn’t lie:
there’s been an almost 25% year-
on-year increase in business use of

becoming
AI, with 63% of executives agreeing
it has led to revenue increases. The
global pandemic has only put this

an AI factory
into sharper focus. The businesses
that thrive and survive will be those
able to adopt the right AI solutions
and deploy and scale them quickly
and efficiently.
Yet, as with all game-changers, AI
initiatives raise new challenges.
Implementation comes with many
Alexandre Thion de la Chaume questions – chief among them,
Managing Partner how can you adopt the right data
Data Factory - Industries approach to deploy AI initiatives
ARTEFACT rapidly and efficiently, without failure
and sustainably over the long term?
The ‘AI Factory’ approach has been
developed for precisely this reason.

Formulating a coherent AI strategy, and deploying The AI Factory is an organisational


operating model – combining different
value-adding and efficient use cases is a struggle talents, capabilities and processes in a
for many businesses. Alexandre Thion de la systematised way – to deliver success
Chaume, Partner, Data Consulting at Artefact, in AI deployment and scalability. It
has been effectively used by industry
explains how these processes can be streamlined leaders like Carrefour and ENGIE to
through the AI Factory model. deliver transformative AI projects
across their businesses. Yet setting
up an effective AI Factory from scratch
can be daunting. You need expert
teams and a clear vision to make the
process work.
Planning makes perfect
The vital first step is to define a vision
and use cases for your AI Factory.
This will be your data strategy. Use
cases offering the highest business
potential of transforming the company
must be identified. Whether it’s supply
chain optimisation or compliance
management, opportunities exist at
all levels.
The company’s AI vision should also
be considered. It’s important to have
the ability to imagine how it could
develop, to plan for it and to reach a
clear-sighted idea of the future. From a
preliminary overall view, draw a refined
version applicable to data and AI.

17
The four pillars of the AI Factory
Once the company’s data strategy and AI vision are defined, you should have
a prioritised list of use cases to implement. But how can you start working on
them? An effective AI Factory implementation is founded on four distinct pillars:

Next, concrete business opportunities


must be assessed, through the
identification and sorting of use
cases. This is done by assessing
business impact and implementation
complexity. A focus on mindsets is
important throughout, to manage
change on a large scale and involve
everyone, from company leadership
to front-line team members.
ONE SINGLE GOVERNANCE
To be efficient, governance must be
high-level, dedicated and tailored. A
highest-instance AI Factory Board –
comprising key C-suite data leaders
– is extremely important in providing
overall sponsorship and direction
as it shares the AI vision and aligns
teams and the roadmap with it. At
the programme management level,
an AI Factory Director role should
be established, involving business,
operations, legal, security and IT data
experts. Their role should be to review,
arbitrate and validate progress.
Finally, at the operational level, there
need to be agile teams. Feature Teams
are responsible for the delivery of use
cases with AI products. They’re close-
knit units working collaboratively to
ensure permanent information flow
and transparency. Most importantly,
they should be multidisciplinary,
combining skills and expertise
from across the business. They are
achievement-oriented, each one
created with a single objective: to
deliver one use case measured by a
unique goal.
ORGANISED, DIVERSE AND
EXPERT TEAMS
To drive efficiencies, structured orga-
nisations should gather business,
data, software and digital tech skills in

18
DATA READINESS

hybrid teams based on agile methods. The purpose of MLOps is to tackle should work on the same canvas
Agility ensures a flexible and adaptive challenges that traditional coded and apply software engineering best
way of working and avoids issues systems do not have. The first practices to data science projects –
linked to a silo approach, such as challenge is collaboration between versioning, deployment environments,
isolated departments within the same teams: different units are often siloed testing.
structure or overly rigid procedures. and own different parts of the process.
Ultimately, MLOps is the discipline of
This requires a good blend of business This stifles the unity needed to go into
consistently managing ML projects
and technical profiles, to ensure that production.
in a way that’s unified with all other
what is developed on the technical
The second is pipeline management, production elements. It secures an
side always has a useful purpose that
as ML pipelines are more complex than efficient technical delivery from use
addresses business needs.
traditional ones. They have specific case early stage (first models) to use
Scalability is an important overall characteristics, including bricks case industrialisation.
characteristic of a team’s makeup. The that must be tested and monitored
A FRAMEWORK FOR SUCCESS
idea is that its structure can be easily throughout production.
duplicated, like Lego bricks. With a AI holds tremendous promise, but
The final obstacle is that ML models
fully scalable model, more teams can also great risk for organisations
usually need several iterations – when
be added to address additional use unable to deploy it properly. The real
put into production in a manual, ad-hoc
cases. benefit of the AI Factory model is that
way, they become rigid and difficult
it establishes a core framework for
ADVANCED AI TECHNOLOGIES to update.
swift and successful implementation.
Of course, effective AI deployment Instead, an MLOps approach should Processes, teams and tools are
needs a foundation of AI-enabling embed all ML assets in a Continuous transferable and repeatable by nature,
technologies. An AI Factory uses Integration and Continuous Delivery meaning a company can remain
a combination of open-source, pipeline (CICD) to secure fast and agile in pursuing its AI vision. Once
proprietary and cloud solutions. They seamless rollouts. All data, features the process is established and
should be standardised across the and models should be tested before supported by MLOps, a business
whole data pipeline – from ingestion every new release to prevent quality has what it needs to become an AI
to visualisation – from beginning to or performance drift. All stakeholders powerhouse.
end, according to best practices.
SYSTEMATIC & PROVEN
METHODOLOGIES
Systematisation is needed to make
sure a series of steps are always taken
in a specific order, each with its own
defined objective. The benefits are
twofold. First, this gives an overall
structure of common references
throughout, creating a backbone that
guarantees consistency. Second, this
makes methodologies replicable and
scalable, considerably accelerating
the deployment of the industrialisation
phase.
MLOPS: KEEPING THE FACTORY
RUNNING
Alongside a set use case methodology,
MLOps (Machine Learning Operations)
practices must be deployed to close
the gap between the concept phase
and production. Inspired by the DevOps
process, this should combine software
development and IT operations to
shorten the development life cycle.

19
CASE STUDY

CARREFOUR
Using AI to drive value in store
based on the AI Factory’s
operating model of Artefact

CHALLENGES

AI as a corporate strategy.
AI offers incredible opportunities in the retail space. Global
retailer Carrefour is going through a digital transformation
and has partnered with Google and Artefact to leverage
the power of AI and capture value in several departments:
assortment, pricing, supply chain, store operations,
ecommerce, and marketing.
“We aim to build Artificial Intelligence and Machine
Learning solutions to better serve our customers and
employees”

40%
Elina Ashkinazi-Ildis — Director, Carrefour-Google Data Lab
Carrefour’s ambition is to sift through its vast trove of
data (4 billions annual transactions, 1 million daily visits
to digital platforms) to identify unaddressed issues, define
use cases, scale AI solutions, spread the adoption of AI

additional
within the company and conduct training and upskilling.
“We are really trying to inject innovation, agility, extra-

revenue
collaboration”
Amélie Oudéa-Castéra — Head of E-Commerce, Data and
Digital, Carrefour
Carrefour chose to set up a multidisciplinary hub of internal
and external data experts.

20
SOLUTION

AI Factory by
Artefact, a robust
framework that turns
AI technology into
valuable AI projects RESULTS

and solid competitive


Making data-driven
advantages.
decisions and unlocking
Artefact devised an operating model through an value at scale.
agile methodology composed of several steps:
Structure, Discovery, Minimum Viable Model Artefact’s initial experiments proved successful and
(MVM), Prototype, Scale and Optimization. are being scaled through the organization and deployed
across different product categories, store formats,
“The key is to carefully select the right use and countries.
cases that bring value to the business.”
Vincent Luciani — Co-CEO, Artefact Carrefour developed an assortment recommenda-
tion tool that helped the chain support a more
Carrefour selected a dedicated AI lead and personalized selection at the store level, giving store
worked with its partners Google and Artefact to directors the autonomy to influence inventory needs.
establish guidelines. Artefact also assembled
Feature Teams, each working on a single unique Some stores saw up to 40% additional revenue on
key performance indicator. They are made of some single items.
a business owner, an AI product owner, a data “Carrefour needed to ensure it had the right
engineer and a data scientist. products, in front of the right shoppers, at the right
“When a challenge is huge, our ambition is to store location”
break it into many sub-problems that we will Stephane Spinella
solve one after the other” Retail Director, Google Cloud
Vincent Luciani — Co-CEO, Artefact The models made to optimize operations with
Use cases were developed in several precision across the supply chain were able to
departments, such as assortment optimization, detect stockouts in just an hour when it used to take
dynamic pricing, relevant promotions, sales two days and to accurately predict the volume of
prediction, inventory management, out of stock curb pickup sales down to the half-hour and allow
prevention, fraud optimization, customization store managers to staff their operations teams
of marketing, churn reduction, and algorithm accordingly.
product recommendation. “We are redefining the ways of shopping, developing
“Our rule is the golden KPI: to define a numbered a truly omnichannel value proposition for our
objective that is very concrete. For example, customers … Across the continent, there is value
regarding supply chain, our KPI is forecast creation in our core business”
accuracy.” Amélie Oudéa-Castéra
Fabrice Henry — Managing Partner Data & Head of E-Commerce, Data and Digital, Carrefour
Consulting, Artefact AI factories are a combination of talented individuals,
Relying on a gradual approach to gain speed methods, and technologies in the service of brands
and scale while enabling innovation is key to looking for scalable operational efficiencies and
turn experimentation into innovation. business successes.

21
CASE STUDY

HEINEKEN
Using the Data Factory methodology as
a Revenue Generation Center
CHALLENGES

HEINEKEN Brazil had an ambitious challenge: add business


value through the use of data and advanced analytics.
So, Artefact joined us to accelerate, and make this happen.

SOLUTION

Rafael Melo – partner phase and its solution, through data mapping,
ARTEFACT collection, and exploration, creation of machine
We implement our Data learning model and a final product to activate
Factory methodology, which this model.
are hybrid teams composed Finally, we test the solution and industrialize this
of business experts, data product for larger scopes. For this, we always rely
scientists and engineers, to on agile principles: We start with a reduced scope,
deliver a product that is quickly actionable. to quickly show business value to stakeholders,
The teams are a mix of people from Artefact and and developing the solution incrementally.
HEINEKEN, to help in the Data Driven acculturation In this partnership with HEINEKEN we created
which was one of HEINEKEN objectives. data products in practically the entire value chain:
Artefact delivers data products from start to Such as finance, HR; production; distribution &
finish, from the business problem prioritization logistics; marketing and trade, as well as sales
and e-commerce.

22
Daniel Guimarães In this way, we developed a stockout prediction
Logistics & Planning model, which consists of the complete automation
Manager of data, modeling and creation of a dashboard
HEINEKEN BRAZIL which generates the necessary insights for our
We had a challenge in the decision making.
area of ​​planning and logistics We started for a few products and a few
related to allocating products in distribution distribution centers, but quickly saw the value
centers and making short-term decisions. The of the solution and scaled to the rest. Today,
challenge was both extracting the information this model is one of the main decision-making
and creating the intelligence to generate the tools in the area.
insight needed daily.

Camila Moreno volume, to create a machine learning model that


Data Scientist - ARTEFACT makes adjustments to production, even during
HEINEKEN invests heavily the process, ensuring quality metrics, such as,
in the automation of its the color of beer.
factories, and one of the The interesting thing about this type of project
projects with Artefact was to is that the automation and financial gains can
use sensor data, like temperature, pressure and be easily scaled to other breweries

RESULTS

Fábio Criniti
Data & Analytics Director
HEINEKEN BRAZIL
The biggest benefit of this
partnership with Artefact is the
speed at which we are able to
deliver value to the business,
and build a revenue generation center for HEINEKEN.
Hybrid teams are able to very well connect the
problem with a data solution. For us this is very
important, as we were able to prove value and
consequently invest more in innovative projects
like these.

23
What are the challenges
of data governance today?
The amount of data and the number
of use cases around data is constantly
increasing. First, companies have to
deal with the challenge of getting the
most possible value out of their data
and democratizing it. Good quality,
well documented data should allow
it to be accessible to the end user.

All of this applies within the framework


of ethics and data protection. Data
governance is becoming essential
to ensure compliance with certain
privacy laws. In Europe, the General
Data Protection Regulation (GDPR)
is in effect and is tending towards
becoming the global standard.

This means that the organization


must first be able to demonstrate

Data Governance,
that it knows what data is flowing
through its infrastructure. It must be
fully transparent about what data it is

a prerequisite collecting from its users and be able to


delete all data linked to any individual

for AI project success


immediately.

Second, migration to the cloud


is essential. Three main uses are
emerging: Business Intelligence,
Artificial Intelligence, and data
exploration. By structuring data based
on data governance, businesses will
be able to offer these three types of
Justine Nerce
uses as a service.
Managing Partner
ARTEFACT
These data products constitute a
common, cross-disciplinary good,
which requires a dedicated team.
These products must be of high
quality, but also visible and usable
Data and its applications are being increasingly by all. The challenge for companies
is data democratization. These data
integrated into business activities. They’re at the products must also be secure and
heart of the search to improve productivity and protected to comply with various
overall efficiency. Through specific processes and an regulatory and ethical issues.

adapted organizational structure, data governance How does Artefact support


enables companies to organize data, enhance companies in implementing
its quality and meet the ethical and regulatory data governance?
challenges of data processing. An interview by the At Artefact, we act as a consulting firm.
Hub Institute with Justine Nerce, Partner at Artefact. We support all our clients throughout

24
DATA READINESS

to accelerate Artificial Intelligence


programs and migration to the Google
Cloud Platform (GCP). Governance
had been positioned as a strategic
asset of their transformation and this
launch was performed in two stages:

• Structuring their “Data Governance


Office” and setting up an operating
model with data stewards and data
custodians, etc.

data governance implementation, specialize in data governance, with • Structuring their data assets into a
from strategy to deployment. First, profiles from different backgrounds: large “business domain”, with the
we perform an audit to see where data product owners who model choice of tools to operate, etc.
they stand, then define a roadmap to products, data stewards who
identify areas to work on. Finally, we document and improve quality, but We’re now entering a third phase of
build a data asset structure into data also data engineers and data analysts. industrialization and extension of this
products and help them choose the AI program. As part of the migration
technical tools they need. We also have an ecosystem of to the cloud, we’re analyzing how we
technology partners with whom can structure, rationalize and pool our
In our consulting approach, we insist we collaborate in an agnostic way. data assets. At the moment, we’ve
on the importance of data as a vector We’re proficient in all the new tools moved on to the second stage, which
of value for the company, then we work that appear on the market. We have consists of structuring our data assets
on deployment, quality tool selection both technical and strategic DNA, and according to these major business
and documentation of governance to are able to link all of these subjects families. Next, we’re going to start
give substance to the strategy and together to treat them in a holistic and thinking about the development of
make it feasible. comprehensive way and deploy them tomorrow’s data products, which
to many clients. will serve different categories of use
We’ve also set up our own Artefact cases.
School of Data, which lets us train data Have you got a concrete
stewards and data owners, essential example of support that What can we expect in the
roles in the implementation of data you’ve provided? future, once everyone has
governance for businesses. Along implemented their data
with this professional training, we We assisted one of our major clients
also intervene directly in companies with very extensive data assets in
governance?
to acculturate them to the need their data transformation. The project The availability of data will allow
for advanced and supported data concerned a redesign of their data the implementation of even more
governance in order to succeed in governance. When we arrived in mid- use cases, particularly in the area
their AI projects. 2017, we saw that their governance of Artificial Intelligence. This will
had been approached from a too- accelerate value creation within
What is unique about technical and not sufficiently organizations. It will also allow us to
Artefact’s global vision? “business” perspective. This resulted support all the issues surrounding data
in a lack of adoption of the necessary democratization and decentralization,
Our strength is that we propose tools. To correct this, we linked their especially in terms of bringing data
a global data governance model, governance to their strategic use closer to the business. Artefact’s
focusing on end-use cases first. cases. To do so, we documented the mission is to create this bridge
We position data governance as an use cases, democratized their access, between data and business, and we
“asset” of this transformation. We’re and improved data quality to ensure carry it out on a daily basis with our
able to transcribe use cases into good results. The first pilots were a clients. If the data is well structured
tangible value and be part of a global success! We then faced the challenge and clean, if the products are available,
transformation program. of scaling up. and if we have the push-button tools
to manipulate them, theoretically in
We also have multidisciplinary experts. In 2020, we assisted this same five years, everyone will be able to use
There are about 20 of us in France who company in launching a program data in their daily work!

25
The vital role data governance
plays in achieving sustainability
goals
In this article, we will present how to define Sustainability is a key focus for today’s
organisations, and with consumers’
sustainability goals and how to include them in data purchase decisions increasingly
governance strategies. based on ‘green’ credentials, it can
be a critical element in remaining
competitive. Businesses are starting
to improve their sustainable practices
by addressing the products and
services they provide, the processes
Manuela Mesa they use, the waste they generate as
Director a by-product, and the supply chain
ARTEFACT that facilitates their operations. But
while 90% of executives believe that
sustainability is essential, only 60%
of organisations have sustainability
strategies in place.

In the data-driven world, companies


Natacha Zouein have a wide range of effective
Senior Data Consultant tools at their disposal that can turn
ARTEFACT data into value to accelerate the
implementation of a sustainability
strategy. They can collect and
examine data on a wide range of
sustainability-related issues – from
energy use to carbon emissions
Pauline Billerot – to reveal key insights that drive
Data Consultant initiatives. In addition to enabling
ARTEFACT green capabilities, analysis shows
that, on average, every dollar invested
in data results in $32 in economic
benefit. In other words, ensuring
that data is accurate and reliable is
essential for organisations.

Focusing on creating, maintaining


and securing high-quality data is key,
but equally important is ensuring that
this data is accessible for the analysis
that enables data-driven decision
making. Consequently, it is crucial
to adopt strong data governance –
a set of processes and policies that
can be implemented to ensure data
is reliable and trustworthy.

26
DATA READINESS

provide around sustainable


products, indicating a risk factor
for the relevant companies. In
addition, 36% of people in the UK
believe further regulation to make
companies improve sustainable
lifestyle choices for consumers
should be introduced.

However, firms are facing challenges


in defining and implementing their
sustainability goals. One recurring
obstacle is ensuring that the adoption
of a sustainable strategy will not
impact their profitability. Businesses
need a quick return on investment,
and a company must be profitable
to be sustainable. At the same time,
the work of measuring ESG scores
may prevent some executives from
fully investing in sustainability
initiatives, as 63% of CEOs struggle to
measure ESG across the value chain,
representing a barrier to sustainability
Defining sustainability productivity; the company is in their industry.
goals: a challenging road for currently working on a pilot project
companies (using SAP’s GreenToken supply Companies need to identify relevant
chain transparency technology) KPIs to create valuable sustainable
In 2015, the United Nations presented to further increase traceability and insights. By measuring these KPIs,
its 17 Sustainable Development Goals transparency of its global palm oil companies will have opportunities
(SDGs) as the blueprint to achieve a supply chain. to achieve their ESG goals, such as
better and more sustainable future carbon footprint reduction, energy
for all; it expects companies to have 2. Legal, compliance and risk consumption, waste and pollution
established sustainability strategies management: Different countries tracking (i.e. within the supply chain),
and implementations in response to have different regulations, which and social impact. But to accurately
them by 2030. may lead to confusion and even measure these KPIs, organisations
risk. In the UK, various laws and must be able to rely on trusted
SDG-oriented business models have frameworks require organisations data to create tangible results, and
the potential to create significant to be transparent in areas such accessibility to the relevant data
market opportunities. In 2019, as diversity, equal pay, carbon can be hard to gain. For example,
McKinsey estimated that global emissions and modern slavery. The if greenhouse gas (GHG) emissions
sustainable investment had exceeded Competition and Markets Authority reduction is identified as a KPI,
$30 trillion, a tenfold increase since (CMA) guidelines, released in sustainability teams will need to
2004. January 2021, helps businesses access hard-to-get financial data,
understand the rules that apply such as travel mileage, and combine
Failure to define SDG goals and apply to their operations and how to it with human resources data to
them to their business model puts achieve sustainability goals without calculate the GHG emissions of
companies at risk in three core areas: breaching competition law. individual employees.

1. 
F inancial Companies can 3. Customer trust: Today’s consumers Sustainability strategies and goals are
face enormous costs due to are actively choosing brands based crucial for companies and if reliable
environmental risks that affect on their ethical behaviour and their data isn’t available and accessible,
their supply chain. For example, initiatives linked to sustainability their societal, environmental and
Unilever estimated an annual loss and climate change – although legal requirements won’t be met.
of €300 million due to climate 48% of UK adults say they do not Companies cannot implement
change endangering agricultural trust the information companies sustainability strategies without data

27
governance that offers transparent
and valuable data for better data-
driven decisions.
Data governance: what it
is and why every company
needs it

Data governance is the approach


companies take to set standards
and policies on how data is ingested,
processed and used in a way that
makes it secure, available, accurate
and usable. It includes aligning the
people, processes and technologies
needed to support those standards.
Putting a data governance policy in
place provides businesses with a
formal strategy with which to access,
monitor and use data to support
employees and business units. It
highlights data’s role as a valuable •COMPLIANCE: A data governance 3.Current data governance: Establish
asset that is essential to respond framework provides companies with whether the company’s data is
to strategic needs and enable data- data security and enables them to supervised, stored securely and
driven decision-making, resulting in meet compliance regulations (such easily accessible, and if it is uniform
the following benefits: as GDPR) and stay on top of their across the organisation.
legal obligations; data governance is
•BETTER DATA QUALITY: Accurate designed to help companies operate 4. D ata applications: Define
and reliable data provides companies more efficiently. what is required to achieve the
with a tangible business asset. Using organisation’s vision. Solid
clean data brings business processes Implementing a data application development processes
across the company into line with governance strategy are essential (development,
each other; this compatibility results prototyping, industrialisation), and
in reliable performance measurement There is no such thing as a one-size- all applications need to be linked to
and dependable KPIs. fits-all approach to data governance. business objectives and add value
Each strategy is unique to the across the company.
•COST AND TIME SAVINGS: By organisation it serves and requires
applying greater data management a different solution. To best define 5.Monitoring and evaluation: Ensure
discipline through better visibility the optimum data governance for the continuous checking of the data
and standardisation of processes, company in question, a framework objectives through clear KPIs and
companies can redeploy 35% of should be followed; based on its targets.
their data spend. Moreover, reliable long experience with clients, Artefact
and accessible data saves time by proposes the following approach: 6.People and processes: Put the right
reducing manual tasks. processes in place to implement
1. Vision and business requirements: the data strategy.
•BREAKING DOWN SILOS: Avoiding Define the company’s business
data duplication, outdated or incorrect priorities and objectives, as well 7.Tools and capabilities: Ensure that
information and silos (collections as its vision for data strategy in the right tools are used, and up to
of data that are isolated across the short, medium and long term. date, to facilitate data processes
different business groups), reduces and enable the required changes.
storage costs and, most importantly, 2.Data infrastructure: Identify where
increases operational efficiency. the organisation’s data currently Organisations need to set up a data
(Artefact’s experience shows that sits, whether the infrastructure is governance programme, which should
data scientists can spend more than designed to facilitate operations, involve structuring data governance
30% of their time on understanding and whether data is constantly assets (definition of the operating
and accessing data.) updated. model, tooling and roadmap of the

28
DATA READINESS

how to reduce usage and save money.


Additionally, dashboards to monitor
and manage utility and energy
waste were created, helping one of
the companies with its main goal of
identifying where solar panels could
be installed to save costs in shopping
malls and residential properties.
Identifying sustainable objectives
within their data strategies provided
both companies with significant
benefits; one forecasted 6% revenue
growth over the next eight years.
WASTE REDUCTION
French retail giant Carrefour had
issues with stock availability/
shortages and shrinkage in its bakery
department due to a limited ability to
predict consumption on any given day.
Artefact provided Carrefour with a
data governance initiative) and the minimise the energy consumption forecasting model that generates daily
deployment of data governance of copper while it was still being used, predictions for each store and product
within each domain (data quality, and quantify the risk levels of the line. Integrated into the current tool
standardisation and accessibility). project using AI. A tool was created to as enriched information, it provides
prioritise geographical areas for the each store manager with reliable
Once companies have set up solid work to be carried out and optimise predictions so that daily production
data governance and defined their costs for the initiative. As a result, of baked goods can be adjusted
sustainability goals, the next step is the company estimated that it could accordingly. The project let Carrefour
to identify how to leverage the first save €1.15M in costs, 1.65 GWh in reduce waste on fresh bakery by 12%.
to achieve the second. energy consumption, and 111 tonnes
of eCO2 on average per year between Additionally, Artefact developed a tool
Using data governance to 2025 and 2030. allowing Carrefour to measure and
achieve sustainability goals model the carbon emissions of its
INCLUDING SUSTAINABILITY
e-commerce sales in 2021, from click
USE CASES IN DATA
Advanced technologies can use data to delivery. It was able to measure
STRATEGIES:
to uncover deep insights, opening a 100% of emissions in four weeks and
world of innovative ways to support Companies are incorporating design a dashboard for simulation and
sustainable practices across the sustainability objectives within their monitoring.
enterprise. Artefact has worked data strategies, such as gathering
SUSTAINABILITY STRATEGIES
with several organisations to build energy and utility data from their
NEED STRONG DATA
data/AI products and strategies, all facilities to work towards carbon
GOVERNANCE
of which are based on strong data emission reduction. For example,
governance, that integrate with Ar tefact helped two leading Data is a vital lever for achieving
business processes to tackle energy property management companies sustainability goals, but it needs
and environmental issues. in the United Arab Emirates (UAE) proactive management if organisations
to understand their data and define are to accurately measure their impact
ENERGY CONSUMPTION
clear sustainability objectives to in this area.
REDUCTION:
achieve their goals in this area.
Artefact supported a leading The projects looked specifically at Structured data governance should
European telecommunications facility, utility and energy waste. therefore be an integral part of any
provider to address an environmental One company partnered with third sustainability strategy; once in
initiative to decommission copper party providers to implement AI and place, companies will be able to lay
across its network by 2030. The smart technology across shopping the foundations for transparent and
sustainability goals were to deploy malls to track air conditioning and accurate decision making and derive
a copper shutdown programme, electricity consumption and pinpoint real business value

29
CASE STUDY

PIERRE & VACANCES


CENTER PARCS
How data governance and data
quality can boost digital marketing
and activation performance
CHALLENGES

Improving data quality across the entire enterprise.


PVCP needed to be able to exploit all the data available forecast for 2020, while there was a total of 40B€
in their ecosystem. To do so, they had to be able to: estimated lost revenue during the lockdown in the
tourism sector in France over the last year. But some
• Improve their data quality to make it more reliable regions are expecting to see modest rises in tourism in
the coming year.
• Join and structure data so it’s clean and can be
shared “An interesting point to highlight about this project is the
use of this unique period we’re living through – it’s an
• Help businesses democratise data use and
ideal time to invest in data quality and data governance
exploit this shared data for personalisation to
– because these are prerequisites and fundamentals,
build enhanced customer relationships.
we all need if we are to use this period intelligently and
Artefact knew that PVCP was going to need help to anticipate the future.”
recoup revenues lost due to the COVID-19 crisis: a
Fabien Cros,
3% drop in worldwide tourism industry growth was
Data Consulting Director at Artefact, added.

30
To meet the objectives of PVPC, it was necessary to:
• Prioritise Data Quality and Governance above
all other subjects to work efficiently and reliably
in order to have an immediate impact on the
business
•
Form a committed SWAT team composed of
experts ready to work hand-in-hand with PVCP
on complex subjects in order to correct current
data quality issues – but also to prevent new
ones
• Create a new Data Steward role with a network
of SPOCs (Single Points of Contact) to reduce
quality problems and produce high-performance
analytics available to all departments.
“The objectives of the data quality project with Artefact
were simply to have a better cohesion in the quality of
the data and to set up processes to help us be more
efficient in the way we deal with the different subjects,”
clarifies Julien SOULARD,
PVCP’s newly-appointed Tracking & Data Collection
Specialist.

SOLUTION

A SWAT team to roll out a Data Governance project.


PVCP and Artefact worked closely for several months With the team in place, a major Data Quality Project
on these complex data science projects. was begun, composed of Data Governance, Tool
Creation and Utilisation, and Monitoring, with speed
To lay the essential groundwork, each company needed
and efficiency as driving factors.
to be able to work in a SWAT team configuration.
SWAT is an American military term meaning Special To kick-start the Data Governance facet, a PVCP –
Weapons And Tactics, but in the business world, a Artefact Data Steward pairing was proposed, while the
SWAT team is composed of experts in various fields Data Steward and SPOCs were trained on using data
that come together to rapidly and efficiently validate quality tools and methodologies.
new business ideas within an organisation.
The Data Steward is responsible, among other things,
The PVCP-Artefact SWAT team: defining data quality standards, monitoring KPIs, and
holding a weekly update to monitor and investigate all
• Prioritised and ultimately delivered value more
current data issues with the data quality SPOCs.
quickly
To address these issues, several tools were created,
• Mobilised different skills to optimise global
most importantly the “golden source” – a data quality
knowledge
dashboard which tracks every KPI and signals every
• Targeted only one value chain at a time for anomaly. There’s also a ticketing tool set up which
optimal efficiency enhances communication when it comes to the way
queries are posed.
• Delivered with a stronger focus on results
“Today the teams are 100% autonomous in all their
The PVCP-Artefact SWAT team was dedicated to
roles.”
generating more high-quality sales leads, optimising
the media budget, and improving processes and Clara Mendes Sampaio,
workflow. Artefact’s Senior Data Consultant who paired with
Julien in his new role as Data Steward, affirmed that

31
RESULT

Time & money savings,


greater accuracy, happier
employees.
The project rapidly showed appreciable benefits:
• +28% improvement in algorithm accuracy
for improved decision-making
• 44% increase in time saved on data
comprehension, location and accessibility
• +30 employee NPS points added after
implementation of the data governance
frame (shift to higher added-value
activities for people)
The project also improved the data quality on
PVCP’s analysis and piloting tools, thanks to the
implementation of different processes, including
weekly appointments, Mantis tickets and a
dashboard for monitoring various tools.
“Before, our data was a bit like Swiss cheese: full
of holes. […] We’ll continue to work with PVCP
on new projects over the coming year: Google
Analytics 4, GTM Server Side, that sort of thing.
But now, we have a much more sound basis that
will allow us to approach data problems more
efficiently.”
Julien was pleased with the results achieved
“Thanks to this project, we’ve cleaned up data we
don’t need, and best of all, we’ve saved 15% on
our Google Analytics consumption. I think that
working with Artefact allows us to guarantee our
leadership and competitiveness.”
At PVCP, the outlook is positive. According to
PVCP.

32
DATA READINESS

Data Mesh: Principles, promises


and realities of a decentralized
data management model

Amine Mokhtari Justine Nerce Killian Gaumont


Customer Engineer, Data Managing Partner Senior
Analytics specialist ARTEFACT Consulting Manager
GOOGLE CLOUD ARTEFACT

On 27 September at the Big Data & AI Paris 2022 Data mesh is a new organizational and
technological model for decentralized
Conference, Justine Nerce, Data Consulting Partner data management. A distributed
at Artefact and Killian Gaumont, Data Consulting architecture approach for managing
Manager at Artefact, along with Amine Mokhtari, analytical data, it allows users to
easily access and query data where
Data Analytics Specialist at Google Cloud, conducted it resides, without first transporting
a Data Mesh Workshop. Data mesh is one of the it to a data lake or warehouse. Data
hottest topics in the data industry today. But what is mesh is based on four core principles:

it? What are its business benefits? And above all, how • Domain-oriented data ownership,
can companies successfully deploy it across their
organizations? • Data as a product,

• Federated data governance,

• Self-serve data as a platform.

The workshop was divided into three


parts:

1. B usiness value: Why adopt a


product/mesh approach? How
does it serve the company’s
business objectives?

2. Deployment approach: How to


achieve success? What steps
should be taken and what
organizational model should be
used?

33
3. Technology stack: Why choose
Google as a technology solution?

To kick off the Business value


discussion, Justine Nerce explained:
“One of the best reasons for adopting
a product/mesh approach is that it
eliminates two vicious circles. The
first is ‘reinventing the wheel’ each
time a new use for data emerges: a
new team is formed that creates its
own data pipeline to serve its specific
needs. The result? Zero shareability,
zero reusability for the technologies
chosen. The second is ‘building a
monolith’ when a new use for data
ends up in the backlog of a central
data team, then gets handed off
to non-data specialist teams that
carry out massive data collection,
generic transformation and use case
development, with the risk of not
responding to user needs.”

But with a product approach, the


vicious circle becomes a virtuous one.
When a new use for data emerges,
instead of building something new,
“Today’s data product is a
data mesh seeks out what already combination of data made available
to the business for business use and
exists and can be reused. It identifies
domains already in charge of handling
given subjects and looks for existing
data products that can accelerate
specific features that facilitate the
the creation and development of new use and the reusability of data”.
needs, either as they are or in iterative
processes to create new, customized
products. And all of these products
can be published in the company
catalog.
How data products create
business value 1. Governed by a team of dedicated At Artefact, data products are
owners; categorized into three different
Data products have existed in product families. “There are raw
enterprises for a long time, but in data 2. E nd-user oriented and widely products such as databases used
mesh, the uses and qualifications adopted; for business processes – which are
of data are essentially different, data products nonetheless”, assures
explains Killian Gaumont: “Today’s 3. Of quality throughout its life cycle; Killian. “Next, there are data products
data product is a combination of data enriched with customized algorithms
made available to the business for 4. Reusable as is or for building other or product recommendations, such
business use and specific features products; as Interaction 360°. At the top are
that facilitate the use and the finished products aligned with use,
reusability of data”. 5. A
 ccessible to all users; such as dashboards. These are
consumer-line products, designed
To be included in data mesh, a data 6. S tandardized so that everyone to create value by linking product
product must be: speaks the same language. development to business strategy.”

34
DATA READINESS

Prerequisite #3: Continuously def1ne data domains and scale up as soon


as the model has proven its value

1 2 3
How to define data domains How to measure success When and how to scate
and associated responsibilities? on a first domain? up the model?

Measure success around The value of data mesh


Start from data usage 2 key axes organization lies
and existing systems by definition in the number
of usersand in scaling up

1. M
 ap systems and Business use Time to insight Several signs that scaling
processes and the people up is possible
attached to these systems lncrease Time to
in average deploy a new 1. If the first data products
2. M
 ap business uses number of data product are widely distributed and
around data (BI solutions, users of data or access reusable
applications) and main products information
users made 2. W
 hen a data domain needs a
available new data product
3. E
 stablish first mapping and
a common vocabulary 3. W
 hen a domain has the
resources to build its own
feature teams product

Deploying data mesh across Killian. “It won’t happen overnight, The last prerequisite is that the
the enterprise obviously. But we’ve already begun business be able to clearly and
breaking down silos by integrating continuously define its data domains
Artefact’s approach to data mesh business teams into IT data teams so and, once the model has proven its
deployment star ts small, by that product teams developing data value, be capable of scaling up.
prioritizing the business’s use cases products can work more efficiently.”
and pain points. All the domains These are the three of the most
and data products needed for each The second prerequisite is the frequently-asked questions by clients
prioritized business use case (from Data Product Owner, who plays a implementing data mesh, along
raw data to finished products) are then key role in coordinating data mesh with Artefact’s recommendations
identified. A future team is assembled implementation. The data product for successfully defining domains,
to develop the first products and set owner has three missions: to design, measuring success, and knowing
standards. Then, related products to build and promote data products. when it’s opportune to scale up.
be built in the future can be identified. The first two missions are self-
explanatory; the third is equally The tech stack: managing
There are three prerequisites for data important, as the strength of a data data mesh with Google Cloud
mesh deployment. The first: breaking product lies in the fact that it is adopted
down silos. and used by the business. “The data “The first thing data and IT teams
product owner is responsible for need to implement data mesh is the
“If data mesh is to be a success, we ensuring that the data product is ability to make their data discoverable
must move towards an organizational documented, understandable and and accessible by publishing it in a
model that breaks down the silos accessible to users, and aligned data catalog”, begins Amine Mohktari.
between IT, data and business to with business needs. The criteria “To achieve this, Google has a first
have platform teams composed of of his success are his KPIs: usage, pillar, Big Query, which enables the
cross domain and cross product technical performance, data quality”, creation of shareable datasets. The
teams, across all entities”, says adds Killian. second pillar, the catalog itself, is

35
The deployment approach used by Artefact clients consists of demonstrating
the value of the model on an initial perimeter or domain

1 2 3 4
Identification of Within a domain
Staffing of 1st
domains and data identification of
Prioritization of feature teams,
products needed for related products
business use development of 1st
prioritized business to be built, opening
cases and pain products, creation
uses cases (from of new domains,
point analysis of 1st standards &
raw data adaptation of
alignment with tools
to finished product standards

#businessfirst #endtoend #MostViable(Data) #interoperability


#startsmall Product #scale
#continuousimprovement

made possible by Analytics Hub, their intelligent data fabric that helps Conclusion: three pitfalls to
which creates links to all the datasets unify distributed data and automate avoid when implementing
created by various members of the data management and governance data mesh
organization or its partners so that across that data to power analytics
subscribers may easily access them.” at scale. Along with an Identity DON’T > Stay stuck in a project vision
and Access Management (IAM) instead of a product vision
“It’s important to understand that only framework to assign a unique identity
links to data are made – never copies. to each data consumer, “Dataplex DO > Define priority data products
Thanks to this system, subscribers offers companies a set of technical according to different uses;
can use data as if it belongs to them, pillars that allow them to carry out
even though it remains in its original any implementation of governance in DON’T > Scale up the new model
physical location. This remains true the simplest way possible”, explains too rapidly
even when you have data sets stored Amine.
in a different cloud”, assures Amine. DO > Test the model with a well-
“At Google Cloud, our aim is to provide defined operating model;
User experience is a major principle you with a serverless data platform
of the system and is reflected in all that will allow your data teams to DON’T > Deploy an overly complex
aspects of data mesh, not only in focus on areas such as processes technical ecosystem
facilitating data sharing and data and business use cases, where they
composition, but by keeping data have added value no one else can DO > Keep the tech stack small to
permanently available, no matter how produce.” IMAGE 3 have as many players as possible.
many users are active.
Google’s Dataplex gives users a 360°
As for data security and governance, view of published data products and
Google has it covered with Dataplex, their quality

36
DATA READINESS

Why is the “data


as a product” concept
central to data mesh?
Violaine Berland
Data Consulting Director
ARTEFACT

The data mesh model has been gaining traction in


recent years as a way to approach data management
in a more modular and decentralized manner. The
idea behind a data mesh is to treat data as a product,
rather than a by-product (for each use case), and to
Killian Gaumont
build data products that are owned and maintained by Senior Consulting Manager
specific business domains within an organization. ARTEFACT

The data product approach:


that would not have been possible to be used as a product. This requires
business advantages with a more siloed approach to data a strong focus on data governance
One of the main benefits of this management. For example, new data and quality control, as well as regular
approach is that it allows for greater value appears when a product’s sales monitoring and testing of data
flexibility and agility in how data is can be compared with its raw materials products to ensure that they are
used and accessed. Rather than composition and pricing evolution. meeting the needs of the business.
having a centralized team responsible
for the ingestion and management Implementing data as a Data mesh: a strategic
of data pipelines, data mesh enables product: organizational approach for data
each data domain to be responsible concerns management
for their own data and to build data
products that are tailored to business However, this new approach does Despite these challenges, the data
needs. Decentralizing data ownership come with a few challenges. One of mesh approach has the potential to
reduces bottlenecks: new data sources the most important is how to manage bring significant value to organizations.
can be integrated more quickly, and the dependencies between different By treating data as a product, and by
changes to data can be made more data products. As each business building data products that are tailored
easily in response to changes in the domain is responsible for their own to the specific needs of different
business. data, changes made by one domain business objectives, organizations can
can have an impact on others. This unlock new insights and opportunities
Another benefit of this “product- means clear policies and coordination that would not have been possible
thinking” philosophy is that it between domain teams must be with a more siloed approach to
encourages collaboration between established to ensure that data is being data management. With the right
domains and thus business units. By used and managed consistently. This governance and quality control in
treating data as a product, it becomes is why a central office must coordinate place, the data mesh approach can
easier for different teams to share and the implementation of “federated help organizations to navigate the
use the data of others, in ways that governance principles”. Another ever-increasing volume of data and
are meaningful to them. This can also challenge is ensuring that data is of to turn that data into a strategic asset
lead to new insights and opportunities sufficient quality and trustworthiness for the business.

37
AI Industry
Solutions
Demand Forecasting

39 Demand forecasting: Using machine learning


to predict retail sales

41 L’OREAL — Trend Detection Innovating


tomorrow’s products today thanks to AI trend
detection by Artefact

43 Scoring Customer Propensity using Machine


Learning Models on Google Analytics Data

38
AI INDUSTRY SOLUTIONS — DEMAND FORECASTING

Demand forecasting: Using


machine learning to predict
retail sales
Looking beyond past sales
to accurately predict future
sales
Jérôme Petit
Managing Partner Massive incremental profit can be
Retail & eCommerce unlocked by retailers managing
ARTEFACT orders and inventory effectively. But
as this requires the processing of data
for a huge number of stock keeping
units (SKUs), which often include
perishable goods and items that are
ordered daily, it is also a significant
challenge.
Pascal Coggia
CEO
Retailers used to rely solely on the
ARTEFACT UK
data from previous years to predict
future sales (and therefore manage
their inventory), but this method is
only useful up to a point. However,
machine learning has now evolved to
the stage that it can provide accurate
predictive models using different
All industries aim to manufacture just the right signals based on how they influence
number of products at the right time, but for retailers purchases.
this issue is particularly critical as they also need to
Predicting sales is complex because,
manage perishable inventory efficiently. in any given period, purchases are
Too many items and too few items are both affected by many factors: weather,
scenarios that are bad for business. (Estimates shopping trends, regulation, new
products, buying behaviours, a
suggest that poor inventory management costs US pandemic… And predictions based on
retailers close to two billion dollars per year.) previously recorded data don’t factor
in specific events, making monthly
sales appear evenly distributed when
this is unlikely to be the case.

For example, an item that is often out


of stock might cause a slowdown in
the sales of that particular product
or category, but it won’t show in
the monthly reports. Even worse,
poor figures are often regarded as
a mark of buyers’ disinterest, when
the opposite is true; consumers’ over-
purchase of an item has caused it to
sell out.

39
Or a product missing from the store
might actually be in stock – just
not yet out on the shelves. Big box
retailers often struggle to restock in
real time, so an instantly popular item
might disappear from the shelves
very quickly, and thus not perform
as well as expected, despite it being
available in inventory. This calls for
technology that can help retailers
seamlessly align supply and demand.
Using machine learning and
multiple signals to assess
inventory levels

Machine learning provides a solution


to these challenges. Predictive
models can forecast sales months
in advance by using a number of the
signals that affect them (seasonality,
consumption trends, price levels, Adopting machine learning
etc). To be as accurate as possible, for inventory management
it’s important that the models use
more indicators than the standard The technology is there, but for information such as real-time sales at
day, product and store that it is usual retailers to use it effectively and make SKU level may be leveraged to detect
to factor in. accurate predictions, they need to ‘empty shelf’ situations. Models can
collect and analyse huge amounts of analyse the usual flow of sales of an
To illustrate this, a retailer might data. Much of this is in different data item, so the normal time between two
analyse the seasonality to predict sources and it can be complex to try sales of a product in a given store is
sales for the forthcoming period. to process multiple Excel and PDF known. Human intervention can be
However, the data will be skewed files that contain previous reports used to review and resolve statistical
because using dates is not 100% and media plans. Big data tools are anomalies.
accurate; a certain date can be a needed to process this information
weekday one year, but the weekend into the clean and readable format Predictive analysis is but one of the
the following year, causing sales to required to create predictive models many ways that traditional retailers
vary greatly. Other factors, such as that can prevent inventory issues. can benefit from machine learning.
whether that date falls on a holiday They have a lot to gain from relying
(Christmas, Easter, etc) or a major Past sales data for a given store may on advanced technology for better
sporting event, also influence be ‘inaccurate’ due to one-off events inventory management to increase
consumer buying patterns. (promotions, adverse weather, traffic store revenue. Processing vast
congestion, etc). To remove this bias, amounts of data can also help them to
It is a similar story with price level predictive models combine past sales optimise the assortment, offer more
signals. Promotions at store level can numbers with those of similar stores. attractive and profitable promotions
markedly affect the sales of a product and set prices more efficiently.
from a given category or even make The other big challenge is preventing
the store as a whole more attractive. items being unavailable on the shelves Well-devised tools can undertake
while they are in stock (caused complex and time-consuming tasks
Both these examples illustrate why it by it being almost impossible for and quickly deliver accurate reports.
is necessary to take many different employees to monitor shelves in real This is the real value creation lever of
signals and indicators into account time and restock them immediately). artificial intelligence in retail: freeing
to accurately forecast sales: a task managers from tedious multi-sources
that was a pain in the neck before Technology solutions using comparisons analyses and allowing
machine learning and advanced surveillance cameras and weight them to focus on the continued
artificial intelligence models made sensors do exist but are a huge improvement of the customer
it achievable. investment. However, readily available experience.

40
CASE STUDY

L’ORÉAL Trend detection


Innovating tomorrow’s products today
thanks to AI trend detection by Artefact

Charles Besson Fabrice Henry


Global Social Insights & AI Director - L’ORÉAL Managing Partner - ARTEFACT

Charles Besson, Global Social Insights & AI Director at L’Oréal, and Fabrice
Henry, Managing Partner at Artefact, discuss how L’Oréal Trend Detection,
deployed with Artefact’s AI trend detection solution, is predicting what
cosmetics products consumers are going to want tomorrow.

CHALLENGES

Predicting new trends before the competition.


L’Oréal is the world’s leading beauty company, present in an IT department. Their latest programme, the L’Oréal
150 countries, offering a rich portfolio of iconic brands for Beauty Tech accelerator, enables the company to select
every type of consumer. The company’s socially responsible the most promising innovative products for incubation
programme, Sharing Beauty With Al, is dedicated to shaping and acceleration. But how to make L’Oréal the world’s
the future of beauty through major, sustainable product number one beauty tech company? Charles wondered:
innovations. “Given the abundance of public data we can collect, and
“There’s a quest at L’Oréal to constantly reinvent the using this extraordinary thing that is artificial intelligence,
business, the brands… it’s part of our DNA, it’s an obsession,” could we invent an algorithmic crystal ball capable of
says Charles Besson, predicting the future?”
Global Social Insights & AI Director at L’Oréal. After consulting with several other companies, he chose
L’Oréal has great tools for product innovation: a Prospective Artefact to build his dream machine.
Consumer Intelligence department, a Digital department,

41
SOLUTION

A co-creation that
can forecast emerging
consumer trends.
L’Oréal’s ambitious project needed to go a step further
with AI technology, in comparison with traditional market RESULT
research models. That’s where Artefact leveraged its
advanced expertise in digital marketing and data science
to help L’Oréal detect and predict new trends emerging in
This project is a predictive intelligence
the digital space. Because discovering what consumers
machine with three main components:
want – almost before they know they want it – is the Holy
Grail every marketer seeks. • D ETECT: Using Natural Language
Processing (NLP) algorithms, this feature
Developing an innovative and reliable trend prediction
can digest a database composed of millions
solution was both exciting and challenging for the project
of documents and extract weak signals –
team. As soon as they started brainstorming, they realised
keywords that are relevant but rare (e.g.
that tracking influencers wasn’t the answer.
emerging terms) in the beauty domain
“Sure, when Kim Kardashian wears a new lipstick, everyone
• PREDICT: Once new, atypical, or relevant
starts buying the same colour, but by then it’s already too
beauty terms have been detected, we have
late, the million-dollar question is what happens before
to see if they have staying power. To find
that?”,
out, we train machine learning algorithms
explains Fabrice Henry, Managing Partner at Artefact. based on predictive variables that have
So we went deeper, and asked upstream questions to find reliably demonstrated whether a given trend
out where trends originate and how they propagate. Once a is going to grow or not, using factors such
trend is born, how does it spread? Does it spread differently as number of mentions, commitment score,
according to geography or community? What are the co-occurrence of author citations, etc.“
big sources – YouTube, blogs, Instagram, Facebook, • I LLUSTRATE: Building a number of
etc. – from which data can be extracted in order to train visualisations to demonstrate the power of
algorithms? the trend, along with a variety of contextual
“We found different approaches for each of these subjects elements (brands or authors that were
and proposed a final one to Charles Besson. And that’s talking about it, articles and visuals that
where our collaboration began, where we started this mentioned it…) and let all of this appear in
project,” adds Fabrice. the tool’s interface.
The project co-created by L’Oréal and Artefact was based “I’m really happy with it, we’ve launched a beta
on three key success factors: co-development of an version, and if all goes well; the next steps will
employee-centric solution, validation of the solution via be the launch, adoption, and training. We’ve
an MVP (Minimum Viable Product) prior to scaling, and already had lots of positive feedback!”
strong collaboration based on trust – a vital element concludes Charles Besson, Global Social Insights & AI
when sharing sensitive information with your partners. Director at L’Oréal.

42
AI INDUSTRY SOLUTIONS — DEMAND FORECASTING

Scoring customer propensity


using machine learning models
on Google Analytics data
Antoine Aubay
Data Science Manager
ARTEFACT

What is propensity
A deep-dive on how we built state of the art custom modeling ?
machine learning models to estimate customer Propensity modeling is estimating
propensity to buy a product using Google Analytics how likely a customer will perform
data. a given action. There are several
actions that can be useful to estimate:

• Purchasing a product
• Propensity modeling can be used • Churn
to increase the impact of your •Unsubscription
communication with customers and • etc …
optimize your advertising budget
spendings. In this article we we will focus on
estimating the propensity to purchase
• Google Analytics data is a well an item on an e-commerce website.
structured data source that can
easily be transformed into a machine But why estimate propensity to
learning ready dataset. purchase ? Because it allows to
adapt how we want to interact with
• Backtest on historical data and a customer. For exemple, suppose we
technical metrics can give you a first have a very simple propensity model
sense of your model’s performance that classify the customers in “Cold”,
while live test and business metrics “Warm” and “Hot” for a given product
will allow you to confirm your (“Hot” being customers with highest
model’s impact. chance of buying and “Cold” the least)

• Our custom machine learning model Well, based on this classification you
outperformed existing baselines: can have a specific targeted response
during live tests in terms of ROAS for each class. You might want to
(Return on advertising spend): have a different marketing approach
+221% vs rule based model and with a customer that is very close to
+73% vs off-the-shelf machine buying than with one who might not
learning (Google Analytics session even have heard of your product. Also
quality score). if you have a limited media budget ,
you can focus it on customers that
This article assumes basic have a high likelihood to buy and not
fundamentals in machine learning spend too much on the ones that are
and marketing. long shots.

43
YES
Hot

Has the customer YES


performed an Add Warm
to Cart action? Has the customer
performed an
Product View
NO action?
Cold
NO
Simple Rule Based Propensity Model

This simple type of rule based Google Analytics data can be easily F R O M ‘b i g q u e r y - p u b l i c - d a t a .
classification can give good results exported to Big Query (Google g o o g l e _ a n a ly t i c s _ s a m p l e.g a _
and is usually better than not having Cloud Platform fully managed data sessions_20170801’
any but it has several limitations: warehouse service) where it can be
accessed via an SQL like syntax: WHERE totals.hits > 1
• It is likely not exploiting all the data
you have at your disposal whether it Note that the Big Query export table
be more precise information on the with Google Analytics data is a nested And in this query we have used an
customer journey or your website or table at session level: Unnest function to query the same
other data sources you may have at information at hit level:
your disposal like CRM data. • S essions are a list of actions a
specific customer does within a SELECT
•
W hile it seems obvious that given timeframe. They start when a VisitId,
customers classified as “Hot” are customer visit a page and end after hits.hitNumber,
more likely to purchase than “Warm” 30 minutes of activity. hits.page.hostname,
which are more likely to purchase hits.page.pagePath,
than “Cold”, this approach does • Each customer can have several Hits.evenInfo.eventAction
not give us any specific figures on sessions.
how likely they are to purchase. Do F R O M ‘b i g q u e r y - p u b l i c - d a t a .
“warm” customers have 3% chance • E ach session can be made of g o o g l e _ a n a ly t i c s _ s a m p l e.g a _
to purchase ? 5%? 10% ? severals hits (i.e. events) and each sessions_20170801’
hit can have a several attributes or UNNEST(hits) as hits
• Using simple rules, the number of custom metrics (this is why the table
classes you can obtain is limited, is nested, for instance if you want WHERE totals.hits > 1
which limits how customized your to look at a the data at hit level you
targeted response can be. will need to flatten the table).
Note that our project was developed
To cope with those limitations we For example in this query we are only on GA360 so if you are using the latest
can use a more data driven approach: looking at session level features: version, GA4, there will be some slight
use machine learning on our data to differences in data model, especially
predict a probability of purchase for SELECT the table will be at event level. There
each customer. VisitId, are public sample tables of GA360
fullVisitorId, and GA4 data available on Big Query.
Understanding Google totals.hits,
Analytics data totals.pageviews, Now that we have access our raw data
totals.timeOnSite, source we need to perform feature
Google Analytics is an analytics web device.browser, engineering before we can feed our
service that tracks usage data and geoNetwork.country, table to a machine learning algorithm
traffic on website and applications. device.operatingSystem,
channelGrouping

44
AI INDUSTRY SOLUTIONS — DEMAND FORECASTING

Crafting the right features for each customer (using fullVisitorId We also wanted to include some
field as a key) information on the key categorical
The aim of the feature engineering data available such as browser or
GENERAL FEATURES
step is to transform the raw Google device. Since that information is at
Analytics data (extracted from Big Global features are numerical features session level, there can be several
Query) into a table ready to be used that give general information about different values for a single customer
for Machine Learning. the session. so we only take the one that occurs the
most per customer (i.e. the favorite).
GA data is very well structured and Note that bounce rate is defined as Also, to avoid having categorical
will require minimal data cleaning % of times the customer only visited features with too high cardinality,
steps. However there are still a lot only one webpage during a session. we only keep the 5 most common
of information present in the table, values for each feature and replace all
many of which are not useful for It was also important to include the other values with an “Other” value
machine learning or cannot be used information on the recency of events:
PRODUCT FEATURES
as is so selecting and crafting the for instance a customer that just
right features is important. For this visited your website is probably more While the first two types of features
we built features that seemed to be keen to purchase than one that visited are definitely useful in helping us
the most correlated with buying a it 3 months ago. For more information answer the question “Is a customer
product. on this topic you can check the theory going to buy on my website?”, they
on RFM (recency, frequency monetary are not specific enough if we need to
We crafted 4 types of features: value). know “Is the customer going to buy
a specific product?”. To help answer
Features used in the model So we added a feature Recency since this question we built product specific
last session = 1 / Number of days since features that only include the product
Note that we are computing all those last session which allows the value to for which we are trying to predict the
features at a customer level which be normalized between 0 and 1 purchase:
means that we are aggregating
FAVORITE FEATURES
information from multiple sessions For Recency since last session with at

45
least one interaction with this product, following to create our machine + 3 weeks, instead of a single
we use the same formula than for learning dataset (which was 1 row one. In addition to increasing our
the Session Recency in the General per customer): volumes of data, this improves the
Features. However we can have cases generalization capacity of the model
where there is 0 session with at least • Compute the features using the by training on various time of the
one interaction with the product, in sessions in a 3 months time window year where the customers can have
which case we fill with 0. This makes for each customer. different purchase behaviors. Note
sense from a business perspective that due to this, the same customer
since is our highest possible value is •
C ompute the target using the will be present several times in the
1 (when the customer had a session sessions in a 3 weeks time window dataset (at different periods). To
since yesterday). subsequent to the feature time avoid data leakage we make sure
window. If there is at least one that he is always either in the training
SIMILAR PRODUCT FEATURES
purchase of the product in the time or the test dataset.
In addition to looking at the customer’s window, Target it equal to 1 (defined
interaction with the product for as Class 1), else Target is equal to • We undersampled our Class 0 so
which we are trying to predict the 0 (defined as Class 0) that the Class 1 / Class 0 ratio is 1.
probability to purchase, knowing Undersampling is a good solution to
that the customer interacted with • Split the data between a Train set deal with the class imbalance issue,
other products with similar function and a Test using 80 / 20 random compared to other options such as
and price range can definitely be split. oversampling or SMOTE, because
useful (ie substitute product). For we were already able to increase
this reason we added a set of Similar However some first data exploration the volume of Class 1considerably
Product features that are identical to quickly showed that there was a strong with the first two changes. Only the
the Product features except that we class imbalance issue: Class 1 / Class training set is rebalanced since we
also include similar products in the 0 ratio was over 1:1000 and we did not want the test set to have the same
variable scope. The similar products have enough Class 1 customers. This class ratios than the future data we
for a given product were defined using can be very problematic for machine will test it on. Note that we tested
business inputs. learning models. with higher ratios such as 5 or 10 but
1 was optimal in model evaluation.
We now have our feature engineered To cope with these issues we made
dataset on which we can train our several modifications in our approach: Using this dataset we tested with
machine learning model. several classification models: Linear
• We switched the target variable from Model, Random Forest and XGboost,
Training the model making a purchase to making an add finetuning hyperparameters using
to cart. Hence, our model looses a grid search, and ended up selecting
Since we want to know whether a bit in terms of business signification an XGboost model.
customer is going to purchase a but increasing the volume of Class
specific product or not, this is a binary 1 more than compensates. Evaluating our model
classification problem.
• We trained the model on several When evaluating a propensity model
For our first iteration, we did the shifting windows, each of 3 months there are two main types of evaluations

46
AI INDUSTRY SOLUTIONS — DEMAND FORECASTING

that can be performed: So we decided to use two metrics that used to impact the media budget
were more interpretable: strategy.
• Backtest Evaluation
• Livetest Evaluation • PR AUC: Area under the curve of With our metrics, we only assessed if
precision by recall graph (see this the model was able to correctly identify
explanation for more details). customers who would make an add
BACKTEST EVALUATION
Essentially this metric allows us to cart but we did not assess how
First we performed backtest to get a global evaluation on every the identification of those customers
evaluation: we applied our model possible threshold. This metric is would generate a sales uplift.
to past historical data and checked well suited for unbalanced dataset
LIVETEST EVALUATION
that our model is correctly identifying where the priority is to maximize
customers that are going to perform precision and recall on the minority So to get a better idea of our model’s
an add to cart. Since we are using a class : Class 1(contrary to its cousin business value we need to perform
binary classifier, the model produces the ROC AUC) live test evaluation. Here we activate
a probability score between 0 and 1 our model and use it to prioritize
of being Class 1 (Add to cart). • Uplift: we sort customers by their advertising budget spendings :
probability score and we divide our
A first step in evaluating a binary results into 20 ventiles. Uplift is Results we obtained on the livetest
classification model is create a defined as the Class 1 Rate in the top were very solid:
confusion matrix and compute the 5% / the Class 1 Rate across all the
precision / recall (or their combined dataset. So for instance if we have • Compared to a simple rule based
form in the f1 score). However there 21 % Add to Cart in the top Top 5 % approach for evaluation propensity,
are two issues with these simple of the dataset vs 3 % Add to cart Rate our model’s ROAS was +221 %
metrics: in whole dataset we have an uplift of
7 which means our model is 7 times • Furthermore we also compared our
Some can be hard to interpret because more effective than a random model. performance to a strong contender
the dataset is imbalanced (for instance competition in the form of Google’s
the precision metric will generally be Results on those metrics were rather Session Quality Score: a score
very low because we have so few positive, especially, Uplift was around provided by Google in the Google
Class 1) 13.5. Analytics dataset, and in that case
our model was still at +73 % ROAS.
They require to decide on a probability Backtest evaluation is a risk free This shows how a custom ML
threshold to discriminate between method for a first assessment of a approach can bring considerable
Class 0 and 1 propensity model but it has several business value.
limitations:
Confusion Matrix Example for our
Class Imbalanced problem Since it is only done on the past, the
model output is not actually being

47
Conclusion

In addition to reaching solid


performance, a strong side
benefit of our approach is
that our feature engineering
is very generic. Almost none
of the feature engineering
steps need to be adapted
to apply our model to a
different country scope
or product scope. In fact
following our first success
in the livetest, we were
able to roll out our model
to multiple countries and
products in a very efficient
manner.

48
AI Industry
Solutions
AI for Call Centre

50 Powering your call centre with artificial


intelligence

54 MAIF — Using Topic Modelling to reduce


contact centre bottlenecks

56 Using NLP to extract quick and valuable


insights from your customers’ reviews

63 HOMESERVE — Using speech analytics to


improve customer satisfaction

49
Powering your The recent advancements in
technology and artificial intelligence

call centre with


can be applied to your own customer
interactions and be a tremendous
support to help your organization

artificial intelligence
profit from productivity gains,
improve customer retention and
create additional revenue.

Having your customers talk to a


robot can seem out of a scifi novel.
But actually, it is already a reality for
hundreds of millions of customers
that have regular interactions with
Matthieu Myszak interfaces powered by artificial
VP Data Consulting intelligence.
ARTEFACT
Ultimately, the goal of an optimal
integration is to have customers talk
like they usually do when interacting
with a conversational agent that can
analyse their sentiment, provide
useful information, and answer
In today’s competitive business environment, recurrent standard requests as well
customer experience is becoming a key as complex issues. Also, the virtual
robot can pass on the caller to a
differentiator for organizations well aware that it is human agent when needed. Beyond
more cost-effective to keep an existing customer this conversational agent, Machine
than acquiring a new one and that a disgruntled Learning technologies record the
interaction for further improvement
customer well-handled can be turned into an in setting new and sophisticated
advocate for your brand. protocols of problem solving.

50
AI INDUSTRY SOLUTIONS — AI FOR CALL CENTRE

1 — The main advantages interactions thanks to natural agent becomes an “augmented


of using artificial language understanding capabilities agent”, meaning that the virtual
intelligence for call (NLP) wired into mainstream digital assistant listens to calls in real-time
products, such as Google Voice and provides contextual assistance
centres Search, Google Assistant or Google letting the human stay focused on
What differentiates customer Home. the conversation and expressing
service boils down to the quality empathy towards the customer.
of the relationships and the overall According to a recent Gartner study,
user experience. In that regard, by 2023, 70% of consumers will prefer 3 — Concrete business
AI can be a useful tool helping to interact with a vocal interface than cases showing tangible
organizations achieve unmet a real person and 40% of all customer benefits in improving
customer expectations. interaction will be fully automated.
customer satisfaction
An artificial intelligence-powered 2 — How does it work Businesses of all sorts could benefit
customer centre is a crucial asset on both sides from a positive impact
for three reasons: Google Cloud has been developing by powering their customer service
virtual assistants capabilities for with artificial intelligence:
•
R educe operating costs (via years and has created a product that
FOR CALL SERVICE OPERATORS
automatisation, reduction of can be used for business purposes.
average handle time) Improve efficiency and focus, reduce
Artificial intelligence is not always churn, provide opportunities to upsell
• Improve the quality of the service meant to replace humans as it can and cross-sell
and the customer satisfaction be utilized to augment real agents.
FOR CUSTOMERS
(via an increased reactivity and Thanks to the robot, call centre
availability) operators can concentrate on Improve user experience with a
complex and higher value situations customer service available 24/7 with
• Offer opportunities for cross-selling and be freed from small, repetitive no waiting time, accelerate retention
and upselling customers and low-value tasks. The chatbot is and customer loyalty
also critical in providing assistance
The revolution of conversational to the operator. In certain situations, “If a customer asks for the
technologies has already begun. maintaining a human contact is key. status of his order, the virtual
Today, internet users are becoming assistant has to provide the
more and more familiar with With the support of artificial
voice commands and chatbot intelligence, the customer service
right request and give the
correct information.”

51
4 — Seamless integration In the event that a business doesn’t Artefact has been helping clients, in
into your legacy system have any data to analyze or precise various industries, turbocharge their
use cases to aim for, it is possible call centres with artificial intelligence.
For optimal performance, the Google to implement a working solution by We provide assistance in different
Contact Centre AI must be integrated asking each caller a prompt such as ways:
into the call centre workflow, work “Could you please tell us the reason
with the existing databases and for your call ?” and then letting them • Identification and prioritization
documentation (via APIs) and the access the traditional customer of use cases
front desk interfaces. experience journey. By analyzing
the initial answer and the human • S etup and training of artificial
Organizations need to bring a agents’ interactions, the artificial intelligence solutions
multidisciplinary team to achieve this intelligence will get trained to quality
project according to their needs and future interactions. •
D evelopment of integrations
their own IT architecture. to collect relevant data
5 — Why rely on a partner
Before being set up, a chatbot needs Our company has ex tensive
to be fed with customer interaction Before taking the task of implementing experience working with both
data. The bot needs to be trained by a virtual assistant solution into your partners and service providers
listening and analyzing past customer architecture, it could be useful to from the digital, data and artificial
interactions. That will enable the bring in an experienced partner that intelligence industries. Artefact’s
virtual assistant to be able to provide could assist you in the different steps method is centreed on featured
value immediately with high levels of the project and help you maximise teams, composed of members with
of customer satisfaction. Existing value. complementary skills, from business
data can be emails, chat messages
or voice calls. The data will help
train the model according to the
customer journey and the expected
optimisations.

“If we have to manage use


cases of a client from the
banking industry, we will
not train the AI tool the
same way that if it was for
an ecommerce brand, for
example.”
It can take from one up to three
months to integrate the artificial
intelligence solution into an existing
customer service depending on the
number and the complexity of use
cases, and number of integration
points to access.

“Deploying a virtual agent


for an insurance company
that can automatically file
a damage claim is more
complex for example than
if you are asking for the
status of your order on a
ecommerce website.”

52
AI INDUSTRY SOLUTIONS — AI FOR CALL CENTRE

analysts to data scientists and


software engineers, that can help Conclusion
projects come to life.
An artificial intelligence, such as Google Contact
“We don’t usually think of Centre AI, integrated into the data architecture
adding a UX Designer to of your customer service will supercharge your
feature teams but this role customer experience.
is important, as it helps give
a personality to the virtual The machine learning capabilities and
agent that reveals your unique recordings of interactions provide a constant
brand image” feedback loop that helps the performance of
Our company has ex tensive the virtual assistant and the use cases that
experience working with both can be addressed.
partners and service providers
from the digital, data and artificial The right partner for an artificial intelligence
intelligence industries. Artefact’s project can help organizations smooth out
method is centered on featured
teams, composed of members with
the definition and implementation phases
complementary skills, from business and achieve immediate performance and
analysts to data scientists and business gains, while providing useful industry
software engineers, that can help
projects come to life.
benchmarks and key learnings from previous
experiences.
Setting up a bot that is both well-
embedded into your data architecture
and provides usefulness to
consumers is a goal that can take
some effort but can be reached when
organizations make it a customer
experience priority.

53
CASE STUDY CHALLENGES

MAIF MAIF is one of France’s largest home


and automotive insurance companies,

Using topic
with more than 3 million members.
One of the challenges facing its
modelling customer services team was managing
the volume of calls coming into its call
to reduce contact centre — on average, some 8 million a
year.
centre bottlenecks With no way of vetting calls before they
reached an operator, the team was
wasting precious time responding to
questions customers easily find the
answers to on the MAIF website.
To improve efficiencies, we needed to
filter out unnecessary calls and free up
more time for MAIF’s customer service
teams to process more complicated
requests.

54
SOLUTION RESULTS

Our analysis showed that 32%


To understand why customers
of inbound calls were ‘low added
were calling MAIF’s call centre,
value requests’ — questions that
we developed Natural Language
could easily be answered online.
Processing (NLP) algorithms to
analyse transcripts of more than 4 As a result, we built a roadmap
million calls. advising MAIF how to solve these
questions online and direct people
We then used topic modelling to
to this content to avoid calling.
categorise every call into one of 35
different request types. Digitising these queries has let
We liaised with MAIF’s business MAIF’s customer services team
teams to identify which questions prioritise cases that require
could be solved online and which a human touch, improving
needed a human response or efficiencies and its round-the-
presented a sales opportunity. clock service.
Where calls did not represent an
opportunity, we advised how to
answer these questions online.

55
Using NLP to extract quick
and valuable insights from
your customers’ reviews

Everyone talks about BERT, GPT-3, XLNet… but


did you know that with some simple NLP 101
Louise Morin
preprocessing you can already extract valuable Senior Data Scientist
insights from your data? ARTEFACT

Understanding customers’ feedback


and knowing what your strengths
and weaknesses are is key to any
business. Nowadays, companies
have access to a lot of information
that could give them those insights:
website reviews, chat interactions,
conversations transcripts, social
media comments…

This article explains how you can


quickly extract insights from textual
data, leveraging consumers’ reviews
as an example. We will present 3
different approaches:

•U
 nsupervised data exploration

• Sentiment analysis with features


importance

•
A nalyzing correlation between
ratings and predefined business
themes

(topic modeling could be a fourth


option to go further)

Please note the data behind this article


was artificially generated to ensure
confidentiality of our initial project.

56
AI INDUSTRY SOLUTIONS — AI FOR CALL CENTRE

Customer Reviews Analysis

We are trying to find insights from


our products reviews in order to
understand what are their main
issues / main strengths. Products
are camera devices and accessories,
rated from 1 (bad) to 5 (excellent).

We will be using three different


approaches here, to gather insights
from our data.

The point is to have complementary


views:

•D
 ata mining or sentiment analysis
is more exploratory: it will find out
what matters the most, what could
be the main reasons driving a review
to be positive or negative.

•T
 hemes impact is used to associate
scores distribution to already
defined business concepts (zoom,
battery, …).

57
Get a global look at the data Number of reviews per product category
you have collected
Whenever you’re starting a new data
project, the first step is always to get
the global picture on the data you have
(is it imbalanced? is there enough
data? are there lot of missing values?).

HOW MANY REVIEWS DO I


HAVE FOR EACH PRODUCT
CATEGORY?
The fact that there are not as many
Tripod reviews should be kept in mind Number of reviews per score
if we analyze reviews for this specific
category of product. The more data
we have, the better, in order to have
unbiased and relevant conclusions.

HOW MANY REVIEWS DO I HAVE


FOR EACH RATING?
This is important. We see that our
dataset is quite imbalanced, we
have a lot more positive reviews
than negative reviews. This kind of
information needs to be taken into
account when training dedicated
models (ex: a classification model
for sentiment analysis).

WHAT’S THE RATING


DISTRIBUTION OF EACH
CATEGORY?
We can see here that Lenses have the
highest average rating, while there are
a lot of negative reviews (especially
with a score of 1) for Drones and
Aerial Imaging. Average rating & distribution of each product category

58
AI INDUSTRY SOLUTIONS — AI FOR CALL CENTRE

Using NLP to understand


from collections import Counter
your customers’ concerns import matplotlib.pyplot as plt
import wordcloud
Now, to understand what the reviews
are about, we will implement the plt.rcParams["figure.figsize"] = [16, 9]
different NLP approaches mentioned
previously.
def create_ngrams(token_list, nb_elements):
DATA CLEANING """
Create n-grams for list of tokens
Before doing anything else, we need Parameters
to clean the text data, to make it ----------
token_list : list
usable by the different NLP methods
list of strings
(this step is not always required, nb_elements :
depending on the algorithms you number of elements in the n-gram
want to use). Returns
-------
Generator
We applied standard pre-processing
generator of all n-grams
functions that were relevant to our data """
(removing HTML, punctuation, phone ngrams = zip(*[token_list[index_token:] for index_token in range(nb_
numbers, …), and we implemented elements)])
a custom list of stop words that we return (" ".join(ngram) for ngram in ngrams)

remove from reviews (for instance


the word “camera” does not bring that def frequent_words(list_words, ngrams_number=1, number_top_words=10):
much information to our analysis). """
Create n-grams for list of tokens
You can find a lot of these functions Parameters
----------
in our NLPretext Github repository.
ngrams_number : int
MINING INSIGHTS IN A FEW number_top_words : int
output dataframe length
LINES OF CODE
Returns
Now that we have for each review: -------
DataFrame
Dataframe with the entities and their frequencies.
•A
 product category """
frequent = []
•T
 he review original text if ngrams_number == 1:
pass
elif ngrams_number >= 2:
•T
 he review cleaned text
list_words = create_ngrams(list_words, ngrams_number)
else:
• The review cleaned text split into raise ValueError("number of n-grams should be >= 1")
tokens counter = Counter(list_words)
frequent = counter.most_common(number_top_words)
return frequent
• The product rating

We can start by simply looking at our def make_word_cloud(text_or_counter, stop_words=None):


most frequent words (single words, if isinstance(text_or_counter, str):
bi-grams, tri-grams…). It’s a simple word_cloud = wordcloud.WordCloud(stopwords=stop_words).
generate(text_or_counter)
analysis, but it gives you an immediate
else:
vision of what the main topics are for if stop_words is not None:
each score and category. text_or_counter = Counter(word for word in text_or_counter if
word not in stop_words)
word_cloud = wordcloud.WordCloud(stopwords=stop_words).generate_
from_frequencies(text_or_counter)
plt.imshow(word_cloud)
plt.axis("off")
plt.show()

59
WordCloud

Leveraging these functions, we can


easily display a Word Cloud of most
frequent words, using reviews for
Cameras with a score between 1
and 2:

Then display a similar Word Cloud


using reviews for Cameras with a
score between 4 and 5 :

We can easily identify the main points


brought up in both cases.

For reviews with low scores, we have


a lot of mentions about the battery,
the device screen, its price or even
mentions of a real bug encountered.

For reviews with high scores, we


see that the photo quality, and the FEATURE IMPORTANCE
functionalities or design are being and 5 gives us the sentiment behind
brought up often the review). But training a model to Once your classifier is implemented,
predict this rating will help us find you can move on to the most important
We could do this exercise for each which words (features) are key for step: getting insights from features
product our company has, in order customers. importance.
to see the specificity of each and be
able to draw conclusions at a more What we can do is to train a sentiment In the following example we apply
granular level. analysis classifier on this data, and SHAP on our model (here, a simple
then use libraries like SHAP or LIME sklearn LogisticRegression):
N-grams Count to understand which features (=
words) have the most impact on a We can see here that the functionalities,
We can also use the frequent_words review being classified as positive photo quality, and zoom features have
function to display the most frequent or negative. a really positive impact on our clients’
words, bi-grams or tri-grams: satisfaction, while the flash, memory
CLASSIFIER
card or batteries tend to have a really
To go further, you could then put To train a classifier, you have a lot negative impact when mentioned in
in place a function displaying the of possible algorithms you can use, a review.
reviews associated with a keyword, ranging from the classic sklearn
in order to zoom in on n-grams you LogisticRegression, to ULM-fit Words like “excellent”, “perfect” or
find interesting. You could also look models (see this notebook to train a “bad” were removed from this analysis
at n-grams with the highest / lowest French ULM-fit model, and this article (before training the classifier),
TF-IDF (easy to compute with the to understand more about ULM-fit) because they will be considered as
sklearn library), since it allows you or the Ludwig classifier developed the most important features, while in
to see important words based on by Uber. our case we want to focus on finding
a different metric than a simple insights about our products, not really
frequency counter. You might want to start with a simple improve our classifier performance.
one first, to see if it already answers
Sentiment Analysis your needs, before putting in place See this notebook for an example
more complex algorithms. on how to use SHAP, with a public
Next, we move on to a sentiment dataset.
analysis approach. Usually, it is Make sure to take into consideration
BUSINESS THEMES IMPACT
used to predict if a text is positive or the fact that your dataset is probably
negative. In our case, we already have imbalanced (more positive than Our third approach was kind of
this information (the score between 1 negative reviews, in our case). different from the previous ones, as it

60
AI INDUSTRY SOLUTIONS — AI FOR CALL CENTRE

starts from business-related themes


chosen by someone knowledgeable
when it comes to the products.

The point is to analyse how predefined


business themes impact products
ratings, to understand if they are a
source of strength or an issue to
solve.
DETERMINING THEMES
The first step is to classify the reviews
into the thematic categories. Either
by labelling your dataset manually
(then you could train a classifier if
you want to automatically classify
new review into themes), or with a
rule-based model.

In our case we used a rule-based


model because it can already bring up
good results at low cost (e.g: if you’re
curious about your lenses quality or
your after-sales services, it can be
simple to establish rules that will
determine if a review mention those
or not).
THEME IMPACT
In a second step you can compute
your global average score, then the
average score of reviews talking
about a specific theme.

By subtracting both scores, you can


deduce the impact your theme has
on your global score.

We should here worry about our after-


sales service because it is often
mentioned in a negative way (though
it could also be because people
contacting the after-sales service
often had an issue in the first place.
Which is why, you should then look
into detail at the reviews mentioning
this theme, to really understand why
it was brought up).

Here again, business knowledge


is essential to make sense of your
results.

On the other hand, when our designs


or lenses are mentioned, it’s often
linked to a review with a high score,
which could mean it’s one of our
strengths.

61
CASE STUDY

HOMESERVE “The detection of non-


compliance in sales calls
Using speech use case analysis allowed
us to prove that AI can be
analytics to improve leveraged to better orient
the work of the compliance
customer satisfaction team.”

62
CHALLENGES

Present in France for 20 years, HomeServe is Because we couldn’t build the entire architecture
the world leader in home insurance services, right away, we needed to quickly demonstrate the
with 8 million customers and over one billion value of speech analytics to all stakeholders via
in revenue. a minimum viable product (MVP), able to expand
When it comes to home emergencies, the after its validation with business experts.
most common channel used by customers To do this, we analysed two high-value use
is the phone – 9 out of 10 customers prefer cases in a four-week cross-company workshop.
it. This particularity places the call centre at We developed several microservices for data
the heart of every step of the insurance value collection and processing and packaged to
chain, from sales to customer service, and enable these use cases to be developed and
ultimately assistance. be reused in the future, should the MVP phase
Although HomeServe has already developed prove successful.
AI-based conversational solutions and is 1. Refining understanding of customer
present on Google Assistant and Amazon contact root causes
Alexa, they wanted to explore new ways in 2. Detecting risks of non-compliance
which AI could improve efficiency and customer within sales calls
experience in their existing phone channel.
They were especially interested to see what
impact speech analytics could have on the
vast amounts of unexploited customer data
they collected.
RESULTS

The most important conclusion


for Artefact is that we proved the
SOLUTION technology is mature. Speech
analytics is ready to produce value
Artefact began by helping HomeServe opt for a for companies right now.
“make” over “buy” strategy, as only a proprietary
asset tailored to their organisation, combining The customer contact root cause
technology and skills, could meet their many use case analysis produced three
objectives, which include: actionable insights, which could help
We also laid out a plan for developing call centre agents to perform better,
HomeServe’s expertise in natural language, sell more contracts, and benefit from
data science algorithms, and AI data-treatment a less tedious workload:
technical structures.
The detection of non-compliance in
Next, Artefact set up a long-term multidisciplinary
sales calls use case analysis allowed
team with HomeServe comprised of a business
team, a core data team, and an IT team to us to prove that AI can be leveraged
assess the maturity of speech analytics, the to better orient the work of the
value and feasibility of relevant use cases, compliance team.
and improvements to customer experience
and efficiency.

63
AI Industry
Solutions
Data for Finance & Industry

66 Unlocking the Future: How financial


institutions can prepare to scale AI
68 Gaining buy-in for data & analytics initiatives
in financial services
74 The road ahead: data-driven sales is critical
for the evolving car industry
77 Interview: How Nissan is transforming in the
digital world

64
AI INDUSTRY SOLUTIONS — AI FOR FINANCE & INDUSTRY

Unlocking the Future:


How financial Change core technologies: move to

institutions can prepare


cloud computing.

“A cloud environment can reduce

to scale AI
the time it takes to test and develop
AI solutions down to a few minutes,
thanks to managed services”, assures
Athena. “A bank I worked with started
transitioning into the cloud two years
ago, and their innovation rate has
increased by about 49% according
to their own KPIs. That might seem
small, but for an incumbent, monolithic
Athena Sharma institution, it’s quite revolutionary.”
Consulting Director & Global
Financial Services Lead Another facet of this challenge is
ARTEFACT investing in data management –
both in terms of data quality and
data access. In FIs, data is siloed
across various business units and
divisions. As a result, data isn’t
standardised, quality is hard to
According to The Economist, some Athena explained the main challenges manage, and there’s no single source
54% of large financial institutions to AI project success and how to of truth, so stakeholders are unsure
(FIs) had already adopted artificial overcome them: if the underlying data of proposed
intelligence back in 2020, so imagine projects is trustworthy. “Investment
where those numbers stand today. • N umber one requires investing in modern data governance and data
To add to that proliferation, 86% of in core technology and data management practices is crucial
financial executives say that they plan management. for FIs”, insists Athena. “And a key
on increasing AI investment through component of that is what we call
2025. And in another survey, 81% said • Number two involves implementing an Enterprise Data Model, or EDM.
that unlocking value from AI would a future-oriented operating model. It’s not an IT concept, but a way of
be the key differentiator between describing and logically organising
winners and losers in the banking • Number three concerns proactively your data – all of your data – in
industry. considering AI ethics and regulation. business relevant language – a kind
of business glossary, if you will, that
“There’s clearly a very strong Investing in core technology streamlines data quality management
value case to be made for AI in and data management for all certified users.” The final part
financial institutions”, said Athena. of this challenge is data access.
“Investment banks are perhaps the For Athena, one of the key difficulties
earliest adopters and beneficiaries of FIs face is that their core technology “Data is the most valuable raw material
machine learning technology in the is built for traditional operations, any organisation possesses; key to
algorithmic trading space. After all, such as payments, lending, claims leveraging its value is to have access
70% of FIs now use machine learning management. “Legacy IT stacks don’t to analytics at scale, at the point of
for fraud detection, credit scoring, have the flexibility to deploy AI skills. decision making. It’s especially difficult
or predicting cash flow events, and The computational capacity for data in banks due to data confidentiality. An
conversational AI is commonly used management and analytics you need innovative solution is to create API-
in retail banking and insurance. Yet in a closed loop VR application just enabled databases for more effective
despite this, many FIs fall short isn’t there and testing and developing and secure data access, but at scale
when it comes to productionising AI technologies can take days or even and in real time to fulfil your business
their AI projects to deliver concrete, months – prohibitive when you’re objectives, and in real time to fulfil your
enterprise-wide value.” trying to be innovative. The solution? business objectives.”

65
Regulatory restrictions are to be
imposed on anyone who uses any
software associated with biometric
technology in financial institutions,
human capital management or credit
assessment of individuals. As things
stand, this will affect almost all FIs.
While the full extent of future AI
regulation is not yet clear to anyone,
what is evident is that regulations will
be ethics-based. But many leaders in
the financial services industry feel
their companies don’t understand
the ethical issues associated with AI.

Artefact proposes developing an


ethical in-house AI governance
Implementing a future- layer. Data scientists and engineers framework that covers all aspects of
oriented operating model can use this analytics layer to test AI ethics, including buyers, including
and learn AI ML solutions. You could data management, model training
The second challenge for financial also have a client 360 dashboard and retraining AI explainability. To
institutions lies in the operating with relevant KPIs for your frontline do this, expert advice may be useful,
model they use. Most are organised sales colleagues and use it to improve but what’s really needed is a two-part
according to business divisions, customer lifetime value. You could mindset shift covering all aspects
often with centralised IT functions, also provide data to your marketing of AI ethics, including buyers, data
impeding their ability to innovate. team about optimization and management, model training and
Business leaders set their own personalization to help them better retraining AI explainability.
agendas and AI strategies, resulting spend their budgets.”
in fragmented teams and a waterfall The first shift requires large-scale
approach that leads to delays, cost The possibilities are endless, but stakeholder buy-in, by obliging
overruns, suboptimal performance in essence, a modular operating stakeholders to let go of the siloed
and a total lack of a test and learn model allows your teams to better mentality, division and operating
mindset. FIs must be able to work in collaborate and work towards a models that are preventing you from
an iterative manner to continuously common strategic goal, rather than productionizing AI. The second is
innovate and improve – a necessity in the silos that currently divide FIs moving from a risk-averse to a
in order to scale AI, because no one – as well as a myriad of companies pioneering mindset. This requires
ever gets it right the first time. across all sectors where product a deep cultural change where the
teams are not yet a reality. entire organisation attains a high
“Instead, we at Artefact propose a level of literacy on the impact of
more agile and flexible future-oriented Proactively considering AI AI, its applications and its ethics, in
operating model based on data ethics and regulation order to be innovative without being
products. A data product is essentially irresponsible.
a set of data solutions that directly Investment in AI ethics and regulation
address a business challenge or is crucial for financial institutions “It isn’t easy, especially in an industry
business outcome. Each data product right now. In reviewing the European where risk aversion is deeply
is developed by a dedicated team that Commission’s proposed Artificial embedded. But ultimately, when it
has their own budget, assets, and KPIs.” Intelligence Act, the European comes to AI adoption, I don’t think
Data Protection Supervisor (EDPS) financial institutions have much
“For example, say you have a client considers that stronger protection optionality, it’s not how or if AI can
360 team of business, IT and data of fundamental rights is necessary, add value to your business. It’s about
stakeholders. They can provide including strengthening the protection how you can embed AI in your day-
several data products to the business, of individuals’ fundamental rights, to-day operations in order to remain
as well as to external customers, so including the rights to privacy and relevant and competitive in a rapidly
you obtain a customer 360 analytics to the protection of personal data. changing global marketplace.”

66
Gaining buy-in for
data & analytics
initiatives in
financial services

Modern day da(ta) Vincis • 94% of companies are increasing


their data investments in 2023 (*
Picture Da Vinci’s Mona Lisa. The Data & Analytics leadership annual
world has been captivated by the executive survey – New Vantage
Akhilesh Kale inherent mystery tied to this work Partners – 2023)
Director – US FSI Lead of art: her elusive smile and eyes
ARTEFACT that seem to follow you. What is • There is an estimated US$447B
less obvious is the science, math & cost saving opportunity for banks in
geometry Da Vinci used to anchor this 2023 (*Data For Finance – Artefact
masterpiece. Today’s Data Leaders Research – 2022)
are the modern day Da(ta) Vincis. They
apply complex data/AI machinery • All banks, regardless of size, are
behind the scenes to offer compelling widely adopting AI and Machine
business outcomes. This analogy Learning (* The state of AI in 2022
underscores the complexities within – Mckinsey & Co – 2022)
Renaissance paintings and data
ecosystems alike. When it comes Strategic decision-making is a delicate
Corentin Boinnot to data, however, it is the business dance of decisions driven by gut and
Senior Consultant outcomes that matter most. by machines & data. It is paramount
ARTEFACT to anchor to fundamental value
Setting the scene drivers before any conceptualizing
and pitching take place.
ADVANCED ANALYTICS IN
FINANCIAL SERVICES
This article outlines some practical
Financial institutions were some of keys to leverage the power of
the earliest to invest in their data data & analytics to transform your
capabilities, driven by regulatory organization. The objective is twofold:
mandates and huge cost saving you want to receive buy-in and
opportunities. Industry surveys harness maximum value from data
John Ly continue to show investments in data projects.
Senior Consultant and AI & the associated expectations
ARTEFACT are on the rise.

67
Speak the language of the
business
The majority of data leaders have a
technical background and possess
a genuine passion for their area of
expertise. Nonetheless, it is key
to resist the urge to drop complex
data concepts when pitching a data
transformation proposal to business
leadership. Business leaders do not
always understand data jargon, and
might be thrown off by a technical
approach to solving their problems.
They care about the business
outcomes and impacts: increasing
sales, reducing operating costs,
freeing up human resources and >
B usiness talent is needed to Business could be hooked by the
mitigating risks. Be sure to speak spearhead the vision and quantify use cases available through a
their language! the value impact: new data platform. They would
own the roadmap. IT would love
•
P ique their interest with to roll out the latest data & AI
compelling use cases through tech.
a new data platform.
EXAMPLE
If you are trying to secure buy-in • Make it clear they would own the
to launch an enterprise-wide project roadmap.
data transformation project: Advanced Analytics in
> Engineering & IT excel at building service of your strategy
– Do not: talk about microservice complex tools:
architectures, data observability,
domain driven ownership or • To ensure they are onboard
common data governance and invested, offer them the
DEFENSE
principles. opportunity to roll out the latest
data & AI technology. Risk Monitoring &
– Instead: reference the cross- Mitigation
selling opportunities, the time
saved by teams cleaning data
Regulatory
EXAMPLE
and the opportunity to reduce Compliance
customer churn. Let’s consider a scenario where
you are trying to build a next Reduce Costs
generation Customer Data
Platform and need to attract
Bring business and talent from elsewhere in the
technology onboard business to drive the project:
OFFENSE
Just as our Renaissance hero • Have a clear plan of who you
combined art and science, modern need and what they can do.
day Da(ta) Vincis need to recruit Increased Revenue
experts from across the business • Understand what will drive and More Retention
to instill a data-driven culture into motivate individual experts to
an organization. Understanding the work on the project. Competitive
strengths of colleagues – and what
motivates them – can help pave the • Provide incentives to work Advantage
way for effective collaboration, for with the data team, make sure
example: everyone benefits.

68
AI INDUSTRY SOLUTIONS — AI FOR FINANCE & INDUSTRY

Unite the ranks, mobilize the business counterparts. Encourage


leaders the following key conversations
between data and business teams:
In addition to the right mix of talent,
long term project success relies on > Help business teams understand
support from leadership as well as the real cost of delivering current
collaboration from on-the-ground data insights.
teams to drive the project forward:
> Have your data teams illustrate the
• Receive buy-in from leadership. benefits of their initiatives to the
Highlight the key issues it can solve business to help ensure a smooth,
and bring them into a vision they self-driven onboarding process.
can relate to.
> Rely on pilots and proof-of-concepts
• Identify data champions within the today to demonstrate the feasibility
business to facilitate collaboration of capabilities at-scale in the future.
with your data teams. Their
BRINGING LEADERSHIP TEAMS
knowledge of both data and
ON BOARD FOR A DATA QUALITY
business will allow them to educate
PROGRAM
colleagues, clarify directives and
identify relevant use cases. Get creative to launch new initiatives.
Help leaders and their teams to
• Establish a deep working relationship understand current challenges and
between data professionals and their the feasibility of practical solutions.

SITUATION
Poor data quality | Ineffective data products | No buy-in from leadership

TOP-DOWN

Quantified lost revenue


CDO briefed c-suite on lost revenues and increasing & high costs due to poor
costs due to poor DQ data quality

To demonstrate the time wasted, each leader made


to clean a real world data set in excel
3 DOMAINS
High costs of poor dq was understood | a practical PRIORITIZED FOR
solution was socialized LAUNCHING THE DATA
QUALITY PROGRAM
Demonstrated impact to the business analysts
through a do engine pilot

Hybrid pods
CDO team engaged business analysts to understand
Sprint retrospects
impacts caused due to poor DQ
Ask me anything hours
Coffee-chats
BOTTOM-UP Surveys, slack

69
Aim big, start small
LAYING THE GROUNDWORK FOR
SUCCESS
The Mona Lisa wasn’t painted in a
day. Data Leaders won’t be able to
reorient their whole company to make
it data driven overnight. Have your
Da(ta) Vincis invest their energy and
resources into first sketching out
the story. Use this to build a high
value use case that highlights all the
advantages of leveraging your data
insights.

No matter how flashy the use case,


the transformation from sketch to
masterpiece won’t take if you don’t
provide the business with the means
to get invested in data. They will need:

• Clean data or the means to clean it

•B
 usiness user friendly tools

• Data team’s time, attention and


resources Our client team (Commercial Sales Operations) spent
ARTEFACT CLIENT 5 hours a week manually building their weekly B2B
ILLUSTRATION Sales Report
How one useful Dashboard convinced
our client to go ‘all in’ on data
visualization

Discovering analytics gold Using a lean data visualization POD team, we built
and deployed an automated B2B Sales Power BI
To complement a strong case – or
Dashboard in three weeks leveraging the existing data
generate buy-in from the business
without one – identifying existing eco-system with no disruption
pots of “analytics gold” in your
organization is a good approach.
There are almost certainly passionate
individuals in the business who
have developed impactful analytics
solutions locally. Find them and
Our client Marketing We helped the
industrialize their initiatives so the reorganized now wanted client build a
whole company can benefit. their sales a dashboard Bl center of
operations to visualize excellence
to leverage omni-channel to backlog,
data from this campaign prioritize and
dashboard results on a execute on
unified interface enterprise
needs

70
AI INDUSTRY SOLUTIONS — AI FOR FINANCE & INDUSTRY

Here are a few tips to help unearth


Organize business hackathons and identify these hidden treasures:

FIND •
B uild an enticing platform
Organize slack/teams channels for (newsletter, monthly data meetup,
ideas sharing …) so people will want to share their
ideas.

• Establish recognition and rewards


to better acknowledge their ideas
Publish success stories in newsletters,
and work.
bring teams to leadership townhalls
ENCOURAGE
•
Work with HR to ensure data
practices are measured in
Push program branding
managerial performance reviews.

• Build guilds and organize dedicated


chat channels to share knowledge.
Get leadership to appreciate
Finding pots of (analytics) gold hidden
REWARD in the business
Extend to performance reviews and
internal incentives

DATA &
Data & analytics leaders, true ANALYTICS
renaissance thinkers LEADER
Being a remarkable “sales” person is
a quality that makes one stand out, no
matter the profession. However, being SPEAK THE LANGUAGE BRING BUSINESS AND
remarkable at “selling” is less about OF THE BUSINESS TECHNOLOGY ONBOARD
the act of sales and more about the
ability to solve complex problems. The
role of a data leader encompasses
all of these traits. They need to be an
outstanding salesperson, a smooth DISCOVER
SELL & EXECUTE &
operator and possess the ability
to understand their organization’s CONVINCE ANALYTICS GOLD SCALE
numerous challenges.

Going back to where we started this


blog, we see the analytics leaders
today as modern day Da(ta) Vincis.
UNITE THE RANKS, AIM BIG,
Powered by the tools of the 21st
century – data, analytics and AI, MOBILIZE THE LEADERS START SMALL
these true Renaissance thinkers
are creatively painting the business
impact story through the complex
machinery of indisputable facts.

71
The road ahead:
data-driven sales is critical
for the evolving car industry
The commercial model for selling
cars is shifting, with manufacturers
adopting the direct-to-consumer (also
Axel Tasciyan know as the agency model) trend
Data Consulting Director being witnessed by a wide variety of
Automotive Lead business sectors as people look to
ARTEFACT simplify the buying process.

This is a sea change for vehicle


producers. It’s also complicated to
navigate in terms of the agreements
that will still be required with
This requires a 3-pronged approach: sell more, dealerships.
sell better, seek out new sales streams.
The net result is that automakers are
experiencing major and sustained
pressure on their bottom lines.

Tackling the industry’s shifting vehicle


retail sands requires a three-pronged
approach: sell more; sell better; seek
out new sales streams.

Of course, it looks obvious on paper.


But far from being a rousing sales
team pep talk, these concrete steps
form the basis of the modern motor
trade. Let’s look at them in detail.

Sell more

This is not the obvious admonishment


to sell more vehicles (although that of
course is important); rather it focuses
on increasing the ‘extras’ sold to
someone who has purchased a car.

Currently a buyer visiting the


manufacturer’s website a week
after placing their order is likely to
be offered another car. This doesn’t
work for anyone.

Contrast that with a consumer buying


a car who, on returning to the website,

72
AI INDUSTRY SOLUTIONS — AI FOR FINANCE & INDUSTRY

more to already acquired customers


and realise more profit is a key piece
of the most effective digital strategy.

Sell better

Currently carmakers control only


a small part of the purchasing
experience.

A potential buyer might approach the


manufacturer, only to be guided to the
nearest car dealership, where they
test drive the car, negotiate the price,
agree a sale, provide their details and
with whom they arrange subsequent
services. This information is all vital
— but it belongs to the distributor.

However, as DTC selling becomes


more prevalent, data ownership will
evolve and change, and manufacturers
will preside over increasing amounts of
valuable insight about their customers
and prospects.

Armed with this information, the


manufacturer can improve their
sales conversion rate because they
know more about who buys their
cars and why — and therefore how to
attract them and retain their interest
throughout the sales funnel. At the
same time, this knowledge sharpens
is shown a range of available ancillary Telecom company Orange, for example, their targeting ability and enables
options — extended warranties, sells a wide variety of accessories them to decrease their acquisition
accessories, service specials etc such as earbuds and phone cases as costs.
(and even specific offerings such a result of the personalization of its
as seasonal items, based on their website and after sales advertising Ultimately “selling better” is about
geographic location — snow-tires in campaigns. marketing, and within that truly
winter for example). understanding how to use digital
And travel specialist Pierre & Vacances tracking tools (such as Google
This personalized approach, Center Parcs, operating a voucher analytics) in order to mine the wealth
undertaken in close collaboration system as the pandemic forced people of information that already exists.
with dealers, extends the carmaker’s to cancel their holidays, encouraged Armed with these details, digital ad
relationship with the buyer from the travelers to supplement trips planned spend can be optimised based on real
current period of one to two years, for the future with incentives such as and accurate business and customer
potentially to between five and eight bigger rooms and additional activities. data (such as customer lifetime value,
years. And by increasing this customer actual purchase details and online
lifetime value, it opens up a range of Making these as appealing as possible behaviours), rather than generic media
revenue-generating opportunities that required the company to minutely KPIs.
add value to the end user. analyse its customer data in order to
offer experiences that were relevant Toyota Canada is one manufacturer
Looking outside the automotive sector to different customers; leveraging the actively demonstrating an advanced
provides further inspiration on how to initial ‘big’ purchase to position the approach. Using a scoring model
‘sell more’. follow-up messages in order to sell based on first party data such as online

73
behavior, it identified the people with using creative thinking to make money components keep costs manageable,
the highest propensity to buy a vehicle; from everything they know about the and make it feasible and relatively
targeting these new audiences drove people that buy their products, in uncomplex for most enterprises.
a conversation rate six times higher a way that enhances the customer
than the previous tactic of re-engaging experience. With a CDP established, data
website visitors, while reducing the management and analysis can deliver
cost-per-acquisition by 80 percent. The data ‘glue’ relevant insight that adds value and
More advanced activity will include generates revenue opportunities
using increasing amounts of first party So what is the link that holds together throughout the complete customer
data to steer advertising campaigns. the modus operandi outlined above? lifetime.
Data. Data, data and more data.
Seek out new sales streams However, this isn’t a “tick box” exercise;
This calls for a change of mindset the work of a data-marketer is never
Extending the car-buying experience in the car industry. Instead of one done… Data, and especially business
gives manufacturers the opportunity organization selling new cars, another data, is continually added to the CDP,
to build up a picture of their customers selling used ones, another offering and with it the picture of customers
— the environment they live in (rural spare parts and yet another selling becomes clearer, richer and more
or urban), the type of car they need, services, with each keeping their data granular — giving manufacturers and
how they use it and potentially where separate, the vision of tomorrow relies dealers ever more accurate tools with
they go. on one central data hub that benefits which to sell more, sell smart and
everyone. Sometimes referred to as a develop new sales streams.
Anonymizing these details and adding Customer Data Platform (CDP), today’s
them to the vast amounts of first-party cloud solutions for the different tech
data that the industry already owns,
puts manufacturers in charge of a
lucrative, mainly unrealised, revenue
stream that can be unlocked via data
partnerships.

Data partnerships enable an


organization to access the first-
party data of another organization,
either paying for it directly, or
reciprocating with its own first-party
data. The strategy enables both
enterprises to explore new routes to
new customers. For the automotive
industry, partnerships with insurance
companies are an obvious link, but
this approach is easily broadened —
hotels and holiday companies are key
contenders for understanding more
about peoples’ driving and travelling
habits for example.

The structure and organization of data


partnerships is still new. It’s therefore
a value-generating differentiator for
manufacturers that are prepared to
be trailblazers and invest time and
effort getting it right now before it
reaches maturity.

In short, finding more value is about


manufacturers complementing the
traditional way of selling cars and

74
AI INDUSTRY SOLUTIONS — AI FOR FINANCE & INDUSTRY

INTERVIEW

How Nissan is transforming


in the digital world
In this Q+A, Dév Rishi Sahani, Nissan’s Global Head
of Customer Experience Data Analytics & Reporting,
chats with Pascal Coggia, Artefact Partner and UK
Managing Director, to explain how the Japanese car
giant has accelerated its digital transformation over
the last few years and is now using data and BI Hubs
Pascal Coggia
to drive operational efficiencies and sales around the CEO
world. ARTEFACT UK

Pascal Coggia: What does digital


transformation mean for Nissan?

Dév Rishi Sahani: The term digital


transformation is overused to an
extent, so it depends on how you
define it. At Nissan, we embarked
on our digital transformation journey
several years ago. We’ve done well
on the technology transformation,
and we will continue to do that, but
the most important thing for us is
keeping up with customers. Our entire
focus is on what we call customer
experience transformation, and if you
look at it from that perspective then
the challenges are very clear.

The first challenge concerns the


organisational or operating structure.
Thinking about customer journeys
as a linear path is a traditional way to
visualise how your customers interact
with your brand, but, ultimately, it limits
itself to be very transactional and
locks you into silos. To move from
transactions to relationships with
customers, we must stop looking
at the step conversions or handoffs
between these channels and connect
those silos.

75
Pascal: Ok, can you tell us how decision making; on the other end is how one million visitors interact, every
you are using data? How are you output data, which are the predictive single minute of every single day.
becoming customer-centric? models and the data science that
shows the operational results of our A couple of years ago, we launched
Dév: Over my career, I hold a firm belief initiatives. What links the two of these a new car in a specific market and
that the amount of data available will together is a data-driven, hypothesis- spotted a lot of cross-segment
always be ahead of organisational led test-and-learn culture. buying going on. We could see when
appetite to consume and action those customers changed their minds from
insights. This is not a bad or a good Pascal: Can you give us an example the model we thought they would buy
thing, it’s just reality. Sometimes the of a data project that has resonated to the one they actually did. We could
battle is simply because the data at scale within your organisation? also determine who would buy an
conflicts with the intended course of automatic transmission or a manual
action you wanted to take. Dév: In one way it’s been our entire transmission, for example. We can
journey for the last two or three now consume data and see significant
But more importantly, it’s when analysis years… developing CEDAR, our internal patterns in the consumer buying
does not present clear implications or brand for the data analytics function. journey. We want to know what our
recommendations — if the data is It stands for Customer Experience customers want and respond to that.
not telling you what to do next, it’s Data Analytics and Reporting. It
pretty useless. This is where the word helps us turn data into information, Pascal: Can you tell me more about
‘utilisation’ becomes relevant; it shifts knowledge, and wisdom. That’s where this CEDAR dashboard — this BI
the focus from how many people are the actionability and usability come in. hub? Why does it exist?
using the data to the utility. We’ve
been leading the adoption of data For the first couple of years, we were Dév: It wasn’t the first thing we built.
within our organisation by keeping a kind of hardcore, getting everything We started by focussing on data –
clear definition of its usefulness for organised. The first ‘Eureka!’ moment what questions can we ask the data?
our markets, business functions and was when the whole organisation what answers will the data give? We
digital teams. realised they could look at the data got our head around that in the first
and see consumer trends across year of the program. Then it was
At one end is the input data, like 147 markets. Rather than make about ‘democratising’ this data. We
dashboards and the support systems assumptions based on samples of realised that having the data was not
that enable upstream planning and 5,000 people, say, we can now see the challenge, but making sense of it

76
AI INDUSTRY SOLUTIONS — AI FOR FINANCE & INDUSTRY

was. This is where we developed our


partnership with Artefact to create
CEDAR home.

CEDAR is an independent platform


that works across Nissan, and across
functions, and contains lots of
different dashboards and insights. It’s
our hub of experience and consumer
insights – where our people go to
see the world as our customers do.
Anyone in the company can go there
and if you don’t have a login, you just
sign up. It should feel just as easy as
browsing our website.

Most importantly, as a data person,


we can look at the analytics in CEDAR
to understand how it is being used,
where the points of friction are, where
people get lost, and what parts are Pascal: Tell me a little bit about the Now, the direct team is mostly based
useful or not useful. Rather than just team. Who’s in the driving seat? in the global customer experience
ask, we know. centre in London, along with some
Dév: Data is not everything; you need folks in our headquarters in Yokohama,
Pascal: Have you seen people to know how to unlock insights from Japan, and we work with embedded
using CEDAR to make decisions it. This is where having the right analytics teams across all markets.
they were unable to before? Or is it team helps. When I joined Nissan
just helping them save time? in November 2017, there were two Pascal: So, what does success look
people in the global data practice and like? How do you see your data
Dév: We have a lot of people who from there we grew the team across projects progressing over the next
are making good decisions, but the three core pillars — called Measure, couple of years?
evidence of people using CEDAR to Optimise and Predict.
make different decisions or decisions Dév: Success is about making the data
that they could not take before is quite The Measure pillar is tasked with we generate useful for both Nissan
satisfying. understanding how we organise the and its customers. That involves
data and turn it into information. We tying everything we do back to our
We started this journey from grew up in a very collaborative way organisational KPIs. We can attribute
a customer experience focus with partnerships, like with Artefact where we are starting to make a
perspective. We wanted anyone who and other companies. We knew we difference from an organisational
was working on customer experience needed the machine to build that up. perspective, and, equally, we can tie
to use these insights. But very quickly The second pillar is called Optimise it back to customer quality metrics
it has spread to teams from other and is responsible for running data- that show how we get short-term wins
functions such as advanced product driven experiments and testing and or long-term longevity. That’s what
planning and market intelligence. All learning – which we do a lot of. The success of the customer experience
these different functions, at global and third pillar is called Predict, which is program looks like. The success of
market levels, know about CEDAR. the data science pillar. We were very the data program is about continuing
That itself is giving us a lot of context careful to add that at the end to ensure to make it very simple and keeping it
about how data is becoming more we didn’t rush ahead and bring in talent growing.
useful across our organisation and that would get very bored, very quickly,
is being used by different types of because we didn’t have the right use
business functions. cases lined up.

77
Data for
Impact
81 Use data to measure and reduce your
environmental impact with Artefact
83 Industrializing carbon footprint measurement
to achieve neutrality
85 Applying machine learning algorithms to
satellite imagery for agriculture applications

78
DATA INDUSTRY SOLUTIONS — DATA FOR IMPACT

Use data to measure Companies often struggle to measure


their environmental impact. It may be

and reduce your


difficult for them to gather all the data
they need, or they may simply not have
control over it. They can’t know, for

environmental impact
example, how their customers will use
their products once they’ve purchased
them and what impact this will have on

with Artefact
the environment. To better estimate
this impact, companies can turn to
Artefact’s data experts. They advise
large companies on how to turn data
into business value and focus more
on the potential of data to positively
impact the environment.

Léonard Cahon An increasingly restrictive


Senior Consulting Manager
legislative framework
ARTEFACT
By 2024, companies will be required
to produce and publish their non-
financial information, including their
annual carbon footprint.
It’s challenging for companies to consistently
“The legislative framework is
measure their environmental impact and prioritize increasingly restrictive, but it sets
the actions they need to take to reduce it. But objectives and constraints without
with data and artificial intelligence, it’s possible to a common frame of reference
for the same sectors to achieve
accurately measure the impact of an activity and these objectives and respect these
effectively guide decision making while adopting constraints,” explains Margot Millory,
best practices for increased energy sobriety in Consulting Manager in charge of
Sustainability at Artefact.
digital business. This is what Artefact, a leading
international data transformation and consulting Two companies might end up
company, demonstrates. with extremely different orders of
magnitude simply because they don’t
have the same measurement methods
or don’t take the same emission
factors into account. Artefact offers
to assist companies by capitalizing
on the entire data value chain, from
data strategy to data governance
to implementation, all with real
consulting support to guide decision
making towards greater efficiency. The
company is already working with many
sectors (consumer products, luxury
goods, telecommunications, etc.), but
more and more business activities
will be impacted by these issues, as
evidenced by the CSRD (Corporate
Sustainability Reporting Directive),
which now sets the standards for
non-financial reporting for 50,000
companies in Europe.

79
Data governance for energy
sobriety in digital activity
As with many of the challenges facing
companies, the creation of a reliable,
sustainable database is an essential
prerequisite for implementing a
strategy to reduce environmental
impact. Artefact’s sustainable data
governance offer enables its clients
to benefit from a clean and structured
data repository, a real added value for
the organization.

“This data is often spread across


several databases that are rarely
or never consulted, with no clearly best practices to implement data/ motivating factor: a new way of thinking
established governance. To retrieve AI projects responsibly, in a logic of with different constraints, different
them on a regular basis and track them energy sobriety. To this end, Vincent stakeholders, and multiple priorities,
over time, companies need strong data Luciani and Margot Millory have joined as different KPIs are involved. The
governance skills, which is the hallmark the Institut Numérique Responsable companies supported by Artefact
of a business like Artefact’s. You can’t (Responsible Digital Institute), a think used to have a customer-centric
measure if you don’t have a solid, tank that focuses, among other things, approach, which puts the customer
sustainable data structure.” explains on reducing the economic, social and at the heart of the company’s activity.
Vincent Luciani, CEO of Artefact. environmental footprint of digital But their priorities are changing,
technology. driven by growing pressure from
This exercise is essential for consumers who now demand full
companies to identify missing data “It’s crucial to operate in both an transparency and concrete action
and get up to speed in order to collect energy-efficient and ethical way, from the companies whose goods and
and structure it and improve its quality. whether in terms of data collection, services they consume. “Our business
As a result, they’ll be able to develop computing power or the way we build is evolving towards an approach that
automated and sustainable reporting and deploy our algorithms” explains puts the citizen and accountability at
tools to serve as the foundation for Vincent Luciani. the center of the company’s activity,”
building a realistic environmental explains Vincent Luciani.
impact reduction trajectory. Environmental initiatives
at the heart of future work Education: a pillar
Best practices for responsible models of sustainable digital
digital activity transformation
Today, Google, Microsoft and
Artefact’s clients are increasingly Amazon are all capable of completely Companies are seeking to recruit
integrating extra-financial logic into transforming the market with new trained people to work on data-related
their reasoning. In response to this technologies. Artefact is optimistic projects; there is also a real need for
evolution, the company also assists about these changes, convinced that training in sustainable development
them in their efforts to reduce their technology can offer many benefits, and environmental impact. Through
carbon footprint through use cases and that those who produce it are its Artefact School of Data, which
based on data science and AI: inventory equally aware of environmental issues. teaches data-related jobs to people
optimization, waste reduction and in professional retraining, Artefact
improved delivery times. “We believe that the environmental tackles important notions such as
transformation wave will be as big as adopting best practices in order
Digital technology is currently the digital wave.” explains Vincent to implement frugal algorithms,
responsible for 2.5% of national Luciani. measuring the carbon impact of an
GHG (greenhouse gas) emissions, e-commerce chain or establishing
and these could increase by 60% by It’s likely that companies will launch an accurate sales forecast in order
2040. Artefact wants to minimize the environmental initiatives or place to prevent food waste, as carried out
negative external factors linked to the environment at the heart of their in the bakery and pastry department
its activity through the promotion of work model. Artefact sees this as a of Carrefour hypermarkets.

80
DATA INDUSTRY SOLUTIONS — DATA FOR IMPACT

Industrializing carbon footprint


measurement to achieve neutrality
To achieve carbon neutrality, the challenge for large
companies is first to track their carbon footprints.
Some large companies have initiated a change
in the culture of data processing to achieve the
industrialization of this data, which is massive,
Vincent Blaclard
heterogeneous and rarely prioritized. Partner
ARTEFACT

The climate emergency has become Achieving carbon neutrality We must meet the expectations of
a major issue for our society. Recent with three objectives thanks consumers, who are increasingly
events, in particular the multiple to data committed, and anticipate the
shortages and repeated heat waves, tightening of the legislative framework
only confirm the acceleration of Let’s take the example of the Carrefour to come, such as the eco-score that
current and future difficulties that group, for whom we are carrying out will become mandatory from 2023 for
must be overcome. Today, many an assignment. Carrefour’s ambition certain players. In order to face these
European companies listed on the is to become the world leader in food challenges, Carrefour understood that
stock exchange are announcing their transition, particularly in e-commerce. it was necessary to have a measure
commitment to climate transition. One of its major objectives is to of the carbon footprint: to make a
30% have made a real commitment make e-commerce carbon-neutral quantified inventory of the starting
to reduce their carbon emissions, but by 2030. Three main levers of action point, to determine the impact of
it is estimated that only 5% of them have been identified in order to reduction initiatives and to be able
are on track to do so. It is not a simple reach these objectives: reducing to communicate both internally and
exercise. Reducing emissions in a Carrefour’s own emissions, engaging externally on the successes, and
sustainable way requires accurate its service providers to reduce their also on the challenges to come. This
measurement of their carbon footprint, emissions and finally encouraging its measure will be the compass for the
in order to develop concrete actions. customers to adopt eco-responsible 2030 neutrality trajectory. It will have
At Artefact, we believe that exploiting behaviors. This ambition, in addition to to meet the requirements of reliability
the data to its full potential is a major responding to the climate emergency, and transparency, and allow for the
asset for the success of this approach. also has a strong economic impact. implementation of concrete actions.

81
We are undoubtedly at a crossroads
as far as the ecological transition
in companies is concerned. The
successive disasters of the summer
of 2022 are helping to accelerate this
awareness, while a new generation
of workers who are highly aware of
these issues is entering the job market.
Nearly 76% of Gen Yers place CSR
above salary in their job search criteria
and 70% are willing to pay up to 35%
The major challenge same way that companies are required more for a sustainable, low-carbon
of prioritizing data to be financially transparent. product or service.

A large part of the project’s efforts The parallel with the data The market is still at this stage: there
consisted of collecting a large market is a strong will to move forward, but
amount of very heterogeneous data the foundations needed to achieve
from multiple sources (for example, We can take the parallel further with these goals in a sustainable way often
mileage data of delivery services or the evolution of the data market. Ten have to be built, which Carrefour has
IT infrastructure emissions data), in years ago, awareness of data in large understood well. It is therefore crucial
order to orchestrate them and build companies was still limited. It was that companies equip themselves
a consolidated carbon footprint the exclusive territory of small teams with the capabilities and tools to
measurement. The goal is to obtain within the IT or digital departments match their ambition, in particular
a comprehensive measurement of who worked on use cases, without the measurement of the carbon
all emission items for each individual the capacity to bring their solution footprint of all their activities. This
order. The main difficulty with any to scale. Today, the importance measurement must be industrialized,
project of this type is the complexity of data is heard at the executive calculated in real time, accessible and
of accessing data that can be used committee level of large groups, and integrated into all business processes.
quickly. Most large groups have is perceived as a strategic priority at For example, the carbon footprint
already launched significant programs all levels. This evolution has been, could be integrated into budgets and
to better govern the data, addressing over the last ten years, the result used to assess the impact of new
quality and accessibility issues first. of a collective awareness of the projects, along with the revenues
These programs are often very importance of data, notably through generated and the associated CapEx
large and obviously cannot handle geopolitical and strategic issues, and OpEx costs.
all the data created in a company, as well as tensions between major
often very large. Prioritization of data powers and large technology groups. Consolidating data
domains closest to the core business This awareness has gradually taken governance
is necessary, such as sales, supplier hold in all organizations, even those
or consumer data. less advanced in digital technology. Once these foundations are built and
It has been accelerated by the arrival consolidated, large corporations will
“Reducing your emissions in a of new generations (millennials) in be able to leverage their data much
sustainable and lasting way requires decision-making positions, who have better to accelerate their green
you to accurately measure your carbon been aware of digital issues since transition. Strong data foundations
footprint.” their childhood. are a major prerequisite for deploying
AI solutions at scale; the same is true
Unfortunately, data related to Measuring the carbon for the green transition, where AI
sustainable development is rarely footprint of all activities will certainly play a role once these
prioritized in such initiatives, as it foundations are consolidated. It’s
is rarely used in an industrial way This evolution is not going smoothly, often more appealing to talk about
by large groups. Today, a team of and the use of data does not always AI than data governance, but I am
experts needs several weeks of project give the expected results, often convinced that the success of these
time to calculate a carbon footprint because robust foundations have not initiatives lies in the ability to move
measurement that is often static. It is been put in place. The major groups forward on both fronts: delivering
certain that tomorrow all companies now understood the importance of this impact through targeted initiatives,
will have to be able to calculate this fundamental work and are launching while building the right foundations
carbon footprint at any time, in the numerous programs on the subject. to sustain those impacts.

82
DATA INDUSTRY SOLUTIONS — DATA FOR IMPACT

Applying machine Business motivation

learning algorithms A solution able to automatically detect


and label crops can have a wide range

to satellite imagery
of business applications. Computing
the number of plots, their average
size, the density of vegetation, the

for agriculture
total surface area of specific crops,
and plenty more indicators could
serve various purposes. For example,

applications
public organizations could use these
metrics for national statistics, while
private farming companies could
use them to estimate their potential
market with a great level of detail.

Naturally, Satellite imagery was


Paul Devienne considered and identified as a very
Senior Data Scientist viable data source for 3 specific
ARTEFACT reasons:

Scalability: A bank of images covering


the whole world is available right away
and being updated regularly

This article will: Data richness: Satellite images can


provide a lot more information than
simple pictures. Instead of a 3-band
• Show you various applications of machine learning image of Red, Green and Blue pixels,
and computer vision to satellite images for some satellites can provide more than
agriculture. 15 features per pixel
• Present a series of algorithms to successfully Cost: Even though satellite imagery
detect and label agricultural plots. can be quite costly, some options are
• Suggest alternative methods depending on the fully free, such as Sentinel 2, which
we ended up selecting as our main
availability of data. data source.
This article assumes basic fundamentals in data
science and computer vision. Step 1 — Detecting
agricultural areas on satellite
images
After retrieving and preprocessing
Sentinel 2 images, our first challenge
was to locate the plots and limit
ourselves to specific areas of interest.
Each image having a very high
resolution, it would not be realistic
to apply the whole processing to full
size images. Instead, the first step
to solve our problem was to crop
large images into smaller fragments,
and identify the areas where the
plots were located on these smaller
images. (cf image 2)

83
Solution 1A
TRAINING A PIXEL CLASSIFIER
The first solution for detecting
agricultural zones on large images
is to build a pixel classifier. For
each pixel, this machine learning
model would predict whether this
pixel belongs to a forest, a city,
water, a farm … and therefore, to an
agricultural zone or not.

Because a lot of resources can


be found for Sentinel-2, we were able
to find labeled images with over 10
different classes of ground truth
(forest, water, tundra, …). However,
if the climate of your area of study
is different from the area you trained
your model on, you might have
to reevaluate the classes attributed
to each pixel.

For example, after training a model


on temperate climate countries, and
applying them to more arid regions
of the world, we observed that what
the model was seeing as forests Image 2: Our desired output: fragments containing only agricultural areas
and tundras were in fact agricultural (Copernicus Sentinel data 2019)
crops.

Once your pixels are classified,


you can drop all images that don’t
contain any agricultural areas.

Solution 1A pros:

• Most reliable and granular results


(pixels)

Solution 1A cons:

• A dataset of labelled pixels is


required

• Classifying each pixel generates a


high computational cost

• O ut of all available methods to


detect agricultural zones, this one
was the most accurate. However, if
you do not have access to labeled
images, we have identified two Illustration of pixel classification with 3 visible classes of pixels
alternative solutions. (Copernicus Sentinel data 2019)

84
DATA INDUSTRY SOLUTIONS — DATA FOR IMPACT

You can design your own polygons on GoogleMaps, thus focusing on a


specific area of choice while drawing around obstacles (water, cities …)

Solution 1B
is to do is map those coordinates to
MAPPING GEO COORDINATES
your satellite images and filter your
TO PIXEL COORDINATES
images to only cover the zones within
If coordinates about your zone your polygons.
of interest have been labeled,
or if you’re labeling coordinates Solution 1B pros:
by yourself, it is possible to map
these geo coordinates (latitude and •A
 lso a reliable method
longitude) to your images.
Solution 1B cons:
For example, if you have the
coordinates associated with large • You need a list of coordinates
farming areas, or if you draw large associated with agricultural regions
polygons on Google Maps yourself,
you can easily obtain geo coordinates • Manually creating those coordinates
of agricultural areas. Then, all there can be time consuming

85
Visual representation of the NDVI on an agricultural zone and a desert
(Copernicus Sentinel data 2019)

Solution 1C
ground, which could serve to detect Solution 1C pros:
USING A VEGETATION INDEX
agricultural areas over a large image.
It is possible to compute a vegetation • Absolutely no labelled data required
index from the color bands provided After computing NDVI values for
by the satellite images. A vegetation each pixel, you can set a threshold Solution 1C cons:
index is a formula combining multiple to quickly eliminate pixels with no
color bands, often highly correlated vegetation. We used NDVI as an • Not very accurate: for example,
with the presence or density of example, but experimenting with it could be hard to differentiate
vegetation (or other indicators such various indices could help achieve agricultural crops from forests
as the presence of water). better results.
• The thresholds have to be fine tuned
Multiple indices exist, but one of Note that computing a vegetation depending on climate and other
the most commonly used ones in index can provide you with useful specificities
an agricultural context is the NDVI information to enrich your analysis,
(Normalized Difference Vegetation even if you have already implemented
Index). This index is used to estimate another way to detect agricultural
the density of vegetation on the areas.

86
DATA INDUSTRY SOLUTIONS — DATA FOR IMPACT

An example of edge detection on agricultural plots using OpenCV


(Copernicus Sentinel data 2019)

Step 2 — Detecting and


outlining agricultural plots
BUILDING AN UNSUPERVISED
EDGE DETECTOR
Once you have determined the
location of your agricultural zones,
you can start focusing on outlining
individual plots on these specific
areas.

In the absence of labeled data, we Illustration of the full process of outlining plots (Copernicus Sentinel
decided to go for an unsupervised data 2019)
approach based on OpenCV’s Canny
Edge detection. Edge detection
consists in looking at a specific pixel
and comparing it to the ones around
it. If the contrast with neighboring
pixels is high, then the pixel can be
considered as an edge.

Once all the pixels that could


potentially be true edges have been
identified, we can start smoothing out
the edges and try to form polygons.
As expected, the performance of the
edge detection algorithm is proven
to be much better when applied to
large plots:

This method allowed us to


automatically identify close to 7 000
plots in our area of interest. Because
we used the pixel classification
method (see step 1A), we were able
to to separate real farm plots from
other polygons, thus only retaining Polygons consisting of a minority of “farm pixels” were eliminated
relevant data. (Copernicus Sentinel data 2019)

87
Experimenting on contrast, saturation or sharpness can help improve the
efficiency of the edge detection (Copernicus Sentinel data 2019)

OPTIMIZING OF THE
PERFORMANCE OF THE EDGE
DETECTION ALGORITHM

In order to have the best possible


results, it could prove useful to apply
modifications to your image, notably
by playing around with contrast,
saturation or sharpness.

Another critical success factor is


forcing the polygons to be convex.
Most plots following regular shapes,
forcing convex polygons can usually
yield much better results.

Forcing convex shapes fits most plots much better (Copernicus Sentinel
data 2019)

88
DATA INDUSTRY SOLUTIONS — DATA FOR IMPACT

Step 3 — Classifying each


parcel to detect specific
crops
Once all plots have been identified, you
can now crop each of them and save
them as individual image files. The
next step is to train a classification Illustration of the external crop data source
model in order to distinguish each
parcel based on its crop. In other
words, trying to identify tomato crops
from cereals, or potatoes.
BUILDING A LABELLED
TRAINING SET
Because we did not have an
already labelled dataset available,
and because manually labelling
hundreds of images would be too
time consuming, we looked for
complementary datasets containing
the information about crops for
specific plots at a given time and
place.

The ideal scenario would be to have


pre-labelled images, but in our case
we only had the geo coordinates and
crops of a few hundred farm plots
in our area of interest. This dataset
contained a list of plots, the latitude
and longitude of its centre, and the
crop planted on it at a specific time
of the year.

In order to build our training set, we


used our geo coordinates to pixel
coordinates converter to identify the
specific plots for which we had a label
(the crop) in our image bank.

Out of the 7 000 plots identified in


Step 2, we managed to label around
500 plots thanks to our external
data source. These 500 labelled Dozens of models were trained on datasets generated with various of
plots served to train and evaluate data preparation techniques
the classification model.

89
When working with farm plots, just a few weeks can make a large
difference (Copernicus Sentinel data 2019)

MODELIZATION performing binary classification Conclusion


on the smallest plots (and thus the
We chose to use a convolutional hardest to classify due to the low
neural network using the fastai library, number of pixels).
as it was an efficient way to classify
our images.
CHALLENGES TO KEEP IN MIND Working with satellite
When working with farm plots, even images opens up
In order to find the best possible a few weeks can make a substantial an endless range of
classifier, we experimented with the difference. Within a few weeks, wheat
input data: crops can go from green to gold to
possibilities. Considering
harvested: how each Satellite
• Selecting various combinations of provides different
color bands (Red, Green, Blue, Near Thus, there are two things to keep in
infraRed …) mind in order to replicate this project
features, and how the
throughout the year: availability and format
• H andling neighboring pixels in of complementary data
different ways: making them • You have to build a model for each
transparent, white, black … or leaving period of the year.
can vary throughout
them untouched the world depending on
•
Your labelled data containing your area of study, every
After experimenting with various information about the crops need
classification models, we reached to be refreshed regularly.
single project will end up
78% accuracy and 74% recall when as a unique use case.

90
WE OFFER END-TO-END
DATA & AI SERVICES

FMCG • RETAIL & ECOMMERCE • LUXURY & COSMETICS SPORTS & ENTERTAINMENT • TRAVEL & TOURISM • PUBLIC &
HEALTHCARE • BANKING & INSURANCE • TELECOMMUNICATIONS GOVERNMENT • REAL ESTATE • MANUFACTURING & UTILITIES
CONTACT
hello@artefact.com
artefact.com/contact-us

ARTEFACT HEADQUARTERS
19, rue Richer
75009 — Paris
France

artefact.com

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy