0% found this document useful (0 votes)
118 views18 pages

Data Trends and Predictions 2022

Organizations are accelerating culture transformation programs to build data-driven decision making. Data culture involves setting governance, investing in literacy, and hiring skilled talent. Chief data officers lead these efforts to start a virtuous cycle of data use. In 2022, companies will double down on roles, departments, and support for cultural changes. Organizations are also scaling data governance to improve usability and trust in data. Bad data costs trillions annually. As data volumes and complexity grow, teams use data observability to monitor quality in real-time and catch issues earlier. Observability is gaining momentum as companies seek to manage quality at massive scales.

Uploaded by

Temp Sear
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views18 pages

Data Trends and Predictions 2022

Organizations are accelerating culture transformation programs to build data-driven decision making. Data culture involves setting governance, investing in literacy, and hiring skilled talent. Chief data officers lead these efforts to start a virtuous cycle of data use. In 2022, companies will double down on roles, departments, and support for cultural changes. Organizations are also scaling data governance to improve usability and trust in data. Bad data costs trillions annually. As data volumes and complexity grow, teams use data observability to monitor quality in real-time and catch issues earlier. Observability is gaining momentum as companies seek to manage quality at massive scales.

Uploaded by

Temp Sear
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

2022 


Data trends
and predictions

2022 Data trends and predictions

In an age of rapid digitization,


data literacy is shaping the
dawn of a new era
COVID-19 has led to a massive leap in digital
transformation across all industries.

During the pandemic, companies reacted to the dramatic shift


of consumers towards digital channels by digitizing their
internal operations, supply-chain interactions, and their
product offerings. A survey by McKinsey revealed that
COVID-19 accelerated the adoption of digital technology by
several years, and those changes are here to stay.

Now more than ever, this rapid digitization is leading to an


explosion of data. According to Accenture, by 2025, the world
will produce 463 exabytes of data in a day. This is 92 times the
amount generated daily in 2021, and approximately the size of
the entire world’s storage capacity in 2009.

2021 2025

463 exabytes

per day

5 exabytes

per day

DATACAMP | 01
2022 Data trends and predictions

Companies sitting on treasure troves of data are feeling the


pressure to become data-driven. Not only are data-driven
organizations 23 times more likely to acquire customers,
but they are also 19 times more likely to be profitable. 

This pressure will animate much of the stories we’ll see in
the upcoming decade. Similar to how organizations
adapted to the rise of personal computing and the internet
over the past 40 years, they will need to adapt to a new
era that will be defined by data literacy.

1980
2000
Today

Computer literacy Software/ Data literacy


internet literacy

In 2022, companies embarking on data transformation


programs in hopes of reaping the rewards of a data-
driven culture will find themselves adopting new tools,
embracing new technologies, new ways of work, and
embracing talent and culture transformation at scale.

As such, here are nine data science trends and


predictions that you can expect in 2022.

DATACAMP | 02
Contents

Organizations accelerate culture 04


01 transformation programs

02 Organizations will scale data governance


06

NLP ushers in a new generation of low-code 08


03
data tools

04 L&D becomes a fabric of company culture


09

MLOps will continue to mature within 10


05
organizations

06 Responsible AI becomes more operationalized


11

07 The rise of the data mesh


13

New generation of tooling will improve the 
 14


08
data team’s productivity

The talent crunch and flexible work will 16


09
broaden and improve the search for data talent
2022 Data trends and predictions

01 Organizations accelerate
culture transformation programs

Companies hoping to collect dividends Defined as “the collective behaviors and


from their wealth of data might be beliefs of people who value, practice,
disappointed when their effort to and encourage the use of data to
become data-driven falls flat. According improve decision-making”, data culture
to the New Vantage Partners 2021 is the basis to capture opportunities and
Survey, the lack of data culture is the value from a company’s data.

culprit that impedes a large majority of


organizations from becoming data- Building an organization-wide data
driven. culture takes time and discipline. It
involves setting up proper data
governance, investing in data literacy
29%
 programs, and training or sourcing talent
of organizations are experiencing with specialized data skill sets. More
transformational outcomes with importantly, it requires data to be
data science and AI integrated seamlessly into business
operations and processes so that data-
driven decision-making becomes the
99%
 norm.

of respondents who are not


experiencing transformational
The onus of building a data culture is on
outcomes cite lack of culture and
the organization’s Chief Data Officer
skills as the biggest impediment
(CDO), who is in the driver’s seat of the
for change
culture transformation program. The
NewVantage Partners CXO Survey 2021
CDO leads by example, starting a
virtuous cycle of data-driven decision-
making that propagates through the
organization.

DATACAMP | 04
2022 Data trends and predictions

This year, we highlighted how Allianz


Benelux is investing in data literacy and
culture transformation programs. As a
strong data culture becomes ingrained,
data-driven decision-making takes over Building a data culture
gut-based guesswork. Similarly, Gulf is not an option; it is
Bank created a data ambassador business-critical.”
program to propagate data-driven Building Data Cultures Webinar
decision-making throughout different
departments.

Sudaman Thoppan
Mohanchandralal

In 2022, expect organizations to double-


down on these culture transformation Regional Chief Data 

& Analytics Officer

programs, by creating dedicated roles Allianz Benelux
and departments for culture
transformation, and by doubling down
on supporting levers that help scale a
data culture.

DATACAMP | 05
2022 Data trends and predictions

02 Organizations will scale 



data governance

Today, bad data costs $3.1 trillion Today, companies measure data quality
per year to the US economy alone, on its accuracy, completeness, validity,
and timeliness. As the data grows in
lending truth to the adage of
volume and complexity, data
“garbage in, garbage out”.

practitioners find it ever more


challenging to monitor these metrics
Data quality is the ingredient to and maintain high-quality data pipelines
establish the data trust needed for in real-time.

making data-driven decisions.


Organizations are further internalizing
As such, expect the solidification and
the importance of data quality an will
emergence of new domains and
look to further establish proper data
categories such as Data observability,
governance, which involves managing
which address that exact pain point.
the availability, usability, integrity, and
Simply put, data observability is a
security of its data.

collection of technologies that identify,


troubleshoot, and resolve data issues in
The size of the data governance near real-time. With multiple start-ups
market will experience 
 (e.g., Monte Carlo, Databand, and
Observe.ai) offering data observability
5X growth by 2025 solutions, implementing real-time
observability is now possible.

Data Goverance Market -


Growth, trends, COVID-19 impact,
and forecasts (2021 - 2026) Data observability allows data teams to
detect system-level changes to the data
set and potentially catch data quality
Data governance is key to ensuring that
issues as early as possible. This
internal stakeholders can access quality
translates to reduced data downtime
data that is compliant, actionable, and
and higher data quality.
usable. Conversely, the need for data
governance grows in tandem with the
demand for self-serve analytics.

DATACAMP | 06
2022 Data trends and predictions

As companies seek to amass more data


and manage their data quality at scale,
data observability will continue to gain
momentum in 2022.


The number one challenge for
companies that want to
become truly data-driven is to
build the trust in data… for
companies that truly want to
become data-driven, data
quality and data observability
have to be a top priority.”
Creating Trust in Data with Data Observability

Barr Moses

CEO, and co-founder



Monte Carlo

DATACAMP | 07
2022 Data trends and predictions

03 NLP ushers in a new generation


of low-code data tools
As evidenced by the rapid development As large language models grow, they
of large language models and the rise of develop groundbreaking capabilities
startups such as hugging face, the past with transformative impacts on science,
few years saw the NLP space progress society, and organizations. For example,
tremendously.
GPT-3 and Megatron-Turing NLG
demonstrated their ability to perform
Over the next year and more, the tasks that they are not explicitly trained
increasing scalability of such models will on and generate alternative types of
generate an arms race for the biggest texts, including computer code and
and most effective large language guitar tabs. It is only a matter of time
model. For example, 2020’s GPT-3 was before the next large language model
hailed as the largest and most powerful pushes the boundary of what natural
generative language model to date, only language processing can do.

to be recently out-sized by the


Megatron-Turing NLG 530B model, Arguably the most impactful use-case
which is almost four times bigger than of large language models for
GPT-3. organizations will be the rise of NLP-
based low-code and no-code tools. 


For instance, the integration of GPT-3


into Microsoft’s Power App will empower
non-technical users to build an app
using conversational language. OpenAI’s
codex will increase the productivity of
data scientists and empower them to
focus on the data problems they’re
Large language models get larger over the years trying to solve, and not on boilerplate
code. As no-code and low-code tools
lower the barriers to coding, expect to
see a rising influx of citizen developers
and citizen data scientists within
organizations.

DATACAMP | 08
2022 Data trends and predictions

04 L&D becomes a fabric


of company culture
According to a 2021 PwC survey, 74% of As such, look for companies to scale
CEOs are concerned by the lack of key their upskilling and reskilling programs
skills within their organizations. This is by building internal data science skill
especially telling in an age where health academies that drive a learning culture
crises, uncertain economic growth, and within the organization.
technological changes are driving
systematic uncertainty for organizations.
Learning and development is proving to
be the silver bullet to the lack of key
skills—with the World Economic Forum

As we’re entering the year,
predicting a 38% of additional global
organizations are facing the
GDP gained from upskilling by 2030.
need for the biggest upskilling
and reskilling programs ever.”
Over the past year, the shift to digital Learning and Development in for the
drastically changed the views of Data-Driven Age
executives and leaders on learning and
development. As learning programs
Marcus Robertson

helped employees stay productive at


Global Curriculum Lead
home during the pandemic, many C- NatWest Group
suite began to recognize the value of
L&D. As a result, the percentage of L&D
departments with leadership reporting to
the C-Suite climbed from 24% at the
start of the pandemic (March 2020), to
63% a year later (March 2021).

As the world settles into the new normal


of remote work and the need for culture
and skill transformation rises, L&D
budgets will be dedicated to creating
vibrant learning ecosystems that provide
effective learning and communities of
practice to drive the era of data literacy.

DATACAMP | 09
2022 Data trends and predictions

05 MLOps will continue to


mature within organizations
MLOps entered the vernacular of the
machine learning community in 2019
and gained tremendous traction in the

Data scientists will have to
years since, and rightfully so. Creating
think about their models in
proof-of-concept with machine learning
post-production since that’s
today is simpler than ever, but 87% of when machine learning starts
models never make it into production. generating value.”

Companies wanting to extract value Operationalizing Machine Learning with MLOps


from machine learning cannot do so at
scale without production-level AI
Alessya Visjnic

systems. This is where MLOps comes


CEO and Founder

into play.

WhyLabs

MLOps is a set of practices that


combine machine learning, data
engineering, and DevOps skillset. More A report by Cognilytics estimated that
broadly, it is a combination of tooling, the MLOps market will be valued at
culture, and processes that help ensure $126.1 billion by 2025. The MLOps
machine learning models continuously market is currently dominated by
deliver the experience that they were startups, which have collectively raised
designed to deliver. It involves $3.4 billion. KubeFlow, Algorithmia, and
standardized processes that automate MLFlow are a few examples of MLOps
machine learning deployment tools at the frontier of scaling
workflows, from data management to enterprise-level AI.

model training and deployment. For


instance, Uber has successfully In the next year and beyond, the
implemented MLOps principles to MLOps stack will grow in maturity and
provide real-time predictions at scale.

become more standardized, and more


data science teams will begin to adopt
MLOps for deploying, managing, and
monitoring their machine learning
models in production.

DATACAMP | 10
2022 Data trends and predictions

06 Responsible AI becomes
more operationalized
A 2021 investigation by the go through a thorough impact
Markup reveals that 80% of black assessment. It is only a matter of time
mortgage applications to be before other nations would follow suit
and mandate that any use of AI is
unfairly discriminated against 

responsible.

due to bias in AI algorithms.



That is why companies are increasingly
adopting and operationalizing principles
Separately, an internal study by Twitter of Responsible AI. It ensures that AI
in 2021 reveals that its existing image- remains fair, interpretable, privacy-
cropping algorithm favors men over preserving, and secure. Google,
women. These are but two examples of Microsoft, and Capgemini published
AI learning and amplifying existing their AI Code of Ethics and implemented
human biases in 2021 alone.
steps towards such goals. Companies
looking to operationalize principles of
Companies seeking to extract value Responsible AI should look into adopting
from AI must ensure that their frameworks, tools, and processes. An
implementation of AI remains fair and example of a framework is PwC’s
responsible. Companies that fail to do so Responsible AI Toolkit, which addresses
are doing themselves a disservice with the various dimensions of Responsible AI,
massive damage to their brands and including governance, privacy, security,
reputation. More importantly, not only safety, and robustness, among others.
can biases in AI become a PR disaster—
but they also reproduce existing
prejudices and exacerbate inequality.

Regulators globally are paying more


attention to AI systems. The European
Union is the first governmental body to
issue a draft of comprehensive AI
regulations. Starting 2020, any
automated decision-making delivered to
the Canadian federal government must

DATACAMP | 11
2022 Data trends and predictions


We need to be thinking about
ethics, risk identification, and
accountability. We need to
make sure that we not only
enjoy the benefits of AI, but
also remain in control,
understand what can go wrong,
and how best to deal with it
with responsible AI.”
The Future of Responsible AI

Maria Luciana Axente

Responsible AI and AI for


Good Lead

PwC UK

DATACAMP | 12
2022 Data trends and predictions

07 The rise of the data mesh

Today, the data lake is one of the Organizations that implement data
most widely adopted data mesh can benefit from the higher speed
of data delivery, stronger data
architectures. Yet, its shortcomings
governance, and greater business
have prompted the rise of the 

domain agility. This is especially true for
data mesh.
large enterprises with varied data
sources, large data teams, and rich data
The data lake is centralized, has highly domains.

coupled pipeline architectures, and is


operated by siloed data engineers. Such Data teams at Zalando, Intuit, and
centralized architecture cannot meet the JPMorgan Chase have already
needs of enterprises with rich business implemented the data mesh
domains and diverse data sources. Not architecture. As the benefits of a data
only that, the high degree of coupling mesh become apparent, we expect more
between various stages of a data enterprises to experiment with the data
pipeline results in bottlenecked existing mesh architecture. Since implementing a
infrastructures struggling to keep up distributed data mesh requires a
with new data sources. Moreover, data paradigm shift from existing
engineers are often disconnected from architectures, such a change will be
users.
gradual over 2022 and beyond.

That is why some companies are moving


away from data lakes in favor of data
mesh. A relatively new concept coined
by Zhamak Dehghani, a data mesh is a
data architecture that addresses the
limitations of existing data lakes. A
monolithic data infrastructure has a
central data lake, responsible for the
consumption, storage, transformation,
and output of all data. In contrast, a
data mesh has distributed “data
products” – each handled by a cross-
functional team of data engineers and
product owners.

DATACAMP | 13
2022 Data trends and predictions

08 New generation of tooling will


improve the data team’s productivity
As data scientists become more sought after and the modern data stack
evolves, we’ll see a new generation of tools that will increase the data
team’s productivity. These tools can eliminate the need for manual work,
freeing up their time to perform higher-value non-routine tasks like data
pre-processing, feature engineering, and model deployment.

AutoML Tools
Many machine learning projects
share similar processes of Data science
hyperparameter tuning and model collaboration tools
selection. Recognizing that these
tasks are repetitive and time- Data science teams lament the
consuming, creators of AutoML tools difficulty in collaborating across the
promise to automate these tasks data science workflow. It is often
efficiently.
challenging for multiple data
scientists to collaboratively write
Data science teams are spoilt for code when performing data
choice for AutoML tools, each with exploration and ML modeling. As a
its strengths. While some like H2O result, data teams need to manually
AutoML and Python’s Auto sklearn handoff code, which can be error-
are focused on modeling traditional prone and time-consuming.

machine learning algorithms, others


like Auto PyTorch allow for tuning A new wave of data science
deep learning architectures. collaboration tools addresses these
pain points. For instance, Databricks
offers a unified platform to
collaboratively run analytics
workloads across the data science
workflow, and DataCamp Workspace
will allow data scientists to
collaborate asynchronously in real-
time in the future.

DATACAMP | 14
2022 Data trends and predictions

Synthetic data generation tools


On the other hand, the data-centric
AI popularized by Andrew Ng
ushered in a new generation of tools
for improving data quality. Most
notably, synthetic data generation
tools are built to generate labeled
data at scale. Not only does
synthetic data hold the promise to
eliminate the need for manual data
labeling or data collection, but it can

With AutoML, a lot of the
problems that data
also remove biases that arise with scientists work on are likely
real-world data. going to be streamlined.
Therefore to remain
competitive, data teams
need to improve their
skillset to be ahead of the
curve so that the value that
they can bring is more than
just .fit() .predict()”
Preventing Fraud and Boosting eCommerce
with Data Science

Elad Cohen

VP of Data Science

Riskified

DATACAMP | 15
2022 Data trends and predictions

09 The talent crunch and flexible


work will broaden and improve
the search for data talent
The Great Resignation saw 4.3 during the pandemic, as evident from
million Americans quit their jobs the 280% increase in remote job
in August 2021 alone. This problem postings on LinkedIn since March 2020.

is particularly pressing for the


In response, 46% of remote workers are
tech industry, where resignations
planning to relocate in 2021, according
rose by 4.5% since last year.
to a survey by Microsoft. Against the
backdrop of remote work and distributed
As employees leave their jobs in droves, teams in 2022, we foresee that
employers face a formidable task to companies will prioritize skills over zip
retain and hire new employees, codes in their hiring policies.
including data talents.

Employers are doing what they can to


stop the mass exodus. They are offering
not just better compensation, but also
greater flexibility to work remotely.
Apple, Google, Facebook, and Amazon Looking to hire 

are some examples of tech companies data talent?

that have delayed their return-to-office


till 2022. As the pandemic stretches on,
it is clear that remote work is here to Sign up for DataCamp
stay, according to Laura Boudreau, a Talent and simplify your
Columbia University economics hiring process
professor.

As working-from-home becomes more


prevalent, companies are becoming less
restricted to geographical boundaries in
hiring talents. Companies seized the
opportunity to widen their talent pool

DATACAMP | 16
Want to succeed in the era

of data literacy? 


Bridge your team's 


data literacy gap and

become more data-driven.

Explore DataCamp for Business

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy