MIT Dremio A New Paradigm For Managing Data
MIT Dremio A New Paradigm For Managing Data
A new paradigm
for managing data
2 MIT Technology Review Insights
Key takeaways
1
Business leaders recognize the
R
imperative to build a data-driven
egeneron Pharmaceuticals, a biotechnology culture, but they are challenged
company that develops life-transforming by enormous amounts and varying
medicines, found itself inundated with types of data, as well as their legacy
vast volumes of data during the peak of data management systems.
2
the covid-19 pandemic. In order to derive
Functioning as a single environment
actionable information from these disparate data sets, to capture all types of data while
which ranged from clinical trial data to real-time supply also enabling business intelligence
chain information, the company needed new ways to and analytics, a data lakehouse can
join and relate them, regardless of what format they be a “best-of-both-worlds” data
were in or where they came from. architecture solution.
3
A data lakehouse unites disparate data
Shah Nawaz, chief technology officer and vice president
types and use cases, providing simple,
of digital technology and engineering at Regeneron,
self-service data access across the
says, “At the time, everybody in the world was reporting
organization—while also simplifying
on their covid-19 findings from different countries and in IT workloads.
different languages.” The challenge was how to make
sense of these massive data sets in a timely manner,
assisting researchers and clinicians, and ultimately
getting the best treatments to patients faster. After
all, he says, “when you’re dealing with large-scale In response, many organizations, including Regeneron,
data sets in hundreds, if not thousands, of locations, are turning to a new form of data architecture as a
connecting the dots can be a complex problem.” modern approach to data management. In fact, by
2024, more than three-quarters of current data lake
Regeneron isn’t the only company eager to derive more users will be investing in this type of hybrid “data
value from its data. Despite the enormous amounts of lakehouse” architecture to enhance the value generated
data they collect and the amount of capital they invest in from their accumulated data, according to Matt Aslett,
data management solutions, business leaders are still a research director with Ventana Research.
not benefitting from their data. According to IDC
research, 83% of CEOs want their organizations to be “Data lakehouse” is the term for a modern, open data
more data driven, but they struggle with the cultural and architecture that combines the performance and
technological changes needed to execute an effective optimization of a data warehouse with the flexibility of
data strategy. a data lake. But achieving the speed, performance,
agility, optimization, and governance promised by this
technology also requires embracing best practices that
prioritize corporate goals and support enterprise-wide
collaboration.
A look inside
data lakehouse
capabilities
At the core of data lakehouse technology
are five key features.
56 % 32%
ORGANIZATIONS
to better support data from multiple sources and to be
used for multiple use cases and applications.”
ORGANIZATIONS MANAGING
MANAGING 20+ DATA Nawaz attests to the value of bringing data together
10+ DATA SOURCES SOURCES at Regeneron. He explains, “By joining cross-functional
domain data, regardless of where it sits, data lakehouse
technology enables us to create an entire value
chain story from early discovery all the way to
commercialization of new products.”
58%
ORGANIZATIONS From simplification to
USING “BIG DATA” greater collaboration
This data management architecture can benefit business
leaders and IT teams alike. Chief among the advantages
is a data lakehouse’s ability to deliver secure, self-service
Source: The Evolving World of Analytics and Data: Market Insights
data access, liberating data with live, interactive queries
from Benchmark Research, Ventana Research, 2022 directly on Amazon S3, Microsoft Azure Data Lake
Storage, HDFS, or another S3 storage solution. The
MIT Technology Review Insights 5
result is greater self-sufficiency and faster insights for warehouse team, a separate BI team, a separate data
data consumers at a time when organizations can’t science team, and a separate data lake team. However,
afford to waste time preparing data for analysis. once you merge technology in a way that works for all
these use cases, it can change a company’s culture.
This architecture provides users with easy access to It’s really an opportunity to bring together people and
data for a wide variety of tasks. A marketing manager, processes.”
for example, may wish to reduce customer churn while
a data analytics team leverages data to predict factory In fact, “by bringing data together” while encouraging
maintenance issues. In the case of Regeneron, the “knowledge sharing across the organization,” Aslett
company improves the lives of patients by combining says data lakehouse technology can drive innovation,
data—both structured and unstructured—in a single, enabling teams “to develop new projects, new initiatives,
centralized repository. “If we’re trying to address our and new ideas” for a distinctly competitive edge.
patients’ needs, we strongly feel that there has to be a
connected data ecosystem so that we can respond much Strategies for data
quicker,” says Nawaz. Whatever the scenario, employees management success
at Regeneron are empowered to discover, curate, analyze, To fully realize the benefits of modern data architecture,
and share datasets with a distinctly self-service mindset. organizations must establish best practices. One such
practice is viewing the modernization of IT infrastructure
IT teams also benefit from data lakehouse technology. as not only a technological feat but also as a critical
By simplifying infrastructure, a data lakehouse can cultural shift.
significantly ease the burden on time-strapped IT teams.
“There are advantages in only having one environment to “It’s a people, process, and technology issue,” says
manage,” says Aslett. In fact, he says, not only does data Shiran. As a result, he says, “Organizations have to
lakehouse technology “consolidate multiple different embrace cultural changes, especially well-established
data spread across the organization,” but it can also companies that have legacy systems and IT processes
“consolidate the numerous platforms companies have, and architecture.” Nawaz agrees. Creating a data
reduce data silos, improve knowledge sharing, and lakehouse “is a paradigm shift,” he says. “We’re helping
enhance information flows.” to shape how our organization thinks about analytics
as a whole.”
Another advantage of a data lakehouse is its power to
encourage enterprise-wide collaboration. “When all For many organizations, this means adopting a
this technology was separate, people and processes data-centric approach to all aspects of the business.
were separate,” says Shiran. “There was a separate “Organizations that are more successful are making
6 MIT Technology Review Insights
cultural changes to make data the focal point of the “From the laboratory system to the shop floor system,
organization, in terms of driving the development of nearly twenty different systems contain supply chain
new products and initiatives,” says Aslett. data,” he says. “But how do you streamline supply chain
analytics? In order to run the supply chain business, we
But facilitating any type of cultural shift requires C-suite need to join all these data streams together. That’s one
commitment. “Cultural change has to come from the top area where data lakehouse technology can be applied.”
and it has to be driven by leaders,” says Aslett. “A lack of
leadership buy-in can be a real impediment to success Not only are use cases growing, so too are the potential
with data analytics.” beneficiaries of this technology. In the future, Aslett says,
data lakehouse capabilities will be “more suitable for
Another essential is developing a deep understanding supporting self-service analytics where business leaders
of how your data architecture will serve your business and senior executives will access data themselves rather
needs. “People talk about data and it sounds great, but than have reports and dashboards created for them.”
at the end of the day, what’s in it for the business? Is it
really making an impact?” asks Nawaz. For now, though, data lakehouse technology is helping
companies like Regeneron unite disparate data types
To answer these important questions, companies must and a wide variety of workloads in a single, big-data
consider what they hope to achieve from their data storage solution. After all, says Nawaz, “To respond to
management efforts. Nawaz says Regeneron invested patient needs effectively, we believe that all these dots
heavily in the “thought process and design thinking” need to be connected.”
around building its modern data architecture.
“A new paradigm for managing data” is an executive briefing paper by MIT Technology Review Insights. We would like to
thank all participants as well as the sponsor, Dremio. MIT Technology Review Insights has collected and reported on all
findings contained in this paper independently, regardless of participation or sponsorship. Laurel Ruma and Teresa Elsey
were the editors of this report, and Nicola Crepaldi was the publisher.
Illustrations
Cover art by Adobe Stock and spot illustrations created by Chandra Tallman with icons by The Noun Project and Adobe Stock.
While every effort has been taken to verify the accuracy of this information, MIT Technology Review Insights cannot accept any responsibility or liability for reliance on any person
in this report or any of the information, opinions, or conclusions set out in this report.