Snowflake To Oracle
Snowflake To Oracle
MIGRATION GUIDE
Migration strategies and best practices
2 Why Migrate?
3 Strategy — Thinking About Your Migration
6 Migrating Your Existing Oracle Warehouse
11 Need Help Migrating?
12 Appendix A — Migration Tools
13 Appendix B — Data Type Conversion Table
14 Appendix C — SQL Considerations
15 About Snowflake
CHAMPION GUIDES
WHY
MIGRATE?
Oracle has had a role in relational databases to on-demand, as-a-service models with minimal 4. The cost is affordable and predictable: Snowflake
and data warehouses for over 30 years. With management or intervention. Oracle’s recent allows for true pay-as-you-go storage and
database cloud options are not quite ready for compute scalability without the need for complex
the introduction of engineered systems such prime time. At a minimum, they are little more reconfiguration as your data or workloads grow.
as Exadata, Exalytics, Exalogic, SuperCluster, than existing server technologies hosted in
and the 12c Database, the tight integration of Oracle data centers. WHY SNOWFLAKE?
storage and compute enabled faster processing 3. New data sources and workloads are already in Snowflake’s innovations break down the technology
of larger amounts of data with on-premises the cloud: The cloud also allows for new types of and architecture barriers that organizations still
infrastructure. However, the volume, velocity, analytics to be assessed and refined without a long- experience with other data warehouse vendors.
and variety of data has increased dramatically, term commitment to infrastructure or specific tools. Only Snowflake has achieved all six of the defining
qualities of an effective cloud data platform:
and the cloud has enabled greater possibilities
with modern data analytics. For example, by
separating compute from storage, Snowflake
has developed a modern cloud data platform
Snowflake delivers the performance, concurrency, and simplicity needed to store and analyze all
that automatically and instantly scales compute data available to an organization, in one location and at a fraction of the cost of traditional solutions.
and storage in a way not possible with Oracle,
whether the current Oracle system is on- FASTER ANALYST ACCESS TO DATA
NEAR-ZERO MAINTENANCE
premises or hosted in the cloud. Snowflake Snowflake’s elastic, near-unlimited scale and
Snowflake reduces complexity with built-in
Cloud Data Platform accomplishes this with its performance, so there’s no infrastructure to speed means analysts have fast access to all
current and historical data at any time to make
multi-cluster, shared data architecture. tweak and no tuning required.
quicker, more accurate decisions.
2
CHAMPION GUIDES
STRATEGY—THINKING ABOUT
YOUR MIGRATION
WHAT SHOULD YOU CONSIDER? within the data, such as data marts that rely on nature of your current data analytics platform, the
There are several things to contemplate when references to data populated via a separate process types and number of data sources, and your future
choosing your migration path. Many organizations in another schema. ambitions and time frames.
pilot the migration on a subset of the data and Questions to ask about your workloads and data: Consider moving data in one bulk transfer if
processes. Then they migrate in stages, reducing you have any of the following:
risk and showing value sooner. However, you must • What workloads and processes can you migrate
balance risk mitigation against the need to maintain with minimal effort? • Highly integrated data across the
existing warehouse
program momentum and minimize the period
• Which processes have issues today and would
of running systems in parallel. In addition, your benefit from re-engineering? • A single independent, standalone data mart
approach may be constrained by interrelationships
• What workloads are outdated and require a • Well-designed data and processes using standard
complete overhaul? ANSI SQL
• What new workloads would you like to add that • A need to move off legacy equipment quickly
would deploy easier in Snowflake?
3
CHAMPION GUIDES
WHAT YOU DON’T NEED TO WORRY ABOUT contention completely. Additionally, Snowflake scales under-configuring your system, especially with high-
When migrating to Snowflake from Oracle, you can compute up or down automatically based on demand. availability (HA) configurations such as Oracle Real
ignore the following factors because they are no Application Clusters (RAC), where multiple servers
Statistics collection
longer relevant: are required. Even with Oracle Database Cloud
Snowflake automatically captures statistics, relieving Services, you have a similar capacity planning risk
Data distribution and primary indexes DBAs from the mundane tasks of setting up jobs to because compute and storage are fixed per instance.
In Snowflake, you don’t need to worry about collect statistics for performance tuning. Snowflake If you need more capacity within an Oracle cloud
primary key indexes, data distribution, or data performs these tasks in real time as data is loaded, service, you must buy in predefined increments.
skewing. Snowflake also eliminates the need to so the metadata is always up to date. You no longer Snowflake’s elastic storage and compute architecture
manage table partitions or sub-partitions. Because have to add new tables to the process as your eliminates this risk, so you can save money and avoid
compute is separate from storage in Snowflake’s data grows. the time previously spent on extensive planning.
architecture, massively parallel processing (MPP) Capacity planning
compute nodes do not rely on the data being
With Snowflake, you pay for only what you use.
distributed ahead of time.
Snowflake is a SaaS product that is further enhanced
Indexing and query optimization for efficiency with per-second, usage-based pricing.
Snowflake has an optimizer built from the ground Under this model, Snowflake also offers further cost
up and architected for MPP and the cloud. reductions for customers who want to pre-purchase
Snowflake understands parallel execution plans and usage. On the flip side, with capacity planning for
automatically optimizes them, relieving you of this on-premises Oracle engineered systems (Exadata
task. Because Snowflake does not use indexes, you and Supercluster), you run the risk of over- or
don’t have to migrate your indexes and partitions.
Workload management
Workload management is unnecessary in a Snowflake
environment due to its multi-cluster architecture,
which allows you to create separate compute clusters
for your disparate workloads to avoid resource
4
CHAMPION GUIDES
Disaster recovery
The Oracle database has several disaster recovery
scenarios, such as Active Data Guard and the Zero
Data Loss Recovery Appliance (ZDLRA). Many
of them require you to buy additional hardware,
software licenses, and network infrastructure.
But Snowflake leverages the built-in features of its
cloud infrastructure providers. By design, Snowflake
is automatically synced across multiple availability
zones. No work is required on your part to
establish this.
5
5
CHAMPION GUIDES
MIGRATING YOUR EXISTING
ORACLE WAREHOUSE
To successfully migrate your enterprise data DDL will execute in Snowflake without change. Using existing DDL scripts
warehouse to Snowflake Cloud Data Platform, Keep in mind that Snowflake is self-tuning and has If you don’t have a data modeling tool, you can begin
develop and follow a plan that includes the a unique architecture. You won’t need to generate with the most recent version of your existing DDL
code for any indexes, partitions, or storage clauses scripts (in a version control system). Edit these scripts
steps presented in this section.
that you may have needed in an Oracle database. to remove code for extraneous features and options
You need only basic DDL, such as CREATE TABLE, not needed in Snowflake, such as indexes, table-
MOVING YOUR DATA MODEL CREATE VIEW, and CREATE SEQUENCE. After you space assignments, and other storage or distribution-
As a starting point for your migration, you need to have these scripts, you can log into your Snowflake related clauses. Depending on the data types you
move your database objects, including databases, account to execute them through the UI or the used in Oracle, you may also need to do a search
tables, views, and sequences, from Oracle to command-line tool SnowSQL.
Snowflake. In addition, you may want to include
If you have a data modeling tool, but the model is
all of your user account names, roles, and objects
not current, we recommend you reverse engineer
grants. At a minimum, create the user who owns the
the current design into your tool, then follow the
Oracle database or schema on the target Snowflake
approach outlined above. Follow this link to learn
system before you migrate data. Your choice of
how to connect Oracle SQL Developer Data Modeler
which objects to move depends on the scope of
to Snowflake.
your initial migration.
6
CHAMPION GUIDES
and replace in the scripts to change some of the data Depending on the data types in your Oracle design, you have to move. For example, to move tens or
types to Snowflake optimized types. For a list of you may also need to change some of the data hundreds of terabytes up to a few petabytes of
these data types, see Appendix B. types to Snowflake-optimized types. You will likely data, a practical approach is to extract the data to
need to write a SQL extract script to build the DDL files and move it via a service such as AWS Snowball
Creating new DDL scripts scripts. Rather than do a search and replace after or Azure Data Box. If you have to move hundreds
the script is generated, you can code these data of petabytes or even exabytes of data, AWS
If you don’t have a data modeling tool or current
type conversions directly into the metadata extract Snowmobile or Azure Data Box are available options.
DDL scripts, you will need to extract the metadata
script, which lets you automate the extract process
needed from the Oracle data dictionary to generate If you choose to move your data manually, you will
and execute the move iteratively. Plus, you will save
these scripts. This task is somewhat simplified for need to extract the data for each table to one or
time editing the script later. Additionally, coding the
Snowflake since you won’t need to extract metadata more delimited flat files in text format. Use one of
conversions into the script is less error-prone than
for indexes and storage clauses. the many methods available to the Oracle database
any manual clean-up process, especially if you are
such as PL/SQL routines using utl_file or SQLcl
migrating hundreds or even thousands of tables.
to pump the data out to the desired format. Then
upload these files using the PUT command into
MOVING YOUR EXISTING DATA SET an internal or external Amazon S3 staging bucket.
After building your objects in Snowflake, move These files should be between 100 MB and 1 GB to
the historical data loaded in your Oracle system to take advantage of Snowflake’s parallel bulk loading.
Snowflake. You can use a third-party migration tool After you have extracted the data and moved it to
(see Appendix A), an ETL tool, or a manual process. S3, you can begin loading the data into your table
When choosing an option, consider how much data in Snowflake using the COPY command. See more
details about the COPY command in the Snowflake
online documentation.
7
CHAMPION GUIDES
Another common change relates to formatting of This should be simple since Snowflake supports resource according to resources required to meet
date constants used for comparisons in predicates. standard ODBC and JDBC connectivity, which the SLA for that workload. Consider the following:
For example: most modern BI tools use. Many of the mainstream
• Is there a specific time period in which this
tools have native connectors to Snowflake. Check
In Oracle it looks like this: workload needs to complete? Between certain
the Snowflake website to see if your tools are
hours? You can easily schedule any Snowflake
where my_date_datatype > '01-JAN-17'; supported. Don’t worry if your tool of choice is not virtual warehouse to turn on and off or to
listed. You should be able to establish a connection automatically suspend and resume when needed.
Or using either ODBC or JDBC. If you have questions
about a specific tool, your Snowflake contact will be • How much compute will you need to meet that
where to_char(my_date_datatype, 'YYYY-MM- DD')
happy to help. window? Use that estimate to determine the
> '2017-01-01';
appropriate compute resource size.
Or
• How many concurrent connections will this
Handling workload management
where my_date_datatype > to_date('2017-01- workload need? If you normally experience
As stated earlier, the workload management
01', 'YY-MM-DD');
required in Oracle is unnecessary with Snowflake.
The multi-cluster architecture of Snowflake
allows you to create separate virtual warehouses
In Snowflake it looks like this: for your disparate workloads to avoid resource
where my_date_datatype > cast('2017-01-01' contention completely. Your workload management
as date) configuration from the Oracle Database Resource
Manager, the I/O Manager, and any Instance Caging
settings will give you a good idea of how to set up
Snowflake compute resources. However, you’ll need
Alternatively in Snowflake you can also
to consider the optimal way to distribute these in
use this form:
Snowflake. As a starting point, create a separate
where my_date_datatype > '2017-01-01'::date compute resource for each workload, then size the
Migrating BI tools
Many of your queries and reports are likely to
use an existing BI tool. Therefore, you’ll need to
account for migrating those connections from
Oracle to Snowflake. You’ll also have to test those
queries and reports to be sure you’re getting the
expected results.
8
CHAMPION GUIDES
and you should see a dramatic improvement Snowpipe enables loading data from files as soon
bottlenecks, you may want to use the Snowflake
multi-cluster resource for those use cases to in performance. as they’re available in a stage. This means you can
enable automatic scale out during peak workloads. load data from files in micro-batches, making it
For data pipelines that require re-engineering, available to users within minutes instead of manually
• Think about dedicating at least one large compute you can leverage Snowflake’s scalable compute executing COPY statements on a schedule to load
resource for tactical, high-SLA workloads. and bulk-loading capabilities to modernize your larger batches.
processes and increase efficiency. You may consider
• If you discover a new workload, you can easily add A pipe is a named, first-class, Snowflake object that
taking advantage of Snowpipe for loading data
it on demand with Snowflake’s ability to instantly contains a COPY statement used by the Snowpipe
continuously as it arrives to your cloud storage
provision a new compute resource.
provider of choice, without any resource contention REST service. The COPY statement identifies the
or impact to performance. Snowflake makes source location of the data files (a named stage) and
it easy to bring in large data sets and perform a target table. Snowflake supports structured and
MOVING THE DATA PIPELINE transformations at any scale. semi-structured data, including semi-structured data
AND ETL PROCESSES types such as JSON and Avro.
Snowflake is optimized for an ELT approach.
However, Snowflake supports many traditional ETL
and data integration solutions. We recommend a
basic migration of all existing data pipelines and ETL
processes to minimize the impact on your project
1 DATA
2
FILES
unless you are planning to significantly enhance or
modify them. Because testing and data validation Continuous
are key elements of any changes to the data data flow
9
CHAMPION GUIDES
CUT OVER
MIGRATE MIGRATE BUILD DATA BUILD METADATA MIGRATE
After you migrate your data model, data, loads, SCHEMA DATA PIPELINE CATALOG USERS
and reporting to Snowflake, plan your switch from
Oracle to Snowflake. Here are the steps:
HOW TO MIGRATE TO SNOWFLAKE
1. Execute a historic, one-time load to move all of
the existing data. Maintain existing schema
Role-based security
10
CHAMPION GUIDES
NEED HELP
MIGRATING?
Snowflake is available to accelerate your Whether your organization is fully staffed for a
migration, structure and optimize your platform migration or you need additional expertise,
planning and implementation activities, Snowflake and our solution partners have the skills
and tools to accelerate your journey to Snowflake
and apply customer best practices to meet
Cloud Data Platform, so you can reap the full
your technology and business objectives. benefits quickly. To find out more, contact the
Snowflake‘s Engagement, Delivery, and Snowflake sales team or visit Snowflake’s Customer
Advisory Services Team deploys a powerful Community Lodge.
combination of data architecture expertise and
advanced technical knowledge of the platform
to deliver high performing data strategies,
proofs of concept, and migration projects.
Our global and regional solution partners also have
extensive experience performing proofs of concept
and platform migrations. They offer services ranging
from high-level architectural recommendations to
manual code conversions. Many Snowflake partners
have also built tools to automate and accelerate the
migration process.
11
CHAMPION GUIDES
APPENDIX A:
MIGRATION TOOLS
Snowflake ecosystem partners may have offerings that can help with your migration from Oracle to
Snowflake. For more information, or to engage these partners, contact your Snowflake representative.
WHERESCAPE® AUTOMATION
FOR SNOWFLAKE ANALYTIX DS
WhereScape automation for Snowflake accelerates Mapping Manager
your ability to start using Snowflake for new If you need to migrate from one ETL tool or method
and ongoing cloud data infrastructure projects. to a new one, consider Mapping Manager. This tool
WhereScape can help you migrate data warehouses, uses a metadata-driven approach to migrate ETL/ELT
data vaults, data lakes, and data marts by: code from one tool vendor to another.
WIPRO
Wipro has developed a tool to assist with migration
efforts—CDRS Self Service Cloud Migration Tool
(patent pending). It provides an end-to-end, self-
service data migration solution for migrating your
on-premises data warehouse to Snowflake. This
includes Snowflake-specific optimizations.
12
CHAMPION GUIDES
APPENDIX B:
DATA TYPE CONVERSION TABLE
This appendix contains a sample of some of the data type mappings you need to know when moving
from Oracle to Snowflake. Many mappings are the same, but you will need to change a few of them.
NUMBER BYTEINT
NUMBER SMALLINT
NUMBER INTEGER
NUMBER BIGINT
NUMBER DECIMAL
FLOAT FLOAT
NUMBER NUMERIC
FLOAT CHAR Up to 16 MB
VARCHAR2/NVARCHAR2 VARCHAR Up to 16 MB
CHAR(n) CHAR VARYING(n)
FLOAT REAL
DATE DATE
TIMESTAMP Only HH:MI:SS TIME
TIMESTAMP with TIMEZONE TIMESTAMP_LTZ aliases:
TIMESTAMPLTZ
13
CHAMPION GUIDES
APPENDIX C:
SQL CONSIDERATIONS
Below are examples of some of the changes you may need to make to your Oracle SQL
queries to have them run correctly in Snowflake. Note that this is not an exhaustive list.
14
ABOUT SNOWFLAKE
Snowflake Cloud Data Platform shatters the barriers that prevent organizations from unleashing the true value from their data.
Thousands of customers deploy Snowflake to advance their businesses beyond what was once possible by deriving all the insights
from all their data by all their business users. Snowflake equips organizations with a single, integrated platform that offers the only
data warehouse built for any cloud; instant, secure, and governed access to their entire network of data; and a core architecture to
enable many other types of data workloads, including a single platform for developing modern data applications.
Snowflake: Data without limits. Find out more at snowflake.com.