0% found this document useful (0 votes)
78 views2 pages

Azure DW

The document discusses the differences between Azure Databricks and Azure Synapse Analytics. It notes that Azure Synapse has both SQL and Spark engines that can be used together on the same data, while Azure Databricks is purely Spark. It recommends using Databricks for more complex analytics and auto-scaling needs, while Synapse Spark pool can be used for medium complexity aggregations when using Spark with a data warehouse. Sample tutorials and datasets are provided to help users get hands-on experience with Azure Synapse SQL pool and Spark pool.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views2 pages

Azure DW

The document discusses the differences between Azure Databricks and Azure Synapse Analytics. It notes that Azure Synapse has both SQL and Spark engines that can be used together on the same data, while Azure Databricks is purely Spark. It recommends using Databricks for more complex analytics and auto-scaling needs, while Synapse Spark pool can be used for medium complexity aggregations when using Spark with a data warehouse. Sample tutorials and datasets are provided to help users get hands-on experience with Azure Synapse SQL pool and Spark pool.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 2

Good morning.Is webinar is recording.

Recording or slides will be avaiable later

No. Unfortunately, the recording is not available for this virtual session. The
slides are not available,
but you can have all the topics in these links :
https://docs.microsoft.com/en-us/learn/,
https://docs.microsoft.com/en-us/learn/paths/azure-fundamentals
https://docs.microsoft.com/en-us/learn/paths/azure-fundamentals/

Most of my customers gets confused between Azure Databricks vs Azure Synapse as to


an extent there is a overlap,
could you please explain the simple difference?

Well, yes. There is indeed an overlap between the two.


First difference is that Azure Synapse Analytics has several major components
including two compute engines - T-SQL one (SQL Pool)
and Spark (Spark Pool) these are independent but can be used in parallel with the
same set of data stored in Azure Data Lake Storage Gen2.
In Azure Databricks there is no separate SQL (MPP / DW) engine. Azure Databricks
is a "pure" Spark.
I would recommend to use Azure Databricks over Azure Synapse Spark Pool in case you
have:
1. Requirement to go into production immediately (as Synapse Spark Pool is in the
Public Preview Phase).
2. You are planning to have very complex aggregations, mathematical computations,
AI / ML combined with the complex pre-processing.
3. You have very specific requirements for auto-scaling, scaling up and down as
well as performance tuning.
On the over hand I would recommend Azure Synapse Spark pool for:
1. the medium-complexity aggregations,
2. some well defined routine processing,
3. using Spark in conjunction with DW technologies.
Please be aware that these recommendations will only hold until
Azure Synapse Analytics Spark Pools will be release as Generally Available
service. We will review them at that moment.

Please could you point us to sample data and practice exercises to develop our
expertise?

Well I would recommend to attend Modern Data Warehouse Hackathon (it is free of
charge and very hands-on focused):
https://openhack.microsoft.com/ Also there are several Quick Start Guides
available.
They are useful to aquire some hands-on knowledge very fast and they already are
referring to the public datasets.
1. Creating a Azure Synapse SQL Pool: https://docs.microsoft.com/en-
us/azure/synapse-analytics/quickstart-create-sql-pool-portal
NOTE: I would recommend to use the smallest possible (DW100) and to pause /
terminate it after you are done experimenting.
2. Loading data into Azure Synapse SQL Pools (using COPY command - POlybase
backed):
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-
warehouse/quickstart-bulk-load-copy-tsql
3. Using Azure Data Factory to copy data: https://docs.microsoft.com/en-
us/azure/data-factory/quickstart-create-data-factory-copy-data-tool
4. Use SQL On-demand to query data directly from storage account:
https://docs.microsoft.com/en-us/azure/synapse-analytics/quickstart-sql-on-
demand
5. Working with Spark Pool:
https://docs.microsoft.com/en-us/azure/synapse-analytics/quickstart-apache-
spark-notebook

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy