Session-10-Data Loading in Snowflake
Session-10-Data Loading in Snowflake
--------
-Load types
-Bulk Loading/ Continous Loading
-Copy Command
-Transforming Data
-This option enables loading batches of data from files already avilable in cloud
storage(External stages)
(or)
- Copying data files from a local machine to an internal stage (i.e snowflake )
before loading the data into table.
- Bulk loading uses virtualwarehouses
- Users are required to size the wareouse appropriately to accommodate expected
loadsusing the copy command.
COPY Command:
COPY INTO TABLENAME
FROM @STAGE
file_format=(...)
files=(filename1,filename2)
(or)
pattern='.*filepattern.*'
other_optional_props;
***********************************************************************************
*********************************************************************
COPY COMMAND:
Location of files:
Local environment ----------> Files are first staged in a snwoflake
stage ,then loaded into a table.
Amazon S3 ----------> Files can be loaded directly from any user
supplied S3 Bucket.
Google Cloud Storage ----------> Files can be loaded directly from any user
supplied Cloud Storage container.
Microsoft Azure ----------> Files can be loaded directly from any user
supplied AZURE Container.
File formats:
Delimited files(CSV,TSV etc) --------> Any valid delimiter is supported,default
is comma(i.e CSV)
JSON
AVRO --------> Includes automatic detection and processing
of staged AVRO files that were compressed using snappy.
ORC --------> Includes automatic detection and processing
of staged ORC files that were compressed using snappy or zlib.
Parquet --------> Includes automatic detection and processing
of staged Parquet files that were compressed using snappy.
XML --------> Supported as a preview feature.
***********************************************************************************
*********************************************************************
Simple transformations during data load:
Snow flake supports transforming data while loading it into a table using COPY
command .options includes
- Column reordering
- Column Omssion
- String operation
- Other functions
- Sequence Numbers
- Auto Increment fields
1. create Database
2.use Database
3.create table
***********************************************************************************
*********************************************************************
//Create a schema for external stage
url='s3://bucketsnowflake3';
ORDER_ID VARCHAR(30),
PROFIT INT,
AMOUNT INT,
CAT_SUBSTR VARCHAR(5),
CAT_CONCAT VARCHAR(60),
PFT_OR_LOSS VARCHAR(10)
);