0% found this document useful (0 votes)
157 views30 pages

Bi 1 230411 163415

This document provides steps to create a staging database and extract-transform-load (ETL) collaboration in Data Integrator. It involves selecting file and JDBC data sources, defining table and column properties, generating a target database, and mapping source and target tables. Key steps include selecting the ETL loader type, choosing or creating a staging database, importing file and database table metadata, customizing column properties, generating the target database structure, and mapping source tables to target tables.

Uploaded by

lilesh Mohane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
157 views30 pages

Bi 1 230411 163415

This document provides steps to create a staging database and extract-transform-load (ETL) collaboration in Data Integrator. It involves selecting file and JDBC data sources, defining table and column properties, generating a target database, and mapping source and target tables. Key steps include selecting the ETL loader type, choosing or creating a staging database, importing file and database table metadata, customizing column properties, generating the target database structure, and mapping source tables to target tables.

Uploaded by

lilesh Mohane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Practical 1 : Import the legacy data from different sources such as ( Excel , SqlServer, Oracle etc.

) and load in
the target system. ( You can download sample database such as Adventureworks, Northwind, foodmart etc.)

Step 1: Open Power BI

Step 2: Click on Get data following list will be displayed → select Excel
Step 3: Select required file and click on Open, Navigator screen appears

Step 4: Select file and click on edit


Step 5: Power query editor appears

Step 6: Again, go to Get Data and select OData feed

Step 7 : Paste url as http://services.odata.org/V3/Northwind/Northwind.svc/

Click on ok
Step 8: Select orders table

And click on edit

Note: If you just want to see preview you can just click on table name without clicking on checkbox

Click on edit to view table

Practical 2 : Perform the Extraction Transformation and Loading (ETL) process to


construct the database in the Sqlserver.
Software requirements: SQL SERVER 2012 FULL VERSION (SQLServer2012SP1-FullSlipstream-ENU-x86)

Steps to install SQL SERVER 2012 FULL VERSION

Step 1: Open SQL Server Management Studio to restore backup file


Step 2: Right click on Databases → Restore Database

Step 3: Select Device → click on (...) icon towards end of device box
Step 4: Click on Add → Select path of backup files

Step 5: Select both files at a time

Step 6 : Click ok and in select backup devices window Add both files of Adventure Works
Step 7: Open SQL Server Data Tools Select File → New → Project → Business Intelligence → Integration
Services Project & give appropriate project name.
Step 8: Right click on Connection Managers in solution explorer and click on New Connection Manager. Add
SSIS connection manager window appears.
Step 9: Select OLEDB Connection Manager and Click on Add

Step 10: Configure OLE DB Connection Manager window appears → Click on New’

Step 11: Select Server name(as per your machine) from drop down and database name and click on Test connection.
If test connection succeeded click on OK
Step 12: Click on OK
Connection is added to connection manager

Step 13: Drag and drop Data Flow Task in Control Flow tab

Step 14: Drag OLE DB Source from Other Sources and drop into Data Flow tab

Step 15: Double click on OLE DB source → OLE DB Source Editor appears→ click on New to add connection manager.
Select [Sales].[Store] table from drop down → ok
Step 16: Drag ole db destination in data flow tab and connect both

Step 17: Double click on OLE DB destination Click on New to run the query to get [OLE DB Destination] in Name of the
table or the view.

Click on ok

Step 18: Click on start.


Step 19: Go to SQL Server Management Studio In database tab → Adventureworks → Right click on [dbo].[OLE DB
Destination] → Scrip Table as → SELECT To → New Query Editor Window

Step 20: Execute following query to get output.

USE [AdventureWorks2012]

GO

SELECT [BusinessEntityID] ,

[Name] ,

[SalesPersonID] ,[Demographics] ,[rowguid] ,[ModifiedDate]

FROM [dbo].[OLE DB Destination]

GO
Practical 3a: Creating the Staging Database and ETL Collaboration

This procedure describes how to create the staging database using the automated wizard. Depending on the type of data
source and the options you use, the wizard skips certain unnecessary steps.
To Create the Staging Database and ETL Collaboration
Before You Begin
Complete the steps under Connecting to the Source Database.

1. On the Select Type of ETL Loader on the New File Wizard, select Advanced Extract – Transform – Load (ETL).
2. Click Next.
The Select or Create Database window appears.
3. To select a staging database to use for external data sources (for this project only), do one of the following:
a. Select an existing database to use from the DB URL field.
b. Select Create and Use New Database, enter a name for a new database in the DB Name field, and then
click Create Database. Select the new database in the DB URL field.
Note –
This database is required and is used for internal processing only.

4. Click Next.
The Choose Data Source window appears.
5. Do any of the following:
• If you do not have any file data sources, click Next and skip to step 15 (choosing JDBC data sources).
• To specify a file data source using a URL, enter the URL and click Add.
• To specify a file data source that is stored on your network, browse for and select a file containing source
data in the Choose a File box, and then click Add.
• Repeat the above two steps until all file data sources are selected.
6. Click Next.
The Enter Table Details window appears, with the information for the first data file displayed.
7. If necessary, modify the table name, the type of data encoding, and the type of document that contains the source
data.
Data Integrator automatically fills in these fields based on the information from the previous window, so the existing
values should be correct.
8. Click Next.
If the data file is a spreadsheet, the Choose a Sheet window appears; otherwise, the Import Table MetaData
window appears.
9. If the Choose a Sheet window appears, select the name of the sheet in the spreadsheet that contains the source
data, and then click Next.
10. When the Import Table Metadata window appears, modify the information about the data file as needed.
11. Data Integrator automatically fills in this information, but you might need to customize it. For more information
about the properties you can configure, Preview the information in the bottom portion of the window, and then
click Next.
12. The Enter Column Properties window appears.
13. In the upper portion of the window, customize any of the column properties.
14. For more information about these properties, Preview the information in the lower portion of the window, and
then click Next.
15. Do one of the following:
c. If you selected multiple file data sources, the wizard returns to the Enter Table Details window with the
attributes for a different file displayed. Repeat the above steps beginning with step 7.
d. If all the files you specified are configured, a dialog box appears confirming the database table creation.
Click OK on the dialog box and continue to the next step.
e. The Select JDBC Source Tables window appears.
16. If you specified file data sources, they are already listed under Selected Tables. Click Next if you have no JDBC
data sources to specify, or do the following to specify a JDBC data source:
f. Under Available Connections, select the database that contains the source data.
g. If there are multiple schemas in the database, select the schema to use.
h. Under Schemas, select the tables that contain the source data and then click Select.
i. Click Next.
j. If there are tables to join, the Select Source Tables for Join window appears; otherwise, the Generate
Target Database window appears.
17. To define join conditions, do the following. If there are no join conditions, click Next and skip to step 17.
k. Under Available Tables, select the tables to join, and then click the right arrow to add them to the
Selected Tables list.
l. In the Preview panel, click the drop-down menu at the top of the join box and select the type of join to use
from one of the following options:
▪ Inner – Use this if all tables to be joined contain the same column.
▪ Left Outer – Use this if the results should always include the records from the left table in the
join clause.
▪ Right Outer – Use this if the results should always include the records from the right table in the
join clause.
▪ Full Outer – Use this if the results should always include the records from both the right and left
tables in the join clause.
m. To specify columns to exclude from each joined table, click the Select Column tab in the Preview pane
and deselect any columns to exclude.
n. Click Next.
o. The Generate Target Database Master Index Model window appears.
18. To create the staging database, do the following:
p. Deselect the check box for Use Existing Database Target Tables.
q. In the Object Definition File field, browse to and select the object.xml file generated for the Master
Index project.
Note –This file is located in NetBeansProjects_Home/Project_Name/src/Configuration.

r. In the Target Database Folder field, select or enter the path where you want to store the database.
s. In the Target Database Name field, enter a name for the database.
t. Click Generate Database.
19. Click Next.
20. The Select JDBC Target Tables window appears. The target tables to load the extracted data into are already
listed under Available Connections. It is not recommended you change these.
21. Click Next.
22. The Map Selected Collaboration Tables window appears.
23. To map source and target data, do the following:
u. To disable constraints on the target tables, select Disable Target Table Constraints.
v. Select the SQL statement type to use for the transfer. You can select insert, update, or both.
w. For each target table listed on the right, select one or more source tables from the list directly to the left of
the target table. These are the source tables that will be mapped to the target in the collaboration.
Note – If you do not specify a mapping here, the source tables do not appear in the ETL collaboration.
You can add the source tables directly to the collaboration using the Select Source and Target Tables
function. To select multiple source tables for one target, hold down the Control key while you select the
required source tables. If you select multiple source tables for one target, the source tables are
automatically joined.

x. Click Finish.
y. The new ETL collaboration appears in the Projects window. If multiple collaboration are created, they are
given the name you specified for the collaboration with a target table name appended.

Practical 3b: Create the cube with suitable dimension and fact tables based on OLAP.
Step 1: Creating Data Warehouse
Let us execute our T-SQL Script to create data warehouse with fact tables, dimensions and populate them with appropriate
test values.
Download T-SQL script attached with this article for creation of Sales Data Warehouse or download from this article
“Create First Data Warehouse” and run it in your SQL Server.

After downloading extract file in folder.


Follow the given steps to run the query in SSMS (SQL Server Management Studio).
1. Open SQL Server Management Studio 2012
2. Connect Database Engine

Password for sa : admin123 (as given during installation)


Click Connect.
3. Open New Query editor
4. Copy paste Scripts given below in various steps in new query editor window one by one
5. To run the given SQL Script, press F5
6. It will create and populate “Sales_DW” database on your SQL Server
OR
1. Go to the extracted sql file and double click on it.
2. New Sql Query Editor will be opened containing Sales_DW Database.
3. Click on execute or press F5 by selecting query one by one or directly click on Execute.
4. After completing execution save and close SQL Server Management studio & Reopen to see Sales_DW in Databases
Tab.
Step 2: Start SSDT environment and create New Data Source

Go to Sql Server Data Tools --> Right click and run as administrator

Click on File → New → Project

In Business Intelligence → Analysis Services Multidimensional and Data Mining models → appropriate project name
→ click OK

Right click on Data Sources in solution explorer → New Data Source

Data Source Wizard appears

Click on New

Select Server Name → select Use SQL Server Authentication → Select or enter a database name (Sales_DW)
Note : Password for sa : admin123 (as given during installation of SQL 2012 full version)

Click Next

Select Inherit → Next

Click Finish

Sales_DW.ds gets created under Data Sources in Solution Explorer

Step 3: Creating New Data Source View


In Solution explorer right click on Data Source View → Select New Data Source View

Click Next Click Next

Select FactProductSales(dbo) from Available objects and put in Includes Objects by clicking on" >”

Click on Add Related Tables.

Click Next

Click Finish

Sales DW.dsv appears in Data Source Views in Solution Explorer.

Step 4: Creating new cube

Right click on Cubes → New Cube

Select Use existing tables in Select Creation Method → Next

In Select Measure Group Tables → Select FactProductSales → Click Next

In Select Measures → check all measures → Next

In Select New Dimensions → Check all Dimensions → Next

Click on Finish

Sales_DW.cube is created

Step 5: Dimension Modification

In dimension tab → Double Click Dim Product.dim

Drag and Drop Product Name from Table in Data Source View and Add in Attribute Pane at left side

Step 6: Creating Attribute Hierarchy in Date Dimension

Double click On Dim Date dimension -> Drag and Drop Fields from Table shown in Data Source View to Attributes-> Drag
and Drop attributes from leftmost pane of attributes to middle pane of Hierarchy.

Drag fields in sequence from Attributes to Hierarchy window (Year, Quarter Name, Month Name, Week of the Month,
Full Date UK)

Step 7: Deploy Cube


Right click on Project name → Properties

This window appaers

Do following changes and click on Apply & ok

Right click on project name → Deploy

Deployment successful

To process cube right click on Sales_DW.cube → Process

Click run

Browse the cube for analysis in solution explorer

Practical 4a : Create the ETL map and setup the schedule for execution.
This post will help you create a simple step by step ETL process flow within Adeptia.
File Source Activity: The File Source provides the ability to specify any file that is located on the local hard
disk, as a source.
Polling Service Activity: Polling Services allow the process flow to ‘wait’ and ‘listen’ to a defined location, at
which specific file is to arrive or is to be modified before the execution of the next activity. The Polling Services
perform the ‘listen’ action at a frequency specified while creating the Polling activity.
File Trigger Activity: Trigger Events are used to schedule and trigger a process flow. Trigger Events enable you
to specify when and how frequently the process flow should be executed on a recurring basis. The File Event
enables you to specify when and how frequently a process flow should be executed based on either creation
of a new file, or existence of a file(s) in a pre-defined location or upon its modification.
Here are the simple ETL Process Flow steps for transferring a file from any source to target after
transformation:
Step 1: If your file is on the local machine, create a new file source activity under Configure > Services > Source
> File. Configure the full path of the source file name in the File Path field and the source file name in the File
Name field. Save it. For more help click on Creating Source Activity and then click on Creating File Source
Activity in the Developer guide.
Step 2: Create a new schema activity under Configure > Services > Schema > for the source file. A Schema is
the structure of a file format and it specifies information about different data fields and record types that a
message or a data file may contain. You can create different types of Schemas according to the file structure.
For more help click on Creating Schema Activity in the Developer guide.
Step 3: Create a new schema activity under Configure > Services > Schema > for the target file. If the target file
structure is same as source file structure then you don’t need to create a new schema.

Step 4: Create a new Data Mapping activity under Configure > Services > Data Transform > Data Mapping. Data
Mapping is used to map source schema elements to target schema elements. You can map one source schema
element to a target schema element directly using the drag and drop approach. The process of mapping
elements comprises of various steps:
• Load the Source and Target Schemas
• Map the Source and Target Elements
• Save the Mapping and Exit Data Mapper

• Step 5: Create a new file target activity under Configure > Services > Target > File. Specify the name
and path of the target file to be created. For more help click on Creating Target Activity and then click
on Creating File Target Activity in the Developer guide.
As you have created all the activities now you need to create a process flow. The process flow is a set of
activities arranged in a sequence to perform a specific task by combining various activities i.e. Source, Target,
Schema or Transformer etc.

Start Event > File Source (Step1) > Source Schema (Step 2) > Data Mapping (Step 4) > Target Schema (Step 3)
> File Target (Step 5) > End Event
Note: You must change the “transformer” property of the target schema (Step3) with “XMLStream2stream” in
the process flow by double click on it.
Step 6: Go to Design > Process Flow and select the above process flow and click on execute.

To schedule the ETL:


1 .Click Tools on the global toolbar.

2 . The ADMINISTRATION TOOLS page appears.


3 Click the ETL Scheduler tab.
4. Define how often you want to run the ETL by selecting one of the following options:

• Hourly every so many hours, starting at a specific time


• Daily at a specific time
• Weekly at a specific time on a set day of the week
• Monthly on a set day of the month at a specific time
5. Specify the other parameters, such as the hour of the day when the ETL will run. Use the 24-hour
format (HH:MM:SS) when specifying a time value.
6. Click Save Schedule to save the settings.
To run the ETL immediately, click Run ETL Now. Running the ETL immediately does not affect the scheduled
ETL.

Practical 4b : Execute the MDX queries to extract the data from the datawarehouse.
Step 1: Open SQL Server Management Studio and connect to Analysis Services.
Server type: Analysis Services Server Name: (according to base machine) Click on connect
Step 2 : Click on New Query and type following query based on Sales_DW

Select[Measure].[Sales Time Alt key] on Columns


From [Sales DW]
Click on execute
Practical 5
Import the data warehouse data in Microsoft Excel and create the Pivot table
and Pivot Chart

Being able to analyze all the data can help you make better business decisions. But sometimes it’s hard
to know where to start, especially when you have a lot of data that is stored outside of Excel, like in a
Microsoft Access or Microsoft SQL Server database, or in an Online Analytical Processing (OLAP) cube
file. In that case, you’ll connect to the external data source, and then create a PivotTable to summarize,
analyze, explore, and present that data.

Here’s how to create a PivotTable by using an existing external data connection:

1. Click any cell on the worksheet.


2. Click Insert > PivotTable.
3. In the Create PivotTable dialog box, click From External Data Source.

4. Click Choose Connection.


5. On the Connections tab, in the Show box, keep All Connections selected, or pick the
connection category that has the data source you want to connect to.

Create a pivot chart

1. Select a cell in your table.


2. Select Insert > PivotChart .
3. Select where you want the PivotChart to appear.
4. Select OK.
5. Select the fields to display in the menu.

Practical 6
Apply the what – if Analysis for data visualization. Design and generate necessary reports based on the data
warehouse data.
A book store has 100 books in storage. You sell a certain % for the highest price of $50 and a certain % for the
lower price of $20
If you sell 60% for the highest price, cell D10 calculates a total profit of 60 * 50 + 40 * 20 = 3800.

Create Different Scenarios But what if you sell 70% for the highest price? And what if you sell 80% for the highest
price? Or 90%, or even 100%? Each different percentage is a different scenario. You can use the Scenario
Manager to create these scenarios.

Note: To type different percentages into cell C4 to see the corresponding result of a scenario in cell D10 we use
what if analysis.

What-if analysis enables you to easily compare the results of different scenarios.

Step 1: In Excel, On the Data tab, in the Data tools group, click What-If Analysis
Step 2: Click on What –if-Analysis and select scenario manager.
The Scenario Manager Dialog box appears.
Step 3: Add a scenario by clicking on Add.

Step 4: Type a name (60 percent), select cell F10 (% sold for the highest price) for the Changing cells and click
on OK. Click on icon which is circled.
Select F10 cell.

Click back on the icon again and then click OK


Step 5: Enter the corresponding value 0.6 and click on OK again.

Step 6: To apply scenarios click on Show


Step 7: Next, add 4 other scenarios (70%, 80%, 90% and 100%)
Finally, your Scenario Manager should be consistent with the picture below:

Practical 7
Perform the data classification using classification algorithm.
Practical 8
Perform the data clustering using clustering algorithm.

Practical 9
Perform the Linear regression on the given data warehouse data.
Practical 10
Perform the logistic regression on the given data warehouse data.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy