Bi 1 230411 163415
Bi 1 230411 163415
) and load in
the target system. ( You can download sample database such as Adventureworks, Northwind, foodmart etc.)
Step 2: Click on Get data following list will be displayed → select Excel
Step 3: Select required file and click on Open, Navigator screen appears
Click on ok
Step 8: Select orders table
Note: If you just want to see preview you can just click on table name without clicking on checkbox
Step 3: Select Device → click on (...) icon towards end of device box
Step 4: Click on Add → Select path of backup files
Step 6 : Click ok and in select backup devices window Add both files of Adventure Works
Step 7: Open SQL Server Data Tools Select File → New → Project → Business Intelligence → Integration
Services Project & give appropriate project name.
Step 8: Right click on Connection Managers in solution explorer and click on New Connection Manager. Add
SSIS connection manager window appears.
Step 9: Select OLEDB Connection Manager and Click on Add
Step 10: Configure OLE DB Connection Manager window appears → Click on New’
Step 11: Select Server name(as per your machine) from drop down and database name and click on Test connection.
If test connection succeeded click on OK
Step 12: Click on OK
Connection is added to connection manager
Step 13: Drag and drop Data Flow Task in Control Flow tab
Step 14: Drag OLE DB Source from Other Sources and drop into Data Flow tab
Step 15: Double click on OLE DB source → OLE DB Source Editor appears→ click on New to add connection manager.
Select [Sales].[Store] table from drop down → ok
Step 16: Drag ole db destination in data flow tab and connect both
Step 17: Double click on OLE DB destination Click on New to run the query to get [OLE DB Destination] in Name of the
table or the view.
Click on ok
USE [AdventureWorks2012]
GO
SELECT [BusinessEntityID] ,
[Name] ,
GO
Practical 3a: Creating the Staging Database and ETL Collaboration
This procedure describes how to create the staging database using the automated wizard. Depending on the type of data
source and the options you use, the wizard skips certain unnecessary steps.
To Create the Staging Database and ETL Collaboration
Before You Begin
Complete the steps under Connecting to the Source Database.
1. On the Select Type of ETL Loader on the New File Wizard, select Advanced Extract – Transform – Load (ETL).
2. Click Next.
The Select or Create Database window appears.
3. To select a staging database to use for external data sources (for this project only), do one of the following:
a. Select an existing database to use from the DB URL field.
b. Select Create and Use New Database, enter a name for a new database in the DB Name field, and then
click Create Database. Select the new database in the DB URL field.
Note –
This database is required and is used for internal processing only.
4. Click Next.
The Choose Data Source window appears.
5. Do any of the following:
• If you do not have any file data sources, click Next and skip to step 15 (choosing JDBC data sources).
• To specify a file data source using a URL, enter the URL and click Add.
• To specify a file data source that is stored on your network, browse for and select a file containing source
data in the Choose a File box, and then click Add.
• Repeat the above two steps until all file data sources are selected.
6. Click Next.
The Enter Table Details window appears, with the information for the first data file displayed.
7. If necessary, modify the table name, the type of data encoding, and the type of document that contains the source
data.
Data Integrator automatically fills in these fields based on the information from the previous window, so the existing
values should be correct.
8. Click Next.
If the data file is a spreadsheet, the Choose a Sheet window appears; otherwise, the Import Table MetaData
window appears.
9. If the Choose a Sheet window appears, select the name of the sheet in the spreadsheet that contains the source
data, and then click Next.
10. When the Import Table Metadata window appears, modify the information about the data file as needed.
11. Data Integrator automatically fills in this information, but you might need to customize it. For more information
about the properties you can configure, Preview the information in the bottom portion of the window, and then
click Next.
12. The Enter Column Properties window appears.
13. In the upper portion of the window, customize any of the column properties.
14. For more information about these properties, Preview the information in the lower portion of the window, and
then click Next.
15. Do one of the following:
c. If you selected multiple file data sources, the wizard returns to the Enter Table Details window with the
attributes for a different file displayed. Repeat the above steps beginning with step 7.
d. If all the files you specified are configured, a dialog box appears confirming the database table creation.
Click OK on the dialog box and continue to the next step.
e. The Select JDBC Source Tables window appears.
16. If you specified file data sources, they are already listed under Selected Tables. Click Next if you have no JDBC
data sources to specify, or do the following to specify a JDBC data source:
f. Under Available Connections, select the database that contains the source data.
g. If there are multiple schemas in the database, select the schema to use.
h. Under Schemas, select the tables that contain the source data and then click Select.
i. Click Next.
j. If there are tables to join, the Select Source Tables for Join window appears; otherwise, the Generate
Target Database window appears.
17. To define join conditions, do the following. If there are no join conditions, click Next and skip to step 17.
k. Under Available Tables, select the tables to join, and then click the right arrow to add them to the
Selected Tables list.
l. In the Preview panel, click the drop-down menu at the top of the join box and select the type of join to use
from one of the following options:
▪ Inner – Use this if all tables to be joined contain the same column.
▪ Left Outer – Use this if the results should always include the records from the left table in the
join clause.
▪ Right Outer – Use this if the results should always include the records from the right table in the
join clause.
▪ Full Outer – Use this if the results should always include the records from both the right and left
tables in the join clause.
m. To specify columns to exclude from each joined table, click the Select Column tab in the Preview pane
and deselect any columns to exclude.
n. Click Next.
o. The Generate Target Database Master Index Model window appears.
18. To create the staging database, do the following:
p. Deselect the check box for Use Existing Database Target Tables.
q. In the Object Definition File field, browse to and select the object.xml file generated for the Master
Index project.
Note –This file is located in NetBeansProjects_Home/Project_Name/src/Configuration.
r. In the Target Database Folder field, select or enter the path where you want to store the database.
s. In the Target Database Name field, enter a name for the database.
t. Click Generate Database.
19. Click Next.
20. The Select JDBC Target Tables window appears. The target tables to load the extracted data into are already
listed under Available Connections. It is not recommended you change these.
21. Click Next.
22. The Map Selected Collaboration Tables window appears.
23. To map source and target data, do the following:
u. To disable constraints on the target tables, select Disable Target Table Constraints.
v. Select the SQL statement type to use for the transfer. You can select insert, update, or both.
w. For each target table listed on the right, select one or more source tables from the list directly to the left of
the target table. These are the source tables that will be mapped to the target in the collaboration.
Note – If you do not specify a mapping here, the source tables do not appear in the ETL collaboration.
You can add the source tables directly to the collaboration using the Select Source and Target Tables
function. To select multiple source tables for one target, hold down the Control key while you select the
required source tables. If you select multiple source tables for one target, the source tables are
automatically joined.
x. Click Finish.
y. The new ETL collaboration appears in the Projects window. If multiple collaboration are created, they are
given the name you specified for the collaboration with a target table name appended.
Practical 3b: Create the cube with suitable dimension and fact tables based on OLAP.
Step 1: Creating Data Warehouse
Let us execute our T-SQL Script to create data warehouse with fact tables, dimensions and populate them with appropriate
test values.
Download T-SQL script attached with this article for creation of Sales Data Warehouse or download from this article
“Create First Data Warehouse” and run it in your SQL Server.
Go to Sql Server Data Tools --> Right click and run as administrator
In Business Intelligence → Analysis Services Multidimensional and Data Mining models → appropriate project name
→ click OK
Click on New
Select Server Name → select Use SQL Server Authentication → Select or enter a database name (Sales_DW)
Note : Password for sa : admin123 (as given during installation of SQL 2012 full version)
Click Next
Click Finish
Select FactProductSales(dbo) from Available objects and put in Includes Objects by clicking on" >”
Click Next
Click Finish
Click on Finish
Sales_DW.cube is created
Drag and Drop Product Name from Table in Data Source View and Add in Attribute Pane at left side
Double click On Dim Date dimension -> Drag and Drop Fields from Table shown in Data Source View to Attributes-> Drag
and Drop attributes from leftmost pane of attributes to middle pane of Hierarchy.
Drag fields in sequence from Attributes to Hierarchy window (Year, Quarter Name, Month Name, Week of the Month,
Full Date UK)
Deployment successful
Click run
Practical 4a : Create the ETL map and setup the schedule for execution.
This post will help you create a simple step by step ETL process flow within Adeptia.
File Source Activity: The File Source provides the ability to specify any file that is located on the local hard
disk, as a source.
Polling Service Activity: Polling Services allow the process flow to ‘wait’ and ‘listen’ to a defined location, at
which specific file is to arrive or is to be modified before the execution of the next activity. The Polling Services
perform the ‘listen’ action at a frequency specified while creating the Polling activity.
File Trigger Activity: Trigger Events are used to schedule and trigger a process flow. Trigger Events enable you
to specify when and how frequently the process flow should be executed on a recurring basis. The File Event
enables you to specify when and how frequently a process flow should be executed based on either creation
of a new file, or existence of a file(s) in a pre-defined location or upon its modification.
Here are the simple ETL Process Flow steps for transferring a file from any source to target after
transformation:
Step 1: If your file is on the local machine, create a new file source activity under Configure > Services > Source
> File. Configure the full path of the source file name in the File Path field and the source file name in the File
Name field. Save it. For more help click on Creating Source Activity and then click on Creating File Source
Activity in the Developer guide.
Step 2: Create a new schema activity under Configure > Services > Schema > for the source file. A Schema is
the structure of a file format and it specifies information about different data fields and record types that a
message or a data file may contain. You can create different types of Schemas according to the file structure.
For more help click on Creating Schema Activity in the Developer guide.
Step 3: Create a new schema activity under Configure > Services > Schema > for the target file. If the target file
structure is same as source file structure then you don’t need to create a new schema.
Step 4: Create a new Data Mapping activity under Configure > Services > Data Transform > Data Mapping. Data
Mapping is used to map source schema elements to target schema elements. You can map one source schema
element to a target schema element directly using the drag and drop approach. The process of mapping
elements comprises of various steps:
• Load the Source and Target Schemas
• Map the Source and Target Elements
• Save the Mapping and Exit Data Mapper
• Step 5: Create a new file target activity under Configure > Services > Target > File. Specify the name
and path of the target file to be created. For more help click on Creating Target Activity and then click
on Creating File Target Activity in the Developer guide.
As you have created all the activities now you need to create a process flow. The process flow is a set of
activities arranged in a sequence to perform a specific task by combining various activities i.e. Source, Target,
Schema or Transformer etc.
Start Event > File Source (Step1) > Source Schema (Step 2) > Data Mapping (Step 4) > Target Schema (Step 3)
> File Target (Step 5) > End Event
Note: You must change the “transformer” property of the target schema (Step3) with “XMLStream2stream” in
the process flow by double click on it.
Step 6: Go to Design > Process Flow and select the above process flow and click on execute.
Practical 4b : Execute the MDX queries to extract the data from the datawarehouse.
Step 1: Open SQL Server Management Studio and connect to Analysis Services.
Server type: Analysis Services Server Name: (according to base machine) Click on connect
Step 2 : Click on New Query and type following query based on Sales_DW
Being able to analyze all the data can help you make better business decisions. But sometimes it’s hard
to know where to start, especially when you have a lot of data that is stored outside of Excel, like in a
Microsoft Access or Microsoft SQL Server database, or in an Online Analytical Processing (OLAP) cube
file. In that case, you’ll connect to the external data source, and then create a PivotTable to summarize,
analyze, explore, and present that data.
Practical 6
Apply the what – if Analysis for data visualization. Design and generate necessary reports based on the data
warehouse data.
A book store has 100 books in storage. You sell a certain % for the highest price of $50 and a certain % for the
lower price of $20
If you sell 60% for the highest price, cell D10 calculates a total profit of 60 * 50 + 40 * 20 = 3800.
Create Different Scenarios But what if you sell 70% for the highest price? And what if you sell 80% for the highest
price? Or 90%, or even 100%? Each different percentage is a different scenario. You can use the Scenario
Manager to create these scenarios.
Note: To type different percentages into cell C4 to see the corresponding result of a scenario in cell D10 we use
what if analysis.
What-if analysis enables you to easily compare the results of different scenarios.
Step 1: In Excel, On the Data tab, in the Data tools group, click What-If Analysis
Step 2: Click on What –if-Analysis and select scenario manager.
The Scenario Manager Dialog box appears.
Step 3: Add a scenario by clicking on Add.
Step 4: Type a name (60 percent), select cell F10 (% sold for the highest price) for the Changing cells and click
on OK. Click on icon which is circled.
Select F10 cell.
Practical 7
Perform the data classification using classification algorithm.
Practical 8
Perform the data clustering using clustering algorithm.
Practical 9
Perform the Linear regression on the given data warehouse data.
Practical 10
Perform the logistic regression on the given data warehouse data.