100% found this document useful (1 vote)
259 views

Lab Manual BI

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
259 views

Lab Manual BI

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

2023-24

Acropolis Institute of
Technology and
Research, Indore
Department of CSE
Submitted To: Dr. Mayur Rathi
(Artificial Intelligence & Machine
Learning)

Business Intelligence (AL801)

Submitted By:
Harsh Khichi
Enrollment No. : 0827AL201022
Class/Year/Sem : AL_F-1/4th / 8th

[LAB ASSIGNMENT BUSINESS INTELLIGENCE (AL-801)]

The Objective of this laboratory work is to enlighten the student with knowledge base in
Business Intelligence and its applications. Also learn how to extract knowledge from data and
information and to learn how to draw conclusions, predictions and take futuristic actions .
ACROPOLIS INSTITUTE OF TECHNOLOGY & RESEARCH,
INDORE

Department of CSE (Artificial Intelligence & Machine Learning)

CERTIFICATE

This is to certify that the experimental work entered in this journal as per

the B. TECH. II year syllabus prescribed by the RGPV was done by Mr.

Harsh Khichi B.TECH IV year VIII semester in the Business Intelligence

Laboratory of this institute during the academic year 2023- 2024.

Signature of the Faculty


ABOUT THE LABORATORY

In this lab, students will be able to learn and develop application using
Business intelligence concepts. Students can expand their skill set by
deriving practical solutions using predictive analytics. More, this lab
provides the understanding of the importance of various algorithms in
Data Science. A business intelligence environment offers decision makers
information and knowledge derived from data processing, through the
application of mathematical models and algorithms. The latest platforms
compilers are provided to the students to run their programs.
GENERAL INSTRUCTIONS FOR LABORATORY CLASSES

➢ DO’S

✓ Without Prior permission do not enter into the Laboratory.

✓ While entering into the LAB students should wear their ID cards.

✓ The Students should come with proper uniform.

✓ Students should sign in the LOGIN REGISTER before entering into the
laboratory.

✓ Students should come with observation and record note book to the laboratory.

✓ Students should maintain silence inside the laboratory.

✓ After completing the laboratory exercise, make sure to shutdown the system
properly.

➢ DONT’S

✓ Students bringing the bags inside the laboratory.

✓ Students using the computers in an improper way.

✓ Students scribbling on the desk and mishandling the chairs.

✓ Students using mobile phones inside the laboratory.

✓ Students making noise inside the laboratory.


SYLLABUS
Course: AL801 (Business Intelligence)
Branch/Year/Sem: Artificial Intelligence & Machine Learning / IV / VIII

Module1: Effective and timely decisions – Data, information and knowledge – Role of
mathematical models – Business intelligence architectures: Cycle of a business
intelligence analysis – Enabling factors in business intelligence projects – Development
of a business intelligence system – Ethics and business intelligence.

Module2: The business intelligence user types, Standard reports, Interactive Analysis
and Ad Hoc Querying, Parameterized Reports and Self-Service Reporting, dimensional
analysis, Alerts/Notifications, Visualization: Charts, Graphs, Widgets, Scorecards and
Dashboards, Geographic Visualization, Integrated Analytics, Considerations:
Optimizing the Presentation for the Right Message.

Module3: Efficiency measures – The CCR model: Definition of target objectives- Peer
groups – Identification of good operating practices; cross efficiency analysis – virtual
inputs and outputs – Other models. Pattern matching – cluster analysis, outlier analysis.

Module4: Marketing models – Logistic and Production models – Case studies.

Module5: Future of business intelligence – Emerging Technologies, Machine Learning,


Predicting the Future, BI Search & Text Analytics – Advanced Visualization – Rich
Report, Future beyond Technology.

HARDWARE AND SOFTWARE REQUIREMENTS:

S. Name of Item Specification


No.
1 Computer System Hard Disk min 5 GB
RAM: 4 GB / 8 GB
Processor: Intel i3 or above

S. Name of Item Specification


No.
1 Operating system Window XP or 2000
Editor Python3.7 IDLE or Google Colab or
Spyder(Anaconda)
RATIONALE:
The purpose of this subject is to study the fundamental strengths and limits of predictive
analysis as well as how these interact with mathematics, data science, and other
disciplines.

PREREQUISITE:-
Experience with a Python is suggested. Prior knowledge of Data Science, Machine
Learning Algorithms & Foundation of Mathematics is helpful.

COURSE OBJECTIVES AND OUTCOMES

➢ Course Objectives
The student should be made to:

1. Be exposed with the basic rudiments of business intelligence system


2. Understand the modeling aspects behind Business Intelligence
3. Understand the business intelligence life cycle and the techniques used in it.
4. Be exposed with different data analysis tools and techniques.

➢ Course Outcomes

At the end of the course student will be able to:

1. Explain the basic concepts of business intelligence & make effective and timely
decision.
2. Deal with data and information and to convert it in knowledgeable data to be stored in
business intelligent systems to be used by knowledge workers.
3. Measure the efficiency using different models.
4. Apply the learned concepts in different real time applications.
5. Understand the future of business intelligence by emerging technologies.

.
Index
Grade &
Date of Page Date of
S.No Name of the Experiment Sign of the
Exp. No. Submission
Faculty
1 Import the legacy data from different sources such
as (Excel, Sql Server, Oracle etc.) and load in the
target system.

2 Perform the Extraction Transformation and Loading


(ETL) process to construct the database in the Sql
server / Power BI.

3 Data Visualization from ETL Process

4 Apply the what – if Analysis for data visualization.


Design and generate necessary reports based on the
data warehouse data.

5 Implementation of Classification algorithm in R


Programming.

6 Practical Implementation of Decision Tree using R


Tool

7 k-means clustering using R

8 Prediction Using Linear Regression

9 Data Analysis using Time Series Analysis

10 Data Analysis and Visualization using Advanced


Excel
Program Outcome (PO)

The engineering graduate of this institute will demonstrate:


a) Apply knowledge of mathematics, science, computing and engineering fundamentals to computer
science engineering problems.
b) Able to identify, formulate, and demonstrate with excellent programming, and problem solving skills.
c) Design solutions for engineering problems including design of experiment and processes to meet
desired needs within reasonable constraints of manufacturability, sustainability, ecological,
intellectual and health and safety considerations.
d) Propose and develop effective investigational solution of complex problems using research
methodology; including design of experiment, analysis and interpretation of data, and combination of
information to provide suitable conclusion. synthesis
e) Ability to create, select and use the modern techniques and various tools to solve engineering
problems and to evaluate solutions with an understanding of the limitations.
f) Ability to acquire knowledge of contemporary issues to assess societal, health and safety, legal and
cultural issues.
g) Ability to evaluate the impact of engineering solutions on individual as well as organization in a
societal and environmental context, and recognize sustainable development, and will be aware of
emerging technologies and current professional issues.
h) Capability to possess leadership and managerial skills, and understand and commit to professional
ethics and responsibilities.
i) Ability to demonstrate the team work and function effectively as an individual, with an ability to
design, develop, test and debug the project, and will be able to work with a multi-disciplinary team.
j) Ability to communicate effectively on engineering problems with the community, such as being able
to write effective reports and design documentation.
k) Flexibility to feel the recognition of the need for, and have the ability to engage in independent and
life- long learning by professional development and quality enhancement programs in context of
technological change.
l) A practice of engineering and management principles and apply these to one’s own work, as a
member and leader in a team, to manage projects and entrepreneurship
Acropolis Institute of Technology and Research, Indore
Department of CSE (Artificial Intelligence & Machine Learning)
Lab: Business Intelligence Title:
(AL801)
EVALUATION RECORD Type/ Lab Session:
Name Harsh Khichi Enrollment No. 0827AL201022
Performing on First submission Second submission
Extra Regular

Grade and Remarks by the Tutor


1. Clarity about the objective of experiment
2. Clarity about the Outcome
3. Submitted the work in desired format
4. Shown capability to solve the problem
5. Contribution to the team work

Additional remarks

Grade: Cross the grade.


A B C D F

Tutor

1 Title
Import the legacy data from different sources such as (Excel, Sql Server, Oracle etc.) and
load in the target system.
2 Neatly Drawn and labeled experimental setup
NA
3 Theoretical solution of the instant problem
3.1 Algorithm
1. Identify legacy data sources.
2. Establish connections to each data source (Excel, SQL Server, Oracle).
3. Retrieve data from each source.
4. Transform data if required (e.g., format conversion, cleansing).
5. Load transformed data into the target system .
3.2 Program
import pandas as pd
import pyodbc
import cx_Oracle

# Function to read data from Excel


def read_excel_data(file_path):
return pd.read_excel(file_path)

# Function to fetch data from SQL Server


def fetch_sql_data(server, database, query):
conn_str = f'DRIVER={{SQL
Server}};SERVER={server};DATABASE={database};Trusted_Connection=yes;'
conn = pyodbc.connect(conn_str)
cursor = conn.cursor()
cursor.execute(query)
columns = [column[0] for column in cursor.description]
data = [dict(zip(columns, row)) for row in cursor.fetchall()]
cursor.close()
conn.close()
return pd.DataFrame(data)

# Function to fetch data from Oracle database


def fetch_oracle_data(user, password, dsn, query):
conn_str = f'{user}/{password}@{dsn}'
conn = cx_Oracle.connect(conn_str)
cursor = conn.cursor()
cursor.execute(query)
columns = [column[0] for column in cursor.description]
data = [dict(zip(columns, row)) for row in cursor.fetchall()]
cursor.close()
conn.close()

Page 10
return pd.DataFrame(data)

# Function to load data into the target system


def load_data_into_target(data):
# Add your implementation here to load data into the target system
print("Data loaded into the target system.")

# File path for Excel data


excel_file_path = "path_to_excel_file.xlsx"

# SQL Server connection details and query


sql_server = "your_sql_server"
sql_database = "your_database"
sql_query = "SELECT * FROM your_table"

# Oracle connection details and query


oracle_user = "your_user"
oracle_password = "your_password"
oracle_dsn = "your_dsn"
oracle_query = "SELECT * FROM your_table"

# Read data from Excel


excel_data = read_excel_data(excel_file_path)

# Fetch data from SQL Server


sql_data = fetch_sql_data(sql_server, sql_database, sql_query)

# Fetch data from Oracle


oracle_data = fetch_oracle_data(oracle_user, oracle_password, oracle_dsn, oracle_query)

# Load data into the target system


all_data = pd.concat([excel_data, sql_data, oracle_data], ignore_index=True)

Page 11
load_data_into_target(all_data)
4 Tabulation Sheet

INPUT OUTPUT
Excel, SQL Server, Oracle data Data loaded into the target system

5 Results

Legacy data from Excel, SQL Server, and Oracle was successfully imported and loaded into the target
system.

Page 12
Acropolis Institute of Technology and Research, Indore
Department of CSE (Artificial Intelligence & Machine Learning)
Lab: Business Intelligence Title:
(AL801)
EVALUATION RECORD Type/ Lab Session:
Name Harsh Khichi Enrollment No. 0827AL201022
Performing on First submission Second submission
Extra Regular

Grade and Remarks by the Tutor


1. Clarity about the objective of experiment
2. Clarity about the Outcome
3. Submitted the work in desired format
4. Shown capability to solve the problem
5. Contribution to the team work

Additional remarks

Grade: Cross the grade.


A B C D F

Tutor

1 Title
Perform the Extraction Transformation and Loading (ETL) process to construct the database
in the Sql server / Power BI.
2 Neatly Drawn and labeled experimental setup
NA
3 Theoretical solution of the instant problem
3.1 Algorithm
1) Extraction:
• Retrieve data from various sources (e.g., files, databases).
• Use appropriate tools (e.g., SSIS for SQL Server, Power Query for Power BI)
2) Transformation:
• Cleanse and validate data.
• Perform data transformations (e.g., filtering, joining, aggregating).
• Apply business rules and logic.

Page 13
3) Loading:
• Insert transformed data into the target database.
• Ensure data integrity and consistency.
• Handle errors and logging.
3.2 Program
ETL Process in SQL Server

Step 1 − Open either BIDS\SSDT based on the version from the Microsoft SQL Server programs
group. The following screen appears.

Step 2 − The above screen shows SSDT has opened. Go to file at the top left corner in the above
image and click New. Select project and the following screen opens.

Page 14
Step 3 − Select Integration Services under Business Intelligence on the top left corner in the above
screen to get the following screen.

Step 4 − In the above screen, select either Integration Services Project or Integration Services Import
Project Wizard based on your requirement to develop\create the package.There are two modes −
Native Mode (SQL Server Mode) and Share Point Mode. Models There are two models − Tabular
Model (For Team and Personal Analysis) and Multi Dimensions Model (For Corporate Analysis)

ETL Process in Power BI


Step 1- Extract Data:
- Open Power BI Desktop.
- Click on "Get Data" in the Home tab.
- Choose the data source(s) from which you want to extract data (e.g., Excel, SQL Server, CSV, Web,
etc.).
- Connect to the data source and import the relevant data into Power BI.

Step 2- Transform Data:


- Once the data is imported, it will open in the Power Query Editor.
- Perform data cleaning, filtering, shaping, and other transformations as needed using the various tools
available in the Power Query Editor.
- You can remove duplicates, replace values, merge tables, create calculated columns, etc.
- Ensure that the data is structured in a way that aligns with your database schema and analysis
requirements.

Step 3- Load Data:


- After transforming the data, click on "Close & Apply" to load the data into the Power BI data model.

Page 15
- Power BI will load the transformed data into its internal data model, which you can then use to create
visualizations and reports.

Step 4- Data Modeling (optional):


- If necessary, you can further enhance your data model by creating relationships between tables,
defining measures, adding calculated columns, and optimizing the model for better performance.

Step 5- Visualize Data:


- Once the data is loaded into the data model, you can create interactive visualizations and reports
using the available visualization tools in Power BI.

Step 6- Publish to Power BI Service:


- If you want to share your reports and dashboards with others, you can publish your Power BI
Desktop file to the Power BI service.
- From there, you can share it with colleagues, collaborate on reports, and schedule data refreshes to
keep your reports up-to-date.

Step 7- Schedule Data Refresh (if applicable):


- If your data source is dynamic and changes over time, you can schedule automatic data refreshes in
the Power BI service to keep your reports updated with the latest data.

4 Tabulation Sheet

INPUT OUTPUT
NA NA

5 Results
• Document the successful execution of the ETL process.
• Include any issues encountered and their resolutions.
• Provide insights gained from the constructed database in SQL Server/Power BI.

Page 16
Acropolis Institute of Technology and Research, Indore
Department of CSE (Artificial Intelligence & Machine Learning)
Lab: Business Intelligence Title:
(AL801)
EVALUATION RECORD Type/ Lab Session:
Name Harsh Khichi Enrollment No. 0827AL201022
Performing on First submission Second submission
Extra Regular

Grade and Remarks by the Tutor


1. Clarity about the objective of experiment
2. Clarity about the Outcome
3. Submitted the work in desired format
4. Shown capability to solve the problem
5. Contribution to the team work

Additional remarks

Grade: Cross the grade.


A B C D F

Tutor

1 Title
Data Visualization from ETL Process
2 Neatly Drawn and labeled experimental setup
NA
3 Theoretical solution of the instant problem
3.1 Algorithm
1. Create Charts
• Drag UnitsInStock onto the canvas to create a Table visualization.
• Set ProductName as the axis and sort the table by UnitsInStock
• Drag OrderDate onto the canvas, then drag LineTotal to create a Line Chart.
• Drag ShipCountry onto the canvas to create a map.
• Set LineTotal as the values for the map.
2. Interact with Visual
• Click on the light blue circle in point to filter visuals for point's data

Page 17
3.2 Program
Step 1: Create charts showing Units in Stock by Product and Total Sales by Year
• Drag UnitsInStock from the Field pane (the Fields pane is along the right of the screen)
onto a blank space on the canvas. A Table visualization is created. Next, drag ProductName
to the Axis box, found in the bottom half of the Visualizations pane. Then we then select
Sort By > UnitsInStock using the skittles in the top right corer of the visualization.

.
• Drag OrderDate to the canvas beneath the first chart, then drag LineTotal (again, from the
Fields pane) onto the visual, then select Line Chart. The following visualization is created.

• Next, drag ShipCountry to a space on the canvas in the top right. Because you selected a
geographic field, a map was created automatically. Now drag LineTotal to the Values field;
the circles on the map for each country are now relative in size to the LineTotal for orders
shipped to that country.

Page 18
Step 2: Interact with your report visuals to analyze further
• Click on the light blue circle centered in Canada. Note how the other visuals are filtered to
show Stock (ShipCountry) and Total Orders (LineTotal) just for Canada.

Page 19
4 Tabulation Sheet

INPUT OUTPUT
NA NA

5 Results

Present the results of your ETL process. This could include visualizations generated from the transformed
data. Describe the insights gained from the visualization and how it helps in understanding the data better.

Page 20
Acropolis Institute of Technology and Research, Indore
Department of CSE (Artificial Intelligence & Machine Learning)
Lab: Business Intelligence Title:
(AL801)
EVALUATION RECORD Type/ Lab Session:
Name Harsh Khichi Enrollment No. 0827AL201022
Performing on First submission Second submission
Extra Regular

Grade and Remarks by the Tutor


1. Clarity about the objective of experiment
2. Clarity about the Outcome
3. Submitted the work in desired format
4. Shown capability to solve the problem
5. Contribution to the team work

Additional remarks

Grade: Cross the grade.


A B C D F

Tutor

1 Title
Apply the what – if Analysis for data visualization. Design and generate necessary reports
based on the data warehouse data.
2 Neatly Drawn and labeled experimental setup
NA
3 Theoretical solution of the instant problem
A book store and have 100 books in storage. You sell a certain % for the highest price of $50 and a
certain % for the lower price of $20. If you sell 60% for the highest price, cell D10 calculates a total profit
of 60 * $50 + 40 * $20 = $3800. Create Different Scenarios But what if you sell 70% for the highest
price? And what if you sell 80% for the highest price? Or 90%, or even 100%? Each different percentage
is a different scenario. You can use the Scenario Manager to create these scenarios. Note: You can simply
type in a different percentage into cell C4 to see the corresponding result of a scenario in cell D10.
However, what-if analysis enables you to easily compare the results of different scenarios.

Page 21
3.1 Algorithm
• Open Excel and load the data table.
• Go to Data tab and click What-If Analysis.
• Select Scenario Manager.
• Add scenarios by naming and assigning values.
• Verify scenarios and Apply scenarios to data.
• Analyze and save updated data.
• Close Excel.
3.2 Program
Step 1 - On the Data tab, in the Forecast group, click What-If Analysis.

Step 2 - Click Scenario Manager.

Step 3 - Add a scenario by clicking on Add.

Page 22
Step 4 - Type a name (60% highest), select cell C4 (% sold for the highest price) for the Changing
cells and click on OK.

Step 5 - Enter the corresponding value 0.6 and click on OK again.

Step 6 - Next, add 4 other scenarios (70%, 80%, 90% and 100%). Finally, your Scenario Manager
should be consistent with the picture below:

Page 23
4 Tabulation Sheet

INPUT OUTPUT
60 3800
70 4100
80 4400

5 Results
Discuss the findings and insights gained from the analysis and Interpret the results and their
implications for decision-making or further analysis.

Page 24
Acropolis Institute of Technology and Research, Indore
Department of CSE (Artificial Intelligence & Machine Learning)
Lab: Business Intelligence Title:
(AL801)
EVALUATION RECORD Type/ Lab Session:
Name Harsh Khichi Enrollment No. 0827AL201022
Performing on First submission Second submission
Extra Regular

Grade and Remarks by the Tutor


1. Clarity about the objective of experiment
2. Clarity about the Outcome
3. Submitted the work in desired format
4. Shown capability to solve the problem
5. Contribution to the team work

Additional remarks

Grade: Cross the grade.


A B C D F

Tutor

1 Title
Implementation of Classification Algorithm in R Programming
2 Neatly Drawn and labeled experimental setup
NA
3 Theoretical solution of the instant problem
3.1 Algorithm
• Start:
• Define the rainfall data points for each month starting from January 2012.
• Create a time series object using the defined rainfall data points.
• Set the start date of the time series to January 2012 and frequency to monthly (12
months).
• Print the time series data to display the rainfall values for each month.
• Open a file to save the plot chart as an image (e.g., PNG format).
• Plot a graph of the time series, with months on the x-axis and rainfall values on the y-

Page 25
axis.
• Save the plotted graph as an image file.
• Close the file.
• End.
3.2 Program
Consider the annual rainfall details at a place starting from January 2012. We create an R time series
object for a period of 12 months and plot it.
# Get the data points in form of a R vector.
rainfall <-
c(799,1174.8,865.1,1334.6,635.4,918.5,685.5,998.6,784.2,985,882.8,1071)
# Convert it to a time series object.
rainfall.timeseries <-
ts(rainfall,start = c(2012,1),frequency = 12)
# Print the timeseries data.
print(rainfall.timeseries)
# Give the chart file a name.
png(file = "rainfall.png")
# Plot a graph of the time series.
plot(rainfall.timeseries)
# Save the file.
dev.off()

4 Tabulation Sheet

INPUT OUTPUT
799,1174.8,865.1,1334.6,635.4,918.5 Jan Feb Mar Apr May Jun Jul Aug Sep
,685.5,998.6,784.2,985,882.8,107 2012 799.0 1174.8 865.1 1334.6 635.4
918.5 685.5 998.6 784.2 Oct Nov Dec
2012 985.0 882.8 1071.0

Page 26
5 Results
The provided algorithm allows us to visualize the annual rainfall data starting from January 2012. we
obtain a time series plot depicting the variation in rainfall over the 12-month period.

Page 27
Acropolis Institute of Technology and Research, Indore
Department of CSE (Artificial Intelligence & Machine Learning)
Lab: Business Intelligence Title:
(AL801)
EVALUATION RECORD Type/ Lab Session:
Name Harsh Khichi Enrollment No. 0827AL201022
Performing on First submission Second submission
Extra Regular

Grade and Remarks by the Tutor


1. Clarity about the objective of experiment
2. Clarity about the Outcome
3. Submitted the work in desired format
4. Shown capability to solve the problem
5. Contribution to the team work

Additional remarks

Grade: Cross the grade.


A B C D F

Tutor

1 Title
Practical Implementation of Decision Tree using R Tool
2 Neatly Drawn and labeled experimental setup
NA
3 Theoretical solution of the instant problem
3.1 Algorithm
Using ctree() Function from the party Package
1. Load the Required Packages:
• Load the "party" package, which contains the ctree() function.
2. Prepare Input Data:
• Load or create the dataset containing variables like "nativeSpeaker", "age",
"shoeSize", and "score".
• Subset the dataset if needed.
3. Create the Decision Tree:
• Use the ctree() function:

Page 28
• Specify the formula: nativeSpeaker ~ age + shoeSize + score.
• Provide the input data using the 'data' parameter.
4. Plot the Tree:
• Generate a graphical representation of the decision tree using the plot() function.
• Optionally, save the tree visualization as an image file.
3.2 Program
Input Data
We will use the R in-built data set named readingSkills to create a decision tree. It describes the
score of someone's readingSkills if we know the variables "age","shoesize","score" and whether the
person is a native speaker or not.

Here is the sample data.


# Load the party package. It will automatically load other
# dependent packages.
library(party)
# Print some records from data set readingSkills.
print(head(readingSkills))
When we execute the above code, it produces the following result and chart –

We will use the ctree() function to create the decision tree and see its graph.
# Load the party package. It will automatically load other
# dependent packages.
library(party)
# Create the input data frame.
input.dat <- readingSkills[c(1:105),]
# Give the chart file a name.

Page 29
png(file = "decision_tree.png")
# Create the tree.
output.tree <- ctree( nativeSpeaker ~ age + shoeSize + score, data = input.dat)
# Plot the tree.
plot(output.tree)
# Save the file.
dev.off()
Output:-
null device 1
Loading required package: methods
Loading required package: grid
Loading required package: mvtnorm
Loading required package: modeltools
Loading required package: stats4
Loading required package: strucchange
Loading required package: zoo
Attaching package: ‘zoo’
The following objects are masked from ‘package:base’: as.Date, as.Date.numeric
Loading required package: sandwich

4 Tabulation Sheet

INPUT OUTPUT
NA NA

Page 30
5 Results
Upon executing the algorithm to create and visualize the decision tree using the ctree()
function from the party package in R, we obtained the following outcome:

Page 31
Acropolis Institute of Technology and Research, Indore
Department of CSE (Artificial Intelligence & Machine Learning)
Lab: Business Intelligence Title:
(AL801)
EVALUATION RECORD Type/ Lab Session:
Name Harsh Khichi Enrollment No. 0827AL201022
Performing on First submission Second submission
Extra Regular

Grade and Remarks by the Tutor


1. Clarity about the objective of experiment
2. Clarity about the Outcome
3. Submitted the work in desired format
4. Shown capability to solve the problem
5. Contribution to the team work

Additional remarks

Grade: Cross the grade.


A B C D F

Tutor

1 Title
K-Means Clustering Using R
2 Neatly Drawn and labeled experimental setup
NA
3 Theoretical solution of the instant problem
3.1 Algorithm
1. Initialize centroids:
- Randomly select k data points from the dataset as initial centroids.

2. Repeat until convergence:


a. Assignment step:
- Assign each data point to the nearest centroid.
b. Update step:
- Update each centroid to be the mean of the data points assigned to it.

Page 32
c. Check convergence:
- If centroids do not change significantly or a maximum number of iterations is reached,
exit.
3. Output:
- Return the final cluster centroids and cluster assignments.
3.2 Program

Compare the Species label with the clustering result

Plot the clusters and their centre

Page 33
4 Tabulation Sheet

INPUT OUTPUT
NA NA

5 Results
Certainly! After running the k-means clustering algorithm using the provided dataset and number of
clusters, you will obtain the following results:

• Cluster centroids: These are the final centroids of each cluster, representing the center points
around which data points in each cluster are grouped.

• Cluster assignments: Each data point is assigned to one of the clusters based on its proximity to the
centroids. The cluster assignments indicate which cluster each data point belongs to.

These results help in understanding how the data points are grouped into clusters and the central
tendencies of each cluster represented by their centroids.

Page 34
Acropolis Institute of Technology and Research, Indore
Department of CSE (Artificial Intelligence & Machine Learning)
Lab: Business Intelligence Title:
(AL801)
EVALUATION RECORD Type/ Lab Session:
Name Harsh Khichi Enrollment No. 0827AL201022
Performing on First submission Second submission
Extra Regular

Grade and Remarks by the Tutor


1. Clarity about the objective of experiment
2. Clarity about the Outcome
3. Submitted the work in desired format
4. Shown capability to solve the problem
5. Contribution to the team work

Additional remarks

Grade: Cross the grade.


A B C D F

Tutor

1 Title
Prediction Using Linear Regression
2 Neatly Drawn and labeled experimental setup
NA
3 Theoretical solution of the instant problem
3.1 Algorithm
1. Import necessary libraries: NumPy, Pandas, Matplotlib, and Scikit-learn.
2. Load the dataset into a Pandas DataFrame.
3. Preprocess the data: handle missing values, encode categorical variables
4. if any, split the data into training and testing sets.
5. Create a linear regression model object.
6. Train the model using the training dataset.
7. Make predictions on the testing dataset.
8. Evaluate the model's performance using appropriate metrics such as Mean Squared

Page 35
Error (MSE) or R-squared.
9. Plot the actual vs. predicted values to visualize the model's performance.
3.2 Program
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Example dataset
data = {'X': [1, 2, 3, 4, 5],
'Y': [2, 4, 5, 4, 5]}

df = pd.DataFrame(data)

# Splitting the dataset into independent (X) and dependent (Y) variables
X = df[['X']]
Y = df['Y']

# Splitting the dataset into training and testing sets


X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

# Creating a linear regression model object


model = LinearRegression()

# Training the model


model.fit(X_train, Y_train)

# Making predictions
Y_pred = model.predict(X_test)

Page 36
# Evaluating the model
mse = mean_squared_error(Y_test, Y_pred)
print("Mean Squared Error:", mse)

# Plotting actual vs. predicted values


plt.scatter(X_test, Y_test, color='blue', label='Actual')
plt.scatter(X_test, Y_pred, color='red', label='Predicted')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Linear Regression: Actual vs. Predicted')
plt.legend()
plt.show()

Page 37
4 Tabulation Sheet

INPUT OUTPUT
X Y
1 2
2 4
3 5
4 4
5 5

5 Results
• Mean Squared Error: [calculated value]
• The graph shows the relationship between the independent variable (X) and the dependent variable (Y),
along with the regression line indicating the model's predictions.

Page 38
Acropolis Institute of Technology and Research, Indore
Department of CSE (Artificial Intelligence & Machine Learning)
Lab: Business Intelligence Title:
(AL801)
EVALUATION RECORD Type/ Lab Session:
Name Harsh Khichi Enrollment No. 0827AL201022
Performing on First submission Second submission
Extra Regular

Grade and Remarks by the Tutor


1. Clarity about the objective of experiment
2. Clarity about the Outcome
3. Submitted the work in desired format
4. Shown capability to solve the problem
5. Contribution to the team work

Additional remarks

Grade: Cross the grade.


A B C D F

Tutor

1 Title
Data Analysis using Time Series Analysis
2 Neatly Drawn and labeled experimental setup
NA
3 Theoretical solution of the instant problem
3.1 Algorithm
Steps:
1. Start
2. If the 'data' parameter is not provided, return an error message indicating missing data.
3. If the 'start' parameter is not provided, assume the start time as 1.
4. If the 'end' parameter is not provided, assume the end time as the length of the data.
5. If the 'frequency' parameter is not provided, assume the default frequency as 1.
6. Create the time series object using the ts() function with the provided parameters.
7. Store the created time series object in the variable 'timeseries.object.name'.

Page 39
8. Return the 'timeseries.object.name'.
9. End
3.2 Program
Consider the annual rainfall details at a place starting from January 2012. We create an R time series
object for a period of 12 months and plot it.
# Get the data points in form of a R vector.
rainfall <-
c(799,1174.8,865.1,1334.6,635.4,918.5,685.5,998.6,784.2,985,882.8,1071)
# Convert it to a time series object.
rainfall.timeseries <-
ts(rainfall,start = c(2012,1),frequency = 12)
# Print the timeseries data.
print(rainfall.timeseries)
# Give the chart file a name.
png(file = "rainfall.png")
# Plot a graph of the time series.
plot(rainfall.timeseries)
# Save the file.
dev.off()

4 Tabulation Sheet

INPUT OUTPUT
799,1174.8,865.1,1334.6,635.4,918.5 Jan Feb Mar Apr May Jun Jul Aug Sep
,685.5,998.6,784.2,985,882.8,107 2012 799.0 1174.8 865.1 1334.6 635.4
918.5 685.5 998.6 784.2 Oct Nov Dec
2012 985.0 882.8 1071.0

Page 40
5 Results

The provided algorithm allows us to visualize the annual rainfall data starting from January 2012. we
obtain a time series plot depicting the variation in rainfall over the 12-month period.

Page 41
Acropolis Institute of Technology and Research, Indore
Department of CSE (Artificial Intelligence & Machine Learning)
Lab: Business Intelligence Title:
(AL801)
EVALUATION RECORD Type/ Lab Session:
Name Harsh Khichi Enrollment No. 0827AL201022
Performing on First submission Second submission
Extra Regular

Grade and Remarks by the Tutor


1. Clarity about the objective of experiment
2. Clarity about the Outcome
3. Submitted the work in desired format
4. Shown capability to solve the problem
5. Contribution to the team work

Additional remarks

Grade: Cross the grade.


A B C D F

Tutor

1 Title
Data Modelling and Analytics with Pivot Table in Excel.
2 Neatly Drawn and labeled experimental setup
NA
3 Theoretical solution of the instant problem
3.1 Algorithm
• Identify the dataset: Gather the data you want to analyze using a pivot table.
• Open Excel: Launch Microsoft Excel on your computer.
• Insert Pivot Table: Go to the "Insert" tab and click on "PivotTable".
• Select Data Range: Choose the range of data you want to analyze.
• Design Pivot Table: Drag and drop fields into the Rows, Columns, and Values areas
to design your pivot table.
• Customize Pivot Table: Apply filters, sort data, and format as needed.
• Analyze Data: Use the pivot table to summarize and analyze your data effectively.

Page 42
3.2 Program
A Data Model is created automatically when you import two or more tables simultaneously from a
database. The existing database relationships between those tables is used to create the Data Model
in Excel.
Step 1 − Open a new blank Workbook in Excel.
Step 2 − Click on the DATA tab.
Step 3 − In the Get External Data group, click on the option From Access. The Select Data Source
dialog box opens.
Step 4 − Select Events.accdb, Events Access Database file.

Step 5 − The Select Table window, displaying all the tables found in the database, appears.

Page 43
Step 6 − Tables in a database are similar to the tables in Excel. Check the ‘Enable selection of
multiple tables’ box, and select all the tables. Then click OK.

Step 7 − The Import Data window appears. Select the PivotTable Report option. This option imports
the tables into Excel and prepares a PivotTable for analyzing the imported tables. Notice that the
checkbox at the bottom of the window - ‘Add this data to the Data Model’ is selected and disabled.

Page 44
Step 8 − The data is imported, and a PivotTable is created using the imported tables.

Explore Data Using PivotTable


Step 1 − You know how to add fields to PivotTable and drag fields across areas. Even if you are not
sure of the final report that you want, you can play with the data and choose the best suited report.
In PivotTable Fields, click on the arrow beside the table - Medals to expand it to show the fields in
that table. Drag the NOC_CountryRegion field in the Medals table to the COLUMNS area.
Step 2 − Drag Discipline from the Disciplines table to the ROWS area. Business Intelligence LAB
Manual 50
Step 3 − Filter Discipline to display only five sports: Archery, Diving, Fencing, Figure Skating, and
Speed Skating. This can be done either in PivotTable Fields area, or from the Row Labels filter in
the PivotTable itself.
Step 4 − In PivotTable Fields, from the Medals table, drag Medal to the VALUES area.
Step 5 − From the Medals table, select Medal again and drag it into the FILTERS area.

Page 45
Step 6 − Click the dropdown list button to the right of the Column labels.
Step 7 − Select Value Filters and then select Greater Than…
Step 8 − Click OK.

Step 9 − Type 80 in the Right Field.


Step 10 − Click OK.

Page 46
The PivotTable displays only those regions, which has more than total 80 medals.

4 Tabulation Sheet

INPUT OUTPUT
NA NA

5 Results
• Present the analyzed data with key insights and findings.
• Use charts or graphs to visualize the data if necessary

Page 47

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy