0% found this document useful (0 votes)
23 views10 pages

Tasbi Ul Hasan-20023247

Uploaded by

tasbiulhasan2010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views10 pages

Tasbi Ul Hasan-20023247

Uploaded by

tasbiulhasan2010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Assignment 01

TASBI UL HASAN
ID: 20023247
Answer to the question no 01

1. Web Data:

o Description: Data collected from the company’s website, capturing user interactions,
click-through rates, session durations, and navigation paths.

o Relevance: This data is particularly valuable for the Marketing Department to


understand customer engagement, track trends in product interest, and develop
targeted marketing strategies.

o Potential Data Types:

 Text logs of customer interactions.

 Numerical values representing metrics (e.g., page views, session time).

 Timestamps to record interaction times.

2. Legacy Systems:

o Description: These are older systems that store historical data across different company
operations, potentially containing records on product versions, customer service logs, or
past transactions.

o Relevance: Legacy data is essential for the Design Department to track the
development and performance of products over time and understand long-term trends.
It may also support Marketing with historical customer data.

o Potential Data Types:

 Mixed data types, including text fields for product descriptions and service logs.

 Numerical data such as sales figures or performance metrics.

 Potentially complex structures (e.g., relational data with linked records).

3. Plant Operations Data:

o Description: Metrics from production and manufacturing processes, covering details


such as machine performance, production volumes, and efficiency measures.
o Relevance: Engineering and Operations departments rely on this data to monitor
productivity, ensure machine efficiency, and assess production workflows.

o Potential Data Types:

 Numeric values for production counts, efficiency rates, and downtimes.

 Timestamps to track production cycles.

 Categorical data indicating machine status (e.g., active, idle, maintenance).

4. Sales Transactions:

o Description: Information on customer purchases, including order details, transaction


amounts, and product information.

o Relevance: This data is crucial for the Marketing Department to analyze sales
performance, identify purchasing patterns, and track high-performing products.

o Potential Data Types:

 Numeric values such as transaction amounts and item quantities.

 Text data for customer names and product descriptions.

 Timestamps to record the timing of each transaction.

5. Employee Health and Safety:

o Description: Data regarding employee health metrics, incident reports, safety


compliance records, and workplace conditions.

o Relevance: The Operations Department requires this data to ensure employee well-
being, maintain a safe working environment, and monitor compliance with safety
regulations.

o Potential Data Types:

 Text data for incident descriptions and safety reports.

 Numerical values indicating health metrics or safety scores.

 Categorical fields for safety status (e.g., compliant, non-compliant).


Going forward we would require to,

1. Extract:

o Data will be extracted from each source system:

 Web Data: Retrieved from web logs or API endpoints, capturing session data
and user interactions.

 Legacy Systems: Extracted from databases or file systems where historical data
is stored, often requiring specialized connectors due to legacy formats.

 Plant Operations Data: Pulled from machine logs or IoT sensors in


manufacturing, providing real-time and historical metrics.

 Sales Transactions: Collected from the point-of-sale or transactional databases.

 Employee Health and Safety: Retrieved from HR and safety compliance


systems, storing records of incidents and health metrics.

2. Transform:

o Data from each source needs to be standardized and cleaned:

 Data Cleaning: Remove duplicates, handle missing values, and standardize


formats (e.g., consistent date and time formats).

 Integration: Link related data across sources. For example, employee data from
Health and Safety records might be connected to Plant Operations to track
safety incidents by location.

 Business Rules: Apply transformations specific to business needs, such as


converting currency or categorizing transaction values (e.g., low vs. high
transaction).

 Schema Alignment: Ensure data fields are consistently named and structured to
fit the data warehouse’s schema.

3. Load:
o The transformed data is loaded into the central data warehouse, typically organized in
tables and schemas that match business domains (e.g., Sales, Operations, Marketing).

o Data Partitions: The data might be partitioned by time or department for faster access
and analysis.

o Data Marts: Department-specific views or “marts” are created based on the central
warehouse to allow departments to access only relevant data.

o Database Structure:

 Fact Tables: Store transactional data, such as sales transactions, manufacturing outputs, or
safety incidents.

 Dimension Tables: Store reference data (e.g., customer details, product info, employee records)
to allow efficient joining with fact tables.

o Storage and Access Design:

 Data is organized by business domains (e.g., Sales, Operations, Health and Safety).

 Implement indexes on frequently queried fields (e.g., date, transaction ID) to improve
performance.

 Historical Data Retention: Data from legacy systems can be retained in a separate archival
schema if needed for long-term reference.

o Data Security and Access Control:

 Implement role-based access controls so that each department can only view and access
relevant data.

o Create Data Marts for Departmental Access

Data marts provide department-specific access to the data stored in the central warehouse. Here’s
how each data mart will be structured for the company’s departments:

1. Engineering Department Data Mart:

o Data Sources: Primarily sourced from Plant Operations data.


o Purpose: Focus on machine performance, production rates, and efficiency metrics.

o Structure: Fact tables for production metrics, dimension tables for machine details and
operators.

2. Operations Department Data Mart:

o Data Sources: Plant Operations and Employee Health and Safety.

o Purpose: Access to safety compliance, incident reports, and overall operational


efficiency.

o Structure: Fact tables for incident reports, dimension tables for employee and machine
information.

3. Design Department Data Mart:

o Data Sources: Legacy Systems and Web Data.

o Purpose: Review historical data on product versions and customer engagement trends.

o Structure: Fact tables for product data and website interactions, dimension tables for
product features.

4. Marketing Department Data Mart:

o Data Sources: Sales Transactions and Web Data.

o Purpose: Analyzing customer behavior, sales performance, and marketing trends.

o Structure: Fact tables for sales and web interactions, dimension tables for customer
demographics and product details.

The company uses Python and JMP Pro as API tools to enable users to access the data warehouse:

1. Python API:

o Allows for data querying, extraction, and manipulation for analytics.

o Can be used for creating automated scripts that pull data for specific analyses or
visualizations.

2. JMP Pro API:


o Offers advanced statistical analysis and visualization tools.

o Departments can use JMP Pro to directly access and analyze their data mart, creating
customized reports and visual insights.

These APIs will facilitate easy and secure data access, enabling each department to use the data
warehouse efficiently without manual data handling.

Answer to the question no. 02

Feature Model-driven DSS Data-driven DSS


Uses analytical models to
Leverages large datasets to
simulate scenarios and
identify trends and patterns,
Focus make decisions based on
providing insights directly from
predictions rather than
the data itself.
raw data.
Needs minimal real-world
data, as it relies primarily Requires extensive historical or
Data
on models (like real-time data; often integrates
Requirement
simulations or with databases or data
s
optimizations) to predict warehouses.
outcomes.
Suitable for “what-if”
analysis or scenarios Ideal for analyzing patterns in
where data might be existing data, such as tracking
Best for
scarce but models can customer behavior or financial
help simulate possible performance trends.
outcomes.
Optimizing a supply chain
Analyzing customer transactions
by simulating inventory
Example to forecast purchasing trends and
needs without detailed
inform marketing strategies.
sales data.

Answer to the question no. 03

Type of
Description Example
Analytics
Focuses on summarizing
A retail store analyzing monthly
Descriptive and describing historical
sales reports to identify popular
Analytics data to understand what
products.
has happened.
Purpose: Looks at past data to identify trends and understand historical
performance.
Techniques Used: Reporting, data aggregation, and data visualization (e.g.,
dashboards).
Benefit: Helps businesses get a clear picture of what has happened, allowing
them to recognize successes or spot issues.
Uses historical data and
Using past sales data to predict
Predictive statistical techniques to
customer demand for the
Analytics forecast what might
upcoming holiday season.
happen in the future.
Purpose: Applies statistical models and machine learning algorithms to
historical data to make predictions about future events.
Techniques Used: Regression analysis, time-series forecasting, and
classification.
Benefit: Allows businesses to anticipate future trends and adjust their strategies
accordingly.
Prescriptive Recommends actions by
Analytics analyzing possible An inventory system that suggests
outcomes, aiming to optimal stock levels based on
suggest the best course predicted demand and supplier
of action to achieve delivery times.
desired results.
Purpose: Builds on predictive insights to recommend specific actions that can
help optimize outcomes.
Techniques Used: Optimization algorithms, decision analysis, and simulation.
Benefit: Provides actionable recommendations, helping businesses to make
decisions that align with their goals.

Answer to the question no. 04

Business Intelligence Decision Support Systems


Aspect
(BI) (DSS)
Provides insights and
Directly assists with decision-
reports to support
making processes, often for
Purpose strategic decisions by
operational or tactical decisions
analyzing historical and
based on real-time needs.
current data.
Primarily relies on data Can use various data sources,
warehouses containing including real-time data, data
Data Source
large, structured datasets warehouses, or external sources,
for strategic analysis. for flexible problem-solving.

Generally oriented Often focused on analytical tasks


Orientation
towards strategic for managers and analysts at
decision-making, often
used by executives and various organizational levels.
upper management.
Typically uses
Often involves custom solutions
commercially available
and models to solve unstructured,
Development tools and dashboards for
complex problems specific to the
easy access and
organization.
visualization of data.
Tools like Tableau or
An inventory management system
Power BI that provide data
that helps a manager decide on
Examples dashboards to monitor
optimal reorder points based on
KPIs and support long-
current stock levels.
term planning.

Answer to question no. 05

The ETL process (Extract, Transform, Load) is a crucial part of data warehousing, enabling data from
various sources to be integrated, standardized, and stored for analysis.

Step Description

This step involves retrieving raw data from various source


systems, such as databases, applications, or files. Data can
Extract
come from multiple formats (e.g., CSV, SQL databases, web
data).

In this step, the data is cleaned and formatted to ensure


consistency and accuracy. Transformations may include
Transform
removing duplicates, handling missing values, and converting
formats.

The final step, where the cleaned and standardized data is


Load loaded into the data warehouse. Here, data is organized into
tables or structures for efficient storage and easy retrieval.

Answer to the question no 6

Aspect Classification Prediction


Assigns data to predefined
categories or classes Forecasts a continuous or ordered
Purpose
based on certain value based on historical data.
attributes.
Categorical (e.g., Yes/No, Numerical or continuous (e.g.,
Output Type
High/Low risk). future sales amount, stock price).
Classifying loan applicants
as low, medium, or high Predicting next month’s sales
Example risk based on income, volume based on historical
credit score, and other monthly sales data.
attributes.
Often used in tasks where
groups need to be Common in forecasting scenarios,
identified, like spam such as predicting demand,
Use Cases
detection, fraud detection, estimating future costs, or
or customer projecting revenue growth.
segmentation.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy