BI AnsBank
BI AnsBank
UNIT 1
1. Draw and explain the architecture of IBM business intelligence system.
• Decision Support System (DSS): An Extended Explanation: A Decision Support System (DSS) is a computer-
based information system designed to assist decision-makers in solving complex problems and making
effective decisions. It provides interactive tools and analytical capabilities to support decision-making
processes at various levels of an organization.
• Effectiveness: DSS enhances decision-making effectiveness by providing decision-makers with timely access
to relevant information, analytical models, and decision-making tools. By integrating data from multiple
sources and enabling dynamic analysis, DSS helps in generating insights and evaluating alternative courses of
action to make informed decisions.
• Mathematical Models: DSS incorporates mathematical models, such as optimization, simulation, and
forecasting models, to analyze data and predict outcomes. These models allow decision-makers to simulate
different scenarios, evaluate potential outcomes, and assess the impact of decisions on organizational
objectives.
• Integration in Decision-Making Process: DSS integrates seamlessly into the decision-making process by
providing support at each stage, from problem identification to solution implementation. It assists in
identifying relevant data sources, analyzing data, generating insights, evaluating alternatives, and monitoring
outcomes, thus facilitating a structured and systematic decision-making approach.
• Organizational Role: DSS plays a crucial role in enhancing organizational decision-making capabilities across
various functional areas, including finance, marketing, operations, and strategic planning. It empowers
decision-makers at all levels of the organization, from frontline employees to top executives, by providing
tailored decision support tools and information access.
• Flexibility: DSS offers flexibility in terms of customization and adaptation to diverse decision-making contexts
and user preferences. It allows users to tailor analytical models, reports, and dashboards to their specific
needs and preferences, ensuring that decision support capabilities align with organizational requirements
and decision-making styles.
• Structured Decision: Structured decisions are routine, repetitive decisions that follow a predefined
set of rules or procedures. These decisions are well-defined and involve clear inputs, processes, and
outputs.
• Example: A retail store uses BI to analyze daily sales data and automatically reorder inventory
when stock levels fall below a certain threshold, following a structured decision-making
process based on predefined inventory management rules.
• Unstructured Decision: Unstructured decisions are complex, non-routine decisions that lack specific
guidelines or predefined solutions. These decisions often involve ambiguity and uncertainty,
requiring creativity and judgment.
• Example: A marketing team uses BI to analyze market trends and consumer behavior to
develop a new advertising campaign targeting a niche market segment. The decision-making
process is unstructured, as there are no predefined rules or guidelines for creating the
campaign.
2. Scope of Decision:
• Strategic Decision: Strategic decisions are long-term decisions made by top-level management to
achieve organizational objectives and gain a competitive advantage. These decisions have a
significant impact on the organization's overall direction and require a broad perspective.
• Example: A CEO uses BI to analyze market trends, competitor performance, and internal
capabilities to formulate a long-term growth strategy for the company, such as expanding
into new markets or diversifying product offerings.
• Tactical Decision: Tactical decisions are medium-term decisions made by middle-level management
to implement strategic plans and improve operational efficiency. These decisions focus on optimizing
resources and processes to achieve specific goals.
• Operational Decision: Operational decisions are short-term decisions made by front-line employees
or supervisors to support day-to-day operations and ensure smooth execution of tasks. These
decisions are often repetitive and routine.
• Example: A sales representative uses BI to access real-time sales data and customer
information to personalize interactions and address customer inquiries, enabling them to
make on-the-spot decisions to improve customer satisfaction and drive sales.
The business intelligence (BI) cycle is a continuous process that organizations use to gather, analyze, and
interpret data to make informed decisions and drive business growth. It typically consists of several key
stages: analysis, insight, decision, and evaluation. Let's delve into each stage in detail:
1. Analysis:
• In the analysis stage, raw data from various sources such as databases, data warehouses,
and external sources are collected and processed.
• Data is cleaned, transformed, and structured to make it suitable for analysis. This may
involve removing duplicates, handling missing values, and aggregating data.
• Analytical techniques such as statistical analysis, data mining, and machine learning are
applied to uncover patterns, trends, and relationships within the data.
• Visualization tools are often used to represent the analyzed data in the form of charts,
graphs, dashboards, or reports, making it easier to interpret and understand.
2. Insight:
• In the insight stage, the analyzed data is interpreted to gain meaningful insights and
actionable information.
• Patterns, trends, anomalies, and correlations discovered during analysis are examined to
understand their implications for the business.
• Data is contextualized and interpreted in the context of business objectives, industry trends,
and market conditions.
• Insights may reveal opportunities for improvement, areas of risk, or emerging trends that
could impact business performance.
3. Decision:
• In the decision stage, stakeholders use the insights gained from the analysis to make
informed decisions that drive business strategy and operations.
• Decision-makers consider the implications of the insights on various aspects of the business,
such as marketing, sales, operations, finance, and customer service.
• Decisions may involve strategic planning, resource allocation, product development,
marketing campaigns, pricing strategies, and risk management.
• BI tools and platforms often provide decision support capabilities, such as scenario analysis
and predictive modeling, to assist decision-makers in evaluating different options and their
potential outcomes.
4. Evaluation:
• In the evaluation stage, the impact of decisions made based on BI insights is assessed to
determine their effectiveness and success.
• Key performance indicators (KPIs) and metrics are monitored to measure the outcomes of
implemented strategies and initiatives.
• Performance metrics may include revenue growth, cost savings, customer satisfaction,
market share, and operational efficiency.
• Continuous feedback loops are established to refine strategies, adjust tactics, and improve
decision-making processes based on the evaluation of results.
6. Describe the BI system. Explain the importance of effective and timely decisions for
business.
1. BI System Overview:
• A BI (Business Intelligence) system acts as a strategic advisor for companies.
• It collects data from various departments such as sales, marketing, finance, and operations.
• This data is then transformed into meaningful insights presented through reports,
dashboards, and visualizations.
• BI streamlines decision-making processes by providing accurate, relevant, and up-to-date
information.
• It enhances strategic planning by identifying emerging trends, market opportunities, and
potential threats.
7. Describe the detail structure of DSS with the help of diagram and appropriate
labelling.(Extended version)
Extended Structure:
1. Data Management:
• Data management in DSS involves collecting, storing, and managing data from various internal and
external sources. It includes processes such as data integration, cleansing, and transformation to
ensure data accuracy and consistency.
• Example: A retail company uses DSS to integrate sales data from POS systems, customer data from
CRM systems, and market data from external sources to analyze sales trends and customer behavior.
2. Model Management:
• Example: A financial institution uses DSS to develop risk assessment models for loan approval,
regularly updating the models based on historical performance and changing market dynamics.
3. Interactions:
• Interactions in DSS refer to the user interface and interaction design that enable intuitive and user-
friendly access to decision support tools and functionalities. It focuses on providing seamless
navigation, visualization, and collaboration features for effective decision-making.
• Example: A healthcare organization uses DSS with a user-friendly interface that allows doctors to
interactively explore patient data, visualize medical imaging results, and collaborate with colleagues
in real-time.
4. Knowledge Management:
• Knowledge management involves capturing, organizing, and sharing knowledge and expertise within
an organization to support decision-making processes. It includes storing and retrieving documents,
best practices, and lessons learned to facilitate knowledge transfer and learning.
• Example: A consulting firm uses DSS with knowledge management capabilities to access case studies,
research reports, and expert insights, helping consultants make informed recommendations to
clients based on previous experiences and industry knowledge.
Business Intelligence (BI) encompasses a range of tools, processes, and methodologies aimed at
transforming raw data into meaningful and actionable insights to support decision-making within an
organization. The main components of BI include:
1. Decisions: At the heart of BI is the decision-making process. BI solutions aim to provide decision-
makers with timely, accurate, and relevant information to support strategic, tactical, and
operational decisions across various functions and levels within the organization.
2. Optimization: Optimization involves selecting the best alternative or course of action based on
data-driven analysis and predefined criteria. BI tools and techniques help identify opportunities for
improvement, resource allocation, cost reduction, revenue maximization, and overall efficiency
enhancement.
3. Data Mining: Data mining refers to the process of discovering patterns, trends, and insights from
large datasets using statistical and machine learning algorithms. BI leverages data mining
techniques to extract valuable knowledge from structured, semi-structured, and unstructured data
sources, enabling organizations to uncover hidden relationships and make informed decisions.
4. Data Exploration: Data exploration involves analyzing and visualizing data to gain a better
understanding of its characteristics, relationships, and underlying patterns. BI solutions facilitate
exploratory data analysis through interactive dashboards, reports, charts, and graphs, allowing users
to explore data intuitively and derive actionable insights.
5. Data Warehouse/Data Mart: A data warehouse or data mart serves as a centralized repository of
integrated, cleansed, and transformed data from various sources within the organization. It stores
historical and current data in a structured format optimized for querying and analysis, enabling
users to access consistent, reliable, and up-to-date information for BI purposes.
6. Multidimensional Cube Analysis: Multidimensional cube analysis, also known as OLAP (Online
Analytical Processing), enables users to analyze data from multiple dimensions or perspectives. It
allows for complex queries, drill-downs, roll-ups, and slicing-and-dicing operations to explore data
at different levels of granularity and gain deeper insights into business performance.
7. Data Sources: Data sources provide the raw material for BI processes and analysis. These sources
include operational systems (e.g., CRM, ERP, SCM), internal databases, spreadsheets, documents,
and external data from third-party sources, market research firms, social media platforms, and
more. BI solutions integrate and consolidate data from diverse sources to create a unified view of
organizational information.
UNIT 2
1. Elaborate on the concept of online analytical processing and the types of online
analytical processing.
OLAP, or On-Line Analytical Processing, is a method used in computing to swiftly answer Multi-Dimensional
Analytical (MDA) queries. It's a key component of business intelligence, which encompasses various
technologies like relational databases, report writing, and data mining. OLAP is essential for businesses
seeking to extract valuable insights from their data quickly and efficiently.
Types of Online Analytical Processing:
1. MOLAP (Multidimensional OLAP):
• In MOLAP, data is stored in a multidimensional cube format, fulfilling the needs of analytical
applications where access to summarized data is required.
• Example: A retail company utilizes MOLAP to analyze sales performance across different
product categories and regions, enabling them to identify trends and optimize inventory
management.
• Advantages:
• MOLAP cubes are designed for quick data retrieval, making them ideal for slicing
operations.
• Complex calculations can be performed rapidly.
• Disadvantages:
• Limited scalability due to all calculations being performed during cube construction.
• Adoption of MOLAP may require additional investments in terms of human resources
and capital.
2. ROLAP (Relational OLAP):
• ROLAP relies on manipulating data stored in relational databases, where detailed level
values are present.
• Example: A financial institution utilizes ROLAP to analyze customer transactions and identify
patterns of fraudulent activity, helping them prevent financial losses.
• Advantages:
• Can handle large volumes of data efficiently.
• Leverages functionalities inherent in relational databases.
• Disadvantages:
• Performance may be slower compared to MOLAP, especially for large datasets.
• Limited by SQL functionalities, which might not cover all analytical needs.
3. HOLAP (Hybrid OLAP):
• HOLAP technologies combine the benefits of both MOLAP and ROLAP.
• Example: A healthcare organization uses HOLAP to analyze patient data stored in a
combination of multidimensional cubes and relational databases, allowing for quick access
to summarized data and detailed patient records.
• Products like Microsoft Analysis Services, Oracle Database OLAP Option, and MicroStrategy
offer HOLAP storage solutions.
• HOLAP allows for the flexibility of MOLAP's quick data access and ROLAP's ability to handle
large datasets efficiently.
OLAP, or Online Analytical Processing, is a method of quickly analyzing large volumes of data to gain
insights for decision-making. It allows users to interactively analyze multidimensional data from different
perspectives. OLAP systems are crucial for businesses to understand trends, patterns, and relationships
within their data.
Types of OLAP:
1. MOLAP (Multidimensional OLAP): Data is stored in multidimensional cubes, optimized for fast data
retrieval and complex calculations.
2. ROLAP (Relational OLAP): Data is stored in relational databases, allowing for flexibility and
scalability but may be slower in performance.
3. HOLAP (Hybrid OLAP): Combines aspects of both MOLAP and ROLAP, providing the best of both
worlds in terms of speed and flexibility.
Architecture of OLAP:
1. Data Warehouse:
• A central repository that stores structured data from various sources.
• Data warehouse collects, integrates, and organizes data to support OLAP analysis.
2. ETL Tools (Extract, Transform, Load):
• ETL tools extract data from different sources, transform it into a consistent format, and load
it into the data warehouse.
• These tools ensure data quality, consistency, and reliability for OLAP analysis.
3. OLAP Server:
• The OLAP server manages and organizes multidimensional data for analysis.
• It provides functionalities for querying, retrieving, and manipulating data stored in OLAP
cubes.
4. OLAP Database (OLAP DB):
• OLAP databases store pre-aggregated data in a multidimensional format.
• These databases optimize data storage and retrieval for OLAP queries.
5. OLAP Cubes:
• OLAP cubes are multidimensional structures that store aggregated data organized into
dimensions and measures.
• Dimensions represent the different perspectives or attributes of data, while measures are
the numerical values being analyzed.
6. OLAP Analytical Tools:
• OLAP analytical tools provide interfaces for users to interact with OLAP cubes and analyze
data.
• These tools allow users to slice, dice, drill down, and pivot data to explore trends and
patterns easily.
5. How can u describe the concept of data reduction and it's methods.
Data Reduction:
Data reduction in the context of Business Intelligence refers to the process of efficiently reducing the size
and complexity of large datasets while maintaining their quality and usefulness for analysis. It involves
applying various techniques such as sampling, attribute selection, and aggregation to streamline data
processing, improve computation speed, enhance accuracy, and simplify model interpretation. By reducing
the volume of data to its most essential components, data reduction enables organizations to extract
meaningful insights and make informed decisions more effectively.
Methods of Data Reduction:
1. Efficiency:
• Making the dataset smaller helps learning algorithms work faster.
• Shorter computation time means quicker analyses and results.
• Example: A retail company collects customer transaction data from its stores nationwide. By
applying sampling techniques, the company selects a representative sample of transactions
instead of analyzing the entire dataset. This reduces computation time, allowing analysts to
quickly identify trends and patterns in customer behavior, such as popular products or peak
shopping times.
2. Accuracy:
• Data reduction techniques should not compromise the accuracy of models generated.
• Some techniques can even improve the model's ability to generalize to new data.
• Example: A marketing team wants to analyze customer demographics to target advertising
campaigns effectively. Using attribute selection techniques, they identify the most relevant
demographic factors (such as age, gender, and income level) from a large dataset containing
various customer attributes. By focusing on these key attributes, they ensure that their
marketing strategies are based on accurate and meaningful insights.
3. Simplicity:
• Simplifying models is important for easier interpretation by experts.
• Decision makers may accept a slight decrease in accuracy for simpler, more understandable
rules.
• Example: An insurance company analyzes claims data to identify fraud patterns. To create
interpretable models for fraud detection, they apply data reduction techniques such as
discretization and aggregation. By grouping similar claim characteristics (such as claim
amount and type of injury) into categories and summarizing them, they develop simpler
rules for identifying suspicious claims that can be easily understood by claims investigators.
10. Describe the concept of supervised and unsupervised learning models on detail.
Refer Question no 6
• Linear regression is a statistical method used to model the relationship between a dependent variable
(target) and one or more independent variables (predictors).
• It assumes a linear relationship between the independent variables and the dependent variable.
•
Formula:
• For multivariate linear regression(1 dependent and many independent variable) : Y = β0 + β1X1 + β2X2 + ... +
βnXn , Where:
• β0 is the intercept (the value of Y when all independent variables are zero).
• β1, β2, ..., βn are the coefficients (slopes) representing the change in Y for a unit change in each
independent variable.
Example: Let's say we want to predict house prices based on their size (in square feet). In this case:
Logistic Regression: Logistic regression is a statistical method used for binary classification. It predicts the probability
of occurrence of an event by fitting data to a logistic function, also known as the sigmoid function.
Formula:
•
• Where x is the linear combination of independent variables.
Key Points:
• It maps the linear combination of independent variables to a probability between 0 and 1 using the
logistic function.
• Threshold value (typically 0.5) is used to classify the outcome into two classes.
• Binomial:
• Only two possible types of the dependent variable (e.g., Pass or Fail).
• Multinomial:
• Three or more possible unordered types of the dependent variable (e.g., "cat", "dog",
"sheep").
• Ordinal:
• Three or more possible ordered types of dependent variables (e.g., "low", "medium",
"high").
Assumptions:
• Logistic regression assumes independent observations, a binary dependent variable, a linear relationship
between independent variables and log odds, the absence of outliers, and a large sample size to ensure
reliable estimates.
• Logistic regression predicts the output of a categorical dependent variable, producing probabilistic values
between 0 and 1, unlike linear regression which predicts continuous values.
• While linear regression fits a straight regression line, logistic regression fits an "S" shaped logistic function to
predict binary outcomes (0 or 1).
Naive Bayes: Naive Bayes is a supervised learning algorithm and it is based on Bayes’ Theorem, utilized for
classification tasks. Despite their simplifying assumption of feature independence, they are popular due to their
simplicity and efficiency in machine learning.
• Formula:
• Where:
• Example:
• In sentiment analysis, a Naive Bayes can determine whether a customer review is positive or
negative based on the presence of certain keywords, such as "good" or "bad". Similarly, in medical
diagnosis, it can predict the likelihood of a patient having a particular disease based on symptoms
like fever, cough, or headache.
• Assumption:
• Naive Bayes assumes feature independence, meaning each feature is independent of others given
the class label.
• For continuous features, it assumes a normal distribution within each class, and for discrete features,
it assumes a multinomial distribution.
Example: Consider a retail store that records daily sales data over the past year. Each data point represents the total
sales revenue for a specific day. Using time series analysis, the store can:
1. Identify Patterns and Trends: By plotting the sales data over time, the store can visually identify patterns and
trends, such as seasonal fluctuations, weekly sales cycles, or long-term growth trends. For instance, the store
may observe higher sales during holiday seasons or weekends compared to regular weekdays.
2. Forecast Future Sales: Time series analysis allows the store to develop predictive models that forecast future
sales based on historical patterns. Using techniques like moving averages, exponential smoothing, or
autoregressive integrated moving average (ARIMA) models, the store can estimate future sales trends and
adjust inventory levels or marketing strategies accordingly.
3. Detect Anomalies or Outliers: Time series analysis helps in detecting anomalies or outliers in the data that
deviate significantly from expected patterns. For example, a sudden spike or drop in sales may indicate a
special promotion or a supply chain disruption, prompting the store to investigate further and take
appropriate action.
4. Evaluate Intervention Effects: If the store implements changes or interventions, such as launching a new
marketing campaign or changing pricing strategies, time series analysis can assess the impact of these
interventions on sales performance over time. By comparing actual sales data with forecasted values, the
store can measure the effectiveness of its initiatives and make data-driven decisions for future strategies.
UNIT 5
1. What is AI? What are the major capabilities of AI.
2. List and explain the characteristics of AI.
3. What are the major advantages of AI over the natural intelligence?
4. What are the disadvantages of AI over the natural intelligence?
5. Explain the different applications of AI.