0% found this document useful (0 votes)
223 views48 pages

IBM Slide CHAPTER 3

This document discusses key concepts in business analytics and optimization (BAO). It describes how BAO technologies help transform raw data into insights and decisions. Organizations gather massive amounts of data from various sources that BAO turns into information through business intelligence, and further analyzes to generate insights and foresights to support decisions. The document outlines essential BAO capabilities like trusted information, reporting and analysis, business performance management, and predictive analysis that organizations need.

Uploaded by

Wen Xin Gan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
223 views48 pages

IBM Slide CHAPTER 3

This document discusses key concepts in business analytics and optimization (BAO). It describes how BAO technologies help transform raw data into insights and decisions. Organizations gather massive amounts of data from various sources that BAO turns into information through business intelligence, and further analyzes to generate insights and foresights to support decisions. The document outlines essential BAO capabilities like trusted information, reporting and analysis, business performance management, and predictive analysis that organizations need.

Uploaded by

Wen Xin Gan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

IBM ICE (Innovation Centre for Education)

Business Analytics and


Optimization
Key Business Analytics And Optimization
Concepts IBM ICE (Innovation Centre for Education)
Key Business Analytics And Optimization
Concepts IBM ICE (Innovation Centre for Education)

• Above Figure 1 shows the flow and transformation of data by using BAO technology that
helps to drive refinement of raw data into insights and actionable decisions.
• As Figure 1 shows, organizations today gather massive amounts of raw data from various
sources, which can be dynamic. Information that is available to businesses has grown
exponentially in the past several decades. Although most executives are convinced of its
value, their enterprises need sophisticated analytics to capture that value.
• This data feeds business intelligence and performance management capabilities to
generate information that promotes awareness and measures the state of the business.
Additional analysis of information and data can generate insights.
• Advanced analytics can further process the information to provide foresight that
supports key decisions by business users. Input to transactional systems drives action
within the business environment.
• Enterprise information and content management solutions can help establish
performance measurements and establish key performance indicators that help
management and personnel take action on the data.
The Need for BAO Now
IBM ICE (Innovation Centre for Education)

• The application of business analytics is opening up important new possibilities for clients
and promises to transform the way consulting is practiced.

• BAO has defined the following competency areas that bring together critical skills that
are necessary to define and drive IBM leadership in the growing analytics market.

• Through these competencies, our clients can operate at a new level of intelligence and
achieve “breakaway” levels by using the -

• BAO Strategy.
• Business Intelligence and Performance Management.
• Advanced Analytics and Optimization.
• Enterprise Information Management
• Enterprise Content Management
The Need for BAO Now
IBM ICE (Innovation Centre for Education)

• By using the BAO Strategy, clients can achieve business objectives faster, with
less risk, and at a lower cost by defining and helping to implement
improvements in how information is identified and acted upon. Applied
enterprise-wide and deep within a business function, this strategy addresses
both what to do and how to do it with actions that span policy, analytics,
business process, organization, applications, and data.

• Business Intelligence and Performance Management empowers decision


making and improved business performance through timely access, analysis,
and reporting of actionable, accurate, and personalized information. This way,
organizations can translate strategies into actionable forecasts or plans,
monitor key financial and operational metrics, and improve insight,
corresponding actions, and ultimately performance across the enterprise.
The Need for BAO Now
IBM ICE (Innovation Centre for Education)

• Advanced Analytics and Optimization enhances organizational performance


by applying advanced mathematical modeling, deep computing, simulation,
data analytics, and optimization techniques to improve operational efficiency.
Operational efficiency is accomplished by using analytical engines, data
mining, and statistical models that address specific business-process areas.

• Enterprise Information Management applies methods, techniques, and


technologies that address data architecture, extraction, transformation,
movement, storage, integration, and governance of enterprise information
and master data.
The Need for BAO Now
IBM ICE (Innovation Centre for Education)

• Enterprise Content Management includes services, technologies and processes


that are used to improve the capture, management, storage, access, preservation
and electronic discovery of unstructured content. It focuses on management of
unstructured content and on drawing value from content through improved
information management, business processes, and advanced analytics. These
capabilities help clients improve the performance of their businesses by reducing
costs and driving efficiencies.
Essential Capabilities of BAO
IBM ICE (Innovation Centre for Education)

• Trusted information – The capability to provide the information based on trusted


quality data and source is essential.
• Reporting and analysis – The Reporting and analysis environment in the organization
should be well established.
• Business performance management – The organization must have processes and
technologies for Business performance management and the same must be the
strategic goal of the organization. The BPM must be implemented at all business as
well as internal functional levels of organization.
• Predictive analysis and mining – Organization will use increasingly predictive analysis
and data mining techniques to seek advance analytical capabilities.
• Real-time analytics - The analytics must reach real time and near real time capability.
Usage of Data Streaming, analytical sand boxes and real time analytical capabilities
must be built in the organization.
• Business rules (process) integration – The high level of Business process integration
across the organization and systems are needed to ensure optimization goals
BAO Capabilities: Business Performance
Management IBM ICE (Innovation Centre for Education)
BAO Capabilities: Business Performance
Management IBM ICE (Innovation Centre for Education)

• Organization is made up of people with different skills and roles all trying to “pull in the same direction” with
the goal of optimizing business performance. Each of these people require different levels of information
and detail in order to make decisions that impact performance. Only IBM offers the complete range of
integrated Business Analytics capabilities to address the needs of your people.
• Through highly visual scorecards, dashboards, reports and real-time activity monitoring, decision makers
gain immediate insights regarding the health of the business and can understand what is happening in their
area of the business.
• Analyzing trends, statistics, correlation, and context, decision makers can understand what leads to the best
outcomes and discover why things are on or off track.
• Knowing what is likely to happen equips decision-makers with the foresight they need to intervene.
Simulation through predictive modeling and “what-if” analysis enables decision makers to predict and act:
change the course to improve the outcomes. Financial and operational planning, budgeting and forecasting
puts resources in the right place and sets targets for those allocations.
• Everyone in the organization can be confidant in a common, consistent and trusted data. IBM allows you to
pull data from a range of systems and makes it easier to turn this data into information. It doesn’t matter
what the knowledge level, everyone will be able to consume the information in a manner that is relevant to
them.
• The right information, in the right way, to the right people at the right time leads to optimized decision
making.
BAO Capabilities: Predictive Analysis and
Mining IBM ICE (Innovation Centre for Education)
What is Cluster Analysis?
IBM ICE (Innovation Centre for Education)

• Cluster: A collection of data objects


• similar (or related) to one another within the same group
• dissimilar (or unrelated) to the objects in other groups
• Cluster analysis
• Finding similarities between data according to the characteristics found in the
data and grouping similar data objects into clusters
• Unsupervised learning: no predefined classes
• Typical applications
• As a stand-alone tool to get insight into data distribution
• As a preprocessing step for other algorithms
Clustering: Application Examples
IBM ICE (Innovation Centre for Education)

• Biology: taxonomy of living things: kingdom, phylum, class, order, family,


genus and species
• Information retrieval: document clustering
• Land use: Identification of areas of similar land use in an earth observation
database
• Marketing: Help marketers discover distinct groups in their customer bases,
and then use this knowledge to develop targeted marketing programs
• City-planning: Identifying groups of houses according to their house type,
value, and geographical location
• Climate: understanding earth climate, find patterns of atmospheric and
ocean
Considerations for Cluster Analysis
IBM ICE (Innovation Centre for Education)

• Partitioning criteria
• Single level vs. hierarchical partitioning (often, multi-level hierarchical partitioning is
desirable)
• Separation of clusters
• Exclusive (e.g., one customer belongs to only one region) vs. non-exclusive (e.g., one
document may belong to more than one class)
• Similarity measure
• Distance-based (e.g., Euclidian) vs. connectivity-based (e.g., density)
• Clustering space
• Full space (often when low dimensional) vs. subspaces (often in high-dimensional clustering)
IBM ICE (Innovation Centre for Education)
BAO Capabilities: Predictive Analysis and
Mining IBM ICE (Innovation Centre for Education)


What are Association Rules?
IBM ICE (Innovation Centre for Education)

• Study of “what goes with what”


• “Customers who bought X also bought Y”
• What symptoms go with what diagnosis
• Transaction-based or event-based
• Also called “market basket analysis” and “affinity analysis”
• Originated with study of customer transactions databases to
determine associations among items purchased
What are Association Rules?
• Frequent pattern: a pattern (a set of items, subsequences, substructures, etc.) that occurs
frequently in a data set

• Motivation: Finding inherent regularities in data


• What products were often purchased together?— Beer and diapers?!
• What are the subsequent purchases after buying a PC?
• What kinds of DNA are sensitive to this new drug?
• Can we automatically classify web documents?

• Applications
• Basket data analysis, cross-marketing, catalog design, sale campaign analysis, Web log (click
stream) analysis, and DNA sequence analysis.
Generating Rules - Term
IBM ICE (Innovation Centre for Education)

“IF” part = antecedent


“THEN” part = consequent

“Item set” = the items (e.g., products) comprising the antecedent or


consequent

• Antecedent and consequent are disjoint (i.e., have no items in


common)
Basic Concepts: Frequent Patterns
IBM ICE (Innovation Centre for Education)

Tid Items bought • itemset: A set of one or more items


10 Beer, Nuts, Diaper
• k-itemset X = {x1, …, xk}
20 Beer, Coffee, Diaper
30 Beer, Diaper, Eggs
• (absolute) support, or, support count
of X: Frequency or occurrence of an
40 Nuts, Eggs, Milk
itemset X
50 Nuts, Coffee, Diaper, Eggs, Milk
• (relative) support, s, is the fraction
Customer
buys both
Customer of transactions that contains X (i.e.,
buys diaper
the probability that a transaction
contains X)
• An itemset X is frequent if X’s
support is no less than a minsup
Customer threshold
buys beer
Basic Concepts: Frequent Patterns
IBM ICE (Innovation Centre for Education)
Tid Items bought
10 Beer, Nuts, Diaper
• Find all the rules X  Y with
20 Beer, Coffee, Diaper minimum support and confidence
30 Beer, Diaper, Eggs • support, s, probability that a
40 Nuts, Eggs, Milk transaction contains X  Y
50 Nuts, Coffee, Diaper, Eggs, Milk
• confidence, c, conditional
Customer Customer probability that a transaction
buys both
buys having X also contains Y
diaper
Let minsup = 50%, minconf = 50%
Freq. Pat.: Beer:3, Nuts:3, Diaper:4, Eggs:3, {Beer,
Diaper}:3
Customer
buys beer  Association rules: (many more!)
 Beer  Diaper (60%, 100%)
 Diaper  Beer (60%, 75%)
21
Apriori Algorithm
IBM ICE (Innovation Centre for Education)

Generating Frequent Item Sets


For k products…
1. User sets a minimum support criterion
2. Next, generate list of one-item sets that meet the
support criterion
3. Use the list of one-item sets to generate list of
two-item sets that meet the support criterion
4. Use list of two-item sets to generate list of three-
item sets
5. Continue up through k-item sets
Association Rule Mining
IBM ICE (Innovation Centre for Education)
• Given a set of transactions, find rules that will predict the
occurrence of an item based on the occurrences of other items
in the transaction
Market-Basket transactions
Example of Association Rules
TID Items
{Diaper}  {Beer},
1 Bread, Milk {Milk, Bread}  {Eggs,Coke},
2 Bread, Diaper, Beer, Eggs {Beer, Bread}  {Milk},
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer Implication means co-occurrence,
5 Bread, Milk, Diaper, Coke not causality!
Definition: Frequent Itemset
• Itemset
IBM ICE (Innovation Centre for Education)
• A collection of one or more items
• Example: {Milk, Bread, Diaper}
• k-itemset TID Items
• An itemset that contains k items
1 Bread, Milk
• Support count () 2 Bread, Diaper, Beer, Eggs
• Frequency of occurrence of an itemset 3 Milk, Diaper, Beer, Coke
• E.g. ({Milk, Bread,Diaper}) = 2 4 Bread, Milk, Diaper, Beer
• Support 5 Bread, Milk, Diaper, Coke
• Fraction of transactions that contain an
itemset
• E.g. s({Milk, Bread, Diaper}) = 2/5
• Frequent Itemset
• An itemset whose support is greater than
or equal to a minsup threshold
Definition: Association Rule
IBM ICE (Innovation Centre for Education)
 Association Rule
– An implication expression of the form X  TID Items
Y, where X and Y are itemsets 1 Bread, Milk
– Example: 2 Bread, Diaper, Beer, Eggs
{Milk, Diaper}  {Beer} 3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
 Rule Evaluation Metrics 5 Bread, Milk, Diaper, Coke
– Support (s) Example:
 Fraction of transactions that contain both {Milk , Diaper }  Beer
X and Y
– Confidence (c)  (Milk, Diaper, Beer ) 2
s   0.4
 Measures how often items in Y |T| 5
appear in transactions that
contain X  (Milk, Diaper, Beer ) 2
c   0.67
 (Milk, Diaper ) 3
Association Rule Mining Task
IBM ICE (Innovation Centre for Education)

• Given a set of transactions T, the goal of association rule mining is to


find all rules having
• support ≥ minsup threshold
• confidence ≥ minconf threshold

• Brute-force approach:
• List all possible association rules
• Compute the support and confidence for each rule
• Prune rules that fail the minsup and minconf thresholds
 Computationally prohibitive!
Mining Association Rules
IBM ICE (Innovation Centre for Education)
TID Items
Example of Rules:
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
{Milk,Diaper}  {Beer} (s=0.4, c=0.67)
{Milk,Beer}  {Diaper} (s=0.4, c=1.0)
3 Milk, Diaper, Beer, Coke
{Diaper,Beer}  {Milk} (s=0.4, c=0.67)
4 Bread, Milk, Diaper, Beer
{Beer}  {Milk,Diaper} (s=0.4, c=0.67)
5 Bread, Milk, Diaper, Coke {Diaper}  {Milk,Beer} (s=0.4, c=0.5)
{Milk}  {Diaper,Beer} (s=0.4, c=0.5)
Observations:
• All the above rules are binary partitions of the same itemset:
{Milk, Diaper, Beer}
• Rules originating from the same itemset have identical support but
can have different confidence
• Thus, we may decouple the support and confidence requirements
Mining Association Rules
IBM ICE (Innovation Centre for Education)

• Two-step approach:
1. Frequent Itemset Generation
– Generate all itemsets whose support  minsup

2. Rule Generation
– Generate high confidence rules from each frequent itemset, where each rule is a binary
partitioning of a frequent itemset

• Frequent itemset generation is still computationally expensive


Frequent Itemset Generation null
IBM ICE (Innovation Centre for Education)

A B C D E

AB AC AD AE BC BD BE CD CE DE

ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE

ABCD ABCE ABDE ACDE BCDE


Given d items, there
are 2d possible
ABCDE candidate itemsets
Frequent Itemset Generation
IBM ICE (Innovation Centre for Education)
• Brute-force approach:
• Each itemset in the lattice is a candidate frequent itemset
• Count the support of each candidate by scanning the database
Transactions List of
Candidates
TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
N 3 Milk, Diaper, Beer, Coke M
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
w
• Match each transaction against every candidate
• Complexity ~ O(NMw) => Expensive since M = 2d !!!
IBM ICE (Innovation Centre for Education)
BAO Capabilities: Predictive Analysis and
Mining IBM ICE (Innovation Centre for Education)

√ √
Prediction Problems: Classification vs.
Numeric Prediction IBM ICE (Innovation Centre for Education)

• Classification
• predicts categorical class labels (discrete or nominal)
• classifies data (constructs a model) based on the training
set and the values (class labels) in a classifying attribute and
uses it in classifying new data
• Numeric Prediction
• models continuous-valued functions, i.e., predicts unknown
or missing values
• Typical applications
• Credit/loan approval:
• Medical diagnosis: if a tumor is cancerous or benign
• Fraud detection: if a transaction is fraudulent
• Web page categorization: which category it is
Classification—A Two-Step Process
IBM ICE (Innovation Centre for Education)
• Model construction: describing a set of predetermined classes
• Each tuple/sample is assumed to belong to a predefined class, as determined by the
class label attribute
• The set of tuples used for model construction is training set
• The model is represented as classification rules, decision trees, or mathematical
formulae
• Model usage: for classifying future or unknown objects
• Estimate accuracy of the model
• The known label of test sample is compared with the classified result from the
model
• Accuracy rate is the percentage of test set samples that are correctly classified by
the model
• Test set is independent of training set (otherwise overfitting)
• If the accuracy is acceptable, use the model to classify new data
• Note: If the test set is used to select models, it is called validation (test) set
Process (1): Model Construction
IBM ICE (Innovation Centre for Education)

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier


M ike A ssistant P rof 3 no (Model)
M ary A ssistant P rof 7 yes
B ill P rofessor 2 yes
Jim A ssociate P rof 7 yes
IF rank = ‘professor’
D ave A ssistant P rof 6 no
OR years > 6
A nne A ssociate P rof 3 no
THEN tenured = ‘yes’
Process (2): Using the Model in
Prediction IBM ICE (Innovation Centre for Education)

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
T om A ssistant P rof 2 no Tenured?
M erlisa A ssociate P rof 7 no
G eorge P rofessor 5 yes
Joseph A ssistant P rof 7 yes
IBM ICE (Innovation Centre for Education)
BAO Capabilities: Predictive Analysis and
Mining IBM ICE (Innovation Centre for Education)

√ √


Linear Regression
IBM ICE (Innovation Centre for Education)
• Linear dependence: constant rate of increase of one variable with respect
to another (as opposed to, e.g., diminishing returns).
• Regression analysis describes the relationship between two (or more)
variables.
• Examples:
• Income and educational level
• Demand for electricity and the weather
• Home sales and interest rates
• Our focus:
•Gain some understanding of the mechanics.
• the regression line
• regression error
• Learn how to interpret and use the results.
• Learn how to setup a regression analysis.
Two Main Questions:
IBM ICE (Innovation Centre for Education)
•Prediction and Forecasting
• Predict home sales for December given the interest rate for this month.
• Use time series data (e.g., sales vs. year) to forecast future performance
(next year sales).
• Predict the selling price of houses in some area.
• Collect data on several houses (# of BR, #BA, sq.ft, lot size, property tax)
and their selling price.
• Can we use this data to predict the selling price of a specific house?
•Quantifying causality
• Determine factors that relate to the variable to be predicted; e.g., predict
growth for the economy in the next quarter: use past history on
quarterly growth, index of leading economic indicators, and others.
• Want to determine advertising expenditure and promotion for the 1999
Ford Explorer.
• Sales over a quarter might be influenced by: ads in print, ads in radio, ads in
TV, and other promotions.
Motivated Example
IBM ICE (Innovation Centre for Education)
• Predict the selling prices of houses in the region.
•Intuitively, we should compare the house for which we need a predicted selling price with houses
that have sold recently in the same area, of roughly the same size, same style etc.
•Idea: Treat it as a multiple sample problem.
•Unfortunately, the list of houses meeting these criteria may be quite small, or there may not be a house
of exactly the same characteristics.
•Alternative approach: Consider the factors that determine the selling price of a house in this region.
• Collect recent historical data on selling prices, and a number of characteristics
about each house sold (size, age, style, etc.).
•Idea: one sample problem
•To predict the selling price of a house without any particular knowledge of the house, we use the
average selling price of all of the houses in the data set.
•Better idea:
•One of the factors that cause houses in the data set to sell for different amounts of money is the fact that
houses come in various sizes.
•A preliminary model might posit that the average value per square foot of a new house is $40 and that
the average lot sells for $20,000. The predicted selling price of a house of size X (in square feet) would be:
20,000 + 40X.
•A house of 2,000 square feet would be estimated to sell for 20,000 + 40(2,000) = $100,000.
Motivated Example
IBM ICE (Innovation Centre for Education)

•Probability Model:
• We know, however, that this is just an approximation, and the selling price of this
particular house of 2,000 square feet is not likely to be exactly $100,000.
• Prices for houses of this size may actually range from $50,000 to $150,000.
• In other words, the deterministic model is not really suitable. We should therefore
consider a probabilistic model.
•Let Y be the actual selling price of the house. Then
Y = 20,000 + 40x + ,
where  (Greek letter epsilon) represents a random error term (which
might be positive or negative).
• If the error term  is usually small, then we can say the model is a good one.
• The random term, in theory, accounts for all the variables that are not part of the
model (for instance, lot size, neighborhood, etc.).
• The value of  will vary from sale to sale, even if the house size remains constant.
That is, houses of the exact same size may sell for different prices.
BAO Capabilities: Predictive Analysis and
Mining IBM ICE (Innovation Centre for Education)

√ √

√ √
IBM Business Analytics Maturity Model
IBM ICE (Innovation Centre for Education)
IBM Business Analytics Maturity Model
IBM ICE (Innovation Centre for Education)

• IBM Business Analytics Maturity Model identifies 5 stages of Information and Analytics
maturity and maps the same with the Business Operations Maturity within an
organization-
• Ad-hoc – AT this stage the information is mainly on spreadsheets and extracts and most of the
analysis is Ad-hoc. The business operations maturity level is of High Command and Control
structure.
• Foundational- At this stage the organization sets up Data Warehouses, governance models and
production reporting processes. The business operations generally are driven by Task integrations
( most likely by ERP systems)
• Competitive- At this stage the use of Master data Management ( MDM) Dashboards and
Scorecards are at a high level within the organization. And most of business operations run by a
process automation and workflow methodologies.
• Differentiating – At this stage, organizations start using Predictions, Contextual business rules and
patterns. The Business operations run riding on a complete Business process integration and
collaboration ( such as CRM )
• Breakaway - At this highest stage of maturity, the organizations start using very high level of
analytical capability built internally and the usages are prescriptive, real time, pattern based
strategies with situational context.
Advantages to Implementing BAO Solutions
IBM ICE (Innovation Centre for Education)

• Organizations that are just beginning to develop analytical decision support often
start with ad hoc tools, such as spreadsheets and SQL. These tools have the
advantages of simplicity, low cost, and flexibility, and therefore, encourage
experimentation.

• Successful implementation provides a confidence in the organization to pursue


analytical approaches with more vigor. However, such tools often prove difficult
to maintain, and they do not scale well. As an organization’s appetite for analytics
increases, it has several options to consider:

• Buy a prepackaged application for a specific purpose


• Build a custom application from components that are knit together with hand-coded
software
• Build a custom application that is based on a platform with standardized components
Advantages to Implementing BAO Solutions
IBM ICE (Innovation Centre for Education)

• Buy a prepackaged application for a specific purpose.


• These kinds of applications often embody industry-specific knowledge and best
practices. When they are applicable to an organization’s business, they can be fast and
economical to implement. However, they might not match the organization’s business
model or processes. Therefore, adopting one of them might require compromises.
• Build a custom application from components that are knit together with hand-
coded software.
• These types of applications can mirror the organization’s business model or processes,
but they can be costly, time-consuming, and risky to develop.
• Build a custom application that is based on a platform with standardized
components.
• This approach allows for customization but reduces the cost, time, and risk of
development by reusing standard components and minimizing the amount of custom
software engineering that is required. Although the platform is standard, any type of
custom application has some up-front costs associated with it.
Summary
IBM ICE (Innovation Centre for Education)

√ √

√ √

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy