0% found this document useful (0 votes)
19 views21 pages

Mutivariate and Baysian

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views21 pages

Mutivariate and Baysian

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

School of Computing Science and Engineering

Course Code : …………….. Course Name: Data Analytics

UNIT 2
MULTIVARIATE ANALYSIS
AND
BAYESIAN MODELING

Name of the Faculty: Ms.KIRTI Program Name: BTech


Multivariate Analysis
There are many different techniques for multivariate analysis, and they can be divided into two categories:
• Dependence techniques
• Interdependence techniques
Multivariate analysis techniques: Dependence vs. interdependence
When we use the terms “dependence” and “interdependence,” we’re referring to different types of
relationships within the data. To give a brief explanation:
• Dependence methods
• Dependence methods are used when one or some of the variables are dependent on others. Dependence
looks at cause and effect;
• In other words, can the values of two or more independent variables be used to explain, describe, or
predict the value of another, dependent variable? To give a simple example, the dependent variable of
“weight” might be predicted by independent variables such as “height” and “age.”
• In machine learning, dependence techniques are used to build predictive models. The analyst enters
input data into the model, specifying which variables are independent and which ones are dependent—
in other words, which variables they want the model to predict, and which variables they want the
model to use to make those predictions.
Cont…..
Interdependence methods
• Interdependence methods are used to understand the structural makeup
and underlying patterns within a dataset. In this case, no variables are
dependent on others, so you’re not looking for causal relationships.
Rather, interdependence methods seek to give meaning to a set of
variables or to group them together in meaningful ways.
• So: One is about the effect of certain variables on others, while the
other is all about the structure of the dataset.
Cont……..
Some useful multivariate analysis techniques are:
• Multiple linear regression
• Multiple logistic regression
• Multivariate analysis of variance (MANOVA)
• Factor analysis
• Cluster analysis
Multiple linear regression

Multiple linear regression is a dependence method which looks at the relationship between
one dependent variable and two or more independent variables. A multiple regression
model will tell you the extent to which each independent variable has a linear relationship
with the dependent variable. This is useful as it helps you to understand which factors are
likely to influence a certain outcome, allowing you to estimate future outcomes.
Example of multiple regression:
As a data analyst, you could use multiple regression to predict crop growth. In this
example, crop growth is your dependent variable and you want to see how different factors
affect it. Your independent variables could be rainfall, temperature, amount of sunlight, and
amount of fertilizer added to the soil. A multiple regression model would show you the
proportion of variance in crop growth that each independent variable accounts for.
Multiple logistic regression

Logistic regression analysis is used to calculate (and predict) the probability


of a binary event occurring. A binary outcome is one where there are only two
possible outcomes; either the event occurs (1) or it doesn’t (0). So, based on a
set of independent variables, logistic regression can predict how likely it is
that a certain scenario will arise. It is also used for classification.
Example of logistic regression:
Let’s imagine you work as an analyst within the insurance sector and you
need to predict how likely it is that each potential customer will make a claim.
You might enter a range of independent variables into your model, such as
age, whether or not they have a serious health condition, their occupation, and
so on. Using these variables, a logistic regression analysis will calculate the
probability of the event (making a claim) occurring. Another cited example is
the filters used to classify email as “spam” or “not spam.”
Multivariate analysis of variance (MANOVA)

• Multivariate analysis of variance (MANOVA) is used to measure the effect


of multiple independent variables on two or more dependent variables.
With MANOVA, it’s important to note that the independent variables are
categorical, while the dependent variables are metric in nature. A
categorical variable is a variable that belongs to a distinct category—for
example, the variable “employment status” could be categorized into
certain units, such as “employed full-time,” “employed part-time,”
“unemployed,” and so on. A metric variable is measured quantitatively and
takes on a numerical value.
• In MANOVA analysis, you’re looking at various combinations of the
independent variables to compare how they differ in their effects on the
dependent variable.
Example of MANOVA:

Let’s imagine you work for an engineering company that is on a mission to


build a super-fast, eco-friendly rocket. You could use MANOVA to measure the
effect that various design combinations have on both the speed of the rocket
and the amount of carbon dioxide it emits. In this scenario, your categorical
independent variables could be:
Engine type, categorized as E1, E2, or E3
Material used for the rocket exterior, categorized as M1, M2, or M3
Type of fuel used to power the rocket, categorized as F1, F2, or F3
Your metric dependent variables are speed in kilometers per hour, and carbon
dioxide measured in parts per million. Using MANOVA, you’d test different
combinations (e.g. E1, M1, and F1 vs. E1, M2, and F1, vs. E1, M3, and F1, and
so on) to calculate the effect of all the independent variables. This should help
you to find the optimal design solution for your rocket.
Factor analysis

• Factor analysis is an interdependence technique which seeks to reduce


the number of variables in a dataset. If you have too many variables, it
can be difficult to find patterns in your data. At the same time, models
created using datasets with too many variables are susceptible to
overfitting. Overfitting is a modeling error that occurs when a model
fits too closely and specifically to a certain dataset, making it less
generalizable to future datasets, and thus potentially less accurate in
the predictions it makes.
• Factor analysis works by detecting sets of variables which correlate
highly with each other. These variables may then be condensed into a
single variable. Data analysts will often carry out factor analysis to
prepare the data for subsequent analyses.
Example:

• Let’s imagine you have a dataset containing data pertaining to a


person’s income, education level, and occupation. You might find a
high degree of correlation among each of these variables, and thus
reduce them to the single factor “socioeconomic status.” You might
also have data on how happy they were with customer service, how
much they like a certain product, and how likely they are to
recommend the product to a friend. Each of these variables could be
grouped into the single factor “customer satisfaction” (as long as they
are found to correlate strongly with one another). Even though you’ve
reduced several data points to just one factor, you’re not really losing
any information—these factors adequately capture and represent the
individual variables concerned. With your “streamlined” dataset,
you’re now ready to carry out further analyses.
Cluster analysis

• Another interdependence technique, cluster analysis is used to group


similar items within a dataset into clusters.
• When grouping data into clusters, the aim is for the variables in one
cluster to be more similar to each other than they are to variables in
other clusters. This is measured in terms of intracluster and intercluster
distance. Intracluster distance looks at the distance between data
points within one cluster. This should be small. Intercluster distance
looks at the distance between data points in different clusters. This
should ideally be large. Cluster analysis helps you to understand how
data in your sample is distributed, and to find patterns.
Example:
• A prime example of cluster analysis is audience segmentation. If you
were working in marketing, you might use cluster analysis to define
different customer groups which could benefit from more targeted
campaigns. As a healthcare analyst, you might use cluster analysis to
explore whether certain lifestyle factors or geographical locations are
associated with higher or lower cases of certain illnesses. Because it’s
an interdependence technique, cluster analysis is often carried out in
the early stages of data analysis.
Bayes Theorem in Machine Learning
Introduction:
Bayes theorem is given by an English statistician, philosopher, and
Presbyterian minister named Mr. Thomas Bayes in 17th century.
Bayes provides their thoughts in decision theory which is extensively used in
important mathematics concepts as Probability.
Bayes theorem is also widely used in Machine Learning where we need to
predict classes precisely and accurately.
An important concept of Bayes theorem named Bayesian method is used to
calculate conditional probability in Machine Learning application that
includes classification tasks.
Further, a simplified version of Bayes theorem (Naïve Bayes classification)
is also used to reduce computation time and average cost of the projects.
Bayes theorem is also extensively applied in health and medical, research
Cont……….
• Bayes theorem is one of the most popular machine learning concepts that helps to calculate the probability of
occurring one event with uncertain knowledge while other one has already occurred.
• Bayes' theorem can be derived using product rule and conditional probability of event X with known event Y:
• According to the product rule we can express as the probability of event X with known event Y as follows;
P(X ? Y)= P(X|Y) P(Y) {equation 1}
• Further, the probability of event Y with known event X:
P(X ? Y)= P(Y|X) P(X) {equation 2}
• Mathematically, Bayes theorem can be expressed by combining both equations on right hand side. We will get:

P(X|Y) = P(Y|X) P(X) / P(Y)


• Here, both events X and Y are independent events which means probability of
outcome of both events does not depends one another.
• The above equation is called as Bayes Rule or Bayes Theorem.
Cont…….
• The Formula : P(X|Y) = P(Y|X) P(X) / P(Y)

• P(X|Y) is called as posterior, which we need to calculate. It is defined as


updated probability after considering the evidence.
• P(Y|X) is called the likelihood. It is the probability of evidence when
hypothesis is true.
• P(X) is called the prior probability, probability of hypothesis before
considering the evidence
• P(Y) is called marginal probability. It is defined as the probability of
evidence under any consideration.
Hence, Bayes Theorem can be written as:
Prerequisites for Bayes Theorem

While studying the Bayes theorem, we need to understand few important concepts. These are
as follows:
1. Experiment
• An experiment is defined as the planned operation carried out under controlled condition such
as tossing a coin, drawing a card and rolling a dice, etc.
2. Sample Space
• During an experiment what we get as a result is called as possible outcomes and the set of all
possible outcome of an event is known as sample space. For example, if we are rolling a dice,
sample space will be:
• S1 = {1, 2, 3, 4, 5, 6}
• Similarly, if our experiment is related to toss a coin and recording its outcomes, then sample
space will be:
• S2 = {Head, Tail}
Cont……….
3. Event
• Event is defined as subset of sample space in an experiment. Further, it is also called as set of
outcomes.
Assume in our experiment of rolling a dice, there are two event A and B such that;
• A = Event when an even number is obtained = {2, 4, 6}
• B = Event when a number is greater than 4 = {5, 6}
• Probability of the event A ''P(A)''= Number of favourable outcomes / Total number of possible
outcomes
P(E) = 3/6 =1/2 =0.5
• Similarly, Probability of the event B ''P(B)''= Number of favourable outcomes / Total number of
possible outcomes
=2/6
=1/3
=0.333
• Union of event A and B:
Cont……
Intersection of event A and B:
A∩B= {6}
Disjoint Event: If the intersection of the event A and B is an empty set or null then such events are known
as disjoint event or mutually exclusive events also.
5. Exhaustive Event: As per the name suggests, a set of events where at least one event occurs at a time,
called exhaustive event of an experiment. Thus, two events A and B are said to be exhaustive if either A or B
definitely occur at a time and both are mutually exclusive for e.g., while tossing a coin, either it will be a
Head or may be a Tail.
6. Independent Event:
• Two events are said to be independent when occurrence of one event does not affect the occurrence of
another event. In simple words we can say that the probability of outcome of both events does not depends
one another.
Mathematically, two events A and B are said to be independent if:
P(A ∩ B) = P(AB) = P(A)*P(B)
7. Conditional Probability:Conditional probability is defined as the probability of an event A, given that
another event B has already occurred (i.e. A conditional B). This is represented by P(A|B) and we can define
it as:
Cont……
8. Marginal Probability:
• Marginal probability is defined as the probability of an event A occurring
independent of any other event B. Further, it is considered as the probability of
evidence under any consideration. Here ~B represents the event that B does not
occur.
• P(A) = P(A|B)*P(B) + P(A|~B)*P(~B)
How to apply Bayes Theorem or Bayes rule in Machine Learning?

• Bayes theorem helps us to calculate the single term P(B|A) in terms of P(A|B),
P(B), and P(A). This rule is very helpful in such scenarios where we have a
good probability of P(A|B), P(B), and P(A) and need to determine the fourth
term.
• Naïve Bayes classifier is one of the simplest applications of Bayes theorem
which is used in classification algorithms to isolate data as per accuracy, speed
and classes.
• Let's understand the use of Bayes theorem in machine learning with below
example.
• Suppose, we have a vector A with I attributes. It means
• A = A1, A2, A3, A4……………Ai
• Further, we have n classes represented as C1, C2, C3, C4…………Cn.
Cont….
• These are two conditions given to us, and our classifier that works on Machine Language has to predict A and
the first thing that our classifier has to choose will be the best possible class. So, with the help of Bayes theorem,
we can write it as:
P(Ci/A)= [ P(A/Ci) * P(Ci)] / P(A)
Here;
• P(A) is the condition-independent entity.
• P(A) will remain constant throughout the class means it does not change its value with respect to change in
class. To maximize the P(Ci/A), we have to maximize the value of term P(A/Ci) * P(Ci).
• With n number classes on the probability list let's assume that the possibility of any class being the right answer
is equally likely. Considering this factor, we can say that:
P(C1)=P(C2)=P(C3)=P(C4)=…..=P(Cn).
This process helps us to reduce the computation cost as well as time. This is how Bayes theorem plays a significant
role in Machine Learning and Naïve Bayes theorem has simplified the conditional probability tasks without
affecting the precision. Hence, we can conclude that:
P(Ai/C)= P(A1/C)* P(A2/C)* P(A3/C)*……*P(An/C)
Hence, by using Bayes theorem in Machine Learning we can easily describe the possibilities of smaller events.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy