0% found this document useful (0 votes)
13 views11 pages

2 - Univerate and Multiveriate

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views11 pages

2 - Univerate and Multiveriate

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Univariate vs.

Multivariate Analysis

Univariate analysis:
• refers to the analysis of one variable.
• “uni” means “one.”
• Simplest form of analysis since the information deals with only one quantity that changes.
• It does not deal with causes or relationships and the main purpose of the analysis is to
describe the data and find patterns that exist within it.

The example of a univariate data can be height.

Multivariate analysis:
refers to the analysis of more than one variable. “multi” means “more than one.”

Data involves three or more variables,


Objectives of multivariate data analysis (MVDA)

Identify Patterns:
Discover hidden patterns or relationships within complex datasets, let see how variables are
connected.

Reduce Complexity:
Simplify high-dimensional data to focus on the most important information while minimizing noise.

Visualize Relationships:
Create visual representations to make sense of complex interactions among multiple variables.

Predict Outcomes:
Build models that use several variables to make predictions or classify data into categories.

Segment Data:
Group similar observations or entities to gain insights and make tailored decisions or
recommendations.
Advantages and disadvantages of multivariate data analysis (MVDA)

Advantages Disadvantages

1. Comprehensive Insights:
1. Complexity:
provides a holistic view of data, uncovering
can be challenging to understand and implement for non-experts.
complex relationships.

2. Reduced Dimensionality: 2. Data Requirements:


Techniques like PCA simplify high-dimensional data Large datasets are often needed for meaningful MVDA, making it
while preserving information. data-intensive.

3. Improved Decision-Making: 3. Interpretation Challenges:


MVDA aids informed decision-making across Interpreting multivariate results can be difficult due to intricate
various fields. relationships.
There are three common ways to perform univariate analysis:

1.Summary Statistics
•We can calculate measures of central tendency like the mean or median for one
variable.

•We can also calculate measures of dispersion such as the standard deviation for one
variable.

2. Frequency Distributions : create a frequency distribution, which describes how often


each value occurs for one variable.

3. Charts: Create charts like boxplots, histograms, density curves, etc. to visualize the
distribution of values for one variable.
There are two common ways to perform multivariate analysis:

1. Scatterplot Matrix : Visualize the relationship b/w each pairwise combination of variables in a dataset.

2. Machine Learning Algorithms


Supervised learning algorithm: to fit a model like multiple linear regression that quantifies the relationship b/w
multiple predictor variables and a response variable.

Unsupervised learning algorithm like principal components analysis to find structure and relationships between multiple
variables in a dataset at once.

Multivariate is similar to bivariate but contains more than one dependent variable.
Some of the techniques are regression analysis, path analysis, factor analysis and multivariate
analysis of variance (MANOVA).
Performing both univariate and multivariate analysis on a dataset
It involves exploring the characteristics of individual variables (univariate) and examining relationships between
variables (multivariate)
Take a dataset about students, including their age, study hours, and exam scores.
let's calculate the univariate statistics for the "Age" variable and perform a simple multivariate analysis by calculating the
correlation between "Age" and "Exam Score" using the dataset :

Study Exam
Univariate Analysis for "Age":
SID Age Hours Score
Mean Age:
1 18 2.5 85
Sum of ages: 18 + 20 + 19 + 22 + 21 = 100
Mean Age: 100 / 5 = 20 years 2 20 3.0 88
Median Age:
Since there is an odd number of observations (5), the median 3 19 2.8 90
is the middle value, which is 20 years. 4 22 3.5 92
Standard Deviation of Age (Sample):
Calculate the squared differences between each age and the 5 21 3.2 91
mean age=>>
Multivariate Analysis
-

Correlation between "Age" and "Exam Score":

To do this, we can use Pearson's correlation coefficient formula:

Where:

x and y are the individual data points.


x̄ and ȳ are the means of the two variables.

Calculations for the correlation coefficient would look like this:


The correlation coefficient between "Age" and "Exam Score" is
approx. -0.075, extremely close to zero, indicating almost no
association between age and exam scores.
This suggests a very weak negative linear relationship between
a student's age and their exam score in this dataset.

A "very weak negative linear relationship" b/w a student's age and


their exam score means that, in the given dataset, there is a slight
tendency for exam scores to decrease slightly as a student's age
increases, but the relationship is so weak that it's not practically
significant.
To visualize in Scatterplot:
import matplotlib.pyplot as plt
# Data
age = [18, 20, 19, 22, 21]
exam_score = [85, 88, 90, 92, 91]
# Create a scatterplot
plt.figure(figsize=(8, 6))
plt.scatter(age, exam_score, color='blue', marker='o')
plt.title('Scatterplot of Age vs. Exam Score')
plt.xlabel('Age')
plt.ylabel('Exam Score')
plt.grid(True)
# Show the plot
plt.show()

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy