0% found this document useful (0 votes)

343 views97 pages

HMX7001 Analysis of Data Using SPSS - Advanced Level

This document provides an outline for exercises in analyzing data using SPSS. It includes 8 exercises that demonstrate various SPSS functions including plotting data, descriptive statistics, correlations analysis, t-tests, ANOVA, cluster analysis, regression models, and principal component analysis. The exercises will use example databases and dummy data to demonstrate preprocessing data, exploring data, and performing both initial and multivariate analyses in SPSS. The document also discusses how statistical analysis fits into typical research frameworks and modeling air pollution and health impacts.

Uploaded by

Lim Kok Ping

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

343 views97 pages

HMX7001 Analysis of Data Using SPSS - Advanced Level

Uploaded by

Lim Kok Ping

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 97

Advanced Research Methodology (HVX8001)

Analysis of Data Using SPSS

– Advanced Level

Dr. Md Firoz Khan

Department of Chemistry, Faculty of
Science, University of Malaya
HP: 0162645381
Outlines: The List of Exercises
Exercise: I
SPSS: Demonstration with an example Database for plotting
Exercise: II
Demonstration: Summary Descriptive Statistics
Exercise: III
Demonstration: Correlations analysis, paired t-test, ANOVA
Exercise: IV
Demonstration: Cluster Analysis
Exercise: V
Demonstration: Multiple regression model
Exercise: VI
Demonstration: PCA procedure
Exercise: VII
Demonstration (PCR): Dummy Data
Exercise: VIII
Demonstration: PCA-APCS
Flow of the data analysis using SPSS
Removal of outlier

Input data Correction the

missing data

Preprocessing
Replacing the data
below detection with
appropriate
Data analysis procedures
(initial and
multivariate)
Convert data
dimension or
Output normalization if
appropriate
Data analysis by SPSS
Exploratory
Data Analysis

Initial analysis Multivariate

analysis

PCA/AP
CA MLR PCR PLS
Correlation CS
analysis,
Time-series
paired t test,
anova etc. CA: Cluster analysis
PCA/APCS: principal component
Basis analysis/absolute principal component
summary scores
statistics
(mean, MLR: multiple linear regression
med, std, PCR: principal component regression
etc.) PLS: partial least square
A typical research framework and
statistical input!!
Air pollution monitoring
Assessment of MM power plant Lung function performances
(PM2.5)
Exp. Set up
Chemical analysis (trace metals,
ionic and carbon compositions) Biological monitoring

Database

Statistical Analysis &

Health risk assessment Toxicity test
Air pollution modeling
(HRA) (cytotoxicity and DNA damage )

Descriptive statistics, Correlation, t-

test, Anova, p value, Cluster PCA-APCS
analysis, Regression PMF
CMB

Validation of the Emission Sources by Bivariate Rose Plot/Potential Source

Contribution Function (PSCF)/Concentration Weighted Trajectory
(CWT)/HYSPLIT density model/wind vector by GrADS

Strategic Mitigation
Establishment of Appropriate Plan for stakeholder
Emission Sources (Hotspots) (TNBR)
Research output Impact
Exercise: I

SPSS: Demonstration with an

example Database for plotting

Basic of the statistics: Practice

from the previous Lecture
Practice with dummy data

Prepare plotting
in SPSS
95% 1.96 x SD’s from the mean

95% of values

P(score > 130) =

0.025

100 130
70
mean − (1.96  SD ) mean + (1.96  SD )
100 − (1.96  15.3) = 70 100 + (1.96  15.3) = 130
95% of people have an IQ between 70 and 130
Example use of lognormal distribution in our published work
Shape of Data

◼ Shape of data is measured by

◼ Skewness
◼ Kurtosis
Skewness
◼ Measures asymmetry of data
◼ Positive or right skewed: Longer right tail
◼ Negative or left skewed: Longer left tail
Let x1 , x2 ,... xn be n observations. Then,
n
n å ( xi - x ) 3
Skewness = i =1
3/ 2
æ n
2ö
ç å ( xi - x ) ÷
è i =1 ø
Kurtosis
◼ Measures peakedness of the distribution of data. The
kurtosis of normal distribution is 0.

Let x1 , x2 ,... xn be n observations. Then,

n
nå ( xi - x ) 4
Kurtosis = i =1
2
-3
æ n 2ö
ç å ( xi - x ) ÷
è i =1 ø
• Positive or right skewed: Longer right tail
• Negative or left skewed: Longer left tail
• Large Kurtosis > peaky distribution.
• Low Kurtosis > ‘flatter’ distribution.
• Data skewness lies ( -1 to 1 ) and Kurtosis (-3 to
+3)
Exercise: II

Demonstration: Summary
Descriptive Statistics
Practice the basic statistics using
dummy data
Correlation
◼ Strength and direction of the relationship
between variables
◼ Scattergrams

Y Y Y
Y Y Y

X X

Positive correlation Negative correlation No correlation

Example use of correlation plots-Khan et al 2017. JGR

Linearity of r value

r > 0 linear + positive

r < 0 linear + negative
r = 0 no linearity
Exercise: III

Demonstration: Correlations
analysis, paired t-test, ANOVA
Practice correlation analysis with dummy data
Paired t test
ANOVA test
Cluster Analysis (CA)
◼ Unsupervised pattern recognition
◼ Could involve: hierarchical clustering & non-
hierarchical clustering
◼ Dimensionality not reduced like PCA
◼ Generally views objects as points in n-
dimensional measurement space
◼ Objects aggregated step-wise according to the
similarity of their features
◼ Searches for the distance between objects in the
measurement space
◼ Developed primarily by biologists to determine
similarities between organisms
CA
The HCA analysis which primary purpose to assemble objects based on the characteristic
they possess was used in this study is perfomed the Ward’s method by using euclidean
distance as a measure of similarity. This most common technique will produce several
number of clusters that can be presented in the form of chart called ‘dendrogram’ or also
known as hierarchical tree.

A number of common numerical measures of similarity is available:

◼Correlation
◼Mahalanobis distance
◼Manhattan distance
◼Euclidean distance (most common)
◼Chebyshev distance
◼Minkowski distance (unifies Euclidean, Manhattan and Chebyshev distances)
Exercise: IV

Demonstration:
Cluster Analysis
Cluster analysis
General Linear Model
◼ Linear regression is actually a form of the
General Linear Model where the parameters
are b, the slope of the line, and a, the
intercept.
y = bx + a +ε
◼ A General Linear Model is just any model that
describes the data in terms of a straight line
An example use of the Linear Model
[Khan et al. 2015]
Multiple regression
◼ Multiple regression is used to determine the effect of a
number of independent variables, x1, x2, x3 etc., on a
single dependent variable, y
◼ The different x variables are combined in a linear way
and each has its own regression coefficient:

y = b0 + b1x1+ b2x2 +…..+ bnxn + ε

◼ The a parameters reflect the independent contribution of

each independent variable, x, to the value of the
dependent variable, y.
◼ i.e. the amount of variance in y that is accounted for by
each x variable after all the other x variables have been
accounted for
Multiple Linear Regression
• Regression refers to the value of a response variable as a
function of the value of an explanatory variable.
• A regression model is a function that describes the
relationship between response and explanatory variables.
• Commonly referred to predictor-predictand method in
earth/environmental sciences.
• A simple linear regression has one explanatory variable and
the regression line is straight.
• The linear relationship of variable Y and X can be written as
in the following regression model form
Y= b0 + b1X + e
where, ‘Y’ is the response variable, ‘X’ is the explanatory
variable, ‘e’ is the residual (error), and b0 and b1 are two
parameters. Basically, bo is the intercept and b1 is the
slope of a straight line y= b0 + b1X.
• By linear, we are referring to the parameters, not the
variables.
Multiple Linear Regression
❖ Response variable is normally distributed.
❖ Relationship between the two variables is
linear.
❖ Observations of response variable are
independent.
❖ Residual error is normally distributed with
mean 0 and constant standard deviation.
◼ Y is expressed as a function of X
(deterministic portion, Ŷ ) plus the random
errors εi which should sum to 0.
◼ There are two parameters that need to be
estimated.
◼ α – slope; β – section.
◼ Method: Least squared method (LSM) –
minimized the sum of squared error.

SSE =  (Yi − Yi ) =  (Yi − X i +  )

ˆ 2 2

i i

• Involve solving sets of simultaneous equations

(linear algebra)
Exercise: V

Simple example use of MLR model

Y = A1X1 + A2X2 + A3*X3………+ AnXn + C

[measured PM10 (μg m-3)] = A1 [measured NOx (μg m~3)] + A2 [measured

sulphate (μg m-3)] + C (μg m-3). [Stedman et al. 2001]

Demonstration: Multiple regression model

A simple Linear Regression
model
Output of MLR model

Coefficientsa
Model Unstandardized Coefficients Standardized Coefficients t Sig.
B Std. Error Beta
1 (Constant) 14.427 1.124 12.839 .000
SO4 1.313 .174 .341 7.549 .000
NO3 1.908 .359 .240 5.311 .000
a. Dependent Variable: Mass

Thus, the reconstructed MLR model:

[measured PM10 (μg m-3)] = 1.908× [measured NOx (μg m~3)] + 1.313
×[measured sulphate (μg m-3)] + 14.427 (μg m-3). [Stedman et al. 2001]
An example multiple linear regression model

Practice with Dummy mass closure data

P values
◼ P values = the probability that the observed
result was obtained by chance
◼ i.e. when the null hypothesis is true

◼ α level is set a priori (Usually 0.05)

◼ If p < α level then we reject the null

hypothesis and accept the experimental
hypothesis
◼ 95% certain that our experimental effect is genuine
◼ If however, p > α level then we reject the
experimental hypothesis and accept the null
hypothesis
When to use non-parametric method and when not to
use?

Visually normal, use parametric

Moderately skewed, use parametric Severely skewed, use non-parametric
Outliers, use non-parametric
Uniformly distributed, use non-parametric
Data Reduction using SPSS
[To be demonstrated to Advanced Lecture for MSc and PhD students]
Basic about multivariate modeling

Receptor modeling in environmental forensics

involves the inference of sources and their
contributions through analysis of chemical data from
the ambient environment.

The objectives are to determine:

➢ the number of chemical fingerprints in the system;
➢ the chemical composition of each fingerprint;
➢ the contribution of each fingerprint in each sample
Multivariate
Receptor Modeling
1. Positive Matrix Factorization Model for environmental data
analyses
https://www.epa.gov/air-research/positive-matrix-factorization-
model-environmental-data-analyses

2. Chemical Mass Balance (CMB) Model

https://www3.epa.gov/scram001/receptor_cmb.htm

3. Unmix 6.0 Model for environmental data analyses

https://www.epa.gov/air-research/unmix-60-model-environmental-
data-analyses

4. Principal Component Analysis/Absolute Principal

Component Analysis (PCA/APCS)
http://www.sciencedirect.com/science/article/pii/000469818590132
5
Widely Used Other Available
Data Mining/Conversion of
Models Models
large data to smaller Group

PCA/Absolute PCA/ APCS - simplified model EPA‘S Chemical

principal Weighted APCS - deals “zero score” Mass Balance
component (CMB)
score(APCS) but lack of non-negativity requirement

PMF is complicated and robust model Unmix

Positive Matrix PMF - lower uncertainty and stop

Factorization producing zero factor score, requires
(PMF) Artificial Neural
component loadings and scores to be non-
Networks-Source
negative receptor modelling
Capable of identifying sources without
45
any prior knowledge of sources
Principal Component Analysis (PCA)

❑ It is a way of identifying patterns in data, and expressing

the data in such a way as to highlight their similarities
and differences.

❑ Principal component analysis (PCA) is also a technique

used to emphasize variation and bring out strong
patterns in a dataset. It's often used to make data easy
to explore and visualize.

46
Objectives of PCA
a) To transform an original set of variables into a new
set of uncorrelated variables called principal
components
b) To rank components in order of the amount of
variance that they account for
c) To see if the first few components account for most
of the variation in the original data
d) If (c) is true, then to make use of a smaller number
of transformed variables
e) If (c) is true, subsequent data analysis can be
simplified because the data set is smaller
f) To seek an underlying meaning of the first few
components (must be approached with care)
PCA/MLRA

address with the following formula

Measurement error
Normalized data
Source contribution
Source profile

48
Data matrix

Data matrix Source contribution Profiles

49
Factor loading using PCA procedure

❑ A large set of data was

used
❑ Obtained 4 small group
❑ Variables are highly
correlated in the respective
group
❑ Least correlation is
observed among the group
❑ Each of the group indicates
similar properties, nature,
sources etc.

50
PCA

◼ The first PC (PC1) is the best fit straight line in the multi-
dimensional space, the scores represent the distance along the
line and the loadings the angle (direction) of the straight line
◼ PC1 explains the largest amount of data variance & subsequent
PCs explain decreasing amounts of data variance
◼ Lower PC number, the greater the signal & lower the noise.
◼ Each PC describes a portion of the data so that all PCs add up
to 100%
◼ If data reduction is good, you need less PC to explain all the
relevant data
◼ PC plots can simplify large or difficult datasets & show the main
trends and are easier to visualize than tables of numbers
Preparation of database
Common problems:
◼ - systematic bias-analysis by different labs or different
methods
◼ - presence of data below detection limit (DL)
◼ - presence of coelution (non-target analytes that elute at the
same time as a target analyte)
◼ - data entry, identify outliers
◼ Noisy data
◼ Missing data
◼ Exclude variables if missing >50%

52
Preparation of database conti..

- replace data below DL with DL/2

- replace missing data with average value of nearby data,
or simply the average of the variable concentration
- data normalization or conversion of the data into unit
less or zero/centered mean
- Adequate number of data point and variables

53
Adequate number of data set

◼ No of data point must be more than no of variables

◼ No of data point should be 5 times of variables
◼ N > or = 100 samples (PK Hopke)
◼ N>(30+p+3)/2 (Henry et al 1984)
◼ N=50 (source unknown)
◼ N=30 (magic number!)
◼ Suitability test (KMO and Bartlett’s test): Our suggestions!!

54
Optimization of factor number

◼ >1 Eigen value

◼ Variance (%) ~ 10 or >10
◼ Interpretable factor profiles
◼ At least one variables should response
significantly
◼ Exclude variable if doesn’t response to any
factor either!

55
Exercise: VI

Activities: PCA procedure

-Follow the example data and use them into PCA to reduce the data into
a small group and least correlation is observed among the group

Demonstration: PCA procedure

PCA – PCR – APCS - MLR
STEP BY STEP
Step 1: Get Data

◼ Suitable data (N)

◼ Missing value
Step 2: Normalize the Data in Excel
Step 3: Upload the normalised data into SPSS
Upload data into SPSS

Upload
File
Step 4: Make Sure Data in Numeric
Step 5: Suitability of the Data
◼ KMO and Bartlett’s test
Step 6: Check KMO Value in Output File
Step 7: Run PCA for Normalised Data
Run PCA

Check all
important
info one
by one
Select Co Varian Method
Varimax
PCA Results

Eigen value > 1

7 Component!
PCA Results – Unrotated Factor Loading
PCA Results –Rotated Factor Loading

Important
Info
Step 8: Explanation of Factor Loading

◼ Factor loading > 0.7

◼ Explain based of significant variable
◼ Need to refer published paper to explain
the sources – need a lot of reading
Step 9: Copy and paste the Factor Scores
in a Excel Sheet
Principal component regression (PCR)
Principal component regression (PCR) analysis is a combination between
PCA (principal component analysis) and OLS (Ordinary Least Squares
regression). The PCR analysis is one of the best approaches to study the
statistical relationships between the air pollutants and meteorological
factors. PCR analysis can reduce the multicollinearity in the datasets
because the presence of multicollinearity among the independent variables
will produce the invalid results in terms of the model’s predictions and
determination of the significant independent variables. The factors with
eigenvalues more than 1.0 is choose in order to fully understanding of the
correlation relationship between the variables as the factors is considered
a significant factors. Then, the significant factors consisted of independent
variables obtained from the PCA were regressed against the dependent
variables using OLS regression analysis.
Exercise: VII

Demonstration (PCR): Dummy Data

Limitation of PCR
Factor Loadings
Factor Scores for
PC1, PC2,
for PC1, PC2,
PC3….
PC3…. PCA

MLR: PC1, PC2,

Rotation by Varimax PC3….vs a
Input data:
to obtain meaningful dependent
normalizatio
PC variable
n

Limitation: appear
negative mass
Execution of concentration
Calculate PCA (unrealistic)
APCS for each
PC
Corrections
for PCA
Determine the
Regress APCS contribution of
Induction an artificial
against the each PC with
samples with zero
dependent variable less
concentration for the
uncertainty
variables
value
APCS-MLR Step by Step

Step 10: Prepare a New Raw Data Set

Adding a Zero Sample at the End of the
Row
Step 11: Normalised the zero samples

= (X-Mean)/SD

Use “$” for Average and Standard Deviation

Paste formula e.g. = (H3-H$632)/H$633
Step 12: Run PCA for the Second Time
Exercise: VIII

Demonstration: PCA-APCS
Step 13: Copy and paste the Factor Scores (0 Sample)
in a Excel Sheet from Step 9
Step 14: Subtract the Factor Score for Zero Sample
(Step 13) from the Each Sample in Step-9

◼ The revised factor scores are recognized here

APCS (Step 9-Step 13)
Minus “Zero Factor Loading” =
APCS
Step 15: Run MLR using PM2.5 mass as Dependent
Variables and Each of the APCS is Independent
Variable.
Step 16: Convert the APCS into Factor Mass by
Multiplying the Respective Regression
Coefficient
Conversion of APCS into Mass Concentration

APCS X Regression Coefficient (B Column)

Delete “Negative Mass” from Data
Set
A correlation of input and predicted PM2.5 mass
% Distribution of PM2.5 mass contributed by F1, F2, F3, F4, F5, and F6
Assignment:
A review on current perspectives of principal component analysis followed by
an absolute principal component analysis in environmental application

Thank you for your attendance

Any further inquiry, please contact me:

mdfirozkhan@um.edu.my, mdfiroz.khan@gmail.com
Acknowledgement

www.utsc.utoronto.ca/~phanira/WebResearchMet
hods/
https://www.nemoursresearch.org/open/StatCla
ss/January200
https://www.stat.auckland.ac.nz/~balemi/Multivar
iate

IBM SPSS Missing Values
100% (1)
IBM SPSS Missing Values
34 pages
Statistics For Health Care Research Research A Practical Workbook 1st Edition Susan K Grove ISBN 9781416002260 PDF Download
No ratings yet
Statistics For Health Care Research Research A Practical Workbook 1st Edition Susan K Grove ISBN 9781416002260 PDF Download
337 pages
Gold Price Estimation Using A Multi Variable Model
No ratings yet
Gold Price Estimation Using A Multi Variable Model
6 pages
Bischof, J., Brüggemann, U., & Daske, H. (2014) .
No ratings yet
Bischof, J., Brüggemann, U., & Daske, H. (2014) .
59 pages
Chapter 6 Estimates and Sample Sizes
100% (2)
Chapter 6 Estimates and Sample Sizes
121 pages
Quantitative Methods Ca Icap Past Papers Spring 2014 PDF
100% (1)
Quantitative Methods Ca Icap Past Papers Spring 2014 PDF
3 pages
Strategic Cost Management
No ratings yet
Strategic Cost Management
54 pages
Operation Management Portfolio A4
No ratings yet
Operation Management Portfolio A4
17 pages
Probablity and Statistics
No ratings yet
Probablity and Statistics
11 pages
Encyclopedia of Research Design 1st Edition Neil J. Salkind Download
No ratings yet
Encyclopedia of Research Design 1st Edition Neil J. Salkind Download
79 pages
Statistical Techniques For Data Analysis Second Edition John K. Taylor 2025 Scribd Download
100% (1)
Statistical Techniques For Data Analysis Second Edition John K. Taylor 2025 Scribd Download
78 pages
D2444 PDF
0% (1)
D2444 PDF
8 pages
2021 - Lopez-Martinez - Overview of Global Status of Plastic Presence in Marine Vertebrates
No ratings yet
2021 - Lopez-Martinez - Overview of Global Status of Plastic Presence in Marine Vertebrates
27 pages
RM-Lab20 - Correlation and Regression Analysis Using SPSS
No ratings yet
RM-Lab20 - Correlation and Regression Analysis Using SPSS
6 pages
Testbank
No ratings yet
Testbank
41 pages
2011 2012 CBCS
No ratings yet
2011 2012 CBCS
87 pages
Central Limit Theorem
No ratings yet
Central Limit Theorem
26 pages
Matm 111 2019 - Syllabus
No ratings yet
Matm 111 2019 - Syllabus
2 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Very Fast BDS
No ratings yet
Very Fast BDS
95 pages
HR Analytics Applications
No ratings yet
HR Analytics Applications
26 pages
Regression Analysis - Chapter 4 - Model Adequacy Checking - Shalabh, IIT Kanpur
No ratings yet
Regression Analysis - Chapter 4 - Model Adequacy Checking - Shalabh, IIT Kanpur
36 pages
2020 - Marti - The Colours of The Ocean Plastics
No ratings yet
2020 - Marti - The Colours of The Ocean Plastics
39 pages
5 SEC - Usman, Britto, Damm & Börstler - Effort Estimation in Large-Scale Software DevelopmentAn Industrial Case Study
No ratings yet
5 SEC - Usman, Britto, Damm & Börstler - Effort Estimation in Large-Scale Software DevelopmentAn Industrial Case Study
30 pages
2019 - Erni-Cassola - Distribution of Plastic Polymer Types in The Marine Environment A Meta-Analysis
No ratings yet
2019 - Erni-Cassola - Distribution of Plastic Polymer Types in The Marine Environment A Meta-Analysis
20 pages
2021 - Salerno - Microplastics and The Functional Traits of Fishes - A Global Meta Analysis
No ratings yet
2021 - Salerno - Microplastics and The Functional Traits of Fishes - A Global Meta Analysis
41 pages
2019 - Akdogan - Microplastics in The Environment - A Critical Review of Current Understanding and Identification of Future Research Needs
No ratings yet
2019 - Akdogan - Microplastics in The Environment - A Critical Review of Current Understanding and Identification of Future Research Needs
24 pages
2020 - Buathong - Accumulation of Microplastics in Zooplankton From Chonburi Province, The Upper Gulf of Thailand
No ratings yet
2020 - Buathong - Accumulation of Microplastics in Zooplankton From Chonburi Province, The Upper Gulf of Thailand
12 pages
2019 - Xu - Microplastics in Aquatic Environments - Occurrence, Accumulation, and Biological Effects
No ratings yet
2019 - Xu - Microplastics in Aquatic Environments - Occurrence, Accumulation, and Biological Effects
14 pages
2019 - Van Aert - Publication Bias Examined in Meta-Analyses From Psychology and Medicine - A Meta-Meta-Analysis
No ratings yet
2019 - Van Aert - Publication Bias Examined in Meta-Analyses From Psychology and Medicine - A Meta-Meta-Analysis
32 pages
Investment Behavior in Generation Z and Millennial
No ratings yet
Investment Behavior in Generation Z and Millennial
15 pages
2017 - Andrady - The Plastic in Microplastics - A Review
No ratings yet
2017 - Andrady - The Plastic in Microplastics - A Review
11 pages
Regression 2
No ratings yet
Regression 2
27 pages
Loq and Lod
No ratings yet
Loq and Lod
5 pages
15multiple Linear Regression
No ratings yet
15multiple Linear Regression
168 pages
TR Model in RM V10.1.1 and Up Eng
No ratings yet
TR Model in RM V10.1.1 and Up Eng
23 pages
A Comparative Analysis of Working Capital Management Among Top 5 NSE Listed Indian Steel Companies
No ratings yet
A Comparative Analysis of Working Capital Management Among Top 5 NSE Listed Indian Steel Companies
8 pages
2021 - Ranjani - Assessment of Potential Ecological Risk of Microplastics in The Coastal Sediments of India - A Meta-Analysis
No ratings yet
2021 - Ranjani - Assessment of Potential Ecological Risk of Microplastics in The Coastal Sediments of India - A Meta-Analysis
12 pages
2015 - Shim - Microplastics in The Ocean
No ratings yet
2015 - Shim - Microplastics in The Ocean
4 pages
2021 - Rodrigues - Microplastics and Plankton - Knowledge From Laboratory and Field Studies To Distinguish Contamination From Pollution
No ratings yet
2021 - Rodrigues - Microplastics and Plankton - Knowledge From Laboratory and Field Studies To Distinguish Contamination From Pollution
14 pages
2015 - Barboza - Microplastics in The Marine Environment - Current Trends and Future Perspectives
No ratings yet
2015 - Barboza - Microplastics in The Marine Environment - Current Trends and Future Perspectives
8 pages
2018 - Lorenzo-Navarro - Automatic Counting and Classification of Microplastic Particles
No ratings yet
2018 - Lorenzo-Navarro - Automatic Counting and Classification of Microplastic Particles
7 pages
2009 - An Evaluation of Induction Machine Stray Load Loss From Collated Test Results
No ratings yet
2009 - An Evaluation of Induction Machine Stray Load Loss From Collated Test Results
7 pages
AGAM-T012-16 Pavement Rutting Repeatability Bias Error-Checks Inertial Profilometer
No ratings yet
AGAM-T012-16 Pavement Rutting Repeatability Bias Error-Checks Inertial Profilometer
9 pages
Qsar Stastistical Method in Drug Design
No ratings yet
Qsar Stastistical Method in Drug Design
54 pages
Nguyễn Phát Thịnh - assignment 11
No ratings yet
Nguyễn Phát Thịnh - assignment 11
6 pages
SPSS Tests-Mann Whitney, Krushkal Wallis, Linear Regression
No ratings yet
SPSS Tests-Mann Whitney, Krushkal Wallis, Linear Regression
4 pages
8multiple Linear Regression
100% (1)
8multiple Linear Regression
21 pages
Lecture 12 - Adv. Correlation and Multiple Regression
No ratings yet
Lecture 12 - Adv. Correlation and Multiple Regression
32 pages
Lecture 6 Regression Analysis
No ratings yet
Lecture 6 Regression Analysis
35 pages
Quantitative Anaysise Solomon
No ratings yet
Quantitative Anaysise Solomon
51 pages
Linear Regression
No ratings yet
Linear Regression
216 pages
Statistics For Managers Using Microsoft Excel: The Simple Linear Regression Model and Correlation
No ratings yet
Statistics For Managers Using Microsoft Excel: The Simple Linear Regression Model and Correlation
94 pages
Noakhali Science and Technology University
No ratings yet
Noakhali Science and Technology University
28 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
Team8 Lab3
No ratings yet
Team8 Lab3
12 pages
120.508 Module 8 Multiple Regression (PDF Full Page Color)
No ratings yet
120.508 Module 8 Multiple Regression (PDF Full Page Color)
52 pages
Lesson 2 Linear Regression
100% (1)
Lesson 2 Linear Regression
21 pages
3.1 Multivariate Analysis
No ratings yet
3.1 Multivariate Analysis
32 pages
Note 13 - Linear Regression
No ratings yet
Note 13 - Linear Regression
25 pages
1.1 Simple Linear Regression Model
100% (1)
1.1 Simple Linear Regression Model
15 pages
Correlation, Simple Linear Regression and Multiple Linear Regression Practice
No ratings yet
Correlation, Simple Linear Regression and Multiple Linear Regression Practice
50 pages
Regression Analysis
No ratings yet
Regression Analysis
16 pages
National University of Modern Languages Lahore Campus Topic
No ratings yet
National University of Modern Languages Lahore Campus Topic
4 pages
Regression Kann Ur 14
No ratings yet
Regression Kann Ur 14
43 pages
Bivariate Statistical
No ratings yet
Bivariate Statistical
51 pages
Lecture 8 Linear and Multiple Regression
No ratings yet
Lecture 8 Linear and Multiple Regression
55 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
Lesson - 4.2 - Exploratory Data Analysis - Analyze - Phase
No ratings yet
Lesson - 4.2 - Exploratory Data Analysis - Analyze - Phase
50 pages
Experiment No.2 Title:: Predicting Missing Data Using Regression Modeling
No ratings yet
Experiment No.2 Title:: Predicting Missing Data Using Regression Modeling
8 pages
Week 2 and Week 3
No ratings yet
Week 2 and Week 3
14 pages
Statistical Analysis Using SPSS and R - Chapter 5 PDF
No ratings yet
Statistical Analysis Using SPSS and R - Chapter 5 PDF
93 pages
Chapter 2
No ratings yet
Chapter 2
19 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Week 9
No ratings yet
Week 9
23 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
10 Regression Analysis
No ratings yet
10 Regression Analysis
55 pages
Module01 LinearRegression
No ratings yet
Module01 LinearRegression
41 pages
Updated Lecture 7
No ratings yet
Updated Lecture 7
29 pages
What Is Multiple Linear Regression
No ratings yet
What Is Multiple Linear Regression
23 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Day 2-Data Science
No ratings yet
Day 2-Data Science
16 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
No ratings yet
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
12 pages
Module01.1 LinearRegression
No ratings yet
Module01.1 LinearRegression
32 pages
Section 2
No ratings yet
Section 2
22 pages
Stats101A - Chapter 1
No ratings yet
Stats101A - Chapter 1
25 pages
Regression Equation For SI
No ratings yet
Regression Equation For SI
12 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
Multiple Regression Analysis 1
No ratings yet
Multiple Regression Analysis 1
57 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
11 pages
Regression and Introduction To Bayesian Network
No ratings yet
Regression and Introduction To Bayesian Network
12 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
Theme 3 Multivariante Regression Model
No ratings yet
Theme 3 Multivariante Regression Model
8 pages
College of Natural and Computational Science Department of Statistics Linear Regression Biostatistics Master Program
No ratings yet
College of Natural and Computational Science Department of Statistics Linear Regression Biostatistics Master Program
3 pages
Regression 101
No ratings yet
Regression 101
18 pages
Chapter 4 Multiple Regression Model
No ratings yet
Chapter 4 Multiple Regression Model
31 pages
L4&5 Multiple Regression 2010B
No ratings yet
L4&5 Multiple Regression 2010B
77 pages
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
No ratings yet
Name: Muhammad Siddique Class: B.Ed. Semester: Fifth Subject: Inferential Statistics Submitted To: Sir Sajid Ali
6 pages
Quant Developers' Tools and Techniques: Quant Books, #1
From Everand
Quant Developers' Tools and Techniques: Quant Books, #1
Manfred Hindering
No ratings yet
Advanced Mathematical Applications in Data Science
From Everand
Advanced Mathematical Applications in Data Science
Biswadip Basu Mallik
No ratings yet
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

HMX7001 Analysis of Data Using SPSS - Advanced Level

Uploaded by

HMX7001 Analysis of Data Using SPSS - Advanced Level

Uploaded by

Advanced Research Methodology (HVX8001)

Analysis of Data Using SPSS

Dr. Md Firoz Khan

Input data Correction the

Initial analysis Multivariate

Statistical Analysis &

Descriptive statistics, Correlation, t-

Validation of the Emission Sources by Bivariate Rose Plot/Potential Source

SPSS: Demonstration with an

Basic of the statistics: Practice

P(score > 130) =

◼ Shape of data is measured by

Let x1 , x2 ,... xn be n observations. Then,

Positive correlation Negative correlation No correlation

r > 0 linear + positive

A number of common numerical measures of similarity is available:

y = b0 + b1x1+ b2x2 +…..+ bnxn + ε

◼ The a parameters reflect the independent contribution of

SSE =  (Yi − Yi ) =  (Yi − X i +  )

• Involve solving sets of simultaneous equations

Simple example use of MLR model

Y = A1*X1 + A2*X2 + A3*X3………+ AnXn + C

[measured PM10 (μg m-3)] = A1 [measured NOx (μg m~3)] + A2 [measured

Demonstration: Multiple regression model

Thus, the reconstructed MLR model:

Practice with Dummy mass closure data

◼ α level is set a priori (Usually 0.05)

◼ If p < α level then we reject the null

Visually normal, use parametric

Receptor modeling in environmental forensics

The objectives are to determine:

2. Chemical Mass Balance (CMB) Model

3. Unmix 6.0 Model for environmental data analyses

4. Principal Component Analysis/Absolute Principal

PCA/Absolute PCA/ APCS - simplified model EPA‘S Chemical

PMF is complicated and robust model Unmix

Positive Matrix PMF - lower uncertainty and stop

❑ It is a way of identifying patterns in data, and expressing

❑ Principal component analysis (PCA) is also a technique

address with the following formula

Data matrix Source contribution Profiles

❑ A large set of data was

- replace data below DL with DL/2

◼ No of data point must be more than no of variables

◼ >1 Eigen value

Activities: PCA procedure

Demonstration: PCA procedure

◼ Suitable data (N)

Eigen value > 1

◼ Factor loading > 0.7

Demonstration (PCR): Dummy Data

MLR: PC1, PC2,

Step 10: Prepare a New Raw Data Set

Use “$” for Average and Standard Deviation

◼ The revised factor scores are recognized here

APCS X Regression Coefficient (B Column)

Thank you for your attendance

Any further inquiry, please contact me:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Y = A1X1 + A2X2 + A3*X3………+ AnXn + C