0% found this document useful (0 votes)

14 views4 pages

Final Cheat Sheet 2

The document provides an overview of statistical concepts, including types of random variables, data collection methods, sampling techniques, and hypothesis testing. It discusses the importance of probability distributions, sampling distributions, and the Central Limit Theorem. Additionally, it outlines various statistical experiments, definitions, and rules related to data visualization and analysis.

Uploaded by

Michael Abou zeid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views4 pages

Final Cheat Sheet 2

Uploaded by

Michael Abou zeid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Data:

Continuous Random Vanables Probabilityof arandomunable

one Expectation -(x) M Variance Variance: covariance

9(x 1)"
=

- Cross Sectional: not dependent on time -

fzix -yad,
=

-samples (y cov(x,y) E)(X- Mx)(y 1y)) G(xy)

- Time Series: sequence over time matters
2
(uxMy)
·

Var(x)
Probability Density function:Cummulative Distribution
=
=

5
-
-

function: f(x)
= =
·
· =

n -

f(x)= rayavara
1
- Panel: Cross sectional + Time series Sf(x)dx
T St. Delv.

1x 255xuxcyx **- planonenuous

P(a>X <b) var(x)

· =

-*
=

f(x2) (f(X)) xxy

occoucc-acic.SC,
Var (x)
· =

-
=
·

Rules:
Jof(x)dx 1 Area
S
· = +

Data Collection:
xoxusif
under

RelationshipbetweenXand.
·
f(x) 10 0 xeR curve · var (x) G((X-M)3)
= =

- Experiments, Observational studies f(x) as

write percense function
2(g(X)) 8y70:InverseyRede
- Prospective: design - collect - analyze Var (9(x)) 2)(9(X) Mx)") Txy XLY (Frelependants
Random vanables
0:
=

Discrete
=
- ·
·

1
- Retrospective: collect - design - analyze Properties:
Properties "spread" with
constants

to
covenance
Correlation z-score
Mass function: Lummulative Distribution function, constant
Probability
-
E(b) b -

E EsaNITREEEEi
=
-

Sampling: f(x) P(X x) f(x) P(XIX) Gf(t),

EcaxalEsB **-
1x
Pxy
#I
= +
= =

"Sum of
probability
- Random: choose n individuals from - -(b)

15f(X)=1-ofalirandon,Addalteprobablessetsets
+

Rule:
population with equal chance f(1) f(2) =Six-j, ==x3 jeTy,Y-@NIY If
*
↓ ↑
f(2) f(0)
11(yI]Std. Dev=War(x)
i.e.
+
=

y
+

formula I2abcor(X,Y)
According
-

- Simple: one group with equal chance

to
->

- Stratified: split into groups then do

i
Data Visualization:
spread Distributions Standard Normal:
simple random sample (e.g. gender)
·
· Scatter plot: Histogram:
·

(n 0)
Doesn't exact
- Convenient: choose individuals that are Relationship for 2 show
Range:max-min mean
-

- =
-
·

numeric vanables value of data points Binomial:

easy to access (not random) All data must be collected Variance:

-
variance (52=1)
Stem LeafPlot: (P)
·

successprob
-

E(x-e

*
making histogram
-
x
·

before
How frequently values appear Sample:s2 t uals (n) of Distribution:
=

Stats Definitions:
-
-

visualization
Good for
-
# of -

valuesens
showsexact
=

- Population: group we are interested in frequency

1) Degrees (v)
E(x
Re. frequency -
-

of f
Population:52 Poisson:
-

Hard to choose stems

-
=

- Parameter: numeric value describing the

- ·

na
population ( M,0,p,N ) ·
Box plot: · Standard Dev. : -
Event rate (1) ·
Chi-squared (X"):
- Sample: subset of a population
*
-

Visualizes are factor in data Sample:5 55u

(k)
=

Degrees
-

f
I
of
Normal:
-

sets ->
- Statistic: known value describing the
·
multiple data
Population:2 v 2
-

compare -
=

sample ( X, S, p, ) ↓ mean (M3 f- Distribution: ①

coup
1 ·

I
-

ofm edian upon her

-> Degrees freedom:v
of n
=
-
1 =
variance (22) -

Degrees of f 12, we -

# of independent pieces i nfo

Probability Definitions:
of
IGR:9-QU
↑
-

available computer variability

-
to

- Statistical Experiment: process that -

Lower whisher:Q-1.S(IGR)
-

Mode:Highest peak (most frequent #)

generates data (diﬀerent to Experiment as it -

upper whisher:9+1.5(IQR) Not affected byoutliers
*

is about testing data instead of hypothesis)

- Sample Space (S) : set of all possible
-
Outliers:
Values>upmowhisher -
mean: Balance pot one Point Estimate a population Parameter
Estimate
(8) (e.g. M) byfinding a point
Affected by
*
outliers 1

outcomes in statistical experiment Measures

·
of shape:
estimate (E) (e.g. *) which is a
single rake from a statistic (0)(e.g. E 3x, x2, x, =

...
3)
50/50
- Event: subset of a sample space Modality
-Median: Area is split

Statistic (8)
symmetry
-

parameter (8) if:

- Partition: two subsets that combine to form -

skewness -

Cleft. of skewness
* Not affectedbyoutliers ·
is unbiased Estimator of
sample space (e.g. events A and A’ form S)
- Random Variable: function associating
Testing Errors M8
=

E(8] 0e.g.z[]
=
M
=

each number with an event in the S

Point have
Sampling Distributions:
·
Type I:rejectto when is
it true
·
Estimates have
sampling distributions since they variance

- Law of Large Numbers: as sample size

increases, sample mean converges to
-

P(Commiting Type I error) a

Setonin ·
Estimator:
Effecient pointestimater (8) that has small vanance in point
the

population mean - x level

=
of
Significance / Degree of confidence estimate distribution (small rot
- Sampling Distribution: distribution of Ho
2 :Reject
P-value < * Most Estimatur:Statistic that
Effecient same the parameter
sample statistics from many samples
is the
categoryas
-

- Central Limit Theorem: as # of identically -

P-Value>&:full to Reject Ho (E.g. Sample venance is effecient
most estimate
distributed samples increases, the sampling -

Confidence Level 1 -
2 for
population
the vanance
distribution of the mean converges to a Decrease 2:
normal distribution ( M,2/0 ) ⑰,
-

has least
the verance meaning It is
1. Increase sample size
(If sample Size 1130 Assume
normality -> the best statistic getting point
region)
for a
2. Decrease critical region (Increase fTR
estimator (E) to the
estimate
- Paired Samples: look at before and after of II:fail to
·
Type reject to when it is false parameter of interest (8)
the same sample
-
P (Committing Type 11 Error) B =

mean of level;for
Hypothesis: I ANOVA
values
-
can only
b e found when we have Ha 2- factor Experiment faster A , Level;for
- Statistical Hypothesis: assertion about one
↑ accross all K
factor is

ocrmimiting
Statistical Power: Powr=1-B

on
or more populations and a parameter value
-

across all
fac tor A
of Level;for jand K
-> Mean
- 2-Sided Hypothesis: specifies that the P (correctlyrejecting to when it's false)
Error

parameter is exactly equal to a value ·↑Power as can have uptors

- 1-Sided Hypothesis: specifies that the -
The farther Hat is from smaller B
the is

parameter is at least or at most a value Decrease 3:

- Null Hypothesis (Ho): hypothesis that we 1. Increase sample size -> Total variation
con accept in the absence of data. We are 2. Increase critical region (Decrease fTR region)
looking to reject the null hypothesis.
Testing Goodness of fit
be
- Alternate Hypothesis (Ha): hypothesis that mean mighti s claimed to 68 in. W 3.6
=

in. Sample size 3 ) =

Looking vanance
at -> X fast
is the opposition of the null. Statistical a) find 2:
evidence and analysis is looking to be used Ho: M 68 = -> Critical Region:M967 and M1>69:
·
Hypotheses:
to support the alternate and reject the null. Ha:M = 68
distributions
Ho: Observed and expected are the same
Ho +Ha all
=

possible outcomes
Reject
to Don'theintto Recorded -> Assume thatt he
null is true

Ha:observed and expected distributions are same

not the
E.g. Mean worse length
is 2 hours:
P(Type/tror) 2
=

P(X26>1n x)
=

=
P(X)69/n 6) 7
+
=

Ho length
Hypotuses Testing:
mean move 2 hourg
=
=
·

P(zc s)+Pl
-

Ha mean move
length 2 hous
67 -be
=
=
z

=a
=

z Given distribution
expected K bins and (ci) Observe
with
frequency per pin
+ distubution a
with bing
E.g. Mean
height boys
of is at most 177cm:
a = 0.095 7x
yy and observed (0,)
=

fuquency per bin

Ho follows expected
Ho observed data distribution
mean
height
= =

= 177 cm
b) find B: Ha observal data
does not follow...
=

Ha =mean
height) 177cm fa i l rejectnull (X 68)
to =

m
(Oi-e, wt Reject Hoifi
Gs,e
- P-Value: probability of observing data as 1

E.
5 Statistic
Test x2 w k presella s
B p(671X169(M 7)
1
=
-

=
= -

given
=
=

extreme as the data found while assuming ↑

P(z
=
=
- 1.67) -

P(zz 5) -

null is true Region where Ho is rejected - =

1.67
* Only can use if e.55 for all bins (i) Test statists
0.0475
-

-
0
- Low P-Value (inside critical region)
=
can combine bins to gete, or 0, -
3 0.0475
means suﬃciently low probability of getting
=

·
Test on Categorical Data:
that data given the null is true —> Reject
Null Special Case:Yate's correction ( Table:show and levels
-> fail reject (fTR) region contingency frequency categorical data according
·
to of
variables/factors
·

- High P-Value (outside critical region)

means not suﬃciently low probability of If 2x2 Table (11) = 2 and 151:2) -> would end up with v1 =
->
Lens
Expected Value Table:
Lets
getting that data given the null is true —> &
Crow;total (row;total)
Fail to Reject Null
x5(1005
251 0.5)
"Yates' correction
-

in
-

=
TestStatistic x
e,j
#valve
conclusion

Pvalue = 0.10 No fredince to

rejectn ull ·
Testing Homogeneity of Categorical Data:(Independence oflevels)

0.05 <Pvalue < 0.10 weak fredince againstthe

null
* Marginal Total values are fixed factors Levels
amton
Is term factors?
.- between
association two
an to

Ho L/R handed
=

does effect
not
-> hij) 5

3
Moderate
fredince

Xt*COi-Zis"
0.01?P-valu <0.05 against the null distribution
the f avourite
of pets Ho They independent
fixed are
=

Rejectto if it
Tests tatistic 12 =

Does effectd istribution Ha Theyare not independent

*
evidence null Ha yr
0.001 P.Valu <0.01
strong the
against X2 >
=
·

(171-)CIF1-ces
"name and snacka re independent"
"Type where v
=

* Fix sample size when

collecting the data of handedness does doesn't
P-value <0.001 Very strong evidence the
againsttunull
effect petd istribution l
· I Sample Vanance Ratio:
Confidence Intervals
Hypothesis Testing Variance (59:

Test Statistic 5 SI/s

InternalEstimate s
Centrality
Estimate -
I K(Other point
estimate
spread · I sample:
1)50
(n
52-(n-)e
=

* Assuming true,
-

null is how where Vi n,

=
-

1 S,)52 CIfor2
=

Sample Size: x"2(2,w

livery Is It to observe data as ·
n130: NormallyDistabuted byCLT where v n =
-
1 &
X2 is not
symmetric
"extreme"as what is observed ·
1 <30: Assume normallydistributed ·
I Sample Ratio of
Proportions: order changes!
spread decreases
-

* As r t as ,

Means:
I
mean (x) known Vanance (52): Binary Linear Regressions
Means:

(52)
CIf o r NY/F =

(S) fa ,we a (sp)fopewrite

· +

· I sample known variance

where
I n, =
-

and
1 We nz =
-

1 Ratio
* If = 1 ->
no difference
Response variable is 0 or 1 CIfor u x
z(y)
X
=

-
10 -
M. =
=

statistez
Test Prediction Interval: Prodecting
=

a
observation
next
5/5 null
mean-logistic function:f(x) =
x
· I sample +
Unknown Vanune (5%)
·
Isample + known valance (5%
n <30:CI
for u x
=

tv,
I v(z)
·
Logistic Regression Model:
fitted PIfor future observation (X.) x
=

=(2x) (5) i
Y
where v n 1
(52)
-
=

Non linear Isample +

Unknown vanance V n
=

1
1
· -

p =

1 e-
+
(80 3,x,... 8xXu)
+
+

-> cannotfrd n130:CI for M X =

I Ean( PI nextobservation
f or (X0) x
=

(tw,E)(s)N
Y
b;easily
Aprobabilityofgetting
·
I Samples:Looka tDifference in means
·
I mean (x) + Unknown Vananke (54: 1 Proportions:(Binomial)
(v2):* Ifzerosinintervalsee
·

to
known variance

Test
Statistic t X
=
-
M. As X,by 1:p* by e-bi CIfur(1, -

M2) (X,
=
-

X2) =
Es) +w)
-
Success Proportion (P)
X
successes =

-> ↓
s// Unknown Vanance (59): ~

sample size: AP15 and n(1 F) = 5

-
odds ofsuccess:
Not he
CI for(M. M2) (X, X-2) -
= -

=
tre · I Sample True Proportion:
(i+si
P2/(1-pe)/same p(1
* can use z-statistic if (P) $12 x p)
CIfor true proportion
-
=

Odds Ratuo:
((ir(=) 1.)e
-
where w
is
=

· Difference in Means known Vanance: factorbywhich

-
px / (1 p1) -

·
Paired Samples:Where
·
I Samples Difference In Proportions:Both
Sample
must meet
Size
mean difference
the odds success
=

of
(X, 1) do
(p, P2) 2a,p.(-) Pu(1-R)
-

X
BtoPa
-

Ma) a tw,z() CIfor true P P2 =

from
changes by,when Ma (M.
+

statistic z
=

CIfor
-

Test
-

moung
=

=
=
-

= 12

(02/n.) (0/na) +

Onlyuse when
do
->
completely I only when ANOVA Ho is rejected
Planned comparisons:
Pairwisecomparison:Ik) pairs
Where null difference in means(a,

One factor experiments original Ho is rejected

=
·

Randomized ·

Cluster levels
Design companion of t wo levels together into contrasts
- Factor: Variable that separates the conditions -
a Priori:Lookat test
data before ANOVA before performing the ANOVAtest, using
- Levels/Treatments: Value of the factor and identifytwo levels that
· Difference in Means Unknown Variance
may
have
differences
qualitative analysis (E.g. Lookat box plots
+ E.g. Looka tbox plot before doing ANOVA Test
"
E.g.
factor:coffeeSoTim
Hr:Ms MT
=

-
Linear Contrasts:Aggregate
our groups
(X, xu) -

do Horton Ha.. mpleT-Test:Unknown population Vanance

t
-

Test Statistic
=

(s?/n.) ((/nr) +
- Within Sample Variance: Variance within one Ho(i,j):M,=Mj -

M,- Mj 0
=

w =

51(,M) where iCi= 0

level itself The more pairwise compansons done without
a prior;
- Between Sample Variance: Variance between C: indicating the
coefficient sideand length

when=lni-les
higher
the the chance of
Type I error.
of mean
every combination of two levels
the

* rpairwise -
P1 orcore Type o 1is =

·
Ho:Zi((, M.) ·Ha:x (C, M;) 0
·
Assumptions:for the
populations of the K levels =
0

Test:Looka tpotential differences after

necrest integer Tukey's
=>

E.g. Room 1,2,3,5 vs Room 4

Independent One factor experimentwith no

ANOVA Test (onlyposterior knowledge)
repeated measure Increases Probability ofType II Video (Higher p-values)
(1M, (1Mz (1)M.
tatoes
or w
-
Have no outliers blocking =
+ +

Ho(i,j):M,=Mj -

M,- Mj 0
=

- NormallyDistributed.The sample data for

where
QC 0
=

distributed with
1.
mm
studentized Range Standard fror:
Ho:M, M2 1y M5 4Ma 0
normally
- =

level
+ + +

each
-

is
·
Difference in Means for Paired Samples: Viardns =

common vanance wh
individual Mrand - SRSE M- (, h) where
E.g. Room 1,2 Room 3.5:Not participating
=

mean +

sample size vs

ado
of
groups i, j
statistic
Test to v n
with = -
1 ·
Hypothesis; Test
Bartlett
Test with w (1)M (1M2 ( 1M3
=
+
+ -

5Ma
+ +

(1)M

where = mean difference

Ho:M =
12 ... Mk = 2.
a-staxstcSRSaretmentin t where Y,and j
-
where

Ho:M, M2M, +
i Ci
-
0
=

Ms 0
=

SSW:
Ha: At least two of
34.... Mr3 equal
aren't sum squared contrastw
~

scrincal
of
due to

valuefromturettableinrefore
error

leasto ne M,is different

At subset
A of SSA

Statistical Model: Mull assumptions: Independanto f; and W DOf

ofError (N k)
=
-

·
normally
Proportions : hisa re independent and sNull Hypothesis:"Difference between;andi s
and
Yij M
= +
xi +

Gij distributed with mean & M;I f9>9*:RejectHosignificant"

Ho:M =
* Large sample size: 1825 and n (1 p) = 5 variance &
S.
-

for all is Interested

in Jusesame *): a
-

Yij:value of level; observation j size level; 1The contrast

is significant
I
Proportion:"I proportion Z-Test"
of

Bet's
,

↑original
·
·
Test:Test Homogeneity Vanance of
explaining
in whythe

-
M: Population grand mean (mean a ll
of
Mi)
-
F- Test: Ho was rejected"

porpo
=
Ho:T Ha:Atleastone
Rejecttoe
J different
2): Effect l evel 1 (How data points change ... is
fw ssw
of
Statistic
Test Test Statistic
-

z =

3 on the Grand
=

bya common factor, as a

1
mean
1. find K Sample Variances:59,S,,,..., SR
-

2.Si 0
=
of the level
result you are on
5i5j(Yij 5,)2
2. find Pooled Vanance Estimate:
s
-

Where MSE
=

Gij: Deviation of level i, observation; from

wk(ni
-
n -

k
Spooled
=
-

1)s. S
the mean of level ;
3. Find statistic b:
Bartlett
- % variability explained by
of
Wy:
contrast
-W
·
Upper-Tailed at
- Test Sig. Level 2: ( k)
Orthogonality:for two contrastsWas and Way,
-

storica ((sp((x X(sR(ux y

z
b
-

...
both SSWa and SSW independent
subsets of SSA.
Difference
are
None of the
Proportions:"2-proportion
·
->
in z-Test" Ho: 1 =... Ck
= 0
=

levels have an Cn0 Overlap SSW)

4. Find Critical Value br(, n1, nk)

(Ha:
least effecton the
p. Pr At o ne di o
. .
.,

TestS tatistic z
-

forWa # C,M,and WD
=
diM,:
=
observed value
=

size oflevel

n,bk(2,n.)7
↑
(x!IXz)(1 x,Ix2) Intheend
-

Ho:M, =... Mk
=

bx(h,n,,..., nr) = 5.i (C,di)/ni 0

=
->
means
Orthogonal
Ha:Atleasto ne M,i s different ↑ K-1
Each from * most
At orthogonal Linear Contrasts:

befoundfromBartlett
tale
Batitsee
can directly
of level;
·
Analysis Variance:Samplmean
of
Grand mean
SSA SSW1+SSWe .. SSWk
+

-
1
↑
=

-
Reject Ho Mr)
similateon
b <bx(x, n,
Repeated
if
-
...,
One factor ANOVAuth Measures
Accounts for sample size leveli
of
f.T.R Ho ifbbx(f,ne...., nk)
·

Observations each other test

Whatwe to
want achelve
are not
completely independentof
Variance:Ho and Ha be
must & Datapoint;for level; as if Allows to do ANOVA E.g. Take 10 subjects and
gluethree treatments each
to -> 30 data

Vanunce (22) notSD(4)

F.5.RNo ifs H:M, =

MrHa:Atleast1mean is different
where k # =

with ...

Treatments
of

· I Sample Vanance: ·
One Way ANOVA: ·
Statistical Model:
N Overall ofobservation- k =# of levels M Total mean
- =
=
-

ofTreatment on
Yij 2i Effect mean
M 2i B Gij
- =

Ch-1S
=
+ + +

TestStatistic x =

v
with n -

Bi Effect
=
of
subject; on mean

Gij Deviation
=
of level i, observation; from the mean of level;

!
same
* I X,=0 and Bj 0 =
·
Assumptions: as one
facter fo

Ho: 2, =

... dk
=
-

ericity:Vanance of the

differences are equal for all

Ha:Atleasto n e 2;different pairs of treatments:
Explains vanability in the one-wayANOVA
(52 52) (V?- 55) (0 55)
-
=
=
-
Simple Linear Regression
Variance (ANOVA): Only Outlier Analysis:
·
Analysis of for B ·

·
variables:
easier find SSR and
to SSE and is easier implement
to - Anscombe's Quartert:
-

Response vanable (Dependent) only 1 with software

& plots with same linear regression line, vanances, and
-
Explanatoryvariable (Independent) can have multiple Divide byDegrees f reedom
of
Ratio variance
of correlation, but that lookve r y different
) must data
looka t

· Determenistic Relationship: When response vanable directly Outlier:observation that

is substantiallydifferentto
-

Celcius to farenheit other observations

relates to explanatoryvariable E.g.
date
Y intercept
Error RV
1. Daten error
measurement E.g. Typo when
entering
Line of Best fit: Slope
-

Remove / correctthe outlier

↑
·

↑
v 5 value
n
-> predicted
process)
simple Linear Regression:Y B0 B,X G
2. Sampling error (notparto fnormal E.g. Selecting
- =
+
+

participants are
that not a fit for the population of interest

Remove / correctthe outlier

fitted Regression y bo
:
=

b, X
+

3. Natural variation (occurs bychance f rom an

not error)
↑
is the predicted I near
Hypotnesis Test:
(log, 12)
-

regression keep outlier butconsider data transformation

bo pointestimators
o f8 and Reject Ho If f >
Defect Outliers:
-
and b, are unblused B, Ho:3, =0 7 ( 1,v=n z, 1)
=
-

sample
n= size /
More data better
estimates
Ha: 5, F0 This is
only an upper-acted sided test1. fit data
to a regression line y=bo +b,x
y
Residual / Error:
atrue manone
·

RejectNo: 3, 70 with statistically

i!
-

a
e;
2. find St.dev. ofresiduals
yi sufferent slope

↑
value

y, yi
·

st
e, 8 Population
I
standard
=

-
y,)"
if Sresiduals 2(3,
=

RejectHo:Rethink liner
-

deviation fall to
regression
=
-

&Population
Predicted oneinof appropriate for data
variance n -

2
observed is
Standard deviation
Optional S sample
* Line
Quality of fit:
=

residual values · ·
Standardized Residual:
=Sample variance
-

#2: Negative and positive values can canal out and

give
cov(X,Y)
a sum of zero even when line
the is nota
good fit
-

Population correlation coeff: =

SRi 2:/sv =

xx5y
2
-

2.: Easier analyze

to and compute compared to [le;) [1,
bounded between +
1] X and 9 St.dev.
Like the -score for residuals
* Bigger value is worse SXY
Estimated Correlation Coeff: 2
* Potential outlier:(SRil>3
SxxSyx
-

·
Ordinary Squares
Least Method:
* As (r) -> 1:Association between a nd
X
·
Studentized Residual:
y,and y Y increases
[i, e, e, y, b0 b,X;
=

min given Calcula

= +

points lineis
-

sign of a tells the direction of association StRi 2:/s ->

=
without
Derivative to
set zero is used to find optimal point Mighthave different
same slope estimate but R-values
* Potential outlier:(SAR,>3
b, =

ux.ilxixi)-(xix.,ee E(x, -

*i(X,- x)2
x) is
xy
= -

Coeff
~
ofDetermination:re=1 -

SSfrom 21, +
1]

Proportion variation
of on thatis explained
Y
·

Categorical vanables:(i.e. not numerical)

b (,Yi b1((i(xi)
model
-

b,
by
the regression
Dummy /
-
y Use Indicator Variables
=

-
=

Higher re -> better

fitted model

Better
=

determining
at how good of a fit the #ofcategorical -> n-1Indicator Vars.

SxX=5i(xitSYyilyiSxyzilxiLeyi) -
model

correlation
-
is compared

causation:Good strand
a
to a since error is squared
E.g. Yeast Type EA, B, 25
=

does not in Y Yeast Type Ia =

2B =
·
B, confidence Interval: slope allways mean an increase in X-> increase

A 1 ①

syy bsnipotentionables Assumptions

5i(3, -5,) of Linear B
Regression:
Ss) -
· o 1

g2 C
=
=

Defaultcase is yeast
n -

2
n 2
1. Linearity:
-

& Estimate bo, and b, Data should look

reasonably linear
where the a
of Vanance of Residuals (52) -
Check using scatter plot with X
and * -

A 56YastA
=
-

z bfyetre
=

for
CI B,:b, tya xxwhere w = n -

2 Residuals / Errors (ei):

2. NormallyDistributed -
model:Y Po
=

Bp4(PH)
+
+
B12A 5323 +

2 N(1 0,52)
=

·
B, Hypothesis Test:slope
- checkusing goodness-of-fit test, plot data,

Ho:(810 0 bi -

(8.0 Normal (G-a plots

quantile plot
Statistic t
-

Test
=

3. Constant Variance:
-
Hai (B,)a = 0 S/Sxx vanance around true points
shouldnot
charge drastically -v
n
=
-
k -
1

similar valance regardless x-value

of thatdon't
follow
a clear pattern .

... .......and size

9. =cpdence
Rejective to does not
always gaurantee a linear Slope E.g. Sinusoidal of
residuals are independent one another no
with trend
Rejectsince
Bo confidence Interval:y-intercept
same
-
checkusing residual is (Hr(ful:BPH1 Bz BB 0 ->

==
-
·
=
=
=

B,
residual-='5 = Palu <0.05
=
as for CI
- observationorder plot
. .
==
.

.
= =
=
==
=

(H0)PH:BpH 0 ->
Reject 2 0.001
:
=
at

(nsxx) xP
=

for
CI 3:bo tax where v n
=

Observation
-

(H0)1:BA 0
= -> ax
Reject
+
0.001
=

(H0)B:BB 0 Reject at 1 0.05

=
=

Two-factor experiment ANOVA

-
->

·
B Hypothesis Test:1-Intercept
-

(H0)0 : B0 0
=
-> Rejectatc 0.001
=

s (amp)
=

Ho:8. (). =
bo -

(80). -

Interpretation:
-

Ha: 3. F (80 teststatistic t =

5 Samp =
-

least 1 21 1,2p 0
- = =

yE
-
- 161.897 54.299(p+1) 89.998
+ +

s
[ix."/n(Sxx) -

YeastB ->
2A=8,25 1
=

15
-
- 161.89) 59.294
+
(PH) +
24.166

- yi 161.897 59.294(PH)
①->
0
Gives information 2= 0,2p
sum
= - =

t here
- +
=

on if Is
DOf
of abn
=
-

1
a baseline value when X =0.

changing yeastchanges the Y-intercept

* Interactions between
categorical and numeric
data will
cause slope to
change Br(PH) (27) ...
E.g.
+

...
|
Multiple Linear Regression ·
ANOVA:Only for 5, ..., BX upour tall test
·
Quality of fit:

R2=SSR=1-30
it
Model:Kexplanatoryvanables ## of a coeffs Coeff Determination:
Ho:B p2 Pn Rejectto
· -> -
of
0
= =
- =

...
=

Multiple Linear Regression:Y 8 =

+8,x, ... Pkxk +

E
Ha: At leasto ne B,0 osen. GluS proportion ofvanabilityin data thatan
-

-Model be explained the regression model

by
-

fitted Regression (predictal): bo +b,X,

=
+
.. brXk
+
Significant:At leastone B, 0 v.i =
1, .
.
.
.,
k

Vi k (explanatory rars.) Ve n k- 1
=

As # o fexplanatory
vanables increases:R2-]
-
=

only var
↑ one response
data points:
A
9(X2, X2i,x,,,..., **, Y:)3 Because SS-
decreases SSTis constant
- -

i = 1,2, n as
...,

- captured
where n>Ki is each data
More vanability gets -> SSR -> SSEN
point
(E.g. a person) and
-

Adding more explanatoryvanables will push R2-1

↓i s a explanatoryvanable (e.g. height, weight, ...) E.g. 13 seeds are tested 3
with
types of fertilizer different
with
even if variables
the are not
statisticallysignificant
* Assume every explanatory vanable (K) is
independant R output:
Leads to model
overfitted and overlycomplex
-
fitted Model: Y, bo +b,X,i
=

b2X2i
+ +

.. +

baxki

Adjusted Dr.Rad
ssESui
Each b,is =
1
Linear
-

· Model Assumptions:unbiased estimate

of B
fort test
1. Have a power of 1for each term (xi, xe,X,,...): use If the Decrease in SSG
as more vars are added IOf
the Decreasei n
-y b b,X b2Xi b,X;
-
does not match increasei n DOF ->
Rady
brxi
3.)
=
+
+

3
+ +

..
poorer fit
+

meaning

6
Explanatory a

(SE) I)
a
still considered a liner model since each farm is added (nearly vanubleg Radi =

1 -

full
=

vs. Partial Model:

-
can transform each form to to
get a linear model
+ +
y b! b,X. bz bux, . . bux
' + +
checki fcertain explanaters the
=
+

vars, should be in

sR i v
n

f=
fitted model partial models
y bo bynot
or not
by forming
x linear model fitted model: I bo
=

a
b, X, bXz byX
+ =
+ + +

-
Use ANOVAon full and partial models compare
to

* 1. Have no interaction eachother

with (Independance):Estimate:bo, b,..., bk
which is the better representation of the data
· Std. error:C I bi =(t *(x)SE;
=

* Lookfor:
(g bu b,x, bzXz b2X, Xa
=
+
+
+

·t-value:t bi (P.)0 / SE, Radyi s

6
1.Which
= -

higher
still considered a linear model (from table)
· P-value:Probability of statistic 2. Which model is more simple
transform a linear model
to get
E.g.full:y 30 3,4,
-

can each term 32Xz 33X3

X, at 0.001level
+

and X2: significant

+ +
=

y bi b,x +beXn beXs' XS x,Xz 0.1level

Partal:Y B0 3,X, + B3X3
+

X3:Notsignificant
at
=
=

+
+ where =

·
Assumptions of Multiple Linear Reg.:
·
Sval
we (Residual Sz) st
squares Method:
=

· Ordinaryleast
-

Lineauty -

Normality of Residuals
statistic
min(e,z (y, =
-

5,)u (yi- (b b,X,i

+... +

bxxvi)" SSz =
-

(Ho)nx: B =
pe B = 0
= -> use -
constant vanance -
Independance of Errors
testthis
=

NoMulticollneanannot
*
B.

3
- (H.Jo :

0
=

General MLTEquation

!
Y bx
-> to
b e correlated each other
e
((01:B
- = +

-
0
=

to statistics or else
they will compete for same statistical effect

([],b (2), Ex== x -].e ()

use
Plot:checkfor
Matux multicolinearity
HoCz:32 =
·

where y = =

x -
to fetalel
(should lookrandom)
=

Avoid correlation (r) > 10.71

* Interaction Term: E.g. B,X, X, ...

can do (Ho3. Since iti s at test, of
not test
...

Goal is to find that minimizes SSE T Non-Interaction Terms:Main Effect terms

c
f(v,
-
=
-

3,v 9) 30.98 p-value 0.996x10-

- = =
-

= =

#o fInteraction Terms:Based # main effect

b (X ix)"X y Use R to solve reject (Ho(full
on of
+
-

- 0.001 We can
Since Rake < * of
=

model is significant Types:

thatthe
-

and say
-

ococoico-Reject
1. first
O rder:2-Vanables interaction
* (Hobo (HO), (Hoben Sie Revale 20001:
interaction
Vanance Covanance Matux:
(x +x) 2. Second Order:3-vanables
-
-

So, 3,32 are

significantlynon-zero coefficients :
: :
-
Vanance:(ii Diagonals second order
E.g. order
first
=

ME
-
fall to reject (Ho). Since p-valu > 0.001:
Covanance:Cij -mmm
y 30 3,X, 12Xz 33X3 5,2X1Xz 313X,X3 B2yX2X3 P23X,x2X3
-
+
+

↑
= + + + +

linley coefficiento f
+

is a zero

vanance of bi C,,52 =

covariance of
bibi C,j52
=

Do not an ME
fit in Interaction term if it's not significant
-Partial Model:X an
-

is not significant since Rake

significantME
A
mayappear insignificant since interaction
vanance
e
f(y,- y)
is very
high fall to reject
Ho.B 0
=

(52=SSt= multicolinearity significance

forms can lead to -> Dilutes
of
residuals n -
k -

↑
1 Re-run with partial Model Yebotb, xi+bXz: -

Picking to use a interaction forms

model with

· B,confidence Interval: Degrees of freedom -

Individual P-values are
significant(<0.001)
still depends on High Rad,Value, complexity(interperability)

B,is:bil S Ciiwhere
-

Total Pavalue is even more significantsmallers RCBD ANOVA:same SSA, SSB, SSE, SSTas Repeated Measures
for
CI tv,c/2 Wn =
- k- 1
·

Higher Adjusted R- Lower Multiok R2

·
S. Hypothesis Test: - > for Hol
because vanability
* should remove
Ho: B (i)0
=

bi -

(5,)0 is explained
by
statistic t
Test from the
Xy
=
one fewer variable
Ha:Bi E(B:). Sci, Is for Ho2
⑳ model
standard
errow
Multi-factor experiment · RCBD Assumptions: Two -
factor experiment
-

1
for vanation from an additional factor
Equality variances of
-

Account for interaction between facture

Account

Repeated
One factor ANOVA with Normalit samethong interaction:value factor influences
-

Measures ·momized
B)ockDesign:
of

-
No Outliers value of faster 2
-> bofsubjects or Blocks Assume factor 1 and I are Independant Positive or
Negative
Independence of Observations
stratify bythe additional factor
-

Two-WayANOVA:
-

-> Blocks are Independent

original factor
·

k = # of Treatments Randomize subjects into treatments

-
for ~
a: #Levels in factor A -
b:Levels in factor B
·subjects are randomlyassigned
* Randomized complete BlockDesign (RCBD): treatments
within a block
-

Yisk: Observation K of combination (i,j)

Leval;o ffactor A, Levels f actor B
Assign each Treatmentonce to each Block * factors have no interaction effect of

-
observations in each
n # of (i,j) (n stays same
Why
Statistical Model; M Total
·
Use RCBD:
(a)(b):
mean
- =

# of groups
·
-

-
Si Effect ofTreatmentonM
=
More vanability can be explained
(a)(b)(n):#
-

- total observations
of

Yjj m 2i Bj Gij Bj Effectof Block;o n M since SSB takes from SSE

=
=

away
-

Statistical Model:
+ + +

·
Interaction Effect on M
-

Gij Random
=
hose -

Increases Power:can get same &

Ho1:2, =... 6k =0 -> =

has
Treatment no effect
conclusions with
smaller samplesize ↑ ijk m 2,
=
+ +

3i +

(B) ii Gijk +

nisample size
Haz:Atleastone 2 is different
room;
Interaction
hassome effectH ol:
2, ... La 0
2i, 35:Effects
offactor and B
=

Treatment
-

- -
A on M
Tis =datapointsfrom room
How:B, =

... Bp 0 =
Blockhas
->
no effect
Ha(A):Atleasto ne aisn't equal Effect -
((ixi 0
=

HOLB):5, . . . Bn 0
H0(AB):(48)1,1 (B)a,b
=

0
room i
E!jBj
= =
... =

0
=

Yi= mean of
effect
-

- > Blockhas
Haz:Atleasto ne is different some
=

Ha(B):Atleast one Bisnitequal HalAB):Atleast one (dB) i,j isn't

equal
Y.grand
mean
-

50*; (< B) is 0
=

UCT PSY2015F Statistics 2023
No ratings yet
UCT PSY2015F Statistics 2023
34 pages
Statistics 101
100% (1)
Statistics 101
20 pages
Statistics: A Branch of Mathematics That Deals With: Planning Collecting Organizing Presenting Analyzing Interpreting
No ratings yet
Statistics: A Branch of Mathematics That Deals With: Planning Collecting Organizing Presenting Analyzing Interpreting
43 pages
Lecture Note On Biostatistics
No ratings yet
Lecture Note On Biostatistics
74 pages
Statistics Cheat Sheet
100% (3)
Statistics Cheat Sheet
23 pages
Biostatistics Notes Part 1
No ratings yet
Biostatistics Notes Part 1
9 pages
STAT100 - Full Course Notes
No ratings yet
STAT100 - Full Course Notes
27 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
29 pages
Unit 10
No ratings yet
Unit 10
20 pages
Business Statistics Flashcards - Quizlet11
No ratings yet
Business Statistics Flashcards - Quizlet11
19 pages
Statistics Cheatsheet
No ratings yet
Statistics Cheatsheet
3 pages
Prob & Stats (Slides) PDF
No ratings yet
Prob & Stats (Slides) PDF
101 pages
Introduction To Statistical Analysis
No ratings yet
Introduction To Statistical Analysis
41 pages
Statistics 110, Lecture Notes - Cedar Crest College
No ratings yet
Statistics 110, Lecture Notes - Cedar Crest College
111 pages
Unit 2
No ratings yet
Unit 2
25 pages
Ages
100% (1)
Ages
89 pages
Chapter 1 - F2021 - IE 242
No ratings yet
Chapter 1 - F2021 - IE 242
35 pages
Data Science 01 - Basics
No ratings yet
Data Science 01 - Basics
52 pages
Statistics През
No ratings yet
Statistics През
46 pages
Review of Statistical Concepts
No ratings yet
Review of Statistical Concepts
60 pages
Bio Statistics
No ratings yet
Bio Statistics
72 pages
Introduction To Quantitative Methods: Morning 6 December 2007
100% (1)
Introduction To Quantitative Methods: Morning 6 December 2007
20 pages
STATS Notes
No ratings yet
STATS Notes
18 pages
COM 201 - Inferential Statistics - 18032022-1
No ratings yet
COM 201 - Inferential Statistics - 18032022-1
58 pages
Statistical Formula Sheet 1: X X N X N X F X N
No ratings yet
Statistical Formula Sheet 1: X X N X N X F X N
11 pages
Math 140 Final Review Notes
No ratings yet
Math 140 Final Review Notes
20 pages
Comm 215.MidtermReview
No ratings yet
Comm 215.MidtermReview
71 pages
Probability and Statistics - Practice Tests and Solutions
No ratings yet
Probability and Statistics - Practice Tests and Solutions
46 pages
ST Formula Sheet Midterm
No ratings yet
ST Formula Sheet Midterm
4 pages
Intro To Probability and Statistics
No ratings yet
Intro To Probability and Statistics
147 pages
406d PDF
No ratings yet
406d PDF
6 pages
Seminar 4
No ratings yet
Seminar 4
43 pages
Chapter 13 Capital Budgeting Estimating Cash Flow and Analyzing Risk Answers To End of Chapter Questions 13 3 Since The Cost of Capital Includes A Premium For Expected Inflation Failure 1
100% (1)
Chapter 13 Capital Budgeting Estimating Cash Flow and Analyzing Risk Answers To End of Chapter Questions 13 3 Since The Cost of Capital Includes A Premium For Expected Inflation Failure 1
8 pages
MAT 211 Introduction To Business Statistics I Lecture Notes
No ratings yet
MAT 211 Introduction To Business Statistics I Lecture Notes
69 pages
Advanced Topics in Number Theory
No ratings yet
Advanced Topics in Number Theory
8 pages
Statistics Notes 1702100127
No ratings yet
Statistics Notes 1702100127
22 pages
Bes Summary
No ratings yet
Bes Summary
11 pages
Statistics
No ratings yet
Statistics
5 pages
Statisitcs
No ratings yet
Statisitcs
22 pages
2statsnotes 1
No ratings yet
2statsnotes 1
24 pages
10 General Aptitude - GQB (Ddpanda)
No ratings yet
10 General Aptitude - GQB (Ddpanda)
71 pages
00-Qe20-00014 Rev B - Draf 021625
No ratings yet
00-Qe20-00014 Rev B - Draf 021625
9 pages
Probs-Stats Revision Notes
No ratings yet
Probs-Stats Revision Notes
19 pages
A Crash Course in Statistics - Handouts
No ratings yet
A Crash Course in Statistics - Handouts
46 pages
Statistical Methods
No ratings yet
Statistical Methods
16 pages
Handwriting Enhancement Recognition-Based and Recognition-Independent Approaches For On-Device Online Handwritten Text Alignment
No ratings yet
Handwriting Enhancement Recognition-Based and Recognition-Independent Approaches For On-Device Online Handwritten Text Alignment
15 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Quick Start Guide To Using PID in Logix5000
No ratings yet
Quick Start Guide To Using PID in Logix5000
9 pages
LQ1 Notes
No ratings yet
LQ1 Notes
15 pages
The Sun and The Stars Are Set in Motion - New Model of Solar System: Legitimate Refutation of Heliocentric Model
No ratings yet
The Sun and The Stars Are Set in Motion - New Model of Solar System: Legitimate Refutation of Heliocentric Model
147 pages
WIPRO Online Assessment Syllabus - WILP
No ratings yet
WIPRO Online Assessment Syllabus - WILP
3 pages
DMTH202 5
No ratings yet
DMTH202 5
2 pages
Probstats Reviewer
No ratings yet
Probstats Reviewer
3 pages
Assignment
No ratings yet
Assignment
5 pages
Statistics For Data Analysis
No ratings yet
Statistics For Data Analysis
13 pages
Statistics For Data Analytics
No ratings yet
Statistics For Data Analytics
15 pages
Philippine Christian University: Week 1
No ratings yet
Philippine Christian University: Week 1
6 pages
Gea Cheatsheet
No ratings yet
Gea Cheatsheet
4 pages
Digital Design Interview Questions
No ratings yet
Digital Design Interview Questions
24 pages
Gea1000 Cheatsheet Finals
No ratings yet
Gea1000 Cheatsheet Finals
3 pages
Stat 115 - Basic Statistical Methods
No ratings yet
Stat 115 - Basic Statistical Methods
6 pages
Statistics Notes
No ratings yet
Statistics Notes
17 pages
Cloud Hypothesis
No ratings yet
Cloud Hypothesis
17 pages
Study Guide For Statistics
No ratings yet
Study Guide For Statistics
7 pages
Cheat Sheet 1
No ratings yet
Cheat Sheet 1
2 pages
A. Variables:: Types of Distributions
No ratings yet
A. Variables:: Types of Distributions
10 pages
ST1131 Cheat Sheet Page 1
0% (1)
ST1131 Cheat Sheet Page 1
1 page
Fuzzy Quantifiers: 4y Springer
No ratings yet
Fuzzy Quantifiers: 4y Springer
6 pages
Cape Applied Mathematics Cheat Sheet
No ratings yet
Cape Applied Mathematics Cheat Sheet
6 pages
Decsci Reviewer CHAPTER 1: Statistics and Data
No ratings yet
Decsci Reviewer CHAPTER 1: Statistics and Data
7 pages
Board Diversity and Its Effects On Bank Performance - An International Analysis PDF
No ratings yet
Board Diversity and Its Effects On Bank Performance - An International Analysis PDF
13 pages
Stats Midterms Cheat Sheet
No ratings yet
Stats Midterms Cheat Sheet
3 pages
Physics 2 A Fiv
No ratings yet
Physics 2 A Fiv
3 pages
Shaft Misalignment and Vibration - A Model
No ratings yet
Shaft Misalignment and Vibration - A Model
13 pages
Handout - Measuring Risk and Return
No ratings yet
Handout - Measuring Risk and Return
79 pages
10 11648 J Ijass 20231102 11
No ratings yet
10 11648 J Ijass 20231102 11
8 pages
The Employee Engagement and OCB As Mediating On Employee Performance
No ratings yet
The Employee Engagement and OCB As Mediating On Employee Performance
21 pages
Or Assignment 4 Queuing N Simulation
0% (1)
Or Assignment 4 Queuing N Simulation
2 pages
Management Science Activity
No ratings yet
Management Science Activity
2 pages
ST 16 2-5 (-4)
No ratings yet
ST 16 2-5 (-4)
9 pages
Solving Multiple Distribution Center Location Allocation Problem Using Kmeans Algorithm and Center of Gravity Method Take Jinjiang District of Chengdu as an ExampleIOP Conference Series Earth and Environmental Science
No ratings yet
Solving Multiple Distribution Center Location Allocation Problem Using Kmeans Algorithm and Center of Gravity Method Take Jinjiang District of Chengdu as an ExampleIOP Conference Series Earth and Environmental Science
7 pages
Computer Programming Laboratory 2018-2019
No ratings yet
Computer Programming Laboratory 2018-2019
37 pages
Averages Arithmetic Mean
No ratings yet
Averages Arithmetic Mean
2 pages
Craven Slides PDF
No ratings yet
Craven Slides PDF
84 pages
Natural Convection Heat Transfer From Inclined Cylinders A Unified Correlation
No ratings yet
Natural Convection Heat Transfer From Inclined Cylinders A Unified Correlation
6 pages
Stat Defn Booklet
No ratings yet
Stat Defn Booklet
9 pages
Birds: Our Fine Feathered Friends: Seen by Sue and Drew
From Everand
Birds: Our Fine Feathered Friends: Seen by Sue and Drew
Gene Crumbley
No ratings yet
Space Women Beyond the Stratosphere #3
From Everand
Space Women Beyond the Stratosphere #3
Scott Amundson
No ratings yet
Blackbeard Legacy #2 Volume 1
From Everand
Blackbeard Legacy #2 Volume 1
Darren G. Davis
No ratings yet
Blackbeard Legacy Gallery
From Everand
Blackbeard Legacy Gallery
Darren G. Davis
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.