Probability and Statistics
Probability and Statistics
a<ol2023
probabiity Theory
* Applicat'ons îo Statistis i
9tatistcs ?s a collecton of methods ohich belps uS
to deçcribe Summarize,, ioterpret and anadye the data.
To Researcb
)HOO medica iotervertions belps us to reduce ttbe hurden
of certaio viral disease.
2) HOco personality relates to decision making
3) betber a oed fertilzer
icreaseS-yield of crop
Raodorm Experiment 8
experment îo obicb outcome s oot certain
siogle outcome but there are foite numbers of tbe.
possibilities that may bappen
eg.Tossiogof coio
Uorandop experiment
experirnent io obich outcome is certaio.
Siogle outcome
Ha t O HoO (wgter)
Population -
Collection of all uoits îo the
expenment isclled
as the populatior.
Tt is denoted by
Note i Uoits io ao
experiment are all p0ssible outcemes
of an experiment
e areinterested io collecting inforrnation ofthe
Students_ partcipatiog fn a statistíca) Course
Varable
1f coe bave specIHed the population of tho
Poterest for the specific research question then e
Can ibok of tborackristics of ourobservato.
Partícular fe ature of thÉs obSevatiod is called a
Variable
this populaion
} So. pe oples o a city
cobere, X, is Gender
Xa n 3 o, Io6
wbere. X2 is Ag of the person
9) } 43|6o2, 4|O06
obere, X3 is pio cod
4) }Black, grow0, $ray
where, X4 iS ee col yr.
Xs
obere X5 is No, of vehicles
Colour of the.car
-e.X (a9 ) = (oloyr obite, Çray, -
Qualitative Variable
Nomioal Varable -
Categonical variab'e:
Tfthe data is available io q30up form e cal the
respectit varnable capturing this. ioforrmation as a
Catgorial variable agrouped ariable
eig. statug of an application.
[yesNo kiod of ]
Every binary variable is a cateqoriml variable. hut
the converSe is Dot
Test. o0e markg of studert Ais student in
c|ass.
Binary variable -
Tt is a cagorical variable ohich coo take only
two ariables
classmite
2)
Quantittive variable Can be both discrete and cootinuo
True
Discretf variable are always quantitatire
false
eg eiye coloux
4) Nominal variableg are aluways quaitativef discrete
True
5)
Cootinuous variabes are aluoays quantitatívR
True
6)
Coteqorical variables can be both quaitative or quan
True
)
Cateqorical variables are never Continuous
false
eq: heigbt of person ( cateqories ]
Representation of data -
9 Bar diagram
2)
Histogra
9) Pie chart /pie diagram
4
3
Histograrn -
CaBsqories daBa into differeni groups.
plot bars for each categories oith the beight hË
hË = fi
hiloguom
3
Consider tbe follouing data of marks qained by 20 people
i0 an examina tion out of 100:
28,35,42, 90, 7,S67S, 6630,8 9,7S. 64 &I 69,
SS, 8 3, 12 68, 79, I6.
Drad histogram
HiStog ram
Absolute Relative
frequeny frequency
pistogram histogran
Cividing data into 5 cateqories
class iotervals fi hj
-2o /20-1 Yigx20
20- 4-0
a/20 3/1gx20
4-6 o 3 3/20 3/ 192
6 80 9|20 9/19x2o
4/20 4/19x2o
20
4
3
40 60
Dete
for eg
f.x860.1_x3
10
60
for eg 2
fi x360 DoX360 8
x360 = S4
20
x360 = S4
9 Xa60 l62
2
4 x 360 -
Note1 :
nlben origbles consist of larqe numbegof differeot values
uoo bar cbart will be usefud_ but we should choase
oVerbar.
histogran
Note 2:
Bar cbart is usef wben no: of category are not too
|arge
Note g:
Piechart is used to visualize absolute ¢ Relative
frequencies of nomi nal f ordioal varigbles
Measures of Central Dispersion 8
Tendency
Arithemnatic Mean (Average)
Median
Quootiles.
MMode
Variance
Standard deviation
Arithematic Mean -
Suppase the gien data is of size 'n represeDtd by the
®n dhen ardthe matic mean of this
data is denoted by x
Propertes -
) Sum of the denotion s of each variable about the
Xero.
aritbematic' me an is
Justißy.. n
Median -
To caleulate me dioo of the data arang e the
qiven data fo an increasaq order
Ktedian is defoed as folloos -
Notatfonto medan 05
n is odd
nis even
21 3,4, S6
2134S, 67
,n6
Median
22
4+5
2
= 45
Medían diides cbservations ioto two
equal part al least
5o7. pa rts are greater equal median values and at
least 5o parts are lesS eq.ual median valyes.
Quantiles -
LA
= 5
18.5
th
fod o. 25 quanles of the data.
(257)
Arrange data in an iocreasiog order
922 23,2428 29,3O,8080, 81:
25 Xlo = 2.5
na is not an inqer
(Smallest iotger greaer than 2-5)
o.25= a = 23
Medían is 50% quantile f the data.
O.9 the quontiles are called as
deiles- ( dividing data in 10 parts).
If «XIO0 iS an intgers quontles qre called as percentle
bccause fthe data is diided ito 100 equal 'part
1002O
e,ast pohahility thoony ?
frponlnent
5. Stmple £vent :
14 is an event ohich is Contaln ooly One elementof
a
sample spate.
eg Consider 's' to be Si Heart club, &pade, díamond ?
Congider
AaA Card af kinq-.of diamond
Aa is the Simple event
Compound Event
Tt is oDe that con beexpre ssed as an anlon o[ 1he
simple event
e A is the conipound event of wstat S2
Empty Space Lalul space. 00 element
Subset ofthe sample space contaYning
is calledas bull spate f empty Space -(p
micro-orqanism witbnake
eq- Consider an event of deheching-
deteching an
A o?
e2) is a set of al such Prime no of fator
thet,is non-
Iotersecioo of ASB: A8 C S
A08
Union of A §8:
Complimen+ of an event A or AC
(n=t)]
(o-)|r]
Too locry tickets are to be chosen out of ao for 4i
and second price How many wys qre possible
ISL= Totnl no of sanmples fo S.
20| 20X19 38
(a0-2)| .18)
Th erefore,
OayS qre possible.
classa.ce
Tbeorem !:
The numbe ofdistinct permutafions of nabjecis out of uohich
,obiechs are ot one kindne objects are of second kind aod
00and ng objects are of kth kind, then n L
Nouw et b = C =
1)
c s fA0g
Let A B be dhe to eveos AB then
i:e. both erent Cannot orcur siroulBaneO0sy
disjoiot eent.
Tos sing a die
Complete decomposiHion ef or S
Subset ot sample space are events-J
A Aa
Am is called com pleke de
nif the usion of AA2
Am = cermpositon
and AiAA
of
Consider, S
a0 experiment of tossiog a coin.
HT
Event A $H cS (4 subses)
Ao $T? cs
Considex tthe
collecHon ofevents A A2
ALA2f is a Conpleke
decooposilisn of S
Tossing adie.
Gettiog a Nonprime fattor of7
7 = f
Geting
i+ is not
gn odd nurmbex.
formio l,31 5
decompoSitiao of as a sample space.
robability of an evtnt i
Lo) an rperimerh oe having possibe event h,
the ezperimert is repeated n ines we con (oun
how rnony 4imes eth posclble event has octlred
n(A1) noof times Ai
hos ocured in o ttials
Then the elalive fregucn is given by :
ncA)
frobabilBy of Ai in nA)
(Ch)
bzlormolic defioahion
de dprcbati lity gieo hy h11 rorrogotot.
Lvery rondorn cerh hhas i4s probatitity int loed
rtes val Cog
Ariorn 2
Anoro 3
by hziorn 143
tlote'
hpplying probability to beth sidt,
by
So:
iv)
|tocat coffee?ohat 1s IË
slowest157 pCx>25) =
od3914 o:
|- PCX725)= PCx25) > Served he
o-8s
pCx>)
PC the isleaves
PCX<
PCX<)x) x<*) the
fo length at
X=
table trips:
the of =|-P(z
o"263})) pobabìlttyoffce the
=?
24 085 =o-)s|- =
= of I-P(x<25) house
ie 0-IS 15 ioe
t O.6026 from
3 o-&sO& aboe
97 = that a
24 25-24 8:4S
8:35
+ obich heam
3& coill am
X)·04 yof)
we aod
missing Lo1e
find
g
es coffFod
:00
hal!
Success =
PCX=2) =
PC=2) Xi
aothe
No. hour
Prob-(Success)
0.0S7I
?= probability
tríp
=
of
?
O922
O-O 2 áps taking
3C 3
Co oith hat
(o-DSTI)
co-os7 succeS&
at
nea too
legst
of
x(o:3 . taost next
Va
0:0571)(1- (binomial
zordon V hour. &
429) toips.
takes
ssaie
O"00922
at
leact
Lcte
o.6026
fCX725) = oO3914
iv) od the length of tíne above obich we fd do
slowest 157. of the trips
15/
PCX>) = 0-]
|- pC x<)
’ as
t oz = 24 + 3-& X)·04
27-452
sste
8-2
3ca (o-os7U)(l- 0-0s7)
3C,oRs7)x(o:g424)
PX=2) O-Oo9222
Oncotia/
Pace
pCX <3 ) =
3/4
O52T6
No ws
n=6
Serveo 1o less than mjo
Success : geting
Y: No.of dayswith syc cess
P C"ien)
I- P X<+)
I- P(X43)
>b(*, 6,0.5 )
|-O.6S6
rYt). = o-3431
taber o callspoisson
receired per hour by telephone a9swerin),
cervíce ís a TV. Xwith =6 Time in hours
Y between calls is exponential|y distibuted. fiod the
obability tha time betoeen aCalls is qreaakr than
15 oio.
X poSSoN random Variable
)= 6
y: time betoeen 2 calls (erponential r.v)
PCY> 15)
PCY>Is ) = -1S710
=0-223|
Dote
Poce
Dístribution of Xi
Let
X nN(2)
Xí = X X2, -
X + X2 + Xa t. ---tXn
3
n 2
Var(?)
(on clusiãn :
X 15 q norma) rv. coih mean and
ariance g?.therfort, Standard si deiakion
of is
lssmate
(Aa)
X+ X2
tben
-efioat'on
Aoy function | Corgbinaton of raodoro yariables is caled
$ the statists.
X is q stat'sts
+2
Dote
Pcae
Propertes of íshibution -
Theorem -
Consid er, ttoo iodepeodent a nd om vqriqble which o
their syrm fs mn dt
Imp erample of dístribytion e onsiqered the
Sqmple Var(s
Sample
Sqmple rgiance cs Sarople
paramefer
fopy lotion =
ECs)
populatior standard devíation
populaign variance
318|3]23
2 ( aiu)2
Consider,
(a-xý+ - 4 2Cais)(-)
(ai-+(*-4+2(7-)(ai
(-z)+n(-u 2(F-u) (2 xi -7)
2i- X
There fore
this i'mplies
|>
(i-uf =s(i-+n(7-u
dide thnughout by c
-(
Teartanging
2 2
2
62
j
Co-1)s2 - (i-z)
t- random varlable
)
find
794)4
.fC4&2 <? I0194 )
<49-176
r('>84) - r(X'
t- distribution:
efoatin
Let x and Ybe
and Y two iod ependeot random variables
2
tben the rato
uchtha
t-rand orn variable wbich follbwc
4-4stibution
Therefore
1f 6 is unkooDo
DatPocee
o 975
classmate
dfstribytion
Theorem
Ratio af2Sample varances
(Reduce t fronm sgmple size )
S Wtm,n-1
Anslaians Samaling d'stributians . aod E
maautacturer a car batkries quarantees that his batkrall
deviahan
last on an avg 3 years oith standarcl
batkrier have ifetime oF l.3.2-4.i a-5
-2ycars is manufchursr stll coooce h t is
wearth s
bottre s bavre st deviaim ot
normalydisibuted.
acea ASSum tht batery iFime s
I9+2-4+3-3-5 +t-2
5
=(·s-3)*+(2-4-3(3-(35-5- (ats)
5-1
O-81S
3.25E a t : S r r
fdis
t f u - n o r m a
SangVle
oianee
tested manufaturer'sclai
otberwau, oe can
Teíectin area
57.
Xo.05
Xans - g. 488
-to.os
S = 40
oX
we want to follgo distributfon about u'
are
checking claiming
Caie i) 6 is kooon then
tn
Ca3e i): 6 is unknown te
X is ao uobigsed estimator of
:ECs)
#Toprove estiatorof r2
unbiased
2 is an
2
Recall;
Varcx) =
by CD-l)
S
divide throughout
S(aj-u)2
(n-)
ECs2) = E SCxiu7
M
ECs')
n
hssmate
Date
Pece
Testingof Hypothesis..
Afn
To this procedure of testing of hypothesis. we examine
obetber a statemest about aparameter or given esearch
bypothe sis is t e ór False ie. to decide cohe ther we accept
the given hypothesis or to reject
Qoe sample problem :
Tf we use only gne type of sample to (ompare a
sample parameder wfth &ome
soroe fif; hypothecized value
oF the paramekN itis called as sample probleas
The data orqinates as ODe
Sample
from cle Lnedpopulat?
Two sample problem i
Soi- Let,
: Varíance tor men qDup
6 : Karance for womens qr0up
Hypethesis to cbeck : > 6,2
O. AtenatiueHypothesis H
lesearch hypothesis wbich we want to check
or to ocR is Called as alternati ve hypothests
It deroed by H
Ho ís rejeced Correctdcision
classmate
Date.
Page
Ty pe T erroY -
The hypothesis H s false tbut itis aceepkd
then we Say thaf we
hypothesis thatH
CSuppose an allergist is oilliog to test thesome'cheese
at least3a of public is alleiraic1 to
pro duct explain how allerqist Couid make type D
Tut-4
monufacurfng Fim is beng chorged tod
lorg
disoririnations in its biring protticeg.
ypotheçis is being tested if juy orareit:
) d What
aupe I error by tnding the rm is guilty?
so"' Given,
Type I ertor by Finding guily.
Ho io is not guilBy
H firm is guilBy
b) hal bypottesis is being eshed if juny Coronots
type 1 error by oding the tim is guilty
Giveo
Tyr I eror by finding guilty
Ho i firo is guilty
t, fo is not guilty
’ttest
Testog
of
(TOH)
’-testv
f- test
1t is the probabilihy of type-I
j-e. prob (Aype-Teror) e d
pC4ypeI erfor)
prob ( Ho is true butit is rejcnd ]
Siasfance d level
T is the probability of type Ti erroT.
ie. p = prob Ctype-I error]
= P[ Ho is false, but is accepted]
p=f[Ho/n]
pouer of test
| - P(t4pen eoT)
1-PC HolH]
= P( He/u)
l is the probobillky o oaking
His
oecislm
tye. io fa
nesearch bypothesis H if
probabiliky o deechng Corect veeorch buo,
Ingoneral. we folloud the folloong ta seoe d
the hypothesls
sep 1 choose Ho H appropialely
Note that, H is gen erally dhe. stakmest w
to pove ¢ HoMo is alBernaHve to H R stated n
terms of oqualiy"
Sep: Choosefixed value of':
(gener altyy t s given as. g or. L't )
step3 Cbuoxe tosl statistie appropafalely crPlcal ngi
based on ' .
Confidence
ioerva
- Nof symnehíc
bote
symmeic.
f-dest
oith
f tes s ured ohen we haye two sample problem.
Tuoo tailed
One tailed Hi
CRigbl taled )
One tailed
Cieft dailed)
too tailed
ane taied
(Right tailed)
uo tailed
Cleft tauled)
is the probobiliky
pothesis H
ot roaking
If 1t is
decislon
tye. ia
esearch by
he probabìliky of deecing Correct reeark
Tn qeneral. De follou the folloing shep:
thi hypothesis
shpl' choahe Ho fHa appropriakely
Nate that, H is generally dhe staement
termsof cquality
Stepa Choose fixed yalue of :
CgeneralyHt given a s or I )
Shep 3 Choose test statistc a ppopafalely 4 crHal egs
based on
dgion
Cooffdence
ioterval
4f = Not symmehie Date
symmeic Page
F-test'
S,2
f test fs used ohen e bave too sample pro blem.
Troo tailed Ho
2
Hoo tailed
Citcaegion.
One tailed
CRight tiled)
two taled
Date
Page
Tel
Test the bi patbesis that '= 6 agains the aten
alh
(s-3)
$- (G-1)l+32
(5.3)2
notheroay '
Ho 6 =6
2.
(thoo tailed )
)
3) fvaue remaîos Same
( i : is one tailed two tailed )
Critieal regon:
dost haxe ftable for o.00s
ontos -So chanqe the K
take
Hence
f>foo5,
Sogs95 o13) 1
Galical
Therefore, J'32 cHtical region. Heoce, we fa |
o ejet Ho
Z- -
7l-3 70
Critca egio
1Zegs
Hence Z 65 CriHcal regioo
As 7=)022 E
-therefore, jec Ho.
clsswste
Dat:
t
Lcte,
4): Ho: = lo
|-e. M> o.05
2)
t=X- lo 3-622
o.o3/1o
4)
Critical Regco t>tK
-5) As,t2:82|
Hence
i-e- 3-1622 ¬
citical'gm
the Ho
Rejet
t> ta
tto. o
t -839
Hence we Rejel Ho
X-Test - To make interenceg about
about
populaten sttde
(D-) s2
slalistic
Cca region.
Ho o = 6
Tuoo tailed 2
2
One tailed Ho = 6o
Cleft tailed)
One-tailed Ho: 6 6o 2
e:j:1) Relatin bet a¡e and body weigbt coith distance Jume
( Muhivarfate
e2) Receven tme of patents depends upon many facd
that cm trol weght, Haemeglebi.,sugar lexrel.body
Hernp- ( Mulfvanate)
it 0
X occease hen y_ decease
Scater plot
min eit
S -min (ytai
( i kyi are comtant iven 4 andF are paromeker
22(y -«-Bi) =o
yi = S«+
(T,
eq D and can be solved simutaneously to
Given
Somple si2e = 4
2yi 22
j:e: 22
22 = 4e t6
}iyi= xi +Bs2
48
Solving
(I)
132 = 244 +36
192
244 t S (w)
So = 208" = 3
Hence
regression line is Y t 3x
i y
Sziyi -(-7)Zzi
Sxy
Saa
obere
Sy
S
Syy
&tÙx is caled fttd vegression 1ine or
fitted Model
when.pis negative asas increasey decreases'
and we Say that x and y having Negative elatton.
between them:
Tf f is positive as ocrease yalso in creases
and we say that elatim betoeen o f y is pasttst.
Resid eual:
Leti
Cai,yi) be obsered data point
(xi,) predícked by segreasion line
Theo
the difference beteen (yi- ) s called as
residual
hlron9
c/essmate
Dule
Faoe
) 2L 24 36 64 39
65 30 60 60
n5
90
iai-) (yi-g)(xi-+)h-9
24 29 324 8
85 65 O.2 4
64 30 28-2 -31 87. 2
20 60 15.2
33 60 - 2-A
Su = |762g-305 (ai-ã)
y= 3o5
2=56G
Say=S66
y= 6)
So, Sxa =
Sy SB
Sy O.89
S
A
y-
6|-(0 489) 3s.2
= f3.78+ o 489 X
Ye 4.70 +0.4R9 X
classmate
Date
Cai-z)
24 (yi-g)(zi-)(yí-g)
29
ai-)
35 - 924-8 |25.44
4
64 30
28-8 -3| 892
20 60 -15.2 -| 15-2
795-24
33 60 -2.2 23)·04
2-2
2- 176 484
-|20|
305
X= 35.2
He re
and Say l 201
There fore Sy 1201
S2
= 6| -
g7-537
(-l.038)x(2S")-g7.537
=91.537 - l. o28 X
12]4|23
*Multfoear regres sion i
Such tha
Givens
yi
coherei= L 2 - - n
=S(yí- y; )
appnxalue
ueválue
SSE
classmate
Date
Page
This ioplies
OCSSE)
(sSE)
’(-iyi-*-i-f2i)
Tut
clussmate.
Dote
Page
65 50 55 65 55 70
7 5 3
437
360
S f = 24
2190O
Si. =1960
I934
|24
2) B is q random varigble.
Variance o
Sx
4)
5) Varck)
2
n(S2-n2)
Treralue ofyi ettnated value
c) sE ssR
6qm of suares of errors )ResTdual
SSR
SSK is an unbiased es
estimator o (popuatn
vanante
Syuat chi in and
ramekers too fnuolve itcouse be arntrced
reed{ dishiution
with
2n- chi-Squat iSq
jal SsR
Resfdual) errors) squares
of (Sum
of
Bi)s(vi-- ssEssR
= 6)
Yiof Trgyalue
ofyi
e ated
) (oitbnad n(22-na)
Varck) = 5)
2
4)
San
8)
varigble. rando q is B 2)
ased anis B. of
is e ie estimator an is B
umma S
cnssMAte
Cate
foce
Cyi-«-p )2
1a2 846 17ZIo4
24
65 35 8.7994
31 96
80 64
60 20 76.32 -\6"3 2
60. 33,. 6.81., -8-2|
SSR
SsR o4.9812 4q8|2
Taferencer (B
(No elatian beth a g49).
H = There is axlation bet a¢9
ToH Reqrcsian
o e. 5/
Therhort
Tis equal to
SSR
rearang (o-2)
SsR
6o Quality Contro)
#Introd uction
Almost evers manuockríng. proce ss results fo
Some randorm varíable i9 the atoms jt produces
to be SOme \aiatior
and they are qlways joing
bet the dtorns produced qnd tois variaion is cal
Chance raria tion. Ths îs inbereotto the
qs proce
HoQcver, thert is q gnothex type of vari ol::
ohich is far froro being ioberent to the proces
lwhich is due to come assiqnable Cause.
due to assLn àble Calse tbere qe Some ad ver:
effect n the quality of toms pnduce:
later Cause Variato oay be du'e o faulty
getting macbint, pooT qually of ra m4 o
iocorect SoPtuqt ç se 4 may be glso due
byrmcn erDr
When the only vahiation_present is ue to the
chance and not by a8siq ngble cause we say that
process S in Cortrol th
The key poblem is to determioe ohether
detrmjoat
procesS iS in or out of control Tbe
of Con trol
o pro(ess wbether it is in or out
use of contro chart
grattly facilitated by the(ontrol
statistical too) s g cbqtts
Also it is deter miped by two numbers
Upper Control imit VCL
Loer Conro iot 'LCL
eratd
To employ syCha chart the data qener
-the manytactyring process divided fnto s
ghups and sbioup statisic sucb a std. devia
nd sabgnup
camputd.
(e interences of 6 l std de vi )
4 interenrey o (mean) Tace
Xnt t Xntat+Xan
X2nt t Xant2 t . - t X 3 n
£CX) =
s2
Var(x)
Normal dishýbution
0 9974
ie. i the process is io Contol though o the
production th gnup we wowd certainly
Cxpect thaf
Noxi- 4)
LCL UCL
Lte.
n 4
LCL 86 3- 3 (o1)
UCL =
3-)S
LC 285
4 Subgrvup No.