Probability and Statistics - Book (DR Hari Arora)
Probability and Statistics - Book (DR Hari Arora)
- Chapter
I5.1. INTRODUCTION
The 12eneral problem of finding equations of approximatin o c urves wh · h .
- b IC fit .
.
Cun·e fi1tnng. given data .is called
Consider n pairs of values (x 1• y 1). (x2. Y2) •··· (x,,, Y,,) of two variables x and
1
about their relationship if any. we plot the values of x and yon a suita ble scale Th)· To_get a rough ide
. . . · e point ( . a
2. .... n constitute a diagram called scatter diagram and the g, ven data (.r;. v .) i == s :\, Y;),;::: I,
1 2
bivariate. An exact relationship between the variables x a nd v i.e .. of the ~, '== / ( r)' ' :··• 11 ts said to he
. . . . . . - . .. w 111ch fits th .
sets of data 1s called curve tittmg. Generally tl ts not poss ibl e to ftnd a c urve w hich a , e given
. . I . h. b .
the e.iven potnlS. We can obtam re at1ons 1p e tween .r and _v 111 the form o r straight r p sses through aII
"' . . . . tnes, curves of
second degree. tlmd degree etc .. which may g ive the best rcprcsentat1O11 o r the biv·triate d". .b .
n,e method of least-squares can be used lo gel that re present at 1. 0 11 . It is probably' the best ISln Ul1on
10
fit;
unique curve to a gi\'en data. The other methods are : g raphic al me thod. me thod o f group averages and
method of moments. Herc we will discuss the me thod o f lc::a st s4u~1res o nly.
II
eoch t·rrw \ \ ' t' squu~ t·uch of tht·sc nnJ for111 their sum i .<' .. 1::: = L E,2
. I • . '
·r ,ivesthebes11illln6
Now if E = 0. then all the poi nt~ lit· un till' curve. O the rwi se the 111111111,um 0 1 ~ g ·. . ,rrors jsn
• • • I • •
or . ,·mum values of£, its partial derivatives w.r.t. a, band c must be zero.
and min
·mum
fof (Tia.XI dE dE dE
------0
da - db - de -
:. . ") artially w.r.t. a, b and c respectively :
·fferentiaung (II p
01 dE
- 2'(y;-a-b x;-ct) (-1)=0
da '--
dE
db
dE
-
de
. l'fy
s,mp 1
and re-write the equations as
LY,2. Solve simultaneously the normal equations for a' and b' and put in x =a'+ b'y.
pl ~or. 2· Plausible values of variables. Lei the II equations in x and y for determining the most
ausible (best) values of two variables x and y ben x + b,y = c,. 0 -;,X + h2Y= C2; ... , a,l + b,,Y = c,,. The
1
345
., , . ancously for a and h, we ge1
1/ o s1mu 11
1 fl ''" . 1 and(") · a== o, h = 13.6
f . , ( I1) • • •
~·;' 1v1fl& f andbincqn. (1), we;gl:l
OP~1' thC values o a ~
•tilting y == I3.6 x
b~I .
OPsu
. ht line to the following data consideringy as the dtpt:ndent variable:
fitastra1g -~-=--~--;; :-~- ~
16 5,Z, l 2 3 4 5
t•9f11P 5 7 9 10 11
y
best fitting line be
1..,et the y == a + bx
11tiOl1· ... 'ii
so1
. for a and b are
quauon
·. Norillal e LY= na + b L/ .... (ii I
4 6 8
4 5 6
Solution L
· et the best fitting line be
.N y = a + bx ... (i)
.. onnat equation for a and b are
LY== na+bLx
1 y= 1.976 + o .506x 1
Example 5.4. Fit a straight line to the foHowing data regardingx as the independent variables:
1 1200 l 1200
2 900 4 1800
3 600 9 1800
4 2.0 0 16 800
5 110 25 550
6 50 36 300
I>=21 LY=3060 Lx = 91
2 Lxy = 6450
-- 347
1) : : na + b~:>
. are
ua1100 ... (ii )
_..,,~ eq
~01''
1od
~ere n~~-
x2
---- y xy
X
69 4899 5041
71 4896 4624
72
68
70 5110
\
5329
73
70 4830 4761
69
68 4556 4489
67
65 67 4355 4225
66 68 4488 4356
67 64 4288 4489
Ly = 39.5454 + o.4242x I
.,· t
~-.
\
,,
-', " = 0
,, -. ~, •.r, ~,·,
-',
~~~--.." t:h~t tht· lint' ~ ~"'t'S tbn m~
h the ntt' lln po int lx . y) .
~otution~L..'t th: ~-s,
fi ni u$. lin t'ar fun c tion ht.·
'" = a+ hr
Tbffi thf ll(\f fl\A f t'\lU:1tio ns :m·
'
L,._,·
• I•,.I = Cl'
L,_ x I. + /,~ ' -\-.~
I
'•• (ii)
13.q113_ooos (it (1i) . liii) mny he rewrill~n as
t>x - y + a = 0
b I, x,- L Y, + ,w = 0
SOLVED EXAMPLES
I le 5 7 Fit a straight line to the foil owing data :
x-a x -2
1t= - - = - - =x-2
h I
y u u2 uy
X
0 l.0 -2 4 -2.0
I l.8 -I I -1.8
2 1.3 0 0 0.0
3 4.5 I I 4.5
4 6.3 2 4 12.5
X 0 5 10 15
----
20 I~
y 12 15 17 22 24
(x - 12.5)
Let u= 2.5 '
V = y -22
X y u V
UV
---
u2
0 12 -5 - 10 50 25
s 15 -3 -7 21 9
10 17 -1 -5 5 I
20 24 3 2 6 9
25 30 5 8 40 25
L:V = a+b Iu
... um
L uv = a Lu + b L u 2
Putting the above values in eqns. (ii ) and (iii) ... (iv)
- 12 = 6a + b .0 ... (v)
122 = a .0 + b.70
I
351
we get
"or a andb,
( .#
f vlflg
501
(iV) llf1
d (V) I'
f
as -2, b = 1.74
and b in (i), we get
00 a1ues o a
. g t1te v v = -2 + l.74u
1v1fl
00 50 1.74
y _ 22= - 2 + .5 (x - 12.5)
2
G== o.1x + 11.2ss 1
or
shows the number of salesman working ,aor a certain
or fbe ,10Uowing table
,r,pl• 5.9.
~--
#'~: rear
1999 2000 2001
40
2002
56
2003
60
28 38 46
fiu"'ber :-_l.--~_.__----:-~:-:-----L-~__L_~ _J
least squares to fit a straight line trend.
etbod of
vse the Ill _ 6 (even) and h = I
1011, t-Iere n -
5011.1t u = x - (mean of two middle tenns)
~ !(~
2
X - ( 200(); 200} )
v= y-46
and
So that the equation y = a + bx is transformed to v = A + Bu.
1998 28 -5 - 18 90 25
1999 38 -3 -8 24 9
2000 46 -1 0 0 l
2001 40 -6 I
1 -6
2002 56 30 9
3 IO
1---
2003
- 60 5 14 70 25
---- Total
2>=0 ~>=-8 Z:uv =208 Z:u2 =70
3 52 - - - -- -- -
Pur 1i n~ 1h~•:-c , ·.1luc in non n al l'4muions.
- - - -----
·wl' !!et
- ~= M
and 20~ = 70B
A= - l .J 3:D .1nd B = 2.97 14
Tile l.' 4ua1io n of s traight line is
,. = - 1.3333 + 2.971 1 u
Puni ng 11 = 2x - 400 1 ~m d
,. = y - 46 in (i). we get
Y - 46 = - 1.3333 + 2.971 4 (2x - 400 I) ••, (ij
[ y = 5.9428x - 11 843.9047
whic h is the re q u ired strai oht
.::, me . r
■ EXERCISE 5.1
l: \ I I I I -
1°0
-
1
29
-
.
48
2. Fit a s trn.ie.ht li ne of regressio n o f .rand,.
.
2
.
67
3
•
8
4
I: I , : I : I : I I: I
3 . F it a s tra ig ht l ine to the fol.low ing data regard ing .r as th e ind e pe ndent vari a ble :
[Ans. y ::: 2x]
I: \ I ~ 16~ I I I 5: I
1200 2~ 1:0
-1. Fi t a straight line to the fo llowing data conside ring .r as a d e pe nd e nt variab le :
[Ans.v= 136197 - 243.42,
1
'
I~ /,~
353
/ - - : :quarcs.fil a Iincar r~ lation of thdorm p O , - - - :
d oflcas ' · cd to li ft u weight W: '' 10
th, lollo, .,.,
f
, ft''
( J11
etho
hC p
ull requ1r t
,,,· . ~ l~e p is I I 00 120
v tJS'fl~w~ere 50 70____J----t----t
1· tlil1a• W()(g) 21 25
!Ans. P = 2.275 + 0.187 Wand 30.463~ kgl
f ()(g) Wis I 50 kg.
wheo . .
ate P, k n at weekly intervals are given below. Fit a straight I' .
1il11 If ta e f · me using Dlethod
6S . t ofa ca a}culate the average rate o growth per week.
we1gh ares and C --:--i-~;-r-;--r~T~--r--::-r-
· --r---
S· ffn~east sqll 2 3 4 5 6 7 8 9 10
o I
65 70-.2 75.4 81.1 87.2 95.5 .102.2 108.4
ftge
[Ans. Y = 45.74 + 6.16 x and 6.16]
~f . te the g1v
5ubsutu
51eP 1• W ·te the oorrna1
equations for each constant.
5tepll• rt LY=
na+bix+ ci x2;
. . .
•
;.e..
Lxy = a I x + b Ix2+ c I ,t3
I x2y = +a:Ix2+bix3+cix4
and ·
the normal equations · Itaneous equations
as s1mu · for a, b. c.
Step JII. Solve d . h . J . .
these values of a. b an c m t e equallon y = a + bx + ex-. which 1s the required
Step JV. Put
~bola of best ft t.
I SOLVED EXAMPLES
Example 5.10. Fit a second degree curve of regression of y on x to the following data :
1 2
6 11 :8 \ 2~ I
Solution. The equation of second degree parabola is given by
y = a+ bx+ cx2 ... (i)
:. Normal equations are
... (ii)
~ :c r :--
~ - ; --+ -- x 2 x
1
3 4
x
I
--
"I 11 4 8
- 16
J 18 9 27 81
➔ '"!.7 16 64 54
256
Ix= ,o Iy= 62 L·t2 = 30 Lx3 = 100 Lx4 =354 108
: I :o I
.r 0 3 4
y 1 17 30
Solution. The equa tion of seco nd degr ee para
bola is give n by
y =a + bx+ ex?-
:. Nor mal equa tion s are ... (I)
LY = na+ bL x+ cLx2
... (ii)
OU~
r f1
7
,,,~a va10cs ,n
e
. (ii), (iii) and (iv)
62 :::
Sa + I Ob + 30c:
-- - 3Ss
... (v)
.i .g,ne 5 ::: 1Oa + 30b + I OOc
195
' 1'1111111 ... (Vi)
67 7 ::: 30a + IOOb + 354c
... (vii)
..) sirnultaneously for a, b and c we get
(vi), (vii
. (V) and a = 1.2, b = I. I, c = 1.5
d . ( ')
f a and b an c m r , we get
·
01v10S
00s values o
, tillg the y = l.2 + I.Ix + l.5x2
bS11tll taki X_
00,~ nd degree parabola to the foUowing data
e: ng - (x - S) M tht
s.tZ· fit a s~ : y -1 as the dependent.variabl
, 111plB ·able and 6 7 8 9
3 2 4 S
~.,JeOt f~l'I
8 10 11 13 10 9
-1rr_ 1 7 - L ~ J _ ~
, z 6~ - -L- --= --~ ~~ - L - e be
bolic curv
y X _ x _ 5 and Y = y - 1. Let the best fit para
5011itlol1, Given - y = a + bX + cX2 ... (I)
. ns for a' b and c are
al equauo
"
y = na + b Li X + c ~ )(2
L "
:.
')'he nor!Tl ... (ii)
... (iv)
LX =O LY== 13
Ix LX 2
Y= -I rx 3=0
rx 4
= 708
-----
2
LX Y:: 55 =6o
356- - - - - - -- --- - - - PAoa:,,
,.,.,,ur.,,"""
h 1·r t • tl1 • . , ·1IUl'' pf ' \ '. 'L , .\T l' l\.' . . and alsn II = () in (ii) · ( 111
...)
"'"'o Sr",'si
S
• Ull~ 1 u 1. c-L • • t- - ,:.... a11 I . 'c.
l (111) S
13 = 00 + /, x O+ 1· x 60 llr 9a + 60r = U ·
):i = ,1 x () + /, x 60 + c x Oor 60b = 55
/>= ll.9 17
- I = 60a + b x O+ 708c or 60a + 708c = _
1 •.. (~1)
Soh ·ing simultaneously thi: l,·). (r i) and (rii) we get ... (1,;;)
a = 3.344. b = 0.917. c = 0.285, and put in (i)
Tht' cun·e of best fi t in X and Y is
y = 3.344 + 0.9 I 7X - 0.285X2 and put X -
- X- 5, }':::: Y-
or y -7=3.344+0.917 (x -5)-0.285(x- 5)2 7
= 3.344 + 0.9 I 7 x- 4.585 - 0.285 (x3 l0
- X + 25)
y = 7 + 3.344-4.585 - 7.125 + 0.917x + 2 85
· x-0.235 2
Hence. the required parabolic curve of best fit is x·
Y b of x values is 9. Now take the middle value 1933 as the assum ed mean. then
.10n The num er
501ut • x- a y- b
u= and v = -
h h
x - 1933
u= and V =y- 357
i.e., 1
u = x - 1933 and V =y-357
. calculations are shown in the tabular form :
or vanous
The
,----
u y u2 u2v
;r V UV
u3 I
'
1929 -4 352 -5
I u4
20 16 -80 -~ 256
I1930 -3 356 - I 3 9 -9 - 27 81
1931 -2 357 0 0 4 0 -8 6
I1932 -l 358 I - 1 I l - I I
/1933 0 360 3 0 0 0 0 0
1934 l 36 ] 4 4 I 4 I I
J 1935 2 36] 4 8 4 16 8 16
1936 3 360 3 9
1937 9 27 27 8
4 356
I ----- 2 8 16 32 64 256
Total
tu== o
let ----- - I>== I I Iuv == 51 L:U2=60 I112,. = - 9 Iu-' = o Ll= 108
the degree e . .
quatton of second degree parabola be
... (i)
v == a + bu + cu 2
358- - - - - - ------ ------
'ilU'tti,1n~ :If\'
The ,wrnta1t"'1 •
■ EXERCISE 5.2
1. By the method of least square. fit a parabola y == a + bx + cx2 in the f
o 11owing data :
~
X -I 0 0 1
2 0 2
[Ans. Y ::: 0.5 + I.S.~]
2. Fit a second degree parabola to the following data :
1 4 5
·~ I 3 I 3~6 1 13 6~4 I 12 62 20.86
6
31.53
[Ans. y = 4 .9 - 3.1 lx + l.257x2]
3. Fit a parabola to the following observations :
X
)'
\ -2 -1 I
\ -3. 1s : -1.39
I
2.88 5.378 1
[Ans. y = 0.621 + 2.123x + 0.123il
4. Fit a second degree parabola to the following data taking x as
2
I: 6
3
7
4
8
5
10
6
11
r - 0.27.tl
[Hint : Put X = x - 5] [Ans. y = - 1 + 3·35'
~ . stance R to the motion of a train at a ~peed V. Ir the law c
-- 359
~ ,tJG ·yes rest . .
, c below gt R _ a + bv2, find by the method of least square the most plausib onnectrng R
./~ ble f rm - • l e va1ue~
C ,(11c \ll . of the o
, I y 1S
,. ~od d b- 30 40
of ti all I0
20 50
10 15 21 30
V 8 [Ans. a= 6.7, h = 0.0092]
-a+ bx
+ cx2 to
the following observations :
arabo\a y - --2- -.-- -:3:-----r----:-
4---r- --=-
5 --,
fil ap I
6·
28 33 39 46
y - -3 , v=y- 33
of' ke u - x
Jliot: 1a [Ans. v = - 0.08 6 + 5.3u + 0.643u 2 or y = 22.801 + 1.442x + 0.643x2]
.~I
Q_ \
-I
2
I
1
O
5
I
\
I
3
I I
-
2
~
360- - -
~---------:--------~ Ae1ur y ~
TTING Of THE CURVE y = ab" 4rvo Sr'\~ ,
Fl I -·d~ .,
.· ,, !(v•:irithlll~ on bot 1 s1 t: S. we gel
.,.,C,"
Step J. Takt llt- . t-
log 10 y = lug 10 a+ x log 10 b
i. e.
r = A + Bx
when:' Y = log ·v. A = log a, B = log b
Write the normal equations
S~p II . .
i.e. LY= llA + B"f.x
and IxY = ALt + B"f.x2
Step 111. Solve the normal equation as simultaneous equations for A
. . . and B th
from rhe relauons . en finct ou
. t a
a= Antilog (A) and b = Antilog (B) and b
Step rv. Put there values I. e. a and bin the equation y = air which is th .
. . e required f
o best fit c
fS.S. FITTING O THE CU~VE xy° = b Urve
1 I
or log 10 y = ~log 10 b- a log 10 x
1 1
i. e. Y=A+BXwhere Y=log 10 y, A= - log 10 b , B= - - andX=loo x
a · 0 a 10
As the problem reduces to Y = A + BX. hence the rest of the procedure is sirnilar as above.
■ SOLVED EXAMPLES
E~ample 5.15. Fit a curve of the best fit of the type y = aebx to the following data by the
method of least squares:
X I 7 9
y 10 12 15
Solution. To fit a curve of the form
y = aebx ...(i)
by the method of least squares, we use the following normal equations:
...(ii)
X.Y =- nA + BI.x
...(iii)
and IxY = AI.x + BI.r2
where Y = log 10 y , A= log 1~ a B = log 10 e
,·/
•
~
I, /piff//J..,
. re shown in the tabular form
ca\c\Jlal!OOS a
- 361
.:::
Example S.16. Fit a relation of the form Y = air fo_r the following data by t_be method of le~st
squares:
X 2 3 ·, 4 s 6
..
y 8.3 15.4 33.1 65.2 126.4
Solution. To fit curve of the form
y = abX ...(i)
by the method of least squares, we use the fol.lowing -nor~ial equations:
lY= nA+ BI.x ...(ii)
I.x Y = AI.x + BLi:2
where A= log 10 a, B = log 10 band Y= log!Oy
The various calculations are shown in the tabular form:
X y xY
Y =::log10 y x2
2 8.3 0.9191 4 1.8382
.3 ' 3.5416
15.4 l.1872 9
4 33.l 1.5198 16 6.0792
5 65.2 25 9.071 0
1.8142
'
6 36 12.6312
I--_ 126.4 2. 1052
4:::20
lY 7.5455 Lt-2 = 90 I.xY= 33.1812 I
L>;,,
.>b.:_
... ,,, - , 1'1 .-, 1n11,) .rn,l{ ,;)
1 tl•' I , ' ' •
. .. .- - , .J " ·' - 5 \ + 2llH
i y = 2.04 ( 1.995y j
ExampJe 5.1 7. BY using the meth od of least squa re find
· a relat·
tht> data: ' ion of th ~
e •0 rlll y :::: a.tb
2 that fits
---::--XJ _
Solution . To fi t a curve of the form
__.__---:-
27_._8 __.___6_: _.1__,___1_: _0 --L\__ 1:_l_j
,. = ru.b
b~ the metbcxi of least sequ ares. we use the follo wing
norm al equa tions
••,(i)
lY= 11A + bl½.
and LXY = AL½ + blx.2
.. .(ii)
where X = log 10 x ,Y = log 10 y, A= log a ... (iii)
10
The , ariou s calcu lation s are show n in the tabul ar form
:
.r y X =log 10 x Y = log10 y xi
.., XY
27.8 0.301 0 1.440 0.090 6 0.4346
3 62. 1 0.477 1 1.793 1 0.227 6 0.8555
4 11.0 0.602 1 2.041 4 0.362 5
I 5
1.229 1
161 0.699 0
I 2.206 8 0.488 6 1.5426
y = 7 .376xl.93l l I
3&3
or the gas corr~~po ndin~ to \ a ri,
pressure ,u-, "o1umt \ , llH:a u rLd
ffJC
,rJO
8
f,r' , 1~jflg
· dlilui: -SO
l
' , rf1r roll"
,,,e ' 60 70 \ 90 I OQ
e~~,,,e' y(cf11 3) 51.3 40.5 25.9 78
,11 -Z)
,i' cffl
f (1'g . 0
pV" == C.
u fltlO
ttJC eq
tJJ to h ve
e d9 e we a pvY == C •. ( I )
fil I~ 10'1• f-Je( ' p == CV-Y
,olLlt log p == log C - y log V
Y == A+ BX
Y == Iog 10 P, A= log 10 C, B =-y, X = log V
~
~ ations are
,~ere .-JllaJ equ ~y = nA + B"I.X ...(ii}
\: 'f)1e riO ALX + Br.x.2 rxr = ... (iii)
·· e shown in the tabular form:
aod . s calcU
lations _:a:_:
r :__=-~,-~:-----:----:::-T---=----
y l p
filevaflou X == log V = og XY xi
p
l' 1.69897 1.81090 3.07666 2.88650
1.77815 1.71012 3.04085 3.161 82
51.3 1.60746 2.96592
60 1.84510 3.~39
40.5 1.41330
70 1.95424 2.76193 3.81903
25,9
90 2 1.89209 7.78418 ➔
78
LX = 9 .27646 ~Y = 8.43387 LXY= 15.62954 tx2 = 17.27 176
I PV 0 28997
· = 167.78765
I EXERCISE 5 . 7
-
I.
-- X 0.0 0 .5 1.0 1.5 2.0 2.5
--- y 0.10
- 0.45 2 .15 9. 15 40.35 180.75 [Ans. y = 0.10 l 9e29963x]
364 -
(1 8
--
10
-- -- P~o B4e ,lit'r
2. - ---
I1 .0~.i 30.1'.!H 4~o s
.ion 8 1.89 7 222. 62 7'41,~
''t-
·" \'
T
I
1.65
....
_.__
2~7 f 3
4.5
I
4
7.35
I
l
4
j
4. \ 0 2 6 8
'
150 63
I 28
I 12
I 5 .6
5. \
I 1 2 3 4 5 6
,. 1.6 4.5 13.8 40.2 125 300
-
Fit an exponentiaJ curve of the form y = air to the
follo wi ng data : rAns . y -
- 0,5580 I n,
e ·"llll
6. X
,.
1 2 3 4 5 6 7 - 1
I 87 97 11 3 129 202 195 193
-
-
[Ans. Y == 73.9656
7. X 2 3 4 5 6 (l.1681)1
I
,. 144 172.8 207.4 24 8.8 298. 5
8.
;
X
\' I 1.0
I
2
1.2
I
3
1.8
I
4
2.5
5
3.6
6 7 r~~I 86(\2~
4 .7
I 6 .6
2 \ 3 4 5
2 I 4.5 8 12.5
012l9'J l
77
[Ans . Y = O. 5
- 365
( # ~ __,..- -3 4 5 l 6 I
CUµv€ ~--1f-----:--:--1-:-:---1
4 .26 5.21 6. 1 6.8
t--
7.5
tZ• . - ~-__,
!Ans. 2.9nJ>~ 1~1 ]
d the volume of a gas are related by the equatio n
ssure an . . .
,rt,e pre- d k being constants. Fit this equatio n to the following set of observatio m.:
13, p\!"{::: k, y an. -----r--:-::----r----::--:--r--::-:--r~~--i-- - -
2) 0 .5 1.0 1.5 2.0 2.5 I 3.0
p (l<g/crn
,~ -+-----t--:-~1---;::::-::--r-~-;;:-1--::-::::----, --:-:-:-1
1.62 1.0 0.75 0.62 0.52 0.46
[Ans. PV 1 -➔ 2 = 0.99]