Ps Darshan
Ps Darshan
Edition
Computer Engineering
Name : - ______________________________
Division : - ______________________________
GTU PAPER.…………………………..…..……………..…………..………..……………….***
✓ The words PROBABLE and POSSIBLE CHANCES are quite familiar to us. We use these
words when we are sure of the result of certain events. These words convey the sense of
uncertainty of occurrence of events.
✓ Probability is the word we use to calculate the degree of the certainty of events.
❖ RANDOM EXPERIMENT
✓ Random experiment is an experiment about whom outcomes cannot be successfully
predicted. Of course, we know all possible outcomes in advance.
❖ SAMPLE SPACE
✓ The set of all possible outcomes of a random experiment is called a sample space.
✓ It is denoted by “S” and if a sample space is in one-one correspondence with a finite set,
then it is called a finite sample space. Otherwise it is knowing as an infinite sample space.
✓ Examples:
✓ Infinite Sample Space: Experiment of tossing a coin until a head comes up for first time.
❖ EVENT
✓ A subset of a sample space is known as Event. Each member is called Sample Point.
✓ Example:
❖ DEFINITIONS
✓ The subset ∅ of a sample space is called “Impossible Events”.
✓ A set contains all elements other than A is called “Complementary Event” of A. It is denoted
by A’.
✓ A Union of Events A and B is Union of sets A and B (As per set theory).
✓ An Intersection of Events A and B is Intersection of sets A and B (As per set theory).
➢ Set Notation: A ∪ B = { x | x ∈ A OR x ∈ B }
❖ PROBABILITY OF AN EVENT
✓ If a finite sample space associated with a random experiment has "n" equally likely
(Equiprobable) outcomes (elements) and of these "m“ (0 ≤ m ≤ n) outcomes are
favorable for the occurrence of an event A, then probability of A is defined as below.
favorable outcomes m
P(A) = =
total outcomes n
❖ EQUIPROBABLE EVENTS
✓ Let U = {x1 , x2 , . . . , xn } be a finite sample space. If P{x1 } = P{x2 } = P{x3 } = ⋯ = P{xn }, then
the elementary events {x1 }, {x2 }, {x3 }, … , {xn } are called Equiprobable Events.
❖ RESULTS
✓ For the Impossible Event P(ϕ) = 0.
❖ PERMUTATION
✓ Suppose that we are given ‘n’ distinct objects and wish to arrange ‘r’ of these objects in a
line. Since there are ‘n’ ways of choosing the 1 st object, after this is done ‘n-1’ ways of
choosing the 2nd object and finally n-r+1 ways of choosing the r th object, it follows by the
fundamental principle of counting that the number of different arrangement (or
PERMUTATIONS) is given as below.
n
n!
Pr = n (n − 1) (n − 2) … (n − r + 1) =
( n − r)!
❖ RESULTS ON PERMUTATION
✓ Suppose that a set consists of ‘n’ objects of which n1 are of one type, n2 are of second type,
…, and nk are of kth type. Here n = n1 + n2 + ⋯ + nk . Then the number of different
permutations of the objects is
n!
.
n1 ! n2 ! … nk !
11!
= 34650.
1! 4! 4! 2!
✓ If ‘r’ objects are to be arranged out of ‘n’ objects and if repetition of an object is allowed
then the total number of permutations is nr .
➢ Different numbers of three digits can be formed from the digits 4, 5, 6, 7, 8 is 53 = 125.
❖ COMBINATION
✓ In a permutation we are interested in the order of arrangement of the objects. For example,
ABC is a different permutation from BCA. In many problems, however, we are interested
only in selecting or choosing objects without regard to order. Such selections are called
combination.
✓ The total number of combination (selections) of ‘r’ objects selected from ‘n’ objects is
denoted and defined by
n
n n!
Cr = ( )= .
r r! (n − r)!
❖ EXAMPLES ON COMBINATION
✓ The number of ways in which 3 card can be chosen from 8 cards is
8 8! 8×7×6
( )= = = 56.
3 3! (8 − 3)! 3×2×1
✓ A club has 10 male and 8 female members. A committee composed of 3 men and 4 women
is formed. In how many ways this be done?
10 8
( ) ( ) = 120 × 70 = 8400
3 4
✓ Out of 6 boys and 4 girls in how many ways a committee of five members can be formed in
which there are at most 2 girls are included?
4 6 4 6 4 6
( ) ( ) + ( ) ( ) + ( ) ( ) = 120 + 60 + 6 = 186
2 3 1 4 0 5
9
C 1 If probability of event A is , what is the probability of the event “not A”?
10
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟏
H 7 One card is drawn at random from a well shuffled pack of 52 cards. Find
probability that the card will be
(a) an ace, (b) a card of black color, (c) a diamond, (d) not an ace.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟎𝟕𝟔𝟗, 𝟎. 𝟓, 𝟎. 𝟐𝟓, 𝟎. 𝟗𝟐𝟑𝟏
H 9 Four cards are drawn from the pack of cards. Find the probability that
(a) all are diamonds, (b) there is one card of each suit, (c) there are two
spades and two hearts.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟎𝟎𝟐𝟔, 𝟎. 𝟏𝟎𝟓𝟓, 𝟎. 𝟎𝟐𝟐𝟓
T 10 Consider a poker hand of five cards. Find the probability of getting four of
a kind (i.e., four cards of the same face value) assuming the five cards are
chosen at random.
𝟏
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟒𝟏𝟔𝟓
H 11 4 cards are drawn at random from a pack of 52 cards. Find probability that
(a) They are a king, a queen, a jack and an ace.
(b) Two are kings and two are queens.
(c) Two are black and two are red.
(d) There are two cards of hearts and two cards of diamonds.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟎𝟎𝟎𝟗𝟓, 𝟎. 𝟎𝟎𝟎𝟏𝟑, 𝟎. 𝟑𝟗𝟎𝟐, 𝟎. 𝟎𝟐𝟐𝟓
H 12 A box contains 5 red, 6 white and 2 black balls. The balls are identical in all
respect other than color (a) one ball is drawn at random from the box. Find
the probability that the selected ball is black, (b) two balls are drawn at
random from the box. Find the probability that one ball is white and one is
red.
𝟐 𝟓
𝐀𝐧𝐬𝐰𝐞𝐫: ,
𝟏𝟑 𝟏𝟑
H 13 There are 5 yellow, 2 red, and 3 white balls in the box. Three balls are
randomly selected from the box. Find the probability of the following
events. (a) all are of different color, (b) 2 yellow and 1 red color, (c) all are
of same color.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟐𝟓, 𝟎. 𝟏𝟔𝟔𝟕, 𝟎. 𝟎𝟗𝟏𝟕
C 14 An urn contains 6 green, 4 red and 9 black balls. If 3 balls are drawn at
random, find the probability that at least one is green.
𝟔𝟖𝟑
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟗𝟔𝟗
T 15 A box contains 6 red balls, 4 white balls, 5 black balls. A person draws 4
balls from the box at random. Find the probability that among the balls
drawn there is at least one ball of each color.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟓𝟐𝟕𝟓
T 16 A machine produces a total of 12000 bolts a day, which are on the average
3% defective. Find the probability that out 600 bolts chosen at random, 12
will be defective.
(𝟑𝟔𝟎
𝟏𝟐
) (𝟏𝟏𝟔𝟒𝟎
𝟓𝟖𝟖
)
𝐀𝐧𝐬𝐰𝐞𝐫:
(𝟏𝟐𝟎𝟎𝟎
𝟔𝟎𝟎
)
C 17 If 5 of 20 tires in storage are defective and 5 of them are randomly chosen W-19
for inspection (that is, each tire has the same chance of being selected), (4)
what is the probability that the two of the defective tires will be included?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟐𝟗𝟑𝟓
C 18 A room has three lamp sockets. From a collection of 10 light bulbs of which
only 6 are good. A person selects 3 at random and puts them in the socket.
What is the probability that the room will have light?
𝟐𝟗
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟑𝟎
H 19 Do as directed:
(a) Find the probability that there will be 5 Sundays in the month of July.
(b) Find the probability that there will be 5 Sundays in the month of June.
(c) What is the probability that a non-leap year contains 53 Sundays?
(d) What is the probability that a leap year contains 53 Sundays?
𝟑 𝟐 𝟏 𝟐
𝐀𝐧𝐬𝐰𝐞𝐫: , , ,
𝟕 𝟕 𝟕 𝟕
H 20 If A and B are two mutually exclusive events with P(A) = 0.30, P(B) =
0.45. Find the probability of A′ , A ∩ B, A ∪ B, A′ ∩ B ′ .
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟕, 𝟎, 𝟎. 𝟕𝟓, 𝟎. 𝟐𝟓
C 21 2
The probability that a student passes a physics test is and the probability
3
14
that he passes both physics and English tests is . The probability that he
45
4
passes at least one test is , what is the probability that he passes the
5
English test?
𝟒
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟗
C 23 Two dice are thrown together. Find the probability that the sum is divisible
by 2 or 3.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟔𝟔𝟔𝟕
H 25 An integer is chosen at random from the first 200 positive integers. What
is the probability that the integer is divisible by 6 or 8?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟐𝟓
T 27 Four letters of the word THURSDAY are arranged in all possible ways. Find
the probability that the word formed is HURT.
𝟏
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟏𝟔𝟖𝟎
H 28 A class has 10 boys and 5 girls. Three students are selected at random one
after the other. Find the probability that
(a) First two are boys and third is girl.
(b) First and third of same gender and second is of opposite gender.
𝟏𝟓 𝟓
𝐀𝐧𝐬𝐰𝐞𝐫: ,
𝟗𝟏 𝟐𝟏
H 30 A market survey was conducted in four cities to find out the preference for
brand A soap. The responses are shown below:
H 31 If 3 balls are “randomly drawn” from a bowl containing 6 white and 5 black W-19
balls. What is the probability that one of the balls is white and the other (3)
two black?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟑𝟔𝟑𝟔
T 32 A card from a pack of 52 cards is lost. From the remaining cards of pack,
two cards are drawn and are found to be hearts. Find the probability of the
missing card to be a heart.
𝟏𝟏
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟓𝟎
❖ CONDITIONAL PROBABILITY
✓ Let S be a sample space and A and B be any two events in S. Then the probability of the
occurrence of event A when it is given that B has already occurred is define as
P(A ∩ B)
P(A⁄B) = ; P(B) > 0.
P(B)
P(B ∩ A)
P(B⁄A) = ; P(A) > 0.
P(A)
✓ Properties:
P(A1 ∪ A2 ⁄B) = P(A1 ⁄B) + P(A2 ⁄B) − P(A1 ∩ A2 ⁄B); P(B) > 0.
P(A ∩ B) = P(A) ⋅ P(B⁄A); P(A) > 0 or P(A ∩ B) = P(B) ⋅ P(A⁄B); P(B) > 0.
❖ INDEPENDENT EVENTS
✓ Let A and B be any two events of a sample space S, then A and B are called independent
events if P(A ∩ B) = P(A) ⋅ P(B) .
✓ This means that the probability of A does not depend on the occurrence or nonoccurrence
of B, and conversely.
❖ REMARKS
✓ Let A, B and C are said to be Mutually independent, if
➢ P(A ∩ B) = P(A) ⋅ P(B), P(B ∩ C) = P(B) ⋅ P(C) & P(C ∩ A) = P(C) ⋅ P(A).
C 1 1 3 11
If P(A) = , P(B) = and P(A ∪ B) = . Find P(A⁄B).
3 4 12
𝟐
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟗
H 2 1 1 1
If P(A) = , P(B) = , P(A ∪ B) = 2 , then find P(B/A), P(A/B’).
3 4
𝟏 𝟏
𝐀𝐧𝐬𝐰𝐞𝐫: ,
𝟒 𝟑
1 1 1
C 3 P(A) = , P (B ′ ) = , P(A ∩ B) = , then find P(A ∪ B), P(A′ ∩ B ′ ) and
3 4 6
P(A′ ⁄B ′ ).
𝟏𝟏 𝟏 𝟏
𝐀𝐧𝐬𝐰𝐞𝐫: , ,
𝟏𝟐 𝟏𝟐 𝟑
C 4 A card is drawn from a well-shuffled deck of 52 cards and then second card
is drawn, find the probability that one card is a spade and then second card
is club if the first card is not replaced.
𝟏𝟑
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟐𝟎𝟒
H 5 In a group of 200 students 40 are taking English, 50 are taking math, 12 are
taking both. (a) if a student is selected at random, what is the probability
that the student is taking English? (b) a student is selected at random from
those taking math. What is the probability that the student is taking
English? (c) a student is selected at random from those taking English,
what is the probability that the student is taking math?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟐𝟎, 𝟎. 𝟐𝟒, 𝟎. 𝟑
H 6 In a box, 100 bulbs are supplied out of which 10 bulbs have defects of type
A, 5 bulbs have defects of type B and 2 bulbs have defects of both the type.
Find the probability that (a) a bulb to be drawn at random has a B type
defect under the condition that it has an A type defect, (b) a bulb to be
drawn at random has no B type defect under the condition that it has no A
type defect.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟐, 𝟎. 𝟗𝟔𝟔𝟕
H 8 Two integers are selected at random from 1 to 11. If the sum is even, find
the probability that both the integers are odd.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟔
C 9 From a bag containing 4 white and 6 black balls, two balls are drawn at
random. If the balls are drawn one after the other without replacements,
find the probability that one is white and one is black.
𝟒
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟏𝟓
C 10 In producing screws, let A mean “screw too slim” and B “screw too short”.
Let p(A) = 0.1 and P(B ∩ A) = 0.02. A screw, selected randomly, is of type
A, what is probability that a screw is of type B.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟐
H 11 A bag contains 6 white, 9 black balls. 4 balls are drawn at a time. Find the
probability for first draw to give 4 white & second draw to give 4 black balls
in each of following cases.
(a) The balls are replaced before the second draw.
(b) The balls are not replaced before the second draw.
𝟔 𝟑
𝐀𝐧𝐬𝐰𝐞𝐫: ,
𝟓𝟗𝟏𝟓 𝟕𝟏𝟓
H 12 For two independent events A & B if P(A) = 0.3, P(A ∪ B) = 0.6, find P(B).
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟒𝟐𝟖𝟔
H 13 If A, B are independent events and P(A) = 1/4, P(B) = 2/3. Find P(A ∪ B).
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟕𝟓
C 14 If A and B are independent events, with P(A) = 3/8, P(B) = 7/8. Find
P(A ∪ B), P(A⁄B) and P(B⁄A).
𝟓𝟗 𝟑 𝟕
𝐀𝐧𝐬𝐰𝐞𝐫: , ,
𝟔𝟒 𝟖 𝟖
T 18 If A and B are independent events with P(A) = 0.26 , P(B) = 0.45, find
(a) P(A ∩ B); (b) P(A ∩ B ̅∩B
̅ ); (c) P(A ̅ ).
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟏𝟏𝟕, 𝟎. 𝟏𝟒𝟑, 𝟎. 𝟒𝟎𝟕
H 19 Show that A and B are independent events if P(A) = 0.20 , P(B) = 0.40 and
P(A ∪ B) = 0.50.
❖ TOTAL PROBABILITY
✓ If B1 & B2 are two mutually exclusive and exhaustive events of sample space S
and P(B1 ), P(B2 ) ≠ 0 , then for any event A,
✓ If B1 , B2 and B3 are mutually exclusive and exhaustive events and P(B1 ), P(B2 ), P(B3 ) ≠ 0 ,
then for any event A.
❖ BAYES’ THEOREM
✓ Let B1 , B2 , B3 … , Bn be n-mutually exclusive and exhaustive events of a sample space S and
let A be any event such that P(A) ≠ 0, then
P(Bi ) ⋅ P(A⁄Bi )
P(Bi ⁄A) = .
P(B1 ) ⋅ P(A⁄B1 ) + P(B2 ) ⋅ P(A⁄B2 ) + ⋯ + P(Bn ) ⋅ P(A⁄Bn )
C 1 Consider two boxes, first with 5-green & 2-pink and second with 4-green
& 3-pink balls. Two balls are selected from random box. If both balls are
pink, find the probability that they are from second box.
𝟑
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟒
H 3 There are three boxes. Box I contains 10 light bulbs of which 4 are
defective. Box II contains 6 light bulbs of which 1 is defective and box III
contains 8 light bulbs of which 3 are defective. A box is chosen and a bulb
is drawn. Find the probability that the bulb is defective.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟑𝟏𝟑𝟗
T 4 An urn contains 10 white and 3 black balls, while another urn contains 3
white and 5 black balls. Two balls are drawn from the first urn and put into
the second urn and then a ball is drawn from the later. What is the
probability that it is a white ball?
𝟓𝟗
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟏𝟑𝟎
C 5 Suppose that the population of a certain city is 40% male & 60% female.
Suppose also that 50% of males & 30% of females smoke. Find the
probability that a smoker is male.
𝟏𝟎
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟏𝟗
H 6 A microchip company has two machines that produce the chips. Machine-I W-19
produces 65% of the chips, but 5% of its chips are defective. Machine-II (4)
produces 35% of the chips, but 15% of its chips are defective. A chip is
selected at random and found to be defective. What is the probability that
it came from Machine-I?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟑𝟖𝟐𝟒
H 9 There are two boxes A and B containing 4 white, 3 red and 3 white, 7 red
balls respectively. A box is chosen at random and a ball is drawn from it, if
the ball is white, find the probability that it is from box A.
𝟒𝟎
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟔𝟏
H 10 Urn A contain 1 white, 2 black, 3 red balls; Urn B contain 2 white, 1 black,
1 red balls; Urn C contain 4 white, 5 black, 3 red balls. One urn is chosen at
random & two balls are drawn. These happen to be one white & one red.
What is probability that they come from urn A?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟐𝟕𝟗𝟕
C 11 Three hospitals contain 10%, 20% and 30% of diabetes patients. A Patient
is selected at random who is diabetes patient. Determine the probability
that this patient comes from first hospital.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟏𝟔𝟔𝟕
C 12 In a computer engineering class, 5% of the boys and 10% of the girls have
an IQ of more than 150. In this class, 60% of student are boys. If a student
is selected random and found to have IQ more than 150, find the
probability that the student is a boy.
𝟑
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟕
H 13 A factory has three machines X, Y, Z producing 1000, 2000, 3000 bolts per
day respectively. Machine X produces 1% defective bolts, Y produces 1.5%,
Z produces 2% defective bolts. At end of the day, a bolt is drawn at random
and it is found to be defective. What is the probability that this defective
bolt has been produced by the machine X?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟏
T 14 Suppose there are three chests each having two drawers. The first chest
has a gold coin in each drawer, the second chest has a gold coin in one
drawer and a silver coin in the other drawer and the third chest has a silver
coin in each drawer. A chest is chosen at random and a drawer opened. If
the drawer contains a gold coin, what is the probability that the other
drawer also contains a gold coin?
𝟐
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟑
C 16 An insurance company insured 2000 bike drivers, 4000 car drivers and
6000 truck drivers. The probability of an accident involving a bike driver,
a car driver and a truck driver is 0.10, 0.03 and 0.15 respectively. One of
the insured persons meets with an accident. What is the probability that he
is a bike driver?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟏𝟔𝟑𝟗
❖ RANDOM VARIABLE
✓ A random variable is a variable whose value is unknown or a function that assigns values
to each of an experiment’s outcomes. Random variables are often designated by capital
letters X, Y.
1) Discrete Random variables, which are variable that have specific values.
2) Continuous Random variables, which are variables that can have any values within
a continuous range.
X x1 x2 x3 … … xn
P(X) p(x1 ) p( x 2 ) p(x 3 ) … … p(xn )
✓ Example: Two balanced coins are tossed, find the probability distribution for heads.
1
➢ P(X = 2) = P(two heads) = = 0.25.
4
X 0 1 2
P(X=x) 0.25 0.5 0.25
✓ A random variable is one, which can assume any of a set of possible values which can be
counted or listed.
✓ A discrete random variable is a random variable with a finite (or countably infinite) range.
✓ A continuous random variable is one, which can assume any of infinite spectrum of
different values across an interval which cannot be counted or listed.
❖ PROBABILITY FUNCTION
✓ If for random variable X, the real valued function f(x) is such that P(X = x) = f(x), then f(x)
is called Probability function of random variable X.
✓ Probability function f(x) gives the measures of probability for different values of X say
x1 , x2 , … . , xn .
✓ Probability functions can be classified as (1) Probability Mass Function (P. M. F.) or (2)
Probability Density Function (P. D. F.).
✓ Conditions:
➢ ∑ni=1 p(xi ) = 1.
✓ Conditions:
b
➢ P(a < x < b) = ∫a f(x) dx.
❖ MATHEMATICAL EXPECTATION
✓ If X is a discrete random variable having various possible values x1 , x2 , … . , xn & if P(X) is
the probability mass function, the mathematical Expectation of X is defined & denoted by
n
E(X ) = ∑ x i ⋅ P (x i ).
i=1
✓ E(X) is also called the mean value of the probability distribution of x and is denoted by μ.
✓ Properties:
➢ If X and Y are two independent random variables, then E(X ∙ Y) = E(X) ∙ E(Y).
✓ If X is a discrete random variable (or continuous random variable) with probability mass
function P(X) (or probability density function), then expected value of [X − E(X)]2 is called
the variance of X and it is denoted by V(X).
✓ Properties:
➢ If X and Y are the independent random variables, then V(X + Y) = V(X) + V(Y).
∑ P(xi ) = 1.
i=1
✓ Where x is any integer. The function F(x) is also called the cumulative distribution function.
The set of pairs {xi , F(x)}, i = 1,2, … is called the cumulative probability distribution.
X x1 x2 …
F(x) P(x1 ) P(x1 ) + P(x2 ) …
➢ P (X > x) = 1 − F(x).
d
➢ F ′ (x) = F(x) = f(x), f(x) ≥ 0.
dx
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲𝐞𝐬, 𝐧𝐨
X −5 −1 0 1 5 8
P(X = x) 0.12 0.16 0.28 0.22 0.12 0.1
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟖𝟔
C 3 The following table gives the probabilities that a certain computer will
malfunction 0, 1, 2, 3, 4, 5 or 6 times on any one day.
Number of
0 1 2 3 4 5 6
malfunctions x
Probability
0.17 0.29 0.27 0.16 0.07 0.03 0.01
f(x)
Find the mean and variance of this probability distribution.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟏. 𝟖, 𝟏. 𝟖
X 0 1 2 3 4 5
1 1 1 1
P(x) p p
5 10 20 20
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐏 = 𝟎. 𝟑𝟎, 𝐄(𝐱) = 𝟏. 𝟕𝟓𝟎𝟎
X 0 1 2 3 4 5 6 7
P(X = x) 0 K 2k 2K 3k k2 2k 2 7k 2 + k
Find the value of k and then evaluate P(X < 6), P(X ≥ 6) and P(0 < x < 5).
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟏, 𝟎. 𝟖𝟏, 𝟎. 𝟏𝟗, 𝟎. 𝟖
X 1 2 3 4
P(X) 0.1 0.2 0.5 0.2
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟐. 𝟖, 𝟎. 𝟕𝟔, 𝟎. 𝟖𝟕𝟏𝟖, 𝟏𝟎. 𝟒, 𝟔. 𝟖𝟒
X 1 2 3 4
P(X) 0.1 0.2 0.5 0.2
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟗, 𝟎. 𝟕
X −2 −1 0 1 2
1 1 1 1
P(x) a
12 3 4 6
𝟏 𝟏 𝟏𝟗 𝟒𝟑 𝟐𝟐𝟕 𝟐𝟐𝟕
𝐀𝐧𝐬𝐰𝐞𝐫: , , , , ,
𝟔 𝟏𝟐 𝟔 𝟏𝟐 𝟏𝟒𝟒 𝟏𝟔
X −1 0 1 2 3
3 1 3 1
P(x) k
10 10 10 10
2
Find k, E(X), E (4X + 3), E(X ), V(X), V(2X + 3).
𝟏 𝟒 𝟑𝟏 𝟏𝟑 𝟒𝟗 𝟏𝟗𝟔
𝐀𝐧𝐬𝐰𝐞𝐫: , , , , ,
𝟓 𝟓 𝟓 𝟓 𝟐𝟓 𝟐𝟓
2x+1
C 10 If P(x) = , x = 1, 2, 3, 4, 5, 6, verify whether p(x) is probability
48
function.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲𝐞𝐬
x
H 11 If P(X = x) = , x = 1 to 5. Find P(1 or 2) & P({0.5 < X < 2.5}/{X > 1}).
15
C 14 Three balanced coins are tossed, find the mathematical expectation of tails.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟏. 𝟓
T 15 4 raw mangoes are mixed accidentally with the 16 ripe mangoes. Find the
probability distribution of the raw mangoes in a draw of 2 mangoes.
𝟔𝟎 𝟑𝟐 𝟑
𝐀𝐧𝐬𝐰𝐞𝐫: , ,
𝟗𝟓 𝟗𝟓 𝟗𝟓
H 17 In a business, the probability that a trader can get profit of Rs. 5000 is 0.4
and probability for loss of Rs. 2000 is 0.6. Find his expected gain or loss.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟖𝟎𝟎
C 18 There are 8 apples in a box, of which 2 are rotten. A person selects 3 Apples
at random from it. Find the expected value of the rotten Apples.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟕𝟓
C 19 There are 3 red and 2 white balls in a box and 2 balls are taken at random
from it. A person gets Rs. 20 for each red ball and Rs. 10 for each white ball.
Find his expected gain.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟑𝟐
H 20 There are 10 bulbs in a box, out of which 4 are defectives. If 3 bulbs are
taken at random, find the expected number of defective bulbs.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟏. 𝟐
C 21 (a) A contestant tosses a coin and receives $5 if head appears and $1 if tail
appears. What is the expected value of a trial?
(b) A contestant receives $4.00 if a coin turns up heads and pays $3.00 if it
turns tails. What is the expected value of a trail?
𝐀𝐧𝐬𝐰𝐞𝐫: $𝟑. 𝟎𝟎, $𝟎. 𝟓𝟎
C 22 cx 2 ; 0 < x < 3
Find the constant c such that the function f(x) = { is a
0 ; elsewhere
probability density function and compute P(1 < X < 2).
𝟏 𝟕
𝐀𝐧𝐬𝐰𝐞𝐫: ,
𝟗 𝟐𝟕
T 27 k
For the probability function f(x) = , −∞ < x < ∞, find k.
1+x2
𝟏
𝐀𝐧𝐬𝐰𝐞𝐫:
𝛑
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐘𝐞𝐬
T 29 The life in hours of a certain kind of radio tube has the probability density
100
; for x ≥ 100
f(x) = { x2 , find the distribution function and use it to
0 ; elsewhere
determine the probability that the life of tube is more than 150 hrs.
𝟏𝟎𝟎 𝟐
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐅(𝐱) = 𝟏 − , 𝐏(𝐱 > 𝟏𝟓𝟎) =
𝐱 𝟑
✓ Example: Consider the experiment of tossing a coin twice. The sample space S = {HH, HT,
TH, TT}. Let X denotes the number of head obtained in first toss and Y denotes the number
of head obtained in second toss. Then
S HH HT TH TT
X(S) 1 1 0 0
Y(S) 1 0 1 0
✓ Here, (X, Y) is a two-dimensional random variable and the range space of (X, Y) is {(1, 1),
(1, 0), (0, 1), (0, 0)} which is finite & so (X, Y) is a two-dimensional discrete random
variable. Further,
dx dx dy dy
P (x − ≤ X≤ x+ , y− ≤ Y≤ y+ ) = f(x, y).
2 2 2 2
✓ It is called the joint probability density function of (X, Y), provided f(x, y) ≥
0, for all (x, y) ϵ D ; Where D is range of space and
d b
✓ Example: Joint probability density function of two random variables X & Y is given by
x 2 − xy
f(x, y) = { ; 0 < x < 2 and − x < y < x .
8
0 ; otherwise
❖ CONDITIONAL DENSITY
✓ Conditional density of X:
f(x, y)
f( x | y) = , where fX (x) is marginal probability density function of X and f(x, y) is
fX (x)
✓ Conditional density of Y:
f(x, y)
f( y | x) = , where fY (y) is marginal probability density function of Y and f(x, y) is
fY (y)
✓ Continuous case:
E(X) = ∬ x f(x, y)dx dy and E(Y) = ∬ y f(x, y)dy dx (where R is given region)
R R
1
H 1 X, Y are two random variables with joint mass function P(x, y) = 27 (2x +
H 3 1 1 1
Let P(X = 0, Y = 1) = , P(X = 1, Y = −1) = , P(X = 1, Y = 1) = 3 . Is it
3 3
the joint probability mass function of X and Y? If yes, find the marginal
probability function of X and Y.
𝟏 𝟐 𝟏 𝟐
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐘𝐞𝐬, 𝐏𝐗 (𝟎) = , 𝐏𝐗 (𝟏) = & 𝐏𝐘 (−𝟏) = , 𝐏𝐘 (𝟏) =
𝟑 𝟑 𝟑 𝟑
1 2 2 3
X=0 0 0
32 32 32 32
1 1 1 1 1 1
X=1
16 16 8 8 8 8
1 1 1 1 2
X=2 0
32 32 64 64 64
𝟏 𝟕 𝟐𝟑 𝟗
𝐀𝐧𝐬𝐰𝐞𝐫: , , , , 𝐍𝐨
𝟏𝟔 𝟖 𝟔𝟒 𝟏𝟔
1 2 2 3
X=0 0 0
32 32 32 32
1 1 1 1 1 1
X=1
16 16 8 8 8 8
1 1 1 1 2
X=2 0
32 32 64 64 64
𝟏𝟏 𝟑 𝟑
𝐀𝐧𝐬𝐰𝐞𝐫: , ,
𝟔𝟒 𝟏𝟔 𝟏𝟔
H 8 Three balanced coins are tossed. Let X denote the number of heads on the
first two coins and Y denote the number of tails on the last two coins. Find
the joint distribution of X and Y.
𝟏 𝟏 𝟏 𝟏
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐏(𝟎, 𝟏) = , 𝐏(𝟎, 𝟐) = , 𝐏(𝟏, 𝟎) = , 𝐏(𝟏, 𝟏) = ,
𝟖 𝟖 𝟖 𝟒
𝟏 𝟏 𝟏
𝐏(𝟏, 𝟐) = , 𝐏(𝟐, 𝟏) = , 𝐏(𝟐, 𝟐) =
𝟖 𝟖 𝟖
1
C 9 (6 − x − y) ; 0 ≤ x < 2, 2 ≤ y < 4
Check weather f(x, y) = { 8 is
0 ; otherwise
probability density function or not?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲𝐞𝐬
𝟑 𝐗 𝟑 𝐘
𝐀𝐧𝐬𝐰𝐞𝐫: (𝐚)𝟏, (𝐛) 𝐗: + ; (𝟎 ≤ 𝐱 < 𝟏) & 𝐘: + ; (𝟎 ≤ 𝐲 < 𝟏),
𝟒 𝟐 𝟒 𝟐
𝟏𝟑
(𝐜)
𝟔𝟒
𝟑 𝟏
𝐀𝐧𝐬𝐰𝐞𝐫: ,
𝟖 𝟏𝟎
⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆
❖ BERNOULLI TRIALS
✓ Suppose a random experiment has two possible outcomes, which are complementary, say
success (S) and failure (F). If the probability p(0 < p < 1) of getting success at each of the
n trials of this experiment is constant, then the trials are called Bernoulli trials.
❖ BINOMIAL DISTRIBUTION
✓ A random experiment consists of n Bernoulli trials such that
➢ Each trial results in only two possible outcomes, labeled as success and failure.
✓ The random variable X that equals the number of trials that results in a success is a
binomial random variable with parameters 0 < p < 1, q = 1 − p and n = 1, 2, 3, …. The
probability mass function of X is
n
P(X = x) = ( ) px qn−x ; x = 0, 1, 2, … … , n.
x
➢ Number of oil wells yielding natural gas in a group of n wells test drilled.
✓ NOTE:
C 1 12% of the tablets produced by a tablet machine are defective. What is the
probability that out of a random sample of 20 tablets produced by the
machine, 5 are defective?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟎𝟓𝟔𝟕
C 2 20% Of the bulbs produced are defective. Find probability that at most 2
bulbs out of 4 bulbs are defective.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟗𝟕𝟐𝟖
C 4 The probability that India wins a cricket test match against Australia is
1
given to be . If India and Australia play 3 test matches, what is the
3
probability that (a) India will lose all the three test matches? (b) India will
win at least one test match?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟐𝟗𝟔𝟑, 𝟎. 𝟕𝟎𝟑𝟕
H 8 1
Probability of man hitting a target is . If he fires 6 times, what is the
3
probability of hitting (a) at most 5 times? (b) at least 5 times? (c) exactly
one?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟗𝟗𝟖𝟔, 𝟎. 𝟎𝟏𝟕𝟗, 𝟎. 𝟐𝟔𝟑𝟒
H 12 Out of 2000 families with 4 children each, how many would you expect to
have (a) at least 1 boy, (b) 2 boys, (c) 1 or 2 girls, (d) no girls? Assume
equal probabilities for boys and girls.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟏𝟖𝟕𝟓, 𝟕𝟓𝟎, 𝟏𝟐𝟓𝟎, 𝟏𝟐𝟓
C 13 Out of 800 families with 4 children each, how many would you expect to W-19
have (a) 2 boys and 2 girls? (b) at least 1 boy? (c) at most 2 girls? (d) no (7)
girls? Assume equal probabilities for boys and girls.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟑𝟎𝟎, 𝟕𝟓𝟎, 𝟓𝟓𝟎, 𝟓𝟎
C 17 Find the probability that in five tosses of a fair die, 3 will appear (a) twice,
(b) at most once, (c) at least two times.
𝟔𝟐𝟓 𝟑𝟏𝟐𝟓 𝟕𝟔𝟑
𝐀𝐧𝐬𝐰𝐞𝐫: , ,
𝟑𝟖𝟖𝟖 𝟑𝟖𝟖𝟖 𝟑𝟖𝟖𝟖
C 21 For the binomial distribution with n = 20, p = 0.35. Find mean, variance
and standard deviation.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟕, 𝟒. 𝟓𝟓, 𝟐. 𝟏𝟑𝟑𝟏
H 22 If the probability of a defective bolt is 0.1, find mean and standard deviation
of the distribution of defective bolts in a total of 400.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟒𝟎, 𝟔
H 24 Each sample of water has a 10% chance of containing a particular organic W-19
pollutant. Assume that the samples are independent with regard to the (3)
presence of the pollutant. Find the probability that in the next 18 samples,
at least 4 samples contain the pollutant.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟎𝟗𝟖𝟐
H 25 The probability that one of the ten telephone lines is busy at an instant is
0.2. (a) What is the probability that 5 of the lines are busy? (b) What is the
probability that all the lines are busy?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟎𝟐𝟔𝟒, 𝟎. 𝟎𝟎𝟎𝟎𝟎𝟎𝟏
T 26 A safety engineers feels that 30% of all industrial accidents in her plant are
caused by failure of employees to follow instructions. If this figure is
correct, find approximately, the probability that among 84 industrialized
accidents in this plant anywhere from 20 to 30 (inclusive) will be due to
failure of employees to follow instructions.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟖𝟏𝟎𝟐
❖ POISSON DISTRIBUTION
✓ A discrete random variable X is said to follow Poisson distribution if it assumes only non-
negative values. Its probability mass function is given by
e−λ λx
P(X = x) = ; x = 0,1,2,3, … . & λ = mean of the Poisson distribution.
x!
➢ The mean and variance of the Poisson distribution with parameter λ are defined as
follows.
H 4 100 Electric bulbs are found to be defective in a lot of 5000 bulbs. Find the
probability that at the most 3 bulbs are defective in a box of 100 bulbs.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐏(𝐗 ≤ 𝟑) = 𝟎. 𝟖𝟓𝟕𝟏
H 6 The probability that a person catch corona virus is 0.001. Find the
probability that out of 3000 persons (a) exactly 3, (b) more than 2 persons
will catch the virus.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟐𝟐𝟒𝟎, 𝟎. 𝟓𝟕𝟔𝟖
H 9 A car hire firm has two cars, which are hires out day by day. The number
of demands for a car on each day is distributed on a Poisson distribution
with mean 1.5. Calculate the proportion of days on which neither car is
used and proportion of days on which some demand is refused.
(e−1.5 = 0.2231).
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐏(𝐗 = 𝟎) = 𝟎. 𝟐𝟐𝟑𝟏, 𝟏 − 𝐏(𝐗 ≤ 𝟐) = 𝟎. 𝟏𝟗𝟏𝟐
C 13 For Poisson variant X, if P(X = 3) = P(X = 4), then find P(X = 0).
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐏(𝐗 = 𝟎) = 𝐞−𝟒
H 14 For Poisson variant X, if P(X = 1) = P(X = 2). Find mean and standard
deviation of this distribution. Also, find P(X = 3).
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟐, √𝟐 , 𝟎. 𝟏𝟖𝟎𝟒
H 15 Assume that the probability that a wafer contains a large particle of W-19
contamination is 0.01 and that the wafers are independent; that is, the (3)
probability that a wafer contains a large particle is not dependent on the
characteristics of any of the other wafers. If 15 wafers are analyzed, what
is the probability that no large particles are found?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟖𝟔𝟎𝟕
H 16 If a publisher of nontechnical books takes great pains to ensure that its W-19
books are free of typographical errors, so that the probability of any given (7)
page containing at least one such error is .005 and errors are independent
from page to page. What is the probability that one of its 400-page novels
will contain (a) exactly one page with errors? (b) at most three pages with
errors?
𝐀𝐧𝐬𝐰𝐞𝐫: (𝐚) 𝟎. 𝟐𝟕𝟎𝟕, (𝐛) 𝟎. 𝟖𝟓𝟕𝟏
T 18 The number of flaws in a fiber optic cable follows a Poisson process with
an average of 0.6 per 100 feet.
(i) Find the probability of exactly 2 flaws in a 200 feet cable.
(ii) Find the probability of exactly 1 flaw in the first 100 feet and exactly 1
flaw in the second 100 feet.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟐𝟏𝟔𝟗, 𝟎. 𝟑𝟐𝟗𝟑
❖ EXPONENTIAL DISTRIBUTION
✓ A random variable X is said to have an Exponential distribution with parameter θ > 0, if its
probability density function is given by
−θx
f(X = x) = { θe ; x≥0
0 ; otherwise
1 1 1
✓ Here, θ = or mean = and variance = .
mean θ θ2
✓ Exponential distribution can be used to describe (waiting) times between Poisson events.
17000? Assume that the income tax is levied at the rate of 15% on the
income above Rs. 15000.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐞−𝟏𝟎𝟎
❖ GAMMA DISTRIBUTION
✓ A random variable X is said to have a Gamma distribution with parameter r, θ > 0, if its
probability density function is given by
θr x r−1 e−θx
; x≥0
f(x) = Γ(r )
{ 0 ; otherwise
r r
➢ Here, mean = and variance = .
θ θ2
The city has daily stock of 30000 liters. What is the probability that the
stock is insufficient on a particular day?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟕𝟑𝟔
Compute the probability that you will have to wait between 2 to 4 hours
before you catch 4 fish.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟏𝟐𝟑𝟗
❖ NORMAL DISTRIBUTION
✓ A continuous random variable X is said to follows a normal distribution if its probability
density function is given by
1 1 x−μ 2
f(x) = exp [ − ( ) ] ; −∞ < x < ∞ & σ > 0
σ √2π 2 σ
✓ If X is a normal random variable with mean μ and standard deviation σ, and if we find the
X−μ
random variable Z = with mean 0 and standard deviation 1, then Z in called the
σ
✓ The probability destiny function for the normal distribution in standard form is given by
1 1 2
f(z) = e− 2
z
; −∞ < z < ∞.
√2π
✓ The distribution of any normal variate X can always be transformed into the distribution
of the standard normal variate Z.
x1 − μ X−μ x2 − μ
P(x1 ≤ X ≤ x2 ) = P ( ≤ ≤ ) = P(z1 ≤ Z ≤ z2 ).
σ σ σ
➢ P(−∞ ≤ z ≤ 0) = P (0 ≤ z ≤ ∞) = 0.5.
C 1 For a random variable having the normal distribution with μ = 18.2 and
σ = 1.25, find the probabilities that it will take on a value (a) less than 16.5,
(b) between 16.5 and 18.8.[P(z = 1.36) = 0.4131, P(z = 0.48) = 0.1843]
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟎𝟖𝟔𝟗, 𝟎. 𝟓𝟗𝟕𝟒
H 4 A sample of 100 dry battery cell tested & found that average life is 12 hours
& standard deviation 3 hours. Assuming data to be normally distributed
what % of battery cells are expected to have life (a) more than 15 hrs.? (b)
less than 6 hrs.? (c) between 10 & 14 hrs.?
[P(z = 1) = 0.3413, P(z = 2) = 0.4773, P(z = 0.67) = 0.2486]
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟏𝟓. 𝟖𝟕%, 𝟐. 𝟐𝟕%, 𝟒𝟗. 𝟕𝟐%
H 8 Distribution of height of 1000 soldiers is normal with mean 165 cm & W-19
standard deviation 15 cm. How many soldiers are of height (a) less than (7)
138 cm? (b) more than 198 cm? (c) between 138 & 198 cm?
[P(z = 1.8) = 0.4641, P(z = 2.2) = 0.4861]
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟑𝟔, 𝟏𝟒, 𝟗𝟓𝟎
T 9 Assuming that the diameters of 1000 brass plugs taken consecutively from
a machine form a normal distribution with mean 0.7515 cm and standard
deviation 0.002 cm. Find the number of plugs likely to be rejected if the
approved diameter is 0.752 ± 0.004 cm.
[P(z = 1.75) = 0.4599, P(z = 2.25) = 0.4878]
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟓𝟐
C 11 In a normal distribution, 31% of items are below 45 & 8% are above 64.
Determine the mean and standard deviation of this distribution.
[P(z = 0.22) = 0.19, P(z = 1.41) = 0.42]
𝐀𝐧𝐬𝐰𝐞𝐫: 𝛍 = 𝟒𝟗. 𝟗𝟕𝟑𝟖, 𝛔 = 𝟗. 𝟗𝟒𝟕𝟔
❖ BOUNDS ON PROBABILITIES
✓ If the probability distribution of a random variable is known, E(X) & V(x) can be computed.
Conversely, if E(X) & V(X) are known, probability distribution of X cannot be constructed
and quantities such as P{|X − E(X)| ≤ K} cannot be evaluate.
✓ Several approximation techniques have been developed to yield upper and/or lower
bounds to such probabilities. The most important of such technique is Chebyshev’s
inequality.
❖ CHEBYSHEV’S INEQUALITY
✓ If X is a random variable with mean μ and variance σ2 , then for any positive number k,
1 1
P{|X − μ| ≥ kσ} ≤ OR P{|X − μ| < kσ} ≥ 1 −
k2 k2
T 5 Two unbiased dice are thrown. If X is the sum of the numbers showing up,
35
prove that P{|X − 7| ≥ 3} < . Compare this with actual probability.
54
𝟏
𝐀𝐧𝐬𝐰𝐞𝐫:
𝟑
⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆
❖ INTRODUCTION
✓ Statistics is the branch of science where we plan, gather and analyze information about a
particular collection of objects under investigation. Statistics techniques are used in every
other field of science, engineering and humanity, ranging from computer science to
industrial engineering to sociology and psychology.
✓ For any statistical problem the initial information collection from the sample may look
messy, and hence confusing. This initial information needs to be organized first before we
make any sense out of it.
✓ Quantitative data in a mass exhibit certain general characteristic or they differ from each
other in the following ways:
➢ They show a tendency to concentrate values, usually somewhere in the center of the
distribution. Measures of this tendency are called measures of Central Tendency.
➢ The data vary about a measure of Central tendency and these measures of deviation
are called measures of variation or Dispersion.
❖ UNIVARIATE ANALYSIS
✓ Univariate analysis involves the examination across cases of one variable at a time. There
are three major characteristics of a single variable that we tend to look at:
➢ Distribution
➢ Central Tendency
➢ Dispersion
➢ Skewness
➢ Kurtosis
❖ DISTRIBUTION
✓ Distribution of a statistical data set (or a population) is a listing or function showing all the
possible values (or intervals) of the data and how often they occur.
➢ Let us arrange the marks in ascending order as: 25, 36, 42, 55, 60, 62, 73, 75, 78, 95
➢ We can clearly see that the lowest marks are 25 and the highest marks are 95. The
difference of the highest and the lowest values in the data is called the range of the
data. So, the range in this case is 95 – 25 = 70.
✓ Consider the marks obtained (out of 100 marks) by 30 students of Class-XII of a school:
10, 20, 36, 92, 95, 40, 50, 56, 60, 70, 92, 88, 80, 70, 72
70, 36, 40, 36, 40, 92, 40, 50, 50, 56, 60, 70, 60, 60, 88
➢ Recall that the number of students who have obtained a certain number of marks is
called the frequency of those marks. For instance, 4 students got 70 marks. So the
frequency of 70 marks is 4. To make the data more easily understandable, we write it
in a table, as given below:
x 10 20 36 40 50 56 60 70 72 80 88 92 95
f 1 1 3 4 3 2 4 4 1 1 2 3 1
✓ 100 plants each were planted in 100 schools during Van Mahotsav. After one month, the
number of plants that survived were recorded as:
95 67 28 32 65 65 69 33 98 96
76 42 32 38 42 40 40 69 95 92
75 83 76 83 85 62 37 65 63 42
89 65 73 81 49 52 64 76 83 92
93 68 52 79 81 83 59 82 75 82
86 90 44 62 31 36 38 42 39 83
87 56 58 23 35 76 83 85 30 68
69 83 86 43 45 39 83 75 66 83
92 75 89 66 91 27 88 89 93 42
53 69 90 55 66 49 52 83 34 36
➢ These groupings are called ‘classes’ or ‘class-intervals’, and their size is called the
class-size or class width, which is 10 in this case. In each of these classes, the least
number is called the lower-class limit and the greatest number is called the upper-
class limit, e.g., in 20-29, 20 is the ‘lower class limit’ and 29 is the ‘upper class limit’.
➢ Also, recall that using tally marks, the data above can be condensed in tabular form as
follows:
❖ SOME DEFINITION
✓ Exclusive Class: If classes of frequency distributions are 0 − 2,2 − 4,4 − 6, … such classes
are called Exclusive Classes.
✓ Inclusive Class: If classes of frequency distributions are 0 − 2,3 − 5,6 − 8, … such classes
are called Inclusive Classes.
❖ CENTRAL TENDENCY
✓ The central tendency of a distribution is an estimate of the "center" of a distribution of
values. There are three major types of estimates of central tendency:
➢ Mean (x̅)
➢ Median (M)
➢ Mode(Z)
❖ MEAN
✓ The Mean or Average is probably the most commonly used method of describing central
tendency. To compute the mean, add up all the values and divide by the number of values.
∑ xi
x̅ =
n
∑ fi xi
x̅ =
n
∑ fi di
➢ Mean by assumed mean method: x̅ = A + ; Where di = xi − A
n
∑ fi xi
x̅ = ; Where xi = Mid value of the respective class
n
∑ fi di
➢ Mean by assumed mean method: x̅ = A + ; Where di = xi − A
n
∑ fi ui (xi −A)
➢ Mean by step deviation method: x̅ = A + ⋅ C ; Where ui =
n C
Marks obtained 20 9 25 50 40 80
Number of students 6 4 16 7 8 2
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟑𝟐. 𝟐𝟑
Marks obtained 18 22 30 35 39 42 45 47
Number of students 4 5 8 8 16 4 2 3
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟑𝟒. 𝟓
x 10 20 36 40 50 56 60 70 72 80 88 92 95
f 1 1 3 4 3 2 4 4 1 1 2 3 1
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟓𝟗. 𝟑
H 6 Find the mean if Survey regarding the weights (kg) of 45 students of class X of
a school was conducted and the following data was obtained:
C 7 Find the mean using direct method, assumed mean method and step deviation
method:
T 8 Find the missing frequency from the following data if mean is 19.92.
T 10 A car runs at speed of 60 k/h over 50 km; the next 30 km at speed of 40 k/h;
next 20 km at speed of 30 k/h; final 50 km at speed of 25 k/h. What is the
average speed?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟑𝟓. 𝟐𝟗
❖ MEDIAN
✓ The Median is the value found at the exact middle of the set of values. To compute the
median is to list all observations in numerical order and then locate the value in the center
of the sample.
n + 1 th
M=( ) observation
2
n th n th
( ) observation + ( + 1) observation
M= 2 2
2
n+1 th
➢ In case of discrete group data, the position of median i.e., ( ) item can be located
2
n
−F
M=L+( 2 )×C
f
➢ C = class length
Marks 20 9 25 50 40 80
No. of students 6 4 16 7 8 2
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟐𝟓
H 3 Obtain the median size of shoes sold from the following data:
Pair 30 40 50 150 300 600 950 820 750 440 250 150 40 39
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟖. 𝟓
x 1 2 3 4 5 6 7 8 9
f 8 10 11 16 20 25 15 9 6
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟓
❖ MODE
✓ The Mode is the most frequently occurring value in the set. To determine the mode, you
might again order the observations in numerical order and then count each one. The most
frequently occurring value is the mode.
➢ Most repeated observation among given data is called Mode of Ungrouped data.
f1 − f0
Z=L+( )×C
2f1 − f0 − f2
x 1 2 3 4 5 6 7 8 9
f 8 10 11 16 20 25 15 9 6
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟔
x 11 22 33 44
f 15 20 19 10
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟐𝟐
x 8 9 10 11 12 13 14 15
f 5 6 8 7 9 8 9 6
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟏𝟐, 𝟏𝟒
❖ DISPERSION
✓ Dispersion refers to the spread of the values around the central tendency. There are two
common measures of dispersion, the range and the standard deviation.
✓ Range is simply the highest value minus the lowest value. In our example, distribution the
high value is 36 and the low is 15, so the range is 36 - 15 = 21.
✓ Standard Deviation (𝛔) is a measure that is used to quantify the amount of variation or
dispersion of a set of data values.
✓ Sample Standard Deviation (𝐒) is root mean square of the difference between observation
and sample mean. It is defined by
∑(xi − x̅)2
S=√ , where x̅ is a sample mean
n
✓ Sample Variance (𝐒 𝟐 ) is the average of squared difference from the mean. It is defined by
∑(xi − x̅)2
(S 2 ) = , where x̅ is a sample mean
n−1
2 2
∑ xi2 ∑ xi ∑ fi xi2 ∑ fi xi
Direct Method σ=√ −( ) σ=√ −( )
n n n n
2 2
∑ d2i ∑ di ∑ fi d2i ∑ fi di
Assumed Mean Method σ=√ −( ) σ=√ −( )
n n n n
2 2
∑ u2i ∑ ui ∑ fi u2i ∑ fi ui
Step Deviation Method σ= √ −( ) ×c σ= √ −( ) ×c
n n n ∑ nfi
∑|xi − M| ∑ fi |xi − M|
M.D. about Median M. D. = M. D. =
n n
∑|xi − Z| ∑ fi |xi − Z|
M.D. about Mode M. D. = M. D. =
n n
C 1 Find the standard deviation for the following data: 6, 7, 10, 12, 13, 4, 8, 12.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟑. 𝟎𝟒𝟏𝟒
C 2 Find the standard deviation and variance for the following distribution:
H 5 The article “A Thin-Film Oxygen Uptake Test for the Evaluation of W-19
Automotive Crankcase Lubricants” reported the following data on (4)
oxidation-induction time (min) for various commercial oils:
87, 103, 130, 160, 180, 195, 132, 145, 211,
105, 145, 153, 152, 138, 87,99, 93, 119, 129
(i) Calculate the sample variance and standard deviation.
(ii) If the observations were re-expressed in hours, what would be the
resulting values of the sample variance and sample standard deviation?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟏𝟏𝟗𝟖. 𝟏𝟗𝟖𝟐, 𝟑𝟒. 𝟔𝟏𝟓𝟎, 𝟏𝟐𝟔𝟒. 𝟕𝟔𝟔𝟎, 𝟑𝟓. 𝟓𝟔𝟑𝟓
A 85 20 62 28 74 5 69 4 13
B 72 4 15 30 59 15 49 27 26
Which of the batsman is more consistent?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐁
A 32 28 47 63 71 39 10 60 96 14
B 19 31 48 53 67 90 10 62 40 80
𝐀𝐧𝐬𝐰𝐞𝐫: 𝛔𝐀 = 𝟐𝟓. 𝟒𝟗𝟓𝟎, 𝛔𝐁 = 𝟐𝟒. 𝟒𝟐𝟗𝟎. 𝐓𝐡𝐞𝐫𝐞 𝐢𝐬 𝐥𝐞𝐬𝐬 𝐯𝐚𝐫𝐢𝐚𝐛𝐢𝐥𝐢𝐭𝐲 𝐢𝐧
𝐭𝐡𝐞 𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐚𝐧𝐜𝐞 𝐨𝐟 𝐭𝐡𝐞 𝐦𝐚𝐜𝐡𝐢𝐧𝐞 𝐁.
H 8 Goals scored by two team A and B in a football season were as shown in the
table. Find out which team is more consistent?
H 10 The runs scored by two batsmen A and B in 10 matches are given in the
following table:
A 14 13 26 53 17 29 79 36 84 49
B 37 22 56 52 14 10 37 48 20 4
Who is more consistent?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐁
H 11 Find the mean deviation about the mean for the following data:
12, 3, 18, 17, 4, 9, 17, 19, 20, 15, 8, 17, 2, 3, 16, 11, 3, 1, 0, 5.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟔. 𝟐
C 12 Find mean deviation about the mean for the following data:
x 2 5 6 8 10 12
f 2 8 10 7 8 5
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟐. 𝟑
H 13 Find mean deviation about the mean for the following data:
x 5 10 15 20 25
f 7 4 6 3 5
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟔. 𝟑𝟐𝟎𝟎
C 14 Find out mean deviation about median for the following series:
Size 4 6 8 10 12 14 16
Freq. 2 1 3 6 4 3 1
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟐. 𝟒
H 15 Find out mean deviation about median for the following series:
Size 4 6 8 10 12 14 16
Freq. 1 2 4 5 4 3 1
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟐. 𝟒
❖ MOMENTS
✓ Moment is a familiar mechanical term which refer to the measure of a force respect to its
tendency to provide rotation or is the arithmetic mean of the various powers of the
deviations of items from their assumed mean or actual mean. If the deviations of the items
are taken from the arithmetic mean of the distribution, it is known as central moment.
Σ fi (x − x̅)r
μr = ; r = 1, 2, 3, 4, …
n
Σ fi (x − a)r
μ′r = ; r = 1, 2, 3, 4, …
n
Σfi x r
vr = ; r = 1, 2, 3, 4, …
n
➢ Moments about actual mean in terms of moments about assumed mean (μ in μ′)
μ1 = μ′ 1 − μ′ 1 = 0
2
μ2 = μ′ 2 − (μ′1 )
3
μ3 = μ′ 3 − 3μ′ 2 . μ′ 1 + 2(μ′1 )
2 4
μ4 = μ′ 4 − 4μ′ 3 . μ′ 1 + 6μ′ 2 . (μ′1 ) − 3(μ′1 )
v1 = a + μ′ 1 = x̅
v2 = μ2 + (v1 )2
v3 = μ3 + 3v1 v2 − 2v1 3
✓ NOTE: The first moment about zero(v1 ) is MEAN of data, the second moment about actual
mean(μ2 ) is VARIANCE of data, third moment about actual mean(μ3 ) is use to find
SKEWNESS and forth moments about actual mean(μ4 ) is use to find KURTOSIS.
❖ SKEWNESS
✓ It is a measure of the asymmetry of the probability distribution of a real-valued random
variable about its mean.
mean − mode
Skewness =
standard deviation
➢ Method of moments
(μ3 )2
Skewness = β1 =
(μ2 )3
➢ Negative: The left tail is longer; the mass of the distribution is concentrated on the
right.
➢ Positive: The right tail is longer; the mass of the distribution is concentrated on the
left
❖ KURTOSIS
✓ The measure of peakedness of a distribution (i.e., measure of convexity of a frequency
curve) is known as Kurtosis. It is based on fourth moment and is defined as
μ4
β2 =
(μ2 )2
✓ When the value of β2 > 3, the curve is more peaked than normal curve and the distribution
is called leptokurtic.
✓ When the value of β2 < 3, the curve is less peaked than normal curve and the distribution
is called Platykurtic.
✓ The normal curve and other curves with β2 = 3 are called Mesokurtic.
C 1 (a) Find the first four moments about the mean for data 1, 3, 7, 9, 10.
(b) Find the first four central moments for the data 11, 12, 14, 16, 20.
𝐀𝐧𝐬𝐰𝐞𝐫: [𝟎, 𝟏𝟐, − 𝟏𝟐, 𝟐𝟎𝟖. 𝟖], [𝟎, 𝟏𝟎, 𝟏𝟗. 𝟏𝟓𝟐𝟎, 𝟐𝟏𝟑. 𝟓𝟖𝟕𝟐]
C 2 Calculate the first four moments about the mean for following distribution:
x 2 3 4 5 6
f 1 3 7 3 1
𝐀𝐧𝐬𝐰𝐞𝐫: [𝟎, 𝟎. 𝟗𝟑𝟑, 𝟎, 𝟐. 𝟓𝟑𝟑]
H 3 Calculate the first four moments about the mean for the following data:
x 5 10 15 20 25
f 6 10 14 6 4
𝐀𝐧𝐬𝐰𝐞𝐫: [𝟎, 𝟑𝟒, 𝟒𝟎𝟗. 𝟓, 𝟐𝟕𝟎𝟐. 𝟗𝟓]
H 4 Calculate the first four moments about the mean of the following data:
x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1
𝐀𝐧𝐬𝐰𝐞𝐫: [𝟎, 𝟐, 𝟎, 𝟏𝟏]
H 5 Calculate the moments about actual mean and zero for following
distribution:
x 1 2 3 4 5 6
f 5 4 3 7 1 1
𝐀𝐧𝐬𝐰𝐞𝐫: [𝟐. 𝟗𝟎𝟒𝟕𝟔, 𝟏𝟎. 𝟓𝟐𝟑𝟖𝟏, 𝟒𝟑. 𝟏𝟗𝟎𝟒𝟖, 𝟏𝟗𝟏. 𝟔𝟔𝟔𝟔𝟕],
𝐀𝐧𝐬𝐰𝐞𝐫: [𝟎, 𝟐. 𝟎𝟖𝟔𝟏𝟕, 𝟎. 𝟓𝟎𝟏𝟔𝟕, 𝟗. 𝟎𝟐𝟗𝟖𝟖]
H 6 (a) Calculate the first four moments about the mean for following
distribution:
C 7 Calculate the moments about assumed mean 25, actual mean and zero for
following:
C 8 The first four moments about a = 4 are 1, 4, 10, 45. Find moments about
actual mean.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎, 𝟑, 𝟎, 𝟐𝟔
H 9 The first four moment about a = 5 are −4, 22, −117, 560. Find moments
about actual mean and origin.
𝐀𝐧𝐬𝐰𝐞𝐫: [𝟎, 𝟔, 𝟏𝟗, 𝟑𝟐], [𝟏, 𝟕, 𝟑𝟖, 𝟏𝟒𝟓]
C 11 (a) Compute the coefficient of skewness for the data; 25, 15, 23, 40, 27, 25,
23, 25, 20.
(b) The pH of a solution is measured 7 times by one operator using a same
instrument are 7.15, 7.20, 7.18, 7.19, 7.21, 7.16 and 7.18. Find skewness.
𝐀𝐧𝐬𝐰𝐞𝐫: −𝟎. 𝟎𝟑, 𝟎. 𝟎𝟒𝟗𝟔
x 2 3 4 5 6
f 1 3 7 3 1
(b) Find skewness from the following table.
x 35 45 55 60 75 80
f 12 18 10 6 3 11
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟓𝟕𝟖𝟖
H 15 Find skewness by the method of moments for 38.2, 40.9, 39.5, 44, 39.6,
40.5, 39.5.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟒𝟐𝟔𝟏
x 2 3 4 5 6
f 1 3 7 3 1
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎
H 18 Find the coefficient of skewness based on the method of moments for the
following data:
x 5 15 25 35
f 1 4 3 2
❖ COEFFICIENT OF CORRELATION
✓ Correlation is the relationship that exists between two or more variables. Two variables
are said to be correlated if a change in one variable affects a change in the other variable.
Such a data connecting two variables is called bivariate data.
✓ When two variables are correlated with each other, it is important to know the amount or
extent of correlation between them. The numerical measure of correlation of degree of
relationship existing between two variables is called the coefficient of correlation and is
denoted by r and it is always lying between −1 and 1.
➢ When the value of r is ±0.9 or ±0.8 etc. it shows high degree of relationship between
the variables and when r is small say ±0.2 or ±0.1 etc, it shows low degree of
correlation.
❖ TYPES OF CORRELATIONS
✓ Positive and negative correlations
➢ If both the variables vary in the same direction, the correlation is said to be positive.
➢ If both the variables vary in the opposite direction, correlation is said to be negative.
➢ When only two variables are studied, the relationship is described as simple
correlation.
➢ When more than two variables are studied, the relationship is multiple correlation.
➢ When more than two variables are studied excluding some other variables, the
relationship is termed as partial correlation.
➢ If the ratio of change between two variables is constant, the correlation is said to be
linear.
➢ If the ratio of change between two variables is not constant, the correlation is
nonlinear.
Σ (x − x̅) (y − y̅) n Σ x y − (Σ x) (Σ y)
r= 𝐎𝐑 r =
n σx σy √n Σ x 2 − (Σ x)2 √n Σ y 2 − (Σ y)2
➢ Rank correlation is based on the rank or the order of the variables and not on the
magnitude of the variables. Here, the individuals are arranged in order of proficiency.
➢ If the ranks are assigned to the individuals range from 1 to n, then the correlation
coefficient between two series of ranks is called rank correlation coefficient.
6 ∑ d2
ρ=1−
n( n2 − 1 )
➢ Where d is difference between the ranks R1 & R 2 given by two judges, n = number of
pairs.
➢ If there is a tie between individuals’ ranks, the rank is divided among equal
individuals.
➢ For example, if two items have fourth rank, the 4th and 5th rank is divided between
4+5
them equally and is given as = 4.5th rank to each of them.
2
4+5+6
➢ If three items have the same 4th rank, each of them is given = 5th rank.
3
1
➢ So, if m is number of items having equal ranks, then the factor ( 12 ) (m3 − m) is added
to ∑ d2 . If there are more than one cases of these types, this factor is added
corresponding to each case.
1 1
6 [∑ d2 + 12 (m1 3 − m1 ) + 12 (m2 3 − m2 ) + ⋯ ]
ρ=1−
n ( n2 − 1 )
x 54 57 55 57 56 52 59
y 36 35 32 34 36 38 35
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐫 = −𝟎. 𝟒𝟓𝟕𝟓
H 2 Compute the coefficient of correlation between X and Y using the following W-19
data: (3)
x 2 4 5 6 8 11
y 18 12 10 8 7 5
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐫 = −𝟎. 𝟗𝟐𝟎𝟑
x 65 66 67 67 68 69 70 72
y 67 68 65 68 72 72 69 71
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐫 = 𝟎. 𝟔𝟎𝟑𝟎
x 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000
y 0.30 0.29 0.29 0.25 0.24 0.24 0.24 0.29 0.18 0.15
Age of husband 35 34 40 43 56 20 38
Age of wife 32 30 31 32 53 20 33
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐫 = 𝟎. 𝟗𝟑𝟕𝟏
Age 20 21 22 23 24 25
No. of students 500 400 300 240 200 160
Regular players 400 300 180 96 60 24
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐫 = −𝟎. 𝟗𝟕𝟑𝟖
H 8 Find the correlation coefficient between the serum diastolic B.P. and serum
cholesterol levels of 10 randomly selected data of 10 persons.
Person 1 2 3 4 5 6 7 8 9 10
Cholesterol 307 259 341 317 274 416 267 320 274 336
B.P. 80 75 90 74 75 110 70 85 88 78
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟎. 𝟖𝟎𝟖𝟖
C 11 Find rxy from given data if n = 10, ∑(x − x̅)(y − y̅) = 66, σx = 5.4, σy =
6.2.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐫 = 𝟎. 𝟏𝟗𝟕𝟏
H 12 Find rxy from given data n = 10, ∑(x − x̅)(y − y̅) = 1650, σ2x = 196, σ2y =
225.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐫 = 𝟎. 𝟕𝟖𝟓𝟕
and ∑ xy = 508. Later on, it was found that two of the points (8, 12) and
2
(6, 8) were wrongly entered as (6, 14) and (8, 6). Prove that r = .
3
1st judge 3 5 8 4 7 10 2 1 6 9
2nd judge 6 4 9 8 1 2 3 10 5 7
𝐀𝐧𝐬𝐰𝐞𝐫: 𝛒 = −𝟎. 𝟐𝟗𝟕𝟎
1st judge 1 2 3 4 5 6 7 8 9 10 11 12
2nd judge 12 9 6 10 3 5 4 7 8 2 11 1
What degree of agreement is there between the Judges?
𝐀𝐧𝐬𝐰𝐞𝐫: 𝛒 = −𝟎. 𝟒𝟓𝟒𝟓
1st judge 1 5 4 8 9 6 10 7 3 2
2nd judge 4 8 7 6 5 9 10 3 2 1
3rd judge 6 7 8 1 5 10 9 2 3 4
Use rank correlation to discuss which pair of judges has nearest approach
to beauty.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟐𝐧𝐝 𝐚𝐧𝐝 𝟑𝐫𝐝 𝐣𝐮𝐝𝐠𝐞𝐬 𝐡𝐚𝐬 𝐧𝐞𝐚𝐫𝐞𝐬𝐭 𝐚𝐩𝐩𝐫𝐨𝐚𝐜𝐡
𝐀𝐧𝐬𝐰𝐞𝐫: [𝛒 = 𝟎. 𝟓𝟓𝟏𝟓, 𝟎. 𝟕𝟑𝟑𝟑, 𝟎. 𝟎𝟓𝟒𝟓]
Roll no. 1 2 3 4 5 6 7 8 9
Marks in Math. 78 36 98 25 75 82 90 62 65
Marks in Chem. 84 51 91 60 68 62 86 58 53
𝐀𝐧𝐬𝐰𝐞𝐫: 𝛒 = 𝟎. 𝟖𝟑𝟑𝟑
Sales 45 56 39 54 45 40 56 60 30 36
Cost 40 36 30 44 36 32 45 42 20 36
𝐀𝐧𝐬𝐰𝐞𝐫: 𝛒 = 𝟎. 𝟕𝟔𝟑𝟔
x 68 64 75 50 64 80 75 40 55 64
y 62 58 68 45 81 60 68 48 50 70
𝐀𝐧𝐬𝐰𝐞𝐫: 𝛒 = 𝟎. 𝟓𝟔𝟑𝟔
❖ REGRESSION ANALYSIS
✓ The regression analysis is concerned with the formulation and determination of algebraic
expressions for the relationship between the two variables.
✓ We use the general form regression line for these algebraic expressions. The algebraic
expressions of the regression lines are called Regression equations.
❖ REGRESSION LINES
✓ Line of regression of y on x is 𝐲 − 𝐲̅ = 𝐛𝐲 𝐱 (𝐱 − 𝐱̅)
✓ Here, bxy and byx are the regression coefficients & σx and σy are the standard deviation &
x̅ and y̅ are the mean & r is the coefficient of correlation of x, y.
✓ Both the regression coefficients will have the same sign. They are either both positive and
both negative.
✓ The Sign of the coefficient of correlation is same as of the regression coefficients. It means,
✓ The arithmetic mean of the regression coefficients is greater than the correlation
coefficient.
bxy + byx
>r
2
x 2 3 4 4 5 6 6 7 7 8 10 10
y 1 3 2 4 4 4 6 4 6 7 9 10
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟎. 𝟗𝟗𝐱 − 𝟎. 𝟗𝟐
x 65 66 67 67 68 69 70 72
y 67 68 65 68 72 72 69 71
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐱 = 𝟎. 𝟓𝟒𝐱 + 𝟑𝟎. 𝟕𝟒, 𝐲 = 𝟎. 𝟔𝟔𝟓𝐱 + 𝟐𝟑. 𝟕𝟖
H 4 Obtain the two lines of regression for the following data: W-19
Sales (7)
190 240 250 300 310 335 300
(No. of tablets)
Advertising
5 10 12 20 20 30 30
expenditure (Rs.)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟎. 𝟏𝟕𝟔𝟔𝐱 − 𝟑𝟎. 𝟒𝟐𝟐𝟏, 𝐱 = 𝟒. 𝟕𝟑𝟓𝟕𝐲 + 𝟏𝟖𝟗. 𝟎𝟖𝟎𝟕
H 5 The amount of chemical compound (y), which were dissolved in 100 grams
of water at various temperatures (x):
x 15 15 30 30 45 45 60 60
y 12 10 25 21 31 33 44 39
Find the equation of the regression line of y on x and estimate y if x = 50°C.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟎. 𝟔𝟕𝐱 + 𝟏. 𝟕𝟓, 𝟑𝟓. 𝟐𝟓
Operator 1 2 3 4 5 6
Performance rating 78 36 98 25 75 82
Experience 84 51 91 60 68 62
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐱 = 𝟏𝟏. 𝟒𝟐𝟖 𝐲 − 𝟐𝟗. 𝟑𝟖, 𝟗𝟔. 𝟑𝟑
H 8 The following data regarding the height (y) and weight (x) of 100 students
are given: ∑ x = 15000 , ∑ y = 6800 , ∑ x 2 = 2272500 , ∑ y 2 = 463025,
∑ xy = 1022250. Find the equation of regression line of height on weight.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟎. 𝟏𝐱 + 𝟓𝟑
C 9 The following values are available for the variable x & y. Obtain regression
lines.
n = 10, ∑ x = 30 , ∑ y = 40 , ∑ x 2 = 222 , ∑ y 2 = 985 , ∑ xy = 384.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟐𝐱 − 𝟐, 𝐱 = 𝟎. 𝟑𝟐𝐲 + 𝟏. 𝟕𝟐
x y
Mean 60 67.5
Standard deviation 15 13.5
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟒𝟎. 𝟓 + 𝟎. 𝟒𝟓𝐱, 𝐱 = 𝟐𝟐. 𝟒𝟕 + 𝟎. 𝟓𝟓𝟔𝐲
Hapur(Rs) Kanpur(Rs)
Average price/kg 2.463 2.797
Standard deviation 0.326 0.207
Correlation coefficient between prices at Hapur and Kanpur is 0.774.
Estimate the most likely price at Hapur corresponding to the price of 3.052
per kilo at Kanpur.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟐. 𝟕𝟕𝟒
PART-IV MISCELLANEOUS
Mid value 15 20 25 30 35 40 45 50 55
Frequency 2 22 19 14 3 4 6 1 1
Cumulative 2 24 43 57 60 64 70 71 72
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟐𝟓. 𝟖𝟒𝟕𝟐, 𝟐𝟏. 𝟖𝟒𝟕𝟖, 𝟐𝟓. 𝟔𝟓𝟕𝟗
C 2 Obtain the mean, mode and median for the following information:
H 3 Obtain the mean, mode and median for the following information:
Firm A Firm B
Number of workers 500 600
Average daily wage 186 175
Variance of distribution of wages 81 100
(a) Which firm has a larger wage bill?
(b) In which firm, is there greater variability in individual wages?
(c) Calculate average daily wages of all the workers in the firms A & B taken
together.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐁, 𝐁, 𝟏𝟖𝟎
H 9 An analysis of monthly wages paid to the workers of two firms A and B W-19
belonging to the same industry gives the following results: (4)
Firm A Firm B
Number of workers 986 548
Average daily wage 52.5 47.5
Variance of distribution of wages 100 121
(a) Which firm has a larger wage bill?
(b) In which firm, is there greater variability in individual wages?
(c) Calculate average daily wages of all the workers in the firms A & B taken
together.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐁, 𝐁, 𝟒𝟗. 𝟖𝟕
x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1
H 16 Find the mean, median and mode for the following frequency distribution:
x 1 2 3 4 5 6 7 8 9 10
f 4 7 8 10 6 6 4 2 2 1
𝐀𝐧𝐬𝐰𝐞𝐫: 𝟒. 𝟒, 𝟒, 𝟒
H 17 An insurance company obtained the following data for accident claims (in
thousand rupees) from a particular region. Find its mean, median and
mode.
⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆
❖ POPULATION OR UNIVERSE
✓ An aggregate of objects (animate or inanimate) under study is called population or
universe. It is thus a collection of individuals or of their attributes (qualities) or of results
of operations which can be numerically specified.
✓ A universe with infinite number of members is known as an infinite universe. For example,
the universe of pressures at various points in the atmosphere.
✓ In some cases, we may be even ignorant whether or not a particular universe is infinite,
e.g., the universe of stars.
✓ The universe of concrete objects is an existent universe. The collection of all possible ways
in which a specified event can happen is called a hypothetical universe. The universe of
heads and tails obtained by tossing an infinite number of times is a hypothetical one.
❖ SAMPLING
✓ A finite sub-set of a universe or population is called a sample. A sample is thus a small
portion of the universe. The number of individuals in a sample is called the sample size.
The process of selecting a sample from a universe is called sampling.
✓ The theory of sampling is a study of relationship between a population and samples drawn
from the population. The fundamental object of sampling is to get as much information as
possible of the whole universe by examining only a part of it.
✓ Sampling is quite often used in our day-to-day practical life. For example, in a shop we
assess the quality of sugar, rice or any commodity by taking only a handful of it from the
bag and then decide whether to purchase it or not.
❖ TEST OF SIGNIFICANCE
✓ An important aspect of the sampling theory is to study the test of significance. Which will
enable us to decide, on the basis of the results of the sample. Whether
✓ The deviation between observed sample statistic and the hypothetical parameter value
✓ The deviation between two samples statistics is significant of might be attributed due to
chance or the fluctuations of the sampling.
✓ For applying the tests of significance, we first set up a hypothesis which is a definite
statement about the population parameter called null hypothesis denoted by H0 .
✓ Any hypothesis which is complementary to the null hypothesis (H0 ) is called an alternative
hypothesis denoted by H1 .
✓ For example, if we want to test the null hypothesis that the population has a specified mean
μ0 , then we have H0 ∶ μ = μ0
✓ Hence alternative hypothesis helps to know whether the test is two tailed or one tailed test.
❖ STANDARD ERROR
✓ The standard deviation of the sampling distribution of a statistic is known as the standard
error.
✓ It plays an important role in the theory of large samples and it forms a basis of testing of
hypothesis. If t is any statistic, for large sample. Then
t − E ( t)
z=
S. E(t)
✓ For large sample, the standard errors of some of the well-known statistic are listed below
σ
1 x̅
√n
σ2
2 S √
2n
σ12 σ22
3 Difference of two sample means x̅1 − ̅̅̅
x2 √ +
n1 n2
σ12 σ22
4 Difference of two sample standard deviation s1 − s2 √ +
2n1 2n2
P1 Q1 P2 Q 2
5 Difference of two sample proportions p1 − P2 √ +
n1 n2
❖ ERRORS IN SAMPLING
✓ The main aim of the sampling theory is to draw a valid conclusion about the population
parameters. On the basis of the same results. In doing this we may commit the following
two type of errors.
H0
P(Reject H0 when it is true) = P (Reject )=α
H0
Where α is called the size of the type I error also referred to as product’s risk.
H0
P(Accept H0 when it is wrong) = P (Accept )=β
H1
Where β is called the size of the type II error, also referred to as consumer’s risk.
t − E ( t)
z=
S. E(t)
✓ Step 6: Conclusion.
➢ Compare the computed value of z with critical value zα at the level of significance (α).
➢ It means if test statistic value belongs to critical Region, then we reject H0 otherwise
we accept H0 .
X 1 1(PQ) PQ
➢ V(p) = V ( ) = V(x) = =
n n2 n n
PQ p−E(p)
➢ S. E. (p) = ට ;z = ~N(0,1)
n S.E(p)
p−P
i. e. z =
ට PQ
n
✓ This z is called test statistics which is used to test the significant difference of sample and
population proportion.
PQ
✓ The probable limit for the observed proportion of successes is p ± zα ට , where zα is the
n
pq
✓ If P is not known, the limits for proportion in the population are p ± zα ට , q = 1 − p.
n
PQ
✓ Hence, confidence limits for observed proportion p are p ± 3ට .
n
pq
✓ The confidence limits for the population proportion p are p ± ට .
n
C 1 A political party claims that 45% of the voters in an election district prefer
its candidate. A sample of 200 voters include 80 who prefer this candidate.
Test if the claim is valid at the 5% significance level.(z0.05 = 1.96)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐩𝐚𝐫𝐭𝐲’𝐬 𝐜𝐥𝐚𝐢𝐦 𝐦𝐢𝐠𝐡𝐭 𝐛𝐞 𝐯𝐚𝐥𝐢𝐝.
H 2 In a sample of 400 parts manufactured by a factory; the number of
defective parts found to be 30. The company, however, claims that only 5%
of their product is defective. Is the claim tenable? (Take level of significance
5%)(z0.05 = 1.645)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐜𝐥𝐚𝐢𝐦 𝐨𝐟 𝐦𝐚𝐧𝐮𝐟𝐚𝐜𝐭𝐮𝐫𝐞𝐫 𝐢𝐬 𝐧𝐨𝐭 𝐭𝐞𝐧𝐚𝐛𝐥𝐞(𝐚𝐜𝐜𝐞𝐩𝐭𝐚𝐛𝐥𝐞).
C 3 A certain cubical die was thrown 9000 times and 5 or 6 was obtained 3240
times. On the assumption of certain throwing, do the data indicate an
unbiased die? (z0.05 = 1.96)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐝𝐢𝐞 𝐢𝐬 𝐮𝐧𝐛𝐢𝐚𝐬𝐞𝐝.
H 4 A coin was tossed 400 times and the head turned up 216 times. Test the
hypothesis that the coin is unbiased.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐜𝐨𝐢𝐧 𝐢𝐬 𝐮𝐧𝐛𝐢𝐚𝐬𝐞𝐝. (𝐳𝟎.𝟎𝟓 = 𝟏. 𝟗𝟔)
✓ The test statistic under the null hypothesis H0 , that there is no significant difference
between the two sample proportions, we have
p1 − p2 n1 p1 + n2 p2
z= , where P = and Q = 1 − P.
1 1 n1 + n2
ටPQ ( +
n1 n2 )
✓ To test whether given sample of size n has been drawn from a population with mean μ i.e.,
to test whether the difference between the sample mean and population mean is significant
or not. Under the null hypothesis that there is no difference between the sample mean and
population mean.
x̅ − μ
z= σ , where σ is the standard deviation of the population.
√n
x̅−μ
✓ If σ is not known, we use test statistic z = s , where s is standard deviation of the sample.
√n
x̅−μ
✓ If the level of significance is α and zα is the critical value −zα < |z| = ቤ σ ቤ < zα
√n
σ σ
✓ The limit of the population mean μ are given by x̅ − zα < μ < x̅ + zα .
√n √n
✓ Confidence limits:
σ
➢ At 5% of level of significance, 95% confidence limits are x̅ − 1.96 < μ < x̅ +
√n
σ
1.96 .
√n
σ
➢ At 1% of level of significance, 99% confidence limits are x̅ − 2.58 < μ < x̅ +
√n
σ
2.58 .
√n
x1 − x2
z=
σ12 σ22
√ +
n1 n2
✓ Under the null hypothesis that the samples are drawn from the same population where
σ1 = σ2 = σ i.e., μ1 = μ2 the test statistic is given by
x1 − x2
z=
1 1
σටn + n
1 2
x1 − x2
z=
s12 s22
√ +
n1 n2
n1 s21 + n2s22
✓ If σ is not known and σ1 = σ2 we use σ2 = to calculate σ,
n1 + n2
x1 − x2
z=
n1 s12 + n2 s22 1 1
√(
n1 + n2 ) . ( n1 + n2 )
C 1 Random samples drawn from two countries gave the following data
relating to the heights of adult males:
Country A Country B
Standard deviation 2.58 2.5
Number in samples 1000 1200
Is the difference between the standard deviation significant? (z0.05 = 1.96)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐬𝐚𝐦𝐩𝐥𝐞 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧𝐬 𝐝𝐨 𝐧𝐨𝐭 𝐝𝐢𝐟𝐟𝐞𝐫 𝐬𝐢𝐠𝐧𝐢𝐟𝐢𝐜𝐚𝐧𝐭𝐥𝐲.
H 2 Intelligence test of two groups of boys and girls gives the following results:
n S.D.
Girls 121 10
Boys 81 12
Is the difference between the standard deviations significant?
(z0.05 = 1.96)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐬𝐚𝐦𝐩𝐥𝐞 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧𝐬 𝐝𝐨 𝐧𝐨𝐭 𝐝𝐢𝐟𝐟𝐞𝐫 𝐬𝐢𝐠𝐧𝐢𝐟𝐢𝐜𝐚𝐧𝐭𝐥𝐲.
C 3 The mean yield of two plots and their variability are as given below:
40 plots 60 plots
S.D. 34 28
Check whether the difference in the variability in yields is
significant. (z0.05 = 1.96)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐬𝐚𝐦𝐩𝐥𝐞 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧𝐬 𝐝𝐨 𝐧𝐨𝐭 𝐝𝐢𝐟𝐟𝐞𝐫 𝐬𝐢𝐠𝐧𝐢𝐟𝐢𝐜𝐚𝐧𝐭𝐥𝐲.
H 4 The yield of wheat in a random sample of 1000 farms in a certain area has
a S.D. of 192 kg. Another random sample of 1000 farms give a S.D. of 224
kg. Are the S. Ds significantly different? (z0.05 = 1.96)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐬𝐚𝐦𝐩𝐥𝐞 𝐬𝐭𝐚𝐧𝐝𝐚𝐫𝐝 𝐝𝐞𝐯𝐢𝐚𝐭𝐢𝐨𝐧𝐬 𝐚𝐫𝐞 𝐬𝐢𝐠𝐧𝐢𝐟𝐢𝐜𝐚𝐧𝐭𝐥𝐲 𝐝𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐭.
✓ H0 : The samples have been drawn from the normal population with means μ1 and μ2
✓ i.e., H0 ∶ μ1 = μ2 .
̅, Y
✓ Let X ̅ be their means of the two samples.
(̅
X−̅
Y)
t=
1 1
σට n + n
1 2
Institution A 65 69 73 71 75 66 71 68 68 74
Institution B 78 69 72 77 84 70 73 77 75 65
Test the claim that institute B is more effective.
(The value of t for 18 degree of freedom at 5% level is 1.734)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐜𝐥𝐚𝐢𝐦 𝐢𝐬 𝐯𝐚𝐥𝐢𝐝.
H 4 Random samples of specimens of coal from two mines A & B are drawn and
their heat producing capacity (in millions of calories per ton) were
measured yielding the following result:
Sample I 25 30 28 34 24 20 13 32 22 38 (7)
Sample II 40 34 22 20 31 40 30 23 36 17
Analyze whether the samples have been drawn from the population of
equal means. [t at 5% level of significance for 18 d.f. is 2.1] Test whether
the means of two populations are same at 5% level (t at 0.05=2.0739).
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐒𝐚𝐦𝐩𝐥𝐞𝐬 𝐡𝐚𝐯𝐞 𝐛𝐞𝐞𝐧 𝐝𝐫𝐚𝐰𝐧 𝐟𝐫𝐨𝐦 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧 𝐰𝐢𝐭𝐡 𝐞𝐪𝐮𝐚𝐥
𝐦𝐞𝐚𝐧. 𝐀𝐥𝐬𝐨, 𝐦𝐞𝐚𝐧𝐬 𝐨𝐟 𝐭𝐰𝐨 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧𝐬 𝐚𝐫𝐞 𝐬𝐚𝐦𝐞.
r√n−2
t= with v = n − 2 degrees of freedom.
√ 1 − r2
✓ To test whether these estimates are significantly different or if the samples may be
regarded as drawn from the same population or from two populations with same
varianceσ2 . We setup the null hypothesis H0 ∶ σ12 = σ22 = σ2 .
(s1 )2
F= , where s12 > s22 .
(s2 )2
A 17 27 18 25 27 29 13 17
B 16 16 20 27 26 25 21 −
Test whether the samples are drawn from the same normal population.
(F for 7 and 6 d.f. = 1.19)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧 𝐯𝐚𝐫𝐢𝐚𝐧𝐜𝐞𝐬 𝐚𝐫𝐞 𝐞𝐪𝐮𝐚𝐥.
H 4 Two independent sample of size 7 and 6 had the following values:
A 28 30 32 33 31 29 34
B 29 30 30 24 27 28 −
Examine whether the samples have been drawn from normal populations
having the same variance. (F for 5 and 6 d.f. = 4.39)
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐒𝐚𝐦𝐩𝐥𝐞𝐬 𝐡𝐚𝐯𝐞 𝐛𝐞𝐞𝐧 𝐝𝐫𝐚𝐰𝐧 𝐟𝐫𝐨𝐦 𝐭𝐡𝐞 𝐧𝐨𝐫𝐦𝐚𝐥 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧𝐬
𝐰𝐢𝐭𝐡 𝐬𝐚𝐦𝐞 𝐯𝐚𝐫𝐢𝐚𝐧𝐜𝐞.
H 5 Two independent samples of 8 and 7 items respectively had the following W-19
values of the variable (weight in kg): (4)
Sample I 9 11 13 11 15 9 12 14
Sample II 10 12 10 14 9 8 10 −
Do the two estimates of population variance differ significantly? Given that
for (7,6) d.f. the value of F at 5% level of significance is 4.20 nearly.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞𝐫𝐞 𝐢𝐬 𝐧𝐨 𝐬𝐢𝐠𝐧𝐢𝐟𝐢𝐜𝐚𝐧𝐭 𝐝𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐜𝐞 𝐛𝐞𝐭𝐰𝐞𝐞𝐧 𝐭𝐡𝐞 𝐯𝐚𝐫𝐢𝐚𝐧𝐜𝐞𝐬 𝐨𝐟
𝐭𝐰𝐨 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧.
H 6 Two samples of size 9 and 8 give the sum of squares of deviations from W-19
their respective means equal 160 inches and 91 inches respectively. Can (3)
they be regarded as drawn from two normal populations with the same
variance? (F for 8 and 7 d.f. = 3.73).
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞𝐫𝐞 𝐢𝐬 𝐧𝐨 𝐬𝐢𝐠𝐧𝐢𝐟𝐢𝐜𝐚𝐧𝐭 𝐝𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐜𝐞 𝐛𝐞𝐭𝐰𝐞𝐞𝐧 𝐭𝐡𝐞 𝐯𝐚𝐫𝐢𝐚𝐧𝐜𝐞𝐬 𝐨𝐟
𝐭𝐡𝐞 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧.
✓ Part-2
➢ H0 ∶ Given probability distribution fits good with the given data; that is ,there is no
significant difference between observed frequencies (Oi ) and expected frequencies
(ei ).
➢ H1 ∶ Given probability distribution does not fit good with the given data; that is , there
is significant difference between observed frequencies (Oi ) and expected frequencies
(ei ).
✓ Note that the value of degree of freedom v for binomial, exponential and normal
distribution is n − 1, n − 2 and n − 3, respectively.
C 1 Suppose that a die is tossed 120 times and the recorded data is as follows:
Face Observed(x) 1 2 3 4 5 6
Frequency 20 22 17 18 19 24
Test the hypothesis that the die is unbiased at α = 0.05.
[χ2 at 5% level of significance for 5 df is 11.070]
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐞 𝐝𝐢𝐞 𝐢𝐬 𝐮𝐧𝐛𝐢𝐚𝐬𝐞𝐝.
H 2 The following table gives the number of accidents that took place in an
industry during various days of the week. Test if accidents are uniformly
distributed over the week.
(a) 1 5 20 28 42 22 15 5 2
(b) 1 6 18 25 40 25 18 6 1
2
Apply the χ -test of goodness of fit.
[χ2 at 5% level of significance for 4 df is 9.488]
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐓𝐡𝐢𝐬 𝐧𝐨𝐫𝐦𝐚𝐥 𝐝𝐢𝐬𝐭𝐫𝐢𝐛𝐮𝐭𝐢𝐨𝐧 𝐩𝐫𝐨𝐯𝐢𝐝𝐞𝐬 𝐚 𝐠𝐨𝐨𝐝 𝐟𝐢𝐭.
➢ Construct a contingency table on the basis of given information and find expected
frequency for each cell using
✓ Part-2
⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆
✓ Suppose (x1 , y1 ), (x2 , y2 ), … , (xn , yn ) be the data showing corresponding values of the
variables x and y under consideration. If we plot the above data points on a rectangular
coordinate system, then the set of points so plotted form a scatter diagram.
✓ From this diagram, it is sometimes possible to visualize a smooth curve approximating the
data. Such a curve is called an approximating curve.
✓ In particular, if the data approximate well to a straight line, we say that a linear relationship
exists between the variables. It is quite possible that the relationship of the form y = f(x)
between two variables x and y, giving the approximating curve and which fit the given data
of x and y, is called curve fitting.
❖ CURVE FITTING
✓ Curve fitting is the process of finding the ‘best-fit’ curve for a given set of data. It is the
representation of the relationship between two variables by means of an algebraic
equation.
✓ Suppose that the data points are (x1 , y1 ), (x2 , y2 ), … , (xn , yn ), where x is independent and y
is dependent variable. Let the fitting curve f(x) has the following deviations (or errors or
residuals) from each data points
✓ Clearly, some of the deviations will be positive and others negative. Thus, to give equal
weightage to each error, we square each of these and form their sum; that is,
D = d1 2 + d2 2 + ⋯ + dn 2
✓ Now, according to the method of least squares, the best fitting curve has the property that
n n
2 2 2 2
D = d1 + d2 + ⋯ + dn = ∑ di = ∑[yi − f(xi )]2 = a minimum.
i=1 i=1
✓ For the general point (xi , yi ), the vertical distance of this point from the line y = a + bx is
the deviation di , then di = yi − f(xi ) = yi − a − bxi .
✓ Applying method of least squares, the values of a and b are so determined as to minimize
n
D = ∑(yi − a − bxi )2
i=1
∑ yi = a ∑ 1 + b ∑ xi and ∑ xi yi = a ∑ xi + b ∑ xi 2
i=1 i=1 i=1 i=1 i=1 i=1
✓ Which implies
n n n n n
∑ yi = an + b ∑ xi and ∑ xi yi = a ∑ xi + b ∑ xi 2
i=1 i=1 i=1 i=1 i=1
✓ We obtain following normal equations for the best fitting straight line y = a + bx.
∑ y = an + b ∑ x
∑ xy = a ∑ x + b ∑ x 2
∑ y = a ∑ x + bn
∑ xy = a ∑ x 2 + b ∑ x
C 1 By the method of least square, find the straight line that best fits the
following data:
x 1 2 3 4 5
y 14 27 40 55 68
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟏𝟑. 𝟔𝐱
H 2 Fit a straight line to the following data:
x 71 68 73 69 67 65 66 67
y 69 72 70 70 68 67 68 68
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟒𝟔. 𝟗𝟑𝟗𝟒 + 𝟎. 𝟑𝟐𝟑𝟐𝐱
H 3 The weight of a calf taken at weekly intervals are given below. Fit a straight
line using method of least squares.
Age (x) 1 2 3 4 5 6 7 8 9 10
Weight (y) 52.5 58.7 65 70.2 75.4 81.1 87.2 95.5 102.5 108.4
x −2 −1 0 1 2
y 1 2 3 3 4
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟎. 𝟕𝐱 + 𝟐. 𝟔
x 6 7 7 8 8 8 9 9 10
y 5 5 4 5 4 3 4 3 3
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = −𝟎. 𝟓𝐱 + 𝟖
H 7 If P is the pull required to lift a load W by means of a pulley block, find a
linear approximation of the form P = mW + c connecting P and W, using
the following data:
P 13 18 23 27
W 51 75 102 119
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐏 = 𝟎. 𝟐𝟎𝟐𝟖𝐖 + 𝟐. 𝟔𝟓𝟖𝟎
H 8 The following show the gain in reading speed of 3 students in a speed-
reading program, and the number of weeks they have been in the program:
No. of weeks 3 5 2 8 6 9 3 4
Speed gain 86 118 49 193 164 232 73 109
Find a straight line by the method of least squares.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟑. 𝟑𝟒𝟎𝟗 + 𝟐𝟒. 𝟗𝟑𝟏𝟖𝐱
H 9 Fit a straight line for following data. Also, find y when x = 2.8.
x 2 5 6 9 11
y 2 4 6 9 10
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = −𝟎. 𝟎𝟐𝟒𝟒 + 𝟎. 𝟗𝟒𝟑𝟏𝐱, 𝐲(𝟐. 𝟖) = 𝟐. 𝟔𝟏𝟔𝟑
C 10 By method of least squares, fit a linear relation of the form P = a + bW to
the following data, P is the pull required to lift a weight W. Also estimate P,
when W is 150.
P 50 70 100 120
W 12 15 21 25
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐏 = −𝟏𝟏. 𝟖𝟎𝟎𝟓 + 𝟓. 𝟑𝟎𝟒𝟏𝐖, 𝐏(𝟏𝟓𝟎) = 𝟕𝟖𝟑. 𝟖𝟏𝟒𝟓
❖ FITTING A PARABOLA
✓ Consider a set of n pairs of the given values (x, y) for fitting the curve y = a + bx + cx 2 . The
residual R = y − (y = a + bx + cx 2 ) is the difference between the observed and estimated
values of y. We have to find a, b, c such that the sum of the squares of the residuals is
minimum. Let
n
∑ y = na + b ∑ x + c ∑ x 2
∑ x y = a ∑ x + b ∑ x2 + c ∑ x3
∑ x2 y = a ∑ x2 + b ∑ x3 + c ∑ x4
∑ y = a ∑ x 2 + b ∑ x + nc
∑ x y = a ∑ x3 + b ∑ x2 + c ∑ x
∑ x2 y = a ∑ x4 + b ∑ x3 + c ∑ x2
x 50 70 100 120
y 12 15 21 25
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟓. 𝟓𝟐𝟓𝟗 + 𝟎. 𝟏𝟎𝟐𝟗𝐱 + 𝟎. 𝟎𝟎𝟎𝟓𝐱 𝟐
x 1 2 3 4 5 6
y 3.13 3.76 6.94 12.62 20.86 31.53
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟒. 𝟗𝟖𝟐 − 𝟑. 𝟏𝟏𝟗𝟗𝐱 + 𝟏. 𝟐𝟓𝟕𝟗𝐱 𝟐
H 3 Fit a parabola y = a + bx + cx 2 to the following data:
x 1 2 3 5 6
y 1.1 5.8 17.5 55.9 86.7
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟐. 𝟕𝟐𝟐𝟕 − 𝟒. 𝟓𝟓𝟐𝟖𝐱 + 𝟑. 𝟎𝟕𝟕𝟏𝐱 𝟐
H 4 Fit a parabola y = a + bx + cx 2 to the following data:
x 0 1 2 3 4
y 1 4 10 17 30
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟏. 𝟐 + 𝟏. 𝟏𝐱 + 𝟏. 𝟓𝐱 𝟐
H 5 Fit a second degree parabola y = a + bx + cx 2 to the following data:
x −1 0 1 2 3
y 5 6 21 50 93
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟕𝐱 𝟐 + 𝟖𝐱 + 𝟔
x −3 −2 −1 0 1 2 3
y 12 4 1 2 7 15 30
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟐. 𝟏𝟏𝟗𝟎𝐱 𝟐 + 𝟐. 𝟗𝟐𝟖𝟔𝐱 + 𝟏. 𝟔𝟔𝟔𝟕
H 9 Fit a polynomial of degree two using least square method for the following
experimental data. Also, estimate y(2.4).
x 1 2 3 4 5
y 5 12 26 60 97
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟏𝟎. 𝟒 − 𝟏𝟏. 𝟎𝟖𝟓𝟕𝐱 + 𝟓. 𝟕𝟏𝟒𝟑𝐱 𝟐 , 𝐲(𝟐. 𝟒) = 𝟏𝟔. 𝟕𝟎𝟖𝟕
C 10 Fit a relation of the form R = a + bV + cV 2 to the following data, where V
is the velocity in km/hr. and R is the resistance in km/quintal. Estimate R
when V = 90.
V 20 40 60 80 100 120
R 5.5 9.1 14.9 22.8 33.3 46.0
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐑 = 𝟒. 𝟑𝟓 + 𝟎. 𝟎𝟎𝟐𝟒𝐕 + 𝟎. 𝟎𝟎𝟐𝟗𝐕 𝟐 , 𝐑(𝟗𝟎) = 𝟐𝟖. 𝟎𝟓𝟔𝟎
H 11 The following are the data on the drying time of a certain varnish and the W-19
amount of an additive that is intended to reduce the drying time? (7)
Amount of
varnish
0 1 2 3 4 5 6 7 8
additive(grams)
”x”
Drying time(hr.)
12 10.5 10 8 7 8 7.5 8.5 9
“y”
I. Fit a second degree polynomial by the method of least square.
II. Use the result to predict the drying time of the varnish when 6.5 gms of
the additive is being used.
✓ 𝐲 = 𝐚𝐱 𝐛
✓ 𝐲 = 𝐚𝐛𝐱
✓ 𝐲 = 𝐚 + 𝐛𝐱 𝟐
✓ 𝐩𝐯 𝛄 = 𝐂
1
1 1
C γ −
➢ v = (p) ⇒ v = Cγp γ
1 1
➢ Take logarithm both the sides log v = log C − log p.
γ γ
1 1
➢ Denoting log v = Y, log C = A, − = B and log P = X , we obtain Y = A + BX.
γ γ
C 1 Fit a curve of the best fit of the type y = aebx to the following data:
x 1 5 7 9 12
y 10 15 12 15 21
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟗. 𝟒𝟕𝟓𝟒 ∙ 𝐞(𝟎.𝟎𝟓𝟗)𝐱
H 2 Fit a curve of the best fit of the type y = aebx to the following data:
x 1 2 3 4
y 1.65 2.7 4.5 7.35
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟏. 𝟎𝟎𝟎𝟏 ∙ 𝐞(𝟎.𝟒𝟗𝟗𝟑)𝐱
H 3 The population (p) of a small community on the outskirts of a city grows
rapidly over a 20 −year period:
t 0 5 10 15 20
p 100 200 450 950 2000
As an engineer working for a utility company, you must forecast the
population 5 years into the future in order to anticipate the demand for
power. Employ an exponential model and linear regression to make this
prediction.
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐩 = 𝟗𝟕. 𝟗𝟏𝟓 ∙ 𝐞(𝟎.𝟏𝟓𝟏)𝐭
H 4 Fit a curve of the best fit of the type y = ax b to the following data:
x 2 3 4 5
y 27.8 62.1 110 161
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟕. 𝟑𝟖𝟎𝟐 ∙ 𝐱 𝟏.𝟗𝟑𝟎𝟐
C 5 Fit a curve of the best fit of the type y = ax b to the following data:
x 1 2 3 4 5
y 0.5 2 4.5 8 12.5
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟎. 𝟓𝐱 𝟐
C 6 Fit a curve of the best fit of the type y = abx to the following data:
x 2 3 4 5 6
y 8.3 15.4 33.1 65.2 126.4
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟐. 𝟎𝟒𝟗𝟓(𝟏. 𝟗𝟗𝟏𝟕)𝐱
H 7 Fit a curve of the best fit of the type y = abx to the following data:
x 2 3 4 5 6
y 144 172.8 207.4 248.8 298.5
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟏𝟎𝟎. 𝟎𝟐𝟑𝟎(𝟏. 𝟐)𝐱
C 8 Find the least square fit of the form y = a0 + a1 x 2 to the following data:
x −1 0 1 2
y 2 5 3 0
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟒. 𝟏𝟔𝟔𝟕 − 𝟏. 𝟏𝟏𝟏𝟏𝐱 𝟐
H 9 b
Using least square method fit the curve y = ax 2 + to the following data:
x
x 1 2 3 4
y −1.51 0.993.88 7.66
𝟐. 𝟎𝟖𝟐𝟔
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐲 = 𝟎. 𝟓𝟏𝟎𝟖𝐱 𝟐 −
𝐱
H 10 The pressure P of the gas corresponding to various volume V is measured
given by the following data, fit the data to the equation PV γ = C.
P 50 60 70 80 90
V 64.7 51.3 40.5 25.9 78
𝐀𝐧𝐬𝐰𝐞𝐫: 𝐏𝐕 𝟑.𝟎𝟗𝟑𝟏 = 𝟏𝟏𝟑𝟎𝟑𝟐𝟒𝟎. 𝟑𝟔
⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆ ⋆
Rationale: Systematic study of uncertainty (randomness) by probability - statistics and curve fitting by
numerical methods
Content:
1
GUJARAT TECHNOLOGICAL UNIVERSITY
Bachelor of Engineering
Subject Code: 3130006
Suggested Specification table with Marks (Theory):
Note: This specification table shall be treated as a general guideline for students and teachers. The actual
distribution of marks in the question paper may vary from above table. This subject will be taught by
Maths faculties.
Reference Books:
(1) P. G. Hoel, S. C. Port and C. J. Stone, Introduction to Probability Theory, Universal Book
Stall.
(2) S. Ross, A First Course in Probability, 6th Ed., Pearson Education India.
(3) W. Feller, An Introduction to Probability Theory and its Applications, Vol. 1, Wiley.
(4) D. C. Montgomery and G. C. Runger, Applied Statistics and Probability for Engineers,
Wiley.
(5) J. L. Devore, Probability and Statistics for Engineering and the Sciences, Cengage
Learning.
Course Outcome:
Sr. CO statement Marks %
No. weightage
CO-4 apply the statistics for testing the significance of the given large and small
25 %
sample data by using t- test, F- test and Chi-square test
2
Seat No.: ________ Enrolment No.___________
Q.1 (a) In how many different ways can 4 of 15 laboratory assistants be chosen to assist with an 03
experiment?
(b) If 5 of 20 tires in storage are defective and 5 of them are randomly chosen for inspection 04
(that is, each tire has the same chance of being selected), what is the probability that the
two of the defective tires will be included?
(c) The following are the data on the drying time of a certain varnish and the amount of an 07
additive that is intended to reduce the drying time?
Amount of varnish
0 1 2 3 4 5 6 7 8
additive(grams)”x”
Drying time(hr) “y” 12.0 10.5 10.0 8.0 7.0 8.0 7.5 8.5 9.0
(i) Fit a second degree polynomial by the method of least square.
(ii) Use the result of (i) to predict the drying time of the varnish when 6.5 gms of the
additive is being used.
Q.2 (a) If 3 balls are “randomly drawn” from a bowl containing 6 white and 5 black balls. What 03
is the probability that one of the balls is white and the other two black?
(b) The article “A Thin-Film Oxygen Uptake Test for the Evaluation of Automotive 04
Crankcase Lubricants” reported the following data on oxidation-induction time (min) for
various commercial oils:
87, 103, 130, 160, 180, 195, 132, 145, 211, 105, 145, 153, 152, 138, 87, 99, 93, 119,
129
(i) Calculate the sample variance and standard deviation.
(ii) If the observations were re-expressed in hours, what would be the resulting values of
the sample variance and sample standard deviation?
(c) In an examination, minimum 40 marks for passing and 75 marks for distinction are 07
required. In this examination 45% students passed and 9% obtained distinction. Find
average marks and standard deviation of this distribution of marks.
[P(z=0.125)=0.05 and P(z=1.34)=0.41]
OR
(c) Distribution of height of 1000 students is normal with mean 165 cms and standard 07
deviation 15 cms. How many soldiers are of height
(i) less than 138 cms (ii) more than 198 cms (iii) between 138 and 198 cms.
[P(z=1.8)=0.4641, P(z=2.2)=0.4861]
Q.3 (a) Compute the coefficient of correlation between X and Y using the following data: 03
X 2 4 5 6 8 11
Y 18 12 10 8 7 5
(b) An analysis of monthly wages paid to workers in two firms A and B belong to the same 04
industry gave the following results.
1
Firm A Firm B
No. of wages earners 986 548
Average monthly wages Rs. 52.5 Rs. 47.5
Variance of distribution of wages 100 121
(a) Which firm pays out large amounts as wage bill?
(b) In which firm there is greater variability in individual wages?
(c) Obtain the two lines of regression for the following data: 07
Sales
190 240 250 300 310 335 300
(No. of tablets)
Advertising
5 10 15 20 20 30 30
expenditure (Rs.)
OR
Q.3 (a) A sample of 20 items has mean 42 units and standard deviation 5 units. Test the hypothesis 03
that it is a random sample from a normal population with mean 45 units.[t at 5% level for
19 d.f. is 2.09.]
(b) A university warehouse has received a shipment of 25 printers, of which 10 are laser 04
printers and 15 are inkjet models. If 6 of these 25 are selected at random to be checked by
a particular technician, what is the probability that exactly 3 of those selected are laser
printers (so that the other 3 are inkjets)?
(c) Find the regression equation showing the capacity utilization on production from the 07
following data:
Average Standard deviation
Production (in lakh units) 35.6 10.5
Capacity utilization (in %) 84.8 8.5
Correlation coefficient r = 0.62
Estimate the production when capacity utilization is 70%.
Q.4 (a) Each sample of water has a 10% chance of containing a particular organic pollutant. 03
Assume that the samples are independent with regard to the presence of the pollutant.
Find the probability that in the next 18 samples, at least 4 samples contain the pollutant.
(b) Goal scored by two teams A and B in a football season were as follows: 04
2
(b) A microchip company has two machines that produce the chips. Machine I produces 65% 04
of the chips, but 5% of its chips are defective. Machine II produces 35% of the chips and
15% of its chips are defective. A chip is selected at random and found to be defective.
What is the probability that it came from Machine I?
(c) If a publisher of nontechnical books takes great pains to ensure that its books are free of 07
typographical errors, so that the probability of any given page containing at least one such
error is .005 and errors are independent from page to page, what is the probability that
one of its 400-page novels will contain (i) exactly one page with errors? (ii)At most three
pages with errors?
Q.5 (a) Samples of sizes 10 and 14 were taken from two normal populations with standard 03
deviation 3.5 and 5.2. The sample means were found to be 20.3 and 18.6. Test whether
the means of the two populations are the same at 5% level. [ t 0.05 =2.0739].
(b) Two independent samples of 8 and 7 items respectively had the following values of the 04
variable (weight in kg):
Sample I : 9 11 13 11 15 9 12 14
Sample II: 10 12 10 14 9 8 10
Do the two estimates of population variance differ significantly? Given that for (7,6) d.f.
the value of F at 5% level of significance is 4.20 nearly.
(c) Records taken of the number of male and female births in 830 families having four 07
children are as follows:
Number of male births 0 1 2 3 4
Number of female births 4 3 2 1 0
Number of families 32 178 290 236 94
Test whether the data are consistent with the hypothesis that the Binomial law holds and
the chance of male birth is equal to that of female birth, namely p = q = 1/2. [ χ 2 at 5%
level of significance for 4 df is 9.49]
OR
Q.5 (a) Two samples of size 9 and 8 give the sum of squares of deviations from their respective 03
means equal 160 inches and 91 inches square respectively. Can they be regarded as drawn
from two normal populations with the same variance?
(F for 8 and 7 d.f. = 3.73).
(b) A die is thrown 276 times and the results of these throws are given below: 04
Number appeared on the die 1 2 3 4 5 6
Frequency 40 32 29 59 57 59
2
Test whether the die is biased or not. [ χ at 5% level of significance for 5 df is 11.09]
(c) The following figures refer to observations in live independent samples: 07
Sample I: 25 30 28 34 24 20 13 32 22 38
Sample II: 40 34 22 20 31 40 30 23 36 17
Analyse whether the samples have been drawn from the population of equal means.
[t at 5% level of significance for 18 d.f is 2.1] Test whether the means of two population
are same at 5% level (t at 0.05=2.0739)
*************************