0% found this document useful (0 votes)
25 views109 pages

Hypothesis

Uploaded by

vivek sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views109 pages

Hypothesis

Uploaded by

vivek sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 109

HYPOTHESIS

HYPOTHESI S TESTIN G
LEARNING OUTCOMES

• Discuss the concepts used in the testing of hypothesis


• Discuss the steps used in testing of hypothesis exercise
• Carry out the test of the significance of the mean of a single population using both t
and z test
• Illustrate the test of the significance of difference between two population means
using t and z tests
• Use SPSS software to conduct the testing of hypothesis
HYPOTHESIS
• A premises or claim that we want to test or investigate.
• It is an assumption about an unknown population
parameter.
• A hypothesis is always framed before the sample is drawn and
data is collected.

As per Claude Bernard-The research method will not give new and
productive ideas to those who do not have them, it will only help in
guiding the ideas to those who have them and in developing those
so as to draw the best possible results.

Hypothesis testing is a well defined procedure which helps


us to decide objectively whether to accept or reject the
hypothesis based on the information available from the
sample.
TYPES OF HYPOTHESIS
• NULL HYPOTHESIS: Set as no difference, status quo and considered true until
and unless it is prove wrong by the sample collected.
• ALTERNATE HYPOTHESIS: Opposite of null hypothesis
NULL HYPOTHESIS
Few things to be kept in mind:
– It should be stated in clear and specific terms
– It is the one which researcher is trying to disapprove or
reject.
– Hypothesis testing is done on the basis of a null
hypothesis
WRITING HYPOTHESIS

• A company is making pipe with 4 mm diameter. But employee start claiming that after
service the machine is no longer making pipe of 4 mm diameters. To clarify the same
claim a sample of 100 pipe were taken at 99% confidence level.

H0 : µ = 4 mm.
Ha : µ ≠ 4 mm.
• Doctors believes that the average teen sleeps on average no longer than 10 hours per
day. A researcher believe that teens on average sleep longer.

H0 : µ 10 hours.
Ha : µ > 10 hours.
EXERCISE
• An ice cream vendor has an average sale of Rs500 per day. Due to
the establishment if school in the locality he expects the ice cream
sale to increase. Create null and alternate hypothesis.

• A company manufacturing air conditioners has to meet a quality


rating of 5000 points and the standard deviation allowed is 280. A
sample of 100 customers reveals that they give the air conditioners
a rating of 4950. Create null and alternate hypothesis.
• From a population of 250 labourers working in a
village it has been seen that the average weekly wage
of labourers in a village of Rs. 500 with a standard
deviation of Rs. 75. A sample of 50 labourers is drawn
and their average weekly wages come out to be Rs
513. Create null and alternate hypothesis.
• Two tailed test: It is non directional test where the alternate hypothesis is expressed as “not
equal to”.
• On tailed test: It is used when the alternate hypothesis specifies the population mean to be
higher or lower than the hypothesized mean.
– Right tailed test
– Left tailed test
• It is the alternate hypothesis that determines whether a test will be two tailed or one tailed.
RIGHT TAILED TEST
RIGHT TAILED TEST
LEFT TAILED TEST
TYPE I AND TYPE II ERROR
• Type I error: It is rejecting a null hypothesis when it is correct. The probability of type I error
is known as significance level and in determined in advance.
• Type II error: It is accepting a null hypothesis that is false
• Level of significance: It is the probability of rejecting a true null hypotheses.
• Power of a test: It specifies how well the test is working.
HYPOTHESIS TESTING
PROCEDURE
• Set null and alternate hypotheses
• Determine the appropriate statistical test
• Set the level of significance
• Set the decision rule
• Collect the sample data
• Analyse the data
• Arrive at a statistical conclusion and business implication.
STEP 1: SET NULL AND ALTERNATIVE HYPOTHESES

 The null hypothesis generally referred by H0 (H sub-zero), is the hypothesis which is tested
for possible rejection under the assumption that is true. Theoretically, a null hypothesis is set
as no difference or status quo and considered true, until and unless it is proved wrong by the
collected sample data.
 Symbolically, a null hypothesis is represented as:

 The alternative hypothesis, generally referred by H1 (H sub-one), is a logical opposite of the null
hypothesis. In other words, when null hypothesis is found to be true, the alternative hypothesis must be
false or when the null hypothesis is found to be false, the alternative hypothesis must be true.
 Symbolically, alternative hypothesis is represented as:
STEP 2: DETERMINE THE APPROPRIATE
STATISTICAL TEST

 Type, number, and the level of data may provide a platform for deciding the
statistical test.
 Apart from these, the statistics used in the study (mean, proportion, variance,
etc.) must also be considered when a researcher decides on appropriate
statistical test, which can be applied for hypothesis testing in order to obtain the
best results.
STEP 3: SET THE LEVEL OF SIGNIFICANCE

 The level of significance generally denoted by α is the probability, which is


attached to a null hypothesis, which may be rejected even when it is true.
 The level of significance is also known as the size of the rejection region or the
size of the critical region.
 The levels of significance which are generally applied by researchers are: 0.01;
0.05; 0.10.
STEP 4: SET THE DECISION RULE
Figure 10.2: Acceptance and rejection regions of null hypothesis (two-tailed test)

Critical region is the area under the normal curve, divided into two mutually exclusive
regions. These regions are termed as acceptance region (when the null hypothesis is
accepted) and the rejection region or critical region (when the null hypothesis is rejected).
STEP 5: COLLECT THE SAMPLE DATA

 In this stage of sampling, data are collected and the appropriate sample statistics
are computed.
 The first four steps should be completed before collecting the data for the study.
 It is not advisable to collect the data first and then decide on the stages of
hypothesis testing.
STEP 6: ANALYSE THE DATA

 In this step, the researcher has to compute the test statistic. This involves
selection of an appropriate probability distribution for a particular test.
 Some of the commonly used testing procedures are z, t, F, and χ2.
STEP 7: ARRIVE AT A STATISTICAL CONCLUSION
AND BUSINESS IMPLICATION

 In this step, the researchers draw a statistical conclusion. A statistical conclusion


is a decision to accept or reject a null hypothesis.
 Statisticians present the information obtained using hypothesis-testing procedure
to the decision makers. Decisions are made on the basis of this information.
Ultimately, a decision maker decides that a statistically significant result is a
substantive result and needs to be implemented for meeting the organization’s
goals.
HYPOTHESIS TESTING FOR A SINGLE
POPULATION USING THE Z STATISTIC
EXAMPLE
• A marketing research firm conducted a survey 10 years ago and found that the
average household income of a particular geographic region is Rs. 10,000. Mr.
Gupta, who has recently joined the firm as a vice president has expressed
doubts about the accuracy of the data. For verifying the data, the firm has
decided to take a random sample of 200 households that yield a sample
mean(for household income) of Rs. 11,000. Assume that the population
standard deviation of the household income is Rs. 1200. Verify Mr. Gupta’s
doubts using the seven steps of hypothesis testing. Take 5 % level of
significance. Table value: 1.96
• Z=11.79
• Null hypothesis Rejected
• A cable TV network company wants to provide modern facilities
to its consumers. The company has five year old data which
reveals that the average household income is Rs. 120,000,
Company officials believe that due to the fast development in
the region, the average household income might have
increased. The company takes a random sample of 40
households to verify this assumption. From the sample, the
average income of the household is calculated as 125,000.
From historical data, population standard deviation is obtained
as 1200. Use level of significance as 5 % to verify the finding. Z
table value is 1.645.
HYPOTHESIS TESTING FOR A SINGLE
POPULATION USING THE T STATISTIC
EXAMPLE
• Royal tyres has launched a new brand of tyres for tractors and claims that under
normal circumstances the average life of the tyres is 40,000 km. A retailer wants to
test this claim and has taken a random sample of 8 tyres. He tests the life of the tyres
under normal circumstance. Table value: 2.365. The results obtained are presented
below:
• S=2618.61
• Null hypothesis accepted
• T=-0.27
• Prices of shares of a company on different days in a month were found to be
as follows
• 572, 545, 575, 570, 580, 565, 568, 571, 572, 592.
• Test at 5% level of significance if the price of shares on an average is Rs.
575.
• t table value is 2.262
HYPOTHESIS TESTING FOR THE
DIFFERENCE BETWEEN TWO POPULATION
MEANS USING THE Z STATISTIC
EXERCISE 1.2

• Given the following information relating to two places, A & B, test whether there is
any significant difference between their mean wages :
A B
Mean wages 47 49
SD 28 40
No of workers 1000 1500

Z tabulated value: 1.96


• Z Cal= -1.47
• Ho accepted at 0.05.
INDEPENDENT SAMPLE T-TEST

• A group of seven chickens reared on a high protein


diet weigh 12, 15, 11, 16, 14, 14, and 16 ounces; a
second group of five chickens, similarly treated except
that they receive a low protein diet, weigh 8, 10, 14,
10 and 13 ounces. Test at 5 per cent level whether
there is significant evidence that additional protein
has increased the weight of the chickens.
• Taking the null hypothesis that additional protein has not increased
the weight of the chickens we can write:
H0 : μ1 = μ 2
Ha: μ 1 > μ 2 (as we want to conclude that additional
protein has increased the weight of chickens)
• Since in the given question variances of the populations are not
known and the size of samples is small, we shall use t-test for
difference in means, assuming the populations to be normal.
• d.f. = (n1 + n2 – 2)
• Degrees of freedom = (n1 + n2 – 2) = 10
• As Ha is one-sided, we shall apply a one-tailed test (in the right tail
because Ha is of more than type) for determining the rejection
region at 5 per cent level at 10 degrees of freedom.

• Ho is rejected and we can conclude that additional protein has


increased the weight of chickens, at 5 per cent level of
significance.
PAIRED T TEST
• Applicable when samples are related or dependent.
• In t test, two samples were assumed to be independent as the value of one
observation is not dependent on the other.
• When the elements in one sample are related to the observations in other sample
then they are said to be dependent samples.
• While using paired t test, observation collected from the two samples are in the form
of matched pairs eg before and after treatment observation.
• While carrying out the paired t test the mean and standard deviation of the
“difference” is calculated.
• The best selling product of a consumer durables manufacturer has reached the
saturation stage in its product life cycle. The company is not willing to withdraw
the product from the market and has decided to motivate its sales executives to
take the personal selling route. The company organized a three day workshop to
motivate its sales executive. Three month later, the company selected nine sales
executives randomly and collected data on the number of average productive
sales calls in a day before and after the training. Test at 5 % level of
significance.t value: 2.306. The data collected are provided in the following table:
Sales Executives Productive sales call(Before training) Productive Sales call(After
Training)
1 3 6
2 4 7
3 2 5
4 5 7
5 3 2
6 4 6
7 6 5
8 5 8
9 4 6
SOLUTION
Sales Executives Productive sales Productive Sales d= X-Y d^2
call(Before call(After
training) Training)
X Y

1 3 6 -3 9
2 4 7 -3 9
3 2 5 -3 9
4 5 7 -2 4
5 3 2 1 1
6 4 6 -2 4
7 6 5 1 1
8 5 8 -3 9
9 4 6 -2 4
Total: n=9 Total= -16 50
• Mean(D)= -16/9= -1.78
• Standard Deviation= 1.6401
• T value=-3.25
• Null hypothesis is rejected.
• A shopkeeper has shifted from using a manual typewriter to a computer to
do his job. The number of mistakes he makes in both the methods are as
follows. Is the computer helpful in reducing mistakes? Test at 5% level of
significance. T value=2.57
Pages Mistakes before using computer Mistakes after using computer
1 58 53

2 29 28

3 30 31

4 55 48

5 56 50

6 45 42
QUICK QUIZ
Q1: The form of the alternative hypothesis can be:
A) one-tailed
B) two-tailed
C) neither one nor two-tailed
D) one or two-tailed
Q2) By taking a level of significance of 5% it is the same as saying:

a) We are 5% confident the results have not occurred by chance


b) We are 95% confident that the results have not occurred by chance
c) We are 95% confident that the results have occurred by chance
d) None of the above
Q3: One-tailed alternatives are phrased in terms of:
A) 
B) < or >
C)  or =
D) None of the above
Q4: A two-tailed test is one where:

A) results in only one direction can lead to rejection of the null hypothesis
B) negative sample means lead to rejection of the null hypothesis
C) results in either of two directions can lead to rejection of the null hypothesis
D) no results lead to the rejection of the null hypothesis
LEARNING OUTCOME

Chi Applications
• Test of Independence
F Test
One Way ANOVA
CHI SQUARE TEST
• Chi square teat was developed by Karl Pearson in 1900.
• It is a non parametric test.
• This data deals with categorical data.
• Categorical data is defined as the counting of frequencies from one or more variables.
• A categorical variable (sometimes called a nominal variable) is one that has two or more
categories, but there is no intrinsic ordering to the categories. For example, gender is a
categorical variable having two categories (male and female) and there is no intrinsic
ordering to the categories)
• Eg: The company has a total of 40,000 officers and it selected a random sample of 650
officers across four departments to assess the representativeness across departments in the
seminar. Out of 650 randomly selected officers, 150 officers are from the production
department, 200 officers are from the marketing department, 160 from the finance
department and remaining 140 from the human resource department.

• A research variable “representativeness from the departments” does not require any rating
scale.
• Here the research question is the frequency count from each department and can be
analysed using the chi square technique.
DEFINING CHI-SQUARE TEST STATISTIC

 Non-parametric tests for testing of the hypothesis.


Applications of Chi-Square :
 Goodness of Fit

Copyright© Dorling Kindersley India Pvt. Ltd


 Test of Independence

Analysis of Variance and Experimental Designs 57


CHI-SQUARE GOODNESS-OF-FIT TEST

Chi-square test provides a platform that can be used to ascertain whether theoretical
probability distributions coincide with empirical sample distributions.

Example 13.1: A company is concerned about the increasing violent altercations


between its employees. The number of violent incidents recorded by the management
during six randomly selected months is given in Table 13.2.
Determine whether the data fits
a uniform distribution at 5%
level of significance.
COMPUTATION OF EXPECTED FREQUENCIES AND
CHI-SQUARE STATISTIC FOR EXAMPLE 13.1
CHI-SQUARE TEST OF INDEPENDENCE:

In many business situations, a market researcher might be interested in understanding the


relationship between two variables or to check whether they are independent of each other.

Eg: An edible oil company may be interested in knowing whether the purchase of oil is
independent of the customer’s age or whether it is dependent on the customer’ s age.

Eg2: HRD Manager of a company who is interested in ascertaining whether the rate of employee
turnover is independent of employee qualification.
Example 13.2

The Vice President (Sales) of a garment company wants to


determine whether sales of the company’s brand of jeans is
independent of age group. He has appointed a marketing
researcher for this purpose. This marketing researcher has
taken a random sample of 703 consumers who have
purchased jeans. The researcher conducted survey for three
brands of the jeans, namely Brand 1, Brand 2, and Brand 3.The
researcher has also divided the age groups into four
categories: 15 to 25, 26 to 35, 36 to 45, and 46 to 55. The
observations of the researcher are provided in Table 13.6:
TA BLE 13. 6 : CO N TI N GE N C Y TA B LE FO R E X A MPL E 13. 2

Determine whether brand preference is independent of age group. Use alpha=0.05.


TA BLE 13. 7: CONT ING E N CY TA B L E O F TH E O B S E RV ED A N D E X PEC TE D
F R EQ U E NC IE S FOR E X A MPL E 13. 2
TA BLE 13 . 8 : COM PU TAT I O N O F E X PEC TE D F REQ U EN C I ES A N D C HI - S Q UA RE
STATI ST I C FOR EX A MPL E 13 .2
EXERCISE 2.2
• A certain drug was administered to 456 males out of a total 720 in a certain locality to
test its efficacy against typhoid. The incidence of typhoid is shown below. Find out the
effectiveness of the drug against the disease.
• Chi cal= 113.6
• Chi tab= 3.8
• Dof= 1
• In a survey done on 1000 students who watch the discovery channel and their IQ
level, the following information is revealed:
• Test at 5% level of significance level if the students watching Discovery Channel have
high IQ?

Watching discovery Not watching discovery

High IQ 415 185

Low IQ 65 335
• Chi cal=269.24
• Chi tab=3.841
• Reject the Null Hypothesis.
F TEST
• It is particularly useful when multiple sample cases are involved and the data has been
measured on interval or ratio scale.
• It can be used to test the equality of variance of two normal populations ie to find
whether two samples can be regarded as drawn from normal populations having same
variance.
• Analyse variance of more than two independent samples.
• Two random samples have been drawn from two normal populations.

Sample 1 75 68 65 70 84 66 55

Sample 2 42 44 56 52 46

Test using variance ratio at 5% significance level whether the two populations
have same variance.
• F calculated=2.37
• F table=6.16
• Accept the null hypothesis
CORRELATION
• Correlation is concerned with identifying the association between two or more variables.
• Correlation identifies the degree of relationship between two variables and regression is
used to study the nature of relationship and develop a cause and effect relationship.

• Types of Correlation

Positive and Negative Correlation

Simple, Partial and Multiple Correlation

Linear and Non Linear Correlation


MEASURES TO CALCULATE
CORRELATION
• Karl Pearson’s Coefficient of Correlation
EXCERCISE
Calculate the correlation Coefficient:
Roll no. of 1 2 3 4 5
students

Marks in 48 35 17 23 47
accountancy

Marks in Statistics 45 20 40 25 45
• R= 0.428
PRACTICE
• Calculate the Karl Pearson’s Coefficient from the following data
X: 811 15 10 12 16
Y: 69 11 7 9 12
SPEARMAN COEFFICIENT OF
CORRELATION
• When ranks are given
EXCERCISE
• When ranks are given
• Two ladies were asked to rank 7 different types of lipsticks. The ranks given by them are as
follows. Calculate Spearman rank correlation.
LIPSTICK A B C D E F G
S

NEELU 2 1 4 3 5 7 6

NEENA 1 3 2 4 5 6 7
• R=0.786
WHEN RANKS ARE NOT GIVEN
• Calculate Spearman coefficient of correlation between marks assign to students by judges X
and Y in a competitive test.

1 2 3 4 5 6 7 8 9 10

Judge 1 52 53 42 60 45 41 37 38 25 27

Judge 2 65 68 43 38 77 48 35 30 25 50
• R=0.539
WHEN RANKS ARE EQUAL
EXERCISE
• Obtain the rank correlation coefficient between the variables X and Y from the
following pairs of observed values.
• X : 50 55 65 50 55 60 50 65 70 75
• Y: 110 110 115 125 140 115 130 120 115 160
• R= 0.155
REGRESSION
• Regression analysis is the process of developing a statistical model, which is used to predict
the value of a dependent variable by at least one independent variable.
• In a simple regression analysis, there are two types of variables
– Dependent Variable: The variable whose value is influenced or to be predicted. It is called
regressed or explained variable.
– Independent variable: The variable which influences the value or is used for prediction. It is also
called regressor or predictor or explanatory variable
Simple linear regression analysis is focused on developing a regression model by which the
value of the dependent variable can be predicted with the help of the independent
variable.
• From the following data
REGRESSION
• A random sample of eight drivers insured with a company and having
similar auto insurance policies was selected. The following table lists their
driving experiences(in years) and monthly auto insurance premiums.

Predict the monthly auto insurance premium for a driver with 10 years
of driving experience.
• b= -1.5476
• a= 76.6605
• Regression equation:
• Y= 76.6605-1.5476X
• When X=10, THEN
• Y= 76.6605-1.5476x10= 61.68
CORRELATION VS REGRESSION

• In correlation analysis the degree and • In regression analysis, the nature of


direction of relationship between the relationship is studied.
variables are studied. • If value of variable is known, the value
• If value of one variable is known, the of other variable can be estimated
value of other variable cannot be using the functional relationships.
estimated.
• Only one regression coefficient can be
• Correlation coefficient lies between -1 greater than 1.
to +1.
• Regression always expresses the
• Correlation doesnot always assume cause and effect relationship.
cause and effect relationship.
PRACTICE

• Determine the Regression equation Y on X and X on Y by using the following


information.

X 55 60 65 70 80

Y 52 54 56 58 62
• X=330
• Y=282
• X^2=22150
• XY=18760
• A= 30
• B=0.4
Y on X
Y=30+0.4X
ONE WAY ANOVA
• ANOVA stands for Analysis of Variance.
• This technique was developed by Sir Ronald Aylmer Fisher when he worked at the Rothamsted
Agricultural Experimental Station from 1919 to 1933.
• It is a technique of testing hypotheses about the significant difference in several population
means.
• The statistic that is computed and tested for statistical significance in this technique is an F
ratio.
• Sawyer (2009) stated that ANOVA is a useful statistical tool for drawing inferential conclusions
about how one or more independent variables influences a parametric dependent variable.
• In analysis of variance, the total variation in the sample data can be on account of two
components, namely, variance between the samples and variances within the samples.
• Variance between samples is attributed to the difference among the sample means. This
variation is due to some assignable causes.
• Variance within the samples is the difference due to chance or experiment errors.
• It is called one-way design because there is only one independent variable, although
any number of groups or levels representing that independent variable can be
subsumed.
• This technique is used to determine whether there exist any statistically significant
differences between the samples of two or more independent groups.
• Using ANOVA, we test for differences among the means of the population by
comparing the amount of variation between samples to amount of variation within
each of these samples.
• OBJECTIVE: The objective of ANOVA is not to test the significance of the
difference between sample variances but to test for the significance of
difference among sample means.
• Fifteen students undergoing training are randomly assigned to three different types on
instruction modules. At the end of training period their test scores are as follows:

Instructio Test 1 Test 2 Test 3 Test 4 Test 5


n Module

A 86 79 81 70 84

B 90 76 88 82 89

C 82 68 73 71 81

Use analysis of Variance to test that there is no significant difference in the mean
scores of three instruction modules using 5% significance level.
• NULL HYPOTHESIS: There is no difference in the mean scores of the three instruction modules.
• STEP 1:
• Mean of A=80
• Mean of B=85
• Mean of C= 75
STEP 2: Mean of sample means=80
STEP 3: SS Between=250
STEP 4: SS within 448

SOURCE OF SS d.f. MS F-RATIO 5%


VARIATION F-LIMIT

I) BETWEEN 250 (3-1)=2 250/2=125 125/37.34=3.3 F(2,12)=3.89


• Fcal= 3.35
SAMPLE 5
• Ftab= 3.89
II) WITHIN 448 448/12=37.34
SAMPLE (15-3)=12

TOTAL 698 (15-1)=14

• Null hypothesis is acceptable


• In order to test the significance of variation of the retail prices of a commodity in three
cities, four shops were chosen at random from each city and prices observed in rupees
were as follows.
Does the data indicate that the prices in three cities are significantly different?

City a 16 8 12 12
City b 14 10 10 6
City c 4 10 8 10
• Ftab=4.26
• F cal=1.63
• Null hypothesis accepted
Example 12.1

Vishal Foods Ltd is a leading manufacturer of biscuits. The company has launched a
new brand in the four metros; Delhi, Mumbai, Kolkata, and Chennai. After one
month, the company realizes that there is a difference in the retail price per pack of
biscuits across cities. Before the launch, the company had promised its employees
and newly-appointed retailers that the biscuits would be sold at a uniform price in the
country. The difference in price can tarnish the image of the company. In order to
make a quick inference, the company collected data about the price from six
randomly selected stores across the four cities. Based on the sample information, the
price per pack of the biscuits (in rupees) is given in Table 12.5:
Example 12.1: Continued

Use one-way ANOVA to analyse the significant difference in the prices. Take 95% as the confidence level.
TAB LE 1 2 . 7 : A N OVA TA B L E FOR E X A MPL E 1 2 . 1
TAB LE 1 2. 11: A N OVA S U M MA RY TA B L E FO R EX A M PL E 1 2 . 2

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy