RM Unit3 Slides
RM Unit3 Slides
Unit-03:
Testing of hypotheses
Dr.Roopa Ravish
Department of CSE
RESEARCH METHODOLOGY
∙Hypothesis testing is often used strategy for deciding whether sample data offer
such support for hypothesis that generalization can be made.
∙Ordinarily, when one talks about hypothesis, one simply means a mere
assumption or some supposition to be proved or disproved. But for a researcher
hypothesis is a formal question that he intends to resolve.
What is hypothesis testing?
∙ Mere assumption or some supposition to be proved or disproved.
∙ Defined as a
“Proposition or a set of proposition set forth as an explanation for the occurrence
of some specified group of phenomena either asserted merely as a provisional
conjecture to guide some investigation or accepted as highly probable in the light
of established facts.”
a. “Students who receive counselling will show a greater increase in creativity than
students not receiving counselling”
These are hypotheses capable of being objectively verified and tested. Thus, we may
conclude that a hypothesis states what we are looking for and it is a proposition which
can be put to a test to determine its validity.
Characteristics of Hypothesis
1) Should be clear and precise.
2) Should be capable of being tested.
a) A Hypotheses is testable if other deductions can be made from it
which, in turn, can be confirmed or disproved by observation.
3) Should state relationship between variables.
4) Should be limited in scope and must be specific.
5) Hypo should be stated in simple terms and easily
understandable.
6) Hypo should be consistent with most known facts.
7) Hypo should be amenable to testing within reasonable time.
Basic concepts: Null Hypothesis and Alternate Hypothesis
In context of Statistical Analysis:
Null Hypothesis – If we compare method A and method B and both are equally
good (H0).
Example : “No difference between coke and diet coke”.
As against this, we may think that the method A is superior or the method B is
inferior, we are then stating what is termed as alternative hypothesis. The null
hypothesis is generally symbolized as H0 and the alternative hypothesis as Ha.
∙Then we would say that the null hypothesis is that the population mean is
equal to the hypothesized mean 100 and symbolically we can express as:
If our sample results do not support this null hypothesis, we should conclude that something
else is true.
What we conclude rejecting the null hypothesis is known as alternative hypothesis. Set of
alternatives to the null hypothesis is referred to as the alternative hypothesis. If we accept
H0 , then we are rejecting Ha and
If we reject H0 , then we are accepting Ha.
Possible alternate hypothesis
For H0: µ = µH0 = 100 , we may consider three possible
alternative hypotheses as follows* :
Possible alternate hypothesis
In the choice of null hypothesis, the following considerations are usually kept in
view:
(a) Alternative hypothesis is usually the one which one wishes to prove and the
null hypothesis is the one which one wishes to disprove.
Thus, a null hypothesis represents the hypothesis we are trying to reject, and
alternative hypothesis represents all other possibilities.
(b) If the rejection of a certain hypothesis when it is actually true involves great
risk, it is taken as null hypothesis because then the probability of rejecting it when
it is true is α (the level of significance) which is chosen very small.
(c) Null hypothesis should always be specific hypothesis i.e., it should not state
about or approximately a certain value.
Statistically Significant
This is a very important concept in the context of hypothesis testing. It is always some
percentage (usually 5%) which should be chosen with great care, thought and reason.
In case we take the significance level at 5 per cent, then this implies that H0 will be
rejected when the sampling result (i.e., observed evidence) has a less than 0.05
probability of occurring if H0 is true.
In other words, the 5% level of significance means that researcher is willing to take as
much as a 5% risk of rejecting the null hypothesis when it (H0 ) happens to be true.
Thus the significance level is the maximum value of the probability of rejecting H0 when
it is true and is usually determined in advance before testing the hypothesis.
Decision rule or test of hypothesis
Type I error is denoted by α (alpha) known as α error, also called the level of
significance of test; and Type II error is denoted by β (beta) known as β error.
Type I and Type II errors
But with a fixed sample size, n, when we try to reduce Type I error, the probability of
committing Type II error increases. Both types of errors cannot be reduced
simultaneously.
There is a trade-off between two types of errors.
To deal with this trade-off in business situations, decision-makers decide the
appropriate level of Type I error by examining the costs or penalties attached to both
types of errors.
Hence, in the testing of hypothesis, one must make all possible effort to strike an
adequate balance between Type I and Type II errors.
One tailed and two tailed test
We test 3 types of Hypotheses given by:
Note:
• The null hypothesis is that the lubricant does not meet the
specification, and that the difference between the sample mean
of 673.2 and 675 is due to chance.
𝟔𝟕𝟑. 𝟐 − 𝟔𝟕𝟓
𝒛 = = −𝟎. 𝟖𝟏
𝟐. 𝟐𝟐
Soution:
• The 𝑷 − 𝒗alue is 𝟎. 𝟐𝟎𝟗 .
We assume 𝑯𝟎 is true
𝟏𝟎𝟎𝟎. 𝟔 − 𝟏𝟎𝟎𝟎
𝒛 =
𝟎. 𝟐𝟓𝟖
= 𝟐. 𝟑𝟐
Solution:
𝑯𝟎 : 𝝁 = 𝟔𝟕. 𝟑𝟗inches
𝑯𝟏 : 𝝁 ≠ 𝟔𝟕. 𝟑𝟗inches
𝑿 − 𝝁𝟎 67.47 − 67.39
𝒛 = 𝝈 = = 𝟏. 𝟐𝟑𝟏
𝒏 130 / 𝟒𝟎𝟎
H0 is accepted.
Drawing Conclusions from the Results of Hypothesis Tests
Statistical Significance:
Example:
Solution:
• The result is statistically significant at any level greater than
or equal to 𝟑%.
Unit-03:
Testing of hypotheses
Dr.Roopa Ravish
Department of CSE
1
RESEARCH METHODOLOGY
2
Tests of Hypothesis
3
Tests of Hypothesis
• Parametric tests usually assume certain properties of the parent population from which we draw
samples.
• Assumptions like observations come from a normal population, sample size is large,
assumptions about the population parameters like mean, variance, etc., must hold good before
parametric tests can be used.
• But there are situations when the researcher cannot or does not want to make such assumptions. In
such situations we use statistical methods for testing hypotheses which are called non-parametric
tests.
• Besides, most non-parametric tests assume only nominal or ordinal data, whereas parametric tests
require measurement equivalent to at least an interval scale.
• Non-parametric tests need more observations than parametric tests to achieve the same size of Type
I and Type II errors.
4
z-test vs t-test
5
Eg: t-test
The specimen of copper wires drawn form a large lot have the following breaking strength (in kg.
weight):
578, 572, 570, 568, 572, 578, 570, 572, 596, 544
Test (using Student’s t-statistic)whether the mean breaking strength of the lot may be taken to be
578 kg. weight (Test at 5 per cent level of significance).
6
Eg: t-test
The specimen of copper wires drawn form a large lot have the following breaking strength (in kg.
weight):
578, 572, 570, 568, 572, 578, 570, 572, 596, 544
Test (using Student’s t-statistic)whether the mean breaking strength of the lot may be taken to be
578 kg. weight (Test at 5 per cent level of significance).
Eg: t-test
Eg: t-test
Eg:
Eg:
Eg:
Chi-Square tests
A chi-square goodness of fit test determines if a sample data matches a population.
chi-square can be used (i) as a test of goodness of fit and (ii) as a test of independence.
As a test of goodness of fit, test enables us to see how well does the assumed theoretical
distribution (such as Binomial distribution, Poisson distribution or Normal distribution) fit to the
observed data.
If the calculated value of is less than the table value at a certain level of significance, the fit is
considered to be a good one which means that the divergence between the observed and expected
frequencies is attributable to fluctuations of sampling. But if the calculated value of is greater
than its table value, the fit is not considered to be a good one.
Chi-Square tests
As a test of independence, test enables us to explain whether or not two attributes are associated (Independent
Variable/Dependent Variable).
On this basis we first calculate the expected frequencies and then work out the value of . If the calculated value
of is less than the table value at a certain level of significance for given degrees of freedom, we conclude that
null hypothesis stands which means that the two attributes are independent or not associated .
But if the calculated value of is greater than its table value, our inference then would be that null hypothesis
does not hold good which means the two attributes are associated and the association is not because of some chance
factor but it exists in reality. It may, however, be stated here that is not a measure of the degree of relationship
or the form of relationship between two attributes, but is simply a technique of judging the significance of such
association or relationship between two attributes.
Conditions for chi-square test.
STEPS INVOLVED IN APPLYING CHI-SQUARE TEST
Example
Example
Example
Example 2
Example 2
Example 2
Problem - 3
Genetic theory states that children having one parent of blood type A and the other of blood type
B will always be of one of three types, A, AB, B and that the proportion of three types will on
an average be as 1 : 2 : 1. A report states that out of 300 children having one A parent and B
parent,30 per cent were found to be types A, 45 per cent per cent type AB and remainder type
B. Test the hypothesis by test
29
Home work
THANK YOU
ANOVA: Analysis of Variance
• This technique is used when multiple sample cases are involved.
• The significance of the difference between the means of two samples can be judged
through either z-test or the t-test, but the difficulty arises when we happen to examine the
significance
of the difference amongst more than two sample means at the same time.
• The ANOVA technique enables us to perform this simultaneous test and as such is
considered to be an important tool of analysis in the hands of a researcher. Using this
technique, one can draw inferences about whether the samples have been drawn from
populations having the same mean.
ANOVA: Analysis of Variance
• The ANOVA technique is important in the context of all those situations where we want to
compare more than two populations such as in comparing the yield of crop from several
varieties of seeds, the gasoline mileage of four automobiles, the smoking habits of five
groups of university students and so on.
• Therefore, one quite often utilizes the ANOVA technique and through it investigates the
differences among the means of all the populations simultaneously.
• “The essence of ANOVA is that the total amount of variation in a set of data is broken down
into two types, that amount which can be attributed to chance and that amount which can be
attributed to specified causes.”
• There may be variation between samples and also within sample items. ANOVA consists in
splitting the variance for analytical purposes.
ANOVA: Analysis of Variance
Hence, it is a method of analysing the variance to which a response is subject into its various
components corresponding to various sources of variation.
Through this technique one can explain whether various varieties of seeds or fertilizers or soils
differ significantly so that a policy decision could be taken accordingly, concerning a particular
variety in the context of agriculture researches.
Thus, through ANOVA technique one can, in general, investigate any number of factors which are
hypothesized or said to influence the dependent variable.
If we take only one factor and investigate the differences amongst its various
categories having numerous possible values, we are said to use one-way ANOVA and in case we
investigate two factors at the same time, then we use two-way ANOVA. In a two or more way
ANOVA, the interaction (i.e., inter-relation between two independent variables/factors), if any,
between two independent variables affecting a dependent variable can as well be studied for better
decisions.
ANOVA: Analysis of Variance
THE BASIC PRINCIPLE OF ANOVA
we have to make two estimates of population variance viz., one based on between samples
variance and the other based on within samples variance.
Then the said two estimates of population variance are compared with F-test, wherein we work
out.
1 6 5 5
2 7 5 4
3 3 3 3
4 8 7 4
𝑥 6 5 4
18
ANOVA
1 6 5 5
2 7 5 4
3 3 3 3
4 8 7 4
𝑥 6 5 4
1 6 5 5
2 7 5 4
3 3 3 3
4 8 7 4
𝑥 6 5 4
3 3 3 3 ANOVA
Source of
Variation SS df MS F P-value F crit
4 8 7 4 Between
Groups 8 2 4 1.5 0.274016 4.256495
𝑥 6 5 4 Within Groups 24 9 2.666667
22
RESEARCH METHODOLOGY
1
Research Methodology
Scientific Publishing- Data representation
PRESENTATION OF DATA
3
Research Methodology
Scientific Publishing- Data representation
4
Research Methodology
Scientific Publishing- Data representation
5
Research Methodology
Scientific Publishing- Data representation
6
Research Methodology
Scientific Publishing- Data representation
7
Research Methodology
Scientific Publishing- Data representation
8
Research Methodology
Scientific Publishing- Data representation
9
Research Methodology
Scientific Publishing- Data representation
Advantages of table:
11
Research Methodology
Scientific Publishing- Data representation
Numerical Tables:
12
Research Methodology
Scientific Publishing- Data representation
13
Research Methodology
Scientific Publishing- Data representation
14
Research Methodology
Scientific Publishing- Data representation
15
Research Methodology
Scientific Publishing- Data representation
16
Research Methodology
Scientific Publishing- Data representation
17
Research Methodology
Scientific Publishing- Data representation
Visualization, as the word suggests is the art of representing
information in visual form like diagrams, charts or images. The
visuals are usually supported by narration from the presenter.
18
Research Methodology
Scientific Publishing- Data representation
19
Research Methodology
Scientific Publishing- Data representation
20
Research Methodology
Scientific Publishing- Data representation
21
Research Methodology
Scientific Publishing- Data representation
22
Research Methodology
Scientific Publishing- Data representation
23
Research Methodology
Scientific Publishing- Data representation
24
Research Methodology
Scientific Publishing- Data representation
25
Research Methodology
Scientific Publishing- Data representation
26
Research Methodology
Scientific Publishing- Data representation
27
Research Methodology
Scientific Publishing- Data representation
28
THANK YOU
29
RESEARCH METHODOLOGY
1
Step by step:
An effective DISCUSSION
Results - Findings
Function:
• To answer questions posed in the Introduction,
• Explain how the results support the answers and
• How the answers fit in with existing knowledge on the
topic.
Discussion
Not mere details about the results;
interpret and explain the results.
1. (Un)expected results
2. Reference to previous research
3. Explanation
4. Exemplification
5. Deduction and hypothesis
6. Recommendation
Provide
a commentary and not a reiteration of the results
Discussion
• Begin by briefly summarizing the previous chapters,
then discuss what you found.
1.
2.
3.
4.
5.
Describe the patterns, principles, and relationships
shown by each major finding/result and put them in
perspective.
The sequencing:
First - state the answer,
Second - support with relevant results,
Third - cite the work of others.
Discussion- Technique
6.
Defend your answers by explaining both why your
answer is satisfactory and why others are not.
7.
8.
9.
10.
1
Summary, Conclusions &
Recommendation
Summary, Conclusions &
Recommendation
• Advantages
• Novelty
• Limitations
• Suggestions
The content of a good conclusion
• Be a logical ending synthesizing what has been previously
discussed and never contain any new information or
material
Recommendation for future research: Further research that has not been
covered but is worthwhile to investigate in the near future.
Limitation of the study: Identify the various limitations which were encountered
during the sampling, lab work, data collection and analysis stages of the research
or project.
Different
Styles Of
Referencing
Agenda ……
• Objective.
• What is reference style.
• Why to reference.
• Types of references.
• Different styles of writing reference.
A. Harvard style of referencing.
C. Vancouver style.
and bibliography.
It is a act of referring.
Reference :
• The action of mentioning or alluding to something or,
• The use of a source of information in order to ascertain
something.
Why to reference??
Book Reference
Internet Reference
Reference Elements
•Authors name
•Article title
•Journal name
•Year
•Volume
•Page numbers
Different styles of writing references:
Vancouver style.
Example
1. Padda, J. (2003) ‘creative writing in coventry'. Journal of writing studies 3
(2), 44-59.
2. Lennernas, H. (1995) ‘Experimental estimation of the effective unstirred
water layer thickness in the human jejunum & its importance in oral drug
absorption’. Eur. J. pharm sci (3), 247-253.
Vancouver style.
Example
1. Haas AN, Susin C, Albandar JM, et al. "Azithromycin as a adjunctive
treatment of aggressive periodontitis: 12-months randomized clinical
trial". N Engl J Med. 2008 Aug; 35(8):696-704.
Vancouver Style does not use the full journal name, only the commonly-
used abbreviation: “New England Journal of Medicine” is cited as “N Engl J
Med”.
MLA citation style (modern
language association )
• Authors name.
• Title of article.
• Name of journal.
• Volume number followed by decimal & issue no.
• Year of publication.
• Page numbers.
• Medium of publication.
Example
1. Matarrita-Cascante, David. "Beyond Growth: Reaching Tourism-Led
Development." Annals of Tourism Research 37.4 (2010): 1141-63. Print
American Psychological Association style
Example
1. Alibali, M. W., Phillips, K. M., & Fischer, A. D. (2009). Learning
new problem-solving strategies leads to changes in problem
representation. Cognitive Development, 24, 89-101.
The Chicago manual of style
Name of author.
Article title in double quotation mark.
Title of journal in italic.
Volume.
Year of publication.
Page no.
Example
1. Joshua I. Weinstein, “The Market in Plato’s ” Classical
Philology, 104 (2009): 440.
Royal society of chemistry styling
INITIALS. Author’s surname.
Title of journal (abbreviated).
Year of publication.
Volume number.
Pages no.
Example
H. Yano, K. Abe, M. Nogi, A. N. Nakagaito, J. Mater. Sci.,
2010, 45, 1–33.
Difference between Reference List and
Bibliography
cited in our text arranged in the order they appeared within the text.
It is usually put at the end of our work but it can also appear as a
footnote (at the bottom of the page), or endnote (at the end of each
chapter) which serves a similar purpose.