LM08 Hypothesis Testing IFT Notes
LM08 Hypothesis Testing IFT Notes
1. Introduction ...........................................................................................................................................................2
2. Hypothesis Tests for Finance ..........................................................................................................................2
3. Tests of Return and Risk in Finance .............................................................................................................8
4. Parametric Versus Nonparametric Tests ................................................................................................ 16
Summary................................................................................................................................................................... 17
Required disclaimer: IFT is a CFA Institute Prep Provider. Only CFA Institute Prep Providers are
permitted to make use of CFA Institute copyrighted materials which are the building blocks of the
exam. We are also required to create / use updated materials every year and this is validated by CFA
Institute. Our products and services substantially cover the relevant curriculum and exam and this is
validated by CFA Institute. In our advertising, any statement about the numbers of questions in our
products and services relates to unique, original, proprietary questions. CFA Institute Prep Providers
are forbidden from including CFA Institute official mock exam questions or any questions other than
the end of reading questions within their products and services.
CFA Institute does not endorse, promote, review or warrant the accuracy or quality of the product and
services offered by IFT. CFA Institute®, CFA® and “Chartered Financial Analyst®” are trademarks
owned by CFA Institute.
© Copyright CFA Institute
Version 1.0
1. Introduction
This learning module covers:
• Hypothesis testing process
• Impact of errors in the hypothesis testing process
• Parametric and nonparametric tests
2. Hypothesis Tests for Finance
Hypothesis testing is the process of making judgments about a larger group (a population)
on the basis of observing a smaller group (a sample). The results of such a test then help us
evaluate whether our hypothesis is true or false.
For example, let’s say you are a researcher and you believe that the average return on all
Asian stocks was greater than 2%. To test this belief, you can draw samples from a
population of all Asian stocks and employ hypothesis testing procedures. The results of this
test can tell you if your belief is statistically valid.
In this learning module we will look at hypothesis tests concerning the mean and variance.
The Process of Hypothesis Testing
A hypothesis is defined as a statement about one or more populations. In order to test a
hypothesis, we follow these steps:
1. State the hypothesis.
2. Identify the appropriate test statistic and its probability distribution.
3. Specify the significance level.
4. State the decision rule.
5. Collect data and calculate the test statistic.
6. Make a decision.
1. Stating the Hypothesis
For each hypothesis test, we always state two hypotheses: the null hypothesis (H0) and the
alternative hypothesis (Ha).
Null hypothesis (H0): It is the hypothesis that the researcher wants to reject.
Alternative hypothesis (Ha): It is the hypothesis that the researcher wants to prove. If the
null hypothesis is rejected then the alternative hypothesis is considered valid.
Suppose you are a researcher and believe that the average return on all Asian stocks was
greater than 2%. In this case, you are making a statement about the population mean (µ) of
all Asian stocks.
For this example, the null and alternative hypotheses are:
H0: µ ≤ 2%
Ha: µ > 2%
(The value 2% is known as µ0, the hypothesized value of the population mean.)
Instructor’s Note:
An easy way to differentiate between the two hypotheses is to remember that the null
hypothesis always contains some form of the equal sign.
Two-Sided vs. One-Sided Hypotheses
The alternative hypothesis can be one-sided or two-sided depending on the proposition
being tested. A one-sided test is also called a one-tailed test, and a two-sided test is also
called a two-tailed test.
If we want to determine whether the estimated value of a population parameter is less than
(or greater than) a hypothesized value we use a one-tailed test. However, if we want to
determine whether the estimated value of a population parameter is different than a
hypothesized value, we use a two tailed test.
Two-sided test: Suppose we want to test if the population mean is equal to 2%. The null and
alternative hypothesis can be expressed as:
H0: µ = 2%
Ha: µ ≠ 2%
Since the alternative hypothesis contains a ≠ sign this is a two-sided test.
One-sided test (right side): Suppose we want to test if the population mean is greater than
2%. The null and alternative hypothesis can be expressed as:
H0: µ ≤ 2%
Ha: µ > 2%
Since the alternative hypothesis contains a > sign this is a one-sided test, and we are
interested in the right side.
Instructor’s Note
The sign in the alternative hypothesis points to the direction of the tail that we should use in
our test. Since in our example the alternative hypothesis has a ‘>’ sign it points to the right,
therefore we are interested in the right tail.
One-sided test (left side): Suppose we want to test if the population mean is less than 2%.
The null and alternative hypothesis can be expressed as:
H0: µ ≥ 2%
Ha: µ < 2%
Since the alternative hypothesis contains a < sign this is a one-sided test, and we are
interested in the left side.
Continuing with our Asian stocks example, suppose we want to test if the population mean is
greater than a particular hypothesized value. We draw 36 observations and get a sample
mean of 4. We are also told that the standard deviation of the population is 4. If the
hypothesized value of the population mean is 2, the test statistic is calculated as:
̅ − μ0
X 4−2
test statistic = σ = 4 =3
√n √36
However, if the hypothesized value of the population mean is 0, then the test statistic is
calculated as:
̅
X − μ0 4−0
test statistic = σ = 4 =6
√n √36
The probability of a Type II error is denoted by ‘β’. The power of test is calculated as (1 - β).
It represents the probability of correctly rejecting the null when it is false.
The different probabilities associated with the hypothesis testing decisions are presented in
the table below:
True condition
Decision
H0 true H0 false
Do not reject H0 Confidence level (1 - α ) β
Reject H0 (accept Ha) Level of significance (α) Power of the test (1 - β)
The most commonly used levels of significance are: 10%, 5% and 1%.
4. State the Decision Rule
A decision rule involves determining the critical values based on the level of significance;
and comparing the test statistic with the critical values to decide whether to reject or not
reject the null hypothesis. When we reject the null hypothesis, the result is said to be
statistically significant.
Determining Critical Values
One-tailed test:
Continuing with our Asian stocks example, suppose we want to test if the population mean is
greater than 2%. Say we want to test our hypothesis at the 5% significance level. This is a
one-tailed test and we are only interested in the right tail of the distribution. (If we were
trying to assess whether the population mean is less than 2%, we would be interested in the
left tail.)
The critical value is also known as the rejection point for the test statistic. Graphically, this
point separates the acceptance and rejection regions for a set of values of the test statistic.
This is shown below:
The region to the left of the test statistic is the ‘acceptance region’. This represents the set of
values for which we do not reject (accept) the null hypothesis. The region to the right of the
test statistic is known as the ‘rejection region’.
Using the Z –table and 5% level of significance, the critical value = Z0.05= 1.645
Two-tailed test:
In a two-tailed test, two critical values exist, one positive and one negative. For a two-sided
test at the 5% level of significance, we split the level of significance equally between the left
0.05
and right tail i.e. = 0.025 in each tail.
2
This corresponds to rejection points of +1.96 and -1.96. Therefore, we reject the null
hypothesis if we find that the test statistic is less than -1.96 or greater than +1.96. We fail to
reject the null hypothesis if -1.96 ≤ test statistic ≤ +1.96. Graphically, this can be shown as:
σ 4
The standard error of the sample is: σx̅ = = = 0.67
√ n √36
Example
Fund Alpha has been in existence for 20 months and has achieved a mean monthly return of
2% with a sample standard deviation of 5%. The expected monthly return for a fund of this
nature is 1.60%. Assuming monthly returns are normally distributed, are the actual results
consistent with an underlying population mean monthly return of 1.60%?
Solution:
The null and alternative hypotheses for this example will be:
H0: µ = 1.60 versus Ha: µ ≠ 1.60
̅
X − μ0 2 − 1.60
test statistic = s = 5 = 0.36
√n √20
Using this formula, we see that the value of the test statistic is 0.36.
The critical values at a 0.05 level of significance can be calculated from the t-distribution
table. Since this is a two-tailed test, we should look at a 0.05/2 = 0.025 level of significance
with df = n - 1 = 20 – 1 = 19. This gives us two values of -2.1 and +2.1.
Since our test statistic of 0.35 lies between -2.1 and +2.1, i.e., the acceptance region, we do
not reject the null hypothesis.
Test Concerning Difference Between Means with Independent Samples
Instructor’s Note:
Focus on the basics of this topic, the probability of being tested on the details is low.
In this section, we will learn how to calculate the difference between the means of two
independent and normally distributed populations. We perform this test by drawing a
sample from each group. If it is reasonable to believe that the samples are normally
distributed and also independent of each other, we can proceed with the test. We may also
assume that the population variances are equal or unequal. However, the curriculum focuses
on tests under the assumption that the population variances are equal.
The term 𝑠𝑝2 is known as the pooled estimator of the common variance. It is calculated by the
following formula:
(n1 − 1)s12 + (n2 − 1)s22
sp2 =
n1 + n2 − 2
The number of degrees of freedom is n1 + n2 – 2.
Example
(This is based on Example 1 from the curriculum.)
An analyst wants to test if the returns for an index are different for two different time
periods. He gathers the following data:
Period 1 Period 2
Mean 0.01775% 0.01134%
Standard deviation 0.31580% 0.38760%
Sample size 445 days 859 days
Note that these periods are of different lengths and the samples are independent; that is,
there is no pairing of the days for the two periods.
Test whether there is a difference between the mean daily returns in Period 1 and in Period
2 using a 5% level of significance.
Solution:
The first step is to formulate the null and alternative hypotheses. Since we want to test
whether the two means were equal or different, we define the hypotheses as:
H0: µ1 - µ2 = 0
Ha: µ1 - µ2 ≠ 0
We then calculate the test statistic:
(n1 − 1)s12 + (n2 − 1)s22 (445 − 1)0.09973 + (859 − 1)0.15023
sp2 = = = 0.1330
n1 + n2 − 2 445 + 859 − 2
̅1 − ̅
(X X2 ) − (μ1 − μ2 ) (0.01775 − 0.01134) − 0
t= = 0.1330 0.1330 1/2 = 0.3099
s2p s2p
(n + )1/2 ( + )
n2 445 859
1
For a 0.05 level of significance, we find the t-value for 0.05/2 = 0.025 using df = 445 + 859 -
2=1302. The critical t-values are ±1.962. Since our test statistic of 0.3099 lies in the
acceptance region, we fail to reject the null hypothesis.
We conclude that there is insufficient evidence to indicate that the returns are different for
the two time periods.
Test Concerning Differences between Means with Dependent Samples
Instructor’s Note:
Focus on the basics of this topic, the probability of being tested on the details is low.
In the previous section, in order to perform hypothesis tests on differences between means
of two populations, we assumed that the samples were independent. What if the samples are
not independent? For example, suppose you want to conduct tests on the mean monthly
return on Toyota stock and mean monthly return on Honda stock. These two samples are
believed to be dependent, as they are impacted by the same economic factors.
In such situations, we conduct a t-test that is based on data arranged in paired
observations. Paired observations are observations that are dependent because they have
something in common.
We will now discuss the process for conducting such a t-test.
Example:
Suppose that we gather data regarding the mean monthly returns on stocks of Toyota and
Honda for the last 20 months, as shown in the table below:
Month Mean return of Mean monthly return Difference in mean monthly
Toyota stock of Honda stock returns (di)
1 0.5% 0.4% 0.1%
2 0.7% 1.0% -0.3%
3 0.3% 0.7% -0.4%
… … … …
20 0.9% 0.6% 0.3%
Average 0.750% 0.600% 0.075%
Here is a simplified process for conducting the hypothesis test:
Step 1: Define the null and alternate hypotheses
We believe that the mean difference is not 0. Hence the null and alternate hypotheses are:
H0 : µd = µd0 versus Ha : µd ≠ µd0
µd stands for the population mean difference and µd0 stands for the hypothesized value for
the population mean difference.
Step 2: Calculate the test-statistic
Determine the sample mean difference using:
n
1
d̅ = ∑ di
n
i=0
In tests concerning the variance of a single normally distributed population, we use the chi-
square test statistic, denoted by χ2.
Properties of the chi-square distribution
The chi-square distribution is asymmetrical and like the t-distribution, is a family of
distributions. This means that a different distribution exists for each possible value of
degrees of freedom, n - 1. Since the variance is a squared term, the minimum value can only
be 0. Hence, the chi-square distribution is bounded below by 0. The graph below shows the
shape of a chi-square distribution:
There are three hypotheses that can be formulated (σ2 represents the true population
variance and σ02 represents the hypothesized variance):
1. H0 : σ2 = σ20 versus Ha : σ2 ≠ σ20 . This is used when we believe the population
variance is not equal to 0, or it is different from the hypothesized variance. It is a two-
tailed test.
2. H0 : σ2 ≥ σ20 versus Ha : σ2 < σ20 . This is used when we believe the population
variance is less than the hypothesized variance. It is a one-tailed test.
3. H0 : σ2 ≤ σ20 versus Ha : σ2 > σ20 . This is used when we believe the population variance
is greater than the hypothesized variance. It is a one-tailed test.
After drawing a random sample from a normally distributed population, we calculate the
test statistic using the following formula using n - 1 degrees of freedom:
(n − 1)(s 2 )
χ2 =
σ20
where:
n = sample size
s = sample variance
We then determine the critical values using the level of significance and degrees of freedom.
The chi-square distribution table is used to calculate the critical value.
Example
Consider Fund Alpha which we discussed in an earlier example. This fund has been in
existence for 20 months. During this period the standard deviation of monthly returns was
5%. You want to test a claim by the fund manager that the standard deviation of monthly
returns is less than 6%.
Solution:
The null and alternate hypotheses are: H0: σ2 ≥ 36 versus Ha: σ2 < 36
Note that the standard deviation is 6%. Since we are dealing with population variance, we
will square this number to arrive at a variance of 36%.
We then calculate the value of the chi-square test statistic:
2 = (n - 1) s2 / σ02 = 19 x 25/36 = 13.19
Next, we determine the rejection point based on df = 19 and significance = 0.05. Using the
chi-square table, we find that this number is 10.117.
Since the test statistic (13.19) is higher than the rejection point (10.117) we cannot reject H0.
In other words, the sample standard deviation is not small enough to validate the fund
manager’s claim that population standard deviation is less than 6%.
Test Concerning the Equality of Two Variances
Instructor’s Note:
Focus on the basics of this topic, the probability of being tested on the details is low.
In order to test the equality or inequality of two variances, we use an F-test which is the ratio
of sample variances.
The assumptions for a F-test to be valid are:
• The samples must be independent.
• The populations from which the samples are taken are normally distributed.
Properties of the F-distribution
The F-distribution, like the chi-square distribution, is a family of asymmetrical distributions
bounded from below by 0. Each F-distribution is defined by two values of degrees of
freedom, called the numerator and denominator degrees of freedom. As shown in the figure
below, the F-distribution is skewed to the right and is truncated at zero on the left hand side.
Summary
LO: Explain hypothesis testing and its components, including statistical significance,
Type I and Type II errors, and the power of a test.
LO: Construct hypothesis tests and determine their statistical significance, the
associated Type I and Type II errors, and power of the test given a significance level.
A hypothesis is a statement about the value of a population parameter developed for the
purpose of testing a theory.
In order to test a hypothesis, we follow these steps:
1. State the hypothesis.
2. Identify the appropriate test statistic and its probability distribution.
3. Specify the significance level.
4. State the decision rule.
5. Collect data and calculate the test statistic.
6. Make a decision.
A test statistic is a quantity, calculated on the basis of a sample, and is used to decide
whether to reject or not to reject the null hypothesis. The formula for computing the test
statistic is:
sample statistic − value of the parameter under H0
test statistic =
standard error of the sample statistic
In reaching a statistical decision, we can make two possible errors: We may reject a true null
hypothesis (a Type I error), or we may fail to reject a false null hypothesis (a Type II error).
The level of significance of a test is the probability of a Type I error. As α gets smaller the
critical value gets larger and it becomes more difficult to reject the null hypothesis.
The power of a test is the probability of correctly rejecting the null (rejecting the null when it
is false). It is expressed as:
Power of a test = 1 – P (Type II error)
LO: Compare and contrast parametric and nonparametric tests, and describe
situations where each is the more appropriate type of test.
A parametric test is a hypothesis test concerning a parameter or a hypothesis test based on
specific distributional assumptions. In contrast, a nonparametric test is either not concerned
with a parameter or makes minimal assumptions about the population from which the
sample is drawn.
A nonparametric test is primarily used in three situations: when data do not meet
distributional assumptions, when data is given in ranks, or when the hypothesis we are
addressing does not concern a parameter.