0% found this document useful (0 votes)
10 views53 pages

Stats Mod6 18

The document explains the use of z-tests and t-tests for hypothesis testing concerning population means, detailing when to use each based on sample size and standard deviation knowledge. It provides formulas for calculating test statistics and examples illustrating their application at various significance levels. Additionally, it discusses how to interpret the results of hypothesis tests and the implications of rejecting or failing to reject the null hypothesis.

Uploaded by

Jennah Naguit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views53 pages

Stats Mod6 18

The document explains the use of z-tests and t-tests for hypothesis testing concerning population means, detailing when to use each based on sample size and standard deviation knowledge. It provides formulas for calculating test statistics and examples illustrating their application at various significance levels. Additionally, it discusses how to interpret the results of hypothesis tests and the implications of rejecting or failing to reject the null hypothesis.

Uploaded by

Jennah Naguit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Statistics and

Probability
Quarter 4 – Module 6:
Computing Test Statistic on
Population Mean
There are two specific test statistics used for hypothesis testing
concerning means: z-test and t-test.

If the sample size is large, where 𝑛 ≥ 30 and the population standard


deviation (𝜎) is known, use z-test.

In finding the z-value, use the formula below:

where: 𝑥̅ = sample mean 𝜇 = population mean


𝑛 = sample size 𝜎 = population standard deviation
On the other hand, t- test is used when 𝑛 < 30, the population is normal
or nearly normal, and sample standard deviation (𝑠) is unknown.

The formula for the t- value is:

where: 𝑥̅ = sample mean 𝜇 = population mean


𝑛 = sample 𝑠 = sample standard deviation

The degrees of freedom is 𝑛 − 1 or 𝑑𝑓 = 𝑛 − 1.

Study the following examples.


Example 1: Compute the z-value given the following information. Use
onetailed test and 0. 05 level of significance.
𝑥̅ = 70 𝜇 = 71.5 𝜎=8 𝑛 = 100

Solution: Since σ is known and n ≥ 30, we will use z-test. Thus, we have:

𝑥̅ − 𝜇 Use the formula for z-test.


𝑧= 𝜎
√𝑛
71. 5 − 70
𝑧= Substitute the given value to the formula.
8
√100
1 .5
𝑧= 8
10
Simplify.

1.5
𝑧=
0.8
𝐳 = 𝟏. 𝟖𝟕𝟓
Therefore, the computed z-value is 1.875.

Example 2: In the first semester of the school year, a random sample of 200
students got a mean score of 81.72 with a population standard deviation of
15 in Statistics and Probability test. The population mean is 79.83. Use 0.05
level of significance.

Solution: To answer the problem, let us first identify the given. We have:
𝑥̅ = 81.72 𝜇 = 79.83 𝜎 = 15 𝑛 = 200
Since σ is known and n ≥ 30, we will use z-test.
𝑥̅ − 𝜇 Use the formula for z-test.
𝑧= 𝜎
√𝑛
81.72 − 79. 83
𝑧=
15 Substitute the given value to the
√200 formula.
1. 89
𝑧=
15 Simplify.
14. 14
1. 89
𝑧=
1.06
Therefore, the computed z-value is
𝐳 = 𝟏. 𝟕𝟖𝟑 1.783.

In Central Limit Theorem, the sample standard deviation (𝑠) may be


used as an estimate of the population standard deviation (𝜎) when the value
of 𝜎 is unknown.

Consider the given examples below:


Example 3: In the past, the average length of an outgoing call from a business
office has been 140 seconds. A manager wishes to check whether that average
has decrease after the introduction of policy changes. A sample of 150
telephone calls produced a mean of 135 second, with a standard deviation of
30 seconds. Perform the relevant test at 1% level of significance.

Solution: Let us first identify the given. We have:


𝑥̅ = 135 𝜇 = 140 𝑠 = 30 𝑛 = 150
Since n ≥ 30, we will use z-test by replacing 𝝈 with its estimate s.
𝑥̅ − 𝜇 Use the formula for z-test.
𝑧= 𝜎
√𝑛
135 − 140
𝑧= Substitute the given value to the
30
√150 formula.

−5
𝑧=
30 Simplify.
12.25
−5
𝑧=
2.45 Therefore, the computed z – value
𝐳 = − 𝟐. 𝟎𝟒𝟏 is -2.041.

Example 4: Compute the t-value given the following information:


𝑥̅ = 129.5 𝜇 = 127
𝑠=5 𝑛 = 12

Solution: Since σ is unknown and n < 30, we will use t-test. Thus, we have:

𝑥̅ − 𝜇 Use the formula for t-test.


𝑡= 𝑠
√𝑛
129. 5 − 127
𝑡= Substitute the given value to the
5
√12 formula.
2. 5
𝑡= Simplify.
5
3.46
2.5
𝑡=
1.44
Therefore, the computed t – value
𝐭 = 𝟏. 𝟕𝟑𝟔 is 1. 736.

Example 5: The government claims that the monthly expenses of a Filipino


family with four members is P10,000. A sample of 26 family’s expenses has a
mean of P10,900 and a standard deviation of P1,250. Is there enough evidence
to reject the government’s claim at 𝛼 = 0. 01?
Solution: Let us first identify the given, so we have:

𝑥̅ = P10,900 𝜇 = P10,000 𝑠 = P1,250 𝑛 = 26


𝑥̅ − 𝜇 Use the formula for t-test.
𝑡= 𝑠
√𝑛
10 900 − 10 000
𝑡=
1 250 Substitute the given value to the
√26 formula.
900
𝑡=
1 250 Simplify.
5.10
900
𝑡=
245. 10
Therefore, the computed t-value is
𝐭 = 𝟑. 𝟔𝟕𝟏
3.671.

Statistics and
Probability
Quarter 4 – Module 7:
Drawing Conclusion About
Population Mean Based on
Test Statistic Value and
Table 1: z – Critical Value
Level of Significance
Type of Test
𝜶 = 1% 𝜶 = 2.5% 𝜶 = 5% 𝜶 = 10%

one-tailed test 𝑐 = ±2. 326 𝑐 = ±1.960 𝑐 = ±1.645 𝑐 = ± 1. 28

two-tailed test 𝑐 = ±2. 575 𝑐 = ±2.326 𝑐 = ±1.960 𝑐 = ±1.645

Table 2: t – Critical Value


𝜶 for one-tailed test 0.05 0.025 0.01 0.005

𝜶 for two-tailed test 0.10 0.05 0.025 0.01

df = (n – 1)
1 6.311 12.706 31.821 63.657
2 2.920 4.303 6.065 9.925
3 2.353 3.182 4.541 5.841
4 2.132 2.776 3.747 4.604
5 2.025 2.571 3.365 4.032
6 1.943 2.447 3.143 3.707
7 1.895 2.365 2.998 3.499
8 1.860 2.306 2.896 3.355
9 1.833 2.262 2.821 3.250
10 1.812 2.228 2.764 3.169
11 1.796 2.201 2.718 3.106
12 1.782 2.179 2.681 3.055
13 1.771 2.160 2.650 3.012
14 1.761 2.145 2.624 2.977
15 1.753 2.131 2.602 2.947
16 1.746 2.120 2.583 2.921
17 1.740 2.110 2.567 2.898
18 1.734 2.101 2.552 2.878
19 1.729 2.093 2.539 2.861
20 1.725 2.086 2.528 2.845
21 1.721 2.080 2.512 2.831
22 1.717 2.074 2.508 2.819
23 1.714 2.069 2.500 2.807
24 1.711 2.064 2.492 2.797
25 1.708 2.060 2.485 2.787
26 1.706 2.056 2.479 2.779
27 1.703 2.052 2.473 2.771
28 1.701 2.048 2.467 2.763
29 1.699 2.045 2.462 2.756
30 1.697 2.042 2.457 2.750
31 1.695 2.040 2.453 2.744
32 1.694 2.037 2.449 2.738
33 1.692 2.035 2.445 2.733
34 1.691 2.032 2.441 2.728
35 1.690 2.030 2.438 2.724
36 1.688 2.028 2.434 2.719
37 1.687 2.026 2.431 2.715
38 1.686 2.024 2.429 2.712
39 1.685 2.023 2.426 2.708
40 1.684 2.021 2.423 2.704
42 1.682 2.018 2.418 2.698
44 1.680 2.015 2.414 2.692
46 1.679 2.013 2.410 2.687
48 1.677 2.011 2.407 2.682
50 1.676 2.009 2.403 2.678
60 1.671 2.000 2.390 2.660
Infinity 1.645 1.960 2.326 2.576

In general, if the absolute value of the computed value is greater than


the absolute value of the critical value, we reject the null hypothesis and
support the alternative hypothesis. But if the absolute value of the computed
value is less than the absolute value of the critical value, we do not reject or
we fail to reject the null hypothesis and the alternative hypothesis is not
supported.

In a right-tailed test, if the computed value is greater than the critical


value, we reject the null hypothesis and support the alternative hypothesis.
But if the computed value is less than the critical value, we do not reject or
we fail to reject the null hypothesis and the alternative hypothesis is not
supported.

In a left-tailed test, if the computed value is less than the critical value,
we reject the null hypothesis and support the alternative hypothesis. But if
the computed value is greater than the critical value, we do not reject or we
fail to reject the null hypothesis and the alternative hypothesis is not
supported.
Rejecting the null hypothesis doesn’t mean that it is incorrect or the
alternative hypothesis is correct. The collected data suggest a sufficient
evidence to disprove the null hypothesis, hence we reject it.
Similarly, a failure to reject the null hypothesis does not mean that it is
true -only that the test did not prove it to be false. There is an insufficient
evidence to disprove the null hypothesis; hence we do not reject it.

Study the examples below.

Example 1: Compute for its value given the following information. Use 𝛼 =
0. 05. Interpret the result.
𝐻𝑜: 𝜇 = 70 𝑥̅ = 71.5 𝜇 = 70
𝐻𝑎: 𝜇 > 70 𝜎=8 𝑛 = 100

Solution: It is a one-tailed test, since it does mention about the direction of the
distribution (the alternative hypothesis uses the symbol >). Since σ is known and n ≥
30, we will use z-test. The level of significance is 0.05. From Table 1, the z-critical
value is 1.645. Thus, we have:
Non-Rejection Rejection Region
𝑥̅ − 𝜇 1.5
𝑧= 𝜎 𝑧= Region
8
ξ𝑛 10
71. 5 − 70 1. 5
𝑧= 𝑧=
8 0. 8
ξ 100 𝐳 = 𝟏. 𝟖𝟕𝟓

Decision: 1.645

The computed z-value is 1.875 which is greater than the critical value of 1.645.
Therefore, we reject the null hypothesis and support the alternative hypothesis.
Example 2: Compute for its value given the following information. Use 𝛼 =
0.01. Interpret the result.
𝐻𝑜: 𝜇 = 127 𝑥̅ = 124.5 𝜇 = 127
𝐻𝑎:𝜇 < 127 𝑠=5 𝑛 = 12

Solution: It is a left-tailed test, since it does mention about the direction of the
distribution (the alternative hypothesis uses the symbol <). Since σ is unknown
and n < 30, we will use t-test. The degree of freedom (df = n - 1) is 11 and 𝛼 =
0.01. Therefore, the t-critical value from Table 2 is -2.718. Thus, we have:
Rejection Acceptance or
𝑥̅ − 𝜇 −2. 5
𝑡= 𝑡= Region Non-Rejection
𝑠 5 Region
ξ𝑛 3.46
124. 5 − 127 −2.5
𝑡= 𝑡=
5 1.44
ξ 12 𝐭 = −𝟏. 𝟕𝟑𝟔
-6 -5 -4 -3 -2 -1 0 1 2 3 4 5

-2.718
Decision:

The computed t-value is greater than the t-critical value at 𝛼 = 0.01 (i. e.−1.736 >
−2.718. Since we have a left-tailed test, our conclusion is that we fail to reject the
null hypothesis.

Example 3: The government claims that P10,000 is the monthly expenses of a


Filipino family with four members. A sample of 26 families has mean monthly
expenses of P10,900 and a standard deviation of P1,250. Is there enough
evidence to reject the government’s claim at 𝛼 = 2.5%?
Solution: Let us identify first the given. So we have:
𝐻𝑜: 𝜇 = 𝑃10,000 𝑥̅ = P10,900 𝑠 = P1,250
𝐻𝑎: 𝜇 ≠ 𝑃 10,000 𝜇 = P10,000 𝑛 = 26

It is a two-tailed test, since it does not mention about the direction of the
distribution. Since σ is unknown and n < 30, we will use t-test. The degree of
freedom (df = n - 1) is 25 and 𝛼 = 2.5%. Therefore, the t-critical value from Table
2 is 2.485. Thus, we have:

Non-Rejection
𝑥̅ − 𝜇 900 Region Rejection Region
𝑡= 𝑠 𝑡=
1 250
ξ𝑛 5.10
10 900 − 10 000 900
𝑡= 𝑡=
1 250 245. 10
ξ 26 𝐭 = 𝟑. 𝟔𝟕𝟏

-5 -4 -3 -2 -1 0 1 2 3 4 5

-2.485 2.485

Decision:

The absolute value of the computed t-value is greater than the absolute of the
critical t-value at 𝛼 = 0.025 (i.e. |3.671|> |2.485|). Therefore, we reject the null
hypothesis.

Conclusion:

We can conclude that there is enough evidence to reject the claim of the
government that P10,000 is the monthly expenses of a Filipino family with four
members.
Statistics and
Probability
Quarter 4 – Module 8: Solving
Problems Involving Test of
Hypothesis on Population
In testing hypothesis on the population means, follow the steps below:

1. State the null hypothesis 𝐻𝑜 and the alternative hypothesis 𝐻𝑎.


2. Determine the test statistic that will be used to conduct the hypothesis test.
Then, calculate its value.
3. Find the critical value for the test and draw the critical region.
4. Decide and draw a conclusion based on the comparison of the calculated
value of the test statistic and the critical value of the test.
In general, if the absolute value of the computed value is greater than the
absolute value of the critical value, we reject the null hypothesis and support
the alternative hypothesis. But if the absolute value of the computed value is
less than the absolute value of the critical value, we fail to reject the null
hypothesis and the alternative hypothesis is not supported.
In a right-tailed test, if the computed value is greater than the critical
value, we reject the null hypothesis and support the alternative
hypothesis. But if the computed value is less than the critical value, we fail
to reject the null hypothesis and the alternative hypothesis is not
supported.

In a left-tailed test, if the computed value is less than the critical


value, we reject the null hypothesis and support the alternative
hypothesis. But if the computed value is greater than the critical value, we
fail to reject the null hypothesis and the alternative hypothesis is not
supported.

Study the given examples below.


Example 1: According to a study conducted by the Grade 12 students, ₱155 is
the average monthly expense for cell phone loads of high school students in
their province. A Statistics student claims that this amount has increased since
January of this year. Do you think his claim is acceptable if a random sample
of 50 students has an average monthly expense of ₱165 for cell phone loads?
Using 5% level of significance, assume that a population standard deviation is
₱52.
Solution:

Given: 𝑥̅ = 165 𝜇 = 155 𝜎 = 52 𝑛 = 50 𝛼 = 0.05

Step 1: State the null and alternative hypotheses.


𝐻𝑜: 𝜇 = 155 𝐻𝑎: 𝜇 > 155

Step 2: Determine the test statistic, then compute its value.


Since the population mean is being tested, the population standard deviation 𝜎
is known, and 𝑛 > 30, the appropriate test statistic is the z-test.

𝑥̅ −𝜇
𝑧= 𝜎
𝑛 √

𝐳 = 𝟏. 𝟑𝟔𝟏
Step 3: Find the critical value and draw the critical region. Use the z-critical
value table.

The alternative hypothesis is directional. Hence, the one-tailed test (right-


tailed test) shall be used. From the z-value table at 0.05 level of significance, the
critical value is 1.645.

Non-Rejection
Region
Rejection Region

1.361 1.645

Step 4: Draw a conclusion.


The z-computed value is 1.361 and it lies within the non-rejection region,
so we fail to reject the null hypothesis. Therefore, there is no enough evidence to
support the claim that the average monthly expense for cell phone loads is more
than ₱155. This result is significant at 𝛼 = 0.05 level.

Example 2: Blood glucose levels for obese teenagers have a mean of 120. A
researcher thinks that a diet high in raw cornstarch will have a positive or
negative effect on blood glucose levels. A sample of 25 patients who have tried
the raw cornstarch diet has a mean glucose level of 135 with a standard
deviation of 38. Test the hypothesis at 𝛼 = 0.10 that the raw cornstarch had an
effect.
Solution:
Given: 𝑥̅ = 135 𝜇 = 120 𝑠 = 38 𝑛 = 25 𝛼 = 0.10 𝑑𝑓 = 24
Step 1: State the null and alternative hypotheses.
𝐻𝑜: 𝜇 = 120 𝐻𝑎: 𝜇 ≠ 120
Step 2: Determine the test statistic, then compute its value.
Since it is the population mean being tested, the population standard deviation
is unknown, and 𝑛 < 30, the appropriate test statistic is the t-test.

t=

𝒕 = 𝟏. 𝟗𝟕𝟒

Rejection Region Non-Rejection Rejection Region


Step 3: Find the Region
critical value and
draw the critical
region.
The alternative
hypothesis is non-
directional. Hence, the
two-tailed test shall be
used. From the t-value
table at 0.10 level of
significance, the critical
value is ±1.711.
Step 4: Draw a conclusion. - 1.711 1.711

Since the t-computed value is 1.974 which is greater than the critical value
of 1.711, we reject the null hypothesis and support the alternative hypothesis.
We can conclude that there is enough evidence to support the claim that the raw
cornstarch had an effect on blood glucose levels.
Example 3: The average IQ of Senior High School students is 99 with a standard
deviation of 15. A researcher believes that the average IQ of Senior High School
students is lower. A random sample of 40 students was tested and got an average
of 95. Is there enough evidence to suggest that the average IQ is lower? Test the
hypothesis at 0.05 level of significance. Solution:
Given: 𝑥̅ = 95 𝜇 = 99 𝜎 = 15 𝑛 = 40 𝛼 = 0.05
Step 1: State the null and alternative hypotheses.
𝐻𝑜: 𝜇 = 99 𝐻𝑎: 𝜇 < 99
Step 2: Determine the test statistic, then compute its value.
Since the population mean is being tested, the population standard deviation 𝜎
is known, and 𝑛 > 30, the appropriate test statistic is the z-test.

𝑥̅ −𝜇
𝑧= 𝜎
𝑛 √

𝐳 = −𝟏. 𝟔𝟖𝟖

Step 3: Find the critical


value and draw the critical
region. Use the z-critical
Non-Rejection value table. The alternative
Region hypothesis is directional.
Rejection Region
Hence, the one-tailed test
(left-tailed test) shall be
used. From the z-value table
at 0.05 level of significance,
the critical value is -1.645.

-1.645

Step 4: Draw a conclusion.


The z-computed value is -1.688 and it lies within the rejection region, so
we reject the null hypothesis. Therefore, there is enough evidence to support
the claim that the IQ level of Senior High School students is lower than 99.
This result is significant at 𝛼 = 0.05 level.
Statistics and
Probability
Quarter 4 – Module 9:
Formulating Appropriate
Null and Alternative Hypotheses
on a Population Proportion

Once you already know that you are dealing with a population proportion,
you can conduct the hypothesis test. You can start with the first step of a
hypothesis test which is to determine the hypotheses. In order to formulate null
and alternative hypotheses concerning population proportions, you can write
them in sentence form or you can use different symbols. Here, you will use the
symbol p for the population proportion.
Remember that the hypotheses are claims about the population
proportion, p. The null hypothesis states that the proportion is equal to a
specific value or the hypothesized proportion, po. On the other hand, the
alternative hypothesis is the competing claim that the population proportion is
less than, greater than, or not equal to po.
As a reminder, the null hypothesis is always a statement of equality. The
alternative hypothesis is always a statement of inequality, using the symbols <,
>, or ≠. Moreover, the hypotheses are stated in such a way that they are mutually
exclusive. That is, if one is true, the other must be false; and vice versa.

If you are going to write the null hypothesis in sentence form, you will
usually use “is” or “is equal to”. In symbols, you are going to use:

HO : p = po

Meanwhile, to formulate alternative hypothesis in sentence form or in


symbols, you will just remember the following:
➢ When testing for population proportions, there are three (3) possible
alternative hypotheses. They are based on the wording of the question
instructing you what to hypothesize. (See illustrative examples below.)

Alternative Hypotheses CLUES/WORDS USED


(SYMBOLS TO BE USED)

a. Ha : p < po smaller, less, decreased, fewer, lower


b. Ha : p > po larger, greater, more, increased
c. Ha : p ≠ po different, not equal to, changed

where: p = population proportion


po = hypothesized proportion

In the given symbols as shown above, letters a and b are used in a one-
tailed test or one-sided tests (directional) while letter c is used for a twotailed
test (non-directional).

As you might recall, the differences between one-tailed test


(directional) and two-tailed test (non-directional) were already explained to you
in the previous modules. And for the purpose of this lesson, the table below
shows the differences between one-tailed test and two-tailed test.

One-Tailed Two-Tailed

Alternative hypothesis contains Alternative contains the


the greater than (>) or less than inequality (≠) symbol.
(<) symbols
It is directional (either right-tailed It has no direction.
or left-tailed)

The next table below shows the null and alternative hypotheses stated
together with the types of hypothesis tests.
Two-Tailed Test Right-Tailed Test Left-Tailed Test
Null 𝐻𝑜:𝑝 = 𝑝𝑜 or 𝐻𝑜:𝑝 = 𝑝𝑜 or
Hypothesis 𝐻 𝑜 : 𝑝 = 𝑝𝑜
𝐻𝑜 : 𝑝 ≤ 𝑝𝑜 𝐻𝑜 : 𝑝 ≥ 𝑝𝑜
Alternative
𝐻𝑎: 𝑝 ≠ 𝑝𝑜 𝐻𝑎: 𝑝 > 𝑝𝑜 𝐻𝑎:𝑝 < 𝑝𝑜
Hypothesis

Illustrative Examples:
Example 1. It has been claimed that 40% of students in a particular senior
high school dislike Mathematics. When a survey was conducted by a researcher,
it showed that 145 of 800 students dislike Mathematics. Test if the claim was
different at α = 0.05 level.

Null Hypothesis (Ho):

In this example, the hypothesized proportion is 40% or 0.40. Hence, the


null hypothesis will be,
The proportion of students who dislike Mathematics is 40%.
In symbols, you can write,
Ho: p = 0.40

Alternative Hypothesis (Ha):

Our cue word here is “different” which means “not the same” or “not
equal”. Therefore the alternative hypothesis is,
The proportion of students who dislike Mathematics is not equal to
40%.
In symbols, you can write,
Ha: p ≠ 0.40

Since the word “different” is used in the given problem,


the symbol to be used in alternative hypothesis is “ ≠ ”.

Note: This is a two-tailed test or non-directional.


Example 2. A certain senior high school plans to open STEM (Science and
Technology, Engineering, and Mathematics) as an academic track only if 60% of
the students in their junior high school will enrol on the following academic
year. A survey conducted among a random sample of students revealed that
450 out of 1000 students will enrol. Is the expected enrolment significantly lower
than the desired enrolment? Test at α = 0.05 level.

Null Hypothesis (Ho):


The hypothesized proportion here is 60%, therefore the null
hypothesis will be,
The proportion of students who will enroll on STEM track is 60%.
In symbols, it can be written as,
Ho: p = 0.60

Alternative Hypothesis (Ha):

Your hint in formulating the alternative hypothesis in this example is the


phrase “lower than” which means “less than”. So, your alternative hypothesis
will be,
The proportion of students who will enroll on STEM track is lower
than 60%.
which can be written as,
Ha: p < 0.60

Since the word “lower” is used in the given problem, the


symbol to be used in alternative hypothesis is “<”.

Note: This is a one-tailed test or directional.

Example 3. It has been claimed that 40% of qualified applicants passed in a


particular job interview. When a survey was conducted by a researcher of a
certain company, it showed that 90 of 145 applicants passed the job interview.
Test if the claim was larger at α = 0.05 level.

Null Hypothesis (Ho):

40% is the hypothesized proportion; hence you have the null hypothesis
stated as
The proportion of qualified applicants in a particular job interview
is 40%.
And it can be written in symbols as
Ho: p = 0.40

Alternative Hypothesis (Ha):


The word “larger” is synonymous to “greater” hence your alternative
hypothesis will be,
The proportion of qualified applicants in a particular job interview
was larger than 40%.
Or in symbols
Ha: p > 0.40

Since the word “larger” is used in the given problem, the


symbol to be used in alternative hypothesis is “ > “.

Note: This is a one-tailed test or directional.

Statistics and
Probability
Quarter 4 – Module 10:
Identifying Appropriate Test
Statistic Involving Population
Proportion

Dealing with various problems or situations oftentimes leads to


confusion. In this section, take note that problems involving proportions,
unlike in population mean and sample mean, never use terms such as
“average” and “mean” but “percentage” instead. Let us first define what
population proportion is.

Population Proportion and Sample Proportion


Population proportion (p) is a part of the population with a
particular attribute or trait expressed as a fraction, decimal, or percentage
of the whole population. In symbol:

𝐧𝐮𝐦𝐛𝐞𝐫 𝐨𝐟 𝐦𝐞𝐦𝐛𝐞𝐫𝐬 𝐢𝐧 𝐭𝐡𝐞 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧 𝐰𝐢𝐭𝐡 𝐚 𝐩𝐚𝐫𝐭𝐢𝐜𝐮𝐥𝐚𝐫 𝐚𝐭𝐭𝐫𝐢𝐛𝐮𝐭𝐞


p= 𝐧𝐮𝐦𝐛𝐞𝐫 𝐨𝐟 𝐦𝐞𝐦𝐛𝐞𝐫𝐬 𝐢𝐧 𝐭𝐡𝐞 𝐩𝐨𝐩𝐮𝐥𝐚𝐭𝐢𝐨𝐧

p= ____ %

Notice that in Matapat City, 10% (percentage is used) of the entire


residents are senior citizen. Therefore, the percentage of the senior citizen
residents represents the population proportion or percentage which
makes p = 10% = 0.10.
Similarly, among these senior citizens, what percentage owns a cell
phone? That illustrates the sample proportion, in symbol 𝒑̂ (read as “p
hat”) which is computed as follows:

𝒑̂ = 0.84

Sometimes, the sample proportion ( 𝒑̂) is stated directly, such as:


- “20% of the respondents” = 0.20 - “5% of the defective bulbs”
= 0.05
- “50% of the Grade 12 students” = 0.50

To change percent to
decimal, see examples
below:
1. 12% = 0.12
2. 5% = 0.05
3. 12.5% = 0.125

On the other hand, there are cases where we still need to calculate 𝒑̂.
Examples of these kinds are:

- “70 out of 200 residents are married.”


- “150 out of 500 listeners are interviewed.”
- “10 out of 1000 bulbs are defective.”

In this case, we need to solve for the value of the sample proportion
𝒑̂ (read as “p hat”).

Sample proportion (𝒑̂) is the ratio of the number of elements in the


sample possessing the characteristics of interest over the number of
elements in the sample or n. It is computed by the formula:
𝑟𝑎𝑛𝑑𝑜𝑚 𝑣𝑎𝑟𝑖𝑎𝑏𝑙𝑒 𝑓𝑜𝑟 𝑡 ℎ 𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑒𝑠 𝑖𝑛 𝑛 𝑠𝑎𝑚𝑝𝑙𝑒𝑠 𝒙
𝒑̂ =
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑡𝑟𝑖𝑎𝑙𝑠 𝑜𝑟 𝑡 ℎ 𝑒 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑡 ℎ 𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
= 𝒏
𝒙
𝑝=𝒏

where: 𝒑̂ is the proportion of the number of successes in n


samples and read as “p hat”.

x represents the number of “successes” in n samples; and

n represents the size of the sample.

The example below will help you understand better how we can easily
estimate the value of the sample proportion.
Remember that in a situation
describing a population
proportion/sample proportion, the
words “mean” or “average” are
notused.

Illustrative Example:
For a class project, a Grade 12 STEM student wants to estimate the
percentage of students in his school who are registered voters. From 45%
Grade 12 students, he surveys 500 students and finds that 200 are
registered voters. Determine the value of p and compute for the sample
proportion.

Solution:
The population proportion is the rate or percent used from the entire
Grade 12 students. Therefore:

Population Proportion, p = 45% = 0. 45

To find the sample proportion ( 𝒑̂ ), identify the ff:


Surveyed Grade 12 students = n = 500
Registered Grade 12 students = x = 200

Therefore, the sample proportion will be computed as follows:

Sample Proportion,

𝒑̂ = 0.4

Using the Central Limit Theorem in Testing Population Proportion


When testing situations involving proportion, a percentage, or a
probability, the following assumptions must be considered:

1. The conditions for binomial experiment are met. That is, there is a fixed
number of independent trials with constant probabilities and each trial
has two outcomes that we usually classify as “success” (p) and
“failure” (q). The sum of p and q is 1. Hence, we can write p + q = 1
or q = 1 – p.
2. The conditions np ≥ 5 and nq ≥ 5 are both satisfied so that the binomial
distribution of sample proportion can be approximated by a normal
distribution with 𝜇 = 𝑛𝑝 and (However, the specific number
varies from source to source, some authors use 10 instead of 5
depending on how good an approximation one wants.)

Likewise, the second assumption served as the basis to determine


whether the sample size from the population proportion is sufficiently large
or not. Remember that this time, the condition that sample be large is not
n to be at “least 30” but it should satisfy the second assumption. For a large
size of sample proportions, the Central Limit Theorem (CLT) can be used.
Bear in mind that if the sample size is sufficiently large, then the mean of
the random sample from a population has a sampling distribution that is
approximately normal, even when the original distribution is normally
distributed and n ≥ 30.
Now, let us check the assumptions from the previous situation:

1. It is evident that the responses have only two outcomes: “registered


voter” (success) or “not registered voter” (failure). Therefore, the first
assumption is met.

2. To be able to satisfy the second condition, we find the hypothesized


value of the population proportion p = 0.45 while n = 500. To get q, q
= 1 – p which makes q = 1 – 0.45 = 0.55.

Through substitution, it shows that the second assumption is also


met, since:
np ≥ 5 and nq ≥ 5
500 (0.45) ≥ 5 and 500 (0.55) ≥ 5
225 ≥ 5 and 275 ≥ 5

Since we have shown that np ≥ 5 and nq ≥ 5, all conditions are met


where the sample size is truly large enough to use CLT. In this condition,
the test statistic to be used is the z-test statistic for proportions denoted by
Zcom or the computed z-value.

The z-Test Statistic for Population Proportion

Recall the z-score formula to be z = With


np ≥ 5 and nq ≥ 5 and with the standard

deviation of sample proportion be


Substituting 𝑝 for 𝑥̅
p for 𝜇𝑥̅
𝑝𝑞
and √𝑛 for 𝜎𝑥̅

Therefore, the formula for the value of z-test statistic for population
proportion would be:

Zcom
or Zcom

where:
zcom is the z-test statistic for proportion.
𝑝 is the sample proportion (
p is the hypothesized value of the population proportion.
n is the sample size or the number of observations in the
sample. q is equal to 1 – p.

Remember this formula because you are going to use this in Module
12 where the actual computation for the test statistic involving population
proportion will be held.

Statistics and
Probability
Quarter 4 – Module 11:
Identifying Appropriate
Rejection Region Involving
Population Proportion

There are two ways to test the hypothesis: with a p-value approach and
with a critical value approach. Here, we will consider the rejection region with
the critical value approach. The critical value enables us to reject or not the
null hypothesis. Also, it is calculated through alpha ( α ) levels and symbolized
by Z or Ztab.
This is the first statement in Activity 2: “The hypothesis that less than
20% of the population are right-handed” wherein Ha: p < 0.20 and it indicates a
left-tailed rejection region. Illustrating it in the normal curve, we will come up
with the picture below:
Rejection
Region Non-Rejection
(α) Region This is the
critical value.

Ztab
The illustration above is for you to visualize how the statement would
look like when put into the normal curve. Notice that the line represented by
ztab separates the curve into two regions. The shaded part is the rejection region
while the non-shaded part is the non-rejection region or the acceptance
region/area. Therefore, it is important that we determine the value of ztab or the
critical value. Now, let us proceed!

Let us now describe the following important terms that we will be needing
in our discussion.

Critical Value, ztab


- separates the rejection region from the acceptance region
- derived from the level of significance and expressed as the standard
zvalues
- symbolized as ztab

We can use the table of critical values for the commonly used levels of
significance presented in the previous modules.

Level of Significance
Test Type
𝛼 = 0.01 𝛼 = 0.025 𝛼 = 0.05 𝛼 = 0.10
left-tailed test −2.33 −1.96 −1.645 −1.28
right-tailed test 2.33 1.96 1.645 1.28
two-tailed test ±2.575 ±2.33 ±1.96 ±1.645

Level of Significance, 𝜶 (Greek letter, alpha)


- refers to the degree of significance in which we reject or do not reject the
null hypothesis
- the basis for the critical or the rejection region dictated by the alternative
hypothesis

The following are the common values of statistical significance:


➢ 0.01 highly significant
➢ 0.05 statistically significant ➢ 0.10 significant

For instance, if we use 0.05

level of significance, then the size of


the rejection region is 0.05 or 5%. For α = .01, then the size of the rejection region is
1%, and 10% for
0.10.

Rejection Region
- the range of the values of the test value which indicates that there is a
significant difference and that the null hypothesis (Ho) should be rejected

Non-Rejection Region
- the range of the values of the test value which indicates that the difference
was statistically insignificant and that we failed to reject the null
hypothesis (Ho)

Illustrative Example1:
A sample of 100 students is randomly selected from Pinagpala High
School and 18 of them said they are left-handed. Test the hypothesis that less
than 20% of the students are left-handed by using 𝛼 = 0.05 as the level of
significance.

What to do:
a. Identify the level of significance.
b. Formulate the alternative hypothesis, Ha.
c. Determine the critical value, ztab.
d. Illustrate the rejection region in the normal curve.

Solution:
a. The level of significance is 𝛼 = 0.05.
b. The alternative hypothesis is Ha: p < 0.20.
It is one directional or left-tailed as determined by the term “less than”.
c. To determine the critical value using the table, we consider the intersection of
the row for the left-tailed test and the column for = 0.05. Hence, the table tells
us that the critical value is – 1.645.
d. Illustrating it under the normal curve makes:

Rejection
Region

𝛼 = 0.05 .
Non-rejection
Region

-3 -2 -1.645 0 1 2 3

From here, you will decide whether the null hypothesis will be rejected
or not, although that part will be discussed in the next module.

Illustrative Example 2:
The claim is made that 40% of tax filers use computer software to file
their taxes. In a sample of 50 tax filers, 14 used computer software to file their
taxes. If Ha: p < 0.40 at α = 0.025 where p is the population proportion who
use computer software to file their taxes. Determine the critical value, Ztab and
illustrate the rejection region in the normal curve.

Solution:
At α = 0.025 level of significance, with p < 0.40, by referring to the table
of the Level of Significance, it shows that the critical value or Ztab = –
1.96

Illustrating the rejection region, we have

Rejection
Region α = 0.025

Non-rejection
Region

Ztab = - 1.96
Illustrative Example 3:

In Kalinga Special Education School, a sample of 144 students was


chosen and among them, 48 are diagnosed with Attention Deficit Hyperactivity
Disorder (ADHD). At 𝛼 = 0.01, test the hypothesis that the proportion of ADHD
students in the school is not 0.40.
When a
What to do: statement did not
specify any cue
a. Identify the level of significance. word that describes
b. Formulate the alternative hypothesis, Ha: p ≠ po. direction, then it is
non-directional or
c. Determine the critical value.
two-tailed.
d. Illustrate the rejection region in the normal curve.

Solution:
a. The level of significance is 𝛼 = 0.01.
b. The alternative hypothesis is p ≠ 0.40 due to the expression “is not 0.40 ”.
This explains why it is non-directional or two-tailed.
c. To determine the critical value using the table, we consider the intersection
of the row for the two -tailed test and the column f or 𝛼 = 0.01. Hence, the
table tells us that the critical value is ±2.575.
d. Illustrating the rejection region in the normal curve gives:

Rejection
Region Acceptance
𝛼
= 0.01 = 0.005
Region 2 2
𝛼
2

Z = -2.575 Z = 2.575
tab tab
Statistics and
Probability
Quarter 4 – Module 12:
Computing Test Statistic Value
Involving Population Proportion

It is observable that the previously cited situation did not use nor mention
words like “mean” or “average” but “percentage” instead. Also, it utilized count
data. Problems such as this involves population proportion. Inferences
involving proportions are made in the context of probability of “success”, p, in a
binomial distribution.

From the situation that we presented in the above activity, the


respondents have only two possible options for their responses and those
are the following:

Option 1 They own their house. “success” or p


Option 2 They do not own their house. “failure” or q

Showing if the number of samples is large enough as the Central Limit


Theorem states, we need to satisfy the two assumptions. It is evident that
the responses have only two possible outcomes: “owned” (success) or “not
owned” (failure). Therefore, the condition for binomial experiment is met.
Also, to be able to satisfy the condition that np ≥ 5 and nq ≥ 5, we find that
the hypothesized value of the population proportion is p = 0.35 while n =
240. To get q, q = 1 – p makes q = 1 – 0.35 = 0.65.

Through substitution, we can show that the second condition is also


met, since:
np ≥ 5 and nq ≥ 5
240 (0.35) ≥ 5 and 240 (0.65) ≥ 5
84 ≥ 5 and 156 ≥ 5

Since we have shown that np ≥ 5 and nq ≥ 5, all conditions are met


where the sample size is large enough to use Central Limit Theorem. In this
condition, the test statistic to be used is the z-test statistic for proportions
denoted by Zcom or the computed z-value.

Again, the problems presented here contain sample sizes


that are large enough to consider the Central Limit Theorem or CLT. Thus, in
solving these problems, there is no need to show these assumptions.

Z – Test Statistic for Population Proportion


Remember that the formula for the value of z-test statistic for population
proportion would be:

Zcom or Zcom
where:

zcom is the z-test statistic for proportion.


𝑝
is the sample proportion ( .
p is the hypothesized value of the population proportion.
n is the sample size or the number of observations in the
sample. is equal to 1 – p.
q

We will use this formula in the examples that follow.

Illustrative Example1:
Let us now determine the z-value in the situation presented
previously. To be able to solve it, we need to identify first the values of the
following:
Zcom = ?
78

p = 35% = 0.35 n = 240


q = 1 – p = 1 – 0.35 =
0.65
Then, substitute these values in the formula:

Zcom

Therefore, the computed z-value is Zcom = - 0.812


If you are still a bit confused, here is another example.

Illustrative Example 2:
Determine the value of Zcom given the following information:
p = 0.42
Sample Size: n = 150
Sample Proportion: 𝑝 = 0.45

Solution:

To start your solution, identify first the values of the following:

Zcom = ?
𝑝 = 0.45
p = 0.42
n = 150 q
= 1 – p = 1 – 0.42 = 0.58

Then, substitute these values in the formula:

Zcom
Zcom = 0.7444

Illustrative Example 3:
The claim is made that 40% of tax filers use computer software to file
their taxes. In a sample of 50, 14 used computer software to file their taxes.
To test Ho: p = 0.4 versus Ha: p > 0.4 at α= 0:05 where p is the population
proportion who use computer software to file their taxes. And to test using
the binomial distribution and test using the normal approximation to the
binomial distribution. Determine first the value of zcom.

Solution:
First, determine the value of the following:
Zcom = ?

p = 40% = 0.40 n = 50
q = 1 – p = 1 – 0.40 =
0.60

Then, substitute these values in the formula:

Zcom
Therefore, the computed z-value is Zcom = –1.739

Statistics and
Probability
Quarter 4 – Module 13:
Drawing Conclusions About Population
Proportion Based on Test

In drawing conclusions, there are two different approaches that you may
apply: the critical z-approach (computed z-value) and the P-value approach.

CRITICAL VALUE APPROACH

In applying the first approach which is determining the critical value (which you
were already taught in the previous modules), you need to consider the following:

a. Null and Alternative Hypotheses;


b. Level of Significance (α);
c. Computed Test Statistic, Critical Value (including rejection region);
and
d. Decision (whether to reject or fail to reject the null hypothesis (Ho).

Determine if the test statistic falls in the rejection region. If it does,


reject the null hypothesis. If it does not, do not reject the null
hypothesis.
❖ If the computed z-statistic (zcom) is > or < the tabular value (ztab),
reject the null hypothesis (Ho).
❖ If the computed z-statistic (zcom) falls in the rejection region, reject
the null hypothesis (Ho).
❖ If the computed z-statistic (zcom) does not fall in the rejection
region, fail to reject the null hypothesis (Ho).

Illustrative Example:

Example 1

a. Ho : p = 0.85
Ha : p < 0.85
b. Level of Significance: α = 0.01
c. Computed Test Statistic:

Given: x = 325 p = 0.85 n = 400

𝑋
𝑝=𝑛

𝒑̂ = 0.81

𝑝 −𝑝
𝑝 (1 −𝑝 )
z= √ 𝑛

z = -2.24
The alternative hypothesis is directional. Hence, one-tailed test shall
be used.

Using the Areas Under the Normal Curve Table, the critical value
is -2.326 at α = 0.01 level. There is a negative sign in the value due to
the direction of the alternative hypothesis.

d. DECISION: Since the computed test statistic (zcom) z = -2.24 does not fall
in the rejection region, fail to reject the null hypothesis (Ho).

CONCLUSION: Therefore, at 0.01 level of significance, there is not enough


evidence to conclude that there is a decrease in the number of students who
prefer male rather than female candidates.

P-VALUE APPROACH

What is P-value?
In critical value approach, a test statistic is compared with a critical value.
However, in p-value approach (short for probability value), probabilities or areas
are compared. P-value measures the consistency of the sample statistics with
the null hypothesis. High P-values mean that sample results are consistent with
a true null hypothesis while low P-values are not consistent. If the P value is
small enough, we can conclude that the sample is so incompatible with the null
hypothesis. Therefore, we can reject the null hypothesis for the entire population.

P-value approach uses the following basic procedures:

1. State the null hypothesis H0 and the alternative hypothesis Ha.


2. Set the level of significance α.
3. Calculate the test statistic.
4. Calculate the p-value.
5. Make a decision. Check whether to reject the null hypothesis by comparing
p-value to α.
❖ If the p-value < α, then reject Ho. Otherwise, do not reject Ho.
Illustrative Example:
Given:
Ho: p = 0.5 = 0.05 n= 25,468
Ha: p > 0.5

Solution:

Using the formula:

z =

z =
z = 5.49
The p-value is represented in the graph below:

P=P(Z≥5.49)=0.0000⋯≈0
CONCLUSION: Because the p-value is smaller than the significance level
α=0.05, we can reject the null hypothesis. Again, we would
say that there is sufficient/enough evidence to conclude
that boys are more common than girls in the entire
population at α=0.05 level.

As should always be the case, the two approaches (critical value approach
and p-value approach) lead to the same conclusion.

OTHER ILLUSTRATIVE EXAMPLES USING TWO-TAILED TEST

Example 1
Given:
a. n= 50
b. = 0.01 significance level
c. H0 : The proportion of students that want to go to the zoo is 85%.
(H0: p = 0.85)
Ha: The proportion of students that want to go to the zoo is not 85%.
(Ha: p ≠ 0.85 )
d. p = 0.7554

DECISION/CONCLUSION: Because p > , we fail to reject the null hypothesis.


There is insufficient evidence to suggest that the proportion of students that
want to go to the zoo is not 85%.

Example 2

Given:
a. n= 150
b. = 0.1 significance level
c. Ho : The proportion of households that have three or more cell phones
is
30%. (Ho : p = 0.3)
Ho : The proportion of households that have three or more cell phones
is different from 30%. (Ha : p ≠ 0.3)

d. 𝑝 = 0.287
e. Zcom = 0.347
-1.64 Zcom=.347 1.64
0
DECISION/CONCLUSION: Fail to reject the null hypothesis (Ho). There is
insufficient evidence supporting that the proportion of households with three or
more cell phones is different from 30%.

NOTE:
Conclusions are answers in sentence form which include: 1) whether there
is enough evidence or not (based on the decision); 2) the level of significance; and
3) whether the original claim is supported or rejected.
Conclusions are based on the original claim which may be the null or
alternative hypothesis. The decisions are always based on the null hypothesis.

Original Claim

H0 Ha
Decision "REJECT" "SUPPORT"

Reject H0 There There issufficientevidence at the


"SUFFICIENT" is sufficientevidence at alpha level ofsignificance
the alpha level of to supportthe claim that(insert
significance original claim here)
.
to reject the claim that
(insert original claim
here).

Fail to reject H
0 There There isinsufficientevidence at
"INSUFFICIENT" is insufficientevidence the alpha level of significance
at the alpha level of to supportthe claim that(insert
significance original claim here)
.
to reject the claim that
(insert original claim
here).
NOTE:

If the null hypothesis isn’t rejected, this doesn’t necessarily mean that it’s
true. It simply means that there is not enough evidence to justify rejecting it.

The hypothesis-testing procedure leads to the acceptance of H0 when H0


is true and the rejection of H0 when H0 is false. Unfortunately, since hypothesis
tests are based on sample information, the possibility of errors must be
considered. A Type I error corresponds to rejecting H0 when H0 is actually true,
while a Type II error corresponds to accepting H0 when H0 is false.

Statistics and
Probability
Quarter 4 – Module 14:
Solving Problems Involving Test
of Hypothesis on Population

Just like in puzzles, you need to think of different ways on how you will be
able to solve it. Same with solving problems involving test of hypotheses on
population proportions, you need to follow important steps in order to arrive at
the correct answer.

Here are the five (5) steps in solving problems for a test of hypothesis on
the population proportion.

STEP 1. HYPOTHESES: State the null and alternative hypotheses (either


in sentence/statement form or in symbols).
Ho : p = p o Ha : p < p o or Ha : p > po or Ha
: p ≠ po

STEP 2. LEVEL OF SIGNIFICANCE ( ): Choose a level of significance like


= 0.01 level.

STEP 3. TEST STATISTIC: Calculate the appropriate test statistic.

Remember:
Test statistic is a random variable calculated from a sample. You
can use test statistics to determine whether to reject the null hypothesis
or not. The test statistic compares your data with what is expected under
the null hypothesis. The test statistic is used to calculate the p-value.
A test statistic measures the degree of agreement between a sample
of data and the null hypothesis. Its observed value changes randomly from
one random sample to a different sample. A test statistic contains
information about the data relevant on deciding whether to reject the null
hypothesis or not.
STEP 4. CRITICAL VALUE/P-VALUE: Determine the critical value or p-
value.
𝑥̅ 𝑝 −𝑝 𝑝 −𝑝
𝑝= 𝑛
z= 𝑝𝑞
or z= 𝑝 (1 −𝑝 )
√𝑛 √
𝑛

where: x = number of sample units that possess the characteristics of


interest

p = population proportion q=1–p

𝑝 = sample proportion n = sample size Remember:


The critical value and p-value are the points being compared with the test
statistic in order to make the final decision on whether to reject the null
hypothesis or not.
STEP 5. DECISION/CONCLUSION:

➢ The decision will be either to reject or fail to reject the null


hypothesis (Ho).

➢ Draw your conclusion about the population proportion based on


the test statistic value and the rejection region.

❖ If the computed z-statistic (zcom) is > or < the tabular/critical


value (ztab), reject the null hypothesis (Ho).
❖ If the computed z-statistic(zcom) falls in the rejection region,
reject the null hypothesis (Ho).
❖ If the computed z-statistic(zcom) does not fall in the rejection
region, fail to reject the null hypothesis (Ho).

NOTE:

(These conditions were already mentioned in the previous module on


drawing conclusions on population proportions.)

To solve problems involving population proportions, just follow the


5-step procedure mentioned above.

Illustrative Examples

Example 1: Every year, the assigned teachers determine the Body Mass Index
(BMI) of students. In a certain public junior high school, a study
finds that 10% of Grade 7 students observed are underweight. A
sample of 780 Grade 7 students were randomly chosen and it was
found out that 125 of them are underweight. Is this claim different
for their grade level age? Use 0.05 level of significance.

SOLUTION:

STEP 1: State the null and alternative hypotheses.


Ho ; p = 0.10
Ha : p ≠ 0.10
STEP 2: Choose a level of significance. α = 0.05

STEP 3: Compute the test statistic.

Given: X= 125 p = 0.10 n = 780

𝑋
𝑝=𝑛 zc = 5.6
1

𝒑̂ = 0.16

𝑝 −𝑝
z= 𝑝 (1 −𝑝 )

𝑛
STEP 4: Determine the critical value.

NOTE: Since the alternative hypothesis is non-directional, the two- tailed test
shall be used. Divide α by 2, then subtract the quotient from 0.5.

Therefore, 0.5 – 0.25 = 0.25.

Rejection Region

𝛼 𝛼
2
= 0.25 2
= 0.25

Rejection Region

𝑍𝛼
NOTE: Using the Areas Under the Normal Curve Table, critical
2
𝑣𝑎𝑙𝑢𝑒𝑠 at 0.05 level of significance are ± 1.96.

STEP 5: Make a decision whether to reject or fail to reject the null


hypothesis. Draw a conclusion.

DECISION: Since the computed test statistic zcom = 2.0 is greater than the critical
value or it falls in the rejection region, reject the null hypothesis.
CONCLUSION: Therefore, we conclude that at 0.05 level of significance, there is
enough evidence that the percentage of Grade 7 students who are
underweight is different from 10%.
Statistics and Probability
Quarter 4 – Module 15:
Illustrating the Nature of Bivariate Data
Data that involve one variable is called univariate data. Univariate data are often
described using the measures of central tendency (mean or average, mode, and median),
variations, or other descriptive statistics. Here are examples of univariate data:

Examples Variable involved


Department of Health (DOH) number of infected cases
recorded the number of infected
COVID-19 cases from April 14 to
May 21, 2020 in the Philippines.
World Health Organization (WHO) number of COVID-19 recoveries
summarized the number of
COVID19 recoveries around the
world.

Data that involve two variables are called bivariate data. The statistical procedure used
to determine and describe the relationship between two variables is called correlation
analysis.

Examples Variables involved


In Tayabas City public market, a supply and price of vegetable
consumer observed that the fewer is
the supply of vegetables, the higher
the price gets.
The Quezon provincial government number of household members and
gave emphasis that limiting the rate of COVID-19 infection
number of household members
going outside to purchase essential
goods will help decrease the rate of
COVID -19 infection in
the province.
Statistics and Probability
Quarter 4 – Module 16:
Constructing a Scatter Plot
Scatter plot, scatter graph, scatter diagram, or scatter gram is a graphical
representation that shows the relationship or the correlation of two variables of
bivariate data.

Scatter plot shows how points collected from a set of bivariate data are scattered on a
Cartesian plane. It gives a good visual picture of how two variables are related or
associated with one another in terms of form, trend, and variation of correlation. The form
of points in the scatter plot determines the shape of the correlation of the variables. The
trend determines the direction of the points, either the variables have positive, negative,
or no correlation. The variation or strength of correlation is based on the closeness of the
points on a trend line and it determines whether the variables have no, weak, moderate,
strong, or perfect correlation.
In constructing a scatter plot, you should know how to plot points in a
Cartesian plane. The independent variable will assume the values of x or abscissa while
the dependent variable will assume the values of y or ordinate.

Example 1:

The given numbers are the age of a person in years and his/her corresponding weight.

Age of a 11 12 13 14 15 16 17 18 19 20
person (x)
Weight (y) 40 42 38 35 45 51 48 48 50 47

Since the weight of an individual depends on his/her age, the independent variable
is the age of the person which is plotted horizontally. The dependent variable is the weight
of the person, which is plotted vertically as shown in the scatter plot below.
Example 2:
A Math teacher conducted a study regarding the performance of grade 11 students
in General Mathematics. Their average grades were taken at different time or period. The
data are given below.

Order of period of the


subject 1 2 3 4 5 6 7 8
Average grades 86 88 84 82 82 81 80 79

From the data given, the independent variable is the order of the subject and the
dependent variable is the average grade. From this, order of the subject will be plotted on
the x-axis and grades will be plotted on the y-axis as illustrated below.

Example 3:
A researcher asked for the weight of 10 students together with the weight of their mother
(biological) and created a scatter plot as presented below.

Weight of mother 65 69 74 78 59 81 76 80 81 75
Weight of student 52 55 62 63 47 66 63 69 68 65

On the given, the independent variable is the weight of the mother while the dependent
variable is the weight of the student. The scatter plot is presented below.
Statistics and Probability
Quarter 4 – Module 17:
Describing the Shape (Form), Trend
(Direction), and Variation (Strength) Based
on a Scatter Plot

The correlation of the variables can be described in terms of form (shape), trend
(direction), and variation (strength) of scatter plot. The form of correlation can be
determined by the shape of points on a scatter plot categorized as linear or curvilinear.
The form of correlation is linear if the points on scatter plot follow a trend of straight line.
The form of scatter plot is non-linear if the points follow a trend of curve line. Sample
scatter plots showing curvilinear form of correlation are given below.
The correlation of variables can also be described in terms of its trend or direction.
The trend of correlation can be positive, negative, or zero/negligible depending on the
direction of the points. The trend of correlation is summarized in the table that follows.

Trend Graph Direction of Description


the Points
A positive
Positive The points correlation
Correlation follow a exists
trend rising when
from left to high values of
right. one variable
correspond to
high values of
another
variable or
low values of
one variable
correspond to
low values of
another
variable.
Negative The points A negative
Correlation follow a correlation
trend rising exists
from right when
to left. high values of
one variable
correspond to
low values of
another
variable or
low values of
one variable
correspond to
high values of
another
variable.

No The points A
Correlation/ are neither negligibl
Negligible rising from e correlation
Correlation left to right exists
nor right to when
left. high values of
one variable
correspond to
either high or
low values of
another
variable.
The closeness of the points around the trend line determines the variation or
strength of the correlation between the variables involved. The closer the points to the
trend line, the stronger the correlation of the variables is. The strength of correlation
between two variables can be perfect, strong, weak, or no/negligible correlation. To
summarize the strength of correlation, refer to the table below.
Correlation Scatter Plot Description
Strong This correlation
Positive exists when almost
Correlation all of the points are
on the line or the
points are closely
scattered on the
trend line that rises
from left to right.

Weak Positive Compared to strong


positive correlation,
the points in this
correlation are
scattered a bit far
from the trend line
from left to right.

No Correlation The points in this


or Negligible correlation do not
Correlation follow any trend line.
The points are just
scattered around
the Cartesian plane.

Weak Negative The points in this


Correlation correlation are
scattered a bit far
from the trend line
from right to left.
Moderate This correlation
Negative exists when the
Correlation points are
moderately
scattered rising
from right to left.

Strong This correlation


Negative exists when almost
Correlation all of the points are
on the line or the
points are closely
scattered on the
trend line that rises
from right to left.

Statistics and Probability


Quarter 4 – Module 18:
Calculating the Pearson’s
Sample Correlation Coefficient

The Pearson’s sample correlation coefficient (also known as Pearson r ), denoted by r,


is a test statistic that measures the strength of the linear relationship between two
variables. To find r, the following formula is used:

𝒏(∑ 𝑿𝒀) − (∑ 𝑿)(∑ 𝒀)


𝒓=
𝟐 𝟐
√ [𝒏(∑ 𝑿𝟐 ) − (∑ 𝑿) ][𝒏(∑ 𝒀𝟐) − (∑ 𝒀) ]

The correlation coefficient (r) is a number between -1 and 1 that describes both
the strength and the direction of correlation. In symbol, we write -1 ≤ r ≤ 1.
Illustrative Example:
Teachers of Pag-asa National High School instilled among their students the value
of time management and excellence in everything they do. The table below shows the time
in hours spent in studying (X) by six Grade 11 students and their scores in a test (Y).
Solve for the Pearson’s sample correlation coefficient r.

X 1 2 3 4 5 6
Y 5 10 10 15 25 30

The next section will guide you on how to compute the Pearson product moment
correlation r.

STEPS SOLUTION
1. Construct a table as shown on
the right side. X Y XY X2 Y2
1 5
2 10
3 10
4 15
5 25
6 30

2. Complete the table.


a. Multiply entries in the X and
X Y XY X2 Y2
Y columns. Put them under
the XY column. 1 5 5 1 25
b. Square all the entries in the X 2 10 20 4 100
column. Put them under X2
column. 3 10 30 9 100

4 15 60 16 225
c. Square all the entries in the Y
column. Put them under Y2 5 25 125 25 625
column.
6 30 180 36 900
3.
a. Get the sum of all entries in X Y XY
the X column. This is ∑ 𝑿. X2 Y2
1 5 5 1 25
b. Get the sum of all entries in
the Y column. This is ∑ 𝒀. 2 10 20 4 100

c. Get the sum of all entries in 3 10 30 9 100

the XY column. This is ∑ 𝑿𝒀. 4 15 60 16 225


d. 5 25 125 25 625
Get the sum of all entries in
the X2 column. This is ∑ 𝑿𝟐. 6 30 180 36 900
e.
Get the sum of all entries in
∑ 𝑿= ∑ 𝒀= ∑ 𝑿𝒀= ∑ 𝑿𝟐= ∑ 𝒀𝟐=
the Y2 column. This is ∑ 𝒀𝟐.
21 95 420 91 1,975

4. Substitute the values obtained Here n = 6 because there are six (6)
from Step 3 in the formula: pairs of values.

𝑛(∑ 𝑋𝑌) − (∑𝑋)(∑𝑌) 𝒏(∑ 𝑿𝒀) − (∑ 𝑿)(∑ 𝒀)


𝑟= 𝒓=
√[𝑛(∑𝑋2) − (∑𝑋)2][𝑛(∑𝑌2) − (∑ 𝑌)2] √[𝒏(∑𝑿𝟐) − (∑ 𝑿)𝟐][𝒏(∑𝒀𝟐) − (∑𝒀)𝟐]

6(420) − (21)(95)
=
√[6(91) − (21)2][6(1,975) − (95)2]

√[546 − 441][11,850 − 9,025]

You may use your


calculator here!

r ≈ 0.96395 or 0.96

The value of r is a positive number.


Therefore, we can say accurately
that there is a positive correlation
between hours spent in studying
and their scores in a test.

Note: For consistency of our answer,


round your final answer into two
decimal places.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy