EXP5
EXP5
SCORE
SECTION/SCHEDULE: ECA3 - FRIDAY 11:00 PM - 2:00 PM
INSTRUCTOR: DR. VICTOR HAFALLA JR.
EXPERIMENT NO. 5
Chi-square Test
LY
INTRODUCTION
The chi-square goodness-of-fit test evaluates whether the distribution of frequencies within k categories of a
single variable is the same as in the theoretical distribution. The term ‘goodness-of-fit’ refers to how well the
N
observed (sample) frequencies ‘fit’ the expected (theoretical) frequencies.
A goodness-of-fit test between the observed and expected frequencies is based on the quantity:
O
E
where oi and ei are the observed and expected frequencies of the ith cell; respectively. The χ2 is a value of a random
variable whose sampling distribution is approximated by the Chi-squared distribution with v=k-1 degrees of freedom,
S
k is the number of paired cells.
The null hypothesis is Ho: o=e (the observed frequencies are equal to the expected frequencies) while the
U
alternative is Ha: o≠e (the observed frequencies are not equal to the expected frequencies). The critical region will
lie on the right tail of the chi-squared distribution. Thus, we reject Ho if
B
The chi-square may also be used to test the independence of two categorical variables using the same
U
factor is applied:
For expected frequencies less than 5, the Fisher-Irwin exact test should be used instead of the Yate’s correction
factor.
In this laboratory experiment, you will be able to analyze experiments involving frequency data using SPSS.
I. LEARNING OUTCOMES
LY
● Google Documents
SCORE:_________
N
III. PROCEDURE
1. The data pertain to a die tossed 120 times and the recorded outcomes. Test whether the die is balanced or
O
not using Chi-square goodness of fit test and SPSS. Properly type the following data in SPSS/statistics
software/statistics website. Print all relevant outputs of the analysis.
Table 1. Outcomes of die tossed 120 times
X
oi
1
19 E
2
18
3
22
4
25
5
16
6
20
S
ei 20 20 20 20 20 20
0.942. The Linear-by-Linear Association shows no significant trend, with a p-value of 0.471. With 240 valid
cases and no expected cell counts less than 5, the assumptions for the Chi-square test are met, but the data
U
2. The following data pertain to a time-bound study to determine whether the incidence of traffic jams in a
busy intersection highway is dependent on the type of vehicles traversing the said intersection.
FO
Set-up a Chi-square test of independence of the aforementioned variables using SPSS/other statistics apps.
Print/paste all relevant outputs and interpret the results.
Table 2. Incidence of Traffic vs. Frequency of Vehicles
Incidence of Frequency of Vehicles Traversing the Intersection
Traffic
Jeepney SUV Truck Private Cars
(min)
0-10 210 169 112 210
11-20 311 259 147 224
21-30 289 245 198 196
More than 30 322 265 221 249
Were there expected cell counts below 5? (2 pts.) NONE, the minimum expected count is 131.04
𝑎
Pearson Chi-square: (2 pts.) 30. 977
p-value: (2 pts.) .000
LY
N
O
E
S
U
B
U
R
FO
O
a significant relationship between the Incidence of traffic and the occurrence of accidents on the
intersection.
SCORE:_________
3. Troubleshooting/Reflection
E
S
a. What menu and submenus did you click to do a Chi-square goodness-of-fit test? (2 pts.)
Data
U
Weight Cases
Analyze
B
- Descriptive Statistics
- Crosstabs
U
● Statistics (Chi-square)
● Cells (Rows, Columns, Total)
R
SCORE:_________
Also, what menu and submenus did you click to do a Chi-square test of independence of categorical
FO
variables? (2 pts.)
Data
Analyze
- Descriptive Statistics
Crosstabs
- Row
- Column
Statistics
- Chi-square
LY
assessing non-frequency data since they maintain the data's integrity and yield more precise results regarding the
relationships between the variables. In the end, a thorough understanding of the data type is necessary for efficient
statistical analysis.
N
(5 pts.)
SCORE:_________
O
4. QUESTIONS
1. What do we usually do when cell frequencies fall below 5? ( 5 pts)
E
When cell frequencies fall below a certain threshold, typically below 5, several strategies can be employed to
address the issue. One common approach is to combine categories, which involves merging rows or columns in the
S
dataset to increase the expected frequency in each cell. This helps ensure that the statistical analysis remains valid.
Alternatively, researchers may opt to use Fisher's Exact Test, particularly for small sample sizes, as it is more
U
appropriate than the chi-square test when dealing with low expected frequencies. Another effective solution is to
increase the sample size by collecting more data, which can help ensure that all cells meet the minimum frequency
requirement. By implementing these strategies, researchers can maintain the integrity of their statistical analyses and
B
2. React on the statement, “Chi-square test is a non-parametric test”. (10 pts.)
The statement “Chi-square test is a non-parametric test” is accurate and highlights an important characteristic
R
of this statistical method. Non-parametric tests, such as the chi-square test, do not assume a specific distribution for
the data, making them particularly useful when the data does not meet the assumptions required for parametric tests,
FO
like normality. The chi-square test is commonly employed to assess relationships between categorical variables,
allowing researchers to analyze a wide range of datasets flexibly. One of its key advantages is its robustness; it can
be applied to nominal and ordinal data and can be used even when sample sizes are small or when data are skewed.
However, it’s important to note that the chi-square test does have limitations, such as requiring sufficient sample sizes
to ensure that expected frequencies in each cell are adequate typically at least 5. Overall, recognizing the chi-square
test as a non-parametric test underscores its versatility and applicability in statistical analysis involving categorical
data.
SCORE:_________
5 3 1 RATING
Group Interaction Initiates the Sometimes observe the Allows the group
performance of the group members to members to
laboratory activity. perform the laboratory complete the
activity. laboratory activity.
/5
Discusses with group
members Sometimes discuss with Does not interact
LY
group members with group
members
Data collection Follows and interprets Follows the procedure Does not follow the
the procedure to and asks questions to procedure and asks
/5
collect data collect data questions to collect
N
data
Data Presentation All data can be easily Some data are hardly The data cannot be
understood and understood and understood and /5
O
interpreted interpreted interpreted
TOTAL SCORE /84
E
S
U
B
U
R
FO