0% found this document useful (0 votes)
25 views11 pages

Chi-square-Lesson

The Chi-square test is a statistical method used to assess the association between categorical variables by comparing observed frequencies with expected frequencies. It can be applied in one-variable tests (goodness-of-fit) and two-variable tests (test of independence), requiring that expected frequencies are above 5 and that categories are independent. The document outlines the process for conducting the test, including formulating hypotheses, calculating the Chi-square statistic, and making decisions based on critical values.

Uploaded by

alabamark70
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
25 views11 pages

Chi-square-Lesson

The Chi-square test is a statistical method used to assess the association between categorical variables by comparing observed frequencies with expected frequencies. It can be applied in one-variable tests (goodness-of-fit) and two-variable tests (test of independence), requiring that expected frequencies are above 5 and that categories are independent. The document outlines the process for conducting the test, including formulating hypotheses, calculating the Chi-square statistic, and making decisions based on critical values.

Uploaded by

alabamark70
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 11
CHI-SQUARE TEST The Chi-square (X?) test is similar to tests of correlation jn that it measures the strength of associations between variables. The Chi-square test can be used to test associations in one or more groups and it does this by comparing actual (observed) numbers in each group, with those that would be expected according to theory or simply by chance. The Chi-square test requires that the data be expressed as frequencies, i.e. numbers in each category; this is nominal level of measurement. It should be noted that in most cases almost any data can be reduced to categorical or frequency data, but it is not always wise to do this because information is invariably lost in the process. For example, the weight (interval measurement) of individual members of a group may be different for each member of that group, but the individuals could be assigned to one of two categories (over weight and under weight), by use of a suitable cut off point in the data. The data would then be categorical in that you would have the numbers of people in the category “over weight”, and the numbers in the category “under weight”; but in doing this the researcher has lost a lot of information about the weight of individuals in the group. To be reliable the Chi Square statistic also requires that the expected frequencies in each category should not fall below 5 - this can cause problems when sample size is relatively small. Finally, the different categories used must be independent of each other. This means that it must not be possible for data to fall into more than one category. For example if the effectiveness of two different treatments was being compared and some of the patients were actually receiving both treatments, then Chi-square could not be used for the analysis. We will now consider a widely used non-parametric test, Chi-square, which we can use with data at the nominal level that is data that is classificatory. For example, we know the frequency with which entering freshman computer science students, when required to purchase a computer for their personal use, select Macintosh Computers, IBM Computers, or Some other brand of computer. We want to know if there is a difference among the frequencies with which these three brands of computers are selected or if they choose basically equally among the three brands. This is a problem we can use the chi-square statistic for. The chi-square statistic is used to compare the observed frequency of some observation (such as frequency of buying different brands of computers) with an expected JSrequency (such as buying equal numbers of each brand of computer). The comparison of observed and expected frequencies is used to calculate the value of the chi-square statistic, which in turn can be compared with the distribution of chi-square to make an inference about a statistical problem. The symbol for chi-square and the formula are as follows: = ) (0-E)? | E O is the Observed frequency E is the Expected frequency Where: The degrees of freedom (df) for the one-dimensional chi- square statistic is: df=C-1 Where: © is the number of categories or levels of the independent variable. ‘ARIABLE CHI-SQUARE (GOODNESS-OF-FIT TEST, ON! We can use the Chi-square statistic to test the distribution of measures over levels of a variable to indicate if the distribution of measures is the same for all levels. This is the first use of the one-variable chi-square test. This test is also referred to as the goodness-of-fit test. Using the example we already mentioned of the frequency with which entering freshman, when required to purchase a computer for college use, select Macintosh Computers, IBM Computers, or Some other brand of computer. We want to know if there is a significant difference among the frequencies with which these three brands of computers are selected or if the students select equally among the three brands. The data for 100 students is recorded in the table below (the observed frequencies). We have also indicated the expected frequency for each category. Since there are 100 measures or observations and there are three categories (Macintosh, IBM, and Other) we would indicate the expected frequency for each category to be 100/3 or 33.333. In the third column of the table we have calculated the square of the observed frequency minus the expected frequency divided by the expected frequency. The sum of the third column would be the value of the chi-square statistic. Observed Expected 0-E)?/E een eaes Frequency Frequency ( yu IBM 47 33.333 5.604 Macintosh 36 | ——-33.333 0.213 Other 17 33.333 8.003 Total ‘ (chi-square) 100 13.820 xX? = Y (O-E)? | E | X? = 5.604+0.213 + 8.003 = 13.820 df=C-1=3-1=2 We can compare the obtained value of chi-square with the critical value for the .05 level and with degrees of freedom of 2 obtained from Appendix Table (Distribution of Chi Square) Looking under the column for .0S and the row for df = 2 we see that the critical value for chi-square is 5.991. APPLICATION OF THE STATISTICAL TEST We now have the information we nced to complete the six step process for testing statistical hypotheses for our research problem, ], State the null hypothesis and the alternative nypothesis based on your research question. Ho:O=E H,:0+¢E Note: Our null hypothesis, for the chi-square test, states that there are no differences between the observed and the expected frequencies. The alternate hypothesis states that there are significant differences between the observed and expected frequencies. 2. Set the alpha level. Y a =.05 Note: As usual we will set our alpha level at .05, we have 5 chances in 100 of making a type I error. 3. Calculate the value of the appropriate statistic. Also indicate the degrees of freedom for the statistical test if necgssary- X? = 13.820 df=C-1=2 4. Write the decision rule for rejecting the null hypothesis. Reject H, if X’ >= 5.991. Note: To write the decision rule we had to know the critical value for chi-square, with an alpha level of .05, and 2 degrees of freedom. We can do this by looking at Appendix Table and noting the tabled value for the column for the .05 level and the row for 2 df. 5. Write a summary statement based on the decision, Reject H,, p < .05 Note: Since our calculated value of X? ( 7 3.820) is greater than 5.991, we reject the null hypothesis and accept the alternative hypothesis. 6. Write a statement of results There is a significant difference among the frequencies with which students purchased three different brands of computers. TWO-VARIABLE CHI-SQUARE [TEST OF INDEPENDENCE) Now let us consider the case of the two-variable chi- square test, also known as the test of independence. For example we may wish to know if there is a significant difference in the frequencies with which males come from small, medium, or large cities as contrasted with females. The two variables we are considering here are hometown size (small, medium, or large) and gender (male or female). Another way of putting our research question is: Is gender independent of size of hometown? The data for 30 females and 6 males is in the following table. [Frequency wit which Males and Females come from mall, Medium, and Large cities Large (L) | Total — z | Female 6 30 | 1 6 | “ 7 Where: O is the Observed frequency, and E is the Expected frequency. The degrees of freedom (df) for the two-dimensional chi- square statistic is: =(C-1)(R-1) Where: C is the number of columns or levels of the first variable R is the number of rows or levels of the seconded variable. In the table above we have the observed frequencies (six of them). Now we must calculate the expected frequency for each of the six cells. For two-variable chi-square we find the expected frequencies with the formula: Expected Frequency for a Cell = (Column Total X Row Total/Grand Total In the table above we can see that the Column Totals are 14 (small), 15 (medium), and 7 (large), while the Row Totals are 30 (female) and 6 (male). The grand total is 36. Using the formula we can thus find the expecteq Frequency ( E) for each cell. 1. E for the S female cell is 14X30/36 = 11.667 2. E for the M female cell is 15X30/36 = 12.500 3. E for the L female cell is 7X30/36 = 5.833 4. E for the § male cell is 14X6/36 = 2.333 5. E for the M male cell is 15X6/36 = 2.500 6. E for the L male cell is 7X6/36 = 1.167 We can put these expected frequencies in our table and also include the values for (O - EP/E. The sum of all these will of course be the value of chi-square. Small (S) Medium (M) Large (L) Total o| © |(Ev/E|o| E | (0-EP/E| 0 | E | (0-E?/E Female | 10/ 11.66 | 0.238 | 14| 12.50] 0.180 | 6 |5.83| 0.005 | 30 Male | 4 | 2.333 | 1.191 | 2 | 250] 0.900 | 2 [1.16] 0.024 6 Total | 14 15 7 36 x = Y (O-E)? X?= 0.238 + 0.180 + 0.005 + 1.191 + 0.900 + 0.024 = 2.538 df = (C - 1)(R - 1) = (3 - 1)(2 - 1) = (aap = 2 APPLICATION OF THE STATISTICAL. TEST We now have the information we need to complete the step process for testing statistical hypotheses for our six research problem 1, State the null hypothesis and the alternative hypothesis based on your research question Ho:O=E H,:O#E 2. Set the alpha level. a =.05 Calculate the value of the appropriate statistic. Also indicate the degrees of freedom for the statistical test if necessary. X = 2.538 = (C- 1(R- 1) = (2)(1) = 2 3. Write the decision rule for rejecting the null hypothesis. Reject H, if X? >= 5.991. Note: To write the decision rule we had to know the critical value for chi-square, with an alpha level of .05, and 2 degrees of freedom. We can do this by looking at Appendix Table and noting the tabled value for the column for the .05 level and the row for 2 df. 4. Write a summary statement based on the decision. Fail to reject H, Note: Since our calculated value of XX? (2.538) is not greater than 5.991, we fail to reject the null hypothesis and are unable to accept the alternative hypothesis. 5. Write a statement of results There is not a significant difference in the frequencies with which males come from small, medium, a large towns as compared with females. Hometown size 18 not independent of gender. Critical Chi-Square-Values Table df\area .050 | .025 :010 1 3.84146 5.02389 6.63490 2 5.99146 7.37776 9.21034 3 7.81473 9.34840 11.34487 4 9.48773 11.14329 13.27670 5 11.07050 12.83250 15.08627 6 12.59159 14.44938 16.81189 a 14.06714 16.01276 18.47531 8 15.50731 17,.53455 20.09024 9 16.91898 19.02277 21.66599 10 18.30704 20.48318 23.20925 Li 19.67514 21.92005 24.72497 12 21.02607 23.33666 26.21697 13 22.36203 24.73560 27.68825 14 23.68479 26.11895 29.14124 15 24.99579 27.48839 30.57791 16 26.29623 28.84535 31.99993 17 27 .58711 30.19101 33.40866 18 28.86930 31.52638 34.80531 19 30.14353 32.85233 36.19087 20 31.41043 34.16961 37.56623 21 32.67057 |35.47888 38.93217 22 33.92444 36.78071 40.28936 23 35.17246 38.07563 41.63840 24 36.41503 39.36408 42.97982 25 37.65248 40.64647 44.31410 26 38.88514 41.92317 45.64168 27 [40.1 1327 43.19451 46.96294 28 41.33714 44.46079 48.27824 29 42.55697 45.72229 49.58788 30 43.77297 46.97924 |50.89218

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy