0% found this document useful (0 votes)
38 views40 pages

Chapter 7

This document discusses various techniques for summarizing and analyzing quantitative and categorical data. It describes key terms like parameters, statistics, and different types of numerical data. It also covers techniques for summarizing quantitative data such as frequency distributions, histograms, the normal curve, measures of variability, standard deviation, z-scores, and correlation. For categorical data, it discusses frequency tables, bar graphs, pie charts, and crossbreak tables. Examples are provided to illustrate concepts like frequency polygons, box plots, scatterplots, and calculating standard deviation.

Uploaded by

javed765
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views40 pages

Chapter 7

This document discusses various techniques for summarizing and analyzing quantitative and categorical data. It describes key terms like parameters, statistics, and different types of numerical data. It also covers techniques for summarizing quantitative data such as frequency distributions, histograms, the normal curve, measures of variability, standard deviation, z-scores, and correlation. For categorical data, it discusses frequency tables, bar graphs, pie charts, and crossbreak tables. Examples are provided to illustrate concepts like frequency polygons, box plots, scatterplots, and calculating standard deviation.

Uploaded by

javed765
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Chapter Seven

Data Analysis

Statistics vs. Parameters

A parameter is a characteristic of a population.

It is a numerical or graphic way to summarize data obtained from the population. It is a numerical or graphic way to summarize data obtained from a sample.

A statistic is a characteristic of a sample.

Types of Numerical Data

There are two fundamental types of numerical data:


1) 2)

Quantitative data: obtained by determining placement on a scale that indicates amount or degree. Categorical data: obtained by determining the frequency of occurrences in each of several categories.

Techniques for Summarizing Quantitative Data


Frequency Distributions Histograms/Stem and Leaf Plots Distribution curves Averages/Spread Variability/Correlations

Frequency Polygons

Places data in some sort of order. A frequency distribution lists scores from high to low. This results in a grouped frequency distribution. Since the information is not very visual, a graphical display called a frequency polygon can help with this.

Frequency polygons can be negatively or positively skewed. They can be useful in comparing two or more groups.

Example of a Frequency Distribution


Raw Score 64 63 61 59 56 52 51 38 36 34 31 29 27 25 24 21 17 15 6 3 Frequency 2 1 2 2 2 1 2 4 3 5 5 5 5 1 2 2 2 1 2 1 n = 50

Example of a Grouped Frequency Distribution


Raw Score (Intervals of Five) 60-64 55-59 50-54 45-49 40-44 35-39 30-34 25-29 20-24 15-19 10-14 5-9 0-4 Frequency

5 4 3 0 0 7 10 11 4 3 0 2 1 n = 50

Example of a Frequency Polygon

Example of a Positively Skewed Polygon

Example of a Negatively Skewed Polygon

10

Two Frequency Polygons Compared

11

Histograms

A histogram is a bar graph used to display quantitative data at the interval or ratio level of measurement.

12

The Normal Curve


This distribution curve shows a generalized distribution of scores vs. straight lines (frequency polygon). Distribution of data tends to follow a specific shape called a normal distribution. This distribution is considered bell shaped and allows the plotting of the following averages: X

Mean average of all the scores in a distribution x

Median midpoint or the point below or above which 50% of the scores in a distribution Mode most frequent score in a distribution

13

The Normal Curve

14

Example of the Mode, Median and Mean in a Distribution


Raw Score 98 97 91 85 80 77 72 65 64 62 58 45 33 11 5 Frequency 1 1 2 1 5 7 5 3 7 10 3 2 1 1 1 n = 50

mode = 62; median = 64.5; mean = 66.7

15

Variability

Two distributions may have identical means and medians. Distribution A: 19, 20, 25, 32, 39 Distribution B: 2, 3, 25, 30, 75

The mean in both distributions is 27 and the median is 25. The two distributions differ. In distribution A, the scores are closer together, in distribution B, they are much more spread out. The two distributions differ in what statisticians call variability.
16

Different Distributions Compared

17

Variability

Refers to the extent to which the scores on a quantitative variable in a distribution are spread out. The range represents the difference between the highest and lowest scores in a distribution. A five number summary reports the lowest, the first quartile (25th percentile), the median (50th percentile), the third quartile (75th percentile), and highest score. Five number summaries are often portrayed graphically by the use of box plots.

18

Box plots
st Lowest 1 quartile score

Median 3rd quartile

Highest score

19

Standard Deviation

Considered the most useful index of variability. It is a single number that represents the spread of a distribution. If a distribution is normal, then the mean plus or minus 3 SD will encompass about 99% of all scores in the distribution.

20

Calculation of the Standard Deviation of a Distribution


Raw Score 85 80 70 60 55 50 45 40 30 25 (X X) Variance (SD ) = n
2 2

Mean 54 54 54 54 54 54 54 54 54 54

XX 31 26 16 6 1 -4 -9 -14 -24 -29

(X X) 961 676 256 36 1 16 81 196 576 841

3640 10 = 364 =

Standard deviation (SD) =

X X
n

This SD of the sample introduces biases when the sample size is small or moderate. Thus, sample SD is used.

SD

X X
n 1

21

Standard Deviations for Boys and Mens Basketball Teams

22

Facts about the Normal Distribution

50% of all the observations fall on each side of the mean. (Figure 10.11) 68% of scores fall within 1 SD of the mean in a normal distribution. 27% of the observations fall between 1 and 2 SD from the mean. 99.7% of all scores fall within 3 SD of the mean. This is often referred to as the 68-95-99.7 rule.

23

Fifty Percent of All Scores in a Normal Curve Fall on Each Side of the Mean

24

Standard Scores

Standard scores use a common scale to indicate how an individual compares to other individuals in a group. The simplest form of a standard score is a Z score. A Z score expresses how far a raw score is from the mean in standard deviation units. raw score mean z score SD Standard scores provide a better basis for comparing performance on different measures than do raw scores. A Probability is a percent stated in decimal form and refers to the likelihood of an event occurring. T scores are z scores expressed in a different form (z score x 10 + 50).

25

z scores associated with the normal curve

26

Comparisons of Raw Scores and z Scores on Two Sets


Test Raw Score Biology 60 Chemistry 80 Mean SD 50 5 90 10 z Score +2 -1 Percentile Rank 98 16

Student is doing better in which subject, biology or chemistry?


The students raw score in biology is two SD above the mean, whereas the raw score in chemistry is 1 SD below the mean. The raw score would suggest the student is doing better in chemistry but the z score indicates that the student is doing better actually in biology.

27

Probabilities under the normal curve

A probability is a percent stated in decimal form and refers to the likelihood of an event occuring.
28

Probability Areas Between the Mean and Different Z Scores

29

Examples of Standard Scores

30

Correlation

Researchers seek to determine whether a relationship exists between two or more quantitative variables. A scatterplot is a pictorial representation of the relationship between two quantitative variables. Outliers are scores that deviate or fall considerably outside most of the other scores in a distribution or pattern. They indicate an unusual exception to a general pattern. Correlation coefficients express the degree of relationship between two sets of scores. Pearson Product-Moment Correlation Coefficient (known as Pearson r, range between -1 and +1) Formula: n X iYi X i Yi r 2 2 [n X i ( X i )2 ][nYi (Yi )2 ]

31

Correlation

32

Scatterplot of Data

33

Relationship Between Family Cohesiveness and School Achievement in a Hypothetical Group of Students

34

Examples of Scatterplots

35

Techniques for Summarizing Categorical Data


The Frequency Table Bar Graphs and Pie Charts The Crossbreak Table

36

Frequency and Percentage of Responses to Questionnaire


Response Frequency Lecture 15 Class discussions 10 Demonstrations 8 Audiovisual presentations 6 Seatwork 5 Oral reports 4 Library research 2 Total 50 Percentage of Total (%) 30 20 16 12 10 8 4 100

37

Example of a Bar Graph

38

Example of Pie Chart

39

Crossbreak Table (Contingency Table)


Male Female 40 60 60 40 100 100 Total 100 100 100

Junior high school teachers High school teachers Total

40

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy