AP Stats Semester 1 Finals Prep
AP Stats Semester 1 Finals Prep
1.6 Describing the Distribution of a Quantitative Variable Describing distribution (SOCV + Context)
● Shape
- Descriptions of the distribution of quantitative data: shape, center, and ○ left/right-skewed, symmetric, unimodal, bimodal, uniform
variability (spread) (+ outliers, gaps, clusters, or multiple peaks) ● Outliers
- Outliers for one-variable data: data points that are unusually small or large ○ If skewed: 1.5 IQR method
relative to the rest of the data. ■ Low: < Q1 - 1.5(IQR)
■ High: > Q3 + 1.5(IQR)
- Skewed
○ If symmetric: SD method
- Skewed to the right (positive): if the right tail is longer than the left ■ 2 SD above/below the mean
- Skewed to the left (negative): if the left tail is longer than the right ● Center
- Symmetric: if the left half is the mirror image of the right half ○ If skewed: median
- Peaks ○ If symmetric: mean
- Unimodal: Univariate graphs with one main peak ● Variability
- Bimodal: Graphs with two prominent peaks ○ Range: Max-min
○ Standard Deviation:
- Uniform: Each bar height is almost the same (no prominent peaks)
2
- A gap is a region of distribution between two data values where there are no Σ(𝑥−𝑥̄)
○ σ = 𝑛
observed data. ■ “The context typical varies by SD from the mean of x̄
- Clusters are concentrations of data usually separated by gaps. ○ Interquartile range (IQR): Q3-Q1
- Descriptive statistics does not attribute properties of a data set to a larger ● Context
population, but may provide the basis for conjectures for subsequent testing.
● Use “ly” words
○ Approximately, comparatively
1.7 Summary Statistics for a Quantitative Variable ● Percentile
○ “Percentage of students are at or below value.”
EU: Graphical representations and statistics allow us to identify and represent key ● Cumulative relative frequency
features of data. ○ Graph reaches 100% at the end
- A statistic is a numerical summary of sample data. ○ “Percentage of the context had the same or lower context.”
- A parameter is a numerical summary of a population.
- Mean: the sum of all the data values divided by the number of values.
𝑛
1
- Sample: 𝑥̄ = 𝑛
∑ 𝑥𝑖
𝑖=1
- Median: the middle value when data are ordered.
- Even number of data points → any value between the two middle
values. (usually, the average of the two middle values)
○
- Q1: the median of the ordered data set from the min to the median ■ Q1 = 25th percentile
- Q3: the median of the ordered data set from the median to the max ■ Median = 50th percentile
- Q1 and Q3 form the boundaries for the middle 50% of values in an ordered data ■ Q3 = 75th percentile
set. ○ Steep slope → many values
- The pth percentile is interpreted as the value that has p% of the data less than or
equal to it
- Variability
- Range: difference between the maximum and minimum data values
- Interquartile range (IQR): the difference between the third and first
quartiles: Q3 − Q1
1 2
- Standard deviation: 𝑠𝑥 = 𝑛−1
∑ (𝑥𝑖 − 𝑥̄)
EU: The normal distribution can be used to represent some population distributions. ○ “Context is z-score standard deviation above/below the mean.”
● Linear Transformation of Data
- A normal curve (approximated normal) is mound-shaped and symmetric
Shape Center Variability
- population mean (µ) and population standard deviation (σ) Add (+a) same +a same
- Empirical Rule Subtract (-a) same -a same
- 68% of the observations are within 1 standard deviation of the mean Multiply (×a) same ×a ×a
- 95% of observations are within 2 standard deviations of the mean Divide (÷a) same ÷a ÷a
- 99.7% of observations are within 3 standard deviations of the mean. Standardize same 0 1
- z-score: measures how many standard deviations a data value is from the mean
𝑥𝑖−µ How to find proportion/boundary value How to find boundary value
- z-score = σ 1. Find z-score value (소수점 2자리) 1. Find z-score value (소수점 2자리)
- Percentiles and z-scores may be used to compare relative positions of points 2. Draw a normal distribution 2. Draw a normal distribution
within a data set or between data sets. a. N(µ, σ) a. N(µ, σ)
3. Use a table / normal CDF 3. Use inverse normal CDF
4. Find proportion (소수점 4자리) 4. Find boundary