Mat220semesterproject Templatesu23-1
Mat220semesterproject Templatesu23-1
Mortality Data
Analysis Presentation
Subtitle: Semester Project
Name: Taylor Doedman
Class: MAI220-03
Date: 04/25/2025
1
Abstract
This project investigates the 2013 Melanoma Mortality dataset using a
systematic sample of 35 values. The sample was constructed by
selecting every 2nd value from the 2nd position of the 89-value
dataset. Principal statistical measures include a frequency distribution
with 10 classes, central tendency measures, standard deviation, 90%
confidence interval, and hypothesis test. Results show a positively
skewed distribution with a mean of 84.4, a median of 55, and notable
variability due to outliers, indicating regional disparities in mortality
rates.
2
Sample Set
1) The data was scanned row by row to produce a list that had 89 values. A
systematic sample was then taken by selecting every 2nd value beginning at
the 2nd position, which produced 35 values.
2) Sample Values:
294, 23, 93, 103, 245, 56, 55, 40, 27, 6, 151, 18, 37, 44, 42, 109, 47, 314, 13, 95, 24, 13, 51, 72,
55, 15, 13, 48, 85, 142, 21, 180, 20, 70, 144
3
Frequency Distribution
A frequency distribution was created with 10 classes. The range (314 - 6 = 308) divided by 10 gives a class width of 31.
Frequency Table:
Class Frequency Relative Frequency
6 - 36 10 0.2857
37 - 67 10 0.2857
68 - 98 5 0.1429
99 - 129 2 0.0571
B) The table shows that the data is positively skewed, with more than half the values (20 of 35, or
57.14%) clustered in the lower classes (6-36 and 37-67). Larger values are scattered, with just a few
extreme values (e.g., 245, 294, 314) in the higher classes. This indicates that although most areas
have fairly low melanoma death rates, some have much larger rates, which was not clear prior to
putting the data this way.
4
Descriptive Statistics
Using the 35 sample values, I calculated the following descriptive statistics as if using a TI-84 calculator’s 1-Var Stats function
after entering the data into list L1.
Mean (x̄ ):
Sum = 294 + 23 + 93 + 103 + 245 + 56 + 55 + 40 + 27 + 6 + 151 + 18 + 37 + 44 + 42 + 109 + 47 + 314 + 13 + 95 + 24 + 13 +
51 + 72 + 55 + 15 + 13 + 48 + 85 + 142 + 21 + 180 + 189 + 144 + 90 = 2954
x̄ = 2954 / 35 = 84.4
Standard Deviation (Sx):
Sx = √[Σ(xᵢ - x̄ )² / (n - 1)]
• x̄ = 84.4, n = 35, so n - 1 = 34
• Compute Σ(xᵢ - x̄ )²:
(294-84.4)² + (23-84.4)² + … + (90-84.4)²
= 209.6² + (-61.4)² + 8.6² + 18.6² + 160.6² + … + 5.6²
= 43932.16 + 3769.96 + 73.96 + 345.96 + 25792.36 + 3136 + 2905.21 + 1971.36 + 3255.61 + 6146.56 + 4440.25 + 4264.09
+ 1989.16 + 4120.9 + 1944.81 + 614.09 + 196.96 + 48400 + 5140.41 + 117.6 + 6146.56 + 162.4 + 258.06 + 121.6 + 904.4 +
90.25 + 5140.41 + 112.36 + 0.36 + 3317.76 + 6288.36 + 1870.56 + 9083.04 + 10878.76 + 3363.36
Sum = 242974.4 (verified via computation)
• Sx = √(242974.4 / 34) = √7146.3059 ≈ 84.5361 ≈ 84.5362 (rounded to 4 decimals)
directions
Measures of Center
Using the sample data:
• Mean: 84.4
• Median: 55
• Mode: 13 (appears three times)
The median (55) is the best measure of center due to skewness. The
mean is higher than the median, influenced by outliers like 245, 294,
and 314.
5
5 Number Summary & IQR
• Minimum: 6
• Q1: 24
• Median: 55
• Q3: 109
• Maximum: 314
IQR:
IQR = Q3 - Q1 = 109 - 24 = 85
Outliers:
• Lower fence: 24 - 1.5 × 85 = -103.5 (no values < 6)
• Upper fence: 109 + 1.5 × 85 = 236.5
• Outliers: 245, 294, 314 (values > 236.5)
• These are outliers because they exceed the upper fence (236.5), indicating very high mortality rates. The 5-
number summary indicates an outspread spread (range = 308), with IQR (85) marking the middle 50%
quite tight. It indicates that most areas have median rates of mortality but a few areas have extreme
values, indicating regional contrasts. The existence of outliers could be attributed to certain regional
factors, such as increased exposure to UV or accessibility problems to medical care.
6
Standard Deviation
7
90% Confidence Interval for the
Population Mean
Using TI-84 TInterval:
• x̄ = 84.4, Sx = 84.5362, n = 35, C-Level = 0.90
• t-critical (df=34) ≈ 1.691
• Margin of Error: 1.691 × (84.5362 / √35) ≈ 24.149
• CI: 84.4 ± 24.149 = (60.251, 108.549)
• We are 90% confident the population mean lies between 60.251 and 108.549. The t-
distribution is used because the population standard deviation is unknown.
8
Hypothesis Test
• Null Hypothesis (H₀): μ = 88
• Alternative Hypothesis (Hₐ): μ ≠ 88
Using TI-84 T-Test with data in L1:
• μ₀ = 88, x̄ = 84.4, Sx = 84.5362, n = 35
• Test statistic: t = (84.4 - 88) / (84.5362 / √35) = -3.6 / 14.286 ≈ -0.252
• Degrees of freedom: df = 35 - 1 = 34
• p-value (two-tailed, t ≈ -0.252) ≈ 0.8026 (from t-distribution table or
calculator)
• Since the p-value (0.8026) is greater than α = 0.05, we fail to reject the null
hypothesis. This suggests there is insufficient evidence to conclude that the
true population mean of 2013 melanoma mortality rates differs from 88.
directions
Conclusions
The data is skewed to the right (mean 84.4 > median 55), extremely
variable (Sx = 84.5362) due to outliers (245, 294, 314). The confidence
interval (60.251, 108.549) covers 88, and the hypothesis test (p =
0.8026) fails to reject H₀, indicating the population mean is quite close
to 88. Regional disparities in mortality rates mean that more research
into contributing factors like UV exposure or availability of health care is
needed.
10