Engineering Data Analysis
Engineering Data Analysis
COLLEGE OF ENGINEERING
Page 0 of 11
Republic of the Philippines
CAMARINES NORTE STATE COLLEGE
F. PimentelAvenue, Brgy. 2, Daet, Camarines Norte
– 4600, Philippines
COLLEGE OF ENGINEERING
1) T-test
2) Types of t-tests
3) Calculation for t-test
3.1) Formula for t-tests
3.2) Formula for standard deviation
4) Testing for Hypothesis
4.1) Proving the Hypothesis
4.2) Critical t-value
4.3) p-value
4.4) Example of reading for Critical t-value
And p-value
INTRODUCTION
In this exploration of the t-test, we will unravel its principles, applications, and underlying
assumptions. We will delve into comprehending how this statistical method empowers us to
discern meaningful patterns in the midst of data variability. By the end of this learning material,
you will not only grasp the mechanics of the t-test but also be equipped to apply this knowledge to
make informed decisions in your own analytical endeavors.
Page 1 of 11
Republic of the Philippines
CAMARINES NORTE STATE COLLEGE
F. PimentelAvenue, Brgy. 2, Daet, Camarines Norte
– 4600, Philippines
COLLEGE OF ENGINEERING
1. T-TEST
WHAT IS A T-TEST
The t – test is one of the tools you can use as a procedure in statistical testing. It
analyzes whether there is a significant difference between mean values of two groups. A t-test
(also known as Student's t-test) is a tool for evaluating the means of one or two populations using
hypothesis testing.
In context, the t-test being capable being dependent or independent refers to its
relationship between the samples compared. In this way, these terms describe the design of the
study and the nature of how the data is collected.
Not only does it extend its applicability across different fields, but in summary the t-test
method is very versatile as a statistical analysis for different kinds and means of scenarios,
simple yet provides accurate results with the use of parametric analysis or normal distribution
covering the sample data.
T-test accounts for the variability within each group making it more suitable for small
sample sizes as it adjusts for the uncertainty with assuming the estimated population parameters
due to limited data. Its distribution assumption is more suited as an accurate representation of
limited sampling distribution of the mean.
Hypothesis: Population:
However, as we cannot observe all the bachelor graduates, we draw a sample that is as
representative as possible. Thus, the t-test as a procedure helps us organize and limit our scope
to an observable value that can somehow still be representative into an accurate data. This is
where we split our population sample into men and women as the concept of t-test, where it
makes it easier to create answers.
The maximum sample size for a t-test is not strictly defined or limited by the statistical
method itself. Instead, the maximum sample size depends on practical considerations, such as
computational resources, time, cost, and the goals of the study.
Degrees of
In theory, you can perform a t-test with any sample size, but as the
Freedom tells us
sample size increases, the distribution of sample means approaches a
how many items
normal distribution due to the Central Limit Theorem. This makes the t-
can be randomly
distribution very similar to the standard normal distribution for large
selected before
samples.
constraints must
Considered small sample size (n < 30) be put in place
o Tends to favor the use of a t-test due to the impact of sample
size
Page 2 of 11
Republic of the Philippines
CAMARINES NORTE STATE COLLEGE
F. PimentelAvenue, Brgy. 2, Daet, Camarines Norte
– 4600, Philippines
COLLEGE OF ENGINEERING
A one-sample t-test is a statistical test used to determine if the mean of a single sample is
significantly different from a known or hypothesized population mean. It is appropriate when you
have a single group of observations and you want to compare the mean of that group to a specific
value. It is used when you want to compare a mean value of a sample to a known mean value of
a sample
This test needs one sample (dependent) and a reference value (independent)
30 SAMPLES
50 GRAMS 48 GRAMS
SAMPLES
KNOWN UNKNOWN/ RESULTS
We take 30 samples FROM SAMPLES
A chocolate bar
of chocolate bar from
weighs an The average weight of a
the factory and
average of 50 chocolate bar measured
measure their average
grams. from the 30 samples is 48
weight
grams.
It is used when you want to compare the means of two independent groups or samples
and then we want to know the significant difference between these two means. The independent
Page 3 of 11
Republic of the Philippines
CAMARINES NORTE STATE COLLEGE
F. PimentelAvenue, Brgy. 2, Daet, Camarines Norte
– 4600, Philippines
COLLEGE OF ENGINEERING
samples t-test is appropriate when you want to compare the means of two independent groups to
determine if there is a statistically significant difference between them.
DRUG A
30 SAMPLE
60 PEOPLE
DRUG B
30 SAMPLE
The sample was divided into two independent samples where the
first group is the one that took Drug A while the other group is the
one that took Drug B
We are to compare the effectiveness of 2 painkillers, Drug A & Drug B. The sample of 60
people would be randomly divided into two groups where one would take the Drug A and the
other group with Drug B. After compiling the results, we can now analyze and compare the results
how similar or different their effectiveness is.
Null hypothesis – the mean values in both groups are the same
Alternate hypothesis – the mean values in both groups are not equal
It is used when you want to compare the means of two dependent groups or samples. It
can be also correlated to the one sample t-test as it only uses one subject however has two
different measured values.
30 PEOPLE UNDERGONE
IN THIS PROCESS
WEIGHT
DIFFERNCCE
When we want to know how effective is a diet through measuring the weight of a
sample of 30 people before diet and measuring their weight again after diet. Now we can
compare and look at the difference of their weight on before and after their diet for each
Null hypothesis – the mean difference between the pairs is zero
subject. Then we can use the paired sample t-test where the samples are available in
pairs and test whether
Alternate the mean
hypothesis – the value that resulted
mean difference in thethe
between samples
pairs is deviate
not zerofrom the
reference value which is 0.
Page 4 of 11
Republic of the Philippines
CAMARINES NORTE STATE COLLEGE
F. PimentelAvenue, Brgy. 2, Daet, Camarines Norte
– 4600, Philippines
COLLEGE OF ENGINEERING
The variable which we want to test whether there is a different between the means must be
metric. (Ex: age, bodyweight, income, temperature, etc.…) It shall also be normally distributed to
all of the types of t-tests. For more accurate data in the independent samples for the independent
t-test, you can use Levene’s test to check if the given two independent samples are
approximately equal.
Difference between mean values Standard deviation from the mean can
t= themean ¿ also be denoted as Standard Error
Standard deviation ¿
s = standard deviation
from the collected data
x̄ = sample mean
x̄−μ n = number of cases or
t= µ = known reference mean
s sample
√n s
= Standard Error
√n
√
2 2 x̄ 2 = Mean value of sample 2
s s2
1
+
n1 n 2 s1
= Standard error of sample
√ n1
1
s2
= Standard error of sample 2
√ n2
FORMULA for dependent t-test (dependent samples)
Note that when the t value would be greater if we have a greater difference between
the means and the t value would be smaller if we have a small difference between the
means
Page 5 of 11
Republic of the Philippines
CAMARINES NORTE STATE COLLEGE
F. PimentelAvenue, Brgy. 2, Daet, Camarines Norte
– 4600, Philippines
COLLEGE OF ENGINEERING
The t value would also become smaller if we have a larger disperse of the mean
(Larger Standard Error) which means that the more a data is scattered the less
meaningful a given mean difference is.
√
n
s = The standard deviation, also can be denoted as (σ ¿
∑ (¿ xi −x̄ )2
s= i
¿ x i = The value of each sample
n
x̄ = The mean value of all samples
n = The number of samples
√
n
√
n n1 +n 2−2 = The degrees of freedom
∑ (¿ xi −x̄ ) 2
Note that there are two independent samples to
i
s= ¿ account for in the independent sample t-test leading
n1+ n2−2
us to add the number of the two independent
samples and a total of 2 samples used for the
statistical analysis.
The standard deviation is a measure that indicates how much data is scattered
around the mean. In a more precise explanation, the data sampled have different values
and deviates from the mean sample value. However, we are not interested in the
deviation of each individual samples from the mean value, we want to know how much
the sample deviates from the mean value on an average scale, this is what the standard
deviation is used for.
In practice, if you have data for the entire population, it's appropriate to use the
population standard deviation formula. If you have a sample, it's often recommended
to use the sample standard deviation formula to provide a more unbiased estimate of
the population standard deviation.
The common choice of a significance level of 0.05 (5%) is somewhat arbitrary, but it has
become a standard in many scientific disciplines. It represents a balance between being
stringent enough to avoid making too many Types I errors (false positives) and being
lenient enough to allow for reasonable sensitivity to detect true effects.
It's important to note that the choice of significance level does not determine whether a
study is "correct" or "incorrect." It is a convention used to establish a standard for
evidence in hypothesis testing, and researchers should interpret their results in the
context of the specific study and field of inquiry
Page 6 of 11
Republic of the Philippines
CAMARINES NORTE STATE COLLEGE
F. PimentelAvenue, Brgy. 2, Daet, Camarines Norte
– 4600, Philippines
COLLEGE OF ENGINEERING
There are two ways to use the t-value to test the hypothesis
The critical t-value is a specific value from the t-distribution that is used in hypothesis
testing. It is compared to the calculated t-statistic to determine whether to reject the null
hypothesis in a t-test. In a t-test, you typically compare the means of two groups or test the
difference between a sample mean and a population mean. The critical t-value is chosen based
on the desired significance level (σ ), degrees of freedom, and whether it is a one-tailed or two-
tailed test.
One-tailed test
A one-tailed t-test is used when the research hypothesis specifies the direction of the
effect. The critical region for rejection is on one side of the distribution. There are two
possible directions for a one tailed test, these are:
Page 7 of 11
Republic of the Philippines
CAMARINES NORTE STATE COLLEGE
F. PimentelAvenue, Brgy. 2, Daet, Camarines Norte
– 4600, Philippines
COLLEGE OF ENGINEERING
Two-tailed test
A two-tailed t-test is used when the research hypothesis is non-directional, simply testing
whether there is a significant difference. The critical region for rejection is on both sides of the
distribution.
H 0 : μ1=μ2
H a : μ1 ≠ μ2
o Reading a Critical T-value - The critical t-value is chosen based on the desired
significance level (σ ), degrees of freedom, and whether it is a one-tailed or two-tailed
test. You can know the critical value based from these components and plotting them
from the t-distribution table.
4.3 P- VALUE
The P stands for probability and measures how likely it is that any observed difference
between groups is due to chance. Initially the t-test would always test the null hypothesis so we
can assume first that the two groups have no difference. However, if were to draw only a sample
from the group, the sample would deviate from the null hypothesis by a certain amount.
The P-VALUE tells us how likely it is that we would draw a sample that deviates from the
group by the same amount or more than the sample we drew. The more the sample deviates
from the null hypothesis, the smaller the p-value becomes, P-values are determined by
adding up probabilities.
o Reading a P-value – The p-value can also be determined vice versa using the t-
distribution table where in this case, it is chosen based on the given t-value, degrees
of freedom and whether it was a one-tailed or two-tailed t-test.
4.4 EXAMPLES
df = Degrees of freedom
Page 8 of 11
Republic of the Philippines
CAMARINES NORTE STATE COLLEGE
F. PimentelAvenue, Brgy. 2, Daet, Camarines Norte
– 4600, Philippines
COLLEGE OF ENGINEERING
EXAMPLE:
If the two tailed test resulted with a t-value of 2.5 to a given degrees of freedom is 10
with a significant level of 5% or 0.05. Find the critical t-value on the t-distribution table
and identify whether the null hypothesis should be rejected or accepted.
t=2.5
σ =0.05
df =10 t c =2.262 if t >t c =true
Reject H 0
2.5>2.262 = true
If the two tailed test resulted with a value of 2.764 to a given degrees of freedom is
10 with a significant level of 5% or 0.05. Find the p-value on the t-distribution table
and identify whether the null hypothesis should be rejected or accepted
Page 9 of 11
Republic of the Philippines
CAMARINES NORTE STATE COLLEGE
F. PimentelAvenue, Brgy. 2, Daet, Camarines Norte
– 4600, Philippines
COLLEGE OF ENGINEERING
REFERENCES
https://www.scribbr.com/statistics/t-test/
https://www.investopedia.com/terms/t/t-test.asp#:~:text=A%20t%2Dtest%20is
%20an,flipping%20a%20coin%20100%20times.
https://www.jmp.com/en_ph/statistics-knowledge-portal/t-test.html
https://www.statology.org/population-vs-sample-standard-deviation/
https://youtu.be/VekJxtk4BYM?si=45sUWaNrnxce2tsv
https://youtu.be/5wJUUgnMGWA?si=2suqXkuyo2ZSd_Ov
https://youtu.be/2ARvj-8tJBs?si=BUmcenc1jBxw0PUV
https://www.sciencedirect.com/topics/social-sciences/statistical-significance
https://byjus.com/maths/t-test-table/
https://statisticsbyjim.com/hypothesis-testing/one-tailed-two-tailed-hypothesis-tests/
Page 10 of 11