0% found this document useful (0 votes)

10 views10 pages

Stats CHP 1 Notes

Inferential statistics is a branch of statistics that allows conclusions about a population based on sample data, distinguishing it from descriptive statistics. It involves key concepts such as population, sample, and various statistical tests, including parametric and non-parametric methods, each with their own assumptions. The document also discusses the significance level in hypothesis testing and the characteristics of sampling distributions, emphasizing the importance of proper statistical methods for accurate conclusions.

Uploaded by

Pihu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views10 pages

Stats CHP 1 Notes

Uploaded by

Pihu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

WHAT IS INFERENTIAL STATISTICS?

The purpose of inferential statistics is to draw a conclusion (an inference) about conditions that exist in a
population (the complete set of observations) by studying a sample (a subset) drawn from the population.
Inferential statistics is a branch of statistics used to make predictions or generalizations about a
population based on sample data. Unlike descriptive statistics, which simply summarize data, inferential
statistics allow you to make informed decisions or conclusions with a degree of certainty.
IMPORTANT TERMINOLOGIES

• Population- Set of participants about which an investigator draws conclusions

• Parameter- It is a descriptive index of a population
• Sample- A subset of a population
• Random Sampling- all participants have a fair and equal chance of being selected
• Variable- any value that varies
• Dependent variable- The variable that is the outcome or result measured, which is expected to
change due to the independent variable's influence.
• Independent variable- Variable that is manipulated by the researcher to observe its effect on the
dependent variable
• Confounding variable- Other variables that might affect the dependent variable
• Discrete variable- Takes only certain values
• Continuous variables- takes on any value (withing the limits than its values may range)
PARAMETRIC AND NON-PARAMETRIC STATS
PARAMETRIC TESTS are statistical methods that make specific assumptions about the underlying
population, such as normal distribution and homogeneity of variance. These tests work well with
continuous data measured on an interval or ratio scale, and they are more powerful when the
assumptions are met. Common examples include the Z-test, T-test, and ANOVA, which compare means
or assess relationships between variables.

• Assumptions of Parametric Statistics

1. Assumption of Normality
The assumption of normality is fundamental in parametric statistics and dictates that the data being
analyzed should follow a normal distribution. A normal distribution, often referred to as a bell curve, is
symmetrical about the mean, where most observations cluster around the central peak and frequencies
taper off equally on both sides. This distribution is crucial because many statistical tests, like the t-test and
ANOVA, rely on this pattern to make accurate inferences. When data significantly deviates from normality,
statistical results may become misleading, potentially leading to false conclusions.

2. Assumption of Measurement Scale (Interval or Ratio)

Parametric tests require that the data being analyzed be measured on an interval or ratio scale. These
scales are quantitative, meaning they allow for mathematical operations like addition, subtraction,
multiplication, and division. An interval scale has equal differences between values but lacks a true zero
point (like temperature in Celsius), while a ratio scale has both equal intervals and a meaningful zero (like
weight or height). The importance of using interval or ratio scales lies in the accuracy and validity of
statistical calculations, such as means and standard deviations. Using data from nominal or ordinal scales
in parametric tests can lead to incorrect conclusions because these scales do not meet the mathematical
requirements for parametric analysis. Therefore, ensuring proper measurement scales is crucial before
proceeding with statistical tests.

3. Assumption of Independence of Observations

The assumption of independence ensures that each data point in the sample is collected without any
influence from other data points. This means that the occurrence or value of one observation does not
affect or alter another. Independence is crucial because if data points are correlated or related, statistical
tests that assume independence may produce biased or invalid results. For example, repeated measures
from the same individual or closely linked data violate this assumption. To maintain independence,
researchers use random sampling methods and carefully plan their study design to minimize potential
biases. When the assumption is violated, techniques like paired or repeated-measures tests are preferred
to account for the dependency between observations.

4. Assumption of Homoscedasticity (Equal Variance)

Homoscedasticity, also known as homogeneity of variance, means that the variability within each group
being compared should be approximately equal. This assumption is essential in tests like ANOVA and
linear regression, as unequal variances (heteroscedasticity) can distort test statistics and p-values. When
the variances differ significantly, the analysis might incorrectly detect a difference between groups or fail
to identify a genuine effect. To check homoscedasticity, researchers often use statistical tests such as
Levene’s test or Bartlett’s test, as well as visual inspections through residual plots. If heteroscedasticity is
detected, data transformations or robust statistical techniques can be employed to address the issue,
maintaining the validity of statistical conclusions.

5. Assumption of Absence of Outliers

Outliers are data points that deviate significantly from the rest of the dataset and can skew the results of
parametric tests, making them unreliable. These extreme values can inflate or deflate measures of central
tendency and dispersion, such as the mean and standard deviation, leading to inaccurate statistical
interpretations. Detecting outliers is typically done using box plots, scatter plots, or calculating Z-scores to
identify values that lie beyond three standard deviations from the mean. Once identified, it’s essential to
assess whether these outliers result from errors, rare events, or valid but unusual observations.
Depending on the context, outliers may be transformed, removed, or analyzed separately to ensure they
do not disproportionately affect the analysis.
Why These Assumptions Matter

• Parametric tests are generally more powerful than non-parametric tests because they make
specific assumptions about the data.

• Violating these assumptions can increase the chances of Type I or Type II errors.

• If the assumptions are not met, consider using non-parametric tests (like the Mann-Whitney U test
or the Kruskal-Wallis test), which make fewer assumptions about the data distribution.
NON-PARAMETRIC TESTS do not rely on strict assumptions about the data distribution. They are
suitable for ordinal or nominal data and are robust when data is non-normally distributed or when sample
sizes are small. Examples include the Chi-square test, Mann-Whitney U test, and Kruskal-Wallis test.
Although less powerful than parametric tests, they provide a reliable alternative when parametric

• Assumptions of Non-Parametric Statistics

1. No Assumption of Normality of the Data

One of the most significant advantages of non-parametric statistics is that it does not require the
assumption of normality in the data distribution. This means that non-parametric tests can be applied
regardless of whether the data follows a bell-shaped or symmetrical distribution. Unlike parametric tests
that are sensitive to deviations from normality, non-parametric methods are distribution-free, making them
more flexible and robust in analyzing skewed or non-normally distributed data. This characteristic is
particularly useful when dealing with small sample sizes, as even with as few as four or six observations,
valid statistical inferences can still be made. Therefore, non-parametric tests are often chosen when the
data's underlying distribution is unknown or cannot be assumed to be normal.

2. Measurement Scales: Nominal or Ordinal

Non-parametric statistics primarily work with data measured on nominal or ordinal scales, rather than
interval or ratio scales as required in parametric statistics. Nominal scales categorize data without any
inherent order, like gender or ethnicity, while ordinal scales rank data in a meaningful sequence without
specifying the distance between ranks, such as ranking participants as first, second, or third. This
flexibility allows non-parametric tests to analyze qualitative data and ordered data without needing precise
numerical values. As a result, non-parametric methods are ideal for surveys, ratings, or any scenario
where data points are classified rather than measured. This makes them versatile for analyzing non-
quantitative data, providing insights where numerical precision is not necessary.

3. No Assumption of Independence of Observations

Unlike parametric statistics, non-parametric methods do not strictly require observations to be

independent of each other. This means that relationships or correlations between data points are
acceptable, and the results remain valid even when observations influence one another to some extent.
This characteristic is particularly useful in studies involving paired data, repeated measures, or clustered
data, where independence cannot be guaranteed. Although independence is ideal, non-parametric tests
can accommodate scenarios where interactions between data points exist, allowing researchers to
analyze data that might otherwise violate parametric assumptions. This flexibility makes non-parametric
methods suitable for real-world data that naturally exhibit some level of interdependence.

4. No Assumption of Homogeneous Variance (Homoscedasticity)

(Heteroscedasticity)
In non-parametric statistics, there is no requirement for homoscedasticity, meaning that the variance
among groups does not need to be equal. Parametric tests like ANOVA assume that variances across
compared groups are similar, but non-parametric tests such as the Kruskal-Wallis or Mann-Whitney U test
do not impose this condition. This is particularly important when analyzing data from heterogeneous
populations where variability is inherently different across groups. As a result, non-parametric methods
are more robust and reliable when dealing with data from diverse or non-uniform sources. This allows for
greater flexibility and accuracy when comparing groups that differ significantly in variance, thereby
maintaining the validity of statistical conclusions.

5. Robust to Outliers and Skewed Data

Non-parametric methods are generally more resilient to outliers and skewed data compared to parametric
approaches. In parametric tests, outliers can distort the mean and standard deviation, leading to
misleading results. However, non-parametric tests often use medians rather than means as measures of
central tendency, making them less influenced by extreme values. This makes non-parametric statistics
particularly useful when data contains significant outliers or is highly skewed, as the results remain
consistent and reliable. By reducing the impact of anomalous data points, non-parametric tests ensure
that statistical interpretations are robust even when the data distribution is irregular.

• When Do We Retain and When Do We Reject the Null Hypothesis?

Level of Significance (α)
The level of significance is the probability value that serves as a criterion for making the decision to
reject or retain the null hypothesis. It represents the risk of rejecting H0H_0H0 when it is actually true
(a Type I error).
Symbol:
The level of significance is symbolized by the Greek letter α
Common Levels of Significance:

• α=0.05\alpha = 0.05α=0.05 (5% significance level): The most commonly used level.
• α=0.01\alpha = 0.01α=0.01 (1% significance level): Used when stronger evidence is required.
• α=0.10\alpha = 0.10α=0.10 (10% significance level): Used in exploratory research or when less
precision is acceptable.
Region of Rejection
The region of rejection is the portion of the sampling distribution that contains values of the test
statistic that are unlikely to occur by chance if the null hypothesis is true.

• The region of rejection is determined by the level of significance (α\alphaα).

• If the calculated test statistic falls into this region, the null hypothesis (H0H_0H0) is rejected.

• If it does not fall into the rejection region, we fail to reject the null hypothesis.
When deciding whether to retain or reject the null hypothesis, researchers follow a systematic approach
based on the level of significance and the evidence gathered from the data. The null hypothesis (H₀) is a
statement that assumes no effect, no difference, or no relationship between variables. On the other hand,
the alternative hypothesis (Hₐ) is the statement that contradicts the null hypothesis, indicating that there is
an effect or difference.
The decision to retain or reject the null hypothesis is based on a criterion known as the level of
significance. This level of significance represents the probability of making a Type I error, which means
rejecting the null hypothesis when it is actually true. The most commonly used level of significance is 0.05
(5%), although stricter levels like 0.01 (1%) can also be used in cases where more precision is required.
Researchers calculate a test statistic from the sample data and compare it to a critical value determined
by the chosen level of significance. If the test statistic falls within the region of rejection (the area of the
sampling distribution where outcomes are considered unlikely under the null hypothesis), the null
hypothesis is rejected. This indicates that the observed difference or effect is statistically significant and
unlikely to have occurred by chance.
However, if the test statistic does not fall within the region of rejection, the null hypothesis is retained,
meaning there is not enough evidence to conclude that a difference or effect exists. It is important to
understand that retaining the null hypothesis does not prove it true, but simply indicates that the data
does not provide strong enough evidence to reject it.
Why Use 0.05 as the Level of Significance?
The 0.05 level of significance is widely used because it provides a balance between risk and
reliability. It implies that there is only a 5% chance of rejecting a true null hypothesis. In some cases
where higher accuracy is needed, 0.01 may be chosen to minimize errors.

• RANDOM SAMPLING DISTRIBUTION OF MEANS

It is a theoretical distribution that represents all possible sample means obtained from an infinite number
of samples of a fixed size drawn from a given population. It shows the relative frequency of each possible
mean that could occur purely by chance. This distribution helps in understanding how sample means vary
and is crucial for making statistical inferences about the population mean. The distribution of sample
means tends to be normally distributed, especially when the sample size is large, due to the Central Limit
Theorem. The mean of this distribution equals the population mean, and the standard deviation of the
distribution, known as the standard error, indicates how much the sample means deviate from the
population mean. This concept is fundamental in hypothesis testing and confidence interval estimation
So, this sampling distribution helps us understand what kinds of differences we would expect to see
purely by chance if there is truly no difference between the populations — and that's exactly what we
compare our observed result to when conducting a hypothesis test.
Characteristics of the Sampling Distribution of the Difference Between Means
1. Mean of the Sampling Distribution of the Difference Between Means
The mean of the sampling distribution of the difference between two sample means is directly related to
the true difference between the population means. If we repeatedly select random samples from two
populations, calculate their means, and subtract one from the other, the average of all these differences
will be equal to the difference between the two population means. This tells us that if the two populations
have the same mean, the average difference across many samples will be zero.
2. Shape of the Sampling Distribution of the Difference Between Means
According to the Central Limit Theorem, if the sample size is large enough (typically n≥30), the sampling
distribution of the mean will be approximately normal, regardless of the shape of the population
distribution. If the population is already normal, the sampling distribution will also be normal even for small
sample sizes.
3. Standard Error of the Difference Between Two Means
The spread of the sampling distribution is measured using the standard error (SE). The standard error
tells us how much the sample means are expected to vary from the population mean. It is calculated as:

where σ is the standard deviation of the population, and n is the sample size. As sample
size increases, the SE decreases, making the sample mean a more accurate estimate of the population
mean.
4.Less Variability than Population
The sampling distribution is less spread out than the population distribution because the SE is smaller
than the population standard deviation. This is because averages tend to be more stable than individual
scores.
5.Symmetry
If the population distribution is normal or the sample size is large, the sampling distribution of means will
also be symmetrical, centered around the population mean.
Assumptions Associated with Inference about the Difference Between Two Independent Means
1. Random Sampling:
Each sample must be drawn randomly from its respective population. This ensures that every individual
has an equal chance of being selected, which leads to more reliable and unbiased results that are
generalizable to the population.
2. Independence of Samples:
The two samples must be independently selected, meaning the selection of one does not influence the
other. Independence ensures that the results from one sample do not affect or bias the results of the other
sample.
3. Sampling with Replacement:
Samples should be drawn with replacement, meaning that after selecting an individual, it is put back into
the population before selecting the next one. This keeps the probability of selection the same for each
observation, maintaining sample representativeness.
4. Normality of Sampling Distribution:
The sampling distribution of the difference between the two means should follow a normal distribution.
This assumption is important for applying normal-based statistical tests, which require the data to
approximate normality to make valid inferences.
5. Use of the t-Statistic:
When population standard deviations are unknown, the t-distribution is used instead of the normal
distribution. This allows us to estimate the population standard deviations from the sample data, adjusting
for this uncertainty when calculating test statistics.
6. Homogeneity of Variance (Equal Variance Assumption):
The assumption is that the variances of both populations are roughly equal. If this assumption is violated,
the results may not be reliable, but the effect is smaller with large sample sizes or when the sample sizes
are equal between the groups.
7. Central Limit Theorem (CLT):
The CLT states that as sample size increases, the sampling distribution of the mean approaches a normal
distribution, even if the population is not normally distributed. For sample sizes over 25, moderate
skewness can usually be tolerated without a significant effect on the results.
8. When to Use Nonparametric Tests:
If the parent population is highly non-normal or if the sample size is small, nonparametric tests are
preferred. These tests do not rely on normality assumptions, making them suitable for data that doesn't
meet the conditions for parametric tests.
ONE-TAILED AND TWO-TAILED HYPOTHESES
When conducting hypothesis testing, researchers choose between two types of tests: one-tailed tests
and two-tailed tests. The choice between these tests depends on the nature of the research question
and the expected direction of the effect.
ONE-TAILED HYPOTHESIS (DIRECTIONAL TEST)
A one-tailed hypothesis is used when the researcher expects the difference or effect to occur in a specific
direction. This means that the alternative hypothesis (Hₐ) states that the population parameter is either
greater than or less than the hypothesized value, but not both. In other words, the region of rejection (the
area where the null hypothesis is rejected) is entirely located on one side (tail) of the sampling
distribution.
One-tailed tests are typically used when there is a clear and justified expectation about the direction of
the outcome. For example:

• A psychologist studying aggression might hypothesize that children show more aggressive behavior
after watching violent TV shows. Here, the hypothesis is one-directional because it expects an
increase in aggression.

• A fitness test for schoolchildren might hypothesize that their physical fitness levels are lower than the
national standard. Again, the expectation is one-directional (lower).
The advantage of a one-tailed test is that it is more statistically powerful for detecting an effect in the
specified direction. However, its major disadvantage is that it does not account for the possibility of an
effect occurring in the opposite direction. If the effect turns out to be opposite to what was predicted, a
one-tailed test fails to detect it, even if the difference is significant.
TWO-TAILED HYPOTHESIS (NONDIRECTIONAL TEST)
A two-tailed hypothesis is used when the researcher does not predict the direction of the effect or
difference, only that there is an effect. The alternative hypothesis (Hₐ) states that the parameter may be
either greater than or less than the hypothesized value. In this case, the region of rejection is divided
equally between both tails of the sampling distribution.
Two-tailed tests are appropriate when it is important to detect any difference, regardless of direction. For
example:

• A study comparing student performance between two teaching methods might not predict
whether one method is better or worse than the other but simply tests for a difference.

• An experiment testing a new drug might not specify whether it will improve or worsen a patient’s
condition, only that it will have an impact compared to the standard treatment.
The advantage of a two-tailed test is that it is more conservative and less likely to lead to false positives
because it accounts for effects in both directions. However, it is generally less statistically powerful than a
one-tailed test since the level of significance (alpha) is split between the two tails.
Choosing Between One-Tailed and Two-Tailed Tests
The choice between a one-tailed and two-tailed test should be made before data collection and should be
based on the logic and objective of the study. It is not appropriate to decide on the type of test after
analyzing the data, as this could introduce bias and inflate error rates.

• Use a one-tailed test if you have a specific directional prediction and are confident that an effect will
occur only in that direction.

• Use a two-tailed test if you want to detect any difference, regardless of whether it is positive or
negative.

• Making a premature or unjustified choice can lead to inaccurate conclusions, so it’s crucial to
understand the research context and hypothesis before deciding.
ASSUMPTIONS OF Z TEST
A Z-test is a statistical test used to determine whether there is a significant difference between a sample
mean and a population mean when the population variance is known, or to compare the means of two
large samples. It helps assess whether the observed data is likely to have occurred by chance under the
assumption that the null hypothesis is true.
1. Random Sampling:
The first assumption of the Z test is that the sample drawn from the population is random. This means
that every individual in the population has an equal chance of being selected. Random sampling helps
ensure that the sample is representative of the population, reducing bias and increasing the reliability of
the results. If the sample is not random, it can lead to skewed results that do not accurately reflect the
population. However, achieving truly random samples can be challenging in practice.
2. Sampling with Replacement:
The second assumption is that sampling is done with replacement. This means that after selecting an
individual, they are placed back into the population, so they can be chosen again. Sampling with
replacement helps maintain the same probability of selection throughout the process. Although sampling
without replacement is more common in practice, the error introduced is minimal if the sample size is
small compared to the population size.
3. Normal Distribution of Sampling Means:
The third assumption is that the sampling distribution of the sample mean follows a normal distribution.
This assumption is crucial because the Z test relies on the properties of the normal distribution to
calculate probabilities and critical values. If the population data is approximately normal, the sampling
distribution will also be normal, especially when the sample size is large (usually 25 or more), thanks to
the Central Limit Theorem. This theorem states that the distribution of sample means will be
approximately normal, regardless of the population’s distribution, when the sample size is sufficiently
large.
4. Known Population Standard Deviation (σ):
The fourth assumption is that the population standard deviation (σ) is known. In most real-world
scenarios, however, this information is rarely available. Knowing the population standard deviation means
that we have precise information about the variability within the population. If σ is unknown, the t-test is
usually preferred over the Z test. The formula for the Z test includes σ, which is used to calculate the
standard error of the mean.
Conclusion:
Meeting these assumptions is essential to obtain valid and reliable results from a Z test. If any of these
assumptions are violated, the conclusions drawn from the test may be inaccurate or misleading.
Therefore, it is important to carefully assess whether the data and sampling methods meet these criteria
before applying the Z test.
STUDENT’S DITRIBUTION OF t
The student’s t-distribution is a probability distribution used in statistics when the sample size is small,
and the population standard deviation is unknown. It is similar to the normal distribution but has heavier
tails, making it useful for estimating population parameters and conducting hypothesis tests when data is
limited.
William S. Gosset (1876–1937) was a British mathematician who developed the t-test. It was originally
called Student’s distribution of t because Gosset published under the name ‘Student’ while working for the
Guinness Brewing Company, which did not allow their employees to publish papers.
Characteristics of Student’s t-Distribution
1. Family of Distributions:
The Student’s t-distribution is not a single distribution but a family of distributions. Each distribution within
this family is defined by its degrees of freedom (df), which is closely related to the sample size. The shape
of the t-distribution changes with different degrees of freedom, becoming more similar to the normal
distribution as the sample size increases.
2. Relation to Sample Size:
o When the sample size is large, the sample standard deviation closely approximates the population
standard deviation, making the t-distribution almost identical to the normal (z) distribution.
o When the sample size is small, standard deviation may vary significantly from population standard
deviation, causing the t-distribution to deviate from the normal distribution.
3. Symmetry and Shape:
o The Student’s t-distribution is symmetrical and unimodal, meaning it has one peak and is mirrored
on either side of the mean.
o The mean of the distribution is zero, similar to the normal distribution.
4. Platykurtic Nature:
The t-distribution is platykurtic, meaning it has a flatter peak and thicker tails compared to the normal
distribution. This characteristic indicates a higher probability of obtaining extreme values, making it more
suitable for small sample sizes or situations where the population standard deviation is unknown.
5. Larger Standard Deviation:
The standard deviation of the t-distribution is larger than that of the normal distribution, reflecting the
greater variability expected when the sample size is small.
6. Effect of Degrees of Freedom:
o The shape and spread of the t-distribution are influenced by the degrees of freedom.
o As the degrees of freedom increase, the t-distribution becomes more similar to the normal
distribution.
o When the degrees of freedom are infinite, the t-distribution and normal distribution are identical.
o As the degrees of freedom decrease, the tails become thicker and the peak becomes lower,
emphasizing greater variability.
7. Critical Values and Significance Levels:
Due to the thicker tails, critical values for hypothesis testing are larger for the t-distribution compared to
the z-distribution. This means that to reject the null hypothesis with the same level of significance (e.g.,
α=0.05), a t-value must be more extreme than a corresponding z-value.
o For example, with infinite degrees of freedom, the critical t-value equals the critical z-value of
±1.96.
o For smaller degrees of freedom, the critical t-value is significantly larger, indicating a stricter
criterion for rejecting the null hypothesis.
In summary, the Student’s t-distribution is designed to handle situations where the sample size is small
and the population standard deviation is unknown. It accounts for additional uncertainty by having thicker
tails and greater variability, making it a robust choice for estimating population parameters from limited
data.
ASSUMPTIONS OF THE T-TEST:
1. Random Sampling:
The data should be collected using a random sampling method to ensure that every individual in the
population has an equal chance of being selected. This helps minimize sampling bias and makes the
results more generalizable. Without random sampling, the results may not accurately represent the
population.
2. Normality:
The population from which the sample is drawn should be approximately normally distributed. This
assumption is crucial when the sample size is small (usually n < 30). If the data are not normally
distributed, the test results may be inaccurate, but the Central Limit Theorem helps when the sample
size is large.
3. Equal Variance (for Independent t-Test):
When performing an independent t-test, the variances of the two populations being compared should
be equal, a condition known as homogeneity of variance. This is often tested using Levene’s Test.
If variances are unequal, a modified version of the t-test (Welch’s t-test) is used.
4. Independence:
The observations in the sample must be independent of each other, meaning that the value of one
observation should not influence another. This assumption is crucial for ensuring the accuracy of the
test statistics. Violating this assumption can lead to misleading conclusions.
5. Unknown Population Standard Deviation:
The standard deviation of the population is generally unknown, and instead, the sample standard
deviation is used as an estimate. This makes the t-test more applicable in real-world situations, as
population parameters are rarely known.
When to Use Each Test:
• Z-Test: When the sample size is large (n>30) and the population variance is known.
• T-Test: When the sample size is small (n≤30n) and the population variance is unknown.
DEGREES OF FREEDOM
1. Definition:
Degrees of freedom (df) in psychological statistics refer to the number of independent values that can
vary in an analysis while estimating a statistical parameter. It represents the flexibility or freedom to
vary when calculating statistical measures.
2. Significance:
Degrees of freedom are crucial because they influence the shape of statistical distributions, like the t-
distribution. They help determine the accuracy and reliability of hypothesis testing, making it possible
to make valid inferences from sample data.
df=n−1 where n is the sample size.
3. Why Important:
o Ensures accurate hypothesis testing by adjusting for sample size and variability.
o Helps choose the correct distribution (like t or F distribution) when making inferences.
o Reduces bias by accounting for the number of estimated parameters.
4. Impact on Distribution:
The fewer the degrees of freedom, the more the t-distribution deviates from the normal curve, making
it flatter with thicker tails. As the degrees of freedom increase, the t-distribution becomes more similar
to the normal distribution.
LEVELS OF SIGNIFICANCE VS. P-VALUES
1. Level of Significance (Alpha, 𝛼):
o The level of significance (usually denoted as 𝛼) is the probability of rejecting the null hypothesis
(H₀) when it is actually true.
o It represents the risk of making a Type I error, meaning falsely concluding that there is an effect
when none exists.
o Commonly used values of 𝛼 are 0.05, 0.01, and 0.001, where a lower value indicates stricter
criteria for rejecting H₀.
o Researchers decide the significance level before conducting the test to control the risk of error.
2. p-Value:
o The p-value measures the probability of obtaining a sample result as extreme as, or more
extreme than, the one observed, assuming that H₀ is true.
o It indicates the rarity of the observed outcome under the null hypothesis.
o A smaller p-value suggests stronger evidence against H₀, while a larger p-value indicates weak
evidence.
o It is not pre-set like the significance level but is calculated after data analysis.
3. Comparison of Level of Significance and p-Value:
o If the p-value ≤ 𝛼, the result is considered statistically significant, and H₀ is rejected.
o If the p-value > 𝛼, the result is not statistically significant, and H₀ is not rejected.
o For example, if 𝛼 = 0.05 and the p-value is 0.012, the p-value is less than 0.05, so H₀ is
rejected.
4. Reporting p-Values:
o Researchers often do not report exact p-values but rather indicate whether they are below
critical levels like 0.05, 0.01, or 0.001.
o A statement like "p < 0.05" or "p < 0.01" is commonly used to indicate statistical significance.
In summary, the level of significance is a pre-determined threshold for decision-making, while the p-value
is a post-analysis measure of how likely the observed data would occur if the null hypothesis were true.

BPCC 108 English Assignment 2024-25
No ratings yet
BPCC 108 English Assignment 2024-25
19 pages
Ilovepdf Merged Removed
No ratings yet
Ilovepdf Merged Removed
232 pages
BPCC 108 Notes
No ratings yet
BPCC 108 Notes
11 pages
Parametric Tests PDF
No ratings yet
Parametric Tests PDF
19 pages
Research Methods Unit 4
No ratings yet
Research Methods Unit 4
6 pages
Parametric Test
No ratings yet
Parametric Test
6 pages
Parametric Test and Non
No ratings yet
Parametric Test and Non
9 pages
Parametric and Non Parametric Test
No ratings yet
Parametric and Non Parametric Test
14 pages
Parametric Tests
No ratings yet
Parametric Tests
69 pages
Summary - Raena AI
No ratings yet
Summary - Raena AI
12 pages
Sanvi Isp Practical
No ratings yet
Sanvi Isp Practical
17 pages
MPC 006 PDF
No ratings yet
MPC 006 PDF
313 pages
Local Media1419236475208910846
No ratings yet
Local Media1419236475208910846
36 pages
LECTURE 1 STAT 401-Non-Parametric ABU ZARIA
No ratings yet
LECTURE 1 STAT 401-Non-Parametric ABU ZARIA
10 pages
Inferenatial Assign, of Iqra Sajid
No ratings yet
Inferenatial Assign, of Iqra Sajid
8 pages
Final Exam
No ratings yet
Final Exam
5 pages
Inferential Statistics
100% (2)
Inferential Statistics
16 pages
Dewi Prita Statistic 4D
No ratings yet
Dewi Prita Statistic 4D
7 pages
Planning Data Analysis Choosing Statistical Tool
No ratings yet
Planning Data Analysis Choosing Statistical Tool
27 pages
Uts WPS Office
No ratings yet
Uts WPS Office
7 pages
Lecture Non Parametric - Friedman
No ratings yet
Lecture Non Parametric - Friedman
48 pages
Paper Work
No ratings yet
Paper Work
39 pages
Unit 15 Analysis Quantitative Data: Inferential Statistics Based Parametric Tests
No ratings yet
Unit 15 Analysis Quantitative Data: Inferential Statistics Based Parametric Tests
24 pages
Statistical Instruments and References Writing in Research
No ratings yet
Statistical Instruments and References Writing in Research
36 pages
Introduction To Non Parametric Tests
No ratings yet
Introduction To Non Parametric Tests
10 pages
Lecture 7.descriptive and Inferential Statistics
100% (1)
Lecture 7.descriptive and Inferential Statistics
44 pages
All Statistical Tests and Their Applications Updated Latest Latest Latest Latest
No ratings yet
All Statistical Tests and Their Applications Updated Latest Latest Latest Latest
14 pages
Final Parametric and Non-Parametric Test
No ratings yet
Final Parametric and Non-Parametric Test
118 pages
Class Note II-1-1
No ratings yet
Class Note II-1-1
30 pages
RM Module 4
No ratings yet
RM Module 4
22 pages
RM Unit-IV
No ratings yet
RM Unit-IV
11 pages
Assumptions For Parametric Test
No ratings yet
Assumptions For Parametric Test
4 pages
Parametric and Non Parametric Assignment
No ratings yet
Parametric and Non Parametric Assignment
17 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
17 pages
Inbound 6674276799695690874
No ratings yet
Inbound 6674276799695690874
67 pages
Para and Non-Para
No ratings yet
Para and Non-Para
4 pages
Inferential Statistics
No ratings yet
Inferential Statistics
6 pages
Wa0002.
No ratings yet
Wa0002.
19 pages
Statistical Tools - Summary
No ratings yet
Statistical Tools - Summary
4 pages
Parametric Test
0% (1)
Parametric Test
5 pages
Pera and No Pera
No ratings yet
Pera and No Pera
4 pages
Statistics Theory Notes
No ratings yet
Statistics Theory Notes
21 pages
1 Statistical Test and Their Issues I
No ratings yet
1 Statistical Test and Their Issues I
5 pages
Lecture 4 - Data Science Statistics
No ratings yet
Lecture 4 - Data Science Statistics
21 pages
Topic 1: Topic 2: Topic 3:: This Course Is Designed To Deepen Students'
No ratings yet
Topic 1: Topic 2: Topic 3:: This Course Is Designed To Deepen Students'
24 pages
Stats Suggestion
No ratings yet
Stats Suggestion
17 pages
Research Meth 4
No ratings yet
Research Meth 4
7 pages
Parametric Vs Non-Parametric Tests.
No ratings yet
Parametric Vs Non-Parametric Tests.
8 pages
Module 004 - Parametric and Non-Parametric
No ratings yet
Module 004 - Parametric and Non-Parametric
12 pages
Inferential Statistics
No ratings yet
Inferential Statistics
42 pages
Example 2 Stat
No ratings yet
Example 2 Stat
9 pages
Data Analysis
No ratings yet
Data Analysis
10 pages
ML Unit 3
No ratings yet
ML Unit 3
46 pages
Transmission Ideation
No ratings yet
Transmission Ideation
11 pages
MATH G10 - Q3 - M6 (14pages)
No ratings yet
MATH G10 - Q3 - M6 (14pages)
14 pages
LESSON 8 - HMIS Data Quality
No ratings yet
LESSON 8 - HMIS Data Quality
4 pages
Preserving The Past 1 1
No ratings yet
Preserving The Past 1 1
53 pages
Community Dentistry
No ratings yet
Community Dentistry
11 pages
RD ST ND RD
No ratings yet
RD ST ND RD
2 pages
Summary, Conclusion, Implications and Recommendation
No ratings yet
Summary, Conclusion, Implications and Recommendation
6 pages
DocMerit 2023 Aqa As Psychology 7181 1 Paper 1 Introductory Topics in Psychology Question Paper Mark Scheme Merged June 2023 Verified
No ratings yet
DocMerit 2023 Aqa As Psychology 7181 1 Paper 1 Introductory Topics in Psychology Question Paper Mark Scheme Merged June 2023 Verified
41 pages
BIADNES STATISTICS. Final Examination
No ratings yet
BIADNES STATISTICS. Final Examination
8 pages
AGRAWALL
No ratings yet
AGRAWALL
19 pages
MATH101
No ratings yet
MATH101
10 pages
De La Hoz Schilling 2018 Exploring The New Nor
No ratings yet
De La Hoz Schilling 2018 Exploring The New Nor
70 pages
STPR Format
No ratings yet
STPR Format
34 pages
Est 305 Material
No ratings yet
Est 305 Material
22 pages
Marketing Research
No ratings yet
Marketing Research
30 pages
10 1108 - Lthe 08 2020 0025
No ratings yet
10 1108 - Lthe 08 2020 0025
15 pages
Microsoft Word - Project On Quot Employee Loyalty in Tata Motors Quot
No ratings yet
Microsoft Word - Project On Quot Employee Loyalty in Tata Motors Quot
63 pages
2013 Wcsb6 SB Sampling SD
No ratings yet
2013 Wcsb6 SB Sampling SD
12 pages
RMI Module3
No ratings yet
RMI Module3
24 pages
Statistical Process Control Notes
No ratings yet
Statistical Process Control Notes
94 pages
Ai A10s2245 2
No ratings yet
Ai A10s2245 2
16 pages
WSU Thesis Book Final Volume I
No ratings yet
WSU Thesis Book Final Volume I
380 pages
Tutorial 1
No ratings yet
Tutorial 1
3 pages
Spss Problem Solve
No ratings yet
Spss Problem Solve
107 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
31 pages
PR2 Group1 1
No ratings yet
PR2 Group1 1
23 pages
'Working Conditions and Level of Satisfaction Among The Workers in Selected Tailoring and Dressmaking Establishments in Tacloban City
No ratings yet
'Working Conditions and Level of Satisfaction Among The Workers in Selected Tailoring and Dressmaking Establishments in Tacloban City
68 pages
La Depresión Con Inicio Temprano: Prevalencia, Curso Natural y Latencia para Buscar Tratamiento
No ratings yet
La Depresión Con Inicio Temprano: Prevalencia, Curso Natural y Latencia para Buscar Tratamiento
8 pages
Sampling
No ratings yet
Sampling
50 pages
05 - Main Report HIB Final
No ratings yet
05 - Main Report HIB Final
39 pages
KCE Society's: Moolji Jaitha College, Jalgaon
No ratings yet
KCE Society's: Moolji Jaitha College, Jalgaon
32 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Stats CHP 1 Notes

Uploaded by

Stats CHP 1 Notes

Uploaded by

WHAT IS INFERENTIAL STATISTICS?

• Population- Set of participants about which an investigator draws conclusions

• Assumptions of Parametric Statistics

2. Assumption of Measurement Scale (Interval or Ratio)

3. Assumption of Independence of Observations

4. Assumption of Homoscedasticity (Equal Variance)

5. Assumption of Absence of Outliers

• Assumptions of Non-Parametric Statistics

1. No Assumption of Normality of the Data

2. Measurement Scales: Nominal or Ordinal

3. No Assumption of Independence of Observations

Unlike parametric statistics, non-parametric methods do not strictly require observations to be

4. No Assumption of Homogeneous Variance (Homoscedasticity)

5. Robust to Outliers and Skewed Data

• When Do We Retain and When Do We Reject the Null Hypothesis?

• The region of rejection is determined by the level of significance (α\alphaα).

• RANDOM SAMPLING DISTRIBUTION OF MEANS

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.