0% found this document useful (0 votes)

53 views19 pages

8614 - Assignment 2 Solved (AG)

This document discusses inferential statistics and how it is used in educational research. Inferential statistics allow researchers to make inferences about populations based on data from samples. It can be used to estimate population parameters and test hypotheses about populations. When collecting data from a sample, inferential statistics account for sampling error, which is the difference between population parameters and sample statistics. Common inferential techniques discussed include point estimates, interval estimates, and confidence intervals, which provide a range that a population parameter is likely to fall within based on a sample.

Uploaded by

Malik Alyan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views19 pages

8614 - Assignment 2 Solved (AG)

Uploaded by

Malik Alyan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Course: Educational Statistics (8614) Semester: Autumn, 2022

ASSIGNMENT-02

Q.1 How is mean calculated? Also discuss its merits and demerits.
Ans-

Mode is the value that occurs with the greatest number of frequency. For
example- if the given set of values are 2, 3, 2, 4, 5, 2, 3, 1, 6 the mode here
would be 2 which appears thrice. However, when there are 2 or more values
appearing with same frequencies then the mode is said to be ill-defined. Such
series is called as bi-modal or multi-modal.
Mode is an appropriate measure than average and median under certain
circumstances. For instance, while studying the income earned by the workers
in a company, mode reflects the wages earned by a large number of workers.
Average income of the workers, on the other hand, may be much higher just
because few employees in higher positions are earning a very high level of
income.
Majority votes are considered in decision making where mode is applied to see
the choice preferred by a large number of people.
1. Individual Observations:
Example 1:
Calculate the mode from the data given below showing the marks obtained by
10 students.
75, 80, 82, 76, 82, 74, 75, 79, 82, 70
Solution:
The mode here is 82 as it appears with the highest frequency.
2. Discrete Series:
Example 2:
Calculate the mode for the data pertaining to the size of shoes.
Course: Educational Statistics (8614) Semester: Autumn, 2022

Solution:
The mode here is 6 as it has the highest frequency.
3. Continuous Series:
Mode for a data in the form of a continuous series is calculated using the
formula

Example 3:
Calculate the mode from the data given below pertaining to the marks
obtained by the students in a test.

Solution:
By observation, it is known that the modal class is 40 – 50 as this class has the
highest frequency.

Calculation of Mode – Grouping Method:

Ascertaining the mode by mere observation can be erroneous when there is a
very low frequency preceding or succeeding the highest frequency. In such
cases, a grouping table and an analysis table is prepared to ascertain the modal
class. A grouping table consists of six columns. The maximum frequency is
marked in the first column.
Course: Educational Statistics (8614) Semester: Autumn, 2022

The frequencies are grouped in two’s in the second column. In the third
column, the first frequency is left out and the remaining frequencies are
grouped in two’s. In the fourth column, the frequencies are grouped in three’s.
In the fifth column, the first frequency is left out and the remaining
frequencies are grouped in three’s. In the sixth column, the first two
frequencies are left out and the remaining frequencies are grouped in three’s.
In each of these columns the maximum value is observed.
The analysis table is prepared taking the column numbers on the left and the
probable values of mode on the right. The probable values of mode are those
values against which the frequencies are the highest in the grouping table. The
values are entered by means of a bar in the analysis table. The column total is
then taken and the one which has the maximum value is the modal value.
Example 4:
Calculate the value of mode for the following data:

The modal value is 25 as it has the maximum total of 5 bars.

Merits of Mode:
1. It can be easily observed from the data.
2. It is easy to compute.
3. It is unaffected by extreme values.
4. Mode can be determined even if the distribution has open end class.
5. It can also be determined easily by graphic method.
6. It is easy to understand.
Demerits of Mode:
1. Mode is ill-defined when there are distributions with two modes.
2. It is not based on all the values.
3. It cannot be accurate when there are sampling fluctuations.
Course: Educational Statistics (8614) Semester: Autumn, 2022

4. When mode is computed through different methods, the value may differ in
each of the methods.

Q.2 What is meant by inferential statistics? How and why is it used in

educational research?
Ans-

While descriptive statistics summarize the characteristics of a data

set, inferential statistics help you come to conclusions and make predictions
based on your data.

When you have collected data from a sample, you can use inferential statistics
to understand the larger population from which the sample is taken.

Inferential statistics have two main uses:

• making estimates about populations (for example, the mean SAT score
of all 11th graders in the US).
• testing hypotheses to draw conclusions about populations (for example,
the relationship between SAT scores and family income).

Table of contents

1. Descriptive versus inferential statistics

2. Estimating population parameters from sample statistics
3. Hypothesis testing
4. Frequently asked questions about inferential statistics

Descriptive versus inferential statistics

Descriptive statistics allow you to describe a data set, while inferential
statistics allow you to make inferences based on a data set.

Descriptive statistics
Using descriptive statistics, you can report characteristics of your data:

• The distribution concerns the frequency of each value.

Course: Educational Statistics (8614) Semester: Autumn, 2022

• The central tendency concerns the averages of the values.

• The variability concerns how spread out the values are.

In descriptive statistics, there is no uncertainty – the statistics precisely

describe the data that you collected. If you collect data from an entire
population, you can directly compare these descriptive statistics to those from
other populations.

Inferential statistics
Most of the time, you can only acquire data from samples, because it is too
difficult or expensive to collect data from the whole population that you’re
interested in.

While descriptive statistics can only summarize a sample’s characteristics,

inferential statistics use your sample to make reasonable guesses about the
larger population.

With inferential statistics, it’s important to use random and unbiased sampling
methods. If your sample isn’t representative of your population, then you can’t
make valid statistical inferences or generalize.

Example: Inferential statisticsYou randomly select a sample of 11th graders in

your state and collect data on their SAT scores and other characteristics.
You can use inferential statistics to make estimates and test hypotheses about
the whole population of 11th graders in the state based on your sample data.

Sampling error in inferential statistics

Since the size of a sample is always smaller than the size of the population,
some of the population isn’t captured by sample data. This creates sampling
error, which is the difference between the true population values (called
parameters) and the measured sample values (called statistics).

Sampling error arises any time you use a sample, even if your sample is
random and unbiased. For this reason, there is always some uncertainty in
inferential statistics. However, using probability sampling methods reduces this
uncertainty.

Estimating population parameters from sample statistics

The characteristics of samples and populations are described by numbers
called statistics and parameters:
Course: Educational Statistics (8614) Semester: Autumn, 2022

• A statistic is a measure that describes the sample (e.g., sample mean).

• A parameter is a measure that describes the whole population (e.g.,
population mean).

Sampling error is the difference between a parameter and a corresponding

statistic. Since in most cases you don’t know the real population parameter,
you can use inferential statistics to estimate these parameters in a way that
takes sampling error into account.

There are two important types of estimates you can make about the
population: point estimates and interval estimates.

• A point estimate is a single value estimate of a parameter. For instance,

a sample mean is a point estimate of a population mean.
• An interval estimate gives you a range of values where the parameter is
expected to lie. A confidence interval is the most common type of
interval estimate.

Both types of estimates are important for gathering a clear idea of where a
parameter is likely to lie.

Confidence intervals
A confidence interval uses the variability around a statistic to come up with an
interval estimate for a parameter. Confidence intervals are useful for
estimating parameters because they take sampling error into account.

While a point estimate gives you a precise value for the parameter you are
interested in, a confidence interval tells you the uncertainty of the point
estimate. They are best used in combination with each other.

Each confidence interval is associated with a confidence level. A confidence

level tells you the probability (in percentage) of the interval containing the
parameter estimate if you repeat the study again.

A 95% confidence interval means that if you repeat your study with a new
sample in exactly the same way 100 times, you can expect your estimate to lie
within the specified range of values 95 times.

Although you can say that your estimate will lie within the interval a certain
percentage of the time, you cannot say for sure that the actual population
parameter will. That’s because you can’t know the true value of the population
parameter without collecting data from the full population.
Course: Educational Statistics (8614) Semester: Autumn, 2022

However, with random sampling and a suitable sample size, you can
reasonably expect your confidence interval to contain the parameter a certain
percentage of the time.

Example: Point estimate and confidence intervalYou want to know the average
number of paid vacation days that employees at an international company
receive. After collecting survey responses from a random sample, you calculate
a point estimate and a confidence interval.
Your point estimate of the population mean paid vacation days is the sample
mean of 19 paid vacation days.

With random sampling, a 95% confidence interval of [16 22] means you can be
reasonably confident that the average number of vacation days is between 16
and 22.

Hypothesis testing
Hypothesis testing is a formal process of statistical analysis using inferential
statistics. The goal of hypothesis testing is to compare populations or assess
relationships between variables using samples.

Hypotheses, or predictions, are tested using statistical tests. Statistical tests

also estimate sampling errors so that valid inferences can be made.

Statistical tests can be parametric or non-parametric. Parametric tests are

considered more statistically powerful because they are more likely to detect
an effect if one exists.

Parametric tests make assumptions that include the following:

• the population that the sample comes from follows a normal

distribution of scores
• the sample size is large enough to represent the population
• the variances, a measure of variability, of each group being compared
are similar

When your data violates any of these assumptions, non-parametric tests are
more suitable. Non-parametric tests are called “distribution-free tests”
because they don’t assume anything about the distribution of the population
data.
Course: Educational Statistics (8614) Semester: Autumn, 2022

Statistical tests come in three forms: tests of comparison, correlation or

regression.

Comparison tests
Comparison tests assess whether there are differences in means, medians or
rankings of scores of two or more groups.

To decide which test suits your aim, consider whether your data meets the
conditions necessary for parametric tests, the number of samples, and
the levels of measurement of your variables.

Means can only be found for interval or ratio data, while medians and rankings
are more appropriate measures for ordinal data.

Comparison test Parametric? What’s being Samples

compared?

t test Yes Means 2 samples

ANOVA Yes Means 3+

samples

Mood’s median No Medians 2+

samples

Wilcoxon signed-rank No Distributions 2 samples

Wilcoxon rank-sum (Mann- No Sums of rankings 2 samples

Whitney U)

Kruskal-Wallis H No Mean rankings 3+

samples

Correlation tests
Correlation tests determine the extent to which two variables are associated.
Course: Educational Statistics (8614) Semester: Autumn, 2022

Although Pearson’s r is the most statistically powerful test, Spearman’s r is

appropriate for interval and ratio variables when the data doesn’t follow a
normal distribution.

The chi square test of independence is the only test that can be used
with nominal variables.

Correlation test Parametric? Variables

Pearson’s r Yes Interval/ratio variables

Spearman’s r No Ordinal/interval/ratio
variables

Chi square test of No Nominal/ordinal variables

independence

Regression tests
Regression tests demonstrate whether changes in predictor variables cause
changes in an outcome variable. You can decide which regression test to use
based on the number and types of variables you have as predictors and
outcomes.

Most of the commonly used regression tests are parametric. If your data is not
normally distributed, you can perform data transformations.

Data transformations help you make your data normally distributed using
mathematical operations, like taking the square root of each value.

Q.3 Discuss the characteristics of correlation. Also explain the importance of

p-value in interpreting correlation.
Ans-

Correlation and P value

The two most commonly used statistical tests for establishing relationship
between variables are correlation and p-value. Correlation is a way to test if
Course: Educational Statistics (8614) Semester: Autumn, 2022

two variables have any kind of relationship, whereas p-value tells us if the
result of an experiment is statistically significant. In this tutorial, we will be
taking a look at how they are calculated and how to interpret the numbers
obtained.

What is correlation?
Correlation coefficient is used in statistics to measure how strong a
relationship is between two variables. There are several types of correlation
coefficients (e.g. Pearson, Kendall, Spearman), but the most commonly used is
the Pearson’s correlation coefficient. This coefficient is calculated as a number
between -1 and 1 with 1 being the strongest possible positive correlation and -
1 being the strongest possible negative correlation.

A positive correlation means that as one number increases the second number
will also increase. A negative correlation means that as one number increases
the second number decreases. However, correlation does not always imply
causation — correlation does not tell us whether change in one number is
directly caused by the other number, only that they typically move together.
Learn more about the Pearson correlation formula, and how to implement it in
SQL here. To understand how correlation works, let’s look at a chart of height
vs weight.

We can observe that with increase in weight, the height also increases – which
indicates they are positively correlated. Also, the correlation coefficient in this
case is 0.88, which supports our finding. Learn more about correlation and how
to implement it in Excel here.
Course: Educational Statistics (8614) Semester: Autumn, 2022

What is a p-value?
P-value evaluates how well your data rejects the null hypothesis, which states
that there is no relationship between two compared groups. Successfully
rejecting this hypothesis tells you that your results may be statistically
significant. In academic research, p-value is defined as the probability of
obtaining results ‘as extreme’ or ‘more extreme’, given that the null
hypothesis is true — essentially, how likely it is that you would receive the
results (or more dramatic results) you did assuming that there is no correlation
or relationship (e.g. the thing that you’re testing) among the subjects. To
understand what this means, let us look at an example.

We are going to conduct an experiment to check if a coin is biased or not. To

do this, let’s flip a coin 10 times. Intuitively, we can say that the probability of
getting 5 heads and 5 tails is highest, followed by 6 heads and 4 tails or 6 tails
and 4 heads, and so on. So first, let’s state the null and alternate hypothesis.
Since the assumption is that the coin is fair, our null hypothesis is “The coin is
unbiased with equal probability of heads and tails”. We are conducting the
experiment to prove/disprove the claim, so our alternative hypothesis is “The
coin is biased with unequal probability of heads and tails”

Assuming the null hypothesis is true (the coin is fair), let’s calculate the
probabilities of the various possible outputs i.e 0 heads & 10 tails, 1 head & 9
tails, 2 heads & 8 tails, and so on.

The probabilities are calculated using the probability of a binomial distribution,

which gives the probability of r successes in n trials using the formula :

nCr * (p)r * (1-p)n-r

Where,
n = no. of trials = 10
r = no. of successes (heads)
p = probability of a success = 1/2
1-p = probability of a failure = 1/2

Let’s consider a ‘success’ to be when heads appears in the coin toss. Also, it
won’t make a difference if ‘success’ is considered to be heads or tails. Let’s first
calculate the probability of obtaining 5 heads and 5 tails in 10 coin flips.

P(5 heads and 5 tails) = 10C5 * (½)5 * (½)5 = 0.24609375

Course: Educational Statistics (8614) Semester: Autumn, 2022

Similarly, let’s generate the probabilities of all other possible combinations of

heads and tails:

Let’s plot the probabilities to understand the intuition behind the above
calculation:

We can observe from the chart that the probability of getting 5 heads is the
highest, and the probability of getting 0 heads or 0 tails is the lowest. Now,
let’s assume we get the output of this experiment as “9 heads and 1 tail”.

Let us calculate the p-value of the experiment. To reiterate the definition – “p

value is the probability of obtaining results as extreme or more extreme,
given the null hypothesis is true”.

Now, we add the probabilities of all the possible outputs of the experiment
which are as probable as ‘9 heads and 1 tail’ and less probable than ‘9 heads
and 1 tail’.

P-value = P(9 heads and 1 tail) + P(10 heads and 0 tail) + P(9 tails and 1 head) +
P(10 tails and 0 heads)
Course: Educational Statistics (8614) Semester: Autumn, 2022

= 0.009765625 + 0.000976563 + 0.009765625 + 0.000976563 = 0.02148437 =

0.02 (approx.)

Now, we need to check whether the p-value is significant or not. This is done
by specifying a significance cutoff, known as the alpha value. Alpha is usually
set to 0.05, meaning the probability of achieving the same or more extreme
results assuming the null hypothesis is 5%. If the p-value is less than the
specified alpha value, then we reject the null hypothesis. Hence, we reject the
hypothesis that ““The coin is fair with equal probability of heads and
tails” and conclude that the coin is biased.

Conclusion
Though correlation and p-value provides us with the relationship between
variables, care should be taken to interpret them correctly. Correlation tells us
whether two variables have any sort of relationship and it does not imply
causation. If two variables A and B are highly correlated, there are several
possible explanations: (a) A influences B; (b) B influences A; (c) A and B are
influenced by one or more additional variables; (d) the relationship observed
between A and B was a chance error. Similarly, p-value should not be misused
to produce a statistically significant result. If analysis is done by exhaustively
searching various combinations of variables for correlation, then it is known
as p-hacking.

Q.4 Explain the rationale of applying ANOVA in educational statistics.

Ans-
Analysis of variance (ANOVA) is an analysis tool used in statistics that splits an
observed aggregate variability found inside a data set into two parts:
systematic factors and random factors. The systematic factors have a
statistical influence on the given data set, while the random factors do not.
Analysts use the ANOVA test to determine the influence that independent
variables have on the dependent variable in a regression study.

The t- and z-test methods developed in the 20th century were used for
statistical analysis until 1918, when Ronald Fisher created the analysis of
variance method.12 ANOVA is also called the Fisher analysis of variance, and
it is the extension of the t- and z-tests. The term became well-known in 1925,
after appearing in Fisher's book, "Statistical Methods for Research
Workers."3 It was employed in experimental psychology and later expanded
to subjects that were more complex.
Course: Educational Statistics (8614) Semester: Autumn, 2022

What Does the Analysis of Variance Reveal?

The ANOVA test is the initial step in analyzing factors that affect a given data
set. Once the test is finished, an analyst performs additional testing on the
methodical factors that measurably contribute to the data set's inconsistency.
The analyst utilizes the ANOVA test results in an f-test to generate additional
data that aligns with the proposed regression models.

The ANOVA test allows a comparison of more than two groups at the same
time to determine whether a relationship exists between them. The result of
the ANOVA formula, the F statistic (also called the F-ratio), allows for the
analysis of multiple groups of data to determine the variability between
samples and within samples.

If no real difference exists between the tested groups, which is called the null
hypothesis, the result of the ANOVA's F-ratio statistic will be close to 1. The
distribution of all possible values of the F statistic is the F-distribution. This is
actually a group of distribution functions, with two characteristic numbers,
called the numerator degrees of freedom and the denominator degrees of
freedom.

Example of How to Use ANOVA

A researcher might, for example, test students from multiple colleges to see if
students from one of the colleges consistently outperform students from the
other colleges. In a business application, an R&D researcher might test two
different processes of creating a product to see if one process is better than
the other in terms of cost efficiency.

The type of ANOVA test used depends on a number of factors. It is applied

when data needs to be experimental. Analysis of variance is employed if there
is no access to statistical software resulting in computing ANOVA by hand. It is
simple to use and best suited for small samples. With many experimental
designs, the sample sizes have to be the same for the various factor level
combinations.

ANOVA is helpful for testing three or more variables. It is similar to multiple

two-sample t-tests. However, it results in fewer type I errors and is
appropriate for a range of issues. ANOVA groups differences by comparing the
means of each group and includes spreading out the variance into diverse
sources. It is employed with subjects, test groups, between groups and within
groups.
Course: Educational Statistics (8614) Semester: Autumn, 2022

One-Way ANOVA Versus Two-Way ANOVA

There are two main types of ANOVA: one-way (or unidirectional) and two-
way. There also variations of ANOVA. For example, MANOVA (multivariate
ANOVA) differs from ANOVA as the former tests for multiple dependent
variables simultaneously while the latter assesses only one dependent
variable at a time. One-way or two-way refers to the number of independent
variables in your analysis of variance test. A one-way ANOVA evaluates the
impact of a sole factor on a sole response variable. It determines whether all
the samples are the same. The one-way ANOVA is used to determine whether
there are any statistically significant differences between the means of three
or more independent (unrelated) groups.

A two-way ANOVA is an extension of the one-way ANOVA. With a one-way,

you have one independent variable affecting a dependent variable. With a
two-way ANOVA, there are two independents. For example, a two-way
ANOVA allows a company to compare worker productivity based on two
independent variables, such as salary and skill set. It is utilized to observe the
interaction between the two factors and tests the effect of two factors at the
same time.

Q.5 Discuss chi-square distribution. Why and where is it used?

Ans-
A chi-square (χ2) statistic is a test that measures how a model compares to
actual observed data. The data used in calculating a chi-square statistic must
be random, raw, mutually exclusive, drawn from independent variables, and
drawn from a large enough sample. For example, the results of tossing a fair
coin meet these criteria.

Chi-square tests are often used to test hypotheses. The chi-square statistic
compares the size of any discrepancies between the expected results and the
actual results, given the size of the sample and the number of variables in the
relationship.

For these tests, degrees of freedom are used to determine if a certain null
hypothesis can be rejected based on the total number of variables and
samples within the experiment. As with any statistic, the larger the sample
size, the more reliable the results.
Course: Educational Statistics (8614) Semester: Autumn, 2022

What Does a Chi-Square Statistic Tell You?

There are two main kinds of chi-square tests: the test of independence, which
asks a question of relationship, such as, "Is there a relationship between
student gender and course choice?"; and the goodness-of-fit test, which asks
something like "How well does the coin in my hand match a theoretically fair
coin?"1

Chi-square analysis is applied to categorical variables and is especially useful

when those variables are nominal (where order doesn't matter, like marital
status or gender).2

Independence
When considering student gender and course choice, a χ2 test for
independence could be used. To do this test, the researcher would collect
data on the two chosen variables (gender and courses picked) and then
compare the frequencies at which male and female students select among the
offered classes using the formula given above and a χ2 statistical table.2

If there is no relationship between gender and course selection (that is, if they
are independent), then the actual frequencies at which male and female
students select each offered course should be expected to be approximately
equal, or conversely, the proportion of male and female students in any
selected course should be approximately equal to the proportion of male and
female students in the sample.2

A χ2 test for independence can tell us how likely it is that random chance can
explain any observed difference between the actual frequencies in the data
and these theoretical expectations.

Goodness-of-Fit
χ2 provides a way to test how well a sample of data matches the (known or
assumed) characteristics of the larger population that the sample is intended
to represent. This is known as goodness of fit.

If the sample data do not fit the expected properties of the population that
we are interested in, then we would not want to use this sample to draw
conclusions about the larger population.3
Course: Educational Statistics (8614) Semester: Autumn, 2022

Example
For example, consider an imaginary coin with exactly a 50/50 chance of
landing heads or tails and a real coin that you toss 100 times. If this coin is fair,
then it will also have an equal probability of landing on either side, and the
expected result of tossing the coin 100 times is that heads will come up 50
times and tails will come up 50 times.4

In this case, χ2 can tell us how well the actual results of 100 coin flips compare
to the theoretical model that a fair coin will give 50/50 results. The actual toss
could come up 50/50, or 60/40, or even 90/10. The farther away the actual
results of the 100 tosses is from 50/50, the less good the fit of this set of
tosses is to the theoretical expectation of 50/50, and the more likely we might
conclude that this coin is not actually a fair coin.4

When to Use a Chi-Square Test

A chi-square test is used to help determine if observed results are in line with
expected results, and to rule out that observations are due to chance.

A chi-square test is appropriate for this when the data being analyzed are
from a random sample, and when the variable in question is a categorical
variable.2 A categorical variable is one that consists of selections such as type
of car, race, educational attainment, male or female, or how much somebody
likes a political candidate (from very much to very little).

These types of data are often collected via survey responses or

questionnaires. Therefore, chi-square analysis is often most useful in analyzing
this type of data.

How to Perform a Chi-Square Test

These are the basic steps whether you are performing a goodness of fit test or
a test of independence:

• Create a table of the observed and expected frequencies;

• Use the formula to calculate the chi-square value;
• Find the critical chi-square value using a chi-square value table or
statistical software;
• Determine whether the chi-square value or the critical value is the
larger of the two;
• Reject or accept the null hypothesis.5
Course: Educational Statistics (8614) Semester: Autumn, 2022

Limitations of the Chi-Square Test

The chi-square test is sensitive to sample size. Relationships may appear to be
significant when they aren't simply because a very large sample is used.

In addition, the chi-square test cannot establish whether one variable has a
causal relationship with another. It can only establish whether two variables
are related.

What Is a Chi-square Test Used for?

Chi-square is a statistical test used to examine the differences between
categorical variables from a random sample in order to judge goodness of fit
between expected and observed results.

Who Uses Chi-Square Analysis?

Since chi-square applies to categorical variables, it is most used by researchers
who are studying survey response data. This type of research can range from
demography to consumer and marketing research to political science and
economics.

Is Chi-Aquare Analysis Used When the Independent Variable Is Nominal or

Ordinal?
A nominal variable is a categorical variable that differs by quality, but whose
numerical order could be irrelevant. For instance, asking somebody their
favorite color would produce a nominal variable. Asking somebody's age, on
the other hand, would produce an ordinal set of data. Chi-square can be best
applied to nominal data.

The Bottom Line

There are two types of chi-square tests: the test of independence and the test
of goodness of fit. Both are used to determine the validity of a hypothesis or
an assumption. The result is a piece of evidence that can be used to make a
decision. For example:

In a test of independence, a company may want to evaluate whether its new

product, an herbal supplement that promises to give people an energy boost,
is reaching the people who are most likely to be interested. It is being
advertised on websites related to sports and fitness, on the assumption that
active and health-conscious people are most likely to buy it. It does an
extensive poll that is intended to evaluate interest in the product by
demographic group. The poll suggests no correlation between interest in this
product and the most health-conscious people.
Course: Educational Statistics (8614) Semester: Autumn, 2022

In a test of goodness of fit, a marketing professional is considering launching a

new product that the company believes will be irresistible to women over 45.
The company has conducted product testing panels of 500 potential buyers of
the product. The marketing professional has information about the age and
gender of the test panels, This allows the construction of a chi-square test
showing the distribution by age and gender of the people who said they would
buy the product. The result will show whether or not the likeliest buyer is a
woman over 45. If the test shows that men over 45 or women between 18 and
44 are just as likely to buy the product, the marketing professional will revise
the advertising, promotion, and placement of the product to appeal to this
wider group of customers.

Mod 3 Statistical Methods
No ratings yet
Mod 3 Statistical Methods
19 pages
Statistics
100% (6)
Statistics
211 pages
Module 1 - Introduction To Statistics PDF
100% (3)
Module 1 - Introduction To Statistics PDF
15 pages
Baseline Maths Test
No ratings yet
Baseline Maths Test
13 pages
Statistics: Prepared By: Larry Jay B. Valero, LPT
No ratings yet
Statistics: Prepared By: Larry Jay B. Valero, LPT
139 pages
Advanced Statistics1
No ratings yet
Advanced Statistics1
19 pages
Merged Presentation 8614
No ratings yet
Merged Presentation 8614
290 pages
Difference Between Descriptive and Inferential Statistics
100% (1)
Difference Between Descriptive and Inferential Statistics
9 pages
8614-2 Hira Naz M Jawed
No ratings yet
8614-2 Hira Naz M Jawed
18 pages
CE 459 Statistics: Assistant Prof. Muhammet Vefa AKPINAR
No ratings yet
CE 459 Statistics: Assistant Prof. Muhammet Vefa AKPINAR
211 pages
Modern Algebra I
No ratings yet
Modern Algebra I
21 pages
Assignment No: 02: Name Misbah
No ratings yet
Assignment No: 02: Name Misbah
22 pages
Chapter 1 of Stats
No ratings yet
Chapter 1 of Stats
198 pages
8614 Assignment 2
No ratings yet
8614 Assignment 2
37 pages
1 - Statistics
No ratings yet
1 - Statistics
125 pages
Cpe 106 Stat
No ratings yet
Cpe 106 Stat
5 pages
Educational Statistics Notes
No ratings yet
Educational Statistics Notes
32 pages
Course: Educational Statistics (8614) Semester: Spring, 2023 Assignment-02
No ratings yet
Course: Educational Statistics (8614) Semester: Spring, 2023 Assignment-02
19 pages
8614-2 General A24
No ratings yet
8614-2 General A24
22 pages
8614 Autumn 2024 KHAN ?
No ratings yet
8614 Autumn 2024 KHAN ?
12 pages
Mode
No ratings yet
Mode
26 pages
Mesaure of Central Tendency
No ratings yet
Mesaure of Central Tendency
68 pages
Descriptive Statistics: Measures of Central Tendency
No ratings yet
Descriptive Statistics: Measures of Central Tendency
3 pages
Lesson 1
No ratings yet
Lesson 1
18 pages
Stats Lecture 1
No ratings yet
Stats Lecture 1
45 pages
Abstract Classes
No ratings yet
Abstract Classes
5 pages
SI Chapter-2
No ratings yet
SI Chapter-2
53 pages
Descriptive and Inferential Statistics
100% (3)
Descriptive and Inferential Statistics
3 pages
STAT Assignment 01
No ratings yet
STAT Assignment 01
3 pages
Statistics: The Language of Facts: Group 6
No ratings yet
Statistics: The Language of Facts: Group 6
65 pages
PDF Maker 1746194114925
No ratings yet
PDF Maker 1746194114925
28 pages
8614 - Assignment 2 Solved (AG)
No ratings yet
8614 - Assignment 2 Solved (AG)
19 pages
Statistics and Probability
No ratings yet
Statistics and Probability
59 pages
8614 Assingment 2
No ratings yet
8614 Assingment 2
14 pages
Advanced Topics in Number Theory
No ratings yet
Advanced Topics in Number Theory
8 pages
Stat 1251 - Types of Statistics
No ratings yet
Stat 1251 - Types of Statistics
30 pages
Open University Assignment 1 (4485)
No ratings yet
Open University Assignment 1 (4485)
11 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
4 pages
3rd QTR Stats Reviewer
No ratings yet
3rd QTR Stats Reviewer
24 pages
Creii-1 1
No ratings yet
Creii-1 1
53 pages
AP Calc AB 2003 PDF
No ratings yet
AP Calc AB 2003 PDF
34 pages
8614 Assignment No 2
No ratings yet
8614 Assignment No 2
8 pages
Chapter 13 Statistics
No ratings yet
Chapter 13 Statistics
9 pages
Semis and Finals MMW
No ratings yet
Semis and Finals MMW
40 pages
Raja Daniyal (0000242740) 8614 - Assignment 1
No ratings yet
Raja Daniyal (0000242740) 8614 - Assignment 1
30 pages
Engineering Probability and Statistics
No ratings yet
Engineering Probability and Statistics
42 pages
Maths F3 PP1 QS
No ratings yet
Maths F3 PP1 QS
15 pages
Nota
No ratings yet
Nota
47 pages
Statistics
No ratings yet
Statistics
8 pages
Mean, Mode Median
No ratings yet
Mean, Mode Median
19 pages
Statistics SS2020
No ratings yet
Statistics SS2020
12 pages
Jee Main - (One Year Crp-2425) C-Lot-Ph-1 (Vec, KM, Lom, Wep & Com)
No ratings yet
Jee Main - (One Year Crp-2425) C-Lot-Ph-1 (Vec, KM, Lom, Wep & Com)
20 pages
Quick Start Guide To Using PID in Logix5000
No ratings yet
Quick Start Guide To Using PID in Logix5000
9 pages
Research 3 Quarter 3 - MELC 1 Week 1-2 Inferential Statistics
No ratings yet
Research 3 Quarter 3 - MELC 1 Week 1-2 Inferential Statistics
39 pages
SPROB Polished
No ratings yet
SPROB Polished
8 pages
Definition of Terms 1. Statistics
No ratings yet
Definition of Terms 1. Statistics
25 pages
Assignment
No ratings yet
Assignment
5 pages
9 First-Order Circuits Noted
No ratings yet
9 First-Order Circuits Noted
67 pages
Calculate Mean, Median, Mode, Variance and Standard Deviation For Column A
No ratings yet
Calculate Mean, Median, Mode, Variance and Standard Deviation For Column A
22 pages
Iso Cie 11664-6-2014
100% (1)
Iso Cie 11664-6-2014
18 pages
Module 1
No ratings yet
Module 1
6 pages
COMP 312 Chapter 1
No ratings yet
COMP 312 Chapter 1
13 pages
Allied Radio Data Handbook 1943
No ratings yet
Allied Radio Data Handbook 1943
52 pages
ANSYS Workbench: Mechanical Examples
No ratings yet
ANSYS Workbench: Mechanical Examples
54 pages
Fraunhofer Diffraction
No ratings yet
Fraunhofer Diffraction
7 pages
Statistics Is The Study of The Collection, Organization, Analysis, Interpretation, and
No ratings yet
Statistics Is The Study of The Collection, Organization, Analysis, Interpretation, and
18 pages
Semantic Danielou-53: (4th of October 2014)
100% (1)
Semantic Danielou-53: (4th of October 2014)
20 pages
Physics 2 A Fiv
No ratings yet
Physics 2 A Fiv
3 pages
Lesson 1 Stats
No ratings yet
Lesson 1 Stats
5 pages
Design and Analysis of Algorithms CSC 321 Lecture 3 29092022 032607pm
No ratings yet
Design and Analysis of Algorithms CSC 321 Lecture 3 29092022 032607pm
49 pages
Chapter 1 - Introduction To Finite Element Analysis
No ratings yet
Chapter 1 - Introduction To Finite Element Analysis
16 pages
Unconstrained Parameterizations For Variance-Covariance Matrices
No ratings yet
Unconstrained Parameterizations For Variance-Covariance Matrices
6 pages
Stats 1
No ratings yet
Stats 1
4 pages
Aaa Math
No ratings yet
Aaa Math
2 pages
Or Assignment 4 Queuing N Simulation
0% (1)
Or Assignment 4 Queuing N Simulation
2 pages
Adha Theza Naputol Grade 11 - HUMSS A Antique National School
No ratings yet
Adha Theza Naputol Grade 11 - HUMSS A Antique National School
1 page
PrOBLEM Reading and Measuring THERMOMETER
No ratings yet
PrOBLEM Reading and Measuring THERMOMETER
16 pages
DPP-1 2D Projectile Motion Op
No ratings yet
DPP-1 2D Projectile Motion Op
2 pages
Applications of Integration - Mean and Root Mean Square Values
No ratings yet
Applications of Integration - Mean and Root Mean Square Values
6 pages
Matrix Multiplication1
No ratings yet
Matrix Multiplication1
10 pages
Fig-1 in (Lec - 05 - Ver - 01.vsd) : Common Emitter Amplifier Frequency Response
No ratings yet
Fig-1 in (Lec - 05 - Ver - 01.vsd) : Common Emitter Amplifier Frequency Response
16 pages
A Novel Robust Crypto Watermarking Scheme Based On Hybrid Transformers
No ratings yet
A Novel Robust Crypto Watermarking Scheme Based On Hybrid Transformers
10 pages
Analysis of Selected Mathematical Models of High-Cycle S-N Characteristics
No ratings yet
Analysis of Selected Mathematical Models of High-Cycle S-N Characteristics
15 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
2 pages
Constraint Programming: Michael Trick Carnegie Mellon
No ratings yet
Constraint Programming: Michael Trick Carnegie Mellon
41 pages
The 10 Minute Talk
No ratings yet
The 10 Minute Talk
11 pages
Statistics: An Introduction and Overview
No ratings yet
Statistics: An Introduction and Overview
51 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Surviving Statistics: A Professor's Guide to Getting Through
From Everand
Surviving Statistics: A Professor's Guide to Getting Through
Luther Maddy
No ratings yet
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

8614 - Assignment 2 Solved (AG)

Uploaded by

8614 - Assignment 2 Solved (AG)

Uploaded by

Course: Educational Statistics (8614) Semester: Autumn, 2022

Calculation of Mode – Grouping Method:

The modal value is 25 as it has the maximum total of 5 bars.

Q.2 What is meant by inferential statistics? How and why is it used in

While descriptive statistics summarize the characteristics of a data

Inferential statistics have two main uses:

1. Descriptive versus inferential statistics

Descriptive versus inferential statistics

• The distribution concerns the frequency of each value.

• The central tendency concerns the averages of the values.

In descriptive statistics, there is no uncertainty – the statistics precisely

While descriptive statistics can only summarize a sample’s characteristics,

Example: Inferential statisticsYou randomly select a sample of 11th graders in

Sampling error in inferential statistics

Estimating population parameters from sample statistics

• A statistic is a measure that describes the sample (e.g., sample mean).

Sampling error is the difference between a parameter and a corresponding

• A point estimate is a single value estimate of a parameter. For instance,

Each confidence interval is associated with a confidence level. A confidence

Hypotheses, or predictions, are tested using statistical tests. Statistical tests

Statistical tests can be parametric or non-parametric. Parametric tests are

Parametric tests make assumptions that include the following:

• the population that the sample comes from follows a normal

Statistical tests come in three forms: tests of comparison, correlation or

Comparison test Parametric? What’s being Samples

t test Yes Means 2 samples

ANOVA Yes Means 3+

Mood’s median No Medians 2+

Wilcoxon signed-rank No Distributions 2 samples

Wilcoxon rank-sum (Mann- No Sums of rankings 2 samples

Kruskal-Wallis H No Mean rankings 3+

Although Pearson’s r is the most statistically powerful test, Spearman’s r is

Correlation test Parametric? Variables

Pearson’s r Yes Interval/ratio variables

Chi square test of No Nominal/ordinal variables

Q.3 Discuss the characteristics of correlation. Also explain the importance of

Correlation and P value

We are going to conduct an experiment to check if a coin is biased or not. To

The probabilities are calculated using the probability of a binomial distribution,

nCr * (p)r * (1-p)n-r

P(5 heads and 5 tails) = 10C5 * (½)5 * (½)5 = 0.24609375

Similarly, let’s generate the probabilities of all other possible combinations of

Let us calculate the p-value of the experiment. To reiterate the definition – “p

= 0.009765625 + 0.000976563 + 0.009765625 + 0.000976563 = 0.02148437 =

Q.4 Explain the rationale of applying ANOVA in educational statistics.

What Does the Analysis of Variance Reveal?

Example of How to Use ANOVA

The type of ANOVA test used depends on a number of factors. It is applied

ANOVA is helpful for testing three or more variables. It is similar to multiple

One-Way ANOVA Versus Two-Way ANOVA

A two-way ANOVA is an extension of the one-way ANOVA. With a one-way,

Q.5 Discuss chi-square distribution. Why and where is it used?

What Does a Chi-Square Statistic Tell You?

Chi-square analysis is applied to categorical variables and is especially useful

When to Use a Chi-Square Test

These types of data are often collected via survey responses or

How to Perform a Chi-Square Test

• Create a table of the observed and expected frequencies;

Limitations of the Chi-Square Test

What Is a Chi-square Test Used for?

Who Uses Chi-Square Analysis?

Is Chi-Aquare Analysis Used When the Independent Variable Is Nominal or

The Bottom Line

In a test of independence, a company may want to evaluate whether its new

In a test of goodness of fit, a marketing professional is considering launching a

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.