0% found this document useful (0 votes)
33 views8 pages

Statistics

The document discusses various statistical concepts including: 1) There are four levels of measurement for types of data: nominal, ordinal, interval, and ratio. 2) There are five types of variables: dependent, independent, moderating, intervening, and extraneous. 3) Distributions can be discrete or continuous depending on whether values are restricted to certain points or can be any real number.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views8 pages

Statistics

The document discusses various statistical concepts including: 1) There are four levels of measurement for types of data: nominal, ordinal, interval, and ratio. 2) There are five types of variables: dependent, independent, moderating, intervening, and extraneous. 3) Distributions can be discrete or continuous depending on whether values are restricted to certain points or can be any real number.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Q1) Types of data:

There are mainly 4 types of data:


1) nominal level:
 The lowest level of data measurement is the nominal level.
 Numbers representing nominal level data can be used only to classify or categorized. The numbers are
used to differentiate them not to make value statement.
 Employee identification numbers are an example of nominal data.
2) Ordinal level:
 Ordinal-level data measurement is higher than the nominal level.
 Ordinal-level measurement can be used to rank or order objects.
 For example, using ordinal data, a supervisor can evaluate three employees by ranking their
productivity with the numbers 1 through 3.
3) Interval level:
 Interval-level data measurement is the next to the highest level of data.
 The distances between consecutive numbers have meaning and the data are always numerical.
 The distances represented by the differences between consecutive numbers are equal; that is, interval
data have equal intervals.
 An example of interval measurement is Fahrenheit temperature.
4) Ratio level:
 Ratio-level data measurement is the highest level of data measurement.
 Ratio data have the same properties as interval data, but ratio data have an absolute zero, and the ratio of
two numbers is meaningful.
 The notion of absolute zero means that zero is fixed, and the zero value in the data represents the
absence of the characteristic being studied.
 Example: height, weight, time, volume and kelvin temperature.

Q2) Types of Variables:


1) Dependent:
 A variable that relies on and can be changed by other factors that are measured.
 A grade someone gets on an exam depends on factors such as how much sleep they got and how long
they studied.
2) Independent:
 A variable that stands alone and isn't changed by the other variables or factored that are measured.
 Age: Other variables such as where someone lives, what they eat or how much they exercise are not
going to change their age.
3) Moderating:
 Changes the relationship between dependent and independent variables by strengthening or weakening
the intervening variable’s effect.
 Age: If a study looking at the relationship between economic status (independent variable) and
how frequently people get physical exams from a doctor (dependent variable), age is a moderating
variable.
4) intervening:
 A theoretical variable used to explain a cause or connection between other study variables.
 Access to health care: If wealth is the independent variable, and a long life span is a dependent
variable, a researcher might hypothesize that access to quality health care is the intervening variable
that links wealth and life span.
5) Extraneous:
 Factors that affect the dependent variable but that the researcher did not originally consider when
designing the experiment.
 Parental support, prior knowledge of a foreign language or socioeconomic status are extraneous
variables that could influence a study assessing whether private tutoring or online courses are more
effective at improving students' Spanish test scores.
Q3) discrete distribution vs continuous:

basis Discrete distribution Continuous distribution


meaning A discrete distribution is one in A continuous distribution is one
which the data can only take on in which data can take on any
certain values, for example value within a specified range
integers.
probability For a discrete distribution, continuous distribution has an
probabilities can be assigned to the infinite number of possible values,
values in the distribution and the probability associated with
any particular value of a continuous
distribution is null.

Q4) What is co-relation and its types:


 Correlation refers to a process for establishing the relationships between two variables.
 This section shows how to calculate and interpret correlation coefficients for ordinal and interval
level scales.
 The statistic r is the Pearson product-moment correlation coefficient, named after Karl Pearson (1857-
1936), an English statistician who developed several coefficients of correlation along with other
significant statistical concepts.
 The term r is a measure of the linear correlation of two variables.
Types:
 Positive Correlation: when the values of the two variables move in the same direction so that an
increase/decrease in the value of one variable is followed by an increase/decrease in the value of the
other variable.
 Negative Correlation: when the values of the two variables move in the opposite direction so that an
increase/decrease in the value of one variable is followed by decrease/increase in the value of the other
variable.
 No Correlation: when there is no linear dependence or no relation between the two variables.
Q5) Hypothesis testing:
 The 'testing of hypothesis' starts with an assumption or guess, termed as hypothesis that is made about a
population parameter.
 The 'testing of hypothesis' is a process of testing the significance of a parameter of the population on
the basis of a sample.
 It is always possible that value of the statistic differs from the assumed value. If the difference is too
small, there is likelihood that guessed or hypothesized value is correct.
Null hypothesis:
 A hypothesis stated in the hope of being rejected is called a null hypothesis and is denoted by H0.
Alternative hypothesis:
 If Ho is rejected. It may lead to acceptance of an alternative hypothesis denoted by H1. In other words,
if sample results fail to support the null hypothesis, we must conclude that something else is true.
Which is termed as alternative hypothesis.
 e.g., A dice is suspected to be rolled. Row the dice a number of times to test.
 The null hypothesis Ho: P = 1/6 for showing six.
 The alternative hypothesis H1 P = 1/6.
Research Hypothesis (H1):
 It is a statement that predicts a relationship or difference between variables.
 Example: "There is a significant difference in test scores between Group A and Group B."
Statistical Hypothesis:
 Involves statements about population parameters that can be tested using statistical methods.
 Example: "The mean score of the population is equal to 50."
Q6) Assumptions in simple linear regression model:
1) Linearity: The relationship between X and Y must be linear. Check this assumption by examining a
scatterplot of x and y.
2) Independence of errors: There is not a relationship between the residuals and the Y variable; in other words,
Y is independent of errors. Check this assumption by examining a scatterplot of "residuals versus fits"; the
correlation should be approximately 0. In other words, there should not look like there is a relationship.
3) Normality of errors: The residuals must be approximately normally distributed. Check this assumption by
examining a normal probability plot; the observations should be near the line. You can also examine a histogram
of the residuals; it should be approximately normally distributed.
4) Equal variances: The variance of the residuals is the same for all values of X. Check this assumption by
examining the scatterplot of "residuals versus fits"; the variance of the residuals should be the same across all
values of the x-axis. If the plot shows a pattern, then variances are not consistent, and this assumption has not
been met.

Q7) ANOVA:
 ANOVA is the important tool of statistical analysis.
 Professor RE fisher has developed ANOVA.
 It is used when there are 2 or more sample drawn from the population.
Assumptions of ANOVA:
 All the samples drawn from the population must be independent.
 Sample drawn from the population should be normally distributed.
 All the sample must be random sample.
 It must have same population variance.
Uses of ANOVA:
 Test of significance between 2 or several sample.
 Test of correlation and regression.
 Test the significance between the variance of several sample.
Techniques:
 1) one way- i) direct method ii) short-cut iii) coding
 2) Two way

Q8. Write a short note on statistical graphs and charts.


1. Histogram:
 Represents the distribution of a continuous variable.
 Bars are used to show the frequency of data within predefined intervals or bins.
 Useful for identifying patterns, central tendency, and spread of the data.

2. Ogives:
 Graphical representation of cumulative frequency distribution.
 Cumulative frequencies are plotted against the upper or lower class boundaries.
 Can be used to find percentiles and analyze the distribution's shape.

3. Pie Chart:
 Circular chart divided into slices, each representing a proportion of the whole.
 Ideal for displaying the relative contribution of each category to the total.
 Effective for presenting categorical data and percentages.

4. Bar Graph:
 Uses bars of equal width to represent different categories or groups.
 Height of each bar corresponds to the quantity or frequency of the data.
 Can be arranged either horizontally or vertically based on preference.

5. Pareto Chart:
 Combines a bar graph and a line graph to prioritize issues or problems.
 Bars represent the frequency or impact of issues in descending order.
 Used in quality control and decision-making to focus on the most critical issues.
Q9. Comparison of various measures of Dispersion.

Range Interquartile Variance Standard Mean Coefficient of


Range (IQR) Deviation Absolute Variation
Deviation (CV)
(MAD)
Definition The The range The average of the The square The average of The ratio of the
difference between the squared root of the the absolute standard
between the first quartile differences from variance. differences deviation to the
maximum (Q1) and the the mean. between each mean,
and third quartile data point and expressed as a
minimum (Q3). the mean. percentage.
values in a
dataset.
Advantage Simple and Less sensitive Provides a More Easy to Allows for the
easy to to extreme comprehensive interpretable interpret and comparison of
calculate. values measure of than the Less affected variability
compared to variability. variance and by extreme relative to the
the range. widely used in values mean.
statistical compared to
analysis. variance.

Limitation Sensitive to Ignores the Units are squared, Still sensitive Ignores the Inappropriate
outliers, spread of data making to outliers. squared for datasets
making it within the interpretation differences, with a mean
less robust. interquartile difficult. Sensitive potentially close to zero.
range. to outliers. downplaying
the impact of
outliers.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy