Basic
Basic
(DEPARTMENT OF ECONOMICS)
Registration#: FA21-BEC-010
Interval estimation involves constructing an interval, or range of values, that we're fairly
confident contains the true population parameter. In the context of regression, we often
construct confidence intervals for the slope and intercept of the regression line. We use the
standard errors of the estimated slope and intercept. These standard errors tell us how much
the estimates are likely to vary from sample to sample. The smaller the standard error, the
more confident we can be that our estimate is close to the true population value. Hypothesis
testing is a way to make inferences or draw conclusions about the population from our sample.
In the context of regression, we might want to test whether the slope of the regression line is
significantly different from zero. This would tell us whether there is a significant relationship
between our two variables. Significant, in a statistical sense just means not likely to happen by
chance. It doesn't necessarily mean the relationship is large or important. We set up a null
hypothesis, which typically states that there is no relationship between the variables (i.e., the
slope is zero). We then calculate a test statistic, such as the t-statistic, and compare it to a
critical value. If the test statistic is beyond the critical value, we reject the null hypothesis.
Level of Significance: The level of significance, denoted by the Greek symbol α (alpha), is the
probability of rejecting the null hypothesis when it is true. It is a measure of the statistical
significance and helps to determine whether the null hypothesis is accepted or rejected.
Limits of Interval: In the context of statistics, limits of interval generally refer to the upper and
lower bounds of a confidence interval or other intervals. These limits contain the range of
values that are considered plausible for some parameter.
One Tail and Two Tail Tests: A one-tailed test in statistics is a hypothesis test in which the
critical area of a distribution is one-sided so that it is either greater than or less than a certain
value. A two-tailed test, on the other hand, checks for a relationship in both directions.
Acceptance and Rejection Region: In hypothesis testing, the acceptance region is the range of
values of a test statistic that leads to the acceptance of the null hypothesis. The rejection region
is the range of values that leads to the rejection of the null hypothesis.
Importance of t-test, f-test and p-value: t value provides us with the magnitude of
relationship between explanatory variable and dependent variable. According to t rule of
thumb, if the value is greater than 2, then the variable is significant. P value is defined as the
exact probability of committing a type I error. Ideally it should be lower than 0.1. The F-test in
regression compares the fitness of overall models. The null hypothesis states that a model with
no independent variables fits the data as well as your model. If the p-value of the F-test is less
than the significance level, your model fits the data better than the model with no independent
variables.
Test of Normality: A normality test is used to determine whether sample data has been drawn
from a normally distributed population. It is generally performed to verify whether the data
involved in the research have a normal distribution. Three main tests that are Histogram of
residuals, Normal probability Plot and the Jarque-Bera Test.
Type I and Type II Error: A Type I error occurs when a true null hypothesis is incorrectly
rejected (false positive). A Type II error happens when a false null hypothesis isn’t rejected
(false negative). They have a inverse relationship, reducing I by lowering the level of significance
will increase chances of type ii error.
R-squared (R²): R-squared is a statistical measure that determines the proportion of variance
in the dependent variable that can be explained by the independent variablble. In other words,
R-squared shows how well the data fit the regression model.
Adjusted R-squared: Adjusted R-squared is a modified version of R-squared that adjusts for
the number of predictors in a regression model. It tells you the percentage of variation
explained by only the independent variables that actually affect the dependent variable.
Explained Sum of Squares (ESS): ESS, also known as the model sum of squares or sum of
squares due to regression (SSR), is a measure of how well a model, often a regression model,
represents the data being modeled.
Residual Sum of Squares (RSS): RSS is a statistical technique used to measure the variance in a
data set that is not explained by the regression model.