0% found this document useful (0 votes)

37 views33 pages

Unit Iii Poriyan Notes

UNIT 3 FDS NOTES

Uploaded by

ramyaproject

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views33 pages

Unit Iii Poriyan Notes

UNIT 3 FDS NOTES

Uploaded by

ramyaproject

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

UNIT III: Describing Relationships

Syllabus

Correlation - Scatter plots - correlation coefficient for quantitative data - computational formula
for correlation coefficient - Regression - regression line - least squares regression line - Standard
error of estimate - interpretation of R2 - multiple regression equations - regression towards the
mean.

1. How correlation coefficient can be calculated for the quantitative data?

2. EXPLAIN CORRELATION COEFFICIENT?NOV / DEC 2023

Correlation

• When one measurement is made on each observation, uni-variate analysis is applied. If more
than one measurement is made on each observation, multivariate analysis is applied. Here we
focus on bivariate analysis, where exactly two measurements are made on each observation.

• The two measurements will be called X and Y. Since X and Y are obtained for each
observation, the data for one observation is the pair (X, Y).

• Some examples:

1. Height (X) and weight (Y) are measured for each individual in a sample.

2. Stock market valuation (X) and quarterly corporate earnings (Y) are recorded for each
company in a sample.

3. A cell culture is treated with varying concentrations of a drug and the growth rate (X) and
drug concentrations (Y) are recorded for each trial.

4. Temperature (X) and precipitation (Y) are measured on a given day at a set of weather
stations.

•There is difference in bivariate data and two sample data. In two sample data, the X and Y
values are not paired and there are not necessarily the same number of X and Y values.

• Correlation refers to a relationship between two or more objects. In statistics, the word
correlation refers to the relationship between two variables. Correlation exists between two
variables when one of them is related to the other in some way.

• Examples: One variable might be the number of hunters in a region and the other variable
could be the deer population. Perhaps as the number of hunters increases, the deer population
decreases. This is an example of a negative correlation: As one variable increases, the other
decreases.

A positive correlation is where the two variables react in the same way, increasing or
decreasing together. Temperature in Celsius and Fahrenheit has a positive correlation.

• The term "correlation" refers to a measure of the strength of association between two variables.

• Covariance is the extent to which a change in one variable corresponds systematically to a

change in another. Correlation can be thought of as a standardized covariance.

• The correlation coefficient r is a function of the data, so it really should be called the sample
correlation coefficient. The (sample) correlation coefficient r estimates the population correlation
coefficient p.

• If either the X, or the Y; values are constant (i.e. all have the same value), then one of the
sample standard deviations is zero and therefore the correlation coefficient is not defined.

Types of Correlation

1. Positive and negative

2. Simple and multiple

3. Partial and total

4. Linear and non-linear.

1. Positive and negative

• Positive correlation : Association between variables such that high scores on one variable
tends to have high scores on the other variable. A direct relation between the variables.

• Negative correlation : Association between variables such that high scores on one variable
tends to have low scores on the other variable. An inverse relation between the variables.

2. Simple and multiple

• Simple: It is about the study of only two variables, the relationship is described as simple
correlation.

• Example: Quantity of money and price level, demand and price.

• Multiple: It is about the study of more than two variables simultaneously, the relationship is
described as multiple correlations.

• Example: The relationship of price, demand and supply of a commodity.

3. Partial and total correlation

• Partial correlation : Analysis recognizes more than two variables but considers only two
variables keeping the other constant. Example: Price and demand, eliminating the supply side.

• Total correlation is based on all the relevant variables, which is normally not feasible. In total
correlation, all the facts are taken into account.

4. Linear and non-linear correlation

• Linear correlation : Correlation is said to be linear when the amount of change in one variable
tends to bear a constant ratio to the amount of change in the other. The graph of the variables
having a linear relationship will form a straight line.

• Non linear correlation : The correlation would be non linear if the amount of change in one
variable does not bear a constant ratio to the amount of change in the other variable.

Classification of correlation

•Two methods are used for finding relationship between variables.

1. Graphic methods

2. Mathematical methods.

• Graphic methods contain two sub methods: Scatter diagram and simple graph.

• Types of mathematical methods are,

a. Karl 'Pearson's coefficient of correlation

b. Spearman's rank coefficient correlation

c. Coefficient of concurrent deviation

d. Method of least squares.

Coefficient of Correlation
Correlation : The degree of relationship between the variables under consideration is measure
through the correlation analysis.

• The measure of correlation called the correlation coefficient. The degree of relationship is
expressed by coefficient which range from correlation (- 1 ≤ r≥ + 1). The direction of change is
indicated by a sign.

• The correlation analysis enables us to have an idea about the degree and direction of the
relationship between the two variables under study.

• Correlation is a statistical tool that helps to measure and analyze the degree of relationship
between two variables. Correlation analysis deals with the association between two or more
variables.

• Correlation denotes the interdependency among the variables for correlating two phenomenon,
it is essential that the two phenomenon should have cause-effect relationship and if such
relationship does not exist then the two phenomenon can not be correlated.

• If two variables vary in such a way that movement in one are accompanied by movement in
other, these variables are called cause and effect relationship.

Properties of Correlation

1. Correlation requires that both variables be quantitative.

2. Positive r indicates positive association between the variables and negative r indicates negative
association.

3. The correlation coefficient (r) is always a number between - 1 and + 1.

4. The correlation coefficient (r) is a pure number without units.

5. The correlation coefficient measures clustering about a line, but only relative to the SD's.

6. The correlation can be misleading in the presence of outliers or nonlinear association.

7. Correlation measures association. But association does not necessarily show causation.

Example 3.1.1: A sample of 6 children was selected, data about their age in years and
weight in kilograms was recorded as shown in the following table. It is required to find the
correlation between age and weight.
Solution :

X = Variable age is the independent variable

Y = Variable weight is the dependent

• Other formula for calculating correlation coefficient is as follows:

Interpreting the correlation coefficient Cr = Σ (Zx Zy)/N

•Because the relationship between two sets of data is seldom perfect, the majority of correlation
coefficients are fractions (0.92, -0.80 and the like).

• When interpreting correlation coefficients it is sometimes difficult to determine what is high,

low and average.

• The value of correlation coefficient 'r' ranges from - 1 to +1.

• If r = + 1, then the correlation between the two variables is said to be perfect and positive.

•If r = -1, then the correlation between the two variables is said to be perfect and negative.

• If r = 0, then there exists no correlation between the variables.

Example 3.1.2: A sample of 12 fathers and their elder sons gave the following data about
their heights in inches. Calculate the coefficient of rank correlation.

Solution:
Example 3.1.3: Calculate coefficient of correlation between age of cars and annual
maintenance and comment.

Solution: Let,

x = Age of cars y= Annual maintenance cost, n = 7

Calculate X̄ = 2+4+6+ 7+ 8+10+12 / 7 = 49/7= 7

Calculate Ȳ = 1600+ 1500+ 1800+ 1900+ 1700 + 2100 + 2000 /7

= 12600 / 7 = 1800

=3700/4427.188 = 0.8357

Coefficient of correlation r = 0.8357

Example 3.1.4: Calculate coefficient of correlation from the following data.

Solution: In the problem statement, both series items are in small numbers. So there is no need
to take deviations.

Computation of coefficient of correlation

= 46 / 5.29 × 9.165

r = 0.9488
3. EXPLAIN SCATTER PLOT IN DEATIL NOV / DEC 2023

Scatter Plots

• When two variables x and y have an association (or relationship), we say there exists
a correlation between them. Alternatively, we could say x and y are correlated. To find such an
association, we usually look at a scatterplot and try to find a pattern.

• Scatterplot (or scatter diagram) is a graph in which the paired (x, y) sample data are plotted
with a horizontal x axis and a vertical y axis. Each individual (x, y) pair is plotted as a single
point.

• One variable is called independent (X) and the second is called dependent (Y).

Example:

• Fig. 3.2.1 shows the scatter diagram.

• The pattern of data is indicative of the type of relationship between your two variables :

1. Positive relationship

2. Negative relationship

3. No relationship.
• The scattergram can indicate a positive relationship, a negative relationship or
a zero relationship.

Advantages of Scatter Diagram

1. It is a simple to implement and attractive method to find out the nature of correlation.

2. It is easy to understand.

3. User will get rough idea about correlation (positive or negative correlation).

4. Not influenced by the size of extreme item

5. First step in investing the relationship between two variables.

Disadvantage of scatter diagram

• Can not adopt an exact degree of correlation.

The product moment correlation, r, summarizes the strength of association between two
metric (interval or ratio scaled) variables, say X and Y.

Correlation Coefficient for Quantitative Data

• The product moment correlation, r, summarizes the strength of association between two
metric (interval or ratio scaled) variables, say X and Y. It is an index used to determine whether a
linear or straight-line relationship exists between X and Y.

• As it was originally proposed by Karl Pearson, it is also known as the Pearson correlation
coefficient. It is also referred to as simple correlation, bivariate correlation or merely the
correlation coefficient.

• The correlation coefficient between two variables will be the same regardless of their
underlying units of measurement.

• It measures the nature and strength between two variables of the quantitative type.

• The sign of r denotes the nature of association. While the value of r denotes the strength of
association.

• If the sign is positive this means the relation is direct (an increase in one variable is associated
with an increase in the other variable and a decrease in one variable is associated with a decrease
in the other variable).
• While if the sign is negative this means an inverse or indirect relationship (which means an
increase in one variable is associated with a decrease in the other).

• The value of r ranges between (-1) and (+ 1). The value of r denotes the strength of the
association as illustrated by the following diagram,

1. If r = Zero this means no association or correlation between the two variables.

2. If 0 < r <0.25 = Weak correlation.

3. If 0.25 ≤ r < 0.75 = Intermediate correlation.

4. If 0.75 ≤ r< 1 = Strong correlation.

5. If r=1= Perfect correlation

• Pearson's 'r' is the most common correlation coefficient. Karl Pearson's Coefficient of
Correlation denoted by - 'r' The coefficient of correlation 'r' measure the degree of linear
relationship between two variables say x and y.

• Formula for calculating correlation coefficient (r) :

1. When deviation taken from actual mean :

2. When deviation taken from an assumed mean :

Example 3.3.1: Compute Pearson's coefficient of correlation between maintains cost and sales
as per the data given below.

Solution: Given data:

n= 10

X= Maintains cost

y=Sales cost

Calculate coefficient of correlation.

Correlation coefficient is positively correlated.

Example 3.3.2: A random sample of 5 college students is selected and their grades in
operating system and software engineering are found to be ?
Calculate Pearson's rank correlation coefficient?

Solution:

Example 3.3.3: Find Karl Pearson's correlation coefficient for the following paired data.

Solution: Let

x = Wages y = Cost of living

Karl Pearson's correlation coefficient r = 0.847

Example 3.3.4: Find Karl Pearson's correlation coefficient for the following paired data.

What inference would you draw from estimate ?

Solution:
For an input x, if the output is continuous, this is called a regression problem.

4. Explain the different types of regression analysis in detail.

Regression

• For an input x, if the output is continuous, this is called a regression problem. For example,
based on historical information of demand for tooth paste in your supermarket, you are asked to
predict the demand for the next month.
• Regression is concerned with the prediction of continuous quantities. Linear regression is the
oldest and most widely used predictive model in the field of machine learning. The goal is to
minimize the sum of the squared errors to fit a straight line to a set of data points.

• It is one of the supervised learning algorithms. A regression model requires the knowledge of
both the dependent and the independent variables in the training data set.

• Simple Linear Regression (SLR) is a statistical model in which there is only one independent
variable and the functional relationship between the dependent variable and the regression
coefficient is linear.

• Regression line is the line which gives the best estimate of one variable from the value of any
other given variable.

• The regression line gives the average relationship between the two variables in mathematical
form. For two variables X and Y, there are always two lines of regression.

• Regression line of Y on X: Gives the best estimate for the value of Y for any specific given
values of X:

where

Y = a + bx

a = Y - intercept

b = Slope of the line

Y = Dependent variable

X = Independent variable

• By using the least squares method, we are able to construct a best fitting straight line to the
scatter diagram points and then formulate a regression equation in the form of:

ŷ = a + bx

ŷ = ȳ + b(x- x̄)

• Regression analysis is the art and science of fitting straight lines to patterns of data. In a linear
regression model, the variable of interest ("dependent" variable) is predicted from k other
variables ("independent" variables) using a linear equation.
• If Y denotes the dependent variable and X1, ..., Xk are the independent variables, then the
assumption is that the value of Y at time t in the data sample is determined by the linear
equation:

Y1 = β0 + β1 X1t + B2 X2t +… + βk Xkt + εt

where the betas are constants and the epsilons are independent and identically distributed normal
random variables with mean zero.

Regression Line

• A way of making a somewhat precise prediction based upon the relationships between two
variables. The regression line is placed so that it minimizes the predictive error.

• The regression line does not go through every point; instead it balances the difference between
all data points and the straight-line model. The difference between the observed data value and
the predicted value (the value on the straight line) is the error or residual. The criterion to
determine the line that best describes the relation between two variables is based on the residuals.

Residual = Observed - Predicted

• A negative residual indicates that the model is over-predicting. A positive residual indicates
that the model is under-predicting.

Linear Regression

• The simplest form of regression to visualize is linear regression with a single predictor. A linear
regression technique can be used if the relationship between X and Y can be approximated with a
straight line.

• Linear regression with a single predictor can be expressed with the equation:

y = Ɵ2x + Ɵ1 + e

• The regression parameters in simple linear regression are the slope of the line (Ɵ2), the angle
between a data point and the regression line and the y intercept (Ɵ1) the point where x crosses the
y axis (X = 0).

• Model 'Y', is a linear function of 'X'. The value of 'Y' increases or decreases in linear manner
according to which the value of 'X' also changes.
Nonlinear Regression:

• Often the relationship between x and y cannot be approximated with a straight line. In this case,
a nonlinear regression technique may be used.

• Alternatively, the data could be preprocessed to make the relationship linear. In Fig. 3.4.2
shows nonlinear regression. (Refer Fig. 3.4.2 on previous page)

• The X and Y have a nonlinear relationship.sp

• If data does not show a linear dependence we can get a more accurate model using a nonlinear
regression model.

• For example: y = W0 + W1X + W2 X2 + W3 X3

• Generalized linear model is foundation on which linear regression can be applied to modeling
categorical response variables.

Advantages:
a. Training a linear regression model is usually much faster than methods such as neural
networks.

b. Linear regression models are simple and require minimum memory to implement.

c. By examining the magnitude and sign of the regression coefficients you can infer how
predictor variables affect the target outcome.

• There are two important shortcomings of linear regression:

1. Predictive ability: The linear regression fit often has low bias but high variance. Recall that
expected test error is a combination of these two quantities. Prediction accuracy can sometimes
be improved by sacrificing some small amount of bias in order to decrease the variance.

2. Interpretative ability: Linear regression freely assigns a coefficient to each predictor

variable. When the number of variables p is large, we may sometimes seek, for the sake of
interpretation, a smaller set of important variables.

Least Squares Regression Line

Least square method

• The method of least squares is about estimating parameters by minimizing the squared
discrepancies between observed data, on the one hand and their expected values on the other.

• The Least Squares (LS) criterion states that the sum of the squares of errors is minimum. The
least-squares solutions yield y(x) whose elements sum to 1, but do not ensure the outputs to be in
the range [0, 1].

• How to draw such a line based on data points observed? Suppose a imaginary line of y = a +
bx.
• Imagine a vertical distance between the line and a data point E = Y - E(Y). This error is the
deviation of the data point from the imaginary line, regression line. Then what is the best values
of a and b? A and b that minimizes the sum of such errors.

• Deviation does not have good properties for computation. Then why do we use squares of
deviation? Let us get a and b that can minimize the sum of squared deviations rather than the
sum of deviations. This method is called least squares.

• Least squares method minimizes the sum of squares of errors. Such a and b are called least
squares estimators i.e. estimators of parameters a and B.

• The process of getting parameter estimators (e.g., a and b) is called estimation. Lest squares
method is the estimation method of Ordinary Least Squares (OLS).

Disadvantages of least square

1. Lack robustness to outliers.

2. Certain datasets unsuitable for least squares classification.

3. Decision boundary corresponds to ML solution.

Example 3.4.1: Fit a straight line to the points in the table. Compute m and b by least
squares.
Standard Error of Estimate

• The standard error of estimate represents a special kind of standard deviation that reflects tells
that we the magnitude of predictive error. The standard error of estimate, denoted S
approximately how large the prediction errors (residuals) are for our data set in the same units as
Y.

Definition formula for standard error of estimate = √Sum of square / √n-2

Definition formula for standard error of estimate = √Y-Y' / √(n-2)

Computation formula for standard error of estimate: Sy/x = √SSy(1-r2) /n-2

Example 3.4.2: Define linear and nonlinear regression using figures. Calculate the value of
Y for X = 100 based on linear regression prediction method.

Solution
The primary objective of regression is to explain the variation in Y using the knowledge
of X.

Interpretation of R2

• The following measures are used to validate the simple linear regression models:

1. Co-efficient of determination (R-square).

2. Hypothesis test for the regression coefficient b1.

3. Analysis of variance for overall model validity (relevant more for multiple linear regression).

4. Residual analysis to validate the regression model assumptions.

5. Outlier analysis.

• The primary objective of regression is to explain the variation in Y using the knowledge of X.
The coefficient of determination (R-square) measures the percentage of variation in Y explained
by the model (ẞ0 + ẞ1 X).

Characteristics of R-square:

• Here are some basic characteristics of the measure:

1. Since R2 is a proportion, it is always a number between 0 and 1.

2. If R2 = 1, all of the data points fall perfectly on the regression line. The predictor x accounts
for all of the variation in y!.

3. If R2 = 0, the estimated regression line is perfectly horizontal. The predictor x accounts for
none of the variation in y!

• Coefficient of determination, R2 a measure that assesses the ability of a model to predict or

explain an outcome in the linear regression setting. More specifically, R2 indicates the proportion
of the variance in the dependent variable (Y) that is predicted or explained by linear regression
and the predictor variable (X, also known as the independent variable).

• In general, a high R2 value indicates that the model is a good fit for the data, although
interpretations of fit depend on the context of analysis. An R2 of 0.35, for example, indicates that
35 percent of the variation in the outcome has been explained just by predicting the outcome
using the covariates included in the model.

• That percentage might be a very high portion of variation to predict in a field such as the social
sciences; in other fields, such as the physical sciences, one would expect R2 to be much closer to
100 percent.

• The theoretical minimum R2 is 0. However, since linear regression is based on the best possible
fit, R2 will always be greater than zero, even when the predictor and outcome variables bear no
relationship to one another.

• R2 increases when a new predictor variable is added to the model, even if the new predictor is
not associated with the outcome. To account for that effect, the adjusted R2 incorporates the
same information as the usual R2 but then also penalizes for the number of predictor variables
included in the model.

• As a result, R2 increases as new predictors are added to a multiple linear regression model, but
the adjusted R increases only if the increase in R2 is greater than one would expect from chance
alone. In such a model, the adjusted R2 is the most realistic estimate of the proportion of the
variation that is predicted by the covariates included in the model.

Spurious Regression

• The regression is spurious when we regress one random walk onto another independent random
walk. It is spurious because the regression will most likely indicate a non-existing relationship:

1. The coefficient estimate will not converge toward zero (the true value). Instead, in the limit
the coefficient estimate will follow a non-degenerate distribution.

2. The t value most often is significant.

3. R2 is typically very high.

• Spurious regression is linked to serially correlated errors.

• Granger and Newbold(1974) pointed out that along with the large t-values strong evidence of
serially correlated errors will appear in regression analysis, stating that when a low value of the
Durbin-Watson statistic is combined with a high value of the t-statistic the relationship is not
true.

Hypothesis Test for Regression Co-Efficient (t-Test)

• The regression co-efficient (ẞ1) captures the existence of a linear relationship between the
response variable and the explanatory variable.

If ẞi = 0, we can conclude that there is no statistically significant linear relationship between the
two variables.

• Using the Analysis of Variance (ANOVA), we can test whether the overall model is
statistically significant. However, for a simple linear regression, the null and alternative
hypotheses in ANOVA and t-test are exactly same and thus there will be no difference in the p-
value.

Residual analysis

• Residual (error) analysis is important to check whether the assumptions of regression models
have been satisfied. It is performed to check the following:

1. The residuals are normally distributed.

2. The variance of residual is constant (homoscedasticity).

3. The functional form of regression is correctly specified.

4. If there are any outliers.

Multiple linear regression is an extension of linear regression, which allows a response

variable, y to be modelled as a linear function of two or more predictor variables.

Multiple Regression Equations

• Multiple linear regression is an extension of linear regression, which allows a response

variable, y to be modelled as a linear function of two or more predictor variables.

• In a multiple regression model, two or more independent variables, i.e. predictors are involved
in the model. The simple linear regression model and the multiple regression model assume that
the dependent variable is continuous.

Difference between Simple and Multiple Regression

Regression toward the mean refers to a tendency for scores, particularly extreme scores,
to shrink toward the mean.

Regression Towards the Mean

• The rule goes that, in any series with complex phenomena that are dependent on many
variables, where chance is involved, extreme outcomes tend to be followed by more moderate
ones.
• The effects of regression to the mean can frequently be observed in sports, where the effect
causes plenty of unjustified speculations.

• It basically states that if a variable is extreme the first time we measure it, it will be closer to
the average the next time we measure it. In technical terms, it describes how a random variable
that is outside the norm eventually tends to return to the norm.

• For example, our odds of winning on a slot machine stay the same. We might hit a "winning
streak" which is, technically speaking, a set of random variables outside the norm. But play the
machine long enough and the random variables will regress to the mean (i.e. "return to normal")
and we shall end up losing.

• Consider a sample taken from a population. The value of the variable will be some distance
from the mean. For instance, we could take a sample of people, it could be just one measure their
heights and then determine the average height of the sample. This value will be some distance
away from the average height of the entire population of people, though the distance might be
zero.

• Regression to the mean usually happens because of sampling error. A good sampling technique
is to randomly sample from the population. If we asymmetrically sampled, then results may be
abnormally high or low for the average and therefore would regress back to the mean.
Regression to the mean can also happen because we take a very small, unrepresentative sample.

Regression fallacy

• Regression fallacy assumes that a situation has returned to normal due to corrective actions
having been taken while the situation was abnormal. It does not take into consideration normal
fluctuations.

• An example of this could be a business program failing and causing problems which is then
cancelled. The return to "normal", which might be somewhat different from the original situation
or a situation of "new normal" could fall into the category of regression fallacy. This is
considered an informal fallacy.

Regression toward the mean refers to a tendency for scores, particularly extreme scores,
to shrink toward the mean.

Regression Towards the Mean

• Regression toward the mean refers to a tendency for scores, particularly extreme scores, to
shrink toward the mean. Regression toward the mean appears among subsets of extreme
observations for a wide variety of distributions.
• The rule goes that, in any series with complex phenomena that are dependent on many
variables, where chance is involved, extreme outcomes tend to be followed by more moderate
ones.

• The effects of regression to the mean can frequently be observed in sports, where the effect
causes plenty of unjustified speculations.

Regression fallacy

Q.1 What is correlation ?

Ans. : Correlation refers to a relationship between two or more objects. In statistics, the word
correlation refers to the relationship between two variables. Correlation exists between two
variables when one of them is related to the other in some way.

Q.2 Define positive and negative correlation.

Ans. :

• Positive correlation : Association between variables such that high scores on one variable tends
to have high scores on the other variable. A direct relation between the variables.

• Negative correlation: Association between variables such that high scores on one variable
tends to have low scores on the other variable. An inverse relation between the variables.

Q.3 What is cause and effect relationship?

Ans. If two variables vary in such a way that movement in one are accompanied by movement in
other, these variables are called cause and effect relationship.

Q.4 Explain advantages of scatter diagram.

Ans. 1. It is a simple to implement and attractive method to find out the nature of correlation.

2. It is easy to understand.

3. User will get rough idea about correlation (positive or negative correlation).

4. Not influenced by the size of extreme item.

5. First step in investing the relationship between two variables.

Q.5 What is regression problem?

Ans. For an input x, if the output is continuous, this is called a regression problem.

Q.6 What are assumptions of regression?

Ans. The regression has five key assumptions: Linear relationship, Multivariate normality, No or
little multi-collinearity and No auto-correlation.
Q.7 What is regression analysis used for?

Ans. : Regression analysis is a form of predictive modelling technique which investigates the
relationship between a dependent (target) and independent variable (s) (predictor). This
technique is used for forecasting, time series modelling and finding the causal effect relationship
between the variables.

Q.8 What are the types of regressions ?

Ans. Types of regression are linear regression, logistic regression, polynomial regression,
stepwise regression, ridge regression, lasso regression and elastic-net regression.

Q.9 What do you mean by least square method?

Ans. Least squares is a statistical method used to determine a line of best fit by minimizing the
sum of squares created by a mathematical function. A "square" is determined by squaring the
distance between a data point and the regression line or mean value of the data set.

Q.10 What is correlation analysis?

Ans. Correlation is a statistical analysis used to measure and describe the relationship between
two variables. A correlation plot will display correlations between the values of variables in the
dataset. If two variables are correlated, X and Y then a regression can be done in order to predict
scores on Y from the scores on X.

Q.11 What is multiple regression equations?

Ans. Multiple linear regression is an extension of linear regression, which allows a response
variable, y to be modelled as a linear function of two or more predictor variables. In a multiple
regression model, two or more independent variables, i.e. predictors are involved in the model.
The simple linear regression model and the multiple regression model assume that the dependent
variable is continuous.

Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Emperical Measurement of Price
No ratings yet
Emperical Measurement of Price
5 pages
Unit III Notes
No ratings yet
Unit III Notes
31 pages
Correlation BMLT
No ratings yet
Correlation BMLT
5 pages
Correlation
No ratings yet
Correlation
83 pages
Correlation
100% (1)
Correlation
78 pages
Correlation and Regression-1
No ratings yet
Correlation and Regression-1
32 pages
MRS - Diana-Correlation Analysis-Notes
No ratings yet
MRS - Diana-Correlation Analysis-Notes
16 pages
Correlation Analysis
100% (1)
Correlation Analysis
51 pages
Correlation
No ratings yet
Correlation
6 pages
Chapter 6 Correlation and Regression
No ratings yet
Chapter 6 Correlation and Regression
29 pages
Correlation
No ratings yet
Correlation
4 pages
4-1 Introduction To Corrrelation and Its Properties
0% (1)
4-1 Introduction To Corrrelation and Its Properties
14 pages
STATISTICS Documentary
No ratings yet
STATISTICS Documentary
18 pages
Correlation Analysis
No ratings yet
Correlation Analysis
49 pages
4-1 Introduction To Corrrelation and Its Properties
No ratings yet
4-1 Introduction To Corrrelation and Its Properties
14 pages
Correlation and Regression
No ratings yet
Correlation and Regression
54 pages
Correlation
No ratings yet
Correlation
34 pages
Correlation: (For M.B.A. I Semester)
100% (2)
Correlation: (For M.B.A. I Semester)
46 pages
Correlation 1
100% (1)
Correlation 1
57 pages
ECAP790 U06L01 Correlation
No ratings yet
ECAP790 U06L01 Correlation
37 pages
Correlation Analysis
No ratings yet
Correlation Analysis
16 pages
1.2. Ch-2 - Correlation Theory-1
No ratings yet
1.2. Ch-2 - Correlation Theory-1
29 pages
Correlation
No ratings yet
Correlation
30 pages
Chapter 4 (Correlation Part)
No ratings yet
Chapter 4 (Correlation Part)
16 pages
CORRELATION
No ratings yet
CORRELATION
61 pages
Lecture Sheet H
No ratings yet
Lecture Sheet H
17 pages
Correlation Notes
No ratings yet
Correlation Notes
15 pages
Correction and Regression
No ratings yet
Correction and Regression
30 pages
Statistics Module 3hejeiehhwwhgsysysudhhdbb
No ratings yet
Statistics Module 3hejeiehhwwhgsysysudhhdbb
44 pages
Correlation Analysis
No ratings yet
Correlation Analysis
54 pages
Correlation
No ratings yet
Correlation
4 pages
Correlation Ansd Simple Regression
No ratings yet
Correlation Ansd Simple Regression
27 pages
Correlation
No ratings yet
Correlation
84 pages
CORRELATION
No ratings yet
CORRELATION
5 pages
Correlation and Regression
No ratings yet
Correlation and Regression
59 pages
Correlation Coefficient Definition
100% (1)
Correlation Coefficient Definition
8 pages
Correlation
No ratings yet
Correlation
2 pages
Correlation
No ratings yet
Correlation
8 pages
Correlation and Regression Analysis
100% (1)
Correlation and Regression Analysis
59 pages
Lecture 5
No ratings yet
Lecture 5
30 pages
Correlation Analysis and Its Types
No ratings yet
Correlation Analysis and Its Types
50 pages
Correlation
No ratings yet
Correlation
5 pages
Correlation Analysis MBA
No ratings yet
Correlation Analysis MBA
17 pages
Unit 9 Part 2
No ratings yet
Unit 9 Part 2
7 pages
Correlation
No ratings yet
Correlation
6 pages
Correlation and Regression
No ratings yet
Correlation and Regression
64 pages
Lesson 11 - Regression and Correlation Analysis
No ratings yet
Lesson 11 - Regression and Correlation Analysis
8 pages
Correlation
No ratings yet
Correlation
5 pages
QT Module II Correlation and Regression Analysis
No ratings yet
QT Module II Correlation and Regression Analysis
10 pages
Correlation
No ratings yet
Correlation
17 pages
5-Correlation and Rank Correlation-03!02!2025
No ratings yet
5-Correlation and Rank Correlation-03!02!2025
60 pages
202003241550009941rajeev Pandey Correlation Research
No ratings yet
202003241550009941rajeev Pandey Correlation Research
87 pages
Correlation
No ratings yet
Correlation
14 pages
Correlation Analysis - Final
No ratings yet
Correlation Analysis - Final
40 pages
Correlation Notes
No ratings yet
Correlation Notes
45 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
11 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Exercises of Advanced Statistics
From Everand
Exercises of Advanced Statistics
Simone Malacrida
No ratings yet
Dpco QB
No ratings yet
Dpco QB
16 pages
Aiml Q Bank
No ratings yet
Aiml Q Bank
25 pages
MCQ FDS
No ratings yet
MCQ FDS
5 pages
Oose Unit 2
No ratings yet
Oose Unit 2
8 pages
Pearson Product Moment Correlation
No ratings yet
Pearson Product Moment Correlation
18 pages
Corelation and Regression
No ratings yet
Corelation and Regression
137 pages
Bus 173 Assingment
No ratings yet
Bus 173 Assingment
18 pages
Reading 07-Correlation and Regression
No ratings yet
Reading 07-Correlation and Regression
18 pages
Model Econometric Time Series - MATLAB
No ratings yet
Model Econometric Time Series - MATLAB
2 pages
Describe in Brief Different Types of Regression Algorithms
No ratings yet
Describe in Brief Different Types of Regression Algorithms
25 pages
ECON 5027FG Chu PDF
No ratings yet
ECON 5027FG Chu PDF
3 pages
Hasil SPSS
No ratings yet
Hasil SPSS
8 pages
Lesson 3.1 SPSS OUTPUT
No ratings yet
Lesson 3.1 SPSS OUTPUT
6 pages
Pattern Recognition Resources Compiled
No ratings yet
Pattern Recognition Resources Compiled
6 pages
Multivariate
100% (1)
Multivariate
78 pages
Machine Learning Oral Questions
No ratings yet
Machine Learning Oral Questions
10 pages
UPDATED - Correlation Worksheet
No ratings yet
UPDATED - Correlation Worksheet
5 pages
(ENGDAT2) Exercise 3
No ratings yet
(ENGDAT2) Exercise 3
10 pages
Eco 372.1 Course Outline Summer 2024-SBn-Final
No ratings yet
Eco 372.1 Course Outline Summer 2024-SBn-Final
4 pages
ML Mid Question Solve
No ratings yet
ML Mid Question Solve
19 pages
ST201 2024
No ratings yet
ST201 2024
12 pages
Lucky Charms Stats II - Math Work Sample
No ratings yet
Lucky Charms Stats II - Math Work Sample
10 pages
Part-A Assignment No. 5
No ratings yet
Part-A Assignment No. 5
2 pages
Testing Heteroskedasticity Stata
No ratings yet
Testing Heteroskedasticity Stata
4 pages
Course Pack Correlation
No ratings yet
Course Pack Correlation
12 pages
Multiple Regression
No ratings yet
Multiple Regression
17 pages
Extra - 2017 - AMJ - PROCESS Versus Structural Equation Modeling
No ratings yet
Extra - 2017 - AMJ - PROCESS Versus Structural Equation Modeling
6 pages
2024 Chapter 1
No ratings yet
2024 Chapter 1
8 pages
Model Selection-Handout PDF
No ratings yet
Model Selection-Handout PDF
57 pages
Chapter 8 Simple Linear Regression
100% (3)
Chapter 8 Simple Linear Regression
17 pages
Random Forests
No ratings yet
Random Forests
35 pages
Ai Q
No ratings yet
Ai Q
15 pages
Malhotra 19
No ratings yet
Malhotra 19
37 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.