0% found this document useful (0 votes)

109 views15 pages

An Intuitive Geometric Approach To The Gauss Markov Theorem

The document discusses an intuitive geometric approach to proving the Gauss Markov theorem. It presents the linear model geometrically using subspaces and linear transformations. It describes the estimation process as choosing the vector in the subspace W that is closest to the observed data vector y. This is done using orthogonal projections. The Gauss Markov theorem states that the least squares estimator, which projects the data onto the subspace W, has minimum variance among all unbiased linear estimators.

Uploaded by

Sigit Wahyudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

109 views15 pages

An Intuitive Geometric Approach To The Gauss Markov Theorem

Uploaded by

Sigit Wahyudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

The American Statistician

ISSN: 0003-1305 (Print) 1537-2731 (Online) Journal homepage: http://www.tandfonline.com/loi/utas20

An intuitive geometric approach to the Gauss

Markov theorem

Leandro da Silva Pereira, Lucas Monteiro Chaves & Devanil Jaques de Souza

To cite this article: Leandro da Silva Pereira, Lucas Monteiro Chaves & Devanil Jaques de
Souza (2016): An intuitive geometric approach to the Gauss Markov theorem, The American
Statistician, DOI: 10.1080/00031305.2016.1209127

To link to this article: http://dx.doi.org/10.1080/00031305.2016.1209127

Accepted author version posted online: 21

Jul 2016.
Published online: 21 Jul 2016.

Submit your article to this journal

Article views: 64

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=utas20

Download by: [Cornell University Library] Date: 25 August 2016, At: 08:34
ACCEPTED MANUSCRIPT

An intuitive geometric approach to the Gauss Markov theorem

Leandro da Silva Pereira1

Lucas Monteiro Chaves2

Devanil Jaques de Souza3

Abstract

Algebraic proofs of Gauss Markov theorem are very disappointing from an intuitive point

of view. An alternative is to use geometry which emphasizes the essential statistical ideas behind

the result. A truly geometrical intuitive approach to the theorem is presented, based only in

simple geometrical concepts, like linear subspaces and orthogonal projections.

Keywords

Orthogonal projection, dispersion cloud of points, Gauss Markov estimator.

1
UTFPR – Federal Technological University of Parana, DAMAT, Apucarana-PR, Brazil. Zip code: 86812-460. E-
mail: leandropereira@utfpr.edu.br
2
UFLA – Federal University of Lavras, DEX, Lavras-MG, Brazil. Zip code: 37200-000.
E-mail: lucas@dex.ufla.br
3
UFLA – Federal University of Lavras, DEX, Lavras-MG, Brazil. Zip code: 37200-000.
E-mail: devaniljaques@dex.ufla.br

1
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

1 – Introduction

There are few general results in statistics. Often, particular and very restrictive

assumptions are necessary, like for example normality. One of these few general results is the

Gauss Markov theorem, with highly practical consequences. The majesty of Gauss Markov

theorem relies in the fact that it holds regardless the distribution of the random variable

considered. This result gives statistical meaning to a pure mathematical fact: the least squares

method. Since the theorem is a basic result, it is taught in beginning statistics courses. However,

the proofs presented in most of the textbooks (Rencher 2008, Rao 1999, Casella and Berger

2002) are based only on algebraic properties of positive definite matrices. The experience in

class seems to lead us to conclude that this kind of demonstration does not improve student

comprehension of the result. Proving the Gauss Markov theorem by algebraic methods seems to

be at most innocuous. There are demonstrations which adopt a geometric flavor, but they usually

rely on some algebraic results (Ruud 2000, Gruber 1998, Saville and Wood 1991). First of all,

we have to be clear about what is meant by a geometric demonstration. In general this will not

be, from a mathematical point of view, a totally rigorous proof, requiring some degree of

intuition. The tricky part of our geometric approach is to interpret matrices as linear

transformations. Linear transformations can be viewed as geometric objects because, as

functions, they transform vectors in vectors and linear subspaces in linear subspaces. In that

sense it is a very concrete geometrical object. Another basic geometric concept is the vector

projection onto linear subspaces, particularly the orthogonal projections. The purpose of this

article is to provide an intuitive geometric demonstration of the Gauss Markov theorem, using

only the concepts of linear subspaces, linear transformations and projections.

2
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

2 – The Linear Model

Consider Y a random vector with unknown mean vector μ  E Y . By reasons related to the

random experiment that generate Y , we can suppose some linear relations among the

components of the unknown mean vector μ and, therefore, assume that the vector μ belongs to

some known linear subspace W , which, in essence, characterizes a linear model. The vector Y

n
stands in the data space, in general, the n dimensional Euclidean space . Since the dimension

of data space is higher than the dimension of W , that is, there are more data than characteristics

to be estimated, it is plausible to use a lower number of variables to describe the W subspace.

Such procedure is called parameterization and can be done in the following way: consider W to

be the image of a linear transformation X , defined in another vector space, which will be called

parameter space, W  Im  X . To avoid technical difficulties the linear transformation X will

be considered injective, that is, for each vector w in W , there exists a unique vector β in the

parameter space such that w  Xβ . In practical situations, the experiment defines the matrix X ,

the design matrix, and the column subspace of X defines W . Then it is possible to make clear

the linear model assumptions: Y is a random vector in the data space, μ  E Y  Xβ a  

vector in W , where β is an unknown vector in the parameter space and Y  Xβ  ε , where ε is

the vector of errors.

All of this can be described geometrically by Figure 1.

3
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

The greatest advantage of representing the linear model as in Figure 1 is the description

of the estimation process. What is the estimation process? It is a very simple decision: after a

data vector y is observed, we have only to choose a vector in W which we believe to be a good

 
representative of E Y . If y belongs to the space W , as this space is the space where the

mean of the random vector Y is restricted, there is no reason to not estimate E Y by the

observed vector y . But this seldom occurs, since the observed vector y is affected by random

errors. Therefore, almost surely, the vector y doesn’t belong to the subspace W and a natural

procedure to estimate the vector E Y is to take some kind of projection of y on W . That

process, when some specific restrictions on the projection are considered, explains more

sophisticated estimation methods, like ridge regression or elastic net regression (Hoerl and

Kennard 1970, Zou and Hastie 2005). Let’s go to the simplest idea: to choose the vector in W

closest to y . This estimation procedure is called the least squares method. How to do this? The

answer is to use linear orthogonal projections. If PW is the linear orthogonal projection onto W ,

the chosen vector is PW (y) (Rencher 2008, p. 43, p. 228). Since the linear transformation X is

injective, there is only one β̂ such that PW (y)  Xβˆ . This equation, in its algebraic form, is

denominated the normal equation and β̂ is the least squares estimate of the parameter vector β .

To express β̂ in terms of the data vector y it is necessary to have an expression of the projection

 )1 X and
PW as a matrix. This can be done with a little linear algebra, getting PW  X(XX

βˆ  (XX)1 X y , where X is the transpose of X . But that linear algebra is not necessary to

understand what follows.

4
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

3 – The Geometry of Gauss Markov Theorem

The statistical properties of such estimation method are established by the Gauss Markov

theorem. Recall that, given a random vector Y with covariance matrix

D(Y)=E  Y  E[Y] Y  E[Y]  , the total variance is the sum of the variances of each
 

component of Y , that is, the trace of D  Y .

THEOREM: (Gauss Markov) If Y  Xβ  ε with covariance matrix D(Y)= I , then the

least-squares estimator βˆ  (XX

 )1 X Y has minimum total variance among all linear unbiased

estimators of β .

First of all, it is necessary to have an intuitive idea of the meaning of “covariance

matrix”. If a lot of values of the random vector Y are observed, they form a cloud of points in

the data space. We will call this cloud the dispersion cloud. The matrix D(Y) tells us about the

shape of this cloud. This can be seen in the following way. Consider a unitary vector x . The

orthogonal projection of the random vector Y in the direction of x defines a one-dimensional

random variable given by Y  x  Y cos( ) , where Y x is the inner product and  is the

angle between x and Y . In this way, we have a one-dimensional random variable with the

same direction as x and centered on μ  x  μ cos( ) , where   is the angle between μ and

x (Figure 2).

5
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

The variance of this random variable gives a good idea of the width of the dispersion

cloud in the x direction. As we are supposing D(Y)= 2I , the variance is

var  x  Y  xD  Y x   2xx   2 . That is, the variability doesn’t depend on the direction

of x . So, all directions in the data space are equally likely. This means that the width of the

dispersion cloud must be almost the same in any direction, that is, the dispersion cloud has

approximate spherical symmetry. To delimit the cloud, we can take a sphere that contains, say,

approximately 75% of the cloud points. Furthermore, with high probability, such cloud should be

closely centered at μ . Observe that, since μ  W , this spherical cloud must be almost

symmetric in relation to W . If the observed data vectors in the cloud are orthogonally projected

into the subspace W , such projection will have an image with almost spherical symmetry and

the same radius of the original dispersion cloud.

The task now is to visualize the dispersion cloud of the estimator β̂ . The dispersion cloud

of the least squares estimator β̂ is then obtained by taking, in the parameter space, the pre-image,

by the transformation X , of the projected dispersion cloud onto W . Since, in general, the

transformation X does not preserve distance, the pre-image will not have spherical symmetry

anymore. With basic linear algebra knowledge, it is possible to prove that this pre-image will

have an elliptical symmetry. Then, the dispersion cloud of the least squares estimator β̂ is, with

high probability, approximately an ellipsoid centered in the real vector β , as shown in Figure 3.

6
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

In other words, a sphere in W,  w  μ  w  μ  const. , is the image, by the transformation X ,


of an ellipse centered at β , Xβˆ  Xβ   Xβˆ  Xβ  βˆ  β XX βˆ  β  const.

The symmetry of the dispersion cloud of β̂ gives an intuitive notion that the least squares

estimator β̂ is unbiased, as in fact it is. Students should take away from this development that

the concepts of unbiasedness and geometrical symmetry are closely related.

Now let us analyze the behavior of other linear unbiased estimators. Such estimators

have the form β  LY , where L is a p  n matrix and

β  E β  E LY  L E Y  LXβ .

Since β is unknown, the equality must hold for every β , then, LX  I . It follows from this

equality that (XL)  (XL)(XL)=X(LX)L=X  I  L=XL . This property implies that

2
XL is

a projection onto W (Gentle 2007, p. 286). So, any other linear unbiased estimator is obtained in

the same way as the least squares estimator; that is, it is obtained as a linear projection onto W .

The distinction between them is that the projection is no longer orthogonal as in the least squares

case. A non-orthogonal projection is denominated oblique. We will not give its precise

mathematical definition, but it is possible to have a good idea of how it works by only looking at

the two-dimensional case (Figure 4).

7
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

Remembering once more that the dispersion cloud of Y is a sphere approximately centered at

the vector Xβ in W , the oblique projection of this sphere is no longer spherical but elliptical.

The following analogy helps to see this. The shadow of a sphere, projected by the sun at noon, is

circular shaped with unchanged radius. As the sun sets, the projected image becomes an ellipse

with increasing eccentricity. Another good idea is to perform an experiment using a flashlight to

make the projection and a styrofoam sphere to represent the dispersion cloud (Figure 5).

To compare the total variance of the least-squares estimator with the total variance of other linear

unbiased estimators, it is enough heuristically to compare the dispersion clouds obtained by

orthogonal and oblique projections in the W subspace. The spherical cloud related to least
squares estimator is entirely contained in the elliptic cloud obtained by oblique projection.

Taking the pre-image, by X , the same occurs for the dispersion cloud in the parameter space.

This demonstrates, intuitively, that the total variance of the least squares estimator is lower than

the total variance of any other linear unbiased estimator. Therefore we have an intuitive

geometric demonstration of the Gauss Markov Theorem.

4 – Conclusions

Although it is known but not often used, geometry is the natural context for problems

related to least squares methods. The use of geometrical arguments is intuitive and enlightens the

statistical concepts. We believe that the geometrical approach to visualizing the Gauss Markov

theorem has considerable pedagogical value.

8
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

References:

 Casella, G. and Berger, R. L. (2002), “Statistical Inference”, second edition, Pacific

Grove, USA: Duxbury.

 Gentle, E. J. (2007), “Matrix Algebra. Theory, computations, and applications in

statistics”, New York, USA: Springer.

 Gruber, M. H. J. (1998), “Improving Efficiency by Shrinkage: The James-Stein and

Ridge Regression Estimators”, New York, USA: Marcel Dekker.

 Hoerl, A. E. and Kennard, R. W. (1970), “Ridge regression. Biased estimation for non-

orthogonal problems”. Technometrics 12.

 Rao, C. R., and Toutenburg, H. (1999), “Linear Models, Least Squares and Alternatives”,

second edition, New York, USA: Springer-Verlag.

 Rencher, A. C. and Schaalje, G. B. (2008), “Linear Models in Statistics”, New Jersey,

USA: John Wiley & Sons.

 Ruud, P. A. (2000), “An Introduction to Classical Econometric Theory”, New York,

USA: Oxford University Press.

 Saville, D. J. and Wood, G. L. (1991). “Statistical methods: the geometric approach”.

New York, USA: Springer-Verlag

 Zou, H. and Hastie, T. (2005), “Regularization and variable selection via the elastic net”.

J. R. Statistic Soc. B, 67, part 2, pp. 301-320.

9
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

Figure 1: Geometrical characterization of a linear model

10
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

Figure 2: Random variable in the x vector direction.

11
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

Figure 3: Dispersion clouds of the random vectors Y and β̂ .

12
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

Figure 4: Oblique projection.

13
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT

Figure 5: Oblique projection of the dispersion cloud

14
ACCEPTED MANUSCRIPT

(Monographs and Surveys in Pure and Applied Mathematics) A K Gupta, D K Nagar - Matrix Variate Distributions-Chapman and Hall - CRC (1999)
No ratings yet
(Monographs and Surveys in Pure and Applied Mathematics) A K Gupta, D K Nagar - Matrix Variate Distributions-Chapman and Hall - CRC (1999)
384 pages
Study Guide For STA3701
No ratings yet
Study Guide For STA3701
325 pages
The Econometric Analysis of Transition Data - Tony Lancaster
No ratings yet
The Econometric Analysis of Transition Data - Tony Lancaster
374 pages
Statistical Inference For Engineers and Data Scientists Solutions Manual
No ratings yet
Statistical Inference For Engineers and Data Scientists Solutions Manual
12 pages
(CMBS-NSF 59) Grace Wahba - Spline Models For Observational Data-SIAM (1990)
No ratings yet
(CMBS-NSF 59) Grace Wahba - Spline Models For Observational Data-SIAM (1990)
179 pages
Generalised Linear Models and Bayesian Statistics
No ratings yet
Generalised Linear Models and Bayesian Statistics
35 pages
3D Visualization-Assisted Electromagnetic Theory Teaching
No ratings yet
3D Visualization-Assisted Electromagnetic Theory Teaching
10 pages
The Advantages of Least Squares Monte Carlo
0% (1)
The Advantages of Least Squares Monte Carlo
9 pages
Solution CH # 5
No ratings yet
Solution CH # 5
39 pages
(MADHU MANGAL PAUL) Numerical Analysis For Scienti
100% (1)
(MADHU MANGAL PAUL) Numerical Analysis For Scienti
666 pages
Binomial Distribution
No ratings yet
Binomial Distribution
16 pages
Levenberg Examples
100% (1)
Levenberg Examples
2 pages
SAS IML User Guide PDF
No ratings yet
SAS IML User Guide PDF
1,108 pages
Eem520l3 2023
No ratings yet
Eem520l3 2023
25 pages
STAT 650 - Foundations of Data Science Syllabus
No ratings yet
STAT 650 - Foundations of Data Science Syllabus
13 pages
Chapter 01 Introduction
No ratings yet
Chapter 01 Introduction
21 pages
Random Variable Generation
No ratings yet
Random Variable Generation
5 pages
Pranab K Sen - Julio M Singer - Large Sample Methods in Statistics (1994) - An Introduction With Applications (2017, CRC Press) - Libgen - Li
No ratings yet
Pranab K Sen - Julio M Singer - Large Sample Methods in Statistics (1994) - An Introduction With Applications (2017, CRC Press) - Libgen - Li
395 pages
OptimisationII Notes
100% (1)
OptimisationII Notes
94 pages
N. Metrópolis The Beginning of The Monte Carlo Method PDF
No ratings yet
N. Metrópolis The Beginning of The Monte Carlo Method PDF
6 pages
Coordinate Descent and Golden Selection Search
No ratings yet
Coordinate Descent and Golden Selection Search
2 pages
Least Square Vs Gradient Descent
100% (1)
Least Square Vs Gradient Descent
52 pages
Optim
No ratings yet
Optim
70 pages
LargeScaleInference PDF
No ratings yet
LargeScaleInference PDF
273 pages
Computational Optimal Transport
No ratings yet
Computational Optimal Transport
56 pages
Writing Reproducible Reports: Knitr With R Markdown
No ratings yet
Writing Reproducible Reports: Knitr With R Markdown
24 pages
Bayesian Statistics: A User's Perspective
No ratings yet
Bayesian Statistics: A User's Perspective
24 pages
Bio Statistics
No ratings yet
Bio Statistics
174 pages
Generalized Additive Model
No ratings yet
Generalized Additive Model
10 pages
Ridge Regression
No ratings yet
Ridge Regression
82 pages
App.A - Detection and Estimation in Additive Gaussian Noise PDF
No ratings yet
App.A - Detection and Estimation in Additive Gaussian Noise PDF
55 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
7 pages
Introductory Concepts of Probabability & Statistics
No ratings yet
Introductory Concepts of Probabability & Statistics
6 pages
Stat 231 Course Notes
100% (1)
Stat 231 Course Notes
326 pages
General Linear Model
No ratings yet
General Linear Model
31 pages
Bayesian Lecture Notes
No ratings yet
Bayesian Lecture Notes
28 pages
Kutner Solution
0% (2)
Kutner Solution
43 pages
Forecast
No ratings yet
Forecast
82 pages
MCMC Sheldon Ross
No ratings yet
MCMC Sheldon Ross
68 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
9 pages
(Solutions Manual) Probability and Statistics For Engineers and Scientists Manual Hayler
100% (1)
(Solutions Manual) Probability and Statistics For Engineers and Scientists Manual Hayler
51 pages
TimeSeries Analysis State Space Methods
100% (1)
TimeSeries Analysis State Space Methods
57 pages
Importance Sampling
No ratings yet
Importance Sampling
8 pages
Priors Algorithms Bayesian
No ratings yet
Priors Algorithms Bayesian
108 pages
RYAN, THOMAS P. - (Wiley Series in Probability and Statistics) Modern Regression Methods - (2
No ratings yet
RYAN, THOMAS P. - (Wiley Series in Probability and Statistics) Modern Regression Methods - (2
658 pages
Bootstrap PDF
No ratings yet
Bootstrap PDF
24 pages
Polynomial Regression and Step Function
100% (1)
Polynomial Regression and Step Function
6 pages
FN3026 Introduction
No ratings yet
FN3026 Introduction
10 pages
Mathematical Methods Modelling and Applications
No ratings yet
Mathematical Methods Modelling and Applications
412 pages
Example of 2D Convolution
No ratings yet
Example of 2D Convolution
5 pages
Topological Data Analysis
No ratings yet
Topological Data Analysis
26 pages
Comments On The Savitzky Golay Convolution Method For Least Squares Fit Smoothing and Differentiation of Digital Data
No ratings yet
Comments On The Savitzky Golay Convolution Method For Least Squares Fit Smoothing and Differentiation of Digital Data
4 pages
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
Linear Model
No ratings yet
Linear Model
14 pages
Chapter 4. Gauss-Markov Model
No ratings yet
Chapter 4. Gauss-Markov Model
20 pages
Gi Part1 PDF
No ratings yet
Gi Part1 PDF
57 pages
Projection
No ratings yet
Projection
12 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Lectures On Isomorphism Theorems
No ratings yet
Lectures On Isomorphism Theorems
59 pages
2linear Regression
No ratings yet
2linear Regression
102 pages
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
From Everand
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
Rob Porter
No ratings yet
Bayes and MCMC For Undergraduates: The American Statistician
No ratings yet
Bayes and MCMC For Undergraduates: The American Statistician
7 pages
An Intuitive Geometric Approach To The Gauss Markov Theorem
No ratings yet
An Intuitive Geometric Approach To The Gauss Markov Theorem
15 pages
A Comparison of Two Methods For Determining The Weights of Belonging To Fuzzy Sets 1
No ratings yet
A Comparison of Two Methods For Determining The Weights of Belonging To Fuzzy Sets 1
8 pages
Babei LGP
No ratings yet
Babei LGP
13 pages
(LN) West G. - An Introduction To Modern Portfolio Theory (2004) PDF
No ratings yet
(LN) West G. - An Introduction To Modern Portfolio Theory (2004) PDF
34 pages
2 Introduction To The Analytic Hierarchy Process (Matteo Brunelli 2015)
No ratings yet
2 Introduction To The Analytic Hierarchy Process (Matteo Brunelli 2015)
88 pages
Tensor Product
No ratings yet
Tensor Product
5 pages
EEE223 Signals and Systems Rev4
No ratings yet
EEE223 Signals and Systems Rev4
181 pages
Group Theory For Physicists - Christoph Ludeling
No ratings yet
Group Theory For Physicists - Christoph Ludeling
123 pages
All Sem 22 MAKAUT
No ratings yet
All Sem 22 MAKAUT
167 pages
MODULE1
No ratings yet
MODULE1
29 pages
Ninteger Manual
No ratings yet
Ninteger Manual
97 pages
M.T. Nair - Linear Algebra
No ratings yet
M.T. Nair - Linear Algebra
62 pages
Continuation Power Flow-Ajjarappu
No ratings yet
Continuation Power Flow-Ajjarappu
8 pages
Assignment 1: MAT216 - SUMMER 2020
No ratings yet
Assignment 1: MAT216 - SUMMER 2020
10 pages
AVU - Analysis 1
No ratings yet
AVU - Analysis 1
60 pages
Physics With Excel and Python: Using The Same Data Structure Volume I: Basics, Exercises and Tasks Dieter Mergel 2024 Scribd Download
No ratings yet
Physics With Excel and Python: Using The Same Data Structure Volume I: Basics, Exercises and Tasks Dieter Mergel 2024 Scribd Download
37 pages
Dual Numbers
No ratings yet
Dual Numbers
29 pages
Show That The Set of All Solutions To The Nonhomoge-: Spanning Sets
No ratings yet
Show That The Set of All Solutions To The Nonhomoge-: Spanning Sets
10 pages
Syllabus (2015 - 2016batch) Third Year
No ratings yet
Syllabus (2015 - 2016batch) Third Year
167 pages
Maths - IA
No ratings yet
Maths - IA
484 pages
Vectors Algebra - Mind Maps
No ratings yet
Vectors Algebra - Mind Maps
2 pages
Electrothermal Simulation
No ratings yet
Electrothermal Simulation
10 pages
B.sc. (Honours in Mathematics) Part II (Sem III&IV)
No ratings yet
B.sc. (Honours in Mathematics) Part II (Sem III&IV)
36 pages
Codes Over An Infinite Family of Rings With A Gray Map: Yasemin Cengellenmis S. T. Dougherty
No ratings yet
Codes Over An Infinite Family of Rings With A Gray Map: Yasemin Cengellenmis S. T. Dougherty
22 pages
Maxima Workbook
No ratings yet
Maxima Workbook
243 pages
Computer Application-BCA First Sem Syllabus
No ratings yet
Computer Application-BCA First Sem Syllabus
12 pages
Prilims: Syllabus of Paper I (General Studies - I)
No ratings yet
Prilims: Syllabus of Paper I (General Studies - I)
3 pages
Linear Transformations: 4.1 Definitions and Properties
No ratings yet
Linear Transformations: 4.1 Definitions and Properties
9 pages
Killing Form
No ratings yet
Killing Form
11 pages
Eur Difftopgp Soln
No ratings yet
Eur Difftopgp Soln
16 pages
INT255 Unit 1
No ratings yet
INT255 Unit 1
14 pages
Linear Algebra
No ratings yet
Linear Algebra
433 pages
Linear Algebra Student Notes
No ratings yet
Linear Algebra Student Notes
3 pages
Signal and System Analysis 1st Drafted
No ratings yet
Signal and System Analysis 1st Drafted
90 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

An Intuitive Geometric Approach To The Gauss Markov Theorem

Uploaded by

An Intuitive Geometric Approach To The Gauss Markov Theorem

Uploaded by

The American Statistician

ISSN: 0003-1305 (Print) 1537-2731 (Online) Journal homepage: http://www.tandfonline.com/loi/utas20

An intuitive geometric approach to the Gauss

To link to this article: http://dx.doi.org/10.1080/00031305.2016.1209127

Accepted author version posted online: 21

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

An intuitive geometric approach to the Gauss Markov theorem

Leandro da Silva Pereira1

Lucas Monteiro Chaves2

Devanil Jaques de Souza3

simple geometrical concepts, like linear subspaces and orthogonal projections.

Orthogonal projection, dispersion cloud of points, Gauss Markov estimator.

transformations. Linear transformations can be viewed as geometric objects because, as

only the concepts of linear subspaces, linear transformations and projections.

2 – The Linear Model

to be estimated, it is plausible to use a lower number of variables to describe the W subspace.

parameter space, W  Im  X . To avoid technical difficulties the linear transformation X will

the linear model assumptions: Y is a random vector in the data space, μ  E Y  Xβ a  

the vector of errors.

All of this can be described geometrically by Figure 1.

understand what follows.

3 – The Geometry of Gauss Markov Theorem

theorem. Recall that, given a random vector Y with covariance matrix

component of Y , that is, the trace of D  Y .

THEOREM: (Gauss Markov) If Y  Xβ  ε with covariance matrix D(Y)= I , then the

least-squares estimator βˆ  (XX

First of all, it is necessary to have an intuitive idea of the meaning of “covariance

orthogonal projection of the random vector Y in the direction of x defines a one-dimensional

cloud in the x direction. As we are supposing D(Y)= 2I , the variance is

the same radius of the original dispersion cloud.

In other words, a sphere in W,  w  μ  w  μ  const. , is the image, by the transformation X ,

the concepts of unbiasedness and geometrical symmetry are closely related.

have the form β  LY , where L is a p  n matrix and

β  E β  E LY  L E Y  LXβ .

equality that (XL)  (XL)(XL)=X(LX)L=X  I  L=XL . This property implies that

the two-dimensional case (Figure 4).

unbiased estimators, it is enough heuristically to compare the dispersion clouds obtained by

geometric demonstration of the Gauss Markov Theorem.

theorem has considerable pedagogical value.

 Casella, G. and Berger, R. L. (2002), “Statistical Inference”, second edition, Pacific

Grove, USA: Duxbury.

 Gentle, E. J. (2007), “Matrix Algebra. Theory, computations, and applications in

statistics”, New York, USA: Springer.

 Gruber, M. H. J. (1998), “Improving Efficiency by Shrinkage: The James-Stein and

Ridge Regression Estimators”, New York, USA: Marcel Dekker.

orthogonal problems”. Technometrics 12.

second edition, New York, USA: Springer-Verlag.

 Rencher, A. C. and Schaalje, G. B. (2008), “Linear Models in Statistics”, New Jersey,

USA: John Wiley & Sons.

 Ruud, P. A. (2000), “An Introduction to Classical Econometric Theory”, New York,

USA: Oxford University Press.

 Saville, D. J. and Wood, G. L. (1991). “Statistical methods: the geometric approach”.

New York, USA: Springer-Verlag

J. R. Statistic Soc. B, 67, part 2, pp. 301-320.

Figure 1: Geometrical characterization of a linear model

Figure 2: Random variable in the x vector direction.

Figure 3: Dispersion clouds of the random vectors Y and β̂ .

Figure 4: Oblique projection.

Figure 5: Oblique projection of the dispersion cloud

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.