Ses Qui Linear
Ses Qui Linear
BRIAN OSSERMAN
This is an alternative presentation of most of the material from 8.1, 8.2, 8.3, 8.4, 8.5 and 8.8
of Artins book. Any terminology (such as sesquilinear form or complementary subspace) which
is discussed here but not in Artin is optional for the course, and will not appear on homework or
exams.
1. Sesquilinear forms
The dot product is an important tool for calculations in Rn . For instance, we can use it to
measure distance and angles. However, it doesnt come from just the vector space structure on Rn
to define it implicitly involves a choice of basis. In Math 67, you may have studied inner products
on real vector spaces and learned that this is the general context in which the dot product arises.
Now, we will make a more general study of the extra structure implicit in dot products and more
generally inner products. This is the subject of bilinear forms. However, not all forms of interest
are bilinear. When working with complex vector spaces, one often works with Hermitian forms,
which toss in an extra complex conjugation. In order to handle both of these cases at once, well
work in the context of sesquilinear forms.
For convenience, well assume throughout that our vector spaces are finite dimensional.
We first set up the background on field automorphisms.
Definition 1.1. Let F be a field. An automorphism of F is a bijection from F to itself which
preserves the operations of addition and multiplication. An automorphism : F F is an
involution if F F is the identity map.
Example 1.2. Every field has at least one involution: the identity automorphism!
Example 1.3. Since for z, z 0 C, we have z + z 0 = z + z0 and zz 0 = zz0 , complex conjugation is
an automorphism. Since z = z, it is an involution.
and
hcv1 , w1 i = c hv1 , w1 i ,
for all c F and v1 , v2 , w1 , w2 V .
and
A special case of sesquilinear forms that works for any field arises when the involution is the
identity map.
1
Definition 1.7. If the chosen involution on F is the identity map, a sesquilinear form is called a
bilinear form.
You should keep the bilinear case in mind as the main situation of interest throughout.
A simple induction implies that sesquilinear forms are compatible with arbitrary linear combinations, as follows:
Proposition 1.8. If h, i is a sesquilinear form on V , then
* n
+
m
X
X
X
bi cj hvi , wj i
bi vi ,
cj wj =
i=1
i,j
j=1
cn
bi cj hvi , vj i
i,j
bi cj ai,j
i,j
b1
a11
.. ..
=. .
bn
...
an1 . . .
a1n
c1
.. .. .
. .
ann
cn
Example 1.11. If our involution is the identity map, this gives a correspondence between bilinear
forms and matrices. In this case, the dot product associated to a given basis is simply the bilinear
form corresponding to the identity matrix.
2
Since the matrix A depends on a choice of basis, it is natural to consider what happens if we
change basis. This is described as follows.
Proposition 1.12. Let h, i be a sesquilinear form on V . Let B = (v1 , . . . , vn ) and B 0 = (v10 , . . . , vn0 )
be two bases of V , and suppose that h, i is represented by the matrix A for the basisPB, and by A0
for the basis B 0 . If P is the change of basis matrix from B to B 0 , so that vi0 = nj=1 Pj,i vj for
i = 1, . . . , n, then
A0 = P AP.
In particular, in the bilinear case we have A = P t AP .
D
E
Proof. According to the definition, we need to verify that vi0 , vj0 = (P AP )i,j . By Proposition
1.10, we have
P1,i
P1,j
0 0 .
vi , vj = .. A ... ,
Pn,i
Pn,j
but the righthand side is precisely (P AP )i,j .
Warning 1.13. Since, given a choice of basis of an n-dimensional vector space V , every bilinear
form on V can be represented uniquely by an n n matrix, and also every linear map from V to
itself can be represented uniquely by an n n matrix, one might think that the theory of bilinear
forms is no different from the theory of linear maps from V to itself. However, they are not the
same. We see this by considering what happens to the matrices in question if we change basis via
an invertible matrix P . In the case of a linear map from V to itself given by a matrix A, the matrix
changes to P 1 AP . However, by the previous proposition, if instead A is the matrix associated to
a bilinear form, it changes to P t AP .
A more abstract way of expressing the difference is that instead of a linear map V V , a
bilinear form naturally gives a linear map V V , where V is the dual space of V defined as the
collection of linear maps V F .
A natural condition to consider on sesquilinear forms is the following:
Definition 1.14. A sesquilinear form h, i is symmetric if hv, wi = hw, vi for all v, w V .
Note that unless the involution isnt the identity map, a symmetric sesquilinear form isnt
quite symmetric, but it is as close as possible given the asymmetric nature of the definition of
sesquilinear form. The special case of primary interest (other than bilinear forms) is the following:
Definition 1.15. If F = C and the involution is complex conjugation, then a symmetric sesquilinear form is called a Hermitian form.
Why did we introduce the complex conjugation rather than simply sticking with bilinearity? One
reason is that with it, we see that for any v V , we have hv, vi = hv, vi, so we conclude that hv, vi
is fixed under complex conjugation, and is therefore a real number. This means we can impose
further conditions by considering whether hv, vi is always positive, for instance. Another reason is
that with this definition, if we multiply both sides by a complex number of length 1, such as i, we
find that hv, wi doesnt change.
Example 1.16. On Cn we have the standard Hermitian form defined by
h(x1 , . . . , xn ), (y1 , . . . , yn )i = x
1 y1 + + x
n yn .
This has the property that hv, vi > 0 for any nonzero v.
3
In fact, if we write xj = aj + ibj and yj = cj + idj (thus identifying Cn with R2n ), we see that
h(x1 , . . . , xn ), (y1 , . . . , yn )i = (a1 ib1 )(c1 + id1 ) + + (an ibn )(cn + idn )
= (a1 c1 + b1 d1 ) + + (an cn + bn dn ) + i ((a1 d1 b1 c1 ) + + (an dn bn cn )) .
Notice that the real part of this is just the usual dot product on R2n . In particular, we get
h(x1 , . . . , xn ), (x1 , . . . , xn )i = (a21 + b21 ) + + (a2n + b2n ),
(the imaginary term cancels out). This is just the usual dot product on Rn , which calculates the
square of the length of a vector.
Here is a different example of a symmetric bilinear form.
Example 1.17.
Using
the standard basis on F 2 , if we consider the bilinear form we obtain from
1 0
the matrix
, we see that
0 1
b1
c
, 1
= b1 c1 b2 c2 .
b2
c2
0
0
This is still a symmetric form, but looks rather different from the dot product, since
,
=
1
1
1.
To study how symmetry of a sesquilinear form is reflected in its associated matrix, we define:
Definition 1.18. A matrix A is self-adjoint if A = A. In the special case that F = C and the
involution is the identity map, a self-adjoint matrix is also called Hermitian.
Example 1.19. If the involution is the identity map, being self-adjoint is the same as being
symmetric.
Just as with transpose, the adjoint satisfies the property that (AB) = B A for any matrices
A, B.
The following proposition describes how symmetry of a form carries over to its associated matrix.
Proposition 1.20. Let h, i be a sesquilinear form described by a matrix A for a basis (v1 , . . . , vn ).
Then the following are equivalent:
(1) h, i is symmetric;
(2) A is self-adjoint;
(3) For all i, j we have hvi , vj i = hvj , vi i.
Proof. If we write A = (ai,j ), then by definition hvi , vj i = ai,j for all i, j, so it is immediate that
(2) and (3) are equivalent.
Next, it isPclear from the definition
that (1) implies (3). Finally, if we assume (3), and we have
P
vectors v = i bi vi and w = i ci vi in V , we have
*
+
X
X
hv, wi =
bi vi ,
cj vj
i
*
=
bi cj
i,j
vi ,
+
X
j
vj
*
X
+
ci vi ,
bj vj
*
=
vi ,
ci bj
*
=
vj
i,j
cibj vi ,
i,j
vj ,
and assuming (3), we see that this is equal to hv, wi. Thus, we conclude that (3) implies (1), so all
of the conditions are equivalent.
We now discuss skew-symmetry. Although there is a notion of skew-symmetry for Hermitian
forms, it has rather different behavior from the case of bilinear forms, so whenever we talk about
skew-symmetry, we will restrict to the bilinear case.
Definition 1.21. A bilinear form is skew-symmetric if hv, vi = 0 for all v V .
This terminology is justified by the following.
Proposition 1.22. If h, i is a skew-symmetric bilinear form, then hv, wi = hw, vi for all v, w V .
The converse is true if 0 6= 2 in F .
Proof. If h, i is skew-symmetric, then for any v, w we have
0 = hv + w, v + wi = hv, vi + hv, wi + hw, vi + hw, wi = hv, wi + hw, vi ,
so hv, wi = hw, vi.
Conversely, suppose that hv, wi = hw, vi for all v, w V . Then for any v V , we have
hv, vi = hv, vi, so 2 hv, vi = 0, and as long 2 6= 0 in F we can cancel the 2 to get hv, vi = 0.
Warning 1.23. We follow Artins terminology for skew-symmetric forms (see 8.8). However, some
other sources (such as, for instance, Wikipedia) define a form to be skew-symmetric if hv, wi =
hw, vi for all v, w V , and call it alternating if hv, vi = 0 for all v V . Of course, according to
the proposition these are the same as long as 2 6= 0 in F .
2
Example 1.24. Still using
the standard basis on F , if we consider the bilinear form we obtain
0 1
from the matrix
, we see that
1 0
b1
c
, 1
= b1 c2 b2 c1 .
b2
c2
bi bj hvi , vj i
i,j
bi bj (hvi , vj i + hvj , vi i) +
i<j
bi bi hvi , vi i
= 0.
Thus, we see that (3) implies (1), so all the conditions are equivalent.
What do we mean by this? We mean that even allowing for going from real to complex numbers,
we will not obtain any non-real eigenvalues of a real symmetric matrix. Equivalently, all complex
roots of the characteristic polynomial are in fact real. We will prove a stronger version of this
statement soon.
Note that even though the statement of the corollary is just in terms of real numbers, it is natural
to work with complex vectors (and therefore Hermitian forms) in proving the statement, since one
has to consider the possibility of complex eigenvectors and eigenvalues in order to prove that in the
end, everything is real.
3. Orthogonality (8.4)
In this section, we have the following basis situation:
Situation 3.1. We have a sesquilinear h, i with the property that hv, wi = 0 if and only if hw, vi = 0.
This condition is satisfied if the form is symmetric (even if the general sesquilinear sense), or if
it is a skew-symmetric bilinear form. Later, we will specialize to the symmetric case.
Definition 3.2. Two vectors v, w are orthogonal (written v w) if hv, wi = 0.
Thus, by hypothesis we have hv, wi = 0 if and only if hw, vi = 0.
In this generality, we may well have v v even if v 6= 0 (indeed, this will always be the case for
skew-symmetric forms, but can occur also in the symmetric or Hermitian cases). Thus, it is better
not to try to place too much geometric significance on the notion of orthogonality, even though it
is very useful.
Definition 3.3. If W V is a subspace, the orthogonal space W to W is defined by
W = {v V : w W, v w};
this is a subspace of V .
Definition 3.4. A basis (v1 , . . . , vn ) of V is orthogonal if vi vj for all i 6= j.
Definition 3.5. A null vector in V is a vector which is orthogonal to every v V . The nullspace
is a set (which is a subspace) of all null vectors.
We see that the nullspace can also be described as V .
Definition 3.6. The form h, i is nondegenerate if its nullspace is {0}. If the form is not nondegenerate, it is degenerate.
Thus, the form is nondegenerate if and only if for every nonzero v V , there is some v 0 V
with hv, v 0 i =
6 0.
Definition 3.7. Given a subspace W V , the form h, i is nondegenerate on W if W W = {0}.
Thus, h, i is nondegenerate on W if, for every nonzero w W , there is some w0 W such that
hw, w0 i =
6 0. We may reexpress this as follows: we can define the restriction of h, i to W to be the
form on W obtained by the inclusion of W into V , forgetting what happens for vectors not in W .
Then h, i is nondegenerate on W if and only if the restriction to W is a nondegenerate form.
The following is immediate from the definitions:
Proposition 3.8. The matrix of the form with respect to a basis B is diagonal if and only if B is
orthogonal, and in this case the form is nondegenerate if and only if none of the diagonal entries
are equal to 0.
Remark 3.9. Note that this means that a nonzero skew-symmetric form can never have an orthogonal basis, since the matrix for a skew-symmetric form always has all 0s on the diagonal.
7
The following proposition gives us a method for testing when two vectors are equal.
Proposition 3.10. Suppose that h, i is nondegenerate, and v, v 0 V . If hv, wi = hv 0 , wi for all
w V , then v = v 0 .
Proof. If hv, wi = hv 0 , wi, then hv v 0 , wi = 0, so (v v 0 ) w. If this is true for all w V , we
conclude that v v 0 V , so by nondegeneracy of h, i, we must have v v 0 = 0.
Next we relate null vectors and nondegeneracy to the matrix describing the form.
Proposition 3.11. Suppose that we have a basis (v1 , . . . , vn ) of V , and A is the matrix for h, i in
terms of this basis. Then:
P
(1) A vector v = i bi vi is a null vector if and only if
b1
..
A . = 0.
bn
(2) The form h, i is nondegenerate if and only if A is invertible.
P
Proof. (1) Since for w = i ci vi , we have
c1
b1
..
..
hw, vi = . A . ,
cn
bn
b1
..
it is clear that if A . = 0, then v is a null vector. Conversely, if w = wi , then we see that hw, vi
bn
b1
..
is equal to the ith coordinate of A . , so if v is a null vector, we conclude that each coordinate
bn
b1
b1
..
..
of A . is equal to 0, and therefore that A . = 0.
bn
bn
(2) By definition, h, i is nondegenerate if and only if there is no nonzero null vector, and by part
(1) this is equivalent to AY = 0 not having any nonzero solutions (Y a n 1 column vector). But
this in turn is equivalent to A being invertible, since it is a square matrix.
Since we have not yet specialized to the symmetric/Hermitian case, the following theorem includes both Theorem 8.4.5 and Theorem 8.8.6 of Artin. I have attempted to give a more conceptual
proof than Artin does.
Theorem 3.12. Let W V be a subspace.
(1) h, i is nondegenerate on W if and only if V is the direct sum W W .
(2) If h, i is nondegenerate on V and on W , then it is nondegenerate on W .
If W1 , W2 V are subspaces, what does it mean to say V is W1 W2 ? The more abstract way
to say it is that we always have the vector space W1 W2 , and this always has a natural map to V ,
coming from the inclusions of W1 and W2 into V . Namely, if (w1 , w2 ) W1 W2 ), then it maps to
w1 + w2 in V . We say that V is W1 W2 if this natural map is an isomorphism. More concretely,
this is the same thing as saying that every vector in V can be written uniquely as w1 + w2 for
8
w1 W1 , w2 W2 . This breaks down into two statements: first, that W1 W2 = {0}, and second,
that W1 + W2 = V .
We will prove a couple of preliminary facts before giving the proof of the theorem.
Proposition 3.13. If W V is a subspace, and (w1 , . . . , wm ) is a basis of W , then v V is in
W if and only if v wi for i = 1, . . . , m.
Proof. Certainly if P
v W , then v wi for all i. Conversely, if v wi for all i, then for any
w W , write w = i ci wi , and we have
X
hv, wi =
ci hv, wi i = 0.
i
Lemma 3.14. If W V is a subspace, then
dim W > dim V dim W.
Proof. Suppose (w1 , . . . , wm ) is a basis of W . We use h, i to construct a linear map : V F m
by sending v to (hw1 , vi , . . . , hwm , vi). Since h, i is linear on the righthand side, this is indeed a
linear map. Moreover, we see that the kernel of the map is precisely the set of vectors v V such
that v wi for all i, which by Proposition 3.13 is exactly W . Now, the image of this map is
contained in F m , so has dimension at most m. Since dim ker + dim im = dim V , we conclude
dim W + dim im = dim V , so
dim W > dim V m = dim V dim W,
as desired.
Proof of Theorem 3.12. For (1), first recall that h, i is nondegenerate on W if and only if W W =
{0}. Thus, if V = W W , then h, i is nondegenerate on W , and to prove the converse, we need
to check that if h, i is nondegenerate, then the map W W V is an isomorphism. But the
kernel of this map is vectors of the form (w, w), where w W W , so if h, i, we have that
the map is injective. Thus, we have that the dimension of the image is equal to dim(W W ) =
dim W + dim W . But by Lemma 3.14, this is at least dim W + dim V dim W = dim V . On the
other hand, the image is contained in V , so it can have dimension at most dim V , and we conclude
that the image dimension is exactly dim V , and hence that the image is all of V , which is what we
wanted to show.
For (2), given any nonzero w W , we wish to show that there is some w0 W such that
hw, w0 i =
6 0. Because we have assumed nondegeneracy on V , there is some v V such that
hw, vi 6= 0. By (1) (using nondegeneracy on W ), we can write v = v 0 + w0 , where v 0 W , and
w0 W . But then since v 0 W and w W , we have
0 6= hw, vi = w, v 0 + w0 = w, v 0 + w, w0 = 0 + w, w0 = w, w0 ,
as desired.
Here is a different point of view on the first part of the theorem: it says that if a form h, i is
nondegenerate on W , then we always have a natural projection from V to W . Here are the relevant
definitions:
Definition 3.15. If W V is a subspace, a projection : V W is a linear map such that
(w) = w for all w W .
A complementary subspace for W is a subspace W 0 V such that V = W W 0 .
Thus, the theorem says if h, i is nondegenerate on W , then W is a complementary subspace for
W . This is related to projections as follows:
9
w + w0 , w + w0 = hw, wi + 2 w, w0 + w0 , w0 ,
and since 2 6= 0 we have that 2 hw, w0 i 6= 0, so at lease one other term in the equation must be
nonzero. Thus we see that we can get the desired v as (at least one of) w, w0 or w + w0 .
Theorem 4.2. h, i has an orthogonal basis.
Proof. We prove the statement by induction on dim V . For the base case V = 0, we may use the
empty basis. Suppose then that dim V = n, and we know the theorem for any vector spaces of
dimension less than n. If h, i is the zero form, then any basis is orthogonal. Otherwise, by Lemma
4.1, there is some v V with hv, vi =
6 0. Then let W be the subspace of V spanned by v. Since
this is 1-dimensional and hv, vi 6= 0, we see that h, i is nondegenerate on W . By Theorem 3.12, we
have V = W W . Now, W has dimension n 1, so by induction it has an orthogonal basis
(v1 , . . . , vn1 ). But then if we set vn = v, we see that we get an orthogonal basis of V .
Remark 4.3. Although we pointed out that as a consequence of Proposition 3.8, a nonzero skewsymmetric form can never have an orthogonal basis, there is a sort of substitute which allows us
to put the matrix for the form into a standard, close-to-diagonal form. This is off topic for us, but
see Theorem 8.8.7 of Artin if youre curious.
We now examine how orthogonal bases give us explicit projection formulas.
10
That is, for any orthogonal basis, there is a simple method of finding how to express any vector
in terms of the basis, just using the given form. (Compare to the usual case, where to figure out
how to express a vector in terms of a basis involves inverting the change-of-basis matrix)
Example 4.6. Say we want to find the formula for orthogonal projection to the line (t, 2t, 3t) in
R3 under dot product. That is, given (x, y, z), give a formula for the value of t such that (t, 2t, 3t)
is cloest to (x, y, z). We have implicitly chosen (1, 2, 3) as the basis for the line, so according to
Theorem 4.4, we find that the value of t we want is
x + 2y + 3z
x + 2y + 3z
h(1, 2, 3), (x, y, z)i
=
=
.
h(1, 2, 3), (1, 2, 3)i
1+4+9
14
Definition 4.7. An orthonormal basis is an orthogonal basis (v1 , . . . , vn ) such that hvi , vi i = 1
for all i.
Thus, a given bilinear form has an orthonormal basis if and only if it can be thought of as being
the dot product with respect to that basis (and for Hermitian forms, the same is true with the
standard Hermitian form in place of the dot product).
Note that given an orthonormal basis, the formulas of Theorem 4.4 and Corollary 4.5 simplify
further, because the denominators are all equal to 1.
We now specialize even further, to the case that we either have a real symmetric bilinear form,
or a Hermitian form. In this situation, we can normalize our basis further. We first define:
Definition 4.8. A real symmetric bilinear form or Hermitian form is positive definite if hv, vi > 0
for all nonzero v V .
11
We then have:
Corollary 4.9. If h, i is a real symmetric bilinear form or a Hermitian form on V , then there
exists an orthogonal basis (v1 , . . . , vn ) such that hvi , vi i is either 1, 1 or 0.
We have h, i positive definite if and only if it has an orthonormal basis.
Proof. If we scale vi by c, then hvi , vi i scales by c2 in the real case, and by cc in the Hermitian case.
Thus, we can scale by an arbitrary positive real number. Since hvi , vi i starts off as a real number,
we get what we want.
P
For the next assertion, if (v1 , . . . , vn ) is an orthonormal basis, and v = i ci vi , then
(P
c2 :
real case
hv, vi = Pi i
i ci : Hermitian case.
ic
In either case, it is visibly positive definite.
Conversely, if h, i is positive definitive, and (v1 , . . . , vn ) is a basis as in Corollary 4.9, then we see
that we must have hvi , vi i = 1 for all i, so (v1 , . . . , vn ) is an orthornomal basis.
Thus, being positive definite is equivalent to being the dot product (respectively, standard Hermitian form) with respect to some basis.
Finally, we also see that for positive definite forms, we dont need to worry about degeneracy:
Corollary 4.10. If h, i is a positive definite real symmetric bilinear or Hermitian form on V , then
for every subspace W V , we have h, i nondegenerate on W , and in particular, V = W W .
Proof. We see directly that the only null vector in W is 0: if w W is nonzero, then by the definition
of positive definite, we have hw, wi =
6 0, so w is not a null vector. Thus, h, i is nondegenerate on
W , so V = W W by Theorem 3.12.
12