0% found this document useful (0 votes)

70 views18 pages

MITOCW - MITRES - 18-007 - Part4 - Lec1 - 300k.mp4: Herbert Gross

The document discusses the concept of linearity and how it relates to calculus. Some key points: 1) Functions are locally linear, meaning they can be approximated as linear near a given point if the function is differentiable at that point. 2) The linear approximation of a function f(x) near a point a is the tangent line, which has the equation Δf = f'(a)Δx. 3) The linear approximation is only valid locally, as the tangent line and thus the linear approximation changes at different points.

Uploaded by

gaur1234

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views18 pages

MITOCW - MITRES - 18-007 - Part4 - Lec1 - 300k.mp4: Herbert Gross

Uploaded by

gaur1234

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

MITOCW | MITRES_18-007_Part4_lec1_300k.

mp4

The following content is provided under a Creative Commons license. Your support

will help MIT OpenCourseWare continue to offer high quality educational resources
for free. To make a donation or view additional materials from hundreds of MIT

courses, visit MIT OpenCourseWare at ocw.mit.edu.

HERBERT Hi. As I was standing here wondering how to begin today's lesson, an old story

GROSS: came to mind of the professor who passed out an examination to his class, and one

of the students said, "Professor, this is the same test you gave us last week". And

the professor said, "I know, but this time I changed the answers." And I was thinking
of this in terms of the fact that much of the new mathematics is essentially the old
mathematics with some of the answers changed.

One of the topics that we used to belittle in the traditional curriculum because it was
too easy, was the topic called linear equations. And it turns out that in the study of

several variables in particular-- but it was already present in calculus of a single

variable-- we very strongly used the concept of linearity.

I could've called today's lesson something old, something new. Meaning that the old
topic that we were going to revisit would be that of linear functions, and the new

topic would be how it manifests into the modern curriculum in the sense that one
introduces a subject called linear algebra, or matrix algebra, as a standard portion
of a modern calculus course whereas in the traditional calculus courses, essentially

nothing was ever said about matrix algebra or linearity.

Instead I picked a more conservative title for today's lesson, I simply call it "Linearity
Revisited". And as I say, it goes back to when we were in junior high school or high
school, when we were taught that linear functions were very nice. For example,
given the equation y equals mx plus b-- the linear equation meaning what? It graphs
as a straight line, but that the two variables are related linearly, y is a constant

multiple of x, plus a constant.

We were told solve for x in terms of y. And what we found was that if y equals mx

1
plus b, this was true-- if and only if-- x was equal to y minus b over m. What we
showed was given a value of x that corresponded a value of y, and conversely given
a value for y that corresponded a unique value for x.

And to put this into the language of functions, what we were saying was that if f(x)
equals mx plus b, then f inverse exists. In other words, what we're saying is that no
two different x values can give you the same y value if the function has the form y
equals mx plus b. And just about the time that we were learning to enjoy this kind of

an equation our dream world was shattered, and we were told it's too bad, but most
functions aren't linear.

We were given things like y equals x to the seventh plus x to the fifth, and we found
that we couldn't solve for x very conveniently in terms of y. And that's what began
our intermediate algebra and advanced algebra courses. In other words, the fact

that most functions are non-linear. Now an interesting thing occurred though. Let
me just emphasize this. And this is the key point.

In terms of calculus, we discovered-- and here's a key word coming up-- Most

functions are locally linear. Now that sounds a little bit like a tongue twister, but
actually back in the first part of course when we talked about delta y sub tan-- a

change in y to the tangent line. Notice what we were saying.

We were saying that to study f(x) near f equals a, we saw that f(a) plus delta x

minus f(a) was equal to f prime of a times delta x plus k delta x where the limit of k--

if delta x1 went to 0-- was 0 itself. Provided of course that f was differentiable at x
equals a otherwise you couldn't write down f prime of a here.

The interesting point is this. But if you look just at this term over here, this expresses

delta f as a linear function of delta x. The part that makes this thing non-linear is the
term called k delta x. But that's the term that's going to 0 as a second order

infinitesimal. So what we're really saying is this-- That provided that f is differentiable
at x equals a-- In other words locally we mean this-- near x equals a we can say that

delta f is approximately f prime of a times delta x.

2
That's what we call delta f sub tan, recall. And what we mean by approximately here
is that error k delta x goes to 0 very, very rapidly as delta x goes to 0. And what we

mean by locally is this-- suppose f prime exists also when x equals b. We can again
compute delta f near x equals b Now delta f is equal to what? Approximately f prime

of b times delta x plus that error term which goes to 0 very rapidly.

We again call this thing here delta f tan, but the thing to keep in mind is since f
prime of a need not equal f prime of b, delta f tan is different at a and at b. In other

words, even though it's always true where f is differentiable, that we can say that
delta f is approximately delta f tan the value of delta f tan depends on the value of x

that we're near. And that's what we mean by saying that approximating delta f by

delta f tan is a local property.

Now I think that sometimes by putting these things into words it sounds harder than

it really is. So I think what might be nice is if we just look at a specific illustration, a
problem which I deliberately picked to be as simple a nonlinear example as I can

think of.

Let me come back to our old friend, the function f(x) equals x squared, which as I
say, is about as simple a non-linear function we can get into. Now we know that f(x)

equals x squared plots as the curve y equals x squared, the parabola. Let's take a
couple of points on this parabola. Let's say the point (1, 1), and the point (2, 4),

draw in the tangent lines to the curve at these two points. And we know what? That

the equation of the tangent line to the curve at (1, 1) is y minus 1 over x minus 1
equals the slope. Since y is equal to x squared the slope is 2x, when x is 1 the slope

is 2. So the equation of this tangent line is given by y minus 1 over x minus 1 equals
2.

At the point corresponding to x equals 2, 2x is 4, so the equation of the tangent line

here is y minus 4 over x minus 2 equals 4. So now I've induced three functions that I
can talk about. My original function, f(x) is x squared. This straight line is the linear

function just solving for y in terms of x. g(x) equals 4x minus 4. And this straight line
corresponds to the function h(x) equals 2x minus 1.

3
Now the interesting point, of course, is that these two functions here are linear.
They are completely different functions. Notice not only pictorially are they different,

but algebraically their slopes are different, and their y-intercepts are different, and
back in our course in part one, we talked about things geometrically saying look at
near the point of tangency, the tangent line serves as a good approximation to the

curve itself.

What were we really saying then? What we were saying was that near the point of
tangency, g(x), which was a linear function, could replace f(x) which was a nonlinear
function.

Of course when we moved too far away from a given point then when we said that
f(x) still had a linear approximation, we had to pick a different linear function. By the
way, again because we were dealing with one independent variable and one

dependent variable, it was very easy to invent the concept of a graph. As we shall
show a little while, the concept of linearity extends to several variables, but you can't
draw the graph as nicely.

So let me now revisit the same result here, only without reference to the graph.
What we're saying is that our function is mapping the real number line into the real
number line. In other words instead of putting x and y at right angles to each other,
let's put x and y horizontally parallel to one another. And what we're saying is that f

maps the interval from 0 to 2 onto the interval from 0 to 4.

Now what does h do? Remember h is the function 2x minus 1. h maps the interval
from 0 to 2 onto the interval from minus 1, 2, 3. And you see this is all this diagram

means f maps 0 into 0, it maps 1 into 1, it maps 2 into 4. f is the function which
squares the input to yield the output. And correspondingly, h maps 0 into minus 1, it
maps 1 into 1, and it maps 2 into 3.

Now the interesting point is that f and h are very different. In fact, the only time f and
h have the same output is when x equals 1. Which of course we move from before
because how was h(x) constructed? h(x) was constructed to be the line tangent to

the parabola y equals x squared at the point x equals 1 y equals 1. So that should
4
be no great surprise.

But if we didn't know that notice it algebraically we could equate f(x) to h(x) conclude
therefore, that means x squared must equal 2x minus 1. We then transpose, and

get that x minus 1 squared must be 0 whence x must equal 1. And what we have is
that near x equals 1 x squared behaves like-- and I put this in quotation marks
because that's the hardest part of the course that's going to follow was what you

mean by behaves like-- but x squared behaves like 2x minus 1.

And what we mean by that is this-- at least in terms of a picture. If I pick a small
interval surrounding x equals 1 on the x-axis, and a small interval-- like a thick dot--

surrounding y equals 1 on the y-axis here. Then as a mapping from this domain into
this range, I can essentially not distinguish f from h.

The error is so small that as the size of the interval shrinks, the error goes to 0 even
faster. And therefore, if I stay close enough locally to the point in question-- I cannot

tell the difference between the non-linear function and the linear function.

But what I have to be careful about is this-- that what whereas x squared can be
replaced by 2x minus 1, near x equals 1, near x equals 2, x squared can be

replaced again by a linear function. Namely 4x minus 4. But 4x minus 4 is not

approximately the same as 2x minus 1, no matter where you look.

You might say well look it, don't these two straight lines intersect at the particular

point? The answer is yes they do. But even at the point that they intersect, there
was no neighborhood in which these lines can serve as approximations for one
another.

Those are two straight lines that intersect at a constant angle, and as soon as you
leave the point of intersection there is a significant arrow. Meaning an arrow which
does not go to 0 more rapidly than the change in x. You don't have that higher or
the infinitesimal over here. At any rate, leaving this to the exercises and the

supplementary notes for you to get more out of, in summary, let's just say this-- If f
is continuously differentiable at x equals a, then locally-- meaning near x equals a-- f

5
behaviors linearly.

f(x) is approximately f(a) plus f prime of a times the quantity x minus a, and you see
once x is chosen to be a this is a number, this is a number, delta x here is the only
variable on the right hand side.

So what we're saying is that f(x) is a what? Linear function of delta x. And the more
interesting point is since this is all review so I say what i mean by interesting point is
what? That we don't have to just review this way, we did this simply to refresh your
memories as to how linearity was playing a big role in calculus of a single variable.

Now what we're going to do is extend the result to several variables. Let me just say
that at the outset. That this concept does extend to n variables, but n equals 2
yields a particularly good geometric insight.

For example, let's suppose I look at two equations and two unknowns. Well actually,
I'll use u and v instead. Let those be variables. Also we can think of this as a
function. I have u(x, y) is x squared minus y squared, Whereas v(x, y) is 2xy. Notice

that these are not linear, because here we have things appearing to second part,
your squares, and here we have what? The variables multiplying one another.

These are not linear equations, but the beautiful point is-- if you look at this way-- is

even without a picture, I can think of this as a mapping which maps two dimensional
space into two dimensional space. And how does this mapping take place? It maps
the point or the pair, or the 2-tuple-- whichever way you want to say it-- (x, y) into
the 2-tuple (u, v), where u is x squared minus y squared, and v is 2 xy.

In other words, f-bar-- and notice I put the bar underneath simply to indicate that E2
is a vector space, and we have a function that's mapping what? A vector into a
vector, so I indicate that f is a vector function here. It maps a vector into a vector.

And how does the mapping take place? It maps the 2-tuple (x, y) into the 2-tuple x
squared minus y squared comma 2xy u v.

Now the thing is that as long as we only have n equals 2, we can still draw a picture,

but not a picture as nice as what existed when n was equal to 1. See pictorially, f-
6
bar maps the xy plane into what we can call the uv plane but notice that since the
domain of f-bar has two degrees of freedom-- a two dimensional vector space--

Notice that the domain of f-bar is the entire xy plane. Whereas the range of f-bar is
the entire uv plane--

In other words, I can now view f-bar as a mapping which carries points in the xy
plane into points in the uv plane, and this will be exploited more later in the course,

but the idea is this. Let's take a look for the time being. Let's see what f-bar does to
the point (2, 1).

Remember u is x squared minus y squared, so at the point (2, 1), u becomes what?
2 squared minus 1 squared, which is 3. On the other hand, 2xy is 2 times 2 times 1,

which is 4. So f-bar can be viewed as mapping the point (2, 1) into the point (3, 4).

Now you recall that calculus isn't interested in what's happening at a particular point,

it's interested in what's happening in the neighborhood of a particular point. So the

major question is how does f-bar behave near the point (2, 1). In other words, what

is f-bar (2 + delta x, 1 + delta y), when delta x and delta y are quite small. That's the

question that we're raising over here.

What we're saying is we know that (2, 1) maps into (3, 4). We also know or we'd like

to believe that a point near (2, 1) maps into a point near (3, 4). Well if we call this
point (2 + delta x, 1 + delta y), then the corresponding image over here should be (3

+ delta u, 4 + delta v).

What we can say is that whatever the image of (2 + delta x, 1 + delta y) is it has the

form (3 + delta u, 4 + delta v), and all we have to do is find delta u and delta v. This
is the pictorial idea of what's happening. Now the point is that delta u and delta v are

very difficult to find. After all, u and v are non-linear functions. To invert them is

either difficult, or downright impossible-- one of the other-- in many cases.

The thing that's easy to find is delta u tan, and delta v tan. Remember delta u tan
was the postulate of u with respect to x times delta x, plus the postulate of u with

respect to y times delta y. Since u is equal to x squared minus y squared, that

7
means delta u tan is 2x delta x minus 2y delta y. We're interested in this at the point
(2, 1).

Letting x be 2, and y be 1, we see that delta u tan is 4 delta x minus 2 delta y. Since

v is equal to 2xy, the postulate of v with respect to x is 2y the postulate with v with

respect to y is 2x. Therefore, delta v sub tan is 2y delta x plus 2x delta y. Since
we're evaluating this at x equals 2y equals 1, we see that delta v tan is two delta x

plus 4 delta y.

Now here's the key point. This is always delta u tan. This is always a delta v tan.

Well, the local thing comes in is that we know that because u and v are continuously

differentiable functions of x and y, that near the point (2, 1), we can replace delta u
by delta u sub tan delta v by delta v sub tan, and we wind up with what? delta u is

approximately 4 delta x minus 2 delta y. delta v is approximately 2 delta x plus 4

delta y.

But the key point now is that this is a system of linear equations. You see delta u is
a linear combination of delta x and delta y, and delta v is also a linear combination

of delta x and delta y. In other words, as long as u and v are continuously

differentiable functions of x and y, we can approximate locally delta u and delta v by

linear approximations.

Notice how linear systems come into play. Now I've been emphasizing the case n

equals 2 just so we could draw a picture. Notice that the no matter how many

variables we have. Well, in fact, let me just summarize this in terms of x and y first.
And then we'll generalize it to n-variables in a minute.

The key point for two variables, and what happens for two variables happens for
any number. But as we've often done in this course, we emphasize the two variable

case because we can still visualize the picture. Even though the graph idea is hard
to see, because we're mapping two dimensions into two dimensions.

But at least the domain and the range are easy to see separately, but if u is a
continuously differentiable function of x and y near the point (x0, y0), then delta u is

8
exactly the postulate of u with respect to x times delta x, plus the postulate of u with

respect to y times delta y, plus an error term, k1 delta x plus k2 delta y, where k1
and k2 go to 0, as delta x and delta y go to 0.

If we just look at this spot alone, delta u is linear up to this as a correction term. In

other words, the non-linearity part of delta u is going to 0, as a second order

infinitesimal, and the reason I keep harping on this point is that no matter how
complex the theory gets in the rest of this particular block, the key step is always

going to be that when you have a continuously differentiable function you can

essentially-- as long you stay locally-- you can essentially throw away the nasty part.

You can essentially throw away this error term, because it goes to 0 so rapidly that if
you stay close enough to the point x0, y0), no harm comes from neglecting this

term. What you must be careful about is that as soon as you pick a large enough

neighborhood so that this term is no longer negligible, then even though this part
here is still delta u sub tan, delta u sub tan is no longer a good approximation for

delta u. At any rate, n-variables what we're saying is suppose w is a function of x1

up to xn.

Then if w happens to be continuously differentiable at the point corresponding to x-

bar equals a-bar, meaning in terms of n-tuples x1 up to xn is the point a1 comma up

to an, then what we're saying is that delta w can be replaced by-- now this has been

mentioned the text, I don't remember whether we've mentioned this in previous
lectures or not. It's rather interesting that when you deal with more than three

independent variables we somehow don't like to use the word delta w sub tan.

Because tangent indicates a tangent line or a tangent plane which is a geometric

concept.

Instead we replace the word tangent by lin as an abbreviation for linear. The key
point being what? That this thing that we call delta w sub lin, or if you like to call it

sub tan, what's in the name? Call it whatever you want. The point is that this thing

that we call delta w sub lin, or delta w sub tan is the partial of f with respect to x1
evaluated at a-bar times delta x1, plus the partial of f with respect to x of n evaluate

9
it at a bar times delta xn.

And the key point is that once you have chosen a specific number a-bar, notice that

the coefficients of delta x1 up to delta xn are numbers. They're not variables. They
are numbers once a is chosen. So that what is delta w lin, why we call it linear?

Notice that this expression here is a linear combination of delta x1 up to delta xn.

In other words they're what? Sums of terms each involving a delta x times excuse

me. A delta x times a constant. What we're saying is that nice functions, and what's
a nice function? A nice function is one which is continuously differentiable. A nice

function is locally linear. In other words, a continuously differentiable function near a

particular point can be approximated by a linear function, where the error will be
very small as long as you stay near the point in question.

You remember at the beginning of my lecture I said something old, something new.

This finishes the old part of the course. In other words, what I've tried to motivate for

you here is why If we were remodeling the pre-calculus curriculum much more
emphasis should be paid to linear equations. Granted that most functions in real life

are non-linear, the point remains that locally, functions are linear. OK?

That's the key point. Locally we deal with linear functions. Therefore, since all non-

linear functions may be viewed as being linear locally, this motivates why we should

really study systems of linear equations. In other words, this motivates the subject
called linear systems. Now what is a linear system? Essentially a linear system is m

equations in n unknowns.

In many cases m and n are taken to be equal, but what kind of equations are they?

They are equations where all the variables appear separately to the first power
multiplied only by a constant term, and by the way, let me introduce this double

subscript notation rather than introducing umpteen different symbols for constants.

Notice that a very nice device here is to pick one symbol like an a, and then use two

subscripts. The first subscript telling you what row the coefficient is referring to, and
the second one which column are in terms of the equations. The first subscript tells

10
you which equation you're dealing with, and the second subscript tells you what

variable it's multiplying.

For example this is what? This is the coefficient of x sub 1 in the first equation. This

is the coefficient of x sub n in the first equation. This is the coefficient of x sub n in
the n-th equation. Think of this as the row and the column if you will. And what we're

saying then is that the solutions of this type of system of equations are really
controlled by the coefficients of the x's.

In other words, by the numbers a sub ij, where i and j take on-- well i takes on all
values from what? The number of rows. i goes from 1 to m, and j goes from 1 to n.

But the a's become very important, and this is what ultimately is going to motivate

what we mean by a matrix, but before I come to that let me give you just one
example of what I mean by saying that the equations are governed by the

coefficients of the x's, not by the constants on the right hand side.

By the way, notice the convention that when you have two equations with two

unknowns rather than call the unknowns x1 and x2, it's conventional to call the
unknowns x and y. Let's take a particularly simple system here-- x plus y equals b1,

x minus y equals b2. If we add these two equations, we get 2x is b1 plus b2,

whereupon x is b1 plus b2 over 2. If we subtract the two equations, we get 2y is b1

minus b2, whereupon y is b1 minus b2 over 2.

Notice that this tells us how to solve for x and y in terms of b1 and b2. Namely, to

find x you take half the sum of the two b's. To find y, you take half the difference.

Now certainly the solution depends on the values of b1 and b2. I'm not saying you
don't change the answers by changing the constants on this side. What I am saying

is that the structure by which you find the answers does not depend on b1 and b2;

it's determined solely by the coefficients of x and y.

What we're saying is no matter what b1 and b2 are in this particular problem, to find

x and y we take half the sum of the b's, and we take half the difference. In other
words, the solution depends on b1 and b2 numerically, but not structurally.

11
Well the whole idea is this-- and this is what we so often do in mathematics.

Because the solution to our equations depends on the coefficients of the x's, we

somehow want to focus our attention on the coefficients. And we don't need the x's
in there, because we can sort of think of the x's as being a place value type of

situation. In other words, x1 can be thought of as being the first column. x2 to the

second column. The first equation can be thought of as the first row. The second
equation, the second row.

And what this motivates is a concept called an m by n matrix. Now this sounds like a

very ominous term, an m by n matrix. But the point is it's not a very ominous term.

It's in fact, I think that it's too-- in fact the word matrix essentially indicates an array,
and that's all this thing is. By an m by n matrix, we simply mean a rectangular array

of numbers, arranged to form m rows-

In other words, the first number tells you the number of rows, and the second

number tells you the number of columns. Now there's certainly nothing logical about
that in terms of our game idea. Just memorize this, it's a rule of the game or a

definition. Somebody could've said, why didn't you give the columns first and then

the rows? Well we could've, but one of them had to come first.

And the convention is that one refers to the rows first, and then the columns. An m

by n matrix then is what? It's a rectangular array of numbers consisting of m rows

and n columns. By way of an example-- by the way, to indicate that's you're talking

about a matrix, one usually encloses the array in brackets, or in parentheses. It

doesn't make any difference. I will use whichever one strikes my fancy at the

moment.

And it happens to be brackets right now. But if I write down this array-- What is it

now? [1, 1, 1; 1, -1, 2]; this is a rectangular array of numbers consisting of what?

Two rows and three columns. And so this is an example of a 2 by 3 matrix. A 2 by 3

matrix.

Now again, we don't want to invent this thing vacuously. Let's keep track of what this

matrix is coding for us in terms of a system of equations. Well. For example,

12
suppose we have the system of equations z1 is equal to y1 plus y2 plus y3. z2 is
equal to y1 minus y2 plus 2y3, and we want to think of the y1, y2, and y3 as being

the variables. z1 and z2 as being the constants here.

What is the matrix of coefficients here? Well the matrix would be what? The
coefficient of the first variable in the first column is 1, second variable first column is

1, third variable first row is 1. You see? Second equation, first variable coefficient is
1. Second equation second variable coefficient is -1.

Second equation, third variable. Coefficient is 2. So using our matrix coding system,
the matrix of coefficients would be what? [1, 1, 1; 1, -1 2]. Which is exactly the

matrix that we wrote down over here. And to put this into a different perspective so
to see what we're driving at, let's take a second example where we first start out
with three equations and four unknowns. Three linear equations and four

unknowns. And then we'll write the matrix for this afterwards.

But let the equations be y1 = x1 + 2 * x2 + x3 + x4. y2 = 2 * x1 - x2 - x3 + 3 * x4. y3 =

3 * x1 + x2 + 2 * x3 - x4. if I want to write the matrix of coefficients, what do I do? I
simply leave the variables out, and write down what? My first row would be what?

[1, 2, 1, 1]. My second row would be [2, -1, -1, 3]. My third row would be [3, 1, 2, -1].

In other words, my matrix of coefficients now would be what kind of a matrix? It

would be a rectangular array of numbers consisting of three rows and four .

Columns. All right? And that would be called a 3 by 4 matrix.

Again know this-- in this coding system, the number of rows corresponds to the
number of equations. And the number of columns corresponds to the number of
variables that are formed in linear combinations. To summarize this again-- the

matrix of coefficients in our second example is the 3 by 4 matrix [1, 2, 1, 1; 2, -1, -1,
3; 3, 1, 2, -1].

Well again, let's recall that when we do mathematics, we don't like to introduce
notation for the sake of notation. And simply to be able to have a way of

conveniently writing the coefficients, but not being able to use it efficiently would be

13
a rather stupid thing to do.

Why invent new notation if it's not going to help us effectively solve new problems?
This is why in mathematics we've been emphasizing the game idea whereby what

we really care about is structure. We care about structure, not about the terms
themselves. And to motivate when I'm driving at, let me return to examples one and
two. And bring up a question that has great impact-- and even if we don't appreciate

it right now in terms of a practical application, let's at least see what's happening.

You'll notice that if I look at these systems of equations over here, notice that the
first two equations tell me how to express z1 and z2 in terms of y1, y2 and y3. On
the other hand, the second system of equations tells me how to express y1, y2, and

y3 in terms of x1, x2, x3 and x4. Now without belaboring the point because the
arithmetic is quite trivial here, a very natural question that might come up next this is
let's look at our old friend the chain rule again.

Since the z's are expressed in terms of the y's, and the y's are expressed in terms

of the x's, it seems that by direct substitution, I should be able to express the z's in
terms of the x's namely. I replace y1 by this linear combination of the x's. I replace
y2 by this linear combination of the x's. I replace y3 by this linear combination of the

x's.

| then combine the y's in terms of the x's as indicated here. And that should give me
the z's in terms of the x's. Leaving that hopefully as a trivial exercise, we come to
the next example that I'd like to mention here, and that is suppose you were told to

express z1 and z2 in terms of x1, x2, x3 and x4. The point is that with the amount of
arithmetic mentioned before we could easily show that z1 = 6*x1 + 2*x2 + 2*x3 +
3*x4.

While z2 = 5x1 + 5x2 + 6x3 - 4x4, by a straightforward substitution. The point is

that somehow or other, we would like to be able to handle this substitution more
efficiently. Is there a neater way of being able to transform the z's into the x's by

way of the y's? In other words is there a way of replacing the y's by the x's, and then
finding z's in terms of x's in a convenient, mechanical way that will save us much
14
steps?

Not so much in these easy examples where you have 2 by 3, and 3 by 4 systems,

but cases where you might have 10 equations, and 10 unknowns. Or 10 equations
and 12 unknowns. And the answer is, there is a way. Of course you knew there was
going to be a way. Otherwise we wouldn't be leading up to it in this particular way,

and as so often happens, there usually happens to be a real life situation that
motivates why we invent something called matrix algebra.

In terms of our present illustration, the chain rule that we're just talking about
expressing the z's in terms of the y's, and then the y's in terms of the x's motivates

what we mean by matrix "multiplication". And you may notice that I put
"multiplication" here in quotation marks. The reason I put in quotation marks is
unfortunately the word "multiplication" has a connotation of multiplying numbers

together.

Don't think of it that way think of multiplication meaning what? A way of combining
two matrices to form another matrix. There's going to be no logic behind this other
than one very famous piece of logic. That is knowing what the answer was

supposed to be we make up our rules to guarantee us that we will get the

appropriate answer.

I remember when I was an undergraduate in college. The big type of humor that
was going around at that time was the idea of, somebody would give you the

answer, and you have to make up the question. Oh they were silly little things like, if
the answer to the question was 9w what was the question? And the question would
be, do you spell your last name with a v Herr-Wagner, and the answer would be 9w.

And these were funny jokes at that time. I don't know whether their funny now or

not. But the funny point is this. That this joke, which might not be that funny is
exactly how we motivate definitions and rules in mathematics. We start with the
answer, and then go back, and answer the question. Knowing in advance that

somehow or other, the matrix that expresses the z's in terms of the y's is given by
this. And the matrix that expresses the y's in terms of the x's, is given by this matrix.
15
Somehow or other what we would like to do is invent a way of combining these two
matrices to give me the matrix that expresses this answer. In other words, if I start

knowing what the answer is supposed to be-- what is the matrix that expresses the
z's in terms of the x's? Is the matrix whose first row is [6, 2, 2, 3}. And whose
second row is [5, 5, 6, -4]. In other words the matrix would be what? [6, 2, 2, 3; 5, 5,

6, -4].

And without even looking at any mechanical rule, the question that comes up is,
how can I invent a rule that will tell me how to multiply this 2 by 3 matrix by this 3 by
4 matrix to obtain this 2 by 4 matrix? 2 by 4 matrix.

Now look in the notes, I'm going to do this in great detail. There will be many
exercises on this for you to sharpen your teeth on. But for now I just want to hit this
main point because the lecture is quite long. Your attention span probably is starting

to be taxed. And so I just want to show you what the recipe is because my feeling is
that this is something you have to hear before you can really read it without
becoming panicked by the notation.

The idea is this-- first of all to multiply two matrices, all we ever require is that the

number of columns in the first matrix equals the number of rows in the second
matrix. And if that sounds complicated to you, simply think in terms of the chain rule
again. The number of columns in the first matrix tells you how many unknowns

there are in the first system of equations.

And that number of unknowns gives you the number of equations in the second
system. In other words, the number of columns in the first matrix must match the
number of rows in the second matrix. Notice we don't care about the number of

rows in the first one matching the number of columns in the second, all we care is
that the number of columns in the first matrix-- namely three here-- match the
number of rows of the second, which is three.

Then the rule works in a very interesting mechanical way that makes use of the dot

product. Namely what you do is, suppose I want to find the term in the product of

16
these two matrices that occupies the second row, third column. What I do is I take
the second row. In other words I take the row that comes from the first matrix. I take

the column value from the second matrix. In other words, I have what? Second row,
third column. And I form the usual dot product that we've talked about. I dot the
second row with the third column.

And what would I get if I did that? 1 * 1 = 1; -1 * -1 = 1; and 2 * 2 + 4. So 1 + 1 + 4 =

6. So in this product matrix, the term in the second row, third column will be 6. The
term in the second row, third column will be 6. Second row, third column will be 6.

Now leaving it as an exercise for the time being, and reading it in the supplementary
notes, I'm sure you'll be able to put this all together. It's not nearly as difficult as it

sounds hearing it the first time.

I think the most difficult part is rationalizing why one would invent such a definition in
the first place. The answer is very simple: we invent the definition to solve a
particular problem. Coming back here again, all I'm saying is that if I invent-- for

example let me just give you one more checking out point here.

Let me see what the term would be in the first row, second column. To find the term
in the first row, second column, I take the first row of the first matrix. Dot it with the

second column of the second matrix. See first row dotted with second column, the
answer will give me what? The term in the product that's in the first row, second
column. Let's check that.

1 * 2 = 2; 1 * -1 = -1; 1 * 1 = 1. 2 - 1 + 1 = 2. And therefore, the term in the first row,

second column should be 2. It is. You see there's no more motivation to how we
multiply these two matrices than the fact that it solves the problem that we want
solved.

To find the term that's in the i-th row, j-th column of the product, dot the i-th row of

the first matrix with the j-th column of the second matrix. More generally, you can
always multiply an m by n matrix by an n by p matrix. What's the key factor? You
don't care about the number of rows in the first, you don't care about the number of

17
columns in the second. What you do care about is what?

That the number of columns in the first matrix be equal to the number of rows in the
second, and if you do that when you multiply an m by n matrix by an n by p matrix,
notice that the result will be what? An m by p matrix. In other words, the number of

rows is governed by the number of rows in the first matrix and a number of columns
is governed by the number of columns in the second matrix.

Notice by the way, that this tells us right away that when we want to multiply two
matrices it makes a difference in which order that they're written. If we were to take

that 2 by 3 matrix, and the 3 by 4 matrix, and interchange them, we don't have the
appropriate match up of rows and columns. You can't dot a 2-tuple with a 4-tuple.

The very fact that we say dot the row with the column, the dot product is only
defined for two n-tuples. We insist that the n-tuples be the same. The n has to be

the same to dot two n-tuples.

Let me summarize today's lecture by saying that in overview, know this-- hopefully
we have reestablished the need for linear systems of equations, and secondly, once

we have understood what the need for linear systems is, we are now introducing a
mechanism whereby we can solve linear systems more efficiently than what we
were taught in the past as to how to solve them.

You see what I'm going to do for the next few lectures now is concentrate on a new

game called the game of matrix algebra. But that will unfold gradually as we develop
the next two lectures. And so until our next lecture, so long.

Funding for the publication of this video was provided by the Gabriella and Paul
Rosenbaum foundation. Help OCW continue to provide free and open access to

MIT courses by making a donation at ocw.mit.edu/donate.

MTS 241 (First)
No ratings yet
MTS 241 (First)
16 pages
1 Functions of A Single Variable
No ratings yet
1 Functions of A Single Variable
17 pages
Mth201 Manual
No ratings yet
Mth201 Manual
43 pages
A-Level Maths 9709 Cheat Sheet P1
No ratings yet
A-Level Maths 9709 Cheat Sheet P1
13 pages
Mat04 Calculu: Gilbert R. Esquillo, Pme
No ratings yet
Mat04 Calculu: Gilbert R. Esquillo, Pme
64 pages
Lec9 16
No ratings yet
Lec9 16
43 pages
Calculus Blank
No ratings yet
Calculus Blank
126 pages
1.calculus I Math111 by DR - Biju V Week1
No ratings yet
1.calculus I Math111 by DR - Biju V Week1
68 pages
Bab 1
No ratings yet
Bab 1
8 pages
FUNCTIONS
No ratings yet
FUNCTIONS
40 pages
Lecture Notes NumAnal-1 PDF
No ratings yet
Lecture Notes NumAnal-1 PDF
31 pages
Functions, Limit, Differentiation and Integration
No ratings yet
Functions, Limit, Differentiation and Integration
33 pages
Calculus PDF
No ratings yet
Calculus PDF
21 pages
Chapter 1
No ratings yet
Chapter 1
18 pages
Screenshot 2024-12-19 at 12.18.46 PM
100% (1)
Screenshot 2024-12-19 at 12.18.46 PM
14 pages
PrecalcBook17 18 1
100% (1)
PrecalcBook17 18 1
254 pages
Calc 1
No ratings yet
Calc 1
138 pages
MTS 241...
No ratings yet
MTS 241...
21 pages
MTS 241 Compiled
No ratings yet
MTS 241 Compiled
100 pages
Before Calculus
No ratings yet
Before Calculus
29 pages
MODULE 1 CE Math107E
No ratings yet
MODULE 1 CE Math107E
12 pages
Calc Study Guide
No ratings yet
Calc Study Guide
14 pages
Chapter 1 of Calculus++ - Differential Calculus With Several Variables
No ratings yet
Chapter 1 of Calculus++ - Differential Calculus With Several Variables
127 pages
Inverse Functions Inverse Trigonometric Functions
100% (1)
Inverse Functions Inverse Trigonometric Functions
38 pages
Principles: Limits and Infinitesimals
No ratings yet
Principles: Limits and Infinitesimals
25 pages
Lecture1b-Various Types of Functions 2025
No ratings yet
Lecture1b-Various Types of Functions 2025
42 pages
MAT Lecture2
No ratings yet
MAT Lecture2
47 pages
Math 121a
No ratings yet
Math 121a
38 pages
Compiled Problem List MTH 1309 Fall 2024
No ratings yet
Compiled Problem List MTH 1309 Fall 2024
121 pages
Concepts in Calculus I - Beta Version - 2011 - Miklos Bona - Sergei Shabanov
No ratings yet
Concepts in Calculus I - Beta Version - 2011 - Miklos Bona - Sergei Shabanov
188 pages
Mooculus 01 PDF
No ratings yet
Mooculus 01 PDF
258 pages
Mathematics I Chapter 1: One-Variable Functions. Continuity
No ratings yet
Mathematics I Chapter 1: One-Variable Functions. Continuity
64 pages
Ex 1 1 FSC Part2
No ratings yet
Ex 1 1 FSC Part2
11 pages
Calculus Module 3
No ratings yet
Calculus Module 3
22 pages
PDF Gallery - 20250219 - 153326
No ratings yet
PDF Gallery - 20250219 - 153326
4 pages
Mo Oculus
No ratings yet
Mo Oculus
258 pages
More On Differentiation: 1 Derivatives of Trigonometric Functions
No ratings yet
More On Differentiation: 1 Derivatives of Trigonometric Functions
10 pages
Calculus 1 Math 1
No ratings yet
Calculus 1 Math 1
4 pages
(A) Some Basic Mathematics
No ratings yet
(A) Some Basic Mathematics
69 pages
Active Maths 3 Leaving Cert Ordinary Level Old Strand 5 Booklet
No ratings yet
Active Maths 3 Leaving Cert Ordinary Level Old Strand 5 Booklet
50 pages
Functions, Limits & Continuity
No ratings yet
Functions, Limits & Continuity
21 pages
BC Things To Know
No ratings yet
BC Things To Know
14 pages
Pearson Custom Library
No ratings yet
Pearson Custom Library
26 pages
Basic Calculus SSC
No ratings yet
Basic Calculus SSC
29 pages
Differentiation & Maxima & Minima
No ratings yet
Differentiation & Maxima & Minima
30 pages
5128 Glossary pp.618-628
No ratings yet
5128 Glossary pp.618-628
11 pages
Mo Oculus
No ratings yet
Mo Oculus
258 pages
Revision Cheat Sheet Mathematics Class 12
No ratings yet
Revision Cheat Sheet Mathematics Class 12
4 pages
A Brief Introduction To Calculus
No ratings yet
A Brief Introduction To Calculus
69 pages
Note Function
No ratings yet
Note Function
10 pages
Review For MATH100
No ratings yet
Review For MATH100
18 pages
MITOCW - MITRES - 18-007 - Part4 - Lec2 - 300k.mp4: Professor
No ratings yet
MITOCW - MITRES - 18-007 - Part4 - Lec2 - 300k.mp4: Professor
16 pages
Unit The Double Sum As An Iterated Integral: Block
No ratings yet
Unit The Double Sum As An Iterated Integral: Block
7 pages
MITOCW - MITRES - 18-007 - Part3 - Lec6 - 300k.mp4: Professor
No ratings yet
MITOCW - MITRES - 18-007 - Part3 - Lec6 - 300k.mp4: Professor
12 pages
MITOCW - MITRES - 18-007 - Part4 - Lec4 - 300k.mp4: Professor
No ratings yet
MITOCW - MITRES - 18-007 - Part4 - Lec4 - 300k.mp4: Professor
11 pages
MITOCW - MITRES - 18-007 - Part4 - Lec3 - 300k.mp4: Professor
No ratings yet
MITOCW - MITRES - 18-007 - Part4 - Lec3 - 300k.mp4: Professor
18 pages
Problems
No ratings yet
Problems
46 pages
MITOCW - MITRES - 18-007 - Part4 - Lec5 - 300k.mp4: Herbert Gross
No ratings yet
MITOCW - MITRES - 18-007 - Part4 - Lec5 - 300k.mp4: Herbert Gross
15 pages
Structure: Study Guide Block 1:vector Arithmetic
No ratings yet
Structure: Study Guide Block 1:vector Arithmetic
6 pages
Unit The Cross Product: Study Guide Block Vector Arithmetic
No ratings yet
Unit The Cross Product: Study Guide Block Vector Arithmetic
5 pages
Unit The Dot Product: Solutions Block Vector Arithmetic
No ratings yet
Unit The Dot Product: Solutions Block Vector Arithmetic
26 pages
MITRES 18 007 Supp Notes02 PDF
No ratings yet
MITRES 18 007 Supp Notes02 PDF
14 pages
Conclusions: To Diffusion
No ratings yet
Conclusions: To Diffusion
3 pages
1809 09813 PDF
No ratings yet
1809 09813 PDF
13 pages
Unit 4: The Dot Product 1.: Study Guide Block 1:vector Arithmetic
No ratings yet
Unit 4: The Dot Product 1.: Study Guide Block 1:vector Arithmetic
5 pages
Grinding and Cutting
100% (1)
Grinding and Cutting
20 pages
Unit Applications To 3-Dimensional Space 1. Lecture 1 - 30: Study Guide Block 1: Vector Arithmetic
No ratings yet
Unit Applications To 3-Dimensional Space 1. Lecture 1 - 30: Study Guide Block 1: Vector Arithmetic
6 pages
MATH1030 Tutorial 9: Tuesday 4:30 Session (11 Nov)
No ratings yet
MATH1030 Tutorial 9: Tuesday 4:30 Session (11 Nov)
2 pages
Applied Computing and Informatics: Kumash Kapadia, Hussein Abdel-Jaber, Fadi Thabtah, Wael Hadi
No ratings yet
Applied Computing and Informatics: Kumash Kapadia, Hussein Abdel-Jaber, Fadi Thabtah, Wael Hadi
6 pages
Conveyor S: by Gaurav D. Tikhe Aissms Coe S.E Chemical
No ratings yet
Conveyor S: by Gaurav D. Tikhe Aissms Coe S.E Chemical
11 pages
Grinding and Cutting
100% (1)
Grinding and Cutting
7 pages
Chapter 02 Basic Concepts in RF Design
No ratings yet
Chapter 02 Basic Concepts in RF Design
110 pages
Introduction To Control Systems
No ratings yet
Introduction To Control Systems
45 pages
Wong Ee Chyn MFS2014
No ratings yet
Wong Ee Chyn MFS2014
21 pages
Numerical - HT - Varatharaj - Linear and Nonlinear Stretching Sheet For Enhanced Heat and Mass Transfer in Casson Nanofluid With Activation Energy
No ratings yet
Numerical - HT - Varatharaj - Linear and Nonlinear Stretching Sheet For Enhanced Heat and Mass Transfer in Casson Nanofluid With Activation Energy
23 pages
Non-Linear Behaviour in Advanced Analysis of Reinforced Concrete
No ratings yet
Non-Linear Behaviour in Advanced Analysis of Reinforced Concrete
6 pages
ELE354 - Week 1 Slides - 26-09-2024
No ratings yet
ELE354 - Week 1 Slides - 26-09-2024
17 pages
Dynamic Fluent Transient
No ratings yet
Dynamic Fluent Transient
58 pages
The Control of A Highly Nonlinear Two-Wheels Balancing Robot: A Comparative Assessment Between LQR and PID-PID Control Schemes
No ratings yet
The Control of A Highly Nonlinear Two-Wheels Balancing Robot: A Comparative Assessment Between LQR and PID-PID Control Schemes
6 pages
Chaos in Fractional-Order Autonomous Nonlinear Systems: Wajdi M. Ahmad, J.C. Sprott
No ratings yet
Chaos in Fractional-Order Autonomous Nonlinear Systems: Wajdi M. Ahmad, J.C. Sprott
13 pages
Matrix Structural Analysis 2nd Edition PDF
100% (3)
Matrix Structural Analysis 2nd Edition PDF
482 pages
Arc Length Method
100% (1)
Arc Length Method
22 pages
Chapter 3 Equilibrium Solution and Stability of FO DE PDF
No ratings yet
Chapter 3 Equilibrium Solution and Stability of FO DE PDF
33 pages
Effects of Strength and Stiffness Degradation On Seismic Response
100% (1)
Effects of Strength and Stiffness Degradation On Seismic Response
312 pages
England P3
No ratings yet
England P3
29 pages
Case Studies On Nonlinear Control Theory of The Inverted Pendulum
No ratings yet
Case Studies On Nonlinear Control Theory of The Inverted Pendulum
28 pages
LST - Spring 2015 - Lecture - Week 1-2
No ratings yet
LST - Spring 2015 - Lecture - Week 1-2
19 pages
Search Results: Mathcad Solve Block
No ratings yet
Search Results: Mathcad Solve Block
3 pages
Artificial Intelligence in Manufacturing Research - J Paulo Davim
100% (2)
Artificial Intelligence in Manufacturing Research - J Paulo Davim
194 pages
Solutions Manual For Quantum Chemistry 7
No ratings yet
Solutions Manual For Quantum Chemistry 7
10 pages
Existence of Analytic Non-Convex V-States: Gerard Castro-L Opez, Javier G Omez-Serrano
No ratings yet
Existence of Analytic Non-Convex V-States: Gerard Castro-L Opez, Javier G Omez-Serrano
57 pages
Lecture 2
No ratings yet
Lecture 2
3 pages
Grade 11 General Mathematics
No ratings yet
Grade 11 General Mathematics
18 pages
Autodesk Simulation Mechanical II
No ratings yet
Autodesk Simulation Mechanical II
1 page
Differential Equations Question Bank
No ratings yet
Differential Equations Question Bank
4 pages
FEM1D - NONLINEAR - Finite Element Method For 1D Nonlinear Problem
No ratings yet
FEM1D - NONLINEAR - Finite Element Method For 1D Nonlinear Problem
3 pages
Integrating Structural Design and Analysis: The Basics of A Revit-Robot Structural Analysis Workflow
No ratings yet
Integrating Structural Design and Analysis: The Basics of A Revit-Robot Structural Analysis Workflow
18 pages
(Ozgur) - Lecture 4 (2.4)
No ratings yet
(Ozgur) - Lecture 4 (2.4)
7 pages
Om Post
No ratings yet
Om Post
21 pages
Sap 2000 Vs STAAD PRO
No ratings yet
Sap 2000 Vs STAAD PRO
10 pages
Offshore Structure
100% (4)
Offshore Structure
124 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

MITOCW - MITRES - 18-007 - Part4 - Lec1 - 300k.mp4: Herbert Gross

Uploaded by

MITOCW - MITRES - 18-007 - Part4 - Lec1 - 300k.mp4: Herbert Gross

Uploaded by

MITOCW | MITRES_18-007_Part4_lec1_300k.

courses, visit MIT OpenCourseWare at ocw.mit.edu.

several variables in particular-- but it was already present in calculus of a single

nothing was ever said about matrix algebra or linearity.

multiple of x, plus a constant.

change in y to the tangent line. Notice what we were saying.

delta f is approximately f prime of a times delta x.

delta f tan is a local property.

At the point corresponding to x equals 2, 2x is 4, so the equation of the tangent line

maps the interval from 0 to 2 onto the interval from 0 to 4.

mean by behaves like-- but x squared behaves like 2x minus 1.

replaced again by a linear function. Namely 4x minus 4. But 4x minus 4 is not

it's interested in what's happening in the neighborhood of a particular point. So the

question that we're raising over here.

+ delta u, 4 + delta v).

either difficult, or downright impossible-- one of the other-- in many cases.

respect to y times delta y. Since u is equal to x squared minus y squared, that

approximately 4 delta x minus 2 delta y. delta v is approximately 2 delta x plus 4

of delta x and delta y. In other words, as long as u and v are continuously

differentiable functions of x and y, we can approximate locally delta u and delta v by

other words, the non-linearity part of delta u is going to 0, as a second order

delta u. At any rate, n-variables what we're saying is suppose w is a function of x1

Then if w happens to be continuously differentiable at the point corresponding to x-

bar equals a-bar, meaning in terms of n-tuples x1 up to xn is the point a1 comma up

Because tangent indicates a tangent line or a tangent plane which is a geometric

function is locally linear. In other words, a continuously differentiable function near a

variable it's multiplying.

whereupon x is b1 plus b2 over 2. If we subtract the two equations, we get 2y is b1

it's determined solely by the coefficients of x and y.

of numbers, arranged to form m rows-

by n matrix then is what? It's a rectangular array of numbers consisting of m rows

about a matrix, one usually encloses the array in brackets, or in parentheses. It

Two rows and three columns. And so this is an example of a 2 by 3 matrix. A 2 by 3

matrix is coding for us in terms of a system of equations. Well. For example,

the variables. z1 and z2 as being the constants here.

But let the equations be y1 = x1 + 2 * x2 + x3 + x4. y2 = 2 * x1 - x2 - x3 + 3 * x4. y3 =

In other words, my matrix of coefficients now would be what kind of a matrix? It

Columns. All right? And that would be called a 3 by 4 matrix.

While z2 = 5*x1 + 5*x2 + 6*x3 - 4*x4, by a straightforward substitution. The point is

supposed to be we make up our rules to guarantee us that we will get the

there are in the first system of equations.

And what would I get if I did that? 1 * 1 = 1; -1 * -1 = 1; and 2 * 2 + 4. So 1 + 1 + 4 =

sounds hearing it the first time.

1 * 2 = 2; 1 * -1 = -1; 1 * 1 = 1. 2 - 1 + 1 = 2. And therefore, the term in the first row,

the same to dot two n-tuples.

MIT courses by making a donation at ocw.mit.edu/donate.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

While z2 = 5x1 + 5x2 + 6x3 - 4x4, by a straightforward substitution. The point is