Maths108 ExamReview 2021
Maths108 ExamReview 2021
Hello! I’ve had a number of students write in and ask for more resources for the exam,
and in particular for more examples to study before taking the exam. This document is
meant to serve as such a resource! This document is a list of what we’ve covered in this
class, along with examples to look at.
This document is not meant to serve as a replacement for your coursebook; instead, it’s
meant to give you a set of examples and summarize what we’ve been doing in the course
thus far! A good way to approach this document is to read through it, and each time you
come across something confusing use your resources (office hours, email, talking to friends,
Piazza, the coursebook, the textbooks) to review the related concepts and figure out what’s
going on.
{x | x = 2k + 1, k ∈ Z}
means “The set of all values of x such that x = 2k + 1, where k is an integer;” in other
words, odd numbers!
We have many famous kinds of sets: the empty set ∅ = {}, the positive whole numbers
N, the integers Z, the rational numbers Q, and real numbers R. We also had intervals of
various kinds:
=(-∞,-1] =(0,1) =[1.5,2.5) =(3,+∞)
- ∞ -2 -1 0 1 2 3 +∞
Given two sets A, B, we can combine them in several ways. The intersection of A and B,
written A ∩ B, is the set of all elements that are in both A and B; for example, {1, 2, 3} ∩
[0, 2) = 1. The union of A and B, written A ∪ B, is the set of all elements in either A or
B or both; for example, (−∞, 2) ∪ [1, 3] = (−∞, 3]. Finally, A set-minus B, written A \ B
or A − B, is the set of all elements in A that are not in B; for example, [3, 4] \ Z = (3, 4).
1
1.2 Relations and Functions
A function f : A → B consists of three things:
• A set A, called the domain. We think of A as the set of all possible “inputs” to the
function f .
• A set B, called the codomain. We think of B as containing all of the possible
“outputs” that the function f could have.
• A rule f , that assigns elements in A to elements in B. To be a function, for every a
in A there must be exactly one b ∈ B such that f (a) = b; you can’t have a value of a
where f (a) is undefined, or a value of a that we try to send to two different values in
B!
For example, consider the following three objects:
√
1. f : R → R, f (x) = x
√
2. g : [0, ∞) → R, f (x) = x
3. h : [−1, 1] → R, f (x) = y if and only if x2 + y 2 = 1.
The first
√ object is not a function, because it’s not defined on every element of its domain:
f (−1) = −1 is not an element of R, for example!
The second object is a function: for any nonnegative number x, there is exactly one
√
value of y in R equal to x; so every element in the domain has exactly one corresponding
element in the codomain.
The third object is not a function. Perhaps the easiest way to see this is to graph it:
that is, let’s draw all of the points (x, y) where x2 + y 2 = 1! As you’ve seen before, this
forms a circle:
= graph
of h
= vertical
line
test
For every x ∈ [−1, 1], there is a corresponding √value of y such that x2 + y 2 = 1; namely,
2 2
if we solve for y, we get that y = 1 − x ⇒ y = ± 1 − x2 , which exists for any x ∈ [−1, 1].
The issue, however, is there there are *two* different values of y for our value of x! A
function is supposed to have exactly one output, but we have two here; as a result, this
object is not a function.
A nice way to visualize whether or not something is a function, in the case that you’re
dealing with things you can graph, is the vertical line test. If you draw a graph of some rule,
look at vertical lines of the form x = a for any value a in your domain A. If that vertical
line hits your graph more than once, you’re not a function: that value of x has multiple
values of y that correspond to it! Conversely, if that vertical line does not hit your graph
at all, you’re also not a function: there is no output that corresponds to that input a from
the domain A!
2
2.1 Functions: Fundamentals
When people think about functions, they often just think about the rule that defines a
x2 + 1
function: that is, we’ll want to regard something like f (x) = 2 as a function, even
x −1
though we haven’t specified its domain or codomain yet!
In cases like this, it is useful to be able to find out what the “largest” domain we could
assign this function that would make sense. For instance, the expression f above is defined
as long as x 6= ±1; so we could say that it is a function with domain R \ {±1}, and
codomain R.
A common question that we’ll ask you in Maths 108 is the following: given an expression
f , can you find a set of values on which f is a function? As an example, the hardest problem
on your mid-semester test (74% of students got this wrong) was the following:
1
3. Suppose that the rule f is given by the formula f (x) = tan(x) .
π 3π
Consider the three following sets: (π, 2π), 2 , 2 , (0, π).
On how many of these sets is f a function?
The answer here is that f is not defined on any of these sets. This is because of the
following logic:
Note that when you’re working with expressions in this course you need to be very
1
careful with simplifying! Many people turned tan(x) into cot(x) = cos(x)
sin(x) , which is true
whenever x 6= (2k+1)π
2 ; however, when x = (2k+1)π2
1
, cot(x) is defined and tan(x) is not!
1 1
This is like how 1/x and x are equal if x 6= 0, but when x = 0 the expression 1/x is
undefined (and so these expressions are regarded as being distinct.)
√
The functions sin(x), cos(x), tan(x), ex , ln(x), |x|, x, x1 and polynomials are all functions
that we use frequently in Maths 108. We draw them here:
3
√
Graph: y = ex Graph: y = ln(x) Graph: y= x
Nat. dom: R Nat. dom: (0, ∞) Nat. dom: √ ∞)
[0,
Key values: e0 = 1 Key values: ln(1) = 0 Key values: 0=0
sin(x)
• sin(x + π) = − sin(x) • cos(−x) = cos(x) • tan(x) = cos(x)
• cos(x + π) = − cos(x) • sin(x + 2π) = sin(x) • sin2 (x) + cos2 (x) = 1.
• sin(−x) = − sin(x) • cos(x + 2π) = cos(x) • ln(ex ) = x.
Understanding how sin and cos can be derived from the unit circle can be a much more
efficient way to remember all of the trigonometric identites and key values above.
4
3.1 Kinds of Limits
Given a function f (x), we’ve studied several different kinds of limits in Maths 108:
• lim f (x) = L. Roughly speaking, this means the following: as we plug in any values
x→a
of x that get closer and closer to a, we get outputs f (x) that get closer and closer to
L.
• lim f (x) = L and lim f (x) = L. This is the same idea as the above (plug in any
x→a− x→a+
value of x close to a, get output f (x) close to L), except that for the a− limit we add
the restriction that x < a, and for the a+ limit we add the restriction that x > a. We
call these the limits as we approach a from the left or the right, respectively.
• lim f (x) = L and lim f (x) = L. Same idea again, except in the −∞ case we’re
x→−∞ x→+∞
plugging in arbitrarily large and negative values of x, while in the +∞ case we’re
plugging in arbitrarily large and positive values of x.
Notice that limits don’t care about what happens when x gets to a — in the definitions
above, we never plug a into f (x)! Instead, we always plug in values of x that get closer and
closer to a. This lets us say things like
x
lim = 1,
x→0 x
x
even though the function x is undefined at zero.
5
we plugged in other large numbers we’d see that actually lim tan(x) DNE, as looking at
x→∞
the graph of tan(x) from earlier demonstrates.
The limit lim f (x) = L exists precisely when lim f (x) = L , lim f (x) = L both
x→a x→a− x→a+
exist and are equal to each other.
1 1
So, for instance, because the limit lim = +∞ and the limit lim = −∞, the
x→0+ x x→0 x
−
1
overall limit lim = DN E.
x→0 x
This is not the only way to show that a limit does not exist; we saw earlier that
lim tan(x) DNE by looking visually at its graph! But it is often a useful technique.
x→∞
6
x2 − x
For a second example, if we had the limit lim , we could again factor an x out of
x→0 x2
x−1
the top and bottom to get the simplified limit lim .
x→0 x
If we were to plug in values of x close to 0, we can see that for tiny x < 0 it looks like
our function is going to +∞, and for tiny x > 0 it looks like our function is going to −∞:
x f(x) x f(x)
-0.1 11 0.1 -9
-0.01 101 0.01 99
-0.001 1001 0.001 999
So it seems likely that this limit does not exist, as the limits from the left and the right are
not equal.
Understanding this function all at once is hard! But, notice that for very small values
of x we know that
• |x| is a very small positive number, therefore
• ln(|x|) is defined and a very large negative number, therefore
• 1 + ln(|x|) is also a very large negative number.
So we know what the denominator is doing.
Conversely, we have limx→0 ex = e0 = 1, because ex is continuous. So, for very small
values of x near zero, we have
ex ∼ something very close to 1 ∼
= = 0.
1 + ln(|x|) something very large and negative
In other words, our limit is 0.
2x2 − 4 2x2 − 4
lim = 2, lim = 2,
x→+∞ x2 − 9 x→−∞ x2 − 9
7
and therefore that 2 is our only horizontal asymptote.
For vertical asymptotes, the only places where our function isn’t continuous are where
the denominator is zero, so these are the only places where our function might go to ±∞.
This happens when x2 − 9 = 0, i.e. at x = ±3; so we want to understand the limits at these
two values.
If we look at −3 in particular, we can see that the numerator 2x2 −4 is about 2·9−4 = 14.
At the same time, the denominator x2 −9 is going to zero: in particular, if we plug in values of
x close to but less than −3, we get that x2 is something slightly larger than 9, and therefore
14
that x2 − 9 is something tiny and positive. Therefore, the ratio tiny, positive is going to +∞,
2x2 − 4
and so the limit lim = +∞.
x→−3− x2 − 9
Similarly, if we plug in values of x close to but greater than −3, x2 is slightly smaller
14
than 9, so x2 − 9 is a tiny negative number. Therefore, the ratio tiny, negative is going to
2
2x − 4
−∞, and so the limit lim = −∞.
x→−3+ x2 − 9
In particular, x = −3 is a vertical asymptote! If you use the same reasoning at x = 3,
2x2 − 4 2x2 − 4
you’ll see that lim 2 = +∞, lim 2 = −∞.
x→3+ x − 9 x→3− x − 9
Now, simply draw in these lines y = 2, x = −3, x = 3 and sketch a curve that has the
limits we just found. You’ll get something like what we have below, which is the graph of
our function!
A scalar is another word for a real number. Given any scalar a ∈ R and any vector v ∈ Rn ,
we can find the scalar multiple av by taking v and multiplying each of its coordinates by
a: so, for example,
8
We say that two vectors are parallel if one is a scalar multiple of the other.
Geometrically, these operations are nice to visualize:
y
2u= (2,6) y
u=(2,3)
u=(1,3)
u+v = (5,1)
x
x
-1u=(1,3)
v=(3-2)
Conversely, (1, 0, 0) is not a linear combination of the two vectors (0, 2, 3), (0, 3, 2), because
for any scalars a, b the linear combination
cannot be equal to (1, 0, 0), as they will never have the same first coordinate.
4.2 Length
As well, given any vector in Rn , we can find its length by using Pythagoras’s theorem. To
be precise, given any vector v = (v1 , v2 , . . . vn ), the length of v is given by the expression
q
v12 + v22 + . . . + vn2 .
We say that a vector is a unit vector if it has length 1. So, for instance, ( 35 , 45 ) is a
unit vector, because
s
3 2
2 r
4 9 16 √
+ = + = 1 = 1.
5 5 25 25
v · w = v 1 w1 + v 2 w2 + . . . + v n wn .
9
So, for example, we have
Given any two vectors u, v, let θ denote the angle between u, v. We can measure θ
using the dot product:
−1 u·v
θ = cos
||u||||v||
For example, if u = (1, 1, 1, 1) and v = (3, 1, 1, 1), the angle between these vectors is
−1 (1, 1, 1, 1) · (3, 1, 1, 1) −1 3+1+1+1
θ = cos = cos √ √
||(1, 1, 1, 1)|| · ||(3, 1, 1, 1)|| 1+1+1+1+ 9+1+1+1
√ !
6 6 3 π
= cos−1 √ = cos−1 √ = cos−1 = .
2 12 4 3 2 6
We say that two vectors are orthogonal if their dot product is zero. Geometrically,
this means that the angle between these two vectors is π/2. For example, the two vectors
(2, 3) and (3, −2) drawn earlier are orthogonal, because (2, 3) · (3, −2) = 6 − 6 = 0. In the
picture, you can see that these two vectors are meeting at a right angle.
5.1 Lines
5.1.1 A Line Through Two Points
Given two points P 6= Q ∈ Rn , the line through P and Q has equation given by
x = P + t(Q − P ), t∈R
Example. The line through the two points (1, 0), (0, 1) in R2 has equation
We can draw this line by plugging in various values of t into our equation:
y
t (x,y)
-1 (2,-1)
x 0 (1,0)
1 (0,1)
2 (-1,2)
10
5.1.2 A Line Through a Point, Parallel to a Vector
Given a point P and vector v 6= 0 ∈ Rn , the line through P parallel to v has equation
given by
x = P + tv, t∈R
Example. The line through (1, 1, 1) parallel to the vector (0, 1, −1) in R3 has equation
We draw this line here in purple, along with the vector (0, 1, −1) that it is parallel to in gold:
z
t (x,y,z)
-1 (1,0,2)
y 0 (1,1,1)
1 (1,2,0)
x
2 (1,3,-1)
Example. We saw earlier that line through the point (1, 1, 1) parallel to the vector (0, 1, −1)
in R3 has equation
We can express this as a set of parametric equations by writing out what the equation above
tells us x, y, z are equal to:
x = 1,
y = 1 + t,
z = 1 − t, t ∈ R.
ax + by = c.
In R3 , things get weirder. A general equation for a line in R3 is any pair of equations of
the form
ax + by + cz = d,
ex + f y + gz = h,
as long as that the two planes described by these two equations are not parallel.
11
Example. We can graph the line with general equations
x + y + z = 1,
x − y + 2z = −1
by plugging in values of x into both equations and using this to solve for corresponding
values of y and z. For instance, if x = 0 we have y + z = 1, −y + 2z = −1, which when
added tells us that 3z = 0, i.e. z = 0, which forces y = 1, and gives us the point (0, 1, 0).
Doing this several times generates the graph below:
z
x y z
0 1 0
y
3 0 -2
-3 2 2
x
5.2 Planes
5.2.1 A Plane Through Three Points
Given three points P, Q, R ∈ Rn not all contained on the same line, the plane containing
the three points P, Q, R has equation given by
x = P + s(Q − P ) + t(R − P ), s, t ∈ R
Example. The plane containing the three points (1, 1, 0), (1, 0, 1), (0, 1, 1, )in R3 has equa-
tion
We can draw this plane by plugging in various values of s, t into our equation.
z
s t (x,y,z)
1 1 (0,0,2)
y
1 -1 (2,0,0)
-1 1 (0,2,0)
x
x = P + sv + tw, s, t ∈ R
Example. The plane containing the points (0, 0, 3) parallel to the vectors (2, 0, 0), (0, −3, 0)in
R3 has equation
12
5.2.3 Parametric Equations for a Plane
A set of parametric equations for a plane in Rn is a set of expressions of the form
a+bs+ct where a, b, c are constants, where we have one such expression for each coordinate.
Example. We saw earlier that the plane containing the three points (1, 1, 0), (1, 0, 1), (0, 1, 1)
in R3 has equation
We can express this as a set of parametric equations by writing out what the equation above
tells us x, y, z are equal to:
x = 1 − t,
y = 1 − s,
z = s + t, s, t ∈ R.
n · (~x − P ) = 0.
Example. The plane x+y+z = 2 from earlier has (1, 1, 1) as a normal vector. If we combine
this information with our knowledge that this plane goes through the point (1, 1, 0), we can
create a point-normal equation for this plane:
Why is this useful? Well: given two planes P, Q, it turns out that the angle of intersection
of these two planes is precisely the angle of intersection of their normal vectors! In other
words, if you have two planes P, Q intersecting at some angle θ, then the angle between the
normal vectors to these planes is also equal to θ.
Example. The angle of intersection θ of the plane x+y+z = 2 and the plane x+2y−3z = 4
is the same thing as the angle of intersection θ of the normal vectors (1, 1, 1) and (1, 2, −3),
which is
−1 (1, 1, 1) · (1, 2, −3) π
θ = cos = cos−1 (0) = .
||(1, 1, 1)||||(1, 2, −3)|| 2
13
6 Switching Between Different Equations
6.1 Lines
Parametric equations, equations through two points, and equations through a point parallel
to a vector for lines are almost identical: there’s really no difference between saying
Converting between general and parametric equations, then, is really the only interesting
thing. You can do this as follows:
1. Given general equations for a line, choose a variable in your equations that’s not
constant. Set it equal to t. Now, plug this into your other equations, and solve for
the other variables in terms of t.
2. Given parametric equations for a line, choose one equation and solve for t in terms of
one of your coordinates. Use this to get rid of t in your other equations.
x=t
y = 2 − 2t
z = 1, t ∈ R.
x − 2y = −7, z + y = 5.
6.2 Planes
Similarly, it’s very easy to switch between parametric equations, equations through three
points, and equations through a point parallel to two vectors for a plane:
So the only interesting thing, again, is switching between parametric and general equa-
tions. You can do this with the exact same methods as before:
14
1. Given general equations for a line, choose two variables in your equations that’s not
constant. Set one equal to t, and the other equal to s. Plug this into your general
equation to express the other variable in terms of s and t.
2. Given parametric equations for a line, choose one equation and solve for t in terms of
the other variables. Plug this into your other equations to get rid of t. Now, solve for
s, and plug this into your third equation. You should now have a general equation!
x = 4 − 3t − 2s
y=s
z = t, s, t ∈ R.
y = (x − t) − t = x − 2t,
z = 1 + 2(x − t) + 3t = 1 + 2x + t
The third equation tells us that z −1−2x = t, which when plugged into the second equation
gives us y = x − 2(z − 1 − 2x) = 5x − 2z + 2, which when rearranged is the general equation
−5x + y + 2z = 2.
15
We illustrate this process with examples:
Example. To find the intersection of the line with general equation x + 2y = 4 with the
line with parametric equations x = 3t − 1, y = t, we plug our parametric equations into our
general equation, and get
(3t − 1) + 2(t) = 4
⇒5t = 5
⇒t = 1.
This is a single value of t; therefore these two lines intersect at a point, in particular the
point we get when we plug t = 1 into the equations x = 3t − 1, y = t: the point (2, 1)!
Example. To find the intersection of the line with parametric equations x = 2 + t, y =
3 − t, z = t with the plane 2x + y − z = 0, we plug our parametric equations into our general
equation and get that
2(2 + t) + (3 − t) − (t) = 0
⇒7 = 0.
This is a contradiction: therefore, no such value of t exists, and the intersection of these
two objects is the empty set!
Example. To find the intersection of the plane with general equation 2x − y + z = 4 and
the plane with parametric equations x = 2 + s, y = 2 + t, z = 2 − 2s + t, we plug our
parametric equations into our general equation, and get
2(2 + s) − (2 + t) + (2 − 2s + t) = 4
⇒4 + 2s − 2 − t + 2 − 2s + t = 4
⇒0 = 0.
In other words, we can’t solve for either s or t! Therefore the intersection of these two
planes is a plane, and in particular this means these two planes are the same!
16
8.1 Fundamentals
A linear equation in n variables x1 , . . . xn is an equation of the form a1 x1 + a2 x2 + . . . +
an xn = c, for constants a1 , . . . an , c ∈ R. For example, 2x + 3y = 4 and 3x + 2y − z = 11 are
both linear equations. A system of linear equations is just a collection of multiple linear
2x + y = 1
equations, like . Given a system of linear equations, a solution is any way
x+y =2
to replace the variables of that system with real numbers, so that all of the equations are
2x + y = 1
satisfied. For example, (x, y) = (−1, 3) is a solution to , because plugging
x+y =2
in x = −1, y = 3 into our equations satisfies both of them!
Given a system of linear equations, we often want to find all of the possible solutions
to that system of equations. In these lectures, we came up with a multi-step process for
finding these solutions! The key ingredients in this process are the following:
17
1 0 −1 x + 0y = −1
into a set of linear equations: −→ . In this form, it’s
0 1 3 0x + y = 3
really easy to read off the solutions to our original system: we just want x = −1, y = 3!
Is there a row
Yes without a leading No
We’re done!
1 that has nonzero
entries in it?
Let Ri be the
topmost row
Using row op- Make all other
without a leading
erations, make entries in column
1. Let Cj be the
the entry in (i, j) Cj zero by adding
leftmost column
a 1. Call this multiples of
containing nonzero
1 a leading 1. Ri to them.
entries other
than leading 1’s.
Is there a row
No 2
that looks like How many free You have a plane
0 ... 0 c variables are there? of solutions.
for some c 6= 0?
3+
Yes 0
Figure 2: A flowchart for how to use RREFs to solve systems of linear equations.
These charts summarize the processes we’ve described in class for finding the RREF of
a matrix and interpreting it! To illustrate how these processes work in action, we consider
an example problem here:
Example. Consider the system of linear equations
4x+ 3y+ 2z = 1
x+ y+ z = 1
y+ 2z = 3
What are the solutions to this system of linear equations? How many solutions does this
system of linear equations have?
18
Answer. We start by converting this system of linear equations to an augmented matrix:
4x+ 3y+ 2z = 1 4 3 2 1
x+ y+ z = 1 −→ 1 1 1 1
y+ 2z = 3 0 1 2 3
We then use our first flowchart to tell us how to find the reduced row-echelon form of this
matrix:
Make
Place other
leading 1
Find R1 4 1 entries in 1
4 3 2 1 3 2 1 1 1 1 1 1 1
and C1 in (1, 1) C1 zero
1 1 1 1 1 1 1 1 4 3 2 1 0 −1 −2 −3
Swap Add
0 1 2 3 0 1 2 3 R1 , R 2 0 1 2 3 −4R1 0 1 2 3
to R2
Find R2 , C2
Make Try to
Place other find more
leading 1
1 entries in 1
1 1 1 1 1 1 1 0 −1 −2 nonzero
in (2, 2) C2 zero rows
0 −1 −2 −3 0 1 2 3 0 1 2 3 Done!
Scale R2 Add −R2 None
0 1 2 3 by −1 0 1 2 3 to R1 , R3 0 0 0 0 exist!
We then consult our second flowchart to figure out what this means:
• We do not have any rows where all of the entries to the left of the vertical break are
0, but where the entry to the right is nonzero. So we are not in the “no solutions”
case.
• We now count our free variables. The variables x and y are leading, because there are
leading 1’s in their columns; this leaves z, which is our one free variable.
• According to our flowchart, this means we have a line of solutions!
Finally, if we want the equations for this line, we can just translate our augmented matrix
back into a set of equations:
1 0 −1 −2
x −z = −2
0 1 2 3 −→
y +2z = 3
0 0 0 0
This gives us a set of general equations for our line. If we want a set of parametric equations,
we can set each free variable equal to its own parameter (in this case, set z = t) and rewrite
all of our other equations in terms of that parameter:
z = t,
x − t = −2 ⇒ x = t − 2,
y + 2t = 3 ⇒ y = 3 − 2t.
19
• Add matrices together, multiply matrices by scalars, multiply matrices together, and
take the transpose of a matrix.
• Find the inverse of a matrix, and how to use it to find solutions to a system of linear
equations.
• Know what the identity matrix is.
• Know how to take the determinant of 2 × 2, 3 × 3 and n × n matrices in general.
• Know several useful properties of the determinant.
matrix of size m × n is a grid of numbers with m rows and n columns. For example,
A
1 2 3
is a 2 × 3 matrix. We defined several operations on matrices in class! Most of
2 3 1
these operations were very intuitive. Addition, for example, was pretty straightforward:
given two matrices A, B of the same size, we could create a new matrix A + B by just
adding each cell in A to the corresponding cell in B. For example,
1 2 3 0 1 0 1+0 2+1 3+0 1 3 3
+ = = .
2 3 1 1 0 1 2+1 3+0 1+1 3 3 2
1 2
You can’t add matrices of different sizes; that is, + 1 2 4 DNE.
2 1
Scalar multiplication was similarly nice; given any m × n matrix A and any real number
x, we could form the matrix xA by multiplying each coordinate of A by x; for example,
1 3 3 2 6 6
2 = .
3 3 2 6 6 4
As well, given a matrix A we can form its transpose AT by “switching” its rows and
its columns; that is, given a matrix A, we can form the matrix AT by making a matrix
whose first row is A’s first column, whose second row is A’s second column, and so on/so
forth until we run out of columns. This is perhaps best illustrated by an example:
T
1 2 3 1 4 7
4 5 6 = 2
5 8
7 8 9 3 6 9
In other words, to find the entry that goes in (i, j), take the dot product of the i-th row of
A and the j-th column of B.
This probably looks scary, but in practice it’s not too bad. Here’s an example of this in
action:
20
1 2 1 −1 1
Example. If A = 2 3 −2 and D = 1 −1, find AD.
0 1 1 3 4
1 2 1
−1 1
rA,1 · cB,1 rA,1 · cB,2
(1, 2, 1) · (−1, 1, 3) (1, 2, 1) · (1, −1, 4)
·c rA,2 · cB,2
r
= A,2 B,1
−2 −1 = (2, 3, −2) · (−1, 1, 3) (2, 3, −2) · (1, −1, 4)
2
AD = 3 1
0 1 1 3 4 rA,3 · cB,1 rA,3 · cB,2 (0, 1, 1) · (−1, 1, 3) (0, 1, 1) · (1, −1, 4)
−1 + 2 + 3 1−2+4 4 3
=
−2 + 3 − 6 = −5
2−3−8 −9
0+1+3 0−1+4 4 3
Notice that to use our definition, we need the number of columns in A to be equal to
the number of rows in B. If A and B do not have the right sizes to use
the definition
above,
1 2 3 1 2 3
we say that their product is undefined; so, for instance, DNE.
2 1 2 2 3 4
Matrix multiplication has a number of interesting properties. One is that most of
the time, the order of multiplication
matters:
that is, AB and BA are often very dif-
0 1 0 0 1 0
ferent! For example, = , but if we switch the order we can see that
0 0 1 0 0 0
0 0 0 1 0 0
= is quite different! Another property is that much like how R has
1 0 0 0 0 1
a “multiplicative identity” in 1 (that is, a number we can multiply by anything without
changing that thing), we have an identity for matrices as well: in general, for any n we
define the n × n identity matrix In as
1 0 0 ... 0
0 1 0 . . . 0
In = 0 0 1 . . . 0
.. .. .. . . ..
. . . . .
0 0 0 ... 1
This is a n × n matrix with ones on the main diagonal (i.e. 1’s in every cell (i, i))
and zeroes everywhere else. This matrix has the property
that for any m × n matrix A,
1 2 3
Im · A = A · In = A. For example, if A = 4 5 6, then
7 8 9
1 0 0
1 2 3
(1, 0, 0) · (1, 4, 7) (1, 0, 0) · (2, 5, 8) (1, 0, 0) · (3, 6, 9)
I3 A =
0 1 4
0 5 =
6 (0, 1, 0) · (1, 4, 7) (0, 1, 0) · (2, 5, 8) (0, 1, 0) · (3, 6, 9)
0 0 1 7 8 9 (0, 0, 1) · (1, 4, 7) (0, 0, 1) · (2, 5, 8) (0, 0, 1) · (3, 6, 9)
1 2 3
=
4 5 6
7 8 9
21
there is some matrix A−1 such that AA−1 = In = A−1 A; we call A−1 the inverse of A.
Not all matrices are invertible, but many are! In class, we described a process for finding
the inverse of a matrix A:
Answer. We run our process here. First, we form [C|I]; then, we perform row operations
until the left-hand side is in RREF:
5 1 0 1 0 0
[C|I] = 4 5 2 0 1 0
5 3 1 0 0 1
5−4 1−5 0−2 1−0 0−1 0−0
add −R to R , then add −R to R1
−−−−−−1−−−−3−−−−−−−−−−2−−−−→ 4 5 2 0 1 0
5−5 3−1 1−0 0−1 0−0 1−0
1 −4 −2 1 −1 0
= 4 5 2 0 1 0
0 2 1 −1 0 1
1 −4 −2 1 −1 0
add −4R1 to R2
−−−−−−− −−−→ 4 − 4 5 + 16 2+8 0−4 1+4 0+0
0 2 1 −1 0 1
1 −4 −2 1 −1 0
= 0 21 10 −4 5 0
0 2 1 −1 0 1
1 −4 −2 1 −1 0
add −10R3 to R2
−−−−−−−− −−−→ 0 − 0 21 − 20 10 − 10 −4 + 10 5 0 − 10
0 2 1 −1 0 1
1 −4 −2 1 −1 0
= 0 1 0 6 5 −10
0 2 1 −1 0 1
1+0 −4 + 4 −2 + 0 1 + 24 −1 + 20 0 − 40
add −2R to R ,4R →R1
−−−−−−−2−−−−3−−−2−−−→ 0 1 0 6 5 −10
0−0 2−2 1−0 −1 − 12 0 − 10 1 + 20
1 0 −2 25 19 −40
= 0 1 0 6 5 −10
0 0 1 −13 −10 21
1 0 −2 + 2 25 − 26 19 − 20 −40 + 42
add 2R3 to R1
−−−−−− −−−→ 0 1 0 6 5 −10
0 0 1 −13 −10 21
1 0 0 −1 −1 −2
= 0 1 0 6 5 −10
0 0 1 −13 −10 21
22
The left-hand side
is in reduced row-echelon
form, and in fact is the identity; therefore
−1 −1 2
the right-hand side 6 5 −10 is C −1 !
−13 −10 21
To make sure we haven’t made any errors in our calculations, we check that CC −1 is in
fact equal to I here:
5 1 0
−1 −1 2
(5, 1, 0) · (−1, 6, −13) (5, 1, 0) · (−1, 5, −10) (5, 1, 0) · (2, −10, 21)
CC −1 =
4 5 6
2 5 =
−10 (4, 5, 2) · (−1, 6, −13) (4, 5, 2) · (−1, 5, −10) (4, 5, 2) · (2, −10, 21)
5 3 1 −13 −10 21 (5, 3, 1) · (−1, 6, −13) (5, 3, 1) · (−1, 5, −10) (5, 3, 1) · (2, −10, 21)
−5 + 6 + 0 −5 + 5 + 0 10 − 10 + 0 1 0 0
=
−4 + 30 − 26 −4 + 25 − 20 8 − 50 + 42 = 0
1 0
−5 + 18 − 13 −5 + 15 − 10 10 − 30 + 21 0 0 1
Success!
If this matrix was not invertible,
we would not have gotten the identity on the left at
2 1
the end. For example, B = is not invertible, because
8 4
1 1
2 1 1 0 multiply R1 by 1
2 1 0
[B|I] = −−−−−−−−−−−→ 2 2
8 4 0 1 8 4 0 1
1 1 1 1
add−8R1 to R2 1 0 1 0
−−−−−−−−−−→ 2
1
2
1 = 2 2
8−8·1 4−8· 2
0−8· 2
1−8·0 0 0 −4 1
gives us a matrix whose left-hand side is in RREF but is not the identity I2 .
Inverses of matrices can be used to solve systems of linear equations! Notice that if we
have n linear equations in n unknowns, like for example
5x+ y = 1,
4x+ 5y +2z = 1,
5x+ 3y +z = 1.
we can rewrite this as
5 1 0 x 1
4 5 2 y = 1 .
5 3 1 z 1
In general, if you have a system of linear equations in n variables, if you let A be the
matrix of coefficients of those variables, x be the vector consisting of all of those variables,
and b be the vector of the constants each equation is equal to, you can always express that
system of linear equations as Ax = b, just like we’ve done here.
5 1 0
Returning to this example: earlier in these notes, we said that if C = 4 5 2 then
5 3 1
−1 −1 2
C is invertible, and in particular C −1 = 6 5 −10 .
−13 −10 21
x a
Therefore, if we want to solve the equation C y = b , we can just multiply both
z c
sides by C −1 to get
23
−1 −1 2
1
(−1, −1, 2) · (1, 1, 1)
−1 − 1 + 2
0
x a
−1
1 =
=
= 1
y = C b = 6 5 −10 (6, 5, −10) · (1, 1, 1) 6 + 5 − 10
z c
−13 −10 21 1 (−13, −10, 21) · (1, 1, 1) −13 − 10 + 21 −2
x 1
In other words, we’ve solved our system of linear equations C y = 1, and found that
z 1
x 1 0
y = C −1 1 = 1 !
z 1 −2
This process works in general: if you have a system of linear equations of the form
Ax = b, then if A−1 exists, you get exactly one solution to this system, and it’s A−1 b! In
general, this is not the fastest way to solve a system of linear equations, and it only applies
when you have A−1 ; if A−1 does not exist, then you cannot use this method, and should go
back to our earlier methods using the RREF to find a solution. But if someone has given
you A−1 for free, then this is a faster way to solve systems of linear equations!
In particular, this means that the inverse is connected to finding solutions to systems of
linear equations in certain ways:
• Let A be a square matrix. If A−1 exists, then Ax = b has exactly one solution for
every b.
• This also applies in the other direction: if A is a square matrix and Ax = b has
exactly one solution for some b, then A−1 exists.
a b c a b c
d e f d e f
g h i g h i
The three blue diagonals correspond to the three terms you add, and the three red
diagonals are the three terms you subtract in the formula above.
For larger matrices, like 4 × 4 and on up, most of the formulas you could memorize get
very messy very quickly. So instead we came up with some properties that can help you
calculate the determinant of a large matrix quickly:
• Given a square matrix A, we know how our row operations from earlier affect the
determinant of A:
24
– If we multiply a row of A by a constant c, this multiplies the determinant by c.
– If we switch two rows in A, this multiplies the determinant by −1.
– If we add a multiple of one row to another row in A, this does nothing to the
determinant.
• We say that a matrix A is upper-triangular if the only cells in A that contain
nonzero values are those on or above the
main diagonal; that is, upper-triangular
1 0 2
matrices are ones that look like 0 2 3.
0 0 0
• The determinant of a matrix that is upper-triangular is the product of the entries on
its diagonal.
Accordingly, this gives us a nice blueprint for how to find the determinant of any square
matrix A:
• Take A and perform row operations on it to transform it into an upper-triangular
matrix B.
• Calculate the determinant of B by multiplying the entries on its diagonal!
• Use this to find the determinant of A by correcting for the row operations you per-
formed: that is, for each swap you did to A, make sure to multiply det(B) by −1 to
cancel out the earlier −1, and for each time you multiplied a row in A by a constant
c make sure to multiply det(B) by 1c .
We calculate a few examples here:
Example. Find the determinants of the following matrices:
1 2 0 0 0
9 8 7 2 1 2 0 0
7 2
A= , B = 6 5 4 , D = 0 2 1 2 0
2 1
3 2 1 0 0 2 1 2
0 0 0 2 1
Answer. For A and B, we just use the formula for the determinant of a 2 × 2 matrix:
7 2
det =7 · 1 − 2 · 2 = 3,
2 1
9 8 7 9 8 7 9 8 7
det 6 5 4 = 6 5 46 5 4
3 2 1 3 2 1 3 2 1
=9 · 5 · 1 + 8 · 4 · 3 + 7 · 6 · 2 − 9 · 4 · 2 − 8 · 6 · 1 − 7 · 5 · 3
=45 + 96 + 84 − 72 − 48 − 105 = 0.
For D, we use row operations to transform this matrix into a triangular matrix:
1 2 0 0 0 1 2 0 0 0 1 2 0 0 0
2
1 2 0 0 add −2R1 to R2
0 −3 2 0 0 2
add 3 R1 to R2
0 −3 2 0 0
7
−−−−−−−−−−→ 0 −−−−−−−−−−→ 0
0 2 1 2 0 2 1 2 0 0 2 0
3
0 0 2 1 2 0 0 2 1 2 0 0 2 1 2
0 0 0 2 1 0 0 0 2 1 0 0 0 2 1
1 2 0 0 0 1 2 0 0 0
6
add − R1 to R2
0
−3 2 0 0
add 14 R1 to R2
0 −3
2 0 0
7 7
−−−−−7−−−−−−→ 0 0 3
2 0 5
−−−−−−−−−−→ 0
0 3
2 0
5
0 0 0 − 7 2 0 0 0 − 57 2
33
0 0 0 2 1 0 0 0 0 5
25
The determinant of the matrix at right is just the product of the entries on its diagonal,
i.e. 1 · (−3) · 73 · (− 75 ) · 33
5 = 33, because it is upper-triangular. Therefore, because adding
rows to other rows does not change the determinant, the determinant of the original matrix
is also 33.
For example,
f (x + h) − f (x)
f 0 (x) = lim
h→0 h
26
So, for example, the derivative of f (x) = x2 at x = 2 is just
(2 + h)2 − 22 4 + 4h + h2 − 4 4h + h2
lim = lim = lim = lim 4 + h = 4.
h→0 h h→0 h h→0 h h→0
So, if we return to our example above where f (x) = x2 , we can see that a tangent line to
f (x) at x = 2 would have the equation
Drawing this line next to the graph of y = f (x) shows that we are indeed capturing the
idea of the “slope” of our function at x = 2:
10
-2 -1 0 1 2 3 4 5 6
While elegant, this limit definition of the derivative can take a while to use. Accordingly,
we’ve calculated the derivatives of several simple functions:
d x x d n n−1 , d
• dx e = e • dx x = nx n 6= 0 • dx cos(x) = − sin(x)
d 1 d d
• dx ln(x) = x • dx c = 0 • dx sin(x) = cos(x)
We also have a set of rules that let us take the derivative of more complicated functions:
• Differentiation is linear; given any two functions f (x), g(x) and constants a, b, we
d
have dx (af (x) + bg(x)) = af 0 (x) + bg 0 (x).
• Product rule: given any two functions f (x), g(x), we have dx d
(f (x) · g(x)) = f 0 (x) ·
0
g(x) + f (x) · g (x).
d
• Chain rule: given any two functions f (x), g(x), we have dx (f (g(x))) = f 0 (g(x))·g 0 (x).
Answer. For p(x), we want to use the chain rule; this is because p(x) consists of functions
composed with each other, and the chain rule is the only rule that deals with this! So: let’s
√
write p(x) = f (g(x)), where f (x) = ex and g(x) = x. Then, the chain rule tells us that
p0 (x) = dxd
(f (g(x))) = f 0 (g(x)) · g 0 (x). We know from above that f 0 (x) = dx
d x
e = ex , and
√
that g 0 (x) = dxd
x = 21 x−1/2 = 2√1 x ; therefore, we have
d 1/2
x = dx
√
0 0 0
√
x 1 e x
p (x) = f (g(x)) · g (x) = e · √ = √ .
2 x 2 x
For q(x), we want to use the product rule, because q(x) consists of the product of two
functions. If we let f (x) = sin(x), g(x) = cos(x) then q(x) = f (x)g(x); so the product rule
27
d
says that dx q(x) = dx d
(f (x) · g(x)) = f 0 (x)g(x) + f (x)g 0 (x). Because f 0 (x) = d
dx sin(x) =
cos(x) and g 0 (x) = dx
d
cos(x) = − sin(x), this tells us that
r(x) might seem harder to determine which rule to use, but it’s actually not that bad: if
we look at r(x), we just need to decide whether it looks more like a f (g(x)) or a f (x)g(x)!
In this case, it’s not clear how we would write this as a f (g(x)), as there’s not an obvious
“outside” function that we’re applying to some inside function. However, it’s very easy to
see how we’d write this as a product: we can write r(x) = f (x) · g(x), where f (x) = x2 and
g(x) = ln(x2 + 1). This is how differentiation always works; you’ll always have exactly one
rule that can work, and all you have to do is figure out what that rule is and then apply it!
If we do that here, then f 0 (x) = dx
d 2
x = 2x, while g 0 (x) = dx d
ln(x2 + 1) is trickier; here
we have to use the chain rule, because we have one function (ln(x)) being applied to another
(x2 + 1)! In particular, if we let h(x) = ln(x), j(x) = x2 + 1, then h0 (x) = x1 , j 0 (x) = 2x, and
therefore the chain rule tells us that g 0 (x) = h0 (j(x)) · j 0 (x) = x22x+1 .
Plugging this into our product rule work earlier tells us that
2x3
r0 (x) = f 0 (x)g(x) + f (x)g 0 (x) = 2x ln(x2 + 1) + .
x2 + 1
2x(x2 + y 2 ) = 3x2 − y 2
even though we cannot easily solve for y and make this into a function of x! In this situation,
dy
we use implicit differentiation to try to find dx . The idea here is the following: take any
expression involving the variables x and y, like sin(x) or exy or y 2 − 2x + 1 or ln(y).
d
• If this expression has the form f (x) (in other words, it only involves x), define dx f (x) =
0
f (x). In other words, take the derivative like normal.
d
• If this expression has the form f (y) (in other words, it only involves y), define dx f (y) =
0 dy dy
f (y) · dx . In other words, take the derivative like normal, but stick this dx on the
outside.
• If it has both x’s and y’s, use the chain and product rules to break it into smaller
pieces.
To give an example: let’s look at the curve 2x(x2 + y 2 ) = 3x2 − y 2 from earlier.
28
d
If we apply dx to both sides, we get
d d
2x · (x2 + y 2 ) = 3x2 − y 2
dx dx
d d d d 2
⇒ (2x) · (x2 + y 2 ) + (2x) · (x2 + y 2 ) = (3x2 ) − (y )
dx dx dx dx
d 2 d 2 dy
⇒ 2(x2 + y 2 ) + (2x) (x ) + (y ) = 6x − 2y
dx dx dx
2 2 dy dy
⇒ 2(x + y ) + (2x) 2x + 2y = 6x − 2y
dx dx
dy dy
⇒ 6x2 + 2y 2 + 4xy = 6x − 2y .
dx dx
dy
Now, we solve for :
dx
dy dy
6x2 + 2y 2 + 4xy = 6x − 2y
dx dx
dy dy
⇒ 4xy + 2y = 6x − 6x2 − 2y 2
dx dx
dy
⇒ (4xy + 2y) = 6x − 6x2 − 2y 2
dx
dy 6x − 6x2 − 2y 2
⇒ =
dx 4xy + 2y
To check
that our answer makes sense, let’s try graphing a tangent line to this curve at
dy
the point 1, √13 . Plugging this point into our equation for dx yields
2
6 · 1 − 6 · 12 − 2 · √1
dy 3 − 23 1
= = =− √ ,
dx 4·1· √1 + 2 √13 √6 3 3
3 3
which tells us that a tangent line to our curve at 1, √13 has equation
dy 1 1
y − y0 = (x − x0 ) ⇒ y − √ = − √ (x − 1).
dx 3 3 3
Graphing this line verifies that it is indeed a tangent line:
29
12 Differentiation Applications [Lectures 24-26]
After reading and watching these lectures, we’re hoping that you can do the following tasks:
• Know what it means for a function to be increasing, decreasing, concave up, concave
down, to have an inflection point, a critical point, or a relative maxima or minima.
• Visually identify all of the above properties.
• Use the derivative to find where a function has any of the above properties.
12.1 Definitions
The derivative can help us visualize and draw functions! It does this in many ways:
• Given a function f , we say that f is increasing on the interval (a, b) if for any
x < y ∈ (a, b), we have f (x) < f (y). Similarly, we say that f is decreasing on (a, b)
if for any x < y ∈ (a, b) we have f (x) > f (y).
Increasing Decreasing
• The derivative can tell us when this happens! It turns out that f is increasing on
(a, b) if f 0 (x) > 0 for every x ∈ (a, b), and f is decreasing on (a, b) if f 0 (x) < 0 for
every x ∈ (a, b).
• We say that f is concave up on (a, b) if f 00 (x) > 0 on (a, b); similarly, f is concave
down on (a, b) if f 00 (x) < 0 on (a, b). Visually, concave up graphs look like they’re
curving upwards (think cups, rockets taking off, the parabola y = x2 ) and concave
down graphs look like they’re curving downwards (think waterfalls, the path made by
throwing a ball in the air, y = −x2 .)
30
• The derivative can help us find these objects! If x is a relative maxima or minima,
then f 0 (x) is a critical point. Not all critical points are relative maxima or minima,
but all relative maxima and minima are critical points.
• We say that a point a is a point of inflection if f 00 (x) changes from positive to negative,
or vice-versa, at x = a.
Inflection point
We can use these properties to draw remarkably accurate graphs of functions! We look
at an example here:
Example. Draw the graph of f (x) = x(x − 9)(x − 24) , labeling all critical points, relative
maxima and minima, inflection points, and identifying where the function is concave up
and where it is concave down.
Answer. Our process for drawing a graph is as follows:
• We start by finding all of the places where our function crosses the x-axis and y-axis:
in other words, we find all of the values of x for which f (x) = 0, and also what f (0)
is.
• Then, we find f 0 (x), and find out where it is positive and negative. We use this to
identify all of the critical points of f and identify which are minima and which are
maxima; we also use this to determine where f is increasing and where f is decreasing.
• We finish by finding f 00 (x), and determine where this is positive and where this is
negative; we use this to find the inflection points of f , and determine where f is
concave up and concave down.
The first of these tasks is pretty straightforward. We know that f (x) = x(x − 9)(x − 24);
so we’ve already factored x into its roots, and can see that we have f (x) = 0 whenever x
is 0, 9 or 24. Similarly we know that f (0) = 0(0 − 9)(0 − 24) = 0, so we know where our
function crosses the y-axis.
Now, to get some more information we look at f 0 (x). Because
f (x) = x(x − 9)(x − 24) = x3 − 33x2 + 216x
⇒ f 0 (x) = 3x2 − 66x + 216 = 3(x2 − 22x + 72) = 3(x − 4)(x − 18),
we can see that x = 4, 18 are the two critical points of our function f . Moreover, because
3(x − 4) (x − 18) 3(x − 4)(x − 18)
x ∈ (−∞, 4) (−) (−) (−) · (−) = (+)
x ∈ (4, 18) (+) (−) (+) · (−) = (−)
x ∈ (18, ∞) (+) (+) (+) · (+) = (+)
we can see that our function is increasing on (−∞, 4), decreasing on (4, 18) and then in-
creasing again on (18, ∞). Finally, because at 4 we switch from increasing to decreasing we
know that our function has a relative maximum there, and at 18 because we switch from
decreasing to increasing we have a relative minimum.
This gives us some more information about our function! In particular, we can plot the
points
f (4) = 4(4 − 9)(4 − 24) = 400, f (18) = 18(18 − 9)(18 − 24) = −972
and get that our function looks like something that goes through the following points,
increasing until x = 4, decreasing until x = 18, and then increasing again:
31
400
200
-5 0 5 10 15 20 25 30
-200
-400
-600
-800
-1000
200
-5 0 5 10 15 20 25 30
-200
-400
-600
-800
-1000
32
Z
xn+1
Z
• ex dx = ex + C • xn dx = + C, n 6= −1
Z Z n+1
• ln(x) dx = x ln(x) − x + C • cos(x) dx = sin(x) + C
Z
1
Z
• dx = ln(x) + C • sin(x) dx = − cos(x) + C
x
We also have some techniques for integrating more complicated functions! One technique
is integration by substitution, which you can think of as “reverse chain rule.” It’s the
following formula:
Z Z
0 0 d
f (g(x)) · g (x) dx = (f (g(x))) dx = f (g(x)) + C, C ∈ R.
dx
Basically, this technique says that if we can recognize the function we’re integrating as
having the form f 0 (g(x)) · g 0 (x) for some f, g, then we’re automatically done! We just get
that the integral is f (g(x)) + C, and that’s quite nice.
Sometimes people use “u-substitution” notation, where to evaluate the integral
Z
f 0 (g(x)) · g 0 (x) dx
in other words it does the exact same thing as the notation above. Pick your favorite!
Not all functions can be written in the form f 0 (g(x)) · g 0 (x), though. For those other
kinds of functions, we have integration by parts (which we can think of as “reverse product
rule.”) It’s the following formula:
Z Z
f (x) · g(x) dx = f (x)g(x) − f (x)g 0 (x) dx.
0
Example. Using the techniques of integration by parts and integration by substitution (i.e.
the reverse chain and product rules), find each of the following indefinite integrals:
Z
1. (x + 1) sin(x + 1) dx
Z
2. (x + 1) sin((x + 1)2 ) dx
Z
sin(x)
3. dx
Z cos(x)
4. (ln(x))2 dx
33
Z
Answer. 1. The first thing we need to do to evaluate (x+1) sin(x+1) dx is figure out
which technique we want to try. At first glance, reverse chain rule (i.e. integration by
substitution) looks good, in that we have some composition going on here — we’d be
tempted to make f 0 (x) = sin(x) and g(x) = x + 1. However, the thing on the outside
is 0
Z not g (x) = 1; it’s x + 1! So our integral actually doesn’t look like it’s of the form
g 0 (x)f 0 (g(x)) dx, and as a result we’re better off trying something else.
Z
Let’s try parts, then! If we were to apply integration by parts to (x+1) sin(x+1) dx,
we’d want to write this in the form f 0 (x)g(x)dx, where f 0 (x) is something whose
R
integral we know and isn’t too bad, while g(x) is something that hopefully gets simpler
when we differentiate. This motivates us to choose g(x) = x + 1, because g 0 (x) = 1 is
indeed a lot simpler; this leaves us with f 0 (x) = sin(x + 1), which has the reasonable
integral f (x) = − cos(x + 1).
Integration by parts, then, tells us that
Z Z Z
(x + 1) sin(x + 1) dx = f (x) · g(x) dx = f (x)g(x) − f (x)g 0 (x) dx
0
Z
= (− cos(x + 1))(x + 1) − (− cos(x + 1))(1) dx
Z
= −(x + 1) cos(x + 1) + cos(x + 1) dx
d
(−(x + 1) cos(x + 1) + sin(x + 1) + C)
dx
= − cos(x + 1) + (−(x + 1))(− sin(x + 1)) + cos(x + 1)
=(x + 1) sin(x + 1).
Z
2. The integral (x + 1) sin((x + 1)2 ) dx looks like a much better integration by sub-
stitution candidate! As before, we think of sin((x + 1)2 ) as the “f 0 (g(x))” part,
with f 0 (x) = sin(x) and g(x) = (x + 1)2 ; this now means that f (x) = − cos(x),
g 0 (x) = 2(x + 1), and therefore that
Z Z
2 1
(x + 1) sin((x + 1) ) dx = 2(x + 1) sin((x + 1)2 ) dx
2
Z
1
= f 0 (g(x)) · g 0 (x) dx
2
1
= f (g(x)) + C
2
1
= − cos((x + 1)2 ) + C, C ∈ R.
2
34
As always, we check this antiderivative by taking a derivative, using the chain rule:
d 1 1 d
− cos((x + 1) ) + C = − (− sin((x + 1)2 )) ·
2
(x + 1)2
dx 2 2 dx
1
= − (− sin((x + 1)2 ))(2(x + 1)1 )
2
= (x + 1) sin((x + 1)2 ).
Z
sin(x)
3. If we look at dx, it looks like substitution is not a bad guess: we certainly
cos(x)
have some composition going on with the cos(x)1
part, and if we indeed make f 0 (x) =
1 0
x , g(x) = cos(x) then g (x) = − sin(x) does indeed give us the remaining parts of
the function we’re integrating, up to the sign! Therefore, because f 0 (x) = x1 forces
f (x) = ln(x), we have
Z Z
sin(x) 1
dx = − − sin(x) dx
cos(x) cos(x)
Z
= − f 0 (g(x)) · g 0 (x) dx
= −f (g(x)) + C
= − ln(| cos(x)|) + C, C ∈ R.
sin(x)
We can check that this is indeed the antiderivative of tan(x) = cos(x) by using the
chain rule:
d 1 d
(− ln(| cos(x)|) + C) = − · (| cos(x)|)
dx | cos(x)| dx
(
1 d
− cos(x) · dx (cos(x)), if cos(x) ≥ 0
= 1 d
− − cos(x) · dx (− cos(x)), if cos(x) < 0
(
1
− cos(x) · (− sin(x)), if cos(x) ≥ 0
= 1
− − cos(x) · (−(− sin(x))), if cos(x) < 0
( sin(x)
cos(x) , if cos(x) ≥ 0
= sin(x)
cos(x) , if cos(x) < 0
sin(x)
= .
cos(x)
Z
4. If we were to try integration by substitution on (ln(x))2 dx , we’d have to make
f 0 (g(x)) = (ln(x))2 , and this doesn’t really leave anything left for the g 0 (x) part! So,
let’s not try
Z this, and try parts instead! In particular, this means we probably think
of this as ln(x) ln(x) dx .
This makes our choice for f 0 (x) and g(x) pretty simple: we make f 0 (x) = ln(x)
and g(x) = ln(x), because we don’t have any other choices really! This means that
35
f (x) = x ln(x) − x, as we saw in class earlier, while g 0 (x) = x1 . As a result, we have
Z Z Z
ln(x) ln(x) dx = f (x) · g(x) dx = f (x)g(x) − f (x)g 0 (x) dx
0
Z
1
= (x ln(x) − x) ln(x) − (x ln(x) − x) dx
x
Z
= x(ln(x))2 − x ln(x) − ln(x) − 1 dx
is the signed area between the curve y = f (x) and the x-axis, where we think of area
above the x-axis as being positive and area below the x-axis as being negative. So, for
instance, the definite integral
Z 2π
sin(x) dx = 0,
0
because the area above the curve from 0 to π is “canceled out” by the area below the curve
from π to 2π:
+
-
36
which verifies algebraically the fact we geometrically saw a moment ago.
One useful application of the definite integral is to finding the unsigned area between
a curve and the x-axis: i.e. the area where we count area below and above the x-axis as
positive, and don’t have any of this canceling-out stuff! To find this, you just want to
integrate |f (x)|, as the absolute-value signs transform all of the parts of our curve where
they were below the x-axis into parts that are above the x-axis! In other words, if we let
A denote the unsigned area between the x-axis and the curve y = f (x) from a to b, then
Rb
A = a |f (x)| dx.
To integrate something like |f (x)|, it helps to break up the region you’re integrating
f (x) over into places where f (x) ≥ 0 and where f (x) ≤ 0, so that you can replace
We illustrate this idea with an example:
Z 1
x
Example. Find f (x) dx for f (x) = 2 . Then, find the area between the x-axis
−1 x +3
and the curve y = f (x) from x = −1 to x = 1.
Z
x
Answer. We start by finding the indefinite integral dx. This looks like an in-
x2 + 3
tegration by substitution problem, as it has some composition going on; indeed, if we let
f 0 (x) = x1 , g(x) = x2 + 3, g 0 (x) = 2x we have f (x) = ln(|x|) and therefore that
Z Z Z
x 1 x 1
dx = 2 2 dx = g 0 (x)f 0 (g(x)) dx
x2 + 3 2 x +3 2
1
= f (g(x)) + C
2
1
= ln(|x2 + 3|) + C.
2
Therefore, by the fundamental theorem of calculus, we have
Z 1
x 1 1
dx = ln(|x2 + 3|) + C − ln(|x2 + 3|) + C
−1 x2 + 3 2 2
x=1 x=−1
ln(4) + C ln(4) + C
= −
2 2
= 0.
37
Z 1 Z 1 Z 0
x x x
Area = 2
dx = 2
dx + 2
dx
−1 x +3 0 x +3 −1 x + 3
Z 1 Z 0
x x
= 2
dx + − 2 dx
0 x +3 −1 x +3
!
1 2 1 2
= ln(|x + 3|) + C − ln(|x + 3|) + C
2 2
x=1 x=0
!
1 1
− ln(|x2 + 3|) + C − ln(|x2 + 3|) + C
2 2
x=0 x=−1
ln(4) + C ln(3) + C ln(3) + C ln(4) + C
= − − −
2 2 2 2
= ln(4) − ln(3).
p
Figure 3: Left to right: the hemisphere z = 1 − x2 − y 2 , parabola z = x2 + y 2 , and
monkey saddle z = x3 − 3xy 2 .
38
Sometimes, we will want to visualize a surface even when we don’t have access to
computer programs! To do this, we use level curves, which are defined as follows: given a
function f (x, y), a level curve at height h is the set of all points (x, y) such that f (x, y) = h.
We think of this as what happens when we “slice” through the graph z = f (x, y) at height
z = h. If we take enough of these cross-sections, I claim that we get a nice visual image of
what our surface will look like!
For example, let f (x, y) = x2 − y 2 . I’ve drawn the level curves h = x2 − y 2 of this
function below, for values of h ranging from 4 to −4:
8 8 8
6 6 6
4 4 4
2 2 2
-2 -2 -2
-4 -4 -4
-6 -6 -6
-8 -8 -8
6 6 6
4 4 4
2 2 2
-2 -2 -2
-4 -4 -4
-6 -6 -6
-8 -8 -8
6 6 6
4 4 4
2 2 2
-2 -2 -2
-4 -4 -4
-6 -6 -6
-8 -8 -8
With some imagination, you can think about what it would look like if these level curves
were drawn in 3D space, each one at its corresponding height h, like this:
Indeed, if you fill in the gaps you can see the surface we’ve drawn here (a hyperbolic
paraboloid!)
39
14.2 Partial Derivatives
Given a function f (x, y), we define the partial derivative with respect to x of f (x, y),
d
denoted dx f (x, y), as the following: take f (x, y), think of x as a variable and y as a constant,
and take the derivative as normal with respect to x. So, for example,
d
(x + y) = 1 + 0 = 1,
dx
d d
sin(xy) = cos(xy) · (xy) = cos(xy) · y,
dx dx
d 2
y = 0.
dx
d
Similarly, we define the partial derivative with respect to y of f (x, y), denoted dy f (x, y),
as the following: take f (x, y), think of y as the variable and x as a constant, and take the
derivative with respect to y! So, for example,
d
(x + y) = 0 + 1 = 1,
dy
d d
sin(xy) = cos(xy) · (xy) = cos(xy) · x,
dy dy
d 2
y = 2y.
dy
Earlier, we used the derivative to make a tangent line to the graph y = f (x). We can
use the partial derivatives here to make a tangent plane to the graph z = f (x, y) at the
point (a, b, c) in a very similar way, using the equation below:
d · (x − a) + d (f (x, y))
z − c = (f (x, y)) · (y − b)
dx dy
(x,y,z)=(a,b,c) (x,y,z)=(a,b,c)
Example. Find the tangent plane to the graph of f (x, y) = x2 + y 2 at the point (1, 1, 2).
Answer. We calculate:
d 2 d 2
(x + y 2 ) = 2x, (x + y 2 ) = 2y,
dx dy
Success!
40