Engineering Mathematics Final Ok
Engineering Mathematics Final Ok
OBJECTIVES
PART : ONE
CHAPTER 1: ELEMENTS OF MATHEMATICS
1.1.Set theory
Sets
Writing A = {1, 2, 3, 4} means that the elements of the set A are the
numbers 1, 2, 3 and 4. Sets of elements of A, for example {1, 2}, are
subsets of A.
1
Sets can themselves be elements. For example, consider the set
B = {1, 2, {3, 4}}. The elements of B are not 1, 2, 3, and 4. Rather, there
are only three elements of B, namely the numbers
umbers 1 and 2, and the set
{3, 4}.
2
Another possible notation for the same relation is
The symbol ϵ was first used by Giuseppe Peano 1889 in his work
Arithmetices principia nova methodo exposita.
exposita. Here he wrote on page
X:
which means
The symbol itself is a stylized lowercase Greek letter epsilon ("ε"), the
first letter of the word ἐστί, which means "is".
The Unicode characters for these symbols are U+2208 ('element of'),
U+220B ('contains as member') and U+2209 ('not an element of'). The
equivalent LaTeX commands are "\in",
" "\ni" and "\notin".
notin". Mathematica
has commands "\[Element]"
[Element]" and "\[NotElement]".
"
3
Cardinality of sets
Examples
• 2∈A
• {3,4} ∈ B
• {3,4} is a member of B
• Yellow ∉ C
• The cardinality of D = { 2, 4, 8, 10, 12 } is finite and equal to 5.
• The cardinality of P = { 2, 3, 5, 7, 11, 13, ...} (the prime numbers)
is infinite (this was proven by Euclid).
Set theory
A Venn diagram illustrating the intersection of two sets.
4
Set theory is the branch of mathematical logic that studies sets, which
informally are collections of objects. Although any type of object can be
collected into a set, set theory is applied most often to objects that are
relevant to mathematics. The language of set theory can be used in the
definitions of nearly all mathematical objects.
The modern study of set theory was initiated by Georg Cantor and
Richard Dedekind in the 1870s. After the discovery of paradoxes in
naive set theory, numerous axiom systems were proposed in the early
twentieth century, of which the Zermelo–Fraenkel axioms, with the
axiom of choice, are the best-known.
History
Georg Cantor.
Since the 5th century BC, beginning with Greek mathematician Zeno of
Elea in the West and early Indian mathematicians in the East,
mathematicians had struggled with the concept of infinity. Especially
notable is the work of Bernard Bolzano in the first half of the 19th
century.[3] Modern understanding of infinity began in 1867–71, with
Cantor's work on number theory. An 1872 meeting between Cantor and
Richard Dedekind influenced Cantor's thinking and culminated in
Cantor's 1874 paper.
The next wave of excitement in set theory came around 1900, when it
was discovered that Cantorian set theory gave rise to several
contradictions, called antinomies or paradoxes. Bertrand Russell and
6
Ernst Zermelo independently found the simplest and best known
paradox, now called Russell's paradox: consider "the set of all sets that
are not members of themselves", which leads to a contradiction since it
must be a member of itself, and not a member of itself. In 1899 Cantor
had himself posed the question "What is the cardinal number of the set
of all sets?", and obtained a related paradox. Russell used his paradox as
a theme in his 1903 review of continental mathematics in his The
Principles of Mathematics.
The momentum of set theory was such that debate on the paradoxes did
not lead to its abandonment. The work of Zermelo in 1908 and Abraham
Fraenkel in 1922 resulted in the set of axioms ZFC, which became the
most commonly used set of axioms for set theory. The work of analysts
such as Henri Lebesgue demonstrated the great mathematical utility of
set theory, which has since become woven into the fabric of modern
mathematics. Set theory is commonly used as a foundational system,
although in some areas category theory is thought to be a preferred
foundation.
7
Set theory begins with a fundamental binary relation between an object o
and a set A. If o is a member (or element) Bof A, write o ∈ A. Since
sets are objects, the membership relation can relate sets as well.
A derived binary relation between two sets is the subset relation, also
called set inclusion. If all the members of set A are also members of set
B, then A is a subset of B, denoted A ⊆ B. For example, {1, 2} is a
subset of {1, 2, 3} , and so is {2} but {1, 4} is not. From this definition,
it is clear that a set is a subset of itself; for cases where one wishes to
rule this out, the term proper subset is defined. A is called a proper
subset of B if and only if A is a subset of B, but B is not a subset of A.
Note also that 1 and 2 and 3 are members (elements) of set {1, 2, 3} , but
are not subsets, and the subsets in turn are not as such members of the
set.
8
• Set difference of U and A, denoted U \ A, is the set of all members
of U that are not members of A. The set difference {1, 2, 3} \ {2, 3,
4} is {1} , while, conversely, the set difference {2, 3, 4} \ {1, 2, 3}
is {4} . When A is a subset of U, the set difference U \ A is also
called the complement of A in U. In this case, if the choice of U is
clear from the context, the notation Ac is sometimes used instead
of U \ A, particularly if U is a universal set as in the study of Venn
diagrams.
• Symmetric difference of sets A and B, denoted A △ B or A ⊖ B,
is the set of all objects that are a member of exactly one of A and B
(elements which are in one of the sets, but not in both). For
instance, for the sets {1, 2, 3} and {2, 3, 4} , the symmetric
difference set is {1, 4} . It is the set difference of the union and the
intersection, (A ∪ B) \ (A ∩ B) or (A \ B) ∪ (B \ A).
• Cartesian product of A and B, denoted A × B, is the set whose
members are all possible ordered pairs (a, b) where a is a member
of A and b is a member of B. The cartesian product of {1, 2} and
{red, white} is {(1, red), (1, white), (2, red), (2, white)}.
• Power set of a set A is the set whose members are all possible
subsets of A. For example, the power set of {1, 2} is { {}, {1},
{2}, {1, 2} } .
9
Some basic sets of central importance are the empty set (the unique set
containing no elements), the set of natural numbers, and the set of real
numbers.
Some ontology
A set is pure if all of its members are sets, all members of its members
are sets, and so on. For example, the set {{}} containing only the empty
set is a nonempty pure set. In modern set theory, it is common to restrict
attention to the von Neumann universe of pure sets, and many systems
of axiomatic set theory are designed to axiomatize the pure sets only.
There are many technical advantages to this restriction, and little
generality is lost, because essentially all mathematical concepts can be
modeled by pure sets. Sets in the von Neumann universe are organized
into a cumulative hierarchy, based on how deeply their members,
members of members, etc. are nested. Each set in this hierarchy is
assigned (by transfinite recursion) an ordinal number α, known as its
rank. The rank of a pure set X is defined to be the least upper bound of
all successors of ranks of members of X. For example, the empty set is
assigned rank 0, while the set {{}} containing only the empty set is
assigned rank 1. For each ordinal α, the set Vα is defined to consist of all
pure sets with rank less than α. The entire von Neumann universe is
denoted V.
10
Axiomatic set theory
The most widely studied systems of axiomatic set theory imply that all
sets form a cumulative hierarchy. Such systems come in two flavors,
those whose ontology consists of:
• Sets alone. This includes the most common axiomatic set theory,
Zermelo–Fraenkel set theory (ZFC), which includes the axiom
of choice. Fragments of ZFC include:
o Zermelo set theory, which replaces the axiom schema of
replacement with that of separation;
o General set theory, a small fragment of Zermelo set theory
sufficient for the Peano axioms and finite sets;
o Kripke–Platek set theory, which omits the axioms of infinity,
powerset, and choice, and weakens the axiom schemata of
separation and replacement.
11
• Sets and proper classes. These include Von Neumann–Bernays–
Gödel set theory, which has the same strength as ZFC for theorems
about sets alone, and Morse-Kelley set theory and Tarski–
Grothendieck set theory, both of which are stronger than ZFC.
Systems of constructive set theory, such as CST, CZF, and IZF, embed
their set axioms in intuitionistic instead of classical logic. Yet other
systems accept classical logic but feature a nonstandard membership
relation. These include rough set theory and fuzzy set theory, in which
the value of an atomic formula embodying the membership relation is
not simply True or False. The Boolean-valued models of ZFC are a
related subject.
12
Applications
13
often much longer than the natural language proofs mathematicians
commonly present. One verification project, Metamath, includes human-
written, computer‐verified derivations of more than 12, 000 theorems
starting from ZFC set theory, first order logic and propositional logic.
Areas of study
Descriptive set theory is the study of subsets of the real line and, more
generally, subsets of Polish spaces. It begins with the study of
pointclasses in the Borel hierarchy and extends to the study of more
complex hierarchies such as the projective hierarchy and the Wadge
hierarchy. Many properties of Borel sets can be established in ZFC, but
proving these properties hold for more complicated sets requires
additional axioms related to determinacy and large cardinals.
14
The field of effective descriptive set theory is between set theory and
recursion theory. It includes the study of lightface pointclasses, and is
closely related to hyperarithmetical theory. In many cases, results of
classical descriptive set theory have effective versions; in some cases,
new results are obtained by proving the effective version first and then
extending ("relativizing") it to make it more broadly applicable.
Large cardinals
16
Determinacy
Forcing
Paul Cohen invented the method of forcing while searching for a model
of ZFC in which the continuum hypothesis fails, or a model of ZF in
which the axiom of choice fails. Forcing adjoins to some given model of
set theory additional sets in order to create a larger model with
properties determined (i.e. "forced") by the construction and the original
model. For example, Cohen's construction adjoins additional subsets of
the natural numbers without changing any of the cardinal numbers of the
original model. Forcing is also one of two methods for proving relative
17
consistency by finitistic methods, the other method being Boolean-
valued models.
Cardinal invariants
Set-theoretic topology
18
From set theory's inception, some mathematicians have objected to it as
a foundation for mathematics. The most common objection to set theory,
one Kronecker voiced in set theory's earliest years, starts from the
constructivist view that mathematics is loosely related to computation. If
this view is granted, then the treatment of infinite sets, both in naive and
in axiomatic set theory, introduces into mathematics methods and
objects that are not computable even in principle. The feasibility of
constructivism as a substitute foundation for mathematics was greatly
increased by Errett Bishop's influential book Foundations of
Constructive Analysis.
19
numbers".[9] Wittgenstein's views about the foundations of mathematics
were later criticised by Georg Kreisel and Paul Bernays, and
investigated by Crispin Wright, among others.
Set Theory/Relations
Ordered pairs
20
To define relations on sets we must have a concept of an ordered pair, as
opposed to the unordered pairs the axiom of pair gives. To have a
rigorous definition of ordered pair, we aim to satisfy one important
property, namely, for sets a,b,c and d, .
As it stands, there are many ways to define an ordered pair to satisfy this
th
property. A simple definition, then is . (This is true
simply by definition. It is a convention that we can usefully build upon,
and has no deeper significance.)
Theorem
Proof
If and , then .
Now, if then . Then
, so and a=c.
So we have . Thus meaning
.
If , note , so
Relations
21
Using the definiton of ordered pairs, we now introduce the notion of a
binary relation.
22
• The preimage of a set B under a relation R is the image of B over
R-1 or
23
• R is antisymmetric if and together imply that for all
x and y in X.
• R is transitive if and together imply that holds for all
x, y, and z in X.
• R is total if , , or both hold for all x and y in X.
Binary relation
"Relation (mathematics)" redirects here. For a more general notion of
relation, see finitary relation.
relation. For a more combinatorial viewpoint, see
theory of relations.. For other uses, see Relation § Mathematics.
Mathematics
24
Binary relations are used in many branches of mathematics to model
concepts like "is greater than", "is equal to", and "divides" in arithmetic,
"is congruent to" in geometry, "is adjacent to" in graph theory, "is
orthogonal to" in linear algebra and many more. The concept of function
is defined as a special kind of binary relation. Binary relations are also
heavily used in computer science.
Formal definition
25
authors define it as an or
ordered triple (X, Y, G),, which is otherwise
referred to as a correspondence.
correspondence
The domain of R is the set of all x such that xRy for at least one y. The
range of R is the set of all y such that xRy for at least one x. The field of
R is the union of its domain and its range.
Either approach is adequate for most uses, provided that one attends to
the necessary changes in language, notation, and the definitions of
concepts like restrictions,
restrictions composition, inverse relation,, and so on. The
choice
ce between the two definitions usually matters only in very formal
contexts, like category theory.
theory
Example
2nd example relation
27
ball car doll gun
John + − − −
Mary − − + −
Venus − + − −
John + − − −
Mary − − + −
Ian − − − −
Venus − + − −
Example: Suppose there are four objects {ball, car, doll, gun} and four
persons {John, Mary, Ian, Venus}. Suppose that John owns the ball,
Mary owns the doll, and Venus owns the car. Nobody owns the gun and
Ian owns nothing. Then the binary relation "is owned by" is given as
R = ({ball, car, doll, gun}, {John, Mary, Ian, Venus}, {(ball, John),
(doll, Mary), (car, Venus)}).
28
Thus the first element of R is the set of objects, the second is the set of
persons, and the last element is a set of ordered pairs of the form (object,
owner).
The pair (ball, John), denoted by ballRJohn means that the ball is owned by
John.
Two different relations could have the same graph. For example: the
relation
({ball, car, doll, gun}, {John, Mary, Venus}, {(ball, John), (doll,
Mary), (car, Venus)})
Uniqueness properties:
29
• injective (also called left-unique[7]): for all x and z in X and y in Y
it holds that if xRy and zRy then x = z. For example, the green
relation in the diagram is injective, but the red relation is not, as it
relates e.g. both x = −5 and z = +5 to y = 25.
• functional (also called univalent[8] or right-unique[7] or right-
definite[9]): for all x in X, and y and z in Y it holds that if xRy and
xRz then y = z; such a binary relation is called a partial function.
Both relations in the picture are functional. An example for a non-
functional relation can be obtained by rotating the red graph
clockwise by 90 degrees, i.e. by considering the relation x=y2
which relates e.g. x=25 to both y=-5 and z=+5.
• one-to-one (also written 1-to-1): injective and functional. The
green relation is one-to-one, but the red is not.
30
• surjective (also called right-total[7] or onto): for all y in Y there
exists an x in X such that xRy. The green relation is surjective, but
the red relation is not, as it doesn't relate any real number x to e.g.
y = −14.
Difunctional
In automata theory, the term rectangular relation has also been used to
denote a difunctional relation. This terminology is justified by the fact
that when represented as a boolean matrix, the columns and rows of a
difunctional relation can be arranged in such a way as to present
32
rectangular blocks of true on the (asymmetric) main diagonal.[15] Other
authors however use the term "rectangular" to denote any heterogeneous
relation whatsoever.[6]
The set of all binary relations Rel(X) on a set X is the set 2X × X which is
a Boolean algebra augmented with the involution of mapping of a
relation to its inverse relation. For the theoretical explanation see
Relation algebra.
The previous 3 alternatives are far from being exhaustive; e.g. the
red relation y=x2 from the above picture is neither irreflexive, nor
coreflexive, nor reflexive, since it contains the pair (0,0), and (2,4),
but not (2,2), respectively.
• symmetric: for all x and y in X it holds that if xRy then yRx. "Is a
blood relative of" is a symmetric relation, because x is a blood
relative of y if and only if y is a blood relative of x.
• antisymmetric: for all x and y in X, if xRy and yRx then x = y.
For example, ≥ is anti-symmetric (so is >, but only because the
condition in the definition is always false).[18]
• asymmetric: for all x and y in X, if xRy then not yRx. A relation
is asymmetric if and only if it is both anti-symmetric and
irreflexive.[19] For example, > is asymmetric, but ≥ is not.
• transitive: for all x, y and z in X it holds that if xRy and yRz then
xRz. For example, "is ancestor of" is transitive, while "is parent of"
is not. A transitive relation is irreflexive if and only if it is
asymmetric.[20]
34
• total: for all x and y in X it holds that xRy or yRx (or both). This
definition for total is different from left total in the previous
section. For example, ≥ is a total relation.
• trichotomous: for all x and y in X exactly one of xRy, yRx or x =
y holds. For example, > is a trichotomous relation, while the
relation "divides" on natural numbers is not.[21]
• Right Euclidean: for all x, y and z in X it holds that if xRy and
xRz, then yRz.
• Left Euclidean: for all x, y and z in X it holds that if yRx and zRx,
then yRz.
• Euclidean: A Euclidean relation is both left and right Euclidean.
Equality is a Euclidean relation because if x=y and x=z, then y=z.
• serial: for all x in X, there exists y in X such that xRy. "Is greater
than" is a serial relation on the integers. But it is not a serial
relation on the positive integers, because there is no y in the
positive integers such that 1>y.[22] However, "is less than" is a
serial relation on the positive integers, the rational numbers and the
real numbers. Every reflexive relation is serial: for a given x,
choose y=x. A serial relation can be equivalently characterized as
every element having a non-empty successor neighborhood (see
the previous section for the definition of this notion). Similarly an
inverse serial relation is a relation in which every element has
non-empty predecessor neighborhood.[12]
35
• set-like (or local): for every x in X, the class of all y such that yRx
is a set. (This makes sense only if relations on proper classes are
allowed.) The usual ordering < on the class of ordinal numbers is
set-like, while its inverse > is not.
directed
→
graph
undirected
irreflexive symmetric
graph
36
tournamen antisymmetri pecking
irreflexive
t c order
dependenc
reflexive symmetric
y
total
reflexive Yes ≤
preorder
preferenc
preorder reflexive Yes ≤
e
partial antisymmetri
reflexive Yes ≤ subset
order c
partial
equivalenc symmetric Yes
e
equivalenc ∼, ≅,
reflexive symmetric Yes equality
e relation ≈, ≡
37
order
38
A relation R on sets X and Y is said to be contained in a relation S on X
and Y if R is a subset of S, that is, if x R y always implies x S y. In this
case, if R and S disagree, R is also said to be smaller than S. For
example, > is contained in ≥.
39
• Transitive reduction: R −, defined as a[clarification needed] minimal
relation having the same transitive closure as R.
• Reflexive transitive closure: R *, defined as R * = (R +) =, the
smallest preorder containing R.
• Reflexive transitive symmetric closure: R ≡, defined as the
smallest equivalence relation over X containing R.
Complement
40
Restriction
41
The left-restriction (right-restriction, respectively) of a binary relation
between X and Y to a subset S of its domain (codomain) is the set of all
pairs (x, y) in the relation for which x (y) is an element of S.
42
In most mathematical contexts, references to the relations of equality,
membership and subset are harmless because they can be understood
implicitly to be restricted to some set in the context. The usual work-
around to this problem is to select a "large enough" set A, that contains
all the objects of interest, and work with the restriction =A instead of =.
Similarly, the "subset of" relation ⊆ needs to be restricted to have
domain and codomain P(A) (the power set of a specific set A): the
resulting set relation can be denoted ⊆A. Also, the "member of" relation
needs to be restricted to have domain A and codomain P(A) to obtain a
binary relation ∈A that is a set. Bertrand Russell has shown that
assuming ∈ to be defined on all sets leads to a contradiction in naive set
theory.
43
• order relations,, including strict orders:
o greater than
o greater than or equal to
o less than
o less than or equal to
o divides (evenly)
o is a subset of
• equivalence relations
relations:
o equality
o is parallel to (for affine spaces)
o is in bijection with
o isomorphy
• dependency relation,
relation, a finite, symmetric, reflexive relation.
• independency relation
relation,, a symmetric, irreflexive relation which is
the complement of some dependency relation.
Functions
Definitions
44
If on each , assigns exactly one , then is called total
function or just function. The following definitions are commonly used
when discussing functions.
Properties of functions
Composition of functions
Inverses of functions
46
Theorem: If a function has both a left inverse and a right inverse ,
then .
47
The input and output of a function can be expressed as an ordered pair,
ordered so that the first element is the input (or tuple of inputs, if the
function takes more than one input), and the second is the output. In the
example above, f(x) = x2, we have the ordered pair (−3, 9). If both input
and output are real numbers, this ordered pair can be viewed as the
Cartesian coordinates of a point on the graph of the function.
48
is a number. Another important operation defined on functions is
function composition,, where the output from one function becomes the
input to another function.
A function that associates to any of the four colored shapes its color.
49
The input to a function is called the argument and the output is called the
value. The set of all permitted inputs to a given function is called the
domain of the function, while the set of permissible outputs is called the
codomain. Thus, the domain of the "color-of-the-shape function" is the
set of the four shapes, and the codomain consists of the five colors. The
concept of a function does not require that every possible output is the
value of some argument, e.g. the color blue is not the color of any of the
four shapes in X.
A third example of a function has the set of polygons as domain and the
set of natural numbers as codomain. The function associates a polygon
with its number of vertices. For example, a triangle is associated with
the number 3, a square with the number 4, and so on.
The term range is sometimes used either for the codomain or for the set
of all the actual values a function has.
Definition
50
The above diagram represents a function with domain {1, 2, 3},
codomain {A, B, C, D} and set of ordered pairs {(1,D), (2,C), (3,C)}.
The image is {C,D}.
However, this second diagram does not represent a function. One reason
is that 2 is the first element in more than one ordered pair. In particular,
(2, B) and (2, C) are both elements of the set of ordered pairs. Another
reason, sufficient by itself, is that 3 is not the first element (input) for
51
any ordered pair. A third reason, likewise, is that 4 is not the first
element of any ordered pair.
52
("yellow rectangle", "red").
"red")
The "color-of-the-shape"
shape" function described above consists of the set of
those ordered pairs,
(shape, color)
where the color is the actual color of the given shape. Thus, the pair
("red triangle", "red") is in the function, but the pair ("yellow rectangle",
rectangl
"red") is not.
Functional notation
or
53
A general function is often denoted by f.. Special functions have names,
for example, the signum function is denoted by sgn. Given a real
number x,, its image under the signum function is then written as sgn(x).
sgn(
Here, the argument is denoted by the symbol x,, but different symbols
may be used in other contexts. For example, in physics, the velocity of
some body, depending on the time, is denoted v(t).
). The parentheses
around the argument may be omitted when there is little chance of
confusion, thus: sin x;; this is known as prefix notation.
In other words, this function has the natural numbers as domain, the
integers as codomain. Strictly speaking, a function is properly defined
54
only when the domain and codomain are specified. For example, the
formula f(x) = 4 − x alone (without specifying the codomain and
domain) is not a properly defined function. Moreover, the function
(with different domain) is not considered the same function, even though
the formulas defining f and g agree, and similarly with a different
codomain. Despite that, many authors drop the specification of the
domain and codomain, especially if these are clear from the context. So
in this example many just write f(x) = 4 − x.. Sometimes, the maximal
possible domain is also understood implicitly: a formula such as
may mean that the domain of f is the set of real numbers x
where the square root is defined (in this case x ≤ 2 or x ≥ 3).
Specifying a function
55
arguments x and their corresponding function values f(x). More
commonly, a function is defined by a formula, or (more generally) an
algorithm — a recipe that tells how to compute the value of f(x) given
any x in the domain.
computable function
56
computable functions. For example, the Euclidean algorithm gives a
precise process to compute the greatest common divisor of two positive
integers. Many of the functions studied in the context of number theory
are computable.
Basic properties
57
The graph of the function f(x) = x3 − 9x2 + 23x − 15. The interval A =
[3.5, 4.25] is a subset of the domain, thus it is shown as part of the x-axis
(green). The image of A is (approximately) the interval [−3.08,
[ −1.88]. It
is obtained by projecting to the y-axis
axis (along the blue arrows) the
intersection
tion of the graph with the light green area consisting of all points
whose x-coordinate
coordinate is between 3.5 and 4.25. the part of the (vertical) y-
axis shown in blue. The preimage of B = [1, 2.5] consists of three
intervals. They are obtained by projecting the intersection of the light
red area with the graph to the x-axis.
58
So, for example, the preimage of {4, 9} under the squaring function is
−3,−2,2,3}. The term range usually refers to the image,[7] but
the set {−3,−2,2,3}.
sometimes it refers to the codomain.
59
Injective and surjective functions
Function composition
Function composition
60
second, reversing English reading order. The notation can be memorized
by reading the notation as "g
" of f" or "g after f".
". The composition is
only defined when the codomain of f is the domain of g.. Assuming that,
the composition in the opposite order need not be defined. Even if it
is, i.e., if the codomain of f is the codomain of g, it is not in general true
that
That is, the order of the composition is important. For example, suppose
f(x) = x2 and g(x) = x+1.
+1. Then g(f(x)) = x2+1, while f(g((x)) = (x+1)2,
which is x2+2x+1,
+1, a different function.
61
•
( ∘ f )(c) = #.
Another composition. For example, we have here (g
Identity function
The unique function over a set X that maps each element to itself is
called the identity function for X,, and typically denoted by idX. Each set
has its own identity function, so the subscript cannot be omitted unless
the set can be inferred from context. U
Under
nder composition, an identity
function is "neutral": if f is any function from X to Y,, then
62
Restrictions and extensions
Inverse function
63
That is, the two possible compositions of f and f−1 need to be the
respective identity maps of X and Y.
Types of functions[
Real-valued
valued functions
A real-valued
valued function f is one whose codomain is the set of real
numbers or a subset thereof. If, in addition, the domain is also a subset
of the reals, f is a real valued function of a real variable. The study of
such functions is called real analysis.
64
Real-valued
valued functions enjoy so-called
so called pointwise operations. That is,
given two functions
f, g: X → Y
65
A linear function A quadratic function.
66
Further types of functions
There are many other special classes of functions that are important to
particular branches of mathematics, or particular applications.
Function spaces
Currying
68
surjective on a (given) set if its image equals that set. For example, we
might say a function f is surjective on the set of real numbers.
Many operations
perations in set theory, such as the power set,, have the class of
all sets as their domain, and therefore, although they are informally
described as functions, they do not fit the set-theoretical
set theoretical definition
outlined above, because a class is not necessarily a set. However some
definitions of relations and functions define them as classes of pairs
rather
her than sets of pairs and therefore do include the power set as a
function.
70
In some parts of mathematics, including recursion theory and functional
analysis, it is convenient to study partial functions in which some values
of the domain have no association in the graph; i.e., single-valued
relations. For example, the function f such that f(x) = 1/x does not define
a value for x = 0, since division by zero is not defined. Hence f is only a
partial function from the real line to the real line. The term total function
can be used to stress the fact that every element of the domain does
appear as the first element of an ordered pair in the graph.
For example, consider the function that associates two integers to their
product: f(x, y) = x·y. This function can be defined formally as having
domain ℤ×ℤ, the set of all integer pairs; codomain ℤ; and, for graph, the
set of all pairs ((x, y), x·y). Note that the first component of any such
71
pair is itself a pair (of integers), while the second component is a single
integer.
The function value of the pair (x, y) is f((x, y)). However, it is customary
to drop one set of parentheses and consider f(x, y) a function of two
variables, x and y. Functions of two variables may be plotted on the
three-dimensional Cartesian as ordered triples of the form (x, y, f(x, y)).
Binary operations
72
Functors
73
denoted ∨, and the negation not, denoted ¬. It is thus a formalism for
describing logical relations in the same way that ordinary algebra
describes numeric relations.
Boolean algebra was introduced by George Boole in his first book The
Mathematical Analysis of Logic (1847), and set forth more fully in his
An Investigation of the Laws of Thought (1854). According to
Huntington, the term "Boolean algebra" was first suggested by Sheffer
in 1913.
History
74
Stone proved in 1936 that every Boolean algebra is isomorphic to a field
of sets.
75
between his algebra and logic was later put on firm ground in the setting
of algebraic logic, which also studies the algebraic systems of many
other logics.[4] The problem of determining whether the variables of a
given Boolean (propositional) formula can be assigned in such a way as
to make the formula evaluate to true is called the Boolean satisfiability
problem (SAT), and is of importance to theoretical computer science,
being the first problem shown to be NP-complete. The closely related
model of computation known as a Boolean circuit relates time
complexity (of an algorithm) to circuit complexity.
Values
Boolean algebra also deals with functions which have their values in the
set {0, 1}. A sequence of bits is a commonly used such function.
Another common example is the subsets of a set E: to a subset F of E is
76
associated the indicator function that takes the value 1 on F and 0
outside F.. The most general example is the elements of a Boolean
algebra, with all of the foregoing being instances thereof.
As with elementary algebra, the purely equational part of the theory may
be developed without considering explicit values for the variables.
Operations
Basic operations
• AND (conjunction
conjunction), denoted x∧y (sometimes x AND y or Kxy),
satisfies x∧y = 1 if x = y = 1 and x∧y = 0 otherwise.
• OR (disjunction),
), denoted x∨y (sometimes x OR y or Axy),
satisfies x∨y = 0 if x = y = 0 and x∨y = 1 otherwise.
• NOT (negation),
), denoted ¬
¬x (sometimes NOT x,, N
Nx or !x),
satisfies ¬x = 0 if x = 1 and ¬x = 1 if x = 0.
77
One may
ay consider that only the negation and one of the two other
operations are basic, because of the following identities that allow to
define the conjunction in terms of the negation and the disjunction, and
vice versa:
Derived operations
78
viewing an implication with a false premise as something other than
either true or false.)
Given two operands, each with two possible values, there are 22 = 4
possible combinations of inputs. Because each output can have two
possible values, there are a total of 24 = 16 possible binary Boolean
operations.
Laws
Monotone laws
80
A consequence of the first of these laws is 1∨1
1 1 = 1, which is false in
ordinary algebra, where 1+1 = 2. Taking x = 2 in the second law shows
that it is not an ordinary algebra law either, since 2×2 = 4. The
remaining four laws can be falsified in ordinary al
algebra
gebra by taking all
variables to be 1, for example in Absorption Law 1 the left hand side is
1(1+1) = 2 while the right hand side is 1, and so on.
All of the laws treated so far have been for conjunction and disjunction.
These operations have the property that changing either argument either
leaves the output unchanged or the output changes in the same way as
the input. Equivalently, changing any variable from 0 to 1 never results
in the output changing from 1 to 0. Operations with this property are
said to be monotone.. Thus the axioms so far have all been for
monotonic Boolean logic. Nonmonotonicity enters via complement ¬ as
follows.[3]
Nonmonotone laws
All properties of negation including the laws below follow from the
above two laws alone.[3]
81
In both ordinary and Boolean algebra,
algebra, negation works by exchanging
pairs of elements, whence in both algebras it satisfies the double
negation law (also called involution law)
Completeness
The laws listed above define Boolean algebra, in the sense that they
entail the rest of the subject. The laws Complementation 1 and 2,
together with the monotone laws, suffice for this purpose and can
therefore be taken as one possible complete set of laws or axiomatization
of Boolean algebra. Every law
law of Boolean algebra follows logically from
these axioms. Furthermore, Boolean algebras can then be defined as the
models of these axioms as treated in the section thereon..
82
model of them. In contrast, in a list of some but not all of the same laws,
there could have been Boolean laws that did not follow from those on
the list, and moreover there would have been models of the listed laws
that were not Boolean algebras.
83
Duality principle
There is nothing magical about the choice of symbols for the values of
Boolean algebra. We could rename 0 and 1 to say α and β, and as long as
we did so consistently throughout it would still be Boolean algebra,
albeit with some obvious cosmetic differences.
One change we did not need to make as part of this interchange was to
complement. We say that complement is a self-dual operation. The
identity or do-nothing operation x (copy the input to the output) is also
self-dual. A more complicated example of a self-dual operation is (x∧y)
∨ (y∧z) ∨ (z∧x). There is no self-dual binary operation that depends on
both its arguments. A composition of self-dual operations is a self-dual
operation. For example, if f(x,y,z) = (x∧y) ∨ (y∧z) ∨ (z∧x), then
f(f(x,y,z),x,t) is a self-dual operation of four arguments x,y,z,t.
Diagrammatic representations
85
Venn diagrams
For conjunction, the region inside both circles is shaded to indicate that
x∧y is 1 when both variables are 1. The other regions are left unshaded
to indicate that x∧y is 0 for the other three combinations.
86
The second diagram represents disjunction x∨y by shading those regions
that lie inside either or both circles. The third diagram represents
complement ¬x by shading the region not inside the circle.
While we have not shown the Venn diagrams for the constants 0 and 1,
they are trivial, being respectively a white box and a dark box, neither
one containing a circle. However we could put a circle for x in those
boxes, in which case each would denote a function of one argument, x,
which returns the same value independently of x, called a constant
function. As far as their outputs are concerned, constants and constant
functions are indistinguishable; the difference is that a constant takes no
arguments, called a zeroary or nullary operation, while a constant
function takes one argument, which it ignores, and is a unary operation.
87
To see the first absorption law, x∧(x∨y) = x, start with the diagram in
the middle for x∨y and note that the portion of the shaded area in
common with the x circle is the whole of the x circle. For the second
absorption law, x∨(x∧y) = x, start with the left diagram for x∧y and note
that shading the whole of the x circle results in just the x circle being
shaded, since the previous shading was inside the x circle.
The second De Morgan's law, (¬x)∨(¬y) = ¬(x∧y), works the same way
with the two diagrams interchanged.
The first complement law, x∧¬x = 0, says that the interior and exterior
of the x circle have no overlap. The second complement law, x∨¬x = 1,
says that everything is either inside or outside the x circle.
88
Digital logic gates
The lines on the left of each gate represent input wires or ports. The
value of the input is represented by a voltage on the lead. For so-called
"active-high" logic, 0 is represented by a voltage close to zero or
"ground", while 1 is represented by a voltage close to the supply voltage;
active-low reverses this. The line on the right of each gate represents the
output port, which normally follows the same voltage conventions as the
input ports.
89
passing through this port is complemented on the way through, whether
it is an input or output port.
More generally one may complement any of the eight subsets of the
three ports of either an AND or OR gate. The resulting sixteen
possibilities give rise to only eight Boolean operations, namely those
with an odd number of 1's in their truth table. There are eight such
because the "odd-bit-out" can be either 0 or 1 and can go in any of four
positions in the truth table. There being sixteen binary Boolean
operations, this must leave eight operations with an even number of 1's
in their truth tables. Two of these are the constants 0 and 1 (as binary
operations that ignore both their inputs); four are the operations that
depend nontrivially on exactly one of their two inputs, namely x, y, ¬x,
and ¬y; and the remaining two are x⊕y (XOR) and its complement x≡y.
Boolean algebras
90
Boolean algebra (structure)
91
equation. Hence modern authors allow the degenerate Boolean algebra
and let X be empty.)
Example 2. The empty set and X. This two-element algebra shows that
a concrete Boolean algebra can be finite even when it consists of subsets
of an infinite set. It can be seen that every field of subsets of X must
contain the empty set and X. Hence no smaller example is possible,
other than the degenerate algebra obtained by taking X to be empty so as
to make the empty set and X coincide.
92
exactly one region. Then the set of all 22n possible unions of regions
(including the empty set obtained as the union of the empty set of
regions and X obtained as the union of all 2n regions) is closed under
union, intersection, and complement relative to X and therefore forms a
concrete Boolean algebra. Again we have finitely many subsets of an
infinite set forming a concrete Boolean algebra, with Example 2 arising
as the case n = 0 of no curves.
(1)
(2)
(3)
(4)
(5)
93
2. The operations satisfy the absorption law
(6)
(7)
(8)
4. contains universal bounds (the empty set) and (the universal set)
which satisfy
(9)
(10)
(11)
(12)
(13)
(14)
94
In the slightly archaic terminology of (Bell 1986, p. 444), a Boolean
algebra can be defined as a set of elements , , ... with binary operators
(or ; logical OR) and (or ; logical AND) such that
3a. .
3b. .
4a. .
4b. .
1. Commutativity: .
2. Associativity: .
95
3. Huntington axiom: .
(15)
• Variable used can have only two values. Binary 1 for HIGH and
Binary 0 for LOW.
96
• Complement of a variable is represented by an overbar (-). Thus,
complement of variable B is represented as . Thus if B = 0 then
= 1 and B = 1 then = 0.
• ORing of the variables is represented by a plus (+) sign between
them. For example ORing of A, B, C is represented as A + B
+ C.
• Logical ANDing of the two or more variable is represented by
writing a dot between them such as A.B.C. Sometime the dot may
be omitted like ABC.
Boolean Laws
1.Commutative law
Commutative law states that changing the sequence of the variables does
not have any effect on the output of a logic circuit.
97
2.Associative law
This law states that the order in which the logic operations are
performed is irrelevant as their effect is the same.
3.Distributive law
4.AND law
These laws use the AND operation. Therefore they are called as AND
laws.
5.OR law
These laws use the OR operation. Therefore they are called as OR laws.
98
6.INVERSION law
This law uses the NOT operation. The inversion law states that double
inversion of a variable results in the original variable itself.
2.1.Logic statements
Statement (logic)
Overview
99
In either case a statement is viewed as a truth bearer.
• "Socrates is a man."
• "A triangle has three sides."
• "Madrid is the capital of Spain."
The first two examples are not declarative sentences and therefore are
not (or do not make) statements. The third and fourth are declarative
sentences but, lacking meaning, are neither true nor false and therefore
are not (or do not make) statements. The fifth and sixth examples are
meaningful declarative sentences, but are not statements but rather
matters of opinion or taste. Whether or not the sentence "Pegasus
exists." is a statement is a subject of debate among philosophers.
100
Bertrand Russell held that it is a (false) statement. Strawson held it is not
a statement at all.
101
If you finish your homework then you can watch T.V.
This is a question if and only if this is an answer.
I have read this and I understand the concept.
Note that the connective ``or'' in logic is used in the inclusive sense (not
the exclusive sense as in English). Thus, the logical statement ``It is
raining or the sun is shining '' means it is raining, or the sun is shining or
it is raining and the sun is shining.
If p is the statement ``The wall is red'' and q is the statement ``The lamp
is on'', then is the statement ``The wall is red or the lamp is on (or
both)'' whereas is the statement ``If the lamp is on then the wall is
red''. The statement translates to ``The wall isn't red and the lamp is
on''.
102
directly translates as ``If the wall is red then the lamp is on''. It can also
be stated as ``The wall is red only if the lamp is on'' or ``The lamp is on
if the wall is red''. Similarly, directly translates as ``The wall is red
and the lamp is not on'' but it would be preferable to say ``The wall is
red but the the lamp is off''. The truth value of a compound statement is
determined from the truth values of its simple components under certain
rules. For example, if p is a true statement then the truth value of is F.
Similarly, if p has truth value F, then the statement has truth value T.
These rules are summarized in the following truth table.
From these elementary truth tables, we can determine the truth value of
more complicated statments. For example, what is the truth value of
given that p and q are true? In this case, has truth value F and
103
from the second line of the tables above, we see the truth value of the
compound statement is F. Had it been the case that p was false and q
true, then again would be false and from the fourth row of the above
table we see that is a false statement. To consider all the possible
truth values, we construct a truth table.
The lower case t and f were used to record truth values in intermediate
steps. Note that while a truth table involving statements p and q has 4
rows to cover the possibility of each statement being true or false, if we
have additional information about either statement this will reduce the
number of rows in the truth table. If, for example, the statement p is
known to be true, then in constructing the truth table of we will
only have 2 rows. Truth tables involving n statements will have rows
unless additional information about the truth values of some of these
statements is known.
Worked Examples
104
Negation
Consider the statement "You are either rich or happy." For this statement
to be false, you can't be rich and you can't been happy. In other words,
the opposite is to be not rich and not happy. Or if we rewrite it in terms
of the original statement we get "You are not rich and not happy."
If we let A be the statement "You are rich" and B be the statement "You
are happy", then the negation of "A or B" becomes "Not A and Not B."
In general, we have the same statement: The negation of "A or B" is the
statement "Not A and Not B."
105
Negation of "A and B".
Again, let's
's analyze an example first.
Consider the statement "I am both rich and happy." For this statement to
be false I could be either not rich or not happy. If we let A be the
statement "I am rich" and B be the statement "I am happy", then the
negation of "A and B" becomes "I am not rich or I am not happy" or
"Not A or Not B".
So the negation of "if A, then B" becomes "A and not B".
106
Example.
Now let's consider a statem
statement
ent involving some mathematics. Take the
statement "If n is even, then is an integer." For this statement to be
false, we would need to find an even integer for which was not
an integer. So the opposite of this statement is the statement that
t " is
even and is not an integer."
Negation of "For every ...", "For all ...", "There exists ..."
Sometimes we encounter phrases such as "for every," "for any," "for all"
and "there exists" in mathematical statements.
Example.
Consider the statement
tatement "For all integers , either is even or is
odd". Although the phrasing is a bit different, this is a statement of the
form "If A, then B." We can reword this sentence as follows: "If is
any integer, then either is even or is odd."
How would we negate this statement? For this statement to be false, all
we would need is to find a single integer which is not even and not odd.
In other words, the negation is the statement "There exists an integer ,
so that is not even and is not odd."
107
negating a statement involving "there exists", the phrase "there exists"
gets replaced with "for every" or "for all."
Example. Negate the statement "If all rich people are happy, then
all poor people are sad."
First, this statement has the form "If A, then B", where A is the
statement "All rich people are happy" and B is the statement "All poor
people are sad." So the negation has the form "A and not B." So we will
need to negate B. The negation of the statement B is "There exists a poor
person who is not sad."
Putting this together gives: "All rich people are happy, but there exists a
poor person who is not sad" as the negation of "If all rich people are
happy, then all poor people are sad."
Summary.
Statement Negation
108
"There exists x such that A(x)" "For every x, not A(x)"
A B A AND B
Every time you use a computer you are relying on Boolean logic: a
system of logic established long before computers were around, named
after the English mathematician George Boole (1815 - 1864). In Boolean
logic statements can either be true or false (e.g. at the moment "I want a
109
cup of tea" is false, but "I want a piece of cake" is always true), and you
can string these together using the words AND, OR and NOT. To
establish if these compound statements are true of false, you might
create what's called a truth table, listing all the possible values the basic
statements can take, and then all the corresponding values the compound
statement can take. (You can read more in George Boole and the
wonderful world of 0s and 1s.)
Truth tables are useful for simple logic statements, but quickly become
tiresome and error prone for more complicated statements. Boole came
to the rescue by ingeniously recognising that binary logical operations
behaved in a way that's strikingly similar to our normal arithmetic
operations, with a few twists.
In this new kind of arithmetic (called Boolean algebra) the variables are
logical statements (loosely speaking, sentences that are either true or
false). As these can only take two values we can write 0 for a statement
we know is false and 1 for a statement we know is true. Then we can
rewrite OR as a kind of addition using only 0s and 1s:
110
We can rewrite AND as a kind of multiplication:
0 x 1 = 1 x 0 = 0 (since "false AND true" and "true AND false" are both
false)
0 x 0 = 0 (since "false AND false" is false)
1 x 1 = 1 (since "true AND true" is true).
As the variables can only have the values of 0 and 1, we can define the
NOT operation as the complement, taking a number to the opposite of its
value:
If A = 1, then NOT A = 0
If A = 0, then NOT A = 1
A + NOT A = 1 (since "true OR false" is true)
A x NOT A = 0 (since "true AND false" is false).
Our new version of these operations is similar in many ways to our more
familiar notions of addition and multiplication but there are a few key
differences. Parts of equations can conveniently disappear in Boolean
algebra, which can be very handy. For example, the variable B in
A+AxB
111
And if A is false (that is, A=0) then (A AND B) is false no matter the
value of B, and so A OR (A AND B) is false. So Boolean algebra
provides us with a disappearing act: the expression A + A x B is equal to
a simple little A:
A + A x B = A.
These two equalities are known as De Morgan's Laws, after the British
mathematician Augustus de Morgan (1806 - 1871). (You can convince
yourself that they are true using the equivalent truth tables.)
These are just two of the tricks Boolean algebra has up its sleeves for
simplifying complicated
Logical Operators
We will now define the logical operators which we mentioned earlier,
using truth tables. But let us proceed with caution: most of the operators
have names which we may be accustomed to using in ways that are
fuzzy or even contradictory to their proper definitions. In all cases, use
the truth table for an operator as its exact and only definition; try not to
112
bring to logic the baggage of your colloquial use of the English
language.
p AND q
but we will represent it using the ampersand ("&") since that is the
symbol most commonly used on computers to represent a logical AND.
It has the following truth table:
p q p&q
T T T
T F F
F T F
F F F
113
Notice that p & q is only T if both p and q are T. Thus the rigorous
definition of AND is consistent with its colloquial definition. This will
be very useful for us when we get to Boolean Algebra: there, we will use
1 in place of T and 0 in place of F, and the AND operator will be used to
"mask" bits.
n bits 32 - n bits
Suppose that on your network, the three most significant bits in the first
byte of an IP address denote the network address, while the remaining
29 bits of the address are used for the host. To find the network address,
we can AND the first byte with
1 1 1 0 0 0 0 02
since
xxxyyyyy
&
114
11100000
xxx00000
(x & 1 = x, but x & 0 = 0). Thus masking allows the system to separate
the network address from the host address in order to identify which
network information is to be sent to. Note that most network numbers
have more than 3 bits. You will spend a lot of time working with
network masks in your courses on networking.
We will represent OR using the stroke ("|"), again due to common usage
on computers. It has the following truth table:
p q p|q
T T T
T F T
F T T
115
F F F
p ~p
T F
F T
116
notation, and is not often used in programming (where our usual logical
operator symbols originate), so we will simply adopt the "X" as the
symbol for the XOR:
p q pXq
T T F
T F T
F T T
F F F
117
p q p→q
T T T
T F F
F T T
F F T
p q p↔q
118
T T T
T F F
F T F
F F T
119
Gate Symbol Boolean Equation
NOT A Q Q=A
AND A
Q Q = A.B
B
OR A
Q Q = A +B
B
A
NAND Q Q = A.B
B
A
NOR Q Q = A +B
EXOR A Q = A ⊕ B or
Q
B
Q = A.B + A.B
EXNOR A Q = A ⊕ B or
Q
Q = A.B + A.B
B
Now let us put this into practice. There are two ways in which Boolean
expressions for a logic system can be formed, either from a truth table or
from a logic circuit diagram. We will now consider each of these in turn
120
starting with the easiest, which is to complete a Boolean expression from
a truth table.
CHAPTER 3: POLYNOMIALS
example of a polynomial
this one has 3 terms
Polynomial comes from poly- (meaning "many") and -nomial (in this
case meaning "term") ... so it says "many terms"
121
variables (like x and y)
exponents (like the 2 in y2), but only 0, 1, 2, 3, ... etc are allowed
... not division by a variable (so something like 2/x is right out)
So:
Polynomial or Not?
122
• 3x
• x-2
• -6y2 - (7/9)x
• 3xyz + 3xy2z - 0.1xz - 200y + 0.5
• 512v5+ 99w5
• 5
(Yes, even "5" is a polynomial, one term is allowed, and it can even be
just a constant!)
Variables
Or one variable
Example: x4-2x2+x has three terms, but only one variable (x)
124
Example: xy4-5x2z has two terms, and three variables (x, y and z)
Example: x4-2x2+x
125
You can also divide polynomials (but the result may not be a
polynomial).
Degree
Example:
The Degree is 3 (the largest exponent of
x)
Standard Form
The Standard Form for writing a polynomial is to put the terms with the
highest degree first.
x6 + 4x3 + 3x2 – 7
Variables
126
Polynomials can have no variable at all
Or one variable
Example: x4-2x2+x has three terms, but only one variable (x)
Example: xy4-5x2z has two terms, and three variables (x, y and z)
127
Example: x4-2x2+x
You can also divide polynomials (but the result may not be a
polynomial).
Degree
Example:
The Degree is 3 (the largest exponent of
x)
Standard Form
The Standard Form for writing a polynomial is to put the terms with the
highest degree first.
128
Example: Put this in Standard Form: 3x2 - 7 + 4x3 + x6
The highest degree is 6, so that goes first, then 3, 2 and then the constant
last:
x6 + 4x3 + 3x2 - 7
Probably the most common thing you will be doing with polynomials is
"combining like terms". This is the process of adding together whatever
terms you can, but not overdoing it by trying to add together terms that
can't actually be combined. Terms can be combined ONLY IF they have
the exact same variable part. Here is a rundown of what's what:
129
The second term now has
4x and NOT like
2
the same variable, but the
3x terms
degree is different
Once you have determined that two terms are indeed "like" terms and
can indeed therefore be combined, you can then deal with them in a
manner similar to what you did in grammar school. When you were first
learning to add, you would do "five apples and six apples is eleven
apples". You have since learned that, as they say, "you can't add apples
and oranges". That is, "five apples and six oranges" is just a big pile of
fruit; it isn't something like "eleven applanges". Combining like terms
works much the same way.
• Simplify 3x + 4x
These are like terms since they have the same variable part, so I
can combine the terms: three x's and four x's makes seven x's:
Copyright © Elizabeth Stapel 2000-2011 All Rights Reserved
3x + 4x = 7x
130
• Simplify 2x2 + 3x – 4 – x2 + x + 9
It is often best to group like terms together first, and then simplify:
2x2 + 3x – 4 – x2 + x + 9
= (2x2 – x2) + (3x + x) + (–4 + 9)
= x2 + 4x + 5
• Simplify 25 – (x + 3 – x2)
25 – (x + 3 – x2)
= 25 – x – 3 + x2
= x2 – x + 25 – 3
= x2 – x + 22
If it helps you to keep track of the negative sign, put the understood 1 in
front of the parentheses:
25 – (x + 3 – x2)
= 25 – 1(x + 3 – x2)
= 25 – 1x – 3 + 1x2
= 1x2 – 1x + 25 – 3
= 1x2 – 1x + 22
= x2 – 1x + 22
132
While the first format (without the 1's being written in) is the more
"standard" format, either format should be acceptable (but check with
your instructor). You should use the format that works most successfully
for you.
Warning: This is the kind of problem that us math teachers love to put
on tests (yes, we're cruel people), so you should expect to need to be
able to do this.
x + 2(x – [3x – 8] + 3)
= x + 2(x – 1[3x – 8] + 3)
= x + 2(x – 3x + 8 + 3)
= x + 2(–2x + 11)
= x – 4x + 22
= –3x + 22
133
[(6x – 8) – 2x] – [(12x – 7) – (4x – 5)]
= [6x – 8 – 2x] – [12x – 7 – 4x + 5]
= [4x – 8] – [8x – 2]
= 4x – 8 – 8x + 2
= –4x – 6
If you think you need more practice with this last type of problem (with
all the brackets and the negatives and the parentheses, then review the
"Simplifying with Parentheses" lesson.)
134
(x)(x) = x2 (multiplication)
x + x = 2x (addition)
So if you have something like x3 + x2, DO NOT try to say that this
somehow equals something like x5 or 5x. If you have something like 2x
+ x, DO NOT say that this somehow equals something like 2x2.
(1)
135
works, the definitions of monomial and term are reversed. Care is
therefore needed in attempting to distinguish these conflicting usages.
(2)
where the product runs over the roots of and it is understood that
multiple roots are counted with multiplicity.
(3)
(4)
and has order less than (in the case of cancellation of leading terms) or
equal to the maximum order of the original two polynomials. Similarly,
136
the product of two polynomials is obtained by multiplying term by term
and combining the results, for example
(5
)
(6
)
and has order equal to the sum of the orders of the two original
polynomials.
A polynomial quotient
(7)
2 quadratic polynomial
3 cubic polynomial
4 Quartic
5 Quintic
6 Sextic
138
(9)
where
(10)
(11)
(12)
(13)
(14)
139
However, solutions of the general quintic equation may be given in
terms of Jacobi theta functions or hypergeometric functions in one
variable. Hermite and Kronecker proved that higher order polynomials
are not soluble in the same manner. Klein showed that the work of
Hermite was implicit in the group properties of the icosahedron. Klein's
method of solving the quintic in terms of hypergeometric functions in
one variable can be extended to the sextic, but for higher order
polynomials, either hypergeometric functions in several variables or
"Siegel functions" must be used (Belardinelli 1960, King 1996, Chow
1999). In the 1880s, Poincaré created functions which give the solution
to the th order polynomial equation in finite form. These functions
turned out to be "natural" generalizations of the elliptic functions.
3.1.LINEAR FUNCTIONS
140
Example
{y=2x+4y=3x+2
y=2x+4_8=?2⋅2+4
2+4
8=8
y=3x+2_
8=?3⋅2+2
141
8=8
Linear systems composes of parallel lines that have the same slope but
different y-intersect
intersect do not have a solution since the lines won't intersect.
Linear systems without a solution are called inconsistent systems.
Linear systems composed of lines that have the same slope and the y-
y
intercept are said to be consistent dependent systems. Consistent
dependent systems have infinitely many solutions since the lines
coincide.
Linear Equations
142
Example: y = 2x+1 is a linear equation:
x y = 2x + 1
-1 y = 2 × (-1) + 1 = -1
0 y=2×0+1=1
1 y=2×1+1=3
143
2 y=2×2+1=5
Check for yourself that those points are part of the line above!
Different Forms
There are many ways of writing linear equations, but they usually have
constants (like "2" or "c") and must have simple variables (like "x" or
"y").
y - 2 = 3(x + 1)
y + 2x - 2 = 0
5x = 6
y/2 = 3
But the variables (like "x" or "y") in Linear Equations do NOT have:
144
Examples: These are NOT linear equations:
y2 - 2 = 0
3√x - y = 6
x3/2 = 16
Slope-Intercept Form
Example: y = 2x + 1
• Slope: m = 2
• Intercept: b = 1
145
Play With It !
Point-Slope Form
y - y1 = m(x - x1)
Example: y - 3 = ¼(x - 2)
• x1 = 2
• y1 = 3
• m=¼
General Form
And there is also the General Form of the equation of a straight line:
146
Ax + By + C = 0
(A and B cannot both be 0)
Example: 3x + 2y - 4 = 0
• A=3
• B=2
• C = -4
As a Function
y = 2x - 3 f(x) = 2x - 3
y = 2x - 3 w(u) = 2u – 3 h(z) = 2z - 3
147
The Identity Function
f(x) = x
In Out
0 0
5 5
-2 -2
148
...etc ...etc
Constant Functions
f(x) = C
Linear Equations
x y = 2x + 1
-1 y = 2 × (-1) + 1 = -1
0 y=2×0+1=1
1 y=2×1+1=3
150
2 y=2×2+1=5
Check for yourself that those points are part of the line above!
Different Forms
There are many ways of writing linear equations, but they usually have
constants (like "2" or "c") and must have simple variables (like "x" or
"y").
y - 2 = 3(x + 1)
y + 2x - 2 = 0
5x = 6
y/2 = 3
But the variables (like "x" or "y") in Linear Equations do NOT have:
151
Examples: These are NOT linear equations:
y2 - 2 = 0
3√x - y = 6
x3/2 = 16
Slope-Intercept Form
Example: y = 2x + 1
• Slope: m = 2
• Intercept: b = 1
152
Play With It !
Point-Slope Form
y - y1 = m(x - x1)
Example: y - 3 = ¼(x - 2)
• x1 = 2
• y1 = 3
• m=¼
General Form
And there is also the General Form of the equation of a straight line:
153
Ax + By + C = 0
(A and B cannot both be 0)
Example: 3x + 2y - 4 = 0
• A=3
• B=2
• C = -4
As a Function
y = 2x - 3 f(x) = 2x - 3
y = 2x - 3 w(u) = 2u – 3 h(z) = 2z - 3
154
The Identity Function
f(x) = x
In Out
0 0
5 5
-2 -2
155
...etc ...etc
Constant Functions
f(x) = C
156
The position point of a vector is defined using Cartesian co-ordinates: it
uses the coordinates of the OX, OY and OZ axes where O is the origin.
We will be looking at vectors in 3 dimensional space in Cartesian
coordinates. Similar ideas hold for vectors in n dimensional space (n-
vectors).
157
In terms of the following components
For addition:
For subtraction:
Top
158
The length of a vector :
Find the length of your own vector and the related unit vector.
159
Scalar multiplication will change the length of the vector and if the
factor λ is negative the vector will point to the opposite direction. Scalar
multiplication satisfies the following properties where a and b are
vectors, λ and μ and are scalars.
In the diagram above, the vector (1,m) is parallel to the line AB and
point A with vector coordinates (0,c) lies on the line AB. Let B be a
typical point on the line with positive vector r. As d=(0,c) is a point on
160
the line and n=(1,m) is a vector parallel to the line, the vector equation
of the line AB is given by
, .
Find the vector equation of your own line by entering two points.
Continuing from above we will now look a case where a given line
intersects the X-Y plane. The vector equation of the line is
If we take:
we get that
Substituting this into equations will give that λ = -2, which when
substituted into the second and first equation give that y = -2 and x = -2.
The intersection of the above line occurs at the point(-2,-2,0).
Scalar product
161
Then the scalar product of a and b, denoted by a.b is
Using the two above formulae and setting them equal to each other as
shown below we are able to calculate the angle between two vectors.
162
Looking the above figure, r − a is perpendicular to n. So the formula of
the vector equation of the plane is given by:
DEFINITION: The angle between two planes is the angle between the
two normals. The plane must firstly been written in vector form.
where
163
If we had two planes then we would have two normal vectors say n1 and
n2. to find the angle between these two vectors using the same formula
when we found the angle between vectors (above).
4.8.Vector Product
where
|a| is the length of a
θ is the angle between vectors
n is the unit vector perpendicular to a and b whose direction is
determined by the left hand skew rule.
164
4.8.The Area of a Vector Triangle
which are in Cartesian form. We firstly need to find the vector parallel to
the plane.
165
This gives a vector denoted by n. So the equation of the plane is found
using the same method as above.
By substituting in we get
DEFINITION: Skew lines are lines or vectors which are not parallel
and do not meet. We now seek the minimum distance between these
lines. By drawing a line between both lines, called a transversal, it will
be perpendicular to both lines.
166
The transversal connects A and B, n3 is the unit vector in the direction
AB and p is the required distance. As mentioned above n3 is
perpendicular to both n1 and n2.
167
basic of these. Analytic geometry is a great invention of Descartes and
Fermat.
ax1 + by1 + c = 0
ax2 + by2 + c = 0,
168
Geometry of the three-dimensional space is modeled with triples of
numbers (x, y, z) and a 3D linear equation ax + by + cz + d = 0 defines a
plane. In general, analytic geometry provides a convenient tool for
working in higher dimensions.
Within the framework of analytic geometry one may (and does) model
non-Euclidean geometries as well. For example, in plane projective
geometry a point is a triple of homogenous coordinates (x, y, z), not all
0, such that
ax + bx + cz = 0.
That part of analytic geometry that deals mostly with linear equations is
called Linear Algebra.
5.1.PLANES
Plane
Examples
170
because it has thickness! And it should extend
forever, too.
So the very top of a perfect piece of paper
that goes on forever is the right idea!
Also, the top of a table, the floor and a whiteboard are all like a plane.
Here you can spin part of a plane (it really should extend forever):
Plane vs Plain
171
But a "plain" is a treeless mostly flat expanse of land ... it is also flat,
but not in the pure sense we use in geometry.
Both words have other meanings too: Plane can mean an airplane, a
level, or a tool for cutting things flat; Plain can mean without special
things, or well understood.
Imagine
What is a Plane?
173
edges to them, when they are drawn, they have an outline. Usually, they
are represented by a parallelogram that is shaded in, like this:
5.2. SPHERE
174
A sphere (from Greek σφαίρα — sphaira, "globe, ball,"[1]) is perfectly
round geometrical object in three-dimensional space, such as the shape
of a round ball. Like a circle in two dimensions, a perfect sphere is
completely symmetrical around its center, with all points on the surface
lying the same distance r from the center point. This distance r is known
as the radius of the sphere.
where r is the radius of the sphere. This formula was first derived by
Archimedes, who showed that the volume of a sphere is 2/3 that of a
175
circumscribed cylinder. (This assertion follows from Cavalieri's
principle.) In modern mathematics, this formula is most easily derived
using integral calculus, e.g. using disk integration.
Equations in R3
In analytic geometry, a sphere with center (x0, y0, z0) and radius r is the
locus of all points (x, y, z) such that
The sphere has the smallest surface area among all surfaces enclosing a
given volume and it encloses the largest volume among all closed
surfaces with a given surface area. For this reason, the sphere appears in
nature: for instance bubbles and small water drops are roughly spherical,
because the surface tension locally minimizes surface area. The surface
177
area in relation to the mass of a sphere is called the specific surface area.
From the above stated equations it can be expressed as follows:
Terminology
Pairs of points on a sphere that lie on a straight line through its center are
called antipodal points. A great circle is a circle on the sphere that has
the same center and radius as the sphere, and consequently divides it into
two equal parts. The shortest distance between two distinct non-
178
antipodal points on the surface and measured along the surface, is on the
unique great circle passing through the two points. Equipped with the
great-circle distance, a great circle becomes the Riemannian circle.
Hemisphere
A sphere is divided into two equal hemispheres by any plane that passes
through its center. If two intersecting planes pass through its center, then
they will subdivide the sphere into four lunes or biangles, the vertices of
which all coincide with the antipodal points lying on the line of
intersection of the planes.
The antipodal quotient of the sphere is the surface called the real
projective plane, which can also be thought of as the northern
hemisphere with antipodal points of the equator identified.
179
The volume of a hemisphere is
180
where Γ(z) is Euler's Gamma function.
More generally, in a metric space (E,d), the sphere of center x and radius
r > 0 is the set of points y such that d(x,y) = r.
Topology
Spherical geometry
182
Great circle on a sphere
Spherical geometry
183
In their book Geometry and the imagination[3] David Hilbert and
Stephan Cohn-Vossen describe 11 properties of the sphere and discuss
whether these properties uniquely determine the sphere. Several
properties hold for the plane which can be thought of as a sphere with
infinite radius. These properties are:
1. The points on the sphere are all the same distance from a fixed
point. Also, the ratio of the distance of its points from two fixed
points is constant.
The first part is the usual definition of the sphere and determines it
uniquely. The second part can be easily deduced and follows a
similar result of Apollonius of Perga for the circle. This second
part also holds for the plane.
184
orthogonal projection on to a plane. It can be proved that each of
these properties implies the other. File:Sphere section.png
185
For a given normal section there is a circle whose curvature is the
same as the sectional curvature, is tangent to the surface and whose
center lines along on the normal line. Take the two center
corresponding to the maximum and minimum sectional curvatures
these are called the focal points, and the set of all such centers
forms the focal surface.
For most surfaces the focal surface forms two sheets each of which
is a surface and which come together at umbilical points. There are
a number of special cases. For channel surfaces one sheet forms a
curve and the other sheet is a surface; For cones, cylinders, toruses
and cyclides both sheets form curves. For the sphere the center of
every osculating circle is at the center of the sphere and the focal
surface forms a single point. This is a unique property of the
sphere.
7. Of all the solids having a given volume, the sphere is the one with
the smallest surface area; of all solids having a given surface area,
the sphere is the one having the greatest volume.
186
These properties define the sphere uniquely. These properties can
be seen by observing soap bubbles. A soap bubble will enclose a
fixed volume and due to surface tension it will try to minimize its
surface area. This is why a free floating soap bubble approximates
a sphere (though external forces such as gravity will distort the
bubble's shape slightly).
8. The sphere has the smallest total mean curvature among all convex
solids with a given surface area.
187
embedded in space. Hence, bending a surface will not alter the
Gaussian curvature and other surfaces with constant positive
Gaussian curvature can be obtained by cutting a small slit in the
sphere and bending it. All these other surfaces would have
boundaries and the sphere is the only surface without boundary
with constant positive Gaussian curvature. The pseudosphere is an
example of a surface with constant negative Gaussian curvature.
PART: TWO
188
CHAPTER 6: LINEAR ALGEBRA
Linear Algebra
The matrix and determinant are extremely useful tools of linear algebra.
One central problem of linear algebra is the solution of the matrix
equation
189
algebra. In particular, a linear algebra over a field has the structure of
a ring with all the usual axioms for an inner addition and an inner
multiplication together with distributive laws, therefore giving it more
structure than a ring. A linear algebra also admits an outer operation of
multiplication by scalars (that are elements of the underlying field ).
For example, the set of all linear transformations from a vector space to
itself over a field forms a linear algebra over . Another example of a
linear algebra is the set of all real square matrices over the field of the
real numbers.
The set of points with coordinates that satisfy a linear equation forms a
hyperplane in an n-dimensional space. The conditions under which a set
of n hyperplanes intersect in a single point is an important focus of study
in linear algebra. Such an investigation is initially motivated by a system
of linear equations containing several unknowns. Such equations are
naturally represented using the formalism of matrices and vectors.[1][2][3]
190
infinite-dimensional version of the theory of vector spaces. Combined
with calculus, linear algebra facilitates the solution of linear systems of
differential equations.
History
The study of linear algebra first emerged from the study of determinants,
which were used to solve systems of linear equations. Determinants
were used by Leibniz in 1693, and subsequently, Gabriel Cramer
devised Cramer's Rule for solving linear systems in 1750. Later, Gauss
further developed the theory of solving linear systems by using Gaussian
elimination, which was initially listed as an advancement in geodesy.[4]
191
and inverses. Crucially, Cayley used a single letter to denote a matrix,
thus treating a matrix as an aggregate object. He also realized the
connection between matrices and determinants, and wrote "There would
be many things to say about this theory of matrices which should, it
seems to me, precede the theory of determinants".[4]
192
Educational history
Scope of study
Vector spaces
The main structures of linear algebra are vector spaces. A vector space
over a field F is a set V together with two binary operations. Elements of
V are called vectors and elements of F are called scalars. The first
operation, vector addition, takes any two vectors v and w and outputs a
third vector v + w. The second operation, scalar multiplication, takes any
scalar a and any vector v and outputs a new vector av. The operations of
addition and multiplication in a vector space must satisfy the following
193
axioms.[11] In the list below, let u, v and w be arbitrary vectors in V, and
a and b scalars in F.
Axiom Signification
Associativity of addition u + (v + w) = (u + v) + w
Distributivity of scalar
multiplication with respect to a(u + v) = au + av
vector addition
Distributivity of scalar
multiplication with respect to (a + b)v = av + bv
field addition
Compatibility of scalar
a(bv) = (ab)v [nb 1]
multiplication with field
194
multiplication
The first four axioms are those of V being an abelian group under vector
addition. Vector spaces may be diverse in nature, for example,
containing functions, polynomials or matrices. Linear algebra is
concerned with properties common to all vector spaces.
Linear transformations
195
When a bijective linear mapping exists between two vector spaces (that
is, every vector from the second space is associated with exactly one in
the first), we say that the two spaces are isomorphic. Because
cause an
isomorphism preserves linear structure, two isomorphic vector spaces
are "essentially the same" from the linear algebra point of view. One
essential question in linear algebra is whether a mapping is an
isomorphism or not, and this question can be answered by checking if
the determinant is nonzero. If a mapping is not an isomorphism, linear
algebra is interested in finding its range (or image) and the set of
elements that get mapped to zero, called the kernel of the mapping.
Linear transformations
nsformations have geometric significance. For example, 2 × 2
real matrices denote standard planar mappings that preserve the origin.
196
where a1, a2, ..., ak are scalars. The set of all linear combinations of
vectors v1, v2, ..., vk is called their span,, which forms a subspace.
197
defined by the dimension theorem for vector spaces.
spaces. If a basis of V has
finite number of elements, V is called a finite-dimensional
dimensional vector space.
dimensional and U is a subspace of V,, then dim U ≤ dim V.
If V is finite-dimensional
If U1 and U2 are subspaces of V, then
Matrix theory
The condition that v1, v2, ..., vn span V guarantees that each vector v can
be assigned coordinates, whereas tthe
he linear independence of v1, v2, ..., vn
assures that these coordinates are unique (i.e. there is only one linear
combination of the basis vectors that is equal to v).
). In this way, once a
basis of a vector space V over F has been chosen, V may be identified
space Fn. Under this identification, addition and
with the coordinate n-space
198
scalar multiplication of vectors in V correspond to addition and scalar
multiplication of their coordinate vectors in Fn. Furthermore, if V and W
are an n-dimensional and m-dimensional vector space over F, and a
basis of V and a basis of W have been fixed, then any linear
transformation T: V → W may be encoded by an m × n matrix A with
entries in the field F, called the matrix of T with respect to these bases.
Two matrices that encode the same linear transformation in different
bases are called similar. Matrix theory replaces the study of linear
transformations, which were defined axiomatically, by the study of
matrices, which are concrete objects. This major technique distinguishes
linear algebra from theories of other algebraic structures, which usually
cannot be parameterized so concretely.
199
inverse if and only if the determinant has an inverse (every non
non-zero real
or complex number has an inverse[16]). If the determinant is zero, then
the nullspace is nontrivial. Determinants have other applications,
including
ding a systematic way of seeing if a set of vectors is linearly
independent (we write the vectors as the columns of a matrix, and if the
determinant of that matrix is zero, the vectors are linearly dependent).
Determinants could also be used to solve syste
systems
ms of linear equations
(see Cramer's rule),
), but in real applications, Gaussian elimina
elimination is a
faster method.
200
where I is the identity matrix.
matrix. For there to be nontrivial solutions to that
equation, det(T − λ I) = 0. The determinant is a polynomial,
polynomial and so the
eigenvalues are not guaranteed to exist if the field is R.. Thus, we often
work with an algebraically closed field such as the complex numbers
when dealing with eigenvectors and eigenvalues so that an eigenvalue
will always exist. It would be particularly nice if given a transformation
T taking a vector space
pace V into itself we can find a basis for V consisting
of eigenvectors. If such a basis exists, we can easily compute the action
of the transformation on any vector: if v1, v2, ..., vn are linearly
independent eigenvectors of a mapping of n-dimensional
dimensional spaces
sp T with
(not necessarily distinct) eigenvalues λ1, λ2, ..., λn, and if v = a1v1 + ... +
an vn, then,
201
Inner-product
product spaces
• Conjugate symmetry:
• Positive-definiteness
definiteness:
and so we can call this quantity the cosine of the angle between the two
vectors.
203
If T satisfies TT* = T*T, we call T normal. It turns out that normal
matrices are precisely the matrices that have an orthonormal system of
eigenvectors that span V.
Applications
Linear algebra provides the formal setting for the linear combination of
equations used in the Gaussian method. Suppose the goal is to find and
204
describe the solution(s), if any, of the following system of linear
equations:
The Gaussian-elimination
elimination algorithm is as follows: eliminate x from all
equations below L1, and then eliminate y from all equations below L2.
This will put the system into triangular form.. Then, using back
back-
substitution, each unknown can be solved for.
205
The result is:
Then, z can be substituted into L2, which can then be solved to obtain
Next, z and y can be substituted into L1, which can be solved to obtain
206
The solution of this system is characterized as follows: first, we find a
particular solution x0 of this equation using Gaussian elimination. Then,
we compute the solutions of Ax = 0; that is, we find the null space N of
A.. The solution set of this equation is given by
. If the number of variables is equal to the
number of equations, then we can characterize when the system has a
unique solution: since N is trivial if and only if det A ≠ 0, the equation
has a unique solution if and only if det A ≠ 0.[19]
Least-squares
squares best fit line
π, π] → R as a
Fourier series are a representation of a function f: [−π,
trigonometric series:
207
functions have a Fourier series that converges to the function value at
most points.
208
Quantum mechanics
209
Point coordinates in the plane E are ordered pairs of real numbers, (x,y),
(
and a line is defined as the set of points
p (x,y)) that satisfy the linear
equation[21]
or
The linear equation, λ,, has the important property, that if x1 and x2 are
homogeneous coordinates of points on the line, then the point αx1 + βx2
is also on thee line, for any real α and β.
210
Now consider the equations of the two lines λ1 and λ2,
211
It is interesting to consider the case of three lines, λ1, λ2 and λ3, which
yield the matrix equation,
which in homogeneous
us form yields,
Clearly, this equation has the solution x = (0,0,0), which is not a point on
the z = 1 plane E.. For a solution to exist in the plane E,, the coefficient
matrix C must have rank 2, which means its determinant must be zero.
Another way to say this is that the columns of the matrix must be
linearly dependent.
or
212
This transformation has the important property that if Ay
Ay=d, then
This showss that the sum of vectors in E map to the sum of their images
in R.. This is the defining characteristic of a linear map,, or linear
transformation.[21] For this case, where the image space is a real number
functional.[23]
the map is called a linear fun
Consider the linear functional a little more carefully. Let i=(1,0) and j
=(0,1) be the natural basis vectors on E, so that x=xi+yj.. It is now
possible to see that
Thus, the columns of the matrix A are the image of the basis vectors of
E in R.
213
where Av=d and Aw=e
=e are the images of the basis vectors v and w. This
is written in matrix form as
Coordinates relative
elative to a basis
214
These coordinate functionals have the properties,
Inverse image
The set of points in the plane E that map to the same image in R under
the linear functional λ define a line in E.. This line is the image of the
inverse map, λ−1: R→E
E.. This inverse image is the set of the points x=(x,
y) that solve the equation,
215
In order to solve the equation, we first recognize that only one of the two
unknowns (x,y) can be determined, so we select y to be determined, and
rearrange the equation
Solve for y and obtain the inverse image as the set of points,
The vector p defines the intersection of the line with the y-axis,
y known
as the y-intercept.
intercept. The vector h satisfies the homogeneous equation,
The set of points of a linear functional that map to zero define the kernel
of the linear functional. The line can be considered to be the set of points
h in the kernel translated by the vector p.[21][23
216
Sources of subspaces: kernels and ranges of linear transformations
ker(T)={A in V | T(A)=0}
The range of T is the set of all vectors in W which are images of some
vectors in V, that is
Examples.
217
Indeed, the vectors which are perpendicular to L and only these vectors
are annihilated by the projection. This proves the statement about the
kernel. The projection on L of every vector is parallel to L (by the
definition of projection) and conversely, every vector which is parallel to
L is the projection of some vector from R2, for example, it is the
projection of itself. This proves the statement about the range.
218
the whole P (every polynomial is a derivative of another polynomial)
and the kernel of T is the set of all constants (prove it!).
BASIS
Definition of Basis
219
Let V be a vector space and S = {v1, v2, ... , vk} be a subset of V. Then
S is a basis for V if the following two statements are true.
1. S spans V.
Example
Let V = Rn and let S = {e1, e2, ... ,en} where ei has ith component equal
to 1 and the rest 0. For example
e2 = (0,1,0,0,...,0)
Example
Let V = P3 and let S = {1, t, t2, t3}. Show that S is a basis for V.
Solution
Linear Independence:
Let
220
c1(1) + c2(t) + c3(t2) + c4(t3) = 0
Then since a polynomial is zero if and only if its coefficients are all zero
we have
c1 = c2 = c3 = c4 = 0
Span
a + bt + ct2 + dt3
We just let
c1 = a, c2 = b, c3 = c, c4 = d
Hence S spans V.
In general the basis {1, t, t2, ... , tn} is called the standard basis for Pn.
221
Example
Solution
Linear Independence
We write
222
c1 + c2 + c3 + 2c4 = 0
2c1 + c4 = 0
2c2 = 0
2c3 = 0
Ac = 0
with
We have
det A = -12
Since the determinant is nonzero, we can conclude that only the trivial
solution exists. That is
c1 = c2 = c3 = c4 = 0
Span
We set
223
which gives the equations
c1 + c2 + c3 + 2c4 = x1
2c1 + c 4 = x2
2c2 = x3
2c3 = x4
Ac = x
c = A-1x
Hence S spans V.
Theorem
224
Let S = {v1, v2, ... , vk} be a basis for V. Then every vector in V can be
written uniquely as a linear combination of vectors in S.
Proof
Suppose that
then
a1 - b1 = ... = an - bn = 0
a1 = b1, ... , an = bn
225
vectors is a linear combination of the rest. Without loss of generality,
we can assume that this is the vector vk. We have that
If S' is not a basis, as above we can get rid of another vector. We can
continue this process until the vectors are finally linear independent.
Theorem
Let S span a vector space V, then there is a subset of S that is a basis for
V.
Dimension
We have seen that any vector space that contains at least two vectors
contains infinitely many. It is uninteresting to ask how many vectors
226
there are in a vector space. However there is still a way to measure the
size of a vector space. For example, R3 should be larger than R2. We
call this size the dimension of the vector space and define it as the
number of vectors that are needed to form a basis. Tow show that the
dimensions is well defined, we need the following theorem.
Theorem
If S = {v1, v2, ... , vn} is a basis for a vector space V and T = {w1, w2,
... , wk} is a linearly independent set of vectors in V, then k < n.
Remark: If S and T are both bases for V then k = n. This says that
every basis has the same number of vectors. Hence the dimension is
will defined.
Proof
228
Example
Since
Example
dim Pn = n + 1
since
Example
dim Mmxn = mn
229
with k < n, then S is not a basis. From the definition of basis, S does not
span V, hence there is a vk+1 such that vk+1 is not in the span of S. Let
Theorem
Let
vk+1, ... , vn
such that
is a basis for V.
We finish this discussion with some very good news. We have seen that
to find out if a set is a basis for a vector space, we need to check for both
linear independence and span. We know that if there are not the right
230
number of vectors in a set, then the set cannot form a basis. If the
number is the right number we have the following theorem.
Theorem
1. S is a basis for V.
2. S is linearly independent.
3. S spans V.
6.3.LINEAR DEPENDENCE
(1)
231
(2)
(3)
(4)
2. .
232
There are many situations when we might wish to know whether a set of
vectors is linearly dependent, that is if one of the vectors is some
combination of the others.
Two vectors u and v are linearly independent if the only numbers x and
y satisfying xu+yv=0 are x=y=0. If we let
If u and v are linearly independent, then the only solution to this system
of equations is the trivial solution, x=y=0. For homogeneous systems
this happens precisely when the determinant is non-zero. We have now
found a test for determining whether a given set of vectors is linearly
independent: A set of n vectors of length n is linearly independent if the
matrix with these vectors as columns has a non-zero determinant. The
set is of course dependent if the determinant is zero.
Example
The vectors <1,2> and <-5,3> are linearly independent since the matrix
233
has a non-zero determinant.
Example
in row-reduced form is
234
Thus, y=-3z and 2x=-3y-5z=-3(-3z)-5z=4z which implies
0=xu+yv+zw=2zu-3zv+zw or equivalently w=-2u+3v. A quick
arithmetic check verifies that the vector w is indeed equal to -2u+3v.
CHAPTER 7: MATRICES
Matrices
A Matrix
(This one has 2 Rows and 3 Columns)
Addingj
235
These are the calculations:
3+4=7 8+0=8
4+1=5 6-9=-3
The two matrices must be the same size, i.e. the rows must match in
size, and the columns must match in size.
But it could not be added to a matrix with 3 rows and 4 columns (the
columns don't match in size)
Negative
236
These are the calculations:
-(2)=-2 -(-4)=+4
-(7)=-7 -(10)=-10
Subtracting
3-4=-1 8-0=8
4-1=3 6-(-9)=15
Multiply by a Constant
237
These are the calculations:
2×4=8 2×0=0
2×1=2 2×-9=-18
Dividing
238
And there are special ways to find the Inverse ...
Transposing
Notation
239
• Rows go left-right
• Columns go up-down
To remember that rows come before columns use the word "arc":
ar,c
Example:
B=
For the general case, where A is an n×n matrix the determinant is given
by:
241
Where the coefficients αij are given by the relation:
where βij is the determinant of the (n-1) × (n-1) matrix that is obtained
by deleting row i and column j. This coefficient αij is also called the
cofactor of aij.
Take for example a arbitury 2×2 Matrix A whose determinant (ad − bc)
is not equal to zero.
242
Where the adj(A) denotes the adjoint (or adjugate) of a matrix. It can be
calculated by the following method:
The unknowns are denoted by x1, x2, ..., xn and the coefficients (a and b
above) are assumed to be given. In matrix form the system of equations
above can be written as:
243
Now, try putting your own equations into matrix form.
or alternatively
From the above it is clear that the existence of a solution depends on the
value of the determinant of A. There are three cases:
1. If the det(A) does not equal zero then solutions exist using
2. If the det(A) is zero and b=0 then the solution will be not be
unique or does not exist.
3. If the det(A) is zero and b=0 then the solution can be x = 0 but as
with 2. is not unique or does not exist.
244
and by rearranging we would get that the solution would look like
and by rearranging we would get that the solution would look like
245
The unknowns are denoted by x1, x2, ..., xn and the coefficients (a and b
above) are assumed to be given. In matrix form the system of equations
above can be written as:
246
or alternatively
From the above it is clear that the existence of a solution depends on the
value of the determinant of A. There are three cases:
1. If the det(A) does not equal zero then solutions exist using
2. If the det(A) is zero and b=0 then the solution will be not be
unique or does not exist.
3. If the det(A) is zero and b=0 then the solution can be x = 0 but as
with 2. is not unique or does not exist.
247
and by rearranging we would get that the solution would look like
and by rearranging we would get that the solution would look like
Augmented matrices
2x + 3y – z = 6
–x – y – z = 9
x + y + 6z = 0
Write down the coefficients and the answer values, including all
"minus" signs. If there is "no" coefficient, then the coefficient is
"1".
That is, given a system of (linear) equations, you can relate to it the
matrix (the grid of numbers inside the brackets) which contains only the
coefficients of the linear system. This is called "an augmented matrix":
the grid containing the coefficients from the left-hand side of each
equation has been "augmented" with the answers from the right-hand
side of each equation.
249
The entries of (that is, the values in) the matrix correspond to the x-, y-
and z-values in the original system, as long as the original system is
arranged properly in the first place. Sometimes, you'll need to rearrange
terms or insert zeroes as place-holders in your matrix.
x+y=0
y+z=3
z–x=2
x+y =0
y+z=3
–x +z=2
When forming the augmented matrix, use a zero for any entry where the
corresponding spot in the system of linear equations is blank.
250
Coefficient matrices
If you form the matrix only from the coefficient values, the matrix
would look like this:
x + 3y = 4
2y – z = 5
3x + z = –2
251
The Size of a matrix
Matrices are often referred to by their sizes. The size of a matrix is given
in the form of a dimension, much as a room might be referred to as "a
ten-by-twelve room". The dimensions for a matrix are the rows and
columns, rather than the width and length. For instance, consider the
following matrix A:
The rows go side to side; the columns go up and down. "Row" and
"column" are technical terms, and are not interchangable. Matrix
dimensions are always given with the number of rows first, followed by
the number of columns. Following this convention, the following matrix
B:
252
...is 2 × 3. If the matrix has the same number of rows as columns, the
matrix is said to be a "square" matrix. For instance, the coefficient
matrix from above:
Multiplication of Matrices
0.6 I + 0.3 S
253
0.4 I + 0.7 S
254
So after one year the table which gives the two populations is
then the populations after one year are given by the formula
255
Combining this formula with the above result, we get
In fact, we do not need to have two matrices of the same size to multiply
them. Above, we did multiply a (2x2) matrix with a (2x1) matrix (which
gave a (2x1) matrix). In fact, the general rule says that in order to
perform the multiplication AB, where A is a (mxn) matrix and B a (kxl)
matrix, then we must have n=k. The result will be a (mxl) matrix. For
example, we have
256
Remember that though we were able to perform the above
multiplication, it is not possible to perform the multiplication
We have
257
and
258
Algebraic Properties of Matrix Operations
In this page, we give some general results about the three operations:
addition, multiplication, and multiplication with numbers, called scalar
multiplication.
1.
A+B = B+A
2.
(A+B)+C = A + (B+C)
3.
where is the mxn zero-matrix (all its entries are equal to 0);
4.
259
1.
(AB)C = A (BC)
Note, for example, that if A is 2x3, B is 3x3, and C is 3x1, then the
above products are possible (in this case, (AB)C is 2x1 matrix).
2.
3.
4.
260
Note that is the nxk zero-matrix. So if n is different from m,
the two zero-matrices are different.
1.
(A+B)C = AC + BC
and
A(B+C) = AB + AC
2.
and
261
Example. Consider the matrices
Evaluate (AB)C and A(BC). Check that you get the same matrix.
Answer. We have
so
so
262
It is easy to check that
and
In particular, we have
263
The matrix In has similar behavior as the number 1. Indeed, for any nxn
matrix A, we have
A In = In A = A
264
The identity matrix behaves like the number 1 not only among the
matrices of the form nxn. Indeed, for any nxm matrix A, we have
In particular, we have
Invertible Matrices
265
where In is the identity matrix. The matrix B is called the inverse matrix
of A.
Example. Let
266
Example. Find the inverse of
Write
Since
we get
267
Easy algebraic manipulations give
or
268
If A and B are invertible matrices, then is also invertible and
269
its diagonal consists of a, e, and k. In general, if A is a square matrix of
order n and if aij is the number in the ith-row and jth-colum, then the
diagonal is given by the numbers aii, for i=1,..,n.
270
are lower-triangular. Now consider the two matrices
we have
271
Properties of the Transpose operation. If X and Y are mxn matrices
and Z is an nxk matrix, then
1.
(X+Y)T = XT + YT
2.
(XZ)T = ZT XT
3.
(XT)T = X
272
are diagonal matrices. Identity matrices are examples of diagonal
matrices. Diagonal matrices play a crucial role in matrix theory. We will
see this later on.
for n=1,2,....
Answer. We have
and
273
By induction, one may easily show that
for n=1,2,..
In particular, we have
274
Elementary Operations for Matrices
275
Its rows are
As we can see, the transpose of the columns of A are the rows of AT. So
the transpose operation interchanges the rows and the columns of a
matrix. Therefore many techniques which are developed for rows may
be easily translated to columns via the transpose operation. Thus, we
will only discuss elementary row operations, but the reader may easily
adapt these to columns.
Definition. Two matrices are row equivalent if and only if one may be
obtained from the other one via elementary row operations.
276
Answer. We start with A. If we keep the second row and add the first to
the second, we get
We keep the first row. Then we subtract the first row from the second
one multiplied by 3. We get
We keep the first row and subtract the first row from the second one. We
get
277
One powerful use of elementary operations consists in finding solutions
to linear systems and the inverse of a matrix. This happens via Echelon
Form and Gauss-Jordan Elimination. In order to appreciate these two
techniques, we need to discuss when a matrix is row elementary
equivalent to a triangular matrix. Let us illustrate this with an example.
First we will transform the first column via elementary row operations
into one with the top number equal to 1 and the bottom ones equal 0.
Indeed, if we interchange the first row with the last one, we get
Next, we keep the first and last rows. And we subtract the first one
multiplied by 2 from the second one. We get
278
We are almost there. Looking at this matrix, we see that we can still take
care of the 1 (from the last row) under the -2. Indeed, if we keep the first
two rows and add the second one to the last one multiplied by 2, we get
1.
any row consisting of zeros is below any row that contains at least
one nonzero number;
2.
the first (from left to right) nonzero entry of any row is to the left
of the first nonzero entry of any lower row.
Now if we make sure that the first nonzero entry of every row is 1, we
get a matrix in row echelon form. For example, the matrix above is not
in echelon form. But if we divide the second row by -2, we get
279
Matrix Exponential
When n gets large, this sequence of matrices get closer and closer to a
certain matrix. This is not easy to show; it relies on the conclusion on ex
280
above. We write this limit matrix as eA. This notation is natural due to
the properties of this matrix. Thus we have the formula
At this point, the reader may feel a little lost about the definition above.
To make this stuff clearer, let us discuss an easy case: diagonal matrices.
281
Using the above properties of the exponential function, we deduce that
B = P-1AP.
Moreover, we have
Bn = P-1AnP
282
This clearly implies that
eB = P-1eAP.
This matrix is upper-triangular. Note that all the entries on the diagonal
are 0. These types of matrices have a nice property. Let us discuss this
for this example. First, note that
283
In this case, we have
As we said before, the reasons for using the exponential notation for
matrices reside in the following properties:
1.
2.
284
eA+B = eAeB;
3.
There are many ways to encrypt a message. And the use of coding has
become particularly significant in recent years (due to the explosion of
the internet for example). One way to encrypt or code a message uses
matrices and their inverse. Indeed, consider a fixed invertible matrix A.
Convert the message into a matrix B such that AB is possible to
perform. Send the message generated by AB. At the other end, they will
need to know A-1 in order to decrypt or decode the message sent. Indeed,
we have
285
matrix involves fractions which are not easy to send in an electronic
form. The best is to have both A and its inverse with integers as their
entries. In fact, we can use our previous knowledge to generate such
First we keep the first row and add it to the second as well as to the third
rows. We obtain
286
Next we keep the first row again, we add the second to the third, and
finally add the last one to the first multiplied by -2. We obtain
This is our matrix A. Easy calculations will give det(A) = -1, which we
knew since the above elementary operations did not change the
determinant from the original triangular matrix which obviously has -1
as its determinant. We leave the details of the calculations to the reader.
The inverse of A is
287
Now we rearrange these numbers into a matrix B. For example, we have
Then we perform the product AB, where A is the matrix found above.
We get
288
We will write
In particular, we have
Algebraic Properties of
1.
-Ma,b = M-a,-b.
2.
289
Multiplication by a number: We have
2.
290
The above properties infer to a very nice structure. The next natural
question to ask, in this case, is whether a nonzero element of is
invertible. Indeed, for any real numbers a and b, we have
which implies
291
Ma, b÷Mc, d = M ,-
The matrix Ma,-b is called the conjugate of Ma,b. Note that the conjugate
of the conjugate of Ma,b is Ma,b itself.
Note that
a + bi
A lot can be said about , but we will advise you to visit the page on
complex numbers.
292
Triangula Triangula
r r
Addition
If A and B above are matrices of the same type then the sum is found by
adding the corresponding elements aij + bij .
293
Here is an example of adding A and B together.
Subtraction
If A and B are matrices of the same type then the subtraction is found by
subtracting the corresponding elements aij − bij.
Matrix Multiplication
294
Now lets look at the n×n matrix case, Where A has dimensions m×n, B
has dimensions n×p. Then the product of A and B is the matrix C, which
has dimensions m×p. The ijth element of matrix C is found by
multiplying the entries of the ith row of A with the corresponding entries
in the jth column of B and summing the n terms. The elements of C are:
Transpose of Matrices
295
In the case of a square matrix (m = n), the transpose can be used to
check if a matrix is symmetric. For a symmetric matrix A = AT.
The unknowns are denoted by x1, x2, ..., xn and the coefficients (a and b
above) are assumed to be given. In matrix form the system of equations
above can be written as:
296
A simplified way of writing above is like this: Ax = b
or alternatively
From the above it is clear that the existence of a solution depends on the
value of the determinant of A. There are three cases:
1. If the det(A) does not equal zero then solutions exist using
297
2. If the det(A) is zero and b=0 then the solution will be not be
unique or does not exist.
3. If the det(A) is zero and b=0 then the solution can be x = 0 but as
with 2. is not unique or does not exist.
and by rearranging we would get that the solution would look like
Now try solving your own two equations with two unknowns.
298
Written in matrix form would look like
and by rearranging we would get that the solution would look like
Now try solving your own three equations with three unknowns.
Cramer's Rule
The first term x1 above can be found by replacing the first column of A
299
Similarly for the general case for solving xr we replace the rth column of
or simplified:
ax + by + cz = p
dx + ey + fz = q
gx + hy + iz = r
Let us do some row elementary operations. First take second Row minus
first row and third row minus 3 times first row, that is
to get
301
Next take to get
If , then we have
302
so the system in this case is not consistent.
303
Next take to get
304
Write . By definition we have . The first
condition translates into
or
305
Consider the augmented matrix of this system
307
So we have
Problem. Use linear systems to find out if there exists a parabola that
The parabola will have the form . The three points will
be on the parabola if and only if
308
The existence of such a parabola translates into the consistency of the
above system. Consider the augmented matrix
309
From here, we can conclude that the parabola does exist. To find it, let
to get
Finally take
Hence
310
or .
Answer.
Set
311
Then we have
and
So
312
INTRODUCTION TO DETERMINANTS
1.
313
This is interesting since it implies that whenever we use rows, a
similar behavior will result if we use columns. In particular we will
see how row elementary operations are helpful in finding the
determinant. Therefore, we have similar conclusions for
elementary column operations.
2.
3.
314
4.
In particular, if all the entries in one row are zero, then the
determinant is zero.
5.
315
6.
We have
), then
Example. Evaluate
316
Let us transform this matrix into a triangular one through elementary
operations. We will keep the first row and add to the second one the first
multiplied by . We get
Therefore, we have
1.
317
2.
3.
4.
5.
6. We have
), then
318
So let us see how this works in case of a matrix of order 4.
Example. Evaluate
We have
We do not touch the first row and work with the other rows. We
interchange the second with the third to get
319
If we subtract every row multiplied by the appropriate number from the
second row, we get
320
These calculations seem to be rather lengthy. We will see later on that a
general formula for the determinant does exist.
Example. Evaluate
Example. Evaluate
We have
321
the column number j, for and . For any i and j,
set Aij (called the cofactors) to be the determinant of the square matrix
of order (n-1) obtained from A by removing the row number i and the
column number j multiplied by (-1)i+j. We have
for any fixed j. In other words, we have two type of formulas: along a
row (number i) or along a column (number j). Any row or any column
will do. The trick is to use a row or a column which has a lot of zeros.
In particular, we have along the rows
or
322
or
Example. Evaluate
We will use the general formula along the third row. We have
323
DETERMINANT AND INVERSE OF MATRICES
Example. Let
324
We have
Is this formula only true for this matrix, or does a similar formula exist
for any square matrix? In fact, we do have a similar formula.
325
In particular, if , then
which gives
On the next page, we will discuss the application of the above formulas
to linear systems.
326
APPLICATION OF DETERMINANT TO SYSTEMS: CRAMER'S
RULE
AX=B
327
where xi are the unknowns of the system or the entries of X, and the
matrix Ai is obtained from A by replacing the ith column by the column
B. In other words, we have
328
Answer. First note that
which implies that the matrix coefficient is invertible. So we may use the
Cramer's formulas. We have
329
models, and calculating powers of matrices (in order to define the
exponential matrix). Other areas such as physics, sociology, biology,
economics and statistics have focused considerable attention on
"eigenvalues" and "eigenvectors"-their applications and their
computations. Before we give the formal definition, let us introduce
these concepts on an example.
We have
330
In other words, we have
Next consider the matrix P for which the columns are C1, C2, and C3,
i.e.,
Next we evaluate the matrix P-1AP. We leave the details to the reader to
check that we have
331
In other words, we have
332
for . Note that it is almost impossible to find A75 directly
from the original form of A.
From now on, we will call column matrices vectors. So the above
column matrices C1, C2, and C3 are now vectors. We have the following
definition.
333
for any number .
where
334
Computation of Eigenvalues
335
The equation translates into
336
n roots or solutions. So a square matrix A of order n will not have more
than n eigenvalues.
This result is valid for any diagonal matrix of any size. So depending on
the values you have on the diagonal, you may have one eigenvalue, two
eigenvalues, or more. Anything is possible.
Remark. It is quite amazing to see that any square matrix A has the
same eigenvalues as its transpose AT because
337
the characteristic polynomial is given by the equation
The number (a+d) is called the trace of A (denoted tr(A)), and clearly
the number (ad-bc) is the determinant of A. So the characteristic
polynomial of A can be rewritten as
We have
338
This equation is known as the Cayley-Hamilton theorem. It is true for
any square matrix A of any order, i.e.
1.
2.
3.
4.
339
If is any number, then is an eigenvalue of .
5.
Computation of Eigenvectors
340
Let us start with an example.
341
which implies that the eigenvalues of A are 0, -4, and 3.
Next we look for the eigenvectors.
1.
Many ways may be used to solve this system. The third equation is
identical to the first. Since, from the second equations, we have y =
6x, the first equation reduces to 13x + z = 0. So this system is
equivalent to
342
Therefore, any eigenvector X of A associated to the eigenvalue 0 is
given by
2.
343
In this case, we will use elementary operations to solve it. First we
Next, we use the first row to eliminate the 5 and 6 on the first
column. We obtain
If we cancel the 8 and 9 from the second and third row, we obtain
344
Next, we set z = c. From the second row, we get y = 2z = 2c. The
first row will imply x = -2y+3z = -c. Hence
2.
Case : The details for this case will be left to the reader.
Using similar ideas as the one described above, one may easily
show that any eigenvector X of A associated to the eigenvalue 3 is
given by
345
where c is an arbitrary number.
Remark. In general, the eigenvalues of a matrix are not all distinct from
each other (see the page on the eigenvalues for more details). In the next
two examples, we discuss this problem.
346
which may be rewritten by
Clearly, the third equation is identical to the first one which is also a
multiple of the second equation. In other words, this system is equivalent
to the system reduced to one equation
2x+y + 2z= 0.
To solve it, we need to fix two of the unknowns and deduce the third
347
Example. Consider the matrix
Hence the matrix A has one eigenvalue, i.e. -3. Let us find the associated
eigenvectors. These are given by the linear system
x - y = 0.
348
Let us summarize what we did in the above examples.
1.
2.
3.
349
Its characteristic equation is given by
This is a quadratic equation. The nature of its roots (which are the
eigenvalues of A) depends on the sign of the discriminant
Remark. Note that the matrix A will have one eigenvalue, i.e. one
double root, if and only if . But this is possible only if a=c and
b=0. In other words, we have
A = a I2.
First let us convince ourselves that there exist matrices with complex
eigenvalues.
350
The characteristic equation is given by
A X = (1+2i) X
351
In fact the two equations are identical since (2+2i)(2-2i) = 8. So the
system reduces to one equation
(1-i)x - y = 0.
Remark. It is clear that one should expect to have complex entries in the
eigenvectors.
We have seen that (1-2i) is also an eigenvalue of the above matrix. Since
the entries of the matrix A are real, then one may easily show that if is
a complex eigenvalue, then its conjugate is also an eigenvalue.
Moreover, if X is an eigenvector of A associated to , then the vector
, obtained from X by taking the complex-conjugate of the entries of
X, is an eigenvector associated to . So the eigenvectors of the above
matrix A associated to the eigenvalue (1-2i) are given by
352
where c is an arbitrary number.
1.
2.
3.
4.
353
In general, it is normal to expect that a square matrix with real entries
may still have complex eigenvalues. One may wonder if there exists a
class of matrices with only real eigenvalues. This is the case for
symmetric matrices. The proof is very technical and will be discussed in
another page. But for square matrices of order 2, the proof is quite easy.
Let us give it here for the sake of being little complete.
This is a quadratic equation. The nature of its roots (which are the
eigenvalues of A) depends on the sign of the discriminant
354
Therefore, is a positive number which implies that the eigenvalues of
A are real numbers.
Remark. Note that the matrix A will have one eigenvalue, i.e. one
double root, if and only if . But this is possible only if a=c and
b=0. In other words, we have
A = a I2.
Diagonalization
355
i.e is similar to . So they have the same characteristic
equation. Hence A and D have the same eigenvalues. Since the
eigenvalues of D of the numbers on the diagonal, and the only
eigenvalue of A is 2, then we must have
In this case, we must have A = P-1DP = 2 I2, which is not the case.
Therefore, A is not similar to a diagonal matrix.
1.
2.
3.
357
Then solve it. We should find the unknown vector X as a linear
combination of vectors, i.e.
4.
358
is a diagonal matrix with diagonal entries equal to the eigenvalues
of A. The position of the vectors Cj in P is identical to the position
of the associated eigenvalue on the diagonal of D. This identity
implies that A is similar to D. Therefore, A is diagonalizable.
5.
1.
359
So -1 is an eigenvalue with multiplicity 2 and -2 with multiplicity
1.
2.
360
which reduces to the system
Set
Then
But if we set
361
then
Set
then
362
Hence A = P D P-1. Set
Then we have
B3 = A.
363
and
are systems of two equations with two unknowns (x and y), while
364
answer that, let x be the number of eggs, y the amount of milk (in cups),
and z the amount of orange of juice (in cups). Then we need to have
365
the solutions of the linear systems are close enough to the solutions of
the nonlinear systems. We will not discuss this here. Instead, we will
focus our attention on linear systems.
ax+by+cz+dw=h
For example,
and
366
are linear systems, while
367
Set the matrices
As you can see this is far nicer than the equations. But sometimes it is
worth to solve the system directly without going through the matrix
form. The matrix A is called the matrix coefficient of the linear system.
The matrix C is called the nonhomogeneous term. When , the
linear system is homogeneous. The matrix X is the unknown matrix. Its
entries are the unknowns of the linear system. The augmented matrix
associated with the system is the matrix [A|C], where
368
In general if the linear system has n equations with m unknowns, then
the matrix coefficient will be a nxm matrix and the augmented matrix an
nx(m+1) matrix. Now we turn our attention to the solutions of a system.
369
The idea is to keep the first equation and work on the last two. In doing
that, we will try to kill one of the unknowns and solve for the other two.
For example, if we keep the first and second equation, and subtract the
first one from the last one, we get the equivalent system
Next we keep the first and the last equation, and we subtract the first
from the second. We get the equivalent system
Now we focus on the second and the third equation. We repeat the same
procedure. Try to kill one of the two unknowns (y or z). Indeed, we keep
the first and second equation, and we add the second to the third after
multiplying it by 3. We get
This obviously implies z = -2. From the second equation, we get y = -2,
and finally from the first equation we get x = 4. Therefore the linear
system has one solution
370
Going from the last equation to the first while solving for the unknowns
is called backsolving.
Keep in mind that linear systems for which the matrix coefficient is
upper-triangular are easy to solve. This is particularly true, if the matrix
is in echelon form. So the trick is to perform elementary operations to
transform the initial linear system into another one for which the
coefficient matrix is in echelon form.
Using our knowledge about matrices, is there anyway we can rewrite
what we did above in matrix form which will make our notation (or
representation) easier? Indeed, consider the augmented matrix
371
Next we keep the first and the last rows, and we subtract the first from
the second. We get
Then we keep the first and second row, and we add the second to the
third after multiplying it by 3 to get
372
system. This shows that instead of writing the systems over and over
again, it is easy to play around with the elementary row operations and
once we obtain a triangular matrix, write the associated linear system
and then solve it. This is known as Gaussian Elimination. Let us
summarize the procedure:
1.
2.
3.
Write down the new linear system for which the triangular matrix
is the associated augmented matrix;
4.
Solve the new system. You may need to assign some parametric
values to some unknowns, and then apply the method of back
substitution to solve the new system.
373
The augmented matrix is
Next we keep the first and second row and try to have zeros in the
second column. We get
374
Next we keep the first three rows. We add the last one to the third to get
x=2+ y+ z-w- v.
x=- - s - t.
375
Example. Use Gaussian elimination to solve the linear system
We keep the first row and subtract the first row multiplied by 2 from the
second row. We get
Clearly the second equation implies that this system has no solution.
Therefore this linear system has no solution.
376
Definition. A linear system is called inconsistent or overdetermined if
it does not have a solution. In other words, the set of solutions is empty.
Otherwise the linear system is called consistent.
377
or simplified:
ax + by + cz = p
dx + ey + fz = q
gx + hy + iz = r
378
Before you work this problem, you must know the definition of simple
interest. Simple interest can be calculated by multiplying the amount
invested at the interest rate.
Solution:
379
phrase (The amount of money invested at 11%) by the symbol .
Let's rewrite sentences (1) and (2) in shortcut form.
Step 2: Substitute this value for y in equation (2). This will change
equation (2) to an equation with just one variable, x.
380
Step 3: Solve for x in the translated equation (2).
381
The Method of Elimination:
The process of elimination involves five steps:
In a two-variable problem rewrite the equations so that when the
equations are added, one of the variables is eliminated, and then solve
for the remaining variable.
Step 2: Add new equation (1) to equation (2) to obtain equation (3).
382
Step 3: Substitute in equation (1) and solve for x.
383
Rewrite equations (1) and (2) without the variables and operators. The
left column contains the coefficients of the x's, the middle column
contains the coefficients of the y's, and the right column contains the
constants.
The objective is to reorganize the original matrix into one that looks like
384
Step 3: Manipulate the matrix so that the cell 22 is 1. Do this by
multiplying row 2 by 50.
You can read the answers off the matrix as x = $7,000 and y = $5,000.
385
If a system of linear equations has at least one solution, it is consistent.
If the system has no solutions, it is inconsistent. If the system has an
infinity number of solutions, it is dependent. Otherwise it is
independent.
where , , , and are real numbers and , , , and are not all
.
Example 1:
John inherited $25,000 and invested part of it in a money market
account, part in municipal bonds, and part in a mutual fund. After one
year, he received a total of $1,620 in simple interest from the three
investments. The money market paid 6% annually, the bonds paid 7%
annually, and the mutually fund paid 8% annually. There was $6,000
more invested in the bonds than the mutual funds. Find the amount John
invested in each category.
There are three unknowns:
1 : The amount of money invested in the money market account.
2 : The amount of money invested in municipal bonds.
3 : The amount of money invested in a mutual fund.
386
Let's rewrite the paragraph that asks the question we are to answer.
in a mutual fund ]
387
z = The amount of money invested in a mutual fund.
We have converted the problem from one described by words to one that
is described by three equations.
388
We are going to show you how to solve this system of equations three
different ways:
1) Substitution,
2) Elimination,
3) Matrices.
SUBSTITUTION:
The process of substitution involves several steps:
Step 1: Solve for one of the variables in one of the equations. It
makes no difference which equation and which variable you choose.
Let's solve for in equation (3) because the equation only has two
variables.
Step 2: Substitute this value for in equations (1) and (2). This will
change equations (1) and (2) to equations in the two variables and .
Call the changed equations (4) and (5).
389
or
Step 4: Substitute this value of in equation (5). This will give you
an equation in one variable.
solve for .
Yes
Yes
Yes
ELIMINATION:
The process of elimination involves several steps: First you reduce three
equations to two equations with two variables, and then to one equation
with one variable.
Step 1: Decide which variable you will eliminate. It makes no
difference which one you choose. Let us eliminate first because is
missing from equation (3).
391
Step 2: Multiply both sides of equation (1) by and then add the
transformed equation (1) to equation (2) to form equation (4).
(1) :
(2) :
(4) :
Step 3: We now have two equations with two variables.
(3) :
(4) :
Step 4: Multiply both sides of equation (3) by and add to
equation (4) to create equation (5) with just one variable.
(3) :
(4) :
(5) :
392
Step 7: Substitute for and for in equation (1) and
solve for .
393
which implies that the matrix coefficient is invertible. So we may use the
Cramer's formulas. We have
Note that it is easy to see that z=0. Indeed, the determinant which gives z
has two identical rows (the first and the last). We do encourage you to
check that the values found for x, y, and z are indeed the solution to the
given system.
Remark. Remember that Cramer's formulas are only valid for linear
systems with an invertible matrix coefficient.
394
Three simultaneous equations in x, y and z
ax + by + cz = p
dx + ey + fz = q
gx + hy + iz = r
9.1.DEFINITION
Up until now, you've been told that you can't take the square root of a
negative number. That's because you had no numbers which were
negative after you'd squared them (so you couldn't "go backwards" by
taking the square root). Every number was positive after you squared it.
So you couldn't very well square-root a negative and expect to come up
with anything sensible.
Now, however, you can take the square root of a negative number, but it
involves using a new number to do it. This new number was invented
395
(discovered?) around the time of the Reformation. At that time, nobody
believed that any "real world" use would be found for this new number,
other than easing the computations involved in solving certain equations,
so the new number was viewed as being a pretend number invented for
convenience sake.
(But then, when you think about it, aren't all numbers inventions? It's not
like numbers grow on trees! They live in our heads. We made them all
up! Why not invent a new one, as long as it works okay with what we
already have?)
Anyway, this new number was called "i", standing for "imaginary",
because "everybody knew" that i wasn't "real". (That's why you couldn't
take the square root of a negative number before: you only had "real"
numbers; that is, numbers without the "i" in them.) The imaginary is
defined to be:
Then:
396
But this doesn't make any sense! You already have two numbers that
square to 1; namely –1 and +1. And i already squares to –1. So it's not
reasonable that i would also square to 1. This points out an important
detail: When dealing with imaginaries, you gain something (the ability
to deal with negatives inside square roots), but you also lose something
(some of the flexibility and convenient rules you used to have when
dealing with square roots). In particular, YOU MUST ALWAYS DO
THE i-PART FIRST!
• Simplify sqrt(–9).
(Warning: The step that goes through the third "equals" sign is " ",
not " ". The i is outside the radical.)
• Simplify sqrt(–25).
• Simplify sqrt(–18).
• Simplify –sqrt(–6).
397
In your computations, you will deal with i just as you would with x,
except for the fact that x2 is just x2, but i2 is –1:
• Simplify 2i + 3i.
2i + 3i = (2 + 3)i = 5i
=(–6)(–1 · i) = (–6)(–i) = 6i
Note this last problem. Within it, you can see that , because i2 = –1.
Continuing, we get:
398
In other words, to calculate any high power of i, you can convert it to a
lower power by taking the closest multiple of 4 that's no bigger than the
exponent and subtracting this multiple from the exponent. For example,
a common trick question on tests is something along the lines of
"Simplify i99", the idea being that you'll try to multiply i ninety-nine
times and you'll run out of time, and the teachers will get a good giggle
at your expense in the faculty lounge. Here's how the shortcut works:
That is, i99 = i3, because you can just lop off the i96. (Ninety-six is a
multiple of four, so i96 is just 1, which you can ignore.) In other words,
you can divide the exponent by 4 (using long division), discard the
answer, and use only the remainder. This will give you the part of the
exponent that you care about. Here are a few more examples:
• Simplify i17.
i17 = i16 + 1 = i4 · 4 + 1 = i1 = i
399
• Simplify i120.
• Simplify i64,002.
Now you've seen how imaginaries work; it's time to move on to complex
numbers. "Complex" numbers have two parts, a "real" part (being any
"real" number that you're used to dealing with) and an "imaginary" part
(being any number with an "i" in it). The "standard" format for complex
numbers is "a + bi"; that is, real-part first and i-part last.
400
Complex numbers are useful abstract quantities that can be used in
calculations and result in physically meaningful solutions. However,
recognition of this fact is one that took a long time for mathematicians to
accept. For example, John Wallis wrote, "These Imaginary Quantities
(as they are commonly called) arising from the Supposed Root of a
Negative Square (when they happen) are reputed to imply that the Case
proposed is Impossible" (Wells 1986, p. 22).
(1)
(2)
401
Here, is known as the complex modulus (or sometimes the complex
norm) and is known as the complex argument or phase. The plot above
shows what is known as an Argand diagram of the point , where the
dashed circle represents the complex modulus of and the angle
represents its complex argument. Historically, the geometric
representation of a complex number as simply a point in the plane was
important because it made the whole idea of a complex number more
acceptable. In particular, "imaginary" numbers became accepted partly
through their visualization.
(3)
(4)
(5)
402
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
Complex addition
403
(14)
complex subtraction
(15)
complex multiplication
(16)
(17)
can also be defined for complex numbers. Complex numbers may also
be taken to complex powers. For example, complex exponentiation
obeys
(18)
404
• Solve 3 – 4i = x + yi
(5 – 2i) – (–4 – i)
= 5 – 2i + 4 + i= (5 + 4) + (–2i + i)
= (9) + (–1i) = 9 – i
You may find it helpful to insert the "1" in front of the second set of
parentheses (highlighted in red above) so you can better keep track of
the "minus" being multiplied through the parentheses.
= 6 + 8i – 3i – 4i2 = 6 + 5i – 4(–1)
= 6 + 5i + 4 = 10 + 5i
For the last example above, FOILing works for this kind of
multiplication, if you learned that method. But whatever method you
use, remember that multiplying and adding with complexes works just
like multiplying and adding polynomials, except that, while x2 is just x2,
i2 is –1. You can use the exact same techniques for simplifying complex-
number expressions as you do for polynomial expressions, but you can
simplify even further with complexes because i2 reduces to the number –
1.
Adding and multiplying complexes isn't too bad. It's when you work
with fractions (that is, with division) that things turn ugly. Most of the
reason for this ugliness is actually arbitrary. Remember back in
elementary school, when you first learned fractions? Your teacher would
get her panties in a wad if you used "improper" fractions. For instance,
you couldn't say " 3/2 "; you had to convert it to "1 1/2". But now that
you're in algebra, nobody cares, and you've probably noticed that
"improper" fractions are often more useful than "mixed" numbers. The
issue with complex numbers is that your professor will get his boxers in
406
a bunch if you leave imaginaries in the denominator. So how do you
handle this?
• Simplify
So the answer is
This was simple enough, but what if they give you something more
complicated?
• Simplify
407
Since I still have an i underneath, this didn't help much. So how do
I handle this simplification? I use something called "conjugates".
The conjugate of a complex number a + bi is the same number, but
with the opposite sign in the middle: a – bi. When you multiply
conjugates, you are, in effect, multiplying to create something in
the pattern of a difference of squares:
Note that the i's disappeared, and the final result was a sum of
squares. This is what the conjugate is for, and here's how it is used:
So the answer is
In the last step, note how the fraction was split into two pieces. This is
because, technically speaking, a complex number is in two parts, the real
part and the i part. They aren't supposed to "share" the denominator. To
be sure your answer is completely correct, split the complex-valued
fraction into its two separate terms.
408
You'll probably only use complexes in the context of solving quadratics
for their zeroes. (There are many other practical uses for complexes, but
you'll have to wait for more interesting classes like "Engineering 201" to
get to the "good stuff".)
Remember that the Quadratic Formula solves "ax2 + bx + c = 0" for the
values of x. Also remember that this means that you are trying to find
the x-intercepts of the graph. When the Formula gives you a negative
inside the square root, you can now simplify that zero by using complex
numbers. The answer you come up with is a valid "zero" or "root" or
"solution" for "ax2 + bx + c = 0", because, if you plug it back into the
quadratic, you'll get zero after you simplify. But you cannot graph a
complex number on the x,y-plane. So this "solution to the equation" is
not an x-intercept. You can make this connection between the Quadratic
Formula, complex numbers, and graphing:
x2 – 2x – 3 x2 – 6x + 9 x2 + 3x + 3
409
inside the square square root number inside
root the square root
As an aside, you can graph complexes, but not in the x,y-plane. You
need the "complex" plane. For the complex plane, the x-axis is where
you plot the real part, and the y-axis is where you graph the imaginary
part. For instance, you would plot the complex number 3 – 2i like this:
410
This leads to an interesting fact: When you learned about regular ("real")
numbers, you also learned about their order (this is what you show on
the number line). But x,y-points don't come in any particular order. You
can't say that one point "comes after" another point in the same way that
you can say that one number comes after another number. For instance,
you can't say that (4, 5) "comes after" (4, 3) in the way that you can say
that 5 comes after 3. Pretty much all you can do is compare "size", and,
for complex numbers, "size" means "how far from the origin". To do
this, you use the Distance Formula, and compare which complexes are
closer to or further from the origin. This "size" concept is called "the
modulus". For instance, looking at our complex number plotted above,
its modulus is computed by using the Distance Formula: Copyright ©
Elizabeth Stapel 2000-2011 All Rights Reserved
Note that all points at this distance from the origin have the same
modulus. All the points on the circle with radius sqrt(13) are viewed as
being complex numbers having the same "size" as 3 – 2i.
411
9.2.Algebra of complex Numbers
for
iff .
412
A meaningful number system requires a method for combining
ordered pairs. The definition of algebraic operations must be consistent
so that the sum, difference, product, and quotient of any two ordered
pairs will again be an ordered pair. The key to defining how these
numbers should be manipulated is to follow Gauss's lead and equate
with . Then, if and are arbitrary complex
numbers, we have
Formula (1-8) .
Formula (1-9) .
413
Derivation of Formula (1-9).
and
and
414
exercises for this section to explain why. How, then, should products be
defined? Again, if we equate with and assume, for the
moment, that makes sense (so that ), we have
Formula (1-10) .
415
We get the same answer by using the notation and :
416
Definition 1.4, (Complex Division)
As with the example for multiplication, we also get this answer if we use
the notation :
417
Explore Solution 1.3.
It turns out that our algebraic definitions give complex numbers all the
properties we normally ascribe to the real number system. Taken
418
together, they describe what algebraists call a field. In formal terms, a
field is a set (in this case, the complex numbers) together with two
binary operations (in this case, addition and multiplication) having the
following properties.
419
.
420
Based on our definition for division, it seems reasonable that the number
would be
We ask you to confirm this result in the exercises for this section.
421
Actually, you can think of the real number system as a subset of the
complex number system. To see why, let's agree that, as any complex
number of the form is on the axis, we can identify it with the real
number . With this correspondence, we can easily verify that our
definitions for addition, subtraction, multiplication, and division of
complex numbers are consistent with the corresponding operations on
real numbers. For example, if and are real numbers, then
422
If we use the symbol for the point , the preceding identity gives
We can also see more clearly now how the notation quates to
. Using the preceding conventions (i.e., , etc.), we have
423
Thus we may move freely between the notations and ,
depending on which is more convenient for the context in which we are
working. Students sometimes wonder whether it matters where the " is
located in writing a complex number. It does not. Generally, most texts
place terms containing an at the end of an expression, and place the "
before a variable, but after a constant. Thus, we write , ,
etc., but , and so forth. Because letters lower in the
alphabet generally denote constants, you will usually (but not always)
see the expression , instead of . Many authors write
quantities like instead of to make sure the " is not
mistakenly thought to be inside the square root symbol. Additionally, if
there is concern that the " might be missed, it is sometimes placed
424
Example 1.4. Given .
425
Because of what it erroneously connotes, it is a shame that the term
imaginary is used in Definition (1.6). It was coined by the brilliant
mathematician and philosopher René Descartes (1596--1650) during an
era when quantities such as were thought to be just that. Gauss, who
was successful in getting mathematicians to adopt the phrase complex
number rather than imaginary number, also suggested that they use
lateral part of z in place of imaginary part of z. Unfortunately, that
suggestion never caught on, and it appears we are stuck with what
history has handed do
426
onto the axis. It makes sense, then, to call the axis the real axis and
the axis the imaginary axis, as Figure 1.3 illustrates.
427
The difference can be represented by the displacement vector
from the point to the point , as Figure 1.5 shows.
(1-20) .
428
number . The number has modulus
, and is depicted in Figure 1.6.
429
The inequality means that the point is closer to the origin
than the point . Although obvious from Figure 1.7, it is still profitable
to work out algebraically the standard results that
(1-21) and .
430
Figure 1.8 The geometry of negation and conjugation.
(1-22) .
431
Figure 1.9 The triangle inequality.
(1-23) .
,
thus
.
Compute .
433
434
We can also establish other important identities by means of the
triangle inequality. Note that
Subtracting from the left and right sides of this string of inequalities
gives an important relationship that will be used in determining lower
bounds of sums of complex numbers:
(1-24) .
435
(1-20)
(1-20)
(1-20)
Taking square roots of the terms on the left and right establishes another
important identity
(1-25) .
(1-26) , provided .
,
thus
.
436
Compute .
437
Figure 1.10 illustrates the multiplication shown in Example1.6. The
length of the vector apparently equals the product of the lengths of
, confirming that , but why is it located in the
second quadrant when both are in the first quadrant? The answer
to this question will become apparent to you in Section 1.4.
438
Figure 1.10 The geometry of multiplication.
Argand Diagram
439
in the complex plane using the x-axis as the real axis and y-axis as the
imaginary axis. In the plot above, the dashed circle represents the
complex modulus of and the angle represents its complex argument.
and so forth. The reasons were that (1) the absolute value |i| of i was one,
so all its powers also have absolute value 1 and, therefore, lie on the unit
440
circle, and (2) the argument arg(i) of i was 90°, so its nth power will
have argument n90°, and those
angles will repeat in a period of
length 4 since 4·90° = 360°, a full
circle.
In the figure you see a complex number z whose absolute value is about
the sixth root of 1/2, that is, |z| = 0.89, and whose argument is 30°. Here,
the unit circle is shaded black while outside the unit circle is gray, so z is
in the black region. Since |z| is less than one, it’s square is at 60° and
closer to 0. Each higher power is 30° further along and even closer to 0.
The first six powers are displayed, as you can see, as points on a spiral.
This spiral is called a geometric or exponential sprial.
Roots.
Note that in the last example, z6 is on the negative real axis at about -1/2.
That means that z is just about equal to one of the sixth roots of -1/2.
441
There are, in fact, six sixth roots of any complex number. Let w be a
complex number, and z any of its sixth roots. Since z6 = w, it follows
that
Actually, the second statement isn’t quite right since 6 arg(z) could be
any multiple of 360° more than arg(w), so you can add multiples of 60°
to arg(w) to get the other five roots.
If we take 1/6 of each of these angles, then we’ll have the possible
arguments for z:
442
Since each of the angles for z differs by 360°, therefore each of the
possible angles for z will differ by 60°. These six sixth roots of -1/2 are
displayed in the figure as blue dots.
Now some of these sixth roots are lower roots of unity as well. The
number –1 is a square root of unity, (–1 ± i√3)/2 are cube roots of unity,
and 1 itself counts as a cube root, a square root, and a “first” root
443
(anything is a first root of itself). But the remaining two sixth roots,
namely, (1 ± i√3)/2, are sixth roots, but not any lower roots of unity.
Such roots are called primitive, so (1 ± i√3)/2 are the two primitive sixth
roots of unity.
It’s fun to find roots of unity, but we’ve found most of the easy ones
already.
9.5.DEMOIVRE’S THEOREM
De Moivre's Theorem
In the last section, we looked at the polar form of complex numers and
proved a beautiful theorem regarding them. In this section, we prove
another beautiful result, known as De Moivre's Theorem, which allows
us to easily compute powers and roots of complex numbers given in
polar form. We will also apply this theorem to many examples.
Theorem 6.6.1 (De Moivre's Theorem): For every real number θ and
every positive integer n, we have
Proof: We prove this theorem by induction, i.e. first we prove it for n=1
and then we prove that if (6.6.2) holds for a particular value of n, then it
holds for n+1 as well. This suffices to prove the theorem for every
positive integer n. (Induction is a commonly-used method for proving
444
mathematical results.) The case n=1 is trivial. Assume (6.6.2) holds for
n. Then we have
(cos θ + i sin θ)n+1 = (cos θ + i sin θ)n (cos θ + i sin θ) = (cos nθ + i sin
nθ) (cos θ + i sin θ),
where in the last step we used the induction hypothesis, i.e. the
assumption that (6.6.2) holds for n. Computing the product on the right
yiedls
(cos θ + i sin θ)n+1 = (cos nθ cos θ - sin nθ sin θ) + i (sin nθ cos θ + cos
nθ sin θ) = cos (n+1)θ + i sin (n+1)θ,
where we used the addition formulas for sine and cosine in the last step.
(Altermatively, we could have applied Theorem 6.5.4 to the two factors
on the right side of the previous equation.) Thus, (6.6.2) holds for n+1 as
well, whence it holds for every positive integer n.
QED
445
Solution: As we have seen in the previous section, the polar form of 1 +
i is √2 (cos π/4 + i sin π/4). Thus, by De Moivre's Theorem, we have
= 32(-√3/2 + 1/2 i)
= -16√3 + 16 i.
446
may be written in polar form as 1 = cos 2πm + i sin 2πm for every
integer m, since cosine and sine are periodic with period 2π. Now
consider the complex number z = cos (2πm/n) + i sin (2πm/n), where n
is an arbitrary positive integer. By De Moivre's Theorem we have
Thus we have shown that cos (2πm/n) + i sin (2πm/n) is an nth root of
unity. In fact, all the nth roots of unity are obtained this way by plugging
in all integer values of m from 0 to n-1. (Every other integer m yields a
root of unity identical to one of these.) Thus we now know how to find
all n nth roots of unity for every positive integer n.
Solution: From our above discussion, we see that the three cube roots of
unity have the form cos (2πm/3) + i sin (2πm/3) for m=0, 1, or 2.
Plugging in m=0 yields the root cos 0 + i sin 0 = 1. (It is easy to see that
1 is an nth root of unity for every integer n, since 1n = 1. Plugging in
m=0 will always yield this root.) Plugging in m=1 yields the root cos
(2π/3) + i sin (2π/3) = -1/2 - √3/2 i and plugging in m=2 yields the root
cos (4π/3) + i sin (4π/3) = -1/2 - √3/2 i.
447
Solution: The four fourth roots of unity have the form cos (2πm/4) + i
sin (2πm/4) for m=0, 1, 2, or 3. As usual, plugging in m=0 yields the
root 1. Plugging in m=1 yields the root cos π/2 + i sin π/2 = i. Plugging
in m=2 yields cos π + i sin π = -1. Finally, plugging in m=3 yields cos
3π/2 + i sin 3π/2 = -i.
448
Figure 6.6.3: The Six Sixth Roots of Unity
So much for nth roots of unity - how about nth roots of general complex
numbers? The following lemma, which is an immediate consequence of
De Moivre's Theorem, tells us how to compute the n nth roots of an
arbitrary complex number, given in polar form.
449
• (6.6.5) w = n√r {cos [(θ + 2πm)/n] + i sin [(θ + 2πm)/n]}
= r (cos θ + i sin θ) = z.
450
Example 5: Compute the two square roots of i.
Solution: It is easy to see that i has the polar form cos π/2 + i sin π/2.
Thus, by Lemma 6.6.4, its square roots are cos π/4 + i sin π/4 = √2/2 +
√2/2 i and cos 3π/4 + i sin 3π/4 = -√2/2 - √2/2 i.
Solution: Since -8 has the polar form 8 (cos π + i sin π), its three cube
roots have the form 3√8 {cos[(π + 2πm)/3] + i sin[(π + 2πm)/3]} for
m=0, 1, and 2. Thus the roots are 2 (cos π/3 + i sin π/3) = 1 + √3 i, 2 (cos
π + i sin π) = -2, and 2 (cos 5π/3 + i sin 5π/3) = 1 - √3 i.
De Moivre's Theorem
451
The Logarithm and the Problem of the Multivalued Nature of
Angles
This formula has a problem, and that problem is that the angle is not a
well defined function in the complex plane, and so neither is ln z.
Similar problems exists for inverse powers such as x1/2 and x1/3 as well.
452
Thus for the logarithm you can say that its imaginary part, , has values
that run from 0 to 2 . If so it is discontinuous on the positive real axis,
being 0 on one side of it and 2 on the other.
Alternatively you can have its values lie from - to + , so that its line
of discontinuity is the negative real axis, and you can choose any other
half line of discontinuity starting at the origin.
In the case of the logarithm this surface winds around and around the
origin. For the square root if you go around the origin twice you come
back to where you started from.
3.2.PARTIAL FRACTION
1)
2) Where is any constant.
3)
4)
5)
453
6)
7)
8)
9)
10)
11)
12)
13)
14)
15)
16)
17) or
18)
19) or
20) or
21)
22)
23)
24)
25)
454
26)
27)
28)
29)
30)
31)
32)
33)
34)
35) or
36) or
37) or
38) or
39) or
40)
Partial Fractions
455
In this section we are going to take a look at integrals of rational
expressions of polynomials and once again let’s start this section out
with an integral that we can already do so we can contrast it with the
integrals that we’ll be doing in this section.
456
In this case the numerator is definitely not the derivative of the
denominator nor is it a constant multiple of the derivative of the
denominator. Therefore, the simple substitution that we used above
won’t work. However, if we notice that the integrand can be broken up
as follows,
457
So, let’s do a quick review of partial fractions. We’ll start with a
rational expression in the form,
where both P(x) and Q(x) are polynomials and the degree of P(x) is
smaller than the degree of Q(x). Recall that the degree of a polynomial
is the largest exponent in the polynomial. Partial fractions can only be
done if the degree of the numerator is strictly less than the degree of the
denominator. That is important to remember.
So, once we’ve determined that partial fractions can be done we factor
the denominator as completely as possible. Then for each factor in the
denominator we can use the following table to determine the term(s) we
pick up in the partial fraction decomposition.
458
,
Notice that the first and third cases are really special cases of the second
and fourth cases respectively.
There are several methods for determining the coefficients for each term
and we will go over each of those in the following examples.
459
Solution
The first step is to factor the denominator as much as possible and get
the form of the partial fraction decomposition. Doing this gives,
The next step is to actually add the right side back up.
Now, we need to choose A and B so that the numerators of these two are
equal for every x. To do this we’ll need to set the numerators equal.
460
Note that in most problems we will go straight from the general form of
the decomposition to this step and not bother with actually adding the
terms back up. The only point to adding the terms is to get the
numerator and we can get that without actually writing down the results
of the addition.
At this point we have one of two ways to proceed. One way will always
work, but is often more work. The other, while it won’t always work, is
often quicker when it does work. In this case both will work and so
we’ll use the quicker way for this example. We’ll take a look at the
other method in a later example.
461
So, by carefully picking the x’s we got the unknown constants to quickly
drop out. Note that these are the values we claimed they would be
above.
At this point there really isn’t a whole lot to do other than the integral.
Recall that to do this integral we first split it up into two integrals and
then used the substitutions,
462
on the integrals to get the final answer.
Before moving onto the next example a couple of quick notes are in
order here. First, many of the integrals in partial fractions problems
come down to the type of integral seen above. Make sure that you can
do those integrals.
It will be an example or two before we use this so don’t forget about it.
463
Example 2 Evaluate the following integral.
Solution
The next step is to set numerators equal. If you need to actually add the
right side together to get the numerator for that side then you should do
so, however, it will definitely make the problem quicker if you can do
464
the addition in your head to get,
As with the previous example it looks like we can just pick a few values
of x and find the constants so let’s do that.
465
Note that unlike the first example most of the coefficients here are
fractions. That is not unusual so don’t get excited about it when it
happens.
466
Solution
This time the denominator is already factored so let’s just jump right to
the partial fraction decomposition.
467
In this case we aren’t going to be able to just pick values of x that will
give us all the constants. Therefore, we will need to work this the
second (and often longer) way. The first step is to multiply out the right
side and collect all the like terms together. Doing this gives,
468
Note that we used x0 to represent the constants. Also note that these
systems can often be quite large and have a fair amount of work
involved in solving them. The best way to deal with these is to use some
form of computer aided solving techniques.
469
In order to take care of the third term we needed to split it up into two
separate terms. Once we’ve done this we can do all the integrals in the
problem. The first two use the substitution
, the third uses the substitution
and the fourth term uses the formula given above for inverse tangents.
470
Solution
Let’s first get the general form of the partial fraction decomposition.
Now, set numerators equal, expand the right side and collect like terms.
471
Setting coefficient equal gives the following system.
472
Don’t get excited if some of the coefficients end up being zero. It
happens on occasion.
473
To this point we’ve only looked at rational expressions where the degree
of the numerator was strictly less that the degree of the denominator. Of
course not all rational expressions will fit into this form and so we need
to take a look at a couple of examples where this isn’t the case.
Solution
So, in this case the degree of the numerator is 4 and the degree of the
denominator is 3. Therefore, partial fractions can’t be done on this
474
rational expression.
To fix this up we’ll need to do long division on this to get it into a form
that we can deal with. Here is the work for that.
475
and the integral becomes,
The first integral we can do easily enough and the second integral is now
in a form that allows us to do partial fractions. So, let’s get the general
form of the partial fractions for the second integrand.
476
Setting numerators equal gives us,
What we’ll do in this example is pick x’s to get the two constants that
we can easily get and then we’ll just pick another value of x that will be
easy to work with (i.e. it won’t give large/messy numbers anywhere) and
then we’ll use the fact that we also know the other two constants to find
the third.
477
The integral is then,
478
In the previous example there were actually two different ways of
dealing with the x2 in the denominator. One is to treat it as a quadratic
which would give the following term in the decomposition
479
Let’s take a look at one more example.
Solution
In this case the numerator and denominator have the same degree. As
with the last example we’ll need to do long division to get this into the
correct form. I’ll leave the details of that to you to check.
480
So, we’ll need to partial fraction the second integral. Here’s the
decomposition.
481
The integral is then,
Partial-Fraction Decomposition:
Previously, you have added and simplified rational expressions, such as:
482
Partial-fraction decomposition is the process of starting with the
simplified answer and taking it back apart, of "decomposing" the final
expression into its initial polynomial fractions.
Then you write the fractions with one of the factors for each of the
denominators. Of course, you don't know what the numerators are yet,
so you assign variables (usually capital letters) for these unknown
values:
483
Multiply things out, and group the x-terms and the constant terms:
3x + 2 = Ax + A1 + Bx
3x + 2 = (A + B)x + (A)1
(3)x + (2)1 = (A + B)x + (A)1
For the two sides to be equal, the coefficients of the two polynomials
must be equal. So you "equate the coefficients" to get:
3=A+B
2=A
A=2
B=1
Then the original fractions were (as we already know) the following:
There is another method for solving for the values of A and B. Since the
equation "3x + 2 = A(x + 1) + B(x)" is supposed to be true for any value
of x, we can pick useful values of x, plug-n-chug, and find the values for
A and B. Looking at the equation "3x + 2 = A(x + 1) + B(x)", you can
see that, if x = 0, then we quickly find that 2 = A:
484
3x + 2 = A(x + 1) + B(x)
3(0) + 2 = A(0 + 1) + B(0)
0 + 2 = A(1) + 0
2=A
I've never seen this second method in textbooks, but it can often save
you a whole lot of time over the "equate the coefficients and solve the
system of equations" method that they usually teach.
485
• Find the partial-fraction decomposition of the following
expression:
x = 1: 1 + 1 = 0 + 0 + C + 0, so C = 2
x = 0: 1 = 0 + 0 + 0 – D, so D = –1
486
equations that I can solve for A and B. The particular x-values I
choose aren't important, so I'll pick smallish ones:
x = –1:
I'm still stuck solving a system of equations, but by using the easier
method to solve for C and D, I now have a simpler system to solve.
Adding the two equations, I get 3A = 3, so A = 1. Then B = 0 (so
that term in the expansion "vanishes"), and the complete
decomposition is:
487
In the above example, one of the coefficients turned out to be zero. This
doesn't happen often (in algebra classes, anyway), but don't be surprised
if you get zero, or even fractions, for some of your coefficients. The
textbooks usually stick pretty closely to nice neat whole numbers, but
not always. Don't just assume that a fraction or a zero is a wrong answer.
For instance:
...decomposes as:
Note that the numerator for the "x2 + 3" fraction is a linear
polynomial, not just a constant term.
489
x – 3 = (–1 + B)x2 + (C)x – 3
–1 + B = 0 and C = 1
B = 1 and C = 1
• Set up, but do not solve, the decomposition equality for the
following:
490
Since x2 + 1 is not factorable, I'll have to use numerators with
linear factors. Then the decomposition set-up looks like this:
491
The long division rearranges the rational expression to give me:
492
The preferred placement of the "minus" signs, either "inside" the
fraction or "in front", may vary from text to text. Just don't leave a
"minus" sign hanging loose underneath.
Find
which implies
493
This gives x + 2 = A(x-1) + B(x+1). If we substitute x=1, we get B = 3/2
and we substitute x=-1, we get A = -1/2. Therefore, we have
Since we have ,
we get
Evaluate
494
Solution: First let us find the antiderivative
Hence
Since we have (see the table of Basic Formulas for Integrating Rational
Functions)
and
495
we get
Therefore, we have
496
(1)
so
(2)
alternate notations
(3)
(4)
497
a rectangular hyperbola (or, more specifically, its right branch) can be
analogously represented by
(5)
(6)
(7)
(8)
498
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
499
(21)
(22)
(23)
where .
(24)
(25)
(26)
(27)
(28)
(29)
500
The hyperbolic functions enjoy properties similar to the trigonometric
functions; their definitions, though, are much more straightforward:
Here are their graphs: the (pronounce: "kosh") is pictured in red, the
function (rhymes with the "Grinch") is depicted in blue.
501
•
In the picture below, the standard hyperbola is depicted in red, while the
502
The other hyperbolic functions are defined the same way, the rest of the
trigonometric functions is defined:
tanh x coth x
sech x csch x
503
For every formula for the trigonometric functions, there is a similar (not
necessary identical) formula for the hyperbolic functions:
Let's consider for example the addition formula for the hyperbolic cosine
function:
Show that .
504
The hyperbolic sine function is a one-to-one function, and thus has an
inverse. As usual, we obtain the graph of the inverse hyperbolic sine
505
Since is defined in terms of the exponential function, you should
not be surprised that its inverse function can be expressed in terms of the
logarithmic function:
and consequently
You know what's coming up, don't you? Here's the graph. Note that the
hyperbolic cosine function is not one-to-one, so let's restrict the domain
to .
507
Hyperbolic and exponential discounting
Well, I don't know that it's "despicable" jargon (this is the kind of
statement that discourages people from being comfortable with
mathematics), and it worried me that a mathematical term was being
maligned, so I read on.
Immediate rewards
Question: If I offered you $100 today, or $105 in one month from now,
which would you choose?
Most people choose the immediate reward, because waiting for a month
for a bit more doesn't seem to make sense.
508
reward some time in the future. It's what marketers use all the time to
encourage you to buy now.
509
[Fig 2: Image source]
That's when it started to get interesting for me. The subject matter of the
graph suggested a physics application, rather than some behavioural
finance model. I began to suspect the author didn't really know what was
going on in this topic (or certainly didn't understand the example graph).
510
[Figure 3: Velocity of a skydiver V = 225(1 − e−t)]
The curve rises fairly quickly then flattens out to some limiting value
(which in this case is 2 coulombs).
The Vmax term in the hyperbolic curve above (Figure 2) was what made
me think of these exponential curves.
You can see the hyperbolic curve is similar to our "flipped" exponential
decay ones, but not identical. The hyperbolic curve appears to have
infinite slope at S = 0, whereas the exponential ones do not.
Hyperbolic curves
I didn't accept that Figure 2 was a hyperbolic curve at first because such
curves usually have 2 arms, something like this:
512
Enzyme Studies
The curve given in Figure 2 is not about behavioral economics at all (nor
is it about physics, as I suspected). It actually represents the Michaelis-
Menten equation, which arises in the study of enzymes. Here's what the
original page (where the Figure 2 graph came from) said:
The "velocity" refers to the speed of the reaction. For your convenience,
here is the curve for Vmax (Figure 2) we saw earlier:
513
They go on to say:
The Michaelis-Menten equation has the same form as the equation for a
rectangular hyperbola; graphical analysis of reaction rate (V) versus
substrate concentration [S] produces a hyperbolic rate plot.
Let's try it out with some simple constant values. I chose Vmax = 6, and
Km = 1.7.
Here is the graph of the hyperbola, showing the two arms I talked about
before.
But if we restrict our values of S such that S > 0, we obtain a curve that
looks a lot like Figure 2. The slope at S = 0 is not infinite, but it is quite
steep.
514
So the graph referred to in the marketing article is indeed hyperbolic.
The actual choice for their example was unfortunate, because the
variables seemed to have nothing to do with economics (they don't), and
I was no wiser about what was going on.
Let's now see how close the hyperbolic curve is to a similar exponential
one. I graphed V = 6(1 − e−0.2S), (chosen so it has the same maximum
value) and here's the result:
515
Note the upper limit is the same as my hyperbolic curve, (Vmax = 6). The
shape of the two curves is similar, but not exactly the same. Here are the
2 graphs on the same set of axes:
Explanation
At the beginning of this post, I asked if you wanted $100 now or $105 in
a month. In most countries currently, the interest and inflation rates
mean it would be much better for you to wait for the month, as you
would make 5% on your money. At that monthly rate, you could make
around 80% per year!
Most people intuitively "discount" future money. That is, they know that
if they wait, they should get some reward for waiting. Now
mathematically, that reward grows exponentially (not as steep at the
beginning), because interest on money grows exponentially. But in our
minds, we want it now. That is, it's more like the hyperbolic case, where
516
we are impatient and want a higher reward for waiting even a short
while. Our impatience can be represented by the hyperbolic curve.
Applications
Conclusion
517
discounting. It will give you some breathing time to work out whether it
really is better to buy now, or wait.
The last set of functions that we’re going to be looking in this chapter at
are the hyperbolic functions. In many physical situations combinations
of and arise fairly often. Because of this these
combinations are given names. There are the six hyperbolic functions
and they are defined as follows.
518
Here are the graphs of the three main hyperbolic functions.
519
You’ll note that these are similar, but not quite the same, to some of the
more common trig identities so be careful to not confuse the identities
here with those of the regular trig functions.
With this formula we’ll do the derivative for hyperbolic sine and leave
the rest to you as an exercise.
520
For the rest we can either use the definition of the hyperbolic function
and/or the quotient rule. Here are all six derivatives.
521
Here are a couple of quick derivatives using hyperbolic functions.
(a)
(b)
Solution
(a)
(b)
522
10.2.PHOASORS IN ELECTRICAL CIRCUITS
A phasor is a complex number in polar form that you can apply to circuit
analysis. When you plot the amplitude and phase shift of a sinusoid in a
complex plane, you form a phase vector, or phasor.
523
The following sections explain how to find the different forms of
phasors and introduce you to the properties of phasors.
where
j = √-1
524
The left side of Euler’s formula is the polar phasor form, and the right
side is the rectangular phasor form. You can write the cosine and sine as
follows:
cosθ = Re[ejθ]
sinθ = Im[ejθ]
In the equations shown here, Re[ ] denotes the real part of a complex
number, and Im[ ] denotes the imaginary part of a complex number.
Here is a cosine function and a shifted cosine function with a phase shift
of π/2.
525
In general, for the sinusoids shown here, you have an amplitude VA, a
radian frequency ω, and a phase shift of ϕ given by the following
expression:
V = VAejϕ
To describe a phasor, you need only the amplitude and phase shift (not
the radian frequency). Using Euler’s formula, the rectangular form of the
phasor is
V = VAcosϕ + jVAsinϕ
One key phasor property is the additive property. If you add sinusoids
that have the same frequency, then the resulting phasor is simply the
vector sum of the phasors — just like adding vectors:
V = V1 + V2 + …VN
526
For this equation to work, phasors V1, V2, …,VN must have the same
frequency. You find this property useful when using Kirchhoff’s laws.
Another vital phasor property is the time derivative. The time derivative
of a sine wave is another scaled sine wave with the same frequency.
Taking the derivative of phasors is an algebraic multiplication of jω in
the phasor domain. First, you relate the phasor of the original sine wave
to the phasor of the derivative:
Based on the phasor definition, the quantity (jωV) is the phasor of the
time derivative of a sine wave phasor V. Rewrite the phasor jωV as
527
sine wave by jω. See how the imaginary number j rotates a phasor by
90o?
Differentiation Formulas
In the first section of this chapter we saw the definition of the derivative
and we computed a couple of derivatives using the definition. As we
saw in those examples there was a fair amount of work involved in
computing the limits and the functions that we worked with were not
terribly complicated.
For more complex functions using the definition of the derivative would
be an almost impossible task. Luckily for us we won’t have to use the
definition terribly often. We will have to use it on occasion, however we
have a large collection of formulas and properties that we can use to
simplify our life considerably and will allow us to avoid using the
definition whenever possible.
528
We will introduce most of these formulas over the course of the next
several sections. We will start in this section with some of the basic
properties and formulas. We will give the properties and formulas in
this section in both “prime” notation and “fraction” notation.
Properties
1)
OR
529
using the definition of the derivative.
2)
OR
, c is any number
Note that we have not included formulas for the derivative of products
or quotients of two functions here. The derivative of a product or
quotient of two functions is not the product or quotient of the derivatives
of the individual pieces. We will take a look at these in the next
section.
Formulas
530
1) If then
OR
2) If then
OR
, n is any number.
531
See the Proof of Various Derivative Formulas section of the Extras
chapter to see the proof of this formula. There are actually three
different proofs in this section. The first two restrict the formula to
n being an integer because at this point that is all that we can do at
this point. The third proof is for the general rule, but does suppose
that you’ve read most of this chapter.
These are the only properties and formulas that we’ll give in this
section. Let’s compute some derivatives using these properties.
(a)
[Solution]
(b)
[Solution]
(c)
532
[Solution]
(d)
[Solution]
(e) [Solution]
Solution
(a)
In this case we have the sum and difference of four terms and so we will
differentiate each of the terms using the first property from above and
then put them back together with the proper sign. Also, for each term
with a multiplicative constant remember that all we need to do is
“factor” the constant out (using the second property) and then do the
derivative.
533
Notice that in the third term the exponent was a one and so upon
subtracting 1 from the original exponent we get a new exponent of zero.
Now recall that . Don’t forget to do any basic
arithmetic that needs to be done such as any multiplication and/or
division in the coefficients.
[Return to Problems]
(b)
The point of this problem is to make sure that you deal with negative
exponents correctly. Here is the derivative.
Make sure that you correctly deal with the exponents in these cases,
534
especially the negative exponents. It is an easy mistake to “go the other
way” when subtracting one off from a negative exponent and get
instead of the correct .
(c)
Now in this function the second term is not correctly set up for us to use
the power rule. The power rule requires that the term be a variable to a
power only and the term must be in the numerator. So, prior to
differentiating we first need to rewrite the second term into a form that
we can deal with.
Note that we left the 3 in the denominator and only moved the variable
up to the numerator. Remember that the only thing that gets an
exponent is the term that is immediately to the left of the exponent. If
we’d wanted the three to come up as well we’d have written,
535
so be careful with this! It’s a very common mistake to bring the 3 up
into the numerator as well at this stage.
Now that we’ve gotten the function rewritten into a proper form that
allows us to use the Power Rule we can differentiate the function. Here
is the derivative for this part.
(d)
All of the terms in this function have roots in them. In order to use the
power rule we need to first convert all the roots to fractional exponents.
Again, remember that the Power Rule requires us to have a variable to a
number and that it must be in the numerator of the term. Here is the
536
function written in “proper” form.
In the last two terms we combined the exponents. You should always do
this with this kind of term. In a later section we will learn of a technique
that would allow us to differentiate this term without combining
exponents, however it will take significantly more work to do. Also
don’t forget to move the term in the denominator of the third term up to
the numerator. We can now differentiate the function.
537
Make sure that you can deal with fractional exponents. You will see a
lot of them in this class.
(e)
In all of the previous examples the exponents have been nice integers or
fractions. That is usually what we’ll see in this class. However, the
exponent only needs to be a number so don’t get excited about problems
like this one. They work exactly the same.
The answer is a little messy and we won’t reduce the exponents down to
decimals. However, this problem is not terribly difficult it just looks
that way initially.
538
There is a general rule about derivatives in this class that you will need
to get into the habit of using. When you see radicals you should always
first convert the radical to a fractional exponent and then simplify
exponents as much as possible. Following this rule will save you a lot of
grief in the future.
Back when we first put down the properties we noted that we hadn’t
included a property for products and quotients. That doesn’t mean that
we can’t differentiate any product or quotient at this point. There are
some that we can do.
(a)
[Solution]
(b)
Solution
539
(a)
In this function we can’t just differentiate the first term, differentiate the
second term and then multiply the two back together. That just won’t
work. We will discuss this in detail in the next section so if you’re not
sure you believe that hold on for a bit and we’ll be looking at that soon
as well as showing you an example of why it won’t work.
540
(b)
As with the first part we can’t just differentiate the numerator and the
denominator and the put it back together as a fraction. Again, if you’re
not sure you believe this hold on until the next section and we’ll take a
more detailed look at this.
[Return to Problems]
541
So, as we saw in this example there are a few products and quotients that
we can differentiate. If we can first do some simplification the functions
will sometimes simplify into a form that can be differentiated using the
properties and formulas in this section.
Example 3 Is increasing,
decreasing or not changing at ?
Solution
542
Note that we rewrote the last term in the derivative back as a fraction.
This is not something we’ve done to this point and is only being done
here to help with the evaluation in the next step. It’s often easier to do
the evaluation with positive exponents.
at .
Solution
543
So, we will need the derivative of the function (don’t forget to get rid of
the radical).
544
Example 5 The position of an object at any time t (in hours) is given
by,
Determine when the object is moving to the right and when the object is
moving to the left
Solution
The only way that we’ll know for sure which direction the object is
moving is to have the velocity in hand. Recall that if the velocity is
positive the object is moving off to the right and if the velocity is
negative then the object is moving to the left.
So, we need the derivative since the derivative is the velocity of the
545
object. The derivative is,
We can see from the factored form of the derivative that the derivative
will be zero at and . Let’s graph these
546
points on a number line.
Now, we can see that these two points divide the number line into three
distinct regions. In each of these regions we know that the derivative
will be the same sign. Recall the derivative can only change sign at the
two points that are used to divide the number line up into the regions.
547
Here are the intervals in which the derivative is positive and negative.
We included negative t’s here because we could even though they may
not make much sense for this problem. Once we know this we also can
answer the question. The object is moving to the right and left in the
following intervals.
548
ƒ'(x) =
and if this limit exists
ƒ'(c) =
Differentiation Rules
product quotient
3. [uv] = uv' + vu' rule 4. [ ]= rule
power
5. [c] = 0 6. [un] = nun-1u' rule
7. [x] = 1 8. [ln u] =
549
Derivatives of the Trigonometric Functions
1. [arcsin u] = 2. [arccsc u] =
3. [arccos u] = 4. [arcsec u] =
5. [arctan u] = 6. [arccot u] =
Implicit Differentiation
3y2 + (x + y) - 2 - 2x = 0
(3y2 + x - 2) = 2x - y
550
=
ƒ(n)(x) = y(n)
y' = 5x4
y'' = 20x3
y''' = 60x2
Logarithmic Differentiation
551
It is often advantageous to use logarithms to differentiate certain
functions.
2. Differentiate
4. Substitute for y
5. Simplify
Exercise:
Find for y =
y' =
y' =
552
ƒ'(c) =
L'Hôpital's Rule
lim = lim
, , , and
Exercise: What is ?
(A) 2
(B) 1
(C) 0
(D)
(E) The limit does not exist.
553
The answer is
B. =1
The answer is
y' = 4x
B.
y = 4(1) = 4
slope of normal = -1/4
554
If a function ƒ(x) is continuous on a closed interval, then ƒ(x) has both a
maximum and minimum value in the interval.
Curve Sketching
Situation Indicates
ƒ'(c) > 0 ƒ increasing at c
ƒ'(c) < 0 ƒ decreasing at c
ƒ'(c) = 0 horizontal tangent at c
ƒ'(c) = 0, ƒ'(c-) < 0,
relative minimum at c
ƒ'(c+) > 0
ƒ'(c) = 0, ƒ'(c-) > 0,
relative maximum at c
ƒ'(c+) < 0
ƒ'(c) = 0, ƒ''(c) > 0 relative minimum at c
ƒ'(c) = 0, ƒ''(c) < 0 relative maximum at c
ƒ'(c) = 0, ƒ''(c) = 0 further investigation required
ƒ''(c) > 0 concave upward
ƒ''(c) < 0 concave downward
ƒ''(c) = 0 further investigation required
ƒ''(c) = 0, ƒ''(c-) < 0,
point of inflection
ƒ''(c+) > 0
ƒ''(c) = 0, ƒ''(c-) > 0, point of inflection
555
ƒ''(c+) < 0
ƒ(c) exists, ƒ'(c) does not possibly a vertical tangent; possibly an
exist absolute max. or min.
xn + 1 = xn -
To use Newton's Method, let x1 be a guess for one of the roots. Reiterate
the function with the result until the required accuracy is obtained.
Optimization Problems
V = x2h = 500
556
S = x2 + 4xh = x2 + 4x(500/x2) = x2 + (2000/x)
S' = 2x - (2000/x2) = 0
2x3 = 2000
x = 10, h = 5
Dimensions: 10 x 10 x 5 inches
Rates-of-Change Problems
Calculus can be used to find the rate of change of two or more variable
that are functions of time t by differentiating with respect to t.
557
b) = ft/sec a) = ft/sec
Note: the answers are independent of the distance from the light.
V= r2h h2
5= h2
V= h3 ft/hr
11.1. Limits
558
et’s first start off with the following “definition” of a limit.
Definition
This is not the exact, precise definition of a limit. If you would like to
see the more precise and mathematical definition of a limit you should
check out the The Definition of a Limit section at the end of this
chapter. The definition given above is more of a “working” definition.
This definition helps us to get an idea of just what limits are and what
they can tell us about functions.
559
So just what does this definition mean? Well let’s suppose that we know
that the limit does in fact exist. According to our “working” definition
we can then decide how close to L that we’d like to make f(x). For sake
of argument let’s suppose that we want to make f(x) no more that 0.001
away from L. This means that we want one of the following
This is actually a fairly important idea. There are many functions out
there in the world that we can make as close to L for specific values of x
that are close to a, but there will be other values of x closer to a that give
560
functions values that are nowhere near close to L. In order for a limit to
exist once we get f(x) as close to L as we want for some x then it will
need to stay in that close to L (or get closer) for all values of x that are
closer to a. We’ll see an example of this later in this section.
In somewhat simpler terms the definition says that as x gets closer and
closer to x=a (from both sides of course…) then f(x) must be getting
closer and closer to L. Or, as we move in towards x=a then f(x) must be
moving in towards L.
561
How do we use this definition to help us estimate limits? We do exactly
what we did in the previous section. We take x’s on both sides of x=a
that move in closer and closer to a and we plug these into our function.
We then look to see if we can determine what number the function
values are moving in towards and use this as our estimate.
Solution
Notice that I did say estimate the value of the limit. Again, we are not
going to directly compute limits in this section. The point of this section
is to give us a better idea of how limits work and what they can tell us
562
about the function.
So, with that in mind we are going to work this in pretty much the same
way that we did in the last section. We will choose values of x that get
closer and closer to x=2 and plug these values into the function. Doing
this gives the following table of values.
x f(x) x f(x)
2.5 3.4 1.5 5.0
2.1 3.857142857 1.9 4.157894737
2.01 3.985074627 1.99 4.015075377
2.001 3.998500750 1.999 4.001500750
2.0001 3.999850007 1.9999 4.000150008
2.00001 3.999985000 1.99999 4.000015000
Note that we made sure and picked values of x that were on both sides of
and that we moved in very close to to
make sure that any trends that we might be seeing are in fact correct.
563
function as this would give us a division by zero error. This is not a
problem since the limit doesn’t care what is happening at the point in
question.
Let’s think a little bit more about what’s going on here. Let’s graph the
function from the last example. The graph of the function in the range
of x’s that were interested in is shown below.
564
As we were plugging in values of x into the function we are in effect
moving along the graph in towards the point as . This
is shown in the graph by the two arrows on the graph that are moving in
towards the point.
When we are computing limits the question that we are really asking is
what y value is our graph approaching as we move in towards
on our graph. We are NOT asking what y value the graph takes
at the point in question. In other words, we are asking what the graph is
doing around the point . In our case we can see that as
x moves in towards 2 (from both sides) the function is approaching
even though the function itself doesn’t even exist at
. Therefore we can say that the limit is in fact 4.
565
So what have we learned about limits? Limits are asking what the
function is doing around and are not concerned with
what the function is actually doing at . This is a good
thing as many of the functions that we’ll be looking at won’t even exist
at as we saw in our last example.
566
Solution
The first thing to note here is that this is exactly the same function as the
first example with the exception that we’ve now given it a value for
. So, let’s first note that
As far as estimating the value of this limit goes, nothing has changed in
comparison to the first example. We could build up a table of values as
we did in the first example or we could take a quick look at the graph of
the function. Either method will give us the value of the limit.
Let’s first take a look at a table of values and see what that tells us.
Notice that the presence of the value for the function at
will not change our choices for x. We only choose values of x that are
getting closer to but we never take .
In other words the table of values that we used in the first example will
be exactly the same table that we’ll use here. So, since we’ve already
got it down once there is no reason to redo it here.
567
From this table it is again clear that the limit is,
The limit is NOT 6! Remember from the discussion after the first
example that limits do not care what the function is actually doing at the
point in question. Limits are only concerned with what is going on
around the point. Since the only thing about the function that we
actually changed was its behavior at this will not
change the limit.
Let’s also take a quick look at this function's graph to see if this says the
same thing.
568
Again, we can see that as we move in towards on our
graph the function is still approaching a y value of 4. Remember that we
are only asking what the function is doing around and
we don’t care what the function is actually doing at .
The graph then also supports the conclusion that the limit is,
Let’s make the point one more time just to make sure we’ve got it.
Limits are not concerned with what is going on at .
Limits are only concerned with what is going on around
. We keep saying this, but it is a very important concept about limits
569
that we must always keep in mind. So, we will take every opportunity to
remind ourselves of this idea.
Let’s take a look another example to try and beat this idea into the
ground.
570
Solution
First don’t get excited about the θ in function. It’s just a letter, just like
x is a letter! It’s a Greek letter, but it’s a letter and you will be asked to
deal with Greek letters on occasion so it’s a good idea to start getting
used to them at this point.
Now, also notice that if we plug in θ=0 that we will get division by
zero and so the function doesn’t exist at this point. Actually, we get 0/0
at this point, but because of the division by zero this function does not
exist at θ=0.
So, as we did in the first example let’s get a table of values and see what
if we can guess what value the function is heading in towards.
1 0.45969769 -1 -0.45969769
0.1 0.04995835 -0.1 -0.04995835
0.01 0.00499996 -0.01 -0.00499996
0.001 0.00049999 -0.001 -0.00049999
571
θ moves in towards 0, from both sides of course.
Therefore, the we will guess that the limit has the value,
So, once again, the limit had a value even though the function didn’t
exist at the point we were interested in.
It’s now time to work a couple of more examples that will lead us into
the next idea about limits that we’re going to want to discuss.
572
Solution
Let’s build up a table of values and see what’s going on with our
function in this case.
t f(t) t f(t)
1 -1 -1 -1
0.1 1 -0.1 1
0.01 1 -0.01 1
0.001 1 -0.001 1
Now, if we were to guess the limit from this table we would guess that
the limit is 1. However, if we did make this guess we would be wrong.
Consider any of the following function evaluations.
573
In all three of these function evaluations we evaluated the function at a
number that is less that 0.001 and got three totally different numbers.
Recall that the definition of the limit that we’re working with requires
that the function be approaching a single value (our guess) as t gets
closer and closer to the point in question. It doesn’t say that only some
of the function values must be getting closer to the guess. It says that all
the function values must be getting closer and closer to our guess.
574
point in question.
This function clearly does not settle in towards a single number and so
this limit does not exist!
This last example points out the drawback of just picking values of x
using a table of function values to estimate the value of a limit. The
values of x that we chose in the previous example were valid and in fact
were probably values that many would have picked. In fact they were
exactly the same values we used in the problem before this one and they
worked in that problem!
When using a table of values there will always be the possibility that we
aren’t choosing the correct values and that we will guess incorrectly for
our limit. This is something that we should always keep in mind when
doing this to guess the value of limits. In fact, this is such a problem
that after this section we will never use a table of values to guess the
value of a limit again.
575
This last example also has shown us that limits do not have to exist. To
this point we’ve only seen limits that have existed, but that just doesn’t
always have to be the case.
Solution
576
We can see from the graph that if we approach from the
right side the function is moving in towards a y value of 1. Well
actually it’s just staying at 1, but in the terminology that we’ve been
using in this section it’s moving in towards 1…
577
Note that the limit in this example is a little different from the previous
example. In the previous example the function did not settle down to a
single number as we moved in towards . In this example
however, the function does settle down to a single number as
on either side. The problem is that the number is different on
each side of . This is an idea that we’ll look at in a little
more detail in the next section.
578
true according to the natural order on the real line in term of sizes,
is big, very big!
, , etc.
579
Example. Consider the function
When , then . So
Note that when x gets closer to 3, then the points on the graph get closer
to the (dashed) vertical line x=3. Such a line is called a vertical
asymptote. For a given function f(x), there are four cases, in which
vertical asymptotes can present themselves:
(i)
580
; ;
(ii)
; ;
(iii)
; ;
(iv)
; ;
581
Next we investigate the behavior of functions when . We have
We have
which implies
582
Note that when x gets closer to (x gets large), then the points on the
graph get closer to the horizontal line y=2. Such a line is called a
horizontal asymptote.
In particular, we have
583
For , we have to be careful about the definition of the power of
negative numbers. In particular, we have
We have
So we have
584
We have
and then
When x goes to , then x < 0, which implies that |x| = -x. Hence
585
Remark. Be careful! A common mistake is to assume that .
Exercise 1. Find
Answer.
Hence
586
Exercise 2. Find
Answer.
We have
So
587
Exercise 3. Find
Answer to Exercise 3
We have
Since
588
we get
Exercise 4. Find
Answer to Exercise 4
We have
Since
which implies
589
On the other hand, we have
which implies
Remark. You may think: Why make things so complicated? Isn't there
an easier way?
590
But the difference of two very large numbers may not be small. In fact,
Exercise 5. Find the vertical and horizontal asymptotes for the graph of
Answer to Exercise 5
The vertical asymptotes are found at the "bad" points, i.e. the points,
which are not the domain. In this case, we have one "bad" point, where
the denominator equals zero: x=2. We have
and
591
So x=2 is a vertical asymptote. On the other hand, we have
and
592
11.2.THE DERIVATIVE
In the first section of the last chapter we saw that the computation of the
slope of a tangent line, the instantaneous rate of change of a function,
and the instantaneous velocity of an object at all
required us to compute the following limit.
We also saw that with a small change of notation this limit could also be
written as,
(1)
593
The derivative of with respect to x is the function
(2)
Note that we replaced all the a’s in (1) with x’s to acknowledge the fact
as “f prime of x”.
594
Solution
So, all we really need to do is to plug this function into the definition of
the derivative, (1), and do some algebra. While, admittedly, the algebra
will get somewhat unpleasant at times, but it’s just algebra so don’t get
excited about the fact that we’re now computing derivatives.
Be careful and make sure that you properly deal with parenthesis when
595
doing the subtracting.
Now, we know from the previous chapter that we can’t just plug in
since this will give us a division by zero error. So we
are going to have to do some work. In this case that means multiplying
everything out and distributing the minus sign through on the second
term. Doing this gives,
596
So, the derivative is,
Solution
597
previous examples. First, we plug the function into the definition of the
derivative,
Note that we changed all the letters in the definition to match up with the
given function. Also note that we wrote the fraction a much more
compact manner to help us with the work.
598
Before finishing this let’s note a couple of things. First, we didn’t
multiply out the denominator. Multiplying out the denominator will just
overly complicate things so let’s keep it simple. Next, as with the first
example, after the simplification we only have terms with h’s in them
599
left in the numerator and so we can now cancel an h out.
So, upon canceling the h we can evaluate the limit and get the
derivative.
600
Example 3 Find the derivative of the following function using the
definition of the derivative.
Solution
First plug into the definition of the derivative as we’ve done with the
previous two examples.
601
(in this case) we multiply both the numerator and denominator by the
numerator except we change the sign between the two terms. Here’s the
rationalizing work for this problem,
602
Again, after the simplification we have only h’s left in the numerator.
So, cancel the h and evaluate the limit.
603
And so we get a derivative of,
Let’s work one more example. This one will be a little different, but it’s
got a point that needs to be made.
Solution
Since this problem is asking for the derivative at a specific point we’ll
go ahead and use that in our work. It will make our life easier and that’s
always a good thing.
604
We saw a situation like this back when we were looking at limits at
infinity. As in that section we can’t just cancel the h’s. We will have to
look at the two one sided limits and recall that
605
606
The two one-sided limits are different and so
doesn’t exist. However, this is the limit that gives us the derivative that
we’re after.
If the limit doesn’t exist then the derivative doesn’t exist either.
In this example we have finally seen a function for which the derivative
doesn’t exist at a point. This is a fact of life that we’ve got to be aware
of. Derivatives will not always exist. Note as well that this doesn’t say
anything about whether or not the derivative exists anywhere else. In
fact, the derivative of the absolute value function exists at every point
except the one we just looked at, .
607
Definition
Theorem
If is differentiable at then
is continuous at .
608
Note that this theorem does not work in reverse. Consider
So, is continuous at
is not differentiable at .
Alternate Notation
Next we need to discuss some alternate notation for the derivative. The
typical derivative notation is the “prime” notation. However, there is
another notation that is used on occasion so let’s cover that.
609
Because we also need to evaluate derivatives on occasion we also need a
notation for evaluating derivatives when using the fractional notation.
So if we want to evaluate the derivative at x=a all of the following are
equivalent.
Note as well that on occasion we will drop the (x) part on the function to
simplify the notation somewhat. In these cases the following are
equivalent.
610
sometimes painful) process filled with opportunities to make mistakes.
In a couple of sections we’ll start developing formulas and/or properties
that will help us to take the derivative of many of the common functions
so we won’t need to resort to the definition of the derivative too often.
f '(x) = 0
Example
611
f '(x) = r x r - 1
Example
f '(x) = c g '(x)
Example
f(x) = 3x 3 ,
612
f(x) = x 2 + 4
f(x) = x 3 - x -2
f(x) = (x 2 - 2x) (x - 2)
613
let g(x) = (x 2 - 2x) and h(x) = (x - 2), then
f '(x) = g(x) h '(x) + h(x) g '(x) = (x 2 - 2x) (1) + (x - 2) (2x - 2)
= x 2 - 2x + 2 x 2 - 6x + 4 = 3 x 2 - 8x + 4
f(x) = (x - 2) / (x + 1)
= ( (x + 1)(1) - (x - 2)(1) ) / (x + 1) 2
= 3 / (x + 1) 2
exists.
614
Similarly, f is differentiable on an open interval (a, b) if
615
To find the limit of the function's slope when the change in x is 0, we
can either use the true definition of the derivative and do
def f(x):
return 1/x^2
var('h')
((f(x+h)-f(x))/h).rational_simplify().subs(h=0)
Toggle Line Numbers
616
plot(abs(x), x, -5, 5)
Toggle Explanation Toggle Line Numbers
617
What about at x = 0? The "logical" response would be to see that g(0) =
0 and say that g'(0) must therefore equal 0. Careful, though...looking
back at the limit definition of the derivative, the derivative of f at a point
c is the limit of the slope of f as the change in its independent variable
approaches 0. Really, the only relevant piece of information is the
behavior of function's slope close to c. Referring back to the example,
since the limit of g'(x) as x approaches 0 from the left ≠ the limit of g'(x)
as x approaches 0 from the right, g'(0) does not exist. We can use the
limit definition of the derivative to prove this:
so , which is undefined.
In this form, it makes far more sense why g'(0) is undefined. By simply
looking at the graph of g, too, one can see that the sudden "twist" at x =
0 is responsible for our inability to evaluate g' there. We can now justly
pronounce that g is differentiable on (-∞, 0) U (0, ∞), so g' is continuous
on that same interval.
618
p = plot(sqrt(x-2), x, 2, 5)
pt1 = point((3, 1), rgbcolor="white", pointsize=30, faceted=True)
pt2 = point((3, 2), rgbcolor="black", pointsize=30)
l = line([(3, 1), (3, 2)], linestyle="--")
(p+pt1+pt2+l).show(xmin=0)
Toggle Line Numbers
Not only is v(t) defined solely on [2, ∞), it has a jump discontinuity at t
= 3. The jump discontinuity causes v'(t) to be undefined at t = 3; do you
see why? Using a slightly modified limit definition of the derivative,
think of what
619
would be for c = 3 and some x very close to 3. The resulting slope would
be astronomically large either negatively or positively, right? In fact, the
dashed line connecting v(t) for t ≠ 3 and v(3) is what the tangent line
will look like at that point. Since a function's der
derivative
ivative cannot be
infinitely large and still be considered to "exist" at that point, v is not
differentiable at t=3
Local maximum and minimum points are quite distinctive on the graph
of a function, and are therefore useful in understanding the shape of the
graph. In many applied problems we
we want to find the largest or smallest
value that a function achieves (for example, we might want to find the
minimum cost at which some task can be performed) and so identifying
620
maximum and minimum points will be useful for applied problems as
well. Some examples of local maximum and minimum points are shown
in figure 5.1.1.
0,0
0,0
A
A
A
B
B
Figure 5.1.1. Some local maximum points ( A)) and minimum points (
B).
Thus, the only points at which a function can have a local maximum or
minimum are points at which the derivative is zero, as in the left hand
graph in figure 5.1.1,, or the derivative is undefined, as in the right hand
graph. Any value of x for which f′(x) is zero or undefined is
called a critical value for f.. When looking for local maximum and
621
minimum points, you are likely to make two sorts of mistakes: You may
forget that a maximum or minimum can occur where the derivative does
not exist, and so forget to check whether the derivative exists
everywhere. You might also assume that any place that the derivative is
zero is a local maximum or minimum point, but this is not true. A
portion of the graph of f(x)=x3 is shown in figure 5.1.2. The
derivative of f is f′(x)=3x2, and f′(0)=0, but
there is neither a maximum nor minimum at (0,0)..
0,0
622
Suppose, for example, that we have identified three points at which
f′ is zero or nonexistent: (x1,y1), (x2,y2)
(x2,y2),
(x3,y3), and x1<x2<x3 (see figure 5.1.3).
5.1.3 Suppose that
we compute the value of f(a) for x1<a<x2 and that
x1<a<x2,
f(a)<f(x2).. What can we say about the graph between a and
x2?? Could there be a point (b,f(b)), a<b<x2 with
f(b)>f(x2)?? No: if there were, the graph would go up from
(a,f(a)) to (b,f(b)) then down to (x2,f(x2))
and somewhere in between would have a local maximum point. (This is
not obvious; it is a result of the Extreme Value Theorem, theorem 6.1.2.)
But at that local maximum point the derivative of f would be zero or
nonexistent, yet we already know that the derivative is zero or
nonexistent only at x1, x2, and x3.. The upshot is that one
computation tells us that (x2,f(x2)) has the largest y
coordinate of any point on the graph near x2 and to the left of
x2.. We can perform the same test on the right. If we find that on both
sides of x2 the values are smaller, then there must be a local
maximum at (x2,f(x2));; if we find that on both sides of
x2 the values are larger, then there must be a local minimum at
(x2,f(x2));; if we find one of each, then there is neither a local
maximum or minimum at x2.
0,0
623
x1
x2
x3
Example 5.1.2 Find all local maximum and minimum points for the
function f(x)=x3
f(x)=x3−x. The derivative is
f′(x)=3x2−1.. This is defined everywhere and is zero at
x=±3/3.. Looking first at x=3/3, we see that
f(3/3)=−23/9.. Now we test two points on either side of
x=3/3,, making sure that neither is farther away than the nearest critical
value; since 3<3
3<3, 3/3<1 and we can use x=0 and
x=1. Since f(0)=0>−23/9 and
624
f(1)=0>−23/9,, there must be a local minimum at x=3/3. For
x=−3/3,, we see that f(−3/3)=23/9.
This time we can use x=0 and x=−1,, and we find that
−1)=f(0)=0<23/9,, so there must be a local maximum
f(−1)=f(0)=0<23/9
at x=−3/3..
Example 5.1.3 Find all local maximum and minimum points for
f(x)=sin x+cos x. The derivative is
f′(x)=cos x−sin x.. This is always defined and is zero whenever
cos x=sin x.. Recalling that the cos x and sin x are
the x and y coordinates of points on a unit circle, we see that
cos x=sin x when x is π/4, π/4±π, π/4±2π,
π/4±3π,, etc. Since both sine and cosine have a period of 2π,
we need only determine the status of x=π/4 and x=5π/4.
We can use 0 and π/2 to test the critical value x=π/4. We
find that f(π/4)=2, f(0)=1<2 and
f(π/2)=1,, so there is a local maximum when x=π and also
x=π/4
when π/4±2π,
x=π/4±2 π/4±4π,, etc. We can summarize
625
this more neatly by saying that there are local maxima at
π/4±2kπ for every integer k.
626
Such a problem differs in two ways from the local maximum and
minimum problems we encountered when graphing functions: We are
interested only in the function between a and b , and we want to know
the largest or smallest value that f(x) takes on, not merely
m values
that are the largest or smallest in a small interval. That is, we seek not a
local maximum or minimum but a global maximum or minimum,
sometimes also called an absolute maximum or minimum.
0,0
−2
627
Example 6.1.1 Find the maximum and minimum values of f(x)=x2
on the interval [−22,1] , shown in figure 6.1.1.. We compute
f′(x)=2x , which is zero at x=0 and is always defined.
628
local maximum or minimum points (or neither); then the largest local
maximum must be the global maximum and the smallest local minimum
must be the global minimum. It is usually easier, however, to compute
the value of f at every point at which the global maximum or minimum
might occur; the largest of these is the global maximum, the smallest is
the global minimum.
So we compute f(−2)=
)=4 , f(0)=0 , f(1)=1 . The
global maximum is 4 at x=−2 and the global minimum is 0 at x=0
.
There are some particularly nice cases that are easy. A continuous
function on a closed interval [a,b] always has both a global
maximum and a global minimum, so examining the critical values and
the endpoints is enough:
629
That is, there are real numbers c and d in [a,b] so that for every x
in [a,b] , f(x)≤f(cc) and f(x)≥f(d) .
Example 6.1.5 Find the maximum and minimum values of the function
f(x)=7+|x−2| for x between 1 and 4 inclusive. The
630
derivative f′(x) is never zero, but f′(x) is undefined at x=2
, so we compute f(2
2)=7 . Checking the end points we get
f(1)=8 and f(4)=
)=9 . The smallest of these numbers is
f(2)=7 , which is, therefore, the minimum value of f(x) on
the interval 1≤x≤4 , and the maximum is f(4)=9 .
0,0
−2
Example 6.1.6 Find all local maxima and minima for f(xx)=x3−x
, and determine whether there is a global maximum or minimum on
the open interval (−2,22) . In example 5.1.2 we found a local
√3/9)
maximum at (−√3/3,2√ and a local minimum at
(√3/3,−2√3/9) . Since the endpoints are not in the
interval (−2,2) they cannot be considered. Is the lone local
maximum a global maximum? Here we must look more closely at the
graph. We know that on the closed interval [−√3/3,√3/3]]
there is a global maximum at x=−√3/3 and a global
minimum at x=√3/3 . So the question becomes: what happens
between −2 and −√33/3 , and between √3/3 and 2 ?
Since there is a local minimum at x=√3/3 , the graph must
631
continue up to the right, since there are no more critical values. This
means no value of f will be less than −2√3/9 between √3/3
and 2 , but it says nothing about whether we might find a value
larger than the local maximum 2√3/9 . How can we tell? Since
the function increases to the right of √3/3 , we need to know what
the function values do "close to'' 2 . Here the easiest test
est is to pick a
number and do a computation to get some idea of what's going on. Since
f(1.9)=4.959>2√3/9 , there is no global
maximum at −√3/3 , and hence no global maximum at all. (How
can we tell that 4.959>
>2√3/9 ? We can use a calculator to
approximate the right hand side; if it is not even close to 4.959 we
can take this as decisive. Since 2√3/9≈0.3849 , there's
really no question. Funny things can happen in the rounding done by
computers and calculators, however, so we might be a little more
careful, especially if the values come out quite close. In this case we can
convert the relation 4.959>2√3/9
4.959 into (99/2)4.959>√3
and ask whether this is true. Since the left side is clearly
larger than 4⋅4 which is clearly larger than √3 , this settles the
question.)
632
Example 6.1.7 Of all rectangles of area 100, which has the smallest
perimeter?
since the perimeter is twice the length plus twice the width of the
rectangle. Not all values of x make sense in this problem: lengths of
sides of rectangles must be positive, so x>0 . If x>0 then so is
100/x , so we need no second condition on x .
633
also the global minimum, so the rectangle with smallest perimeter is the
10×10 square.
634
We want to know the maximum value of this function when x is
between 0 and 1.5 . The derivative is P′(x)=−20000x+
+25000
, which is zero when x=1.25 . Since
P″(x)=−20000<0 , there must be a local maximum at
x=1.25 , and since this is the only critical value it must be a global
maximum as well. (Alternately, we could compute P(0)=
)=−12000
, P(1.25)=3625 , and P(1.5)=3000 and
note that P(1.25) is the maximum of these.)
11.7.PARTIAL DIFFERENTIATION
Partial Derivatives
Now that we have the brief discussion on limits out of the way we can
proceed into taking derivatives of functions of mo
moee than one variable.
Before we actually start taking derivatives of functions of more than one
variable let’s recall an important interpretation of derivatives of
functions of one variable.
635
function as x changes. This is an important interpretation of derivatives
and we are not going to want to lose it with functions of more than one
variable. The problem with functions of more than one variable is that
there is more than one variable. In other words, what do we do if we
only want one of the variables to change, or if we want more than one of
them to change? In fact, if we’re going to allow more than one of the
variables to change there are then going to be an infinite amount of ways
for them to change. For instance, one variable could be changing faster
than the other variable(s) in the function. Notice as well that it will be
completely possible for the function to be changing differently
depending on how we allow one or more of the variables to change.
We will need to develop ways, and notations, for dealing with all of
these cases. In this section we are going to concentrate exclusively on
only changing one of the variables at a time, while the remaining
variable(s) are held fixed. We will deal with allowing multiple variables
to change in a later section.
Because we are going to only allow one of the variables to change taking
the derivative will now become a fairly simple process. Let’s start off
this discussion with a fairly simple function.
636
Let’s start with the function
Now, this is a function of a single variable and at this point all that we
637
. In other words, we want to compute
Now, let’s do it the other way. We will now hold x fixed and allow y to
vary. We can do this in a similar way. Since we are holding x fixed it
must be fixed at and so we can define a new function
of y and then differentiate this as we’ve always done with functions of
one variable.
638
In this case we call the partial derivative of
Note that these two partial derivatives are sometimes called the first
order partial derivatives. Just as with functions of one variable we can
have derivatives of all orders. We will be looking at higher order
derivatives in a later section.
Note that the notation for partial derivatives is different than that for
derivatives of functions of a single variable. With functions of a single
variable we could denote the derivative with a single prime. However,
with partial derivatives we will always need to remember the variable
that we are differentiating with respect to and so we will subscript the
639
variable that we differentiated with respect to. We will shortly be seeing
some alternate notation for partial derivatives as well.
640
Before we work any examples let’s get the formal definition of the
partial derivative out of the way as well as some alternate notation.
Now let’s take a quick look at some of the possible alternate notations
641
For the fractional notation for the partial derivative notice the difference
between the partial derivative and the ordinary derivative from single
variable calculus.
642
Okay, now let’s work some examples. When working these examples
always keep in mind that we need to pay very close attention to which
variable we are differentiating with respect to. This is important because
we are going to treat all other variables as constants and then proceed
with the derivative as if it was a function of a single variable. If you can
remember this you’ll find that doing partial derivatives are not much
more difficult that doing derivatives of functions of a single variable as
we did in Calculus I.
Example 1 Find all of the first order partial derivatives for the
following functions.
(a)
[Solution]
(b)
[Solution]
(c)
[Solution]
643
(d)
[Solution]
Solution
(a)
Let’s first take the derivative with respect to x and remember that as we
do so all the y’s will be treated as constants. The partial derivative with
respect to x is,
Notice that the second and the third term differentiate to zero in this
case. It should be clear why the third term differentiated to zero. It’s a
constant and we know that constants always differentiate to zero. This is
also the reason that the second term differentiated to zero. Remember
that since we are differentiating with respect to x here we are going to
treat all y’s as constants. That means that terms that only involve y’s
644
will be treated as constants and hence will differentiate to zero.
Now, let’s take the derivative with respect to y. In this case we treat all
x’s as constants and so the first term involves only x’s and so will
differentiate to zero, just as the third term will. Here is the partial
derivative with respect to y.
[Return to Problems]
(b)
With this function we’ve got three first order derivatives to compute.
Let’s do the partial derivative with respect to x first. Since we are
differentiating with respect to x we will treat all y’s and all z’s as
constants. This means that the second and fourth terms will differentiate
to zero since they only involve y’s and z’s.
645
This first term contains both x’s and y’s and so when we differentiate
with respect to x the y will be thought of as a multiplicative constant and
so the first term will be differentiated just as the third term will be
differentiated.
Let’s now differentiate with respect to y. In this case all x’s and z’s will
be treated as constants. This means the third term will differentiate to
zero since it contains only x’s while the x’s in the first term and the z’s
in the second term will be treated as multiplicative constants. Here is
the derivative with respect to y.
646
Finally, let’s get the derivative with respect to z. Since only one of the
terms involve z’s this will be the only non-zero term in the derivative.
Also, the y’s in that term will be treated as multiplicative constants.
Here is the derivative with respect to z.
(c)
With this one we’ll not put in the detail of the first two. Before taking
the derivative let’s rewrite the function a little to help us with the
differentiation process.
647
Now, the fact that we’re using s and t here instead of the “standard” x
and y shouldn’t be a problem. It will work the same way. Here are the
two derivatives for this function.
648
(d)
Now, we can’t forget the product rule with derivatives. The product rule
will work the same way here as it does with functions of one variable.
We will just need to be careful to remember which variable we are
differentiating with respect to.
Let’s start out by differentiating with respect to x. In this case both the
cosine and the exponential contain x’s and so we’ve really got a product
of two functions involving x’s and so we’ll need to product rule this up.
Here is the derivative with respect to x.
649
Do not forget the chain rule for functions of one variable. We will be
looking at the chain rule for some more complicated expressions for
multivariable functions in a later section. However, at this point we’re
treating all the y’s as constants and so the chain rule will continue to
work as it did back in Calculus I.
650
Now, let’s differentiate with respect to y. In this case we don’t have a
product rule to worry about since the only place that the y shows up is in
the exponential. Therefore, since x’s are considered to be constants for
this derivative, the cosine in the front will also be thought of as a
multiplicative constant. Here is the derivative with respect to y.
Example 2 Find all of the first order partial derivatives for the
following functions.
(a) [Solution]
651
(b)
[Solution]
(c)
[Solution]
Solution
(a)
We also can’t forget about the quotient rule. Since there isn’t too much
to this one, we will simply give the derivatives.
652
In the case of the derivative with respect to v recall that u’s are constant
and so when we differentiate the numerator we will get zer
(b)
Let’s do the derivatives with respect to x and y first. In both these cases
the z’s are constants and so the denominator in this is a constant and so
653
we don’t really need to worry too much about it. Here are the
derivatives for these two cases.
654
We went ahead and put the derivative back into the “original” form just
so we could say that we did. In practice you probably don’t really need
to do that.
(c)
In this last part we are just going to do a somewhat messy chain rule
problem. However, if you had a good background in Calculus I chain
rule this shouldn’t be all that difficult of a problem. Here are the two
derivatives,
655
656
So, there are some examples of partial derivatives. Hopefully you will
agree that as long as we can remember to treat the other variables as
constants these work in exactly the same manner that derivatives of
functions of one variable do. So, if you can do Calculus I derivatives
you shouldn’t have too much difficulty in doing basic partial derivatives.
There is one final topic that we need to take a quick look at in this
section, implicit differentiation. Before getting into implicit
differentiation for multiple variable functions let’s first remember how
implicit differentiation works for functions of one variable.
657
Example 3 Find for
.
Solution
658
Now, we did this problem because implicit differentiation works in
exactly the same manner with functions of multiple variables. If we
have a function in terms of three variables x, y, and z we will assume
a .
659
(a)
[Solution]
(b)
Solution
(a)
660
then any product of x’s and z’s will be a product and so will
need the product rule!
Now we’ll do the same thing for except this time we’ll need
661
(b)
We’ll do the same thing for this function as we did in the previous part.
662
Don’t forget to do the chain rule on each of the trig functions and when
we are differentiating the inside function on the cosine we will need to
663
Now let’s take care of . This one will be slightly easier than
the first one.
664
11.8.Higher order derivatives
665
Now, this is a function and so it can be differentiated. Here is the
notation that we’ll use for that, as well as the derivative.
666
point. We can keep adding on primes, but that will get cumbersome
after awhile.
This process can continue but notice that we will get zero for all
derivatives after this point. This set of derivatives leads us to the
following fact about the differentiation of polynomials.
Fact
667
The presence of parenthesis in the exponent denotes differentiation
while the absence of parenthesis denotes exponentiation.
Collectively the second, third, fourth, etc. derivatives are called higher
order derivatives.
Example 1 Find the first four derivatives for each of the following.
(a)
[Solution]
(b) [Solution]
(c)
[Solution]
668
Solution
(a)
669
(b)
Note that cosine (and sine) will repeat every four derivatives. The other
four trig functions will not exhibit this behavior. You might want to
take a few derivatives to convince yourself of this.
[Return to Problems]
(c)
670
In the previous two examples we saw some patterns in the differentiation
of exponential functions, cosines and sines. We need to be careful
however since they only work if there is just a t or an x in the argument.
This is the point of this example. In this example we will need to use the
chain rule on each derivative.
671
So, we can see with slightly more complicated arguments the patterns
that we saw for exponential functions, sines and cosines no longer
completely hold.
[Return to Problems]
(a)
(b)
(c)
Solution
(a)
672
Notice that the second derivative will now require the product rule.
(b)
673
As with the first example we will need the product rule for the second
derivative.
(c)
674
The second derivative this time will require the quotient rule.
As we saw in this last set of examples we will often need to use the
product or quotient rule for the higher order derivatives, even when the
first derivative didn’t require these rules.
Let’s work one more example that will illustrate how to use implicit
differentiation to find higher order derivatives.
675
Example 3 Find for
Solution
Okay, we know that in order to get the second derivative we need the
first derivative and in order to get that we’ll need to do implicit
differentiation. Here is the work for that.
676
This is fine as far as it goes. However, we would like there to be no
derivatives in the answer. We don’t, generally, mind having x’s and/or
y’s in the answer when doing implicit differentiation, but we really don’t
like derivatives in the answer. We can get rid of the derivative however
by acknowledging that we know what the first derivative is and
substituting this into the second derivative equation. Doing this gives,
677
Now that we’ve found some higher order derivatives we should
probably talk about an interpretation of the second derivative.
678
The acceleration of the object is the first derivative of the velocity, but
since this is the first derivative of the position function we can also think
of the acceleration as the second derivative of the position function.
Alternate Notation
679
How to find the slope of a straight line and its derivative ? What is
the relation between the slope of a curve or a parabola and its
derivative ? How to find the derivative of the composite of two
functions f(g(x)), an exponential or trigonometric function, a
logarithmic function,… ?
680
and knowing that
the equation of the
straight line
shown is
c) conclude
something on the The slope of the line, computed in a), is equal
681
obtained results to the derivative of the straight line equation,
in a) and b). computed in b). Indeed, the derivative of a
linear function (general equation of a linear
function : ) equals the slope of the
line described by that function. In other
words, the derivative of a linear function is
the angular coefficient a of the function.
682
at x = -2, a) Calculation of the derivative y ' of the
quadratic equation (parabola equation)
b) find out
683
graphically the slope of Slope of the tangent line =
the tangent line to
the parabola at x =
-2, with the help of the
formula c) Find the equation of that tangent line
slope = ΔY / ΔX ,
684
we know that (x ; y) = (-2 ; 5) satisfies
this equation.
Thus 5 = -4(-2) + b,
and 5 + 4(-2) = b,
685
Calculate a derivative is the same than
calculate a slope and inversely.
f '(1) = 2(1) = 2.
686
3. Compute the derivative of the following functions
(use the derivative rules)
687
Solution 3.3 y = (-x + 7)4, find the
derivative y' with the help of the derivative
rules. Exercises with solutions.
688
Solution 3.4
Solution 3.5
689
Solution 3.6
Solution 3.7
Solution 3.8
690
4. Derivate the following functions (derivative formulas) :
composite of two functions f(g(x), exponential, logarithmic
function,
trigonometric function,...
691
Solution 4.1
Solution 4.2
Solution 4.3
692
Solution 4.4
693
Solution 4.5
Solution 4.6
Solution 4.7
694
Solution 4.8
Solution 4.9
Solution 4.10
695
696
Solution 4.11
Solution 4.12
697
Solution 4.13
698
Solution 4.14
699
Solution 4.15
Solution 4.16
700
Solution 4.17
Solution 4.18
701
Solution 4.19
702
Solution 4.20
Solution 4.21
703
Solution 4.22 Interesting !
704
MATHEMATICS I: ASSIGNMENT /40Marks
Q2 20Marks
705
c) Evaluate the following integral.
i.
706
ii.
707