Chapter 6 Lecture Notes
Chapter 6 Lecture Notes
• The function f: ℝn → ℝ that we wish to minimize is a real-valued function called the objective function or cost function.
• The vector x is an n-vector of independent variables: x = [x1, x2, …, xn]⊤ ∈ ℝn. The variables x1, x2, …, xn are often referred to as decision
variables.
• The set Ω is a subset of ℝn called the constraint set or feasible set.
○ Often, the constraint set Ω takes the form Ω = {x: h(x) = 0, g(x) ≤ 0}, where h and g are given functions.
• We refer to such constraints as functional constraints.
• The optimization problem above can be viewed as a decision problem that involves finding the “best” vector x of the decision variables over all
possible vectors in Ω.
○ By the “best” vector we mean the one that results in the smallest value of the objective function.
• This vector is called the minimizer of f over Ω.
• It is possible that there may be many minimizers. In this case, finding any of the minimizers will suffice.
• There are also optimization problems that require maximization of the objective function, in which case we seek maximizers.
○ Minimizers and maximizers are also called extremizers.
○ Maximization problems, however, can be represented equivalently in the minimization form above because maximizing f is equivalent to
minimizing –f.
• Therefore, we can confine our attention to minimization problems without loss of generality.
• The problem above is a general form of a constrained optimization problem, because the decision variables are constrained to be in the
constraint set Ω.
○ If Ω = ℝn, then we refer to the problem as an unconstrained optimization problem.
• In this session, we discuss basic properties of the general optimization problem above, which includes the unconstrained case.
• During next couple of sessions, we deal with iterative algorithms for solving unconstrained optimization problems.
• If in the definitions above we replace “≥” with “>,” then we have a strict local minimizer and a strict global minimizer, respectively.
• If x* is a global minimizer of f over Ω, we write f(x*) = minx∈Ω f(x) and x* = arg minx∈Ω f(x).
• If the minimization is unconstrained, we simply write f(x*)= minx f(x) or x* = arg min f(x).
○ In other words, given a real-valued function f, the notation arg min f(x) denotes the argument that minimizes the function f (a point in the
domain of f), assuming that such a point is unique (if there is more than one such point, we pick one arbitrarily).
• For example, if f: ℝ → ℝ is given by f(x) = (x + 1)2 + 3, then arg min f(x) = –1.
○ If we write arg minx∈Ω, then we treat “x ∈ Ω” to be a constraint for the minimization.
• For example, for the function f above, arg minx≥0 f(x) = 0.
• Strictly speaking, an optimization problem is solved only when a global minimizer is found.
○ However, global minimizers are, in general, difficult to find.
• Therefore, in practice, we often have to be satisfied with finding local minimizers :( or maybe NOT!
• Note that the gradient ∇f is just the transpose of Df; that is, ∇f= (Df)⊤.
• The second derivative of f: ℝn → ℝ (also called the Hessian of f) is
Example 1) Let f(x1,x2)=5x1+8x2+x1x2−x12−2x22. Then, find its gradient and Hessian.
• Given an optimization problem with constraint set Ω, a minimizer may lie either
a. in the interior of Ω or
b. on the boundary of Ω.
• To study the case where it lies on the boundary, we need the notion of feasible directions.
Definition: A vector d ∈ ℝn, d ≠ 0, is a feasible direction at x ∈ Ω if there exists α0 > 0 such that x + αd ∈ Ω for all α ∈ [0, α0].
Let f: ℝn → ℝ be a real-valued function and let d be a feasible direction at x ∈ Ω. The directional derivative of f in the direction d, denoted ∂f/∂d,
is the real-valued function defined by
In summary, if d is a unit vector (∥d∥ = 1), then ⟨∇f(x),d⟩ is the rate of increase of f at the point x in the direction d.
Example 2) Define f: ℝ3 → ℝ by f(x) = x1x2x3, and let d=[0.5, 0.5 0.707]T. Find the directional derivative of f in the direction d .
• Illustration of the FONC for a constrained case; x1 does not satisfy the FONC, whereas x2 satisfies the FONC.
• A special case of interest is when x* is an interior point of Ω. In this case, any direction is feasible, and we have the following result.
a. Is the first-order necessary condition (FONC) for a local minimizer satisfied at x = [1, 3]⊤?
where Ω={[x1,x2]⊤:x12+x22=1}.
c. Based on part b, is the FONC for this set-constrained problem useful for eliminating local-minimizer candidates?
and F(x*) is positive semidefinite (F(x*) ≥ 0); that is, for all d ∈ ℝn,
Example 5) Consider a function of one variable f(x) = x3, f: ℝ → ℝ. Because f′(0) = 0, and f″(0) = 0, the point x = 0 satisfies both the FONC and SONC.
However, x = 0 is not a minimizer
The Hessian matrix is indefinite; that is, for some d1 ∈ ℝ2 we have d1⊤Fd1>0 (e.g., d1 = [1, 0]⊤) and for some d2 we have d2⊤Fd2<0 (e.g., d2 = [0,
1]⊤). Thus, x = [0, 0]⊤ does not satisfy the SONC, and hence it is not a minimizer.