0% found this document useful (0 votes)
31 views51 pages

L4 Optimization

The document discusses search algorithms for optimization problems. It introduces local search techniques like hill climbing and simulated annealing that iteratively improve solutions. Hill climbing always chooses the best neighbor but can get stuck in local optima. Simulated annealing also allows worse moves to escape local optima, with the probability of accepting worse moves decreasing over time.

Uploaded by

1mysterious.iam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views51 pages

L4 Optimization

The document discusses search algorithms for optimization problems. It introduces local search techniques like hill climbing and simulated annealing that iteratively improve solutions. Hill climbing always chooses the best neighbor but can get stuck in local optima. Simulated annealing also allows worse moves to escape local optima, with the probability of accepting worse moves decreasing over time.

Uploaded by

1mysterious.iam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

COMP 424 - Artificial Intelligence

Local search for optimization

Instructor: Jackie CK Cheung and Bogdan Mazoure


Readings: R&N Ch. 4.1 (3rd or 4th ed)
Scheduling McGill’s Final Exams
• How would you design the final exam schedule for McGill?

2
Scheduling McGill’s Final Exams
• It’s not so hard to find a solution (though it might be
terrible).

• We want a good solution, or even an optimal solution

• Today’s lecture: solve this by search!

3
Overview
Uninformed search
• Assumes no knowledge about the problem.
• BFS, DFS, Iterative deepening
Informed search
• Use knowledge about the problem, in the form of a heuristic.
• Best-first search, heuristic search, A* (and extensions)
Search for optimization problems:
• Search over large-dimensional spaces
• Iterative improvement algorithms:
1. Hill climbing Today!
2. Simulated annealing
3. Parallelism and beam search
4. Genetic algorithms

4
Search and Optimization
• Search so far:
• Finding the solution with the minimal cost, where the cost is the
sum of edge weights (e.g., A* search)
• Today:
• Case where solution cost is some arbitrary function (Eval(X))
• Want to find best solution (X*) – optimization problem

5
Optimization problems are everywhere
Scheduling
• Given: a set of tasks to be completed, with durations and mutual
constraints (e.g. task ordering, joint resources).
• Goal: generate shortest schedule (assignment of start times to
tasks.)
Digital circuit layout
• Given: a board, components and connections.
• Goal: place each component on the board such as to maximize
energy efficiency, minimize production cost, …
User customization
• Given: customers described by characteristics (age, gender, location,
etc.) and previous purchases.
• Goal: find a function from characteristics to products that maximizes
expected gain.

6
Types of search for optimization
problems
1. Constructive methods: Start from scratch and build up a
solution.
• This is the type of method we have seen so far.

2. Iterative improvement/repair methods: Start with a


solution (which may be broken / suboptimal) and improve it.
• Focus of today’s lecture

7
Characteristics
• An optimization problem is described by:
• A set of states (= configurations)
• An evaluation function
• For interesting optimization problems, the state space is
too big to enumerate all states, or the evaluation function
is too expensive to compute for all states.
• In iterative improvement, a state is a candidate solution,
not a description of the world!
• It might be a partial or incorrect solution; if so, should be reflected
in evaluation function.

8
Travelling Salesman Problem (TSP)
• Given: a set of vertices and distance between pairs.
• Goal: construct the shortest path that visits every vertex
exactly once (a tour)
• e.g., consider these seven cities on a map

• A state in the iterative improvement paradigm is a candidate


tour; the evaluation function would be length of the tour.
• X1 (above) is a tour, but not an optimal one.
• Provably very hard (NP-complete) to find the best solution

9
Visualizing iterative improvement
• Intuition: Consider all possible solutions laid out on a
landscape. We want to find the highest (or lowest) point.

• This landscape is often very


high-dimensional.

10
A generic local search algorithm
• Start from an initial configuration X0.
• Repeat until satisfied:
• Generate the set of neighbours of Xi and evaluate them.
• Select one of the neighbours, Xi+1.
• The selected neighbor becomes the current configuration.

11
A generic local search algorithm
• Start from an initial configuration X0.
• Repeat until satisfied:
• Generate the set of neighbours of Xi and evaluate them.
• Select one of the neighbours, Xi+1.
• The selected neighbor becomes the current configuration.

Important questions:
1. How do we choose the set of neighbours to consider?
2. How do we select one of the neighbours?
• Defining the set of neighbours is a design choice (like
choosing the heuristic for A*) and has crucial impact on
performance.

12
What moves should we consider?
• Case 1: Search for high ground
• Start with initial state = random position.
• Move to an adjacent position.
• Terminate when goal is reached.

• Case 2: Traveling Salesman Problem


• Start with initial state = a random (possibly incomplete/illegal) tour.
• Swap cities to obtain a new partial tour.
• Terminate when constraints are met.

13
Hill climbing
• Also called greedy local search
• In continuous state space, related to gradient ascent
Start from an initial configuration X0 with value E(X0)
X  X0 , and E  E(X0)
Repeat until satisfied:
Generate the set of neighbours of Xi and their value E(Xi).
Let Emax = maxi E(Xi) be the value of the best successor,
i* = argmaxi E(Xi) be the index of the best successor.
if Emax  E:
return X (we are at an optimum)
else:
let X  Xi* , and E  Emax. (take a greedy step)

14
Properties of hill climbing
• Very popular in AI:
• Trivial to program!
• Requires no memory of where we’ve been (no backtracking).
• Can handle very large problems.

• Neighbourhood function is important


• Small neighbourhood: fewer neighbours to evaluate, but possibly
worse solution.
• Large neighbourhood: more computation, but maybe fewer local
optima, so better final result.

15
Local vs. Global Optimum
• Global optimum: The optimal
point over the full space of
possible configurations.

• Local optimum: The optimal


point over the set of
neighbours. One of the
(possibly many) optimums.

16
Example: TSP

• What neighbours should we consider?


• How many neighbours is that?

17
Example: TSP swapping 2 nodes

18
Example: TSP swapping 3 nodes

19
Problems with hill climbing
• Can get stuck in a local maximum or in a plateau
objective function
global maximum

shoulder

local maximum
“flat” local maximum

state space
current
state
• Relies heavily on having a good evaluation

20
Improvements to hill climbing
• Quick fix:
• When stuck in a plateau or local maximum, use random re-starts.

• Slightly better fix:


• Instead of picking the next best move, pick any move that
produces an improvement. (Called randomized hill climbing.)

21
Improvements to hill climbing
• Quick fix:
• When stuck in a plateau or local maximum, use random re-starts.

• Slightly better fix:


• Instead of picking the next best move, pick any move that
produces an improvement. (Called randomized hill climbing.)

• But sometimes we need to pick


apparently worse moves to
eventually reach a better state.

22
Simulated annealing

23
Simulated annealing
Similar to hill climbing, but:
• allows some “bad moves” to try to escape local maxima.
• decrease size and frequency of “bad moves” over time.

Algorithm:
• Start from an initial configuration X0 with value E(X0).
• Repeat until satisfied:
• Let Xi be a random neighbour of X with value E(Xi).
• If Ei > Emax, let Xi*  Xi and let Emax = Ei (we found a new better sol’n).
• If Ei > E then X Xi and E  Ei .
• Else, with some probability p, still accept the move: X Xi and E  Ei .
• Return Xi* .

24
What value should we use for p?
• Many possible choices:
• A given fixed value.
• A value that decays to 0 over time.
• A value that decays to 0, and gives similar chance to “similarly
bad” moves.
• A value that depends on on how much worse the bad move is.

25
What value should we use for p?
• If the new value Ei is better than the old value E, move to
Xi .

• If the new value is worse (Ei < E) then move to the


neighboring solution with probability: p = e-(E-Ei)/T
(Boltzmann distribution)
• T > 0 is a parameter called the temperature, which typically starts
high, then decreases over time towards 0.
• If T is very close to 0, the probability of moving to a worse
solution is almost 0.
• We can gradually decrease T by multiplying by constant 0 <  < 1
at every iteration.

26
Properties of simulated annealing
• What happens when T is high?
• Algorithm is in an exploratory phase (even bad moves have a high
chance of being picked).
• What happens when T is low?
• Algorithm is in an exploitation phase (the “bad” moves have very
low probability).

27
Properties of simulated annealing
• If T decreases slowly enough, simulated annealing is
guaranteed to reach the optimal solution (i.e., find the
global maximum).
• But it may take an infinite number of moves! This result is
not practically useful.

28
Example

29
TSP example: Searching
configurations

30
TSP Example: Energy

31
Simulated annealing in practice
• Very useful algorithm, used to solve hard optimization
problems.
• E.g. Protein design, scheduling large transportation fleets.
• The temperature annealing schedule is crucial (design
choice!)
• Cool too fast: converge to sub-optimal solution.
• Cool too slow: don’t converge.

• Simulated annealing is an example of a randomized search


or Monte-Carlo search.
• Basic idea: run around through the environment and explore it,
instead of systematically sweeping. Very powerful idea!

32
Mitigating the local optimum problem
• Even simulated annealing can get stuck in local maxima!
• More strategies to find a good solution:
• Parallel search
• Beam search
• Genetic algorithms

33
Parallel search
• Run many separate searches (hill climbing or simulated
annealing) in parallel.

• Keep the best solution found.

• Search speed can be greatly improved by using many


processors (including, most recently, GPUs).

• Especially useful when actions have non-deterministic


outcomes (many possible successor states).

34
Local beam search
• Similar to parallel search; run many instances of local
search or simulated annealing at the same time

• Key difference: information sharing across searches


• Algorithm:
• Start k searches in parallel
• At each step, keep the top k solutions in the sets of
neighbourhoods, discard the remaining
• k is called the beam width

35
Local beam search schematic

Image source: Sarah Strickland, http://slideplayer.com/slide/4897669/

36
Evolutionary computing
• Refers generally to computational procedures patterned
after biological evolution
• Many solutions (individuals) exist in parallel
• Nature looks for the best individual (i.e., the fittest)

• Evolutionary search procedures are also parallel,


perturbing probabilistically several potential solutions

37
Genetic algorithms
• A candidate solution is called an individual.
• In a traveling salesman problem, an individual is a tour
• Each individual has a fitness
• fitness = numerical value proportional to quality of that solution
• A set of individuals is called a population.
• Populations change over generations, by applying operations
to individuals.
• operations = {mutation, crossover, selection}
• Individuals with higher fitness are more likely to survive &
reproduce.
• Individual typically represented by a binary string:
• allows operations to be carried out easily.

38
Mutation
• A way to generate desirable features that are not present in
the original population by injecting random change.
• Typically, mutation just means changing a 0 to a 1 (and vice versa).
• The mutation rate controls prob. of mutation occurring
• We can allow mutation in all individuals, or just in the
offspring.

39
Crossover
• Combine parts of individuals to create new individuals
• Single-point crossover:
• Choose a crossover point, cut individuals there, swap the pieces.
E.g. 101|0101 101|1110
011|1110 011|0101
• Implementation:
• Use a crossover mask, which is a binary string
E.g. mask = 1110000

• Multi-point crossover can be implemented with arbitrary


mask

40
Encoding operators as binary masks

41
Typical genetic algorithm
GA(Fitness, threshold, p, r, m)

• Initialize: P  p random individuals

• Evaluate: for each h  P, compute Fitness(h)

• While maxh Fitness(h) < threshold


• Select: Probabilistically select (1-r)p members of P to include in Ps
• Crossover: Probabilistically select rp/2 pairs of individuals from P.
For each pair (h1, h2), produce two offspring by applying a crossover operator. Include
all offspring in Ps.
• Mutate: Invert a randomly selected bit in m*p randomly selected members of Ps
• Update: P  Ps
• Evaluate: for each h  P, compute Fitness(h)

• Return the individual from P that has the highest fitness.

42
Selection: Survival of the fittest
• As in natural evolution, fittest individuals are more likely to
survive.
• Several ways to implement this idea:
1. Fitness proportionate selection:
Can lead to crowding (multiple copies being propagated).
2. Tournament selection:
Pick i, j at random with uniform probability. With prob p select the fitter
one. Only requires comparing two individuals.
3. Rank selection:
Sort all hypothesis by fitness. Probability of selection is proportional to
rank.
4. Softmax (Boltzman) selection:

• Elitism: separately, store the best solution ever encountered

43
Genetic algorithms as search
• States: possible solutions
• Search operators: mutation, crossover, selection
• Relation to previous search algorithms:
• Parallel search, since several solutions are maintained in parallel
• Hill-climbing on the fitness function, but without following the
gradient
• Mutation and crossover should allow us to get out of local minima
• Very related to simulated annealing.

44
Example: Solving TSP with a GA
• Each individual is a tour.
• Mutation swaps a pair of edges (many other operations
are possible and have been tried in literature.)
• Crossover cuts the parents in two and swaps them. Reject
any invalid offsprings.
• Fitness is the length of the tour.
• Note that GA operations (crossover and mutation)
described here are fancier that the simple binary
examples given before.

45
Example: Solving TSP with a GA

46
TSP example: Initial generation

47
TSP example: Generation 15

48
TSP example: Generation 30

49
The good and bad of GAs
• Good:
• Intuitively appealing, due to evolution analogy.
• If tuned right, can be very effective (good solution with few
steps.)
• Bad:
• Performance depends crucially on the problem encoding. Good
encodings are difficult to find!
• Many parameters to tweak! Bad parameter settings can result in
very slow progress, or the algorithm is stuck in local minima.
• With mutation rate is too low, can get overcrowding (many copies
of the identical individuals in the population).

50
Summary
• Optimization problems are widespread and important.
• It is infeasible to enumerate lots of solutions.
• Goal is to get a reasonable (not necessarily optimal) solution.

• Apply a local search and move in a promising direction.


• Hill climbing always moves in the (locally) best direction
• Simulated annealing allows some moves towards worse solutions
• Parallelism and beam search can be exploited to improve results;
find a better local optimum
• Genetic algorithms are an alternative related to simulated annealing
which has an analogy to biological evolution

51

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy