CSE 421
Algorithms
Richard Anderson
Lecture 6
Greedy Algorithms
Greedy Algorithms
• Solve problems with the simplest possible
algorithm
• The hard part: showing that something
simple actually works
• Pseudo-definition
– An algorithm is Greedy if it builds its solution
by adding elements one at a time using a
simple rule
Scheduling Theory
• Tasks
– Processing requirements, release times,
deadlines
• Processors
• Precedence constraints
• Objective function
– Jobs scheduled, lateness, total execution time
• Tasks occur at fixed times
• Single processor
• Maximize number of tasks completed
• Tasks {1, 2, . . . N}
• Start and finish times, s(i), f(i)
Interval Scheduling
What is the largest solution?
Greedy Algorithm for Scheduling
Let T be the set of tasks, construct a set of independent tasks I,
A is the rule determining the greedy algorithm
I = { }
While (T is not empty)
Select a task t from T by a rule A
Add t to I
Remove t and all tasks incompatible with t from T
Simulate the greedy algorithm for
each of these heuristics
Schedule earliest starting task
Schedule shortest available task
Schedule task with fewest conflicting tasks
Greedy solution based on earliest
finishing time
Example 1
Example 2
Example 3
Theorem: Earliest Finish Algorithm
is Optimal
• Key idea: Earliest Finish Algorithm stays
ahead
• Let A = {i1, . . ., ik} be the set of tasks found
by EFA in increasing order of finish times
• Let B = {j1, . . ., jm} be the set of tasks
found by a different algorithm in increasing
order of finish times
• Show that for r<= min(k, m), f(ir) <= f(jr)
Stay ahead lemma
• A always stays ahead of B, f(ir) <= f(jr)
• Induction argument
– f(i1) <= f(j1)
– If f(ir-1) <= f(jr-1) then f(ir) <= f(jr)
Completing the proof
• Let A = {i1, . . ., ik} be the set of tasks found by
EFA in increasing order of finish times
• Let O = {j1, . . ., jm} be the set of tasks found by
an optimal algorithm in increasing order of finish
times
• If k < m, then the Earliest Finish Algorithm
stopped before it ran out of tasks
Scheduling all intervals
• Minimize number of processors to
schedule all intervals
How many processors are needed
for this example?
Prove that you cannot schedule this set
of intervals with two processors
Depth: maximum number of
intervals active
Algorithm
• Sort by start times
• Suppose maximum depth is d, create d
slots
• Schedule items in increasing order, assign
each item to an open slot
• Correctness proof: When we reach an
item, we always have an open slot
Homework Scheduling
• Tasks to perform
• Deadlines on the tasks
• Freedom to schedule tasks in any order
Scheduling tasks
• Each task has a length ti and a deadline di
• All tasks are available at the start
• One task may be worked on at a time
• All tasks must be completed
• Goal: minimize maximum lateness
– Lateness = fi – di if fi >= di
Example
2
3
2
4
Deadline
Time
2 3
2
3
Lateness 1
Lateness 3
Determine the minimum lateness
2
3
4
5
6
4
5
12
Deadline
Time
Greedy Algorithm
• Earliest deadline first
• Order jobs by deadline
• This algorithm is optimal
Analysis
• Suppose the jobs are ordered by deadlines, d1
<= d2 <= . . . <= dn
• A schedule has an inversion if job j is scheduled
before i where j > i
• The schedule A computed by the greedy
algorithm has no inversions.
• Let O be the optimal schedule, we want to show
that A has the same maximum lateness as O
List the inversions
2
3
4
5
4
5
6
12
Deadline
Time
a1
a2
a3
a4
a4 a2 a3
a1
Lemma: There is an optimal
schedule with no idle time
• It doesn’t hurt to start your homework early!
• Note on proof techniques
– This type of can be important for keeping proofs clean
– It allows us to make a simplifying assumption for the
remainder of the proof
a4 a2 a3 a1
Lemma
• If there is an inversion i, j, there is a pair of
adjacent jobs i’, j’ which form an inversion
Interchange argument
• Suppose there is a pair of jobs i and j, with
di <= dj, and j scheduled immediately
before i. Interchanging i and j does not
increase the maximum lateness.
di dj
di dj
j i j
i
Proof by Bubble Sort
a4
a2 a3 a1
a4
a2 a3
a4
a2 a3
a1
a4
a2 a3
a1
a1
a4
a2 a3
a1
Determine maximum lateness
d1 d2 d3 d4
Real Proof
• There is an optimal schedule with no
inversions and no idle time.
• Let O be an optimal schedule k inversions,
we construct a new optimal schedule with
k-1 inversions
• Repeat until we have an optimal schedule
with 0 inversions
• This is the solution found by the earliest
deadline first algorithm
Result
• Earliest Deadline First algorithm
constructs a schedule that minimizes the
maximum lateness
Extensions
• What if the objective is to minimize the
sum of the lateness?
– EDF does not seem to work
• If the tasks have release times and
deadlines, and are non-preemptable, the
problem is NP-complete
• What about the case with release times
and deadlines where tasks are
preemptable?
Optimal Caching
• Caching problem:
– Maintain collection of items in local memory
– Minimize number of items fetched
Caching example
A, B, C, D, A, E, B, A, D, A, C, B, D, A
Optimal Caching
• If you know the sequence of requests,
what is the optimal replacement pattern?
• Note – it is rare to know what the requests
are in advance – but we still might want to
do this:
– Some specific applications, the sequence is
known
– Competitive analysis, compare performance
on an online algorithm with an optimal offline
algorithm
Farthest in the future algorithm
• Discard element used farthest in the future
A, B, C, A, C, D, C, B, C, A, D
Correctness Proof
• Sketch
• Start with Optimal Solution O
• Convert to Farthest in the Future Solution
F-F
• Look at the first place where they differ
• Convert O to evict F-F element
– There are some technicalities here to ensure
the caches have the same configuration . . .

Lecture06_07 GREEDY ALGORITHM-DATA ANALYTICS ALGORITHMS.ppt

  • 1.
  • 2.
    Greedy Algorithms • Solveproblems with the simplest possible algorithm • The hard part: showing that something simple actually works • Pseudo-definition – An algorithm is Greedy if it builds its solution by adding elements one at a time using a simple rule
  • 3.
    Scheduling Theory • Tasks –Processing requirements, release times, deadlines • Processors • Precedence constraints • Objective function – Jobs scheduled, lateness, total execution time
  • 4.
    • Tasks occurat fixed times • Single processor • Maximize number of tasks completed • Tasks {1, 2, . . . N} • Start and finish times, s(i), f(i) Interval Scheduling
  • 5.
    What is thelargest solution?
  • 6.
    Greedy Algorithm forScheduling Let T be the set of tasks, construct a set of independent tasks I, A is the rule determining the greedy algorithm I = { } While (T is not empty) Select a task t from T by a rule A Add t to I Remove t and all tasks incompatible with t from T
  • 7.
    Simulate the greedyalgorithm for each of these heuristics Schedule earliest starting task Schedule shortest available task Schedule task with fewest conflicting tasks
  • 8.
    Greedy solution basedon earliest finishing time Example 1 Example 2 Example 3
  • 9.
    Theorem: Earliest FinishAlgorithm is Optimal • Key idea: Earliest Finish Algorithm stays ahead • Let A = {i1, . . ., ik} be the set of tasks found by EFA in increasing order of finish times • Let B = {j1, . . ., jm} be the set of tasks found by a different algorithm in increasing order of finish times • Show that for r<= min(k, m), f(ir) <= f(jr)
  • 10.
    Stay ahead lemma •A always stays ahead of B, f(ir) <= f(jr) • Induction argument – f(i1) <= f(j1) – If f(ir-1) <= f(jr-1) then f(ir) <= f(jr)
  • 11.
    Completing the proof •Let A = {i1, . . ., ik} be the set of tasks found by EFA in increasing order of finish times • Let O = {j1, . . ., jm} be the set of tasks found by an optimal algorithm in increasing order of finish times • If k < m, then the Earliest Finish Algorithm stopped before it ran out of tasks
  • 12.
    Scheduling all intervals •Minimize number of processors to schedule all intervals
  • 13.
    How many processorsare needed for this example?
  • 14.
    Prove that youcannot schedule this set of intervals with two processors
  • 15.
    Depth: maximum numberof intervals active
  • 16.
    Algorithm • Sort bystart times • Suppose maximum depth is d, create d slots • Schedule items in increasing order, assign each item to an open slot • Correctness proof: When we reach an item, we always have an open slot
  • 17.
    Homework Scheduling • Tasksto perform • Deadlines on the tasks • Freedom to schedule tasks in any order
  • 18.
    Scheduling tasks • Eachtask has a length ti and a deadline di • All tasks are available at the start • One task may be worked on at a time • All tasks must be completed • Goal: minimize maximum lateness – Lateness = fi – di if fi >= di
  • 19.
  • 20.
    Determine the minimumlateness 2 3 4 5 6 4 5 12 Deadline Time
  • 21.
    Greedy Algorithm • Earliestdeadline first • Order jobs by deadline • This algorithm is optimal
  • 22.
    Analysis • Suppose thejobs are ordered by deadlines, d1 <= d2 <= . . . <= dn • A schedule has an inversion if job j is scheduled before i where j > i • The schedule A computed by the greedy algorithm has no inversions. • Let O be the optimal schedule, we want to show that A has the same maximum lateness as O
  • 23.
  • 24.
    Lemma: There isan optimal schedule with no idle time • It doesn’t hurt to start your homework early! • Note on proof techniques – This type of can be important for keeping proofs clean – It allows us to make a simplifying assumption for the remainder of the proof a4 a2 a3 a1
  • 25.
    Lemma • If thereis an inversion i, j, there is a pair of adjacent jobs i’, j’ which form an inversion
  • 26.
    Interchange argument • Supposethere is a pair of jobs i and j, with di <= dj, and j scheduled immediately before i. Interchanging i and j does not increase the maximum lateness. di dj di dj j i j i
  • 27.
    Proof by BubbleSort a4 a2 a3 a1 a4 a2 a3 a4 a2 a3 a1 a4 a2 a3 a1 a1 a4 a2 a3 a1 Determine maximum lateness d1 d2 d3 d4
  • 28.
    Real Proof • Thereis an optimal schedule with no inversions and no idle time. • Let O be an optimal schedule k inversions, we construct a new optimal schedule with k-1 inversions • Repeat until we have an optimal schedule with 0 inversions • This is the solution found by the earliest deadline first algorithm
  • 29.
    Result • Earliest DeadlineFirst algorithm constructs a schedule that minimizes the maximum lateness
  • 30.
    Extensions • What ifthe objective is to minimize the sum of the lateness? – EDF does not seem to work • If the tasks have release times and deadlines, and are non-preemptable, the problem is NP-complete • What about the case with release times and deadlines where tasks are preemptable?
  • 31.
    Optimal Caching • Cachingproblem: – Maintain collection of items in local memory – Minimize number of items fetched
  • 32.
    Caching example A, B,C, D, A, E, B, A, D, A, C, B, D, A
  • 33.
    Optimal Caching • Ifyou know the sequence of requests, what is the optimal replacement pattern? • Note – it is rare to know what the requests are in advance – but we still might want to do this: – Some specific applications, the sequence is known – Competitive analysis, compare performance on an online algorithm with an optimal offline algorithm
  • 34.
    Farthest in thefuture algorithm • Discard element used farthest in the future A, B, C, A, C, D, C, B, C, A, D
  • 35.
    Correctness Proof • Sketch •Start with Optimal Solution O • Convert to Farthest in the Future Solution F-F • Look at the first place where they differ • Convert O to evict F-F element – There are some technicalities here to ensure the caches have the same configuration . . .