Comp 122, Spring 2004
Divide and Conquer
(Merge Sort)
Divide and Conquer
(Merge Sort)
Comp 122- 2
Divide and Conquer
 Recursive in structure
 Divide the problem into sub-problems that are
similar to the original but smaller in size
 Conquer the sub-problems by solving them
recursively. If they are small enough, just solve
them in a straightforward manner.
 Combine the solutions to create a solution to
the original problem
Comp 122- 3
An Example: Merge Sort
Sorting Problem: Sort a sequence of n elements into
non-decreasing order.
 Divide: Divide the n-element sequence to be
sorted into two subsequences of n/2 elements each
 Conquer: Sort the two subsequences recursively
using merge sort.
 Combine: Merge the two sorted subsequences to
produce the sorted answer.
Comp 122- 5
Merge Sort – Example
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
2618 632 1543 19
18 26 32 6 43 15 9 1
18 26 32 6 43 15 9 1
18 26 326 15 43 1 9
6 18 26 32 1 9 15 43
1 6 9 15 18 26 32 43
18 26
18 26
18 26
32
32
6
6
32 6
18 26 32 6
43
43
15
15
43 15
9
9
1
1
9 1
43 15 9 1
18 26 32 6 43 15 9 1
18 26 632
626 3218
1543 19
1 915 43
16 9 1518 26 32 43
Original Sequence Sorted Sequence
Comp 122- 6
Merge-Sort (A, p, r)
INPUT: a sequence of n numbers stored in array A
OUTPUT: an ordered sequence of n numbers
MergeSort (A, p, r) // sort A[p..r] by divide & conquer
1 if p < r
2 then q ← (p+r)/2
3 MergeSort (A, p, q)
4 MergeSort (A, q+1, r)
5 Merge (A, p, q, r) // merges A[p..q] with A[q+1..r]
Initial Call: MergeSort(A, 1, n)
Comp 122- 7
Procedure Merge
Merge(A, p, q, r)
1 n1 ← q – p + 1
2 n2 ← r – q
3 for i ← 1 to n1
4 do L[i] ← A[p + i – 1]
5 for j ← 1 to n2
6 do R[j] ← A[q + j]
7 L[n1+1] ← ∞
8 R[n2+1] ← ∞
9 i ← 1
10 j ← 1
11 for k ←p to r
12 do if L[i] ≤ R[j]
13 then A[k] ← L[i]
14 i ← i + 1
15 else A[k] ← R[j]
16 j ← j + 1
Sentinels, to avoid having to
check if either subarray is
fully copied at each step.
Input: Array containing
sorted subarrays A[p..q]
and A[q+1..r].
Output: Merged sorted
subarray in A[p..r].
Comp 122- 8
j
Merge – Example
6 8 26 32 1 9 42 43… …A
k
6 8 26 32 1 9 42 43
k k k k k k k
i i i i
∞ ∞
i j j j j
6 8 26 32 1 9 42 43
1 6 8 9 26 32 42 43
k
L R
Comp 122- 9
Correctness of Merge
Merge(A, p, q, r)
1 n1 ← q – p + 1
2 n2 ← r – q
3 for i ← 1 to n1
4 do L[i] ← A[p + i – 1]
5 for j ← 1 to n2
6 do R[j] ← A[q + j]
7 L[n1+1] ← ∞
8 R[n2+1] ← ∞
9 i ← 1
10 j ← 1
11 for k ←p to r
12 do if L[i] ≤ R[j]
13 then A[k] ← L[i]
14 i ← i + 1
15 else A[k] ← R[j]
16 j ← j + 1
Loop Invariant for the for loop
At the start of each iteration of the
for loop:
Subarray A[p..k – 1]
contains the k – p smallest elements
of L and R in sorted order.
L[i] and R[j] are the smallest elements of
L and R that have not been copied back into
A.
Initialization:
Before the first iteration:
•A[p..k – 1] is empty.
•i = j = 1.
•L[1] and R[1] are the smallest
elements of L and R not copied to A.
Comp 122- 10
Correctness of Merge
Merge(A, p, q, r)
1 n1 ← q – p + 1
2 n2 ← r – q
3 for i ← 1 to n1
4 do L[i] ← A[p + i – 1]
5 for j ← 1 to n2
6 do R[j] ← A[q + j]
7 L[n1+1] ← ∞
8 R[n2+1] ← ∞
9 i ← 1
10 j ← 1
11 for k ←p to r
12 do if L[i] ≤ R[j]
13 then A[k] ← L[i]
14 i ← i + 1
15 else A[k] ← R[j]
16 j ← j + 1
Maintenance:
Case 1: L[i] ≤ R[j]
•By LI, A contains p – k smallest elements
of L and R in sorted order.
•By LI, L[i] and R[j] are the smallest
elements of L and R not yet copied into A.
•Line 13 results in A containing p – k + 1
smallest elements (again in sorted order).
Incrementing i and k reestablishes the LI
for the next iteration.
Similarly for L[i] > R[j].
Termination:
•On termination, k = r + 1.
•By LI, A contains r – p + 1 smallest
elements of L and R in sorted order.
•L and R together contain r – p + 3 elements.
All but the two sentinels have been copied
back into A.
Comp 122- 11
Analysis of Merge Sort
 Running time T(n) of Merge Sort:
 Divide: computing the middle takes Θ(1)
 Conquer: solving 2 subproblems takes 2T(n/2)
 Combine: merging n elements takes Θ(n)
 Total:
T(n) = Θ(1) if n = 1
T(n) = 2T(n/2) + Θ(n) if n > 1
⇒ T(n) = Θ(n lg n) (CLRS, Chapter 4)
Comp 122, Spring 2004
Recurrences – IRecurrences – I
Comp 122- 13
Recurrence Relations
 Equation or an inequality that characterizes a
function by its values on smaller inputs.
 Solution Methods (Chapter 4)
 Substitution Method.
 Recursion-tree Method.
 Master Method.
 Recurrence relations arise when we analyze the
running time of iterative or recursive algorithms.
 Ex: Divide and Conquer.
T(n) = Θ(1) if n ≤ c
T(n) = a T(n/b) + D(n) + C(n) otherwise
Comp 122- 14
Substitution Method
 Guess the form of the solution, then
use mathematical induction to show it correct.
 Substitute guessed answer for the function when the
inductive hypothesis is applied to smaller values –
hence, the name.
 Works well when the solution is easy to guess.
 No general way to guess the correct solution.
Comp 122- 15
Example – Exact Function
Recurrence: T(n) = 1 if n = 1
T(n) = 2T(n/2) + n if n > 1
Guess: T(n) = n lg n + n.
Induction:
•Basis: n = 1 ⇒ n lgn + n = 1 = T(n).
•Hypothesis: T(k) = k lg k + k for all k < n.
•Inductive Step: T(n) = 2 T(n/2) + n
= 2 ((n/2)lg(n/2) + (n/2)) + n
= n (lg(n/2)) + 2n
= n lg n – n + 2n
= n lg n + n
Comp 122- 16
Recursion-tree Method
 Making a good guess is sometimes difficult with
the substitution method.
 Use recursion trees to devise good guesses.
 Recursion Trees
 Show successive expansions of recurrences using
trees.
 Keep track of the time spent on the subproblems of a
divide and conquer algorithm.
 Help organize the algebraic bookkeeping necessary
to solve a recurrence.
Comp 122- 17
Recursion Tree – Example
 Running time of Merge Sort:
T(n) = Θ(1) if n = 1
T(n) = 2T(n/2) + Θ(n) if n > 1
 Rewrite the recurrence as
T(n) = c if n = 1
T(n) = 2T(n/2) + cn if n > 1
c > 0: Running time for the base case and
time per array element for the divide and
combine steps.
Comp 122- 18
Recursion Tree for Merge Sort
For the original problem,
we have a cost of cn,
plus two subproblems
each of size (n/2) and
running time T(n/2).
cn
T(n/2) T(n/2)
Each of the size n/2 problems
has a cost of cn/2 plus two
subproblems, each costing
T(n/4).
cn
cn/2 cn/2
T(n/4) T(n/4) T(n/4) T(n/4)
Cost of divide and
merge.
Cost of sorting
subproblems.
Comp 122- 19
Recursion Tree for Merge Sort
Continue expanding until the problem size reduces to 1.
cn
cn/2 cn/2
cn/4 cn/4 cn/4 cn/4
c c c cc c
lg n
cn
cn
cn
cn
Total : cnlgn+cn
Comp 122- 20
Recursion Tree for Merge Sort
Continue expanding until the problem size reduces to 1.
cn
cn/2 cn/2
cn/4 cn/4 cn/4 cn/4
c c c cc c
•Each level has total cost cn.
•Each time we go down one level,
the number of subproblems doubles,
but the cost per subproblem halves
⇒ cost per level remains the same.
•There are lg n + 1 levels, height is
lg n. (Assuming n is a power of 2.)
•Can be proved by induction.
•Total cost = sum of costs at each
level = (lg n + 1)cn = cnlgn + cn =
Θ(n lgn).
Comp 122- 21
Other Examples
 Use the recursion-tree method to determine a
guess for the recurrences
 T(n) = 3T(n/4) + Θ(n2
).
 T(n) = T(n/3) + T(2n/3) + O(n).
Comp 122- 22
Recursion Trees – Caution Note
 Recursion trees only generate guesses.
 Verify guesses using substitution method.
 A small amount of “sloppiness” can be
tolerated. Why?
 If careful when drawing out a recursion tree and
summing the costs, can be used as direct proof.
Comp 122- 23
The Master Method
 Based on the Master theorem.
 “Cookbook” approach for solving recurrences
of the form
T(n) = aT(n/b) + f(n)
• a ≥ 1, b > 1 are constants.
• f(n) is asymptotically positive.
• n/b may not be an integer, but we ignore floors and
ceilings. Why?
 Requires memorization of three cases.
Comp 122- 24
The Master Theorem
Theorem 4.1
Let a ≥ 1 and b > 1 be constants, let f(n) be a function, and
Let T(n) be defined on nonnegative integers by the recurrence
T(n) = aT(n/b) + f(n), where we can replace n/b by n/b or n/b.
T(n) can be bounded asymptotically in three cases:
1. If f(n) = O(nlogba–ε
) for some constant ε > 0, then T(n) = Θ(nlogba
).
2. If f(n) = Θ(nlogba
), then T(n) = Θ(nlogba
lg n).
3. If f(n) = Ω(nlogba+ε
) for some constant ε > 0,
and if, for some constant c < 1 and all sufficiently large n,
we have a·f(n/b) ≤ c f(n), then T(n) = Θ(f(n)).
Theorem 4.1
Let a ≥ 1 and b > 1 be constants, let f(n) be a function, and
Let T(n) be defined on nonnegative integers by the recurrence
T(n) = aT(n/b) + f(n), where we can replace n/b by n/b or n/b.
T(n) can be bounded asymptotically in three cases:
1. If f(n) = O(nlogba–ε
) for some constant ε > 0, then T(n) = Θ(nlogba
).
2. If f(n) = Θ(nlogba
), then T(n) = Θ(nlogba
lg n).
3. If f(n) = Ω(nlogba+ε
) for some constant ε > 0,
and if, for some constant c < 1 and all sufficiently large n,
we have a·f(n/b) ≤ c f(n), then T(n) = Θ(f(n)).
We’ll return to recurrences as we need them…

5.2 divide and conquer

  • 1.
    Comp 122, Spring2004 Divide and Conquer (Merge Sort) Divide and Conquer (Merge Sort)
  • 2.
    Comp 122- 2 Divideand Conquer  Recursive in structure  Divide the problem into sub-problems that are similar to the original but smaller in size  Conquer the sub-problems by solving them recursively. If they are small enough, just solve them in a straightforward manner.  Combine the solutions to create a solution to the original problem
  • 3.
    Comp 122- 3 AnExample: Merge Sort Sorting Problem: Sort a sequence of n elements into non-decreasing order.  Divide: Divide the n-element sequence to be sorted into two subsequences of n/2 elements each  Conquer: Sort the two subsequences recursively using merge sort.  Combine: Merge the two sorted subsequences to produce the sorted answer.
  • 4.
    Comp 122- 5 MergeSort – Example 18 26 32 6 43 15 9 1 18 26 32 6 43 15 9 1 18 26 32 6 43 15 9 1 2618 632 1543 19 18 26 32 6 43 15 9 1 18 26 32 6 43 15 9 1 18 26 326 15 43 1 9 6 18 26 32 1 9 15 43 1 6 9 15 18 26 32 43 18 26 18 26 18 26 32 32 6 6 32 6 18 26 32 6 43 43 15 15 43 15 9 9 1 1 9 1 43 15 9 1 18 26 32 6 43 15 9 1 18 26 632 626 3218 1543 19 1 915 43 16 9 1518 26 32 43 Original Sequence Sorted Sequence
  • 5.
    Comp 122- 6 Merge-Sort(A, p, r) INPUT: a sequence of n numbers stored in array A OUTPUT: an ordered sequence of n numbers MergeSort (A, p, r) // sort A[p..r] by divide & conquer 1 if p < r 2 then q ← (p+r)/2 3 MergeSort (A, p, q) 4 MergeSort (A, q+1, r) 5 Merge (A, p, q, r) // merges A[p..q] with A[q+1..r] Initial Call: MergeSort(A, 1, n)
  • 6.
    Comp 122- 7 ProcedureMerge Merge(A, p, q, r) 1 n1 ← q – p + 1 2 n2 ← r – q 3 for i ← 1 to n1 4 do L[i] ← A[p + i – 1] 5 for j ← 1 to n2 6 do R[j] ← A[q + j] 7 L[n1+1] ← ∞ 8 R[n2+1] ← ∞ 9 i ← 1 10 j ← 1 11 for k ←p to r 12 do if L[i] ≤ R[j] 13 then A[k] ← L[i] 14 i ← i + 1 15 else A[k] ← R[j] 16 j ← j + 1 Sentinels, to avoid having to check if either subarray is fully copied at each step. Input: Array containing sorted subarrays A[p..q] and A[q+1..r]. Output: Merged sorted subarray in A[p..r].
  • 7.
    Comp 122- 8 j Merge– Example 6 8 26 32 1 9 42 43… …A k 6 8 26 32 1 9 42 43 k k k k k k k i i i i ∞ ∞ i j j j j 6 8 26 32 1 9 42 43 1 6 8 9 26 32 42 43 k L R
  • 8.
    Comp 122- 9 Correctnessof Merge Merge(A, p, q, r) 1 n1 ← q – p + 1 2 n2 ← r – q 3 for i ← 1 to n1 4 do L[i] ← A[p + i – 1] 5 for j ← 1 to n2 6 do R[j] ← A[q + j] 7 L[n1+1] ← ∞ 8 R[n2+1] ← ∞ 9 i ← 1 10 j ← 1 11 for k ←p to r 12 do if L[i] ≤ R[j] 13 then A[k] ← L[i] 14 i ← i + 1 15 else A[k] ← R[j] 16 j ← j + 1 Loop Invariant for the for loop At the start of each iteration of the for loop: Subarray A[p..k – 1] contains the k – p smallest elements of L and R in sorted order. L[i] and R[j] are the smallest elements of L and R that have not been copied back into A. Initialization: Before the first iteration: •A[p..k – 1] is empty. •i = j = 1. •L[1] and R[1] are the smallest elements of L and R not copied to A.
  • 9.
    Comp 122- 10 Correctnessof Merge Merge(A, p, q, r) 1 n1 ← q – p + 1 2 n2 ← r – q 3 for i ← 1 to n1 4 do L[i] ← A[p + i – 1] 5 for j ← 1 to n2 6 do R[j] ← A[q + j] 7 L[n1+1] ← ∞ 8 R[n2+1] ← ∞ 9 i ← 1 10 j ← 1 11 for k ←p to r 12 do if L[i] ≤ R[j] 13 then A[k] ← L[i] 14 i ← i + 1 15 else A[k] ← R[j] 16 j ← j + 1 Maintenance: Case 1: L[i] ≤ R[j] •By LI, A contains p – k smallest elements of L and R in sorted order. •By LI, L[i] and R[j] are the smallest elements of L and R not yet copied into A. •Line 13 results in A containing p – k + 1 smallest elements (again in sorted order). Incrementing i and k reestablishes the LI for the next iteration. Similarly for L[i] > R[j]. Termination: •On termination, k = r + 1. •By LI, A contains r – p + 1 smallest elements of L and R in sorted order. •L and R together contain r – p + 3 elements. All but the two sentinels have been copied back into A.
  • 10.
    Comp 122- 11 Analysisof Merge Sort  Running time T(n) of Merge Sort:  Divide: computing the middle takes Θ(1)  Conquer: solving 2 subproblems takes 2T(n/2)  Combine: merging n elements takes Θ(n)  Total: T(n) = Θ(1) if n = 1 T(n) = 2T(n/2) + Θ(n) if n > 1 ⇒ T(n) = Θ(n lg n) (CLRS, Chapter 4)
  • 11.
    Comp 122, Spring2004 Recurrences – IRecurrences – I
  • 12.
    Comp 122- 13 RecurrenceRelations  Equation or an inequality that characterizes a function by its values on smaller inputs.  Solution Methods (Chapter 4)  Substitution Method.  Recursion-tree Method.  Master Method.  Recurrence relations arise when we analyze the running time of iterative or recursive algorithms.  Ex: Divide and Conquer. T(n) = Θ(1) if n ≤ c T(n) = a T(n/b) + D(n) + C(n) otherwise
  • 13.
    Comp 122- 14 SubstitutionMethod  Guess the form of the solution, then use mathematical induction to show it correct.  Substitute guessed answer for the function when the inductive hypothesis is applied to smaller values – hence, the name.  Works well when the solution is easy to guess.  No general way to guess the correct solution.
  • 14.
    Comp 122- 15 Example– Exact Function Recurrence: T(n) = 1 if n = 1 T(n) = 2T(n/2) + n if n > 1 Guess: T(n) = n lg n + n. Induction: •Basis: n = 1 ⇒ n lgn + n = 1 = T(n). •Hypothesis: T(k) = k lg k + k for all k < n. •Inductive Step: T(n) = 2 T(n/2) + n = 2 ((n/2)lg(n/2) + (n/2)) + n = n (lg(n/2)) + 2n = n lg n – n + 2n = n lg n + n
  • 15.
    Comp 122- 16 Recursion-treeMethod  Making a good guess is sometimes difficult with the substitution method.  Use recursion trees to devise good guesses.  Recursion Trees  Show successive expansions of recurrences using trees.  Keep track of the time spent on the subproblems of a divide and conquer algorithm.  Help organize the algebraic bookkeeping necessary to solve a recurrence.
  • 16.
    Comp 122- 17 RecursionTree – Example  Running time of Merge Sort: T(n) = Θ(1) if n = 1 T(n) = 2T(n/2) + Θ(n) if n > 1  Rewrite the recurrence as T(n) = c if n = 1 T(n) = 2T(n/2) + cn if n > 1 c > 0: Running time for the base case and time per array element for the divide and combine steps.
  • 17.
    Comp 122- 18 RecursionTree for Merge Sort For the original problem, we have a cost of cn, plus two subproblems each of size (n/2) and running time T(n/2). cn T(n/2) T(n/2) Each of the size n/2 problems has a cost of cn/2 plus two subproblems, each costing T(n/4). cn cn/2 cn/2 T(n/4) T(n/4) T(n/4) T(n/4) Cost of divide and merge. Cost of sorting subproblems.
  • 18.
    Comp 122- 19 RecursionTree for Merge Sort Continue expanding until the problem size reduces to 1. cn cn/2 cn/2 cn/4 cn/4 cn/4 cn/4 c c c cc c lg n cn cn cn cn Total : cnlgn+cn
  • 19.
    Comp 122- 20 RecursionTree for Merge Sort Continue expanding until the problem size reduces to 1. cn cn/2 cn/2 cn/4 cn/4 cn/4 cn/4 c c c cc c •Each level has total cost cn. •Each time we go down one level, the number of subproblems doubles, but the cost per subproblem halves ⇒ cost per level remains the same. •There are lg n + 1 levels, height is lg n. (Assuming n is a power of 2.) •Can be proved by induction. •Total cost = sum of costs at each level = (lg n + 1)cn = cnlgn + cn = Θ(n lgn).
  • 20.
    Comp 122- 21 OtherExamples  Use the recursion-tree method to determine a guess for the recurrences  T(n) = 3T(n/4) + Θ(n2 ).  T(n) = T(n/3) + T(2n/3) + O(n).
  • 21.
    Comp 122- 22 RecursionTrees – Caution Note  Recursion trees only generate guesses.  Verify guesses using substitution method.  A small amount of “sloppiness” can be tolerated. Why?  If careful when drawing out a recursion tree and summing the costs, can be used as direct proof.
  • 22.
    Comp 122- 23 TheMaster Method  Based on the Master theorem.  “Cookbook” approach for solving recurrences of the form T(n) = aT(n/b) + f(n) • a ≥ 1, b > 1 are constants. • f(n) is asymptotically positive. • n/b may not be an integer, but we ignore floors and ceilings. Why?  Requires memorization of three cases.
  • 23.
    Comp 122- 24 TheMaster Theorem Theorem 4.1 Let a ≥ 1 and b > 1 be constants, let f(n) be a function, and Let T(n) be defined on nonnegative integers by the recurrence T(n) = aT(n/b) + f(n), where we can replace n/b by n/b or n/b. T(n) can be bounded asymptotically in three cases: 1. If f(n) = O(nlogba–ε ) for some constant ε > 0, then T(n) = Θ(nlogba ). 2. If f(n) = Θ(nlogba ), then T(n) = Θ(nlogba lg n). 3. If f(n) = Ω(nlogba+ε ) for some constant ε > 0, and if, for some constant c < 1 and all sufficiently large n, we have a·f(n/b) ≤ c f(n), then T(n) = Θ(f(n)). Theorem 4.1 Let a ≥ 1 and b > 1 be constants, let f(n) be a function, and Let T(n) be defined on nonnegative integers by the recurrence T(n) = aT(n/b) + f(n), where we can replace n/b by n/b or n/b. T(n) can be bounded asymptotically in three cases: 1. If f(n) = O(nlogba–ε ) for some constant ε > 0, then T(n) = Θ(nlogba ). 2. If f(n) = Θ(nlogba ), then T(n) = Θ(nlogba lg n). 3. If f(n) = Ω(nlogba+ε ) for some constant ε > 0, and if, for some constant c < 1 and all sufficiently large n, we have a·f(n/b) ≤ c f(n), then T(n) = Θ(f(n)). We’ll return to recurrences as we need them…

Editor's Notes

  • #15 Talk about how mathematical induction works.