Fundamentals of the Analysis of Algorithm Efficiency
Introduction Analysis is the separation of an intellectual or substantial whole into its constituent parts for individual study. Investigation of an algorithm’s efficiency with respect to running time and memory space. An algorithm’s time efficiency is principally measured as a function of its input size by counting the number of times its basic operation executed. A basic operation is the operation that contributes most toward running time.
Order of Growth It is the rate of growth  of the running time that really interests us. Consider only the leading term of a formula, since the lower-order terms are relatively insignificant for large n and also ignoring the leading term’s constant coefficient  (as constant factors are less significant than the rate of growth in determining computational efficiency for large inputs.) One algorithm is more efficient   than other if its worst-case running time has a lower order of growth. Due to constant factors and lower order terms, error for small inputs.
Worst-Case, Best-Case & Average-Case Efficiencies Running time depends not only on an input size but also on the specifies of a particular input.  (For example: Sequential Search) The worst-case efficiency of an algorithm is its efficiency for the worst-case input of size n, which is an input of size n for which the algorithm runs the longest among all possible inputs of size. The best-case efficiency of an algorithm is its efficiency for the best case input of size for which the algorithm runs the fastest among all possible inputs of that size. The average-case efficiency seeks to provide the information about an algorithm’s behavior on a typical or random input.
Sequential Search Algorithm:  Sequential search (A[0..n-1],k) // searches for a given value in a given array by sequential search // Input: An array A[0..n-1] and a search key k // Output: The index of the first element of A that matches k or -1 if there are no matching elements. i=0 While i<n and A [i] ≠k do i=i+1 If i<n return i Else return -1
Sequential Search (contd.) Worst-Case:  No Matching elements or the first matching elements happens to    be the last one on the list. C worst (n)=n Best-Case:  C best (n)=1 Average-Case:  To analyze the algorithm's average case efficiency, some    assumptions about possible inputs of size n. The standard assumptions are that a. the probability of successful search is p(0 ≥ p ≤ 1) and b. the probability of the first match occurring in the i th  position  of the list is the same for every i.  For successful search: probability of the first match occurring in the i th   position of the list is  p/n  for every i. For unsuccessful search: probability is  (1-p) .
Sequential Search (contd.) C avg (n)=[1.p/n+2.p/n+..........+i.p/n+.........+n.p/n]+n.(1-p)   = p/n[1+2+…….+i+…………+n]+n(1-p)   =p/n (n(n+1))/2+n(1-p) =p(n+1)/2+n(1-p) If p=1 (successful search), the average number of key comparisons made by sequential search is  (n+1)/2  i.e. the algorithm will inspect half of the list’s elements. If p=0 (unsuccessful search), the average number of key comparisons made by sequential search will be  n .
Asymptotic Notations Used to formalize that an algorithm has running time or storage requirements that are ``never more than,'' ``always greater than,'' or ``exactly'' some amount. To compare and rank the order of growth of an algorithm’s basic operation count. Three asymptotic notations: O (big oh), Ω (big omega), and  Θ (big theta)
O-notation  (Big Oh) Asymptotic Upper Bound For a given function g(n), we denote  O(g(n)) as the set of functions:   O(g(n)) = { f(n)| there exists positive  constants c and n 0  such that  0 ≤ f(n) ≤ c g(n) for all n ≥ n 0  } For example: 100n+5  Є  O(n 2 ) 100n+5 ≤ 100n+n (for all n≥5) = 101n  ≤ 101n 2 By the definition, c=101 and n o =5
Ω -notation Asymptotic lower bound Ω (g(n)) represents a set of functions such that: Ω(g(n)) = {f(n): there exist positive  constants c and n 0  such that 0 ≤ c g(n) ≤ f(n) for all n≥ n 0 } For example: n 3  Є   Ω (n 2 ) n 3  ≥  n 2  for all n  ≥ 0 By definition, we have c=1 and n o =0
Θ -notation Asymptotic tight bound Θ (g(n)) represents a set of functions such that: Θ (g(n)) = {f(n): there exist positive  constants c 1 , c 2 , and n 0  such  that 0 ≤ c 1 g(n) ≤ f(n) ≤ c 2 g(n)  for all n≥ n 0 } For example: ½ n(n-1)   Є   Θ (n 2 ) Upper bound:  ½ n(n-1)   = ½ n 2 - ½ n  ≤  ½ n 2   for all n  ≥ 0 Lower bound:  ½ n 2 - ½ n  ≥   ½ n 2   -  ½ n .½ n = ¼ n 2  for all n  ≥ 2 Hence, we have c 2 =1/4, c 1 =1/2 and n o =2
Mappings for n 2 Ω (n 2  ) O(n 2  ) Θ ( n 2 )
Bounds of a Function
Examples of algorithms for sorting techniques and their complexities Insertion sort : O(n 2 ) Selection sort : O(n 2 ) Quick sort : O(n logn) Merge sort : O(n logn)
 
 
 
 
Time Efficiency Analysis Exmple1:C(n)=the number of times the comparison is executed  =  =n-1  Є   Θ (n) Example 2: C worst (n)=  =(n-1) 2  - (n-2)(n-1)/2 = (n-1)n/2 ≈  ½ n 2  Є Θ (n 2 )
Recurrences When an algorithm contains a recursive call to itself, its running time can often be described by a recurrence equation or inequality. It describes the overall running time on a problem of size n in terms of the running time on smaller inputs. Special techniques are required to analyze the space and time required. For example: iteration method, substitution method and master theorem.
Mathematical Analysis of Recursive Algorithms Decide on parameter indicating an input’s size. Identify the algorithm’s basic operation. Check whether the number of times the basic operation is executed can vary on different inputs of the same size. Set up a recurrence relation with an appropriate initial condition. Solve the recurrence or at least ascertain the order of growth of its solution.
Example 1 Compute the factorial function F(n)=n! // computes n! recursively // Input: A non negative integer n // Output: The value of n! If n=0 return 1 Else return F(n-1)*n Compute Time efficiency: The number of multiplications (basic operation) M(n) needed to compute F(n) must satisfy the equality M(n)=M(n-1) to compute F(n-1) +1 to multiply F(n-1) by n  for n>0 Initial condition makes the algorithm stop its recursive calls.   If n=0 return 1
Example 1 (contd.) From the method of backward substitutions, we have  M(n) =M(n-1)+1 substitute M(n-1)=M(n-2)+1   =[M(n-2)+1]+1=M(n-2)+2  substituteM(n-2)=M(n-3)+1 =[M(n-3)+1]+1=M(n-3)+3 . .   =M(n-i)+i =………=M(n-n)+n    =M(0)+n  but M(0)=0   =n
Example 2
Example 2 (contd.) Compute Time efficiency:
Example 3 Analysis of Merge Sort:  To   set up the recurrence for T(n), the worst –case running time for merge sort on n numbers. Merge sort on just one element takes constant time. When we have n >1 elements, we break down the running time as follows: Divide:  The divide step just computes the middle of the sub  array, which takes constant time. Thus, D(n)= Θ (1). Conquer:  We recursively solve two problems, each of size n/2, which contributes 2T(n/2) to  the running time. Combine:  Merge procedure on an n-element sub array takes time Θ(n), so C(n)= Θ(n).
Example 3 (contd.) The worst-case running time T(n) could be described by the recurrence as follows:    T(n)=  Whose solution is claimed to be T(n)= Θ(n lg n). The equation 1.0 can be rewritten as follows: T(n)=   Where the constant c represents the time required to  solve problems of size 1 as well as the time per array  element of the divide and combine steps.
Example 3 (contd.) cn T(n) cn T(n/2) T(n/2 ) cn/2 cn/2 T(n/4) T(n/4) T(n/4) T(n/4) (a) (b) (c)
Example 3 (contd.) We continue expanding each node in the tree by breaking it into its constituent parts as determined by the recurrence, until the problem sizes get down to 1, each with a cost of c. We add the costs across each level of the tree. The top level has total cost cn, the next level down has total cost c(n/2)+c(n/2)=cn, the level after that has the total cost cn/4+cn/4+cn/4+cn/4=cn, and so on. In general, the level i below the top has 2i nodes, each contributing a cost of  c(n/2i), so the ith level below the top has total cost 2i c(n/2i)=cn.
Example 3 (contd.) At the bottom level, there are n nodes, each contributing a cost of c, for a total cost of cn. (2 i =2 log 2 n = nlog 2 2=n nodes) The longest path from the root to a leaf is n-> (1/2)n-> (1/2)2n ->------->1. Since (1/2) k  n=1 when k= log2n, the height of the tree is log2n. The total numbers of levels of the recursion tree is lg n+1. To compute the total cost represented by the recurrence (1.1), we simply add the costs of all levels. There are lg n+1 levels, each costing cn, for a total of cn(lg n+1)=cn lg n+cn. Ignoring the low order term and the constant c gives the desired result of Θ(n lg n).
Randomized Algorithms It doesn’t require that the intermediate results of each step of execution be uniquely defined and depend only on the inputs and results of the preceding steps. It makes random choices and these choices are made by random number generator. When a random number generator is called, it computes a number and returns its value. When a sequence of calls is made to a random generator, the sequence of numbers returned is random. In practice, a pseudorandom number generator is used. It is an algorithm that produces numbers that appear random.
Algorithm Visualization Use of images to convey some useful information about algorithms. Two principal variations: Static algorithm visualization Dynamic algorithmic visualization (animation) Be consistent, interactive, clear and concise, adaptability, user friendly etc.
Assignment 1.0  Use the most appropriate notation among O, Ω and  Θ  to   indicate the time efficiency class of sequential search a. in the worst case b. in the best case c. in the average case 2.0  Use the definitions of O, Ω and  Θ  to determine whether the  following assertions are true or false. a. n(n+1)/2 Є O(n 3 ) b. n(n+1) Є O(n 2 ) c. n(n+1)/2 Є   Θ (n 3 ) d. n(n+1) Є Ω(n) 3.0  Argue that the solution to the recurrence  T(n)=T(n/3)+T(2n/3)+cn, where c is a constant, is   O(n lgn) by appealing to a recursion tree.
Thank You!

Slide2

  • 1.
    Fundamentals of theAnalysis of Algorithm Efficiency
  • 2.
    Introduction Analysis isthe separation of an intellectual or substantial whole into its constituent parts for individual study. Investigation of an algorithm’s efficiency with respect to running time and memory space. An algorithm’s time efficiency is principally measured as a function of its input size by counting the number of times its basic operation executed. A basic operation is the operation that contributes most toward running time.
  • 3.
    Order of GrowthIt is the rate of growth of the running time that really interests us. Consider only the leading term of a formula, since the lower-order terms are relatively insignificant for large n and also ignoring the leading term’s constant coefficient (as constant factors are less significant than the rate of growth in determining computational efficiency for large inputs.) One algorithm is more efficient than other if its worst-case running time has a lower order of growth. Due to constant factors and lower order terms, error for small inputs.
  • 4.
    Worst-Case, Best-Case &Average-Case Efficiencies Running time depends not only on an input size but also on the specifies of a particular input. (For example: Sequential Search) The worst-case efficiency of an algorithm is its efficiency for the worst-case input of size n, which is an input of size n for which the algorithm runs the longest among all possible inputs of size. The best-case efficiency of an algorithm is its efficiency for the best case input of size for which the algorithm runs the fastest among all possible inputs of that size. The average-case efficiency seeks to provide the information about an algorithm’s behavior on a typical or random input.
  • 5.
    Sequential Search Algorithm: Sequential search (A[0..n-1],k) // searches for a given value in a given array by sequential search // Input: An array A[0..n-1] and a search key k // Output: The index of the first element of A that matches k or -1 if there are no matching elements. i=0 While i<n and A [i] ≠k do i=i+1 If i<n return i Else return -1
  • 6.
    Sequential Search (contd.)Worst-Case: No Matching elements or the first matching elements happens to be the last one on the list. C worst (n)=n Best-Case: C best (n)=1 Average-Case: To analyze the algorithm's average case efficiency, some assumptions about possible inputs of size n. The standard assumptions are that a. the probability of successful search is p(0 ≥ p ≤ 1) and b. the probability of the first match occurring in the i th position of the list is the same for every i. For successful search: probability of the first match occurring in the i th position of the list is p/n for every i. For unsuccessful search: probability is (1-p) .
  • 7.
    Sequential Search (contd.)C avg (n)=[1.p/n+2.p/n+..........+i.p/n+.........+n.p/n]+n.(1-p) = p/n[1+2+…….+i+…………+n]+n(1-p) =p/n (n(n+1))/2+n(1-p) =p(n+1)/2+n(1-p) If p=1 (successful search), the average number of key comparisons made by sequential search is (n+1)/2 i.e. the algorithm will inspect half of the list’s elements. If p=0 (unsuccessful search), the average number of key comparisons made by sequential search will be n .
  • 8.
    Asymptotic Notations Usedto formalize that an algorithm has running time or storage requirements that are ``never more than,'' ``always greater than,'' or ``exactly'' some amount. To compare and rank the order of growth of an algorithm’s basic operation count. Three asymptotic notations: O (big oh), Ω (big omega), and Θ (big theta)
  • 9.
    O-notation (BigOh) Asymptotic Upper Bound For a given function g(n), we denote O(g(n)) as the set of functions: O(g(n)) = { f(n)| there exists positive constants c and n 0 such that 0 ≤ f(n) ≤ c g(n) for all n ≥ n 0 } For example: 100n+5 Є O(n 2 ) 100n+5 ≤ 100n+n (for all n≥5) = 101n ≤ 101n 2 By the definition, c=101 and n o =5
  • 10.
    Ω -notation Asymptoticlower bound Ω (g(n)) represents a set of functions such that: Ω(g(n)) = {f(n): there exist positive constants c and n 0 such that 0 ≤ c g(n) ≤ f(n) for all n≥ n 0 } For example: n 3 Є Ω (n 2 ) n 3 ≥ n 2 for all n ≥ 0 By definition, we have c=1 and n o =0
  • 11.
    Θ -notation Asymptotictight bound Θ (g(n)) represents a set of functions such that: Θ (g(n)) = {f(n): there exist positive constants c 1 , c 2 , and n 0 such that 0 ≤ c 1 g(n) ≤ f(n) ≤ c 2 g(n) for all n≥ n 0 } For example: ½ n(n-1) Є Θ (n 2 ) Upper bound: ½ n(n-1) = ½ n 2 - ½ n ≤ ½ n 2 for all n ≥ 0 Lower bound: ½ n 2 - ½ n ≥ ½ n 2 - ½ n .½ n = ¼ n 2 for all n ≥ 2 Hence, we have c 2 =1/4, c 1 =1/2 and n o =2
  • 12.
    Mappings for n2 Ω (n 2 ) O(n 2 ) Θ ( n 2 )
  • 13.
    Bounds of aFunction
  • 14.
    Examples of algorithmsfor sorting techniques and their complexities Insertion sort : O(n 2 ) Selection sort : O(n 2 ) Quick sort : O(n logn) Merge sort : O(n logn)
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
    Time Efficiency AnalysisExmple1:C(n)=the number of times the comparison is executed = =n-1 Є Θ (n) Example 2: C worst (n)= =(n-1) 2 - (n-2)(n-1)/2 = (n-1)n/2 ≈ ½ n 2 Є Θ (n 2 )
  • 20.
    Recurrences When analgorithm contains a recursive call to itself, its running time can often be described by a recurrence equation or inequality. It describes the overall running time on a problem of size n in terms of the running time on smaller inputs. Special techniques are required to analyze the space and time required. For example: iteration method, substitution method and master theorem.
  • 21.
    Mathematical Analysis ofRecursive Algorithms Decide on parameter indicating an input’s size. Identify the algorithm’s basic operation. Check whether the number of times the basic operation is executed can vary on different inputs of the same size. Set up a recurrence relation with an appropriate initial condition. Solve the recurrence or at least ascertain the order of growth of its solution.
  • 22.
    Example 1 Computethe factorial function F(n)=n! // computes n! recursively // Input: A non negative integer n // Output: The value of n! If n=0 return 1 Else return F(n-1)*n Compute Time efficiency: The number of multiplications (basic operation) M(n) needed to compute F(n) must satisfy the equality M(n)=M(n-1) to compute F(n-1) +1 to multiply F(n-1) by n for n>0 Initial condition makes the algorithm stop its recursive calls. If n=0 return 1
  • 23.
    Example 1 (contd.)From the method of backward substitutions, we have M(n) =M(n-1)+1 substitute M(n-1)=M(n-2)+1 =[M(n-2)+1]+1=M(n-2)+2 substituteM(n-2)=M(n-3)+1 =[M(n-3)+1]+1=M(n-3)+3 . . =M(n-i)+i =………=M(n-n)+n =M(0)+n but M(0)=0 =n
  • 24.
  • 25.
    Example 2 (contd.)Compute Time efficiency:
  • 26.
    Example 3 Analysisof Merge Sort: To set up the recurrence for T(n), the worst –case running time for merge sort on n numbers. Merge sort on just one element takes constant time. When we have n >1 elements, we break down the running time as follows: Divide: The divide step just computes the middle of the sub array, which takes constant time. Thus, D(n)= Θ (1). Conquer: We recursively solve two problems, each of size n/2, which contributes 2T(n/2) to the running time. Combine: Merge procedure on an n-element sub array takes time Θ(n), so C(n)= Θ(n).
  • 27.
    Example 3 (contd.)The worst-case running time T(n) could be described by the recurrence as follows: T(n)= Whose solution is claimed to be T(n)= Θ(n lg n). The equation 1.0 can be rewritten as follows: T(n)= Where the constant c represents the time required to solve problems of size 1 as well as the time per array element of the divide and combine steps.
  • 28.
    Example 3 (contd.)cn T(n) cn T(n/2) T(n/2 ) cn/2 cn/2 T(n/4) T(n/4) T(n/4) T(n/4) (a) (b) (c)
  • 29.
    Example 3 (contd.)We continue expanding each node in the tree by breaking it into its constituent parts as determined by the recurrence, until the problem sizes get down to 1, each with a cost of c. We add the costs across each level of the tree. The top level has total cost cn, the next level down has total cost c(n/2)+c(n/2)=cn, the level after that has the total cost cn/4+cn/4+cn/4+cn/4=cn, and so on. In general, the level i below the top has 2i nodes, each contributing a cost of c(n/2i), so the ith level below the top has total cost 2i c(n/2i)=cn.
  • 30.
    Example 3 (contd.)At the bottom level, there are n nodes, each contributing a cost of c, for a total cost of cn. (2 i =2 log 2 n = nlog 2 2=n nodes) The longest path from the root to a leaf is n-> (1/2)n-> (1/2)2n ->------->1. Since (1/2) k n=1 when k= log2n, the height of the tree is log2n. The total numbers of levels of the recursion tree is lg n+1. To compute the total cost represented by the recurrence (1.1), we simply add the costs of all levels. There are lg n+1 levels, each costing cn, for a total of cn(lg n+1)=cn lg n+cn. Ignoring the low order term and the constant c gives the desired result of Θ(n lg n).
  • 31.
    Randomized Algorithms Itdoesn’t require that the intermediate results of each step of execution be uniquely defined and depend only on the inputs and results of the preceding steps. It makes random choices and these choices are made by random number generator. When a random number generator is called, it computes a number and returns its value. When a sequence of calls is made to a random generator, the sequence of numbers returned is random. In practice, a pseudorandom number generator is used. It is an algorithm that produces numbers that appear random.
  • 32.
    Algorithm Visualization Useof images to convey some useful information about algorithms. Two principal variations: Static algorithm visualization Dynamic algorithmic visualization (animation) Be consistent, interactive, clear and concise, adaptability, user friendly etc.
  • 33.
    Assignment 1.0 Use the most appropriate notation among O, Ω and Θ to indicate the time efficiency class of sequential search a. in the worst case b. in the best case c. in the average case 2.0 Use the definitions of O, Ω and Θ to determine whether the following assertions are true or false. a. n(n+1)/2 Є O(n 3 ) b. n(n+1) Є O(n 2 ) c. n(n+1)/2 Є Θ (n 3 ) d. n(n+1) Є Ω(n) 3.0 Argue that the solution to the recurrence T(n)=T(n/3)+T(2n/3)+cn, where c is a constant, is O(n lgn) by appealing to a recursion tree.
  • 34.