Approximation Algorithms
Chapter 1- Introduction
My T. Thai
mythai@cise.ufl.edu
2
General Information
 Time: MWF 1:55 – 2:45
 Place: TUR 2303
My T. Thai
mythai@cise.ufl.edu
3
Instructor Information
 Name: Dr. My T. Thai
 Office: E566 CSE Building
 Phone: (352)392-6842
 Email: mythai@cise.ufl.edu
 Office Hours: W 3:00pm – 5:00pm or by
appointments
My T. Thai
mythai@cise.ufl.edu
4
Why Approximation Algorithms
 Problems that we cannot find an optimal
solution in a polynomial time
 Eg: Set Cover, Bin Packing
 Need to find a near-optimal solution:
 Heuristic
 Approximation algorithms:
 This gives us a guarantee approximation ratio
My T. Thai
mythai@cise.ufl.edu
5
Why take this course
 Your advisers/bosses give you a
computationally hard problem. Here are two
scenarios:
 No knowledge about approximation:
 Spend a few months looking for an optimal solution
 Come to their office and confess that you cannot do it
 Get fired
 Knowledge about approximation:
My T. Thai
mythai@cise.ufl.edu
6
Why take this course (cont)
 Knowledge about approximation
 Show your boss that this is a NP-complete (NP-
hard) problem
 There does not exist any polynomial time algorithm
to find an exact solution
 Propose a good algorithm (either heuristic or
approximation) to find a near-optimal solution
 Better yet, prove the approximation ratio
My T. Thai
mythai@cise.ufl.edu
7
Course Description
 Covers a variety of techniques to design and
analyze many approximation algorithms for
computationally hard problems:
 Combinatorial algorithms:
 Greedy Techniques, Independent System, Submodular
Function
 Cover various problems
 Linear Programming based algorithms
 Semidefinite Programming based algorithms
My T. Thai
mythai@cise.ufl.edu
8
Course Objectives
 Grasp the essential techniques to design and
analyze approximation algorithms:
 Combinatorial methods
 Linear programming
 Semidefinite programming
 Primal-dual and relaxation methods
 Hardness of approximation
 Grasp the key ideas of graph theory
 Able to model and solve many practical
problems raising in our real life
My T. Thai
mythai@cise.ufl.edu
9
Textbooks
 Required:
 Vijay Vazirani, Approximation Algorithms,
Springer-Verlag, 2001
 Recommended:
 Vasek Chvatal, Linear Programming, W. H.
Freeman Company
 Michael R. Garey and David S. Johnson,
Computers and Intractability: A Guide to the
Theory of NP-Completeness
 Shall provide some appropriate lecture notes
My T. Thai
mythai@cise.ufl.edu
10
Grading Policies
 Homework Assignments:
 5 homework assignments, each weighs 10%
 Due at the beginning of the lecture on the due date
 No late assignment will be accepted
 Midterm Exam:
 One midterm exam, weighs 20%
 In class, open book, open notes
 One Final Exam:
 Weighs 30%
My T. Thai
mythai@cise.ufl.edu
11
Grading Policies (cont)
Cut-off points:
 A >= 85%
 85% > B >= 75%
 75% > C >= 65%
My T. Thai
mythai@cise.ufl.edu
12
Plagiarism Policy
 Zero tolerance
 Collaboration:
 May discuss with other students on homework, but
must write up solution on your own independently
 More info, see the class website
 Other policy: Students are encouraged to attend
the class. Many proofs will not be shown on the
lecture slides
My T. Thai
mythai@cise.ufl.edu
13
Tentative Schedule
 See the class website for detail information
 Roughly divided into two parts:
 Week 1 to Week 8: Combinatorial Algorithms
 Week 9 to Week 16: Linear Programming: Dual-
Primal, Relaxation, Rounding, and Semidefinite
programming
My T. Thai
mythai@cise.ufl.edu
14
Introduction to Combinatorial
Optimization
My T. Thai
mythai@cise.ufl.edu
15
Combinatorial Optimization
 The study of finding the “best” object from
within some finite space of objects, eg:
 Shortest path: Given a graph with edge costs and a
pair of nodes, find the shortest path (least costs)
between them
 Traveling salesman: Given a complete graph with
nonnegative edge costs, find a minimum cost cycle
visiting every vertex exactly once
 Maximum Network Lifetime: Given a wireless
sensor networks and a set of targets, find a schedule
of these sensors to maximize network lifetime
My T. Thai
mythai@cise.ufl.edu
16
In P or not in P?
Informal Definitions:
 The class P consists of those problems that are
solvable in polynomial time, i.e. O(nk) for some
constant k where n is the size of the input.
 The class NP consists of those problems that
are “verifiable” in polynomial time:
 Given a certificate of a solution, then we can verify
that the certificate is correct in polynomial time
My T. Thai
mythai@cise.ufl.edu
17
In P or not in P: Examples
 In P:
 Shortest path
 Minimum Spanning Tree
 Not in P (NP):
 Vertex Cover
 Traveling salesman
 Minimum Connected Dominating Set
My T. Thai
mythai@cise.ufl.edu
18
NP-completeness (NPC)
 A problem is in the class NPC if it is in NP and
is as “hard” as any problem in NP
My T. Thai
mythai@cise.ufl.edu
19
What is “hard”
 Decision Problems: Only consider YES-NO
 Decision version of TSP: Given n cities, is there a
TSP tour of length at most L?
 Why Decision Problem? What relation between
the optimization problem and its decision?
 Decision is not “harder” than that of its
optimization problem
 If the decision is “hard”, then the optimization
problem should be “hard”
My T. Thai
mythai@cise.ufl.edu
20
NP-complete and NP-hard
A language L is NP-complete if:
1. L is in NP, and
2. For every L’ in NP, L’ can be polynomial-time
reducible to L
My T. Thai
mythai@cise.ufl.edu
21
Approximation Algorithms
 An algorithm that returns near-optimal
solutions in polynomial time
 Approximation Ratio ρ(n):
 Define: C* as a optimal solution and C is the
solution produced by the approximation algorithm
 max (C/C*, C*/C) <= ρ(n)
 Maximization problem: 0 < C <= C*, thus C*/C
shows that C* is larger than C by ρ(n)
 Minimization problem: 0 < C* <= C, thus C/C*
shows that C is larger than C* by ρ(n)
My T. Thai
mythai@cise.ufl.edu
22
Approximation Algorithms (cont)
 PTAS (Polynomial Time Approximation
Scheme): A (1 + ε)-approximation algorithm
for a NP-hard optimization П where its running
time is bounded by a polynomial in the size of
instance I
 FPTAS (Fully PTAS): The same as above +
time is bounded by a polynomial in both the
size of instance I and 1/ε
My T. Thai
mythai@cise.ufl.edu
23
A Dilemma!
 We cannot find C*, how can we compare C to
C*?
 How can we design an algorithm so that we can
compare C to C*
It is the objective of this course!!!
My T. Thai
mythai@cise.ufl.edu
24
Vertex Cover
 Definition:
 An Example
'
at
incident
endpoint
one
least
at
has
,
edge
every
for
iff
of
cover
vertex
a
called
is
'
subset
a
,
)
,
(
graph
undirected
an
Given
V
e
E
e
G
V
V
E
V
G



My T. Thai
mythai@cise.ufl.edu
25
Vertex Cover Problem
 Definition:
 Given an undirected graph G=(V,E), find a vertex
cover with minimum size (has the least number of
vertices)
 This is sometimes called cardinality vertex cover
 More generalization:
 Given an undirected graph G=(V,E), and a cost
function on vertices c: V → Q+
 Find a minimum cost vertex cover
My T. Thai
mythai@cise.ufl.edu
26
How to solve it
 Matching:
 A set M of edges in a graph G is called a matching
of G if no two edges in set M have an endpoint in
common
 Example:
My T. Thai
mythai@cise.ufl.edu
27
How to solve it (cont)
 Maximum Matching:
 A matching of G with the greatest number of edges
 Maximal Matching:
 A matching which is not contained in any larger
matching
 Note: Any maximum matching is certainly
maximal, but not the reverse
My T. Thai
mythai@cise.ufl.edu
28
Main Observation
 No vertex can cover two edges of a matching
 The size of every vertex cover is at least the
size of every matching: |M| ≤ |C|
 |M| = |C| indicates the optimality
 Possible Solution: Using Maximal matching to
find Minimum vertex cover
My T. Thai
mythai@cise.ufl.edu
29
An Algorithm
 Alg 1: Find a maximal matching in G and output the
set of matched vertices
 Theorem: The algorithm 1 is a factor 2-approximation
algorithm.
 Proof:
opt
C
M
C
opt
M
C
opt
2
|
|
Therefore,
|
|
2
|
|
and
|
|
:
have
We
1.
algorithm
from
obtained
cover
vertex
of
set
a
be
Let
solution.
optimal
an
of
size
a
be
Let



My T. Thai
mythai@cise.ufl.edu
30
Can Alg1 be improved?
 Q1: Can the approximation ratio of Alg 1 be
improved by a better analysis?
 Q2: Can we design a better approximation
algorithm using the similar technique (maximal
matching)?
 Q3: Is there any other lower bounding method
that can lead to a better approximation
algorithm?
My T. Thai
mythai@cise.ufl.edu
31
Answers
 A1: No by considering the complete bipartite
graphs Kn,n
 A2: No by considering the complete graph Kn
where n is odd.
 |M| = (n-1)/2 whereas opt = n -1
My T. Thai
mythai@cise.ufl.edu
32
Answers (cont)
 A3:
 Currently a central open problem
 Yes, we can obtain a better one by using the
semidefinite programming
 Generalization vertex Cover
 Can we still able to design a 2-approximation
algorithm?
 Homework assignment!
My T. Thai
mythai@cise.ufl.edu
33
Set Cover
 Definition: Given a universe U of n elements, a
collection of subsets of U, S = {S1, …, Sm}, and a cost
function c: S → Q+, find a minimum cost
subcollection C of S that covers all elements of U.
 Example:
 U = {1, 2, 3, 4, 5}
 S1 = {1, 2, 3}, S2 = {2,3}, S3 = {4, 5}, S4 = {1, 2, 4}
 c1 = c2 = c3 = c4 = 1
 Solution C = {S1, S3}
 If the cost is uniform, then the set cover problem asks
us to find a subcollection covering all elements of U
with the minimum size.
My T. Thai
mythai@cise.ufl.edu
34
An Example
My T. Thai
mythai@cise.ufl.edu
35
INSTANCE: Given a universe U of n elements, a collection
of subsets of U, S = {S1, …, Sm}, and a positive integer b
QUESTION: Is there a , |C| ≤ b,
such that
(Note: The subcollection {Si | } satisfying the above
condition is called a set cover of U
NP-completeness
 Theorem: Set Cover (SC) is NP-complete
 Proof:
My T. Thai
mythai@cise.ufl.edu
36
Proof (cont)
 First we need to show that SC is in NP. Given a
collection of sets C, it is easy to verify that if
|C| ≤ b and the union of all sets listed in C does
include all elements in U.
 To complete the proof, we need to show that
Vertex Cover (VC) ≤p Set Cover (SC)
Given an instance C of VC (an undirected graph
G=(V,E) and a positive integer j), we need to
construct an instance C’ of SC in polynomial
time such that C is satisfiable iff C’ is
satisfiable.
My T. Thai
mythai@cise.ufl.edu
37
Proof (cont)
Construction: Let U = E. We will define n
elements of U and a collection S as follows:
Note: Each edge corresponds to each element in U and
each vertex corresponds to each set in S.
Label all vertices in V from 1 to n. Let Si be the set of all
edges that incident to vertex i. Finally, let b = j. This
construction is in poly-time with respect to the size of
VC instance.
My T. Thai
mythai@cise.ufl.edu
38
VERTEX-COVER p SET-COVER
one set for every vertex,
containing the edges it covers
VC
one element
for every edge
VC SC
My T. Thai
mythai@cise.ufl.edu
39
Proof (cont)
Now, we need to prove that C is satisfiable iff C’ is.
That is, we need to show that if the original instance
of VC is a yes instance iff the constructed instance of
SC is a yes instance.
 (→) Suppose G has a vertex cover of size at most j,
called C. By our construction, C corresponds to a
collection C’ of subsets of U. Since b = j, |C’| ≤ b.
Plus, C’ covers all elements in U since C “covers” all
edges in G. To see this, consider any element of U.
Such an element is an edge in G. Since C is a set cover,
at least endpoint of this edge is in C.
My T. Thai
mythai@cise.ufl.edu
40
 (←) Suppose there is a set cover C’ of size at
most b in our constructed instance. Since each
set in C’ is associated with a vertex in G, let C
be the set of these vertices. Then |C| = |C’| ≤ b =
j. Plus, C is a vertex cover of G since C’ is a set
cover. To see this, consider any edge e. Since e
is in U, C’ must contain at least one set that
includes e. By our construction, the only set
that include e correspond to nodes that are
endpoints of e. Thus C must contain at least one
endpoints of e.
My T. Thai
mythai@cise.ufl.edu
41
Solutions
Algorithm 1: (in the case of uniform cost)
1: C = empty
2: while U is not empty
3: pick a set Si such that Si covers the most
elements in U
4: remove the new covered elements from U
5: C = C union Si
6: return C
My T. Thai
mythai@cise.ufl.edu
42
Solutions (cont)
 In the case of non-uniform cost
 Similar method. At each iteration, instead of picking a
set Si such that Si covers the most uncovered elements,
we will pick a set Si whose cost-effectiveness α is
smallest, where α is defined as:
 Questions: Why choose smallest α? Why define α as
above
|
|
/
)
( U
S
S
c i
i 


My T. Thai
mythai@cise.ufl.edu
43
Solutions (cont)
Algorithm 2: (in the case of non-uniform cost)
1: C = empty
2: while U is not empty
3: pick a set Si such that Si has the smallest α
4: for each new covered elements e in U
5: set price(e) = α
6: remove the new covered elements from U
7: C = C union Si
8: return C
My T. Thai
mythai@cise.ufl.edu
44
Approximation Ratio Analysis
Let ek, k = 1…n, be the elements of U in the order in which
they were covered by Alg 2. We have:
 Lemma 1:
 Proof: Let Uj be the set the remaining set U at iteration j.
That is, Uj contains all the uncovered elements at
iteration j. At any iteration j, the leftover sets of the
optimal solution can cover the remaining elements at a
cost of at most opt. (Why?)
solution
optimal
the
of
cost
the
is
where
)
1
/(
)
(
price
},
,...,
1
{
each
For
opt
k
n
opt
e
n
k k 



My T. Thai
mythai@cise.ufl.edu
45
Proof of Lemma 1 (cont)
Thus, among these leftover sets, there must exist
one set where its α value is at most opt/|Uj|. Let
ek be the element covered at this iteration, |Uj| ≥
n – k + 1. Since ek was covered by the most
cost-effective set in this iteration, we have:
price(ek) ≤ opt/|Uj| ≤ opt/(n-k+1)
My T. Thai
mythai@cise.ufl.edu
46
Approximation Ratio
 Theorem 1: The set cover obtained form Alg 2
(also Alg 1) has a factor of Hn where Hn is a
harmonic series Hn = 1 + 1/2 + … + 1/n
 Proof: It follows directly from Lemma 1

Introduction to algorithms and it's different types

  • 1.
  • 2.
    My T. Thai mythai@cise.ufl.edu 2 GeneralInformation  Time: MWF 1:55 – 2:45  Place: TUR 2303
  • 3.
    My T. Thai mythai@cise.ufl.edu 3 InstructorInformation  Name: Dr. My T. Thai  Office: E566 CSE Building  Phone: (352)392-6842  Email: mythai@cise.ufl.edu  Office Hours: W 3:00pm – 5:00pm or by appointments
  • 4.
    My T. Thai mythai@cise.ufl.edu 4 WhyApproximation Algorithms  Problems that we cannot find an optimal solution in a polynomial time  Eg: Set Cover, Bin Packing  Need to find a near-optimal solution:  Heuristic  Approximation algorithms:  This gives us a guarantee approximation ratio
  • 5.
    My T. Thai mythai@cise.ufl.edu 5 Whytake this course  Your advisers/bosses give you a computationally hard problem. Here are two scenarios:  No knowledge about approximation:  Spend a few months looking for an optimal solution  Come to their office and confess that you cannot do it  Get fired  Knowledge about approximation:
  • 6.
    My T. Thai mythai@cise.ufl.edu 6 Whytake this course (cont)  Knowledge about approximation  Show your boss that this is a NP-complete (NP- hard) problem  There does not exist any polynomial time algorithm to find an exact solution  Propose a good algorithm (either heuristic or approximation) to find a near-optimal solution  Better yet, prove the approximation ratio
  • 7.
    My T. Thai mythai@cise.ufl.edu 7 CourseDescription  Covers a variety of techniques to design and analyze many approximation algorithms for computationally hard problems:  Combinatorial algorithms:  Greedy Techniques, Independent System, Submodular Function  Cover various problems  Linear Programming based algorithms  Semidefinite Programming based algorithms
  • 8.
    My T. Thai mythai@cise.ufl.edu 8 CourseObjectives  Grasp the essential techniques to design and analyze approximation algorithms:  Combinatorial methods  Linear programming  Semidefinite programming  Primal-dual and relaxation methods  Hardness of approximation  Grasp the key ideas of graph theory  Able to model and solve many practical problems raising in our real life
  • 9.
    My T. Thai mythai@cise.ufl.edu 9 Textbooks Required:  Vijay Vazirani, Approximation Algorithms, Springer-Verlag, 2001  Recommended:  Vasek Chvatal, Linear Programming, W. H. Freeman Company  Michael R. Garey and David S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness  Shall provide some appropriate lecture notes
  • 10.
    My T. Thai mythai@cise.ufl.edu 10 GradingPolicies  Homework Assignments:  5 homework assignments, each weighs 10%  Due at the beginning of the lecture on the due date  No late assignment will be accepted  Midterm Exam:  One midterm exam, weighs 20%  In class, open book, open notes  One Final Exam:  Weighs 30%
  • 11.
    My T. Thai mythai@cise.ufl.edu 11 GradingPolicies (cont) Cut-off points:  A >= 85%  85% > B >= 75%  75% > C >= 65%
  • 12.
    My T. Thai mythai@cise.ufl.edu 12 PlagiarismPolicy  Zero tolerance  Collaboration:  May discuss with other students on homework, but must write up solution on your own independently  More info, see the class website  Other policy: Students are encouraged to attend the class. Many proofs will not be shown on the lecture slides
  • 13.
    My T. Thai mythai@cise.ufl.edu 13 TentativeSchedule  See the class website for detail information  Roughly divided into two parts:  Week 1 to Week 8: Combinatorial Algorithms  Week 9 to Week 16: Linear Programming: Dual- Primal, Relaxation, Rounding, and Semidefinite programming
  • 14.
  • 15.
    My T. Thai mythai@cise.ufl.edu 15 CombinatorialOptimization  The study of finding the “best” object from within some finite space of objects, eg:  Shortest path: Given a graph with edge costs and a pair of nodes, find the shortest path (least costs) between them  Traveling salesman: Given a complete graph with nonnegative edge costs, find a minimum cost cycle visiting every vertex exactly once  Maximum Network Lifetime: Given a wireless sensor networks and a set of targets, find a schedule of these sensors to maximize network lifetime
  • 16.
    My T. Thai mythai@cise.ufl.edu 16 InP or not in P? Informal Definitions:  The class P consists of those problems that are solvable in polynomial time, i.e. O(nk) for some constant k where n is the size of the input.  The class NP consists of those problems that are “verifiable” in polynomial time:  Given a certificate of a solution, then we can verify that the certificate is correct in polynomial time
  • 17.
    My T. Thai mythai@cise.ufl.edu 17 InP or not in P: Examples  In P:  Shortest path  Minimum Spanning Tree  Not in P (NP):  Vertex Cover  Traveling salesman  Minimum Connected Dominating Set
  • 18.
    My T. Thai mythai@cise.ufl.edu 18 NP-completeness(NPC)  A problem is in the class NPC if it is in NP and is as “hard” as any problem in NP
  • 19.
    My T. Thai mythai@cise.ufl.edu 19 Whatis “hard”  Decision Problems: Only consider YES-NO  Decision version of TSP: Given n cities, is there a TSP tour of length at most L?  Why Decision Problem? What relation between the optimization problem and its decision?  Decision is not “harder” than that of its optimization problem  If the decision is “hard”, then the optimization problem should be “hard”
  • 20.
    My T. Thai mythai@cise.ufl.edu 20 NP-completeand NP-hard A language L is NP-complete if: 1. L is in NP, and 2. For every L’ in NP, L’ can be polynomial-time reducible to L
  • 21.
    My T. Thai mythai@cise.ufl.edu 21 ApproximationAlgorithms  An algorithm that returns near-optimal solutions in polynomial time  Approximation Ratio ρ(n):  Define: C* as a optimal solution and C is the solution produced by the approximation algorithm  max (C/C*, C*/C) <= ρ(n)  Maximization problem: 0 < C <= C*, thus C*/C shows that C* is larger than C by ρ(n)  Minimization problem: 0 < C* <= C, thus C/C* shows that C is larger than C* by ρ(n)
  • 22.
    My T. Thai mythai@cise.ufl.edu 22 ApproximationAlgorithms (cont)  PTAS (Polynomial Time Approximation Scheme): A (1 + ε)-approximation algorithm for a NP-hard optimization П where its running time is bounded by a polynomial in the size of instance I  FPTAS (Fully PTAS): The same as above + time is bounded by a polynomial in both the size of instance I and 1/ε
  • 23.
    My T. Thai mythai@cise.ufl.edu 23 ADilemma!  We cannot find C*, how can we compare C to C*?  How can we design an algorithm so that we can compare C to C* It is the objective of this course!!!
  • 24.
    My T. Thai mythai@cise.ufl.edu 24 VertexCover  Definition:  An Example ' at incident endpoint one least at has , edge every for iff of cover vertex a called is ' subset a , ) , ( graph undirected an Given V e E e G V V E V G   
  • 25.
    My T. Thai mythai@cise.ufl.edu 25 VertexCover Problem  Definition:  Given an undirected graph G=(V,E), find a vertex cover with minimum size (has the least number of vertices)  This is sometimes called cardinality vertex cover  More generalization:  Given an undirected graph G=(V,E), and a cost function on vertices c: V → Q+  Find a minimum cost vertex cover
  • 26.
    My T. Thai mythai@cise.ufl.edu 26 Howto solve it  Matching:  A set M of edges in a graph G is called a matching of G if no two edges in set M have an endpoint in common  Example:
  • 27.
    My T. Thai mythai@cise.ufl.edu 27 Howto solve it (cont)  Maximum Matching:  A matching of G with the greatest number of edges  Maximal Matching:  A matching which is not contained in any larger matching  Note: Any maximum matching is certainly maximal, but not the reverse
  • 28.
    My T. Thai mythai@cise.ufl.edu 28 MainObservation  No vertex can cover two edges of a matching  The size of every vertex cover is at least the size of every matching: |M| ≤ |C|  |M| = |C| indicates the optimality  Possible Solution: Using Maximal matching to find Minimum vertex cover
  • 29.
    My T. Thai mythai@cise.ufl.edu 29 AnAlgorithm  Alg 1: Find a maximal matching in G and output the set of matched vertices  Theorem: The algorithm 1 is a factor 2-approximation algorithm.  Proof: opt C M C opt M C opt 2 | | Therefore, | | 2 | | and | | : have We 1. algorithm from obtained cover vertex of set a be Let solution. optimal an of size a be Let   
  • 30.
    My T. Thai mythai@cise.ufl.edu 30 CanAlg1 be improved?  Q1: Can the approximation ratio of Alg 1 be improved by a better analysis?  Q2: Can we design a better approximation algorithm using the similar technique (maximal matching)?  Q3: Is there any other lower bounding method that can lead to a better approximation algorithm?
  • 31.
    My T. Thai mythai@cise.ufl.edu 31 Answers A1: No by considering the complete bipartite graphs Kn,n  A2: No by considering the complete graph Kn where n is odd.  |M| = (n-1)/2 whereas opt = n -1
  • 32.
    My T. Thai mythai@cise.ufl.edu 32 Answers(cont)  A3:  Currently a central open problem  Yes, we can obtain a better one by using the semidefinite programming  Generalization vertex Cover  Can we still able to design a 2-approximation algorithm?  Homework assignment!
  • 33.
    My T. Thai mythai@cise.ufl.edu 33 SetCover  Definition: Given a universe U of n elements, a collection of subsets of U, S = {S1, …, Sm}, and a cost function c: S → Q+, find a minimum cost subcollection C of S that covers all elements of U.  Example:  U = {1, 2, 3, 4, 5}  S1 = {1, 2, 3}, S2 = {2,3}, S3 = {4, 5}, S4 = {1, 2, 4}  c1 = c2 = c3 = c4 = 1  Solution C = {S1, S3}  If the cost is uniform, then the set cover problem asks us to find a subcollection covering all elements of U with the minimum size.
  • 34.
  • 35.
    My T. Thai mythai@cise.ufl.edu 35 INSTANCE:Given a universe U of n elements, a collection of subsets of U, S = {S1, …, Sm}, and a positive integer b QUESTION: Is there a , |C| ≤ b, such that (Note: The subcollection {Si | } satisfying the above condition is called a set cover of U NP-completeness  Theorem: Set Cover (SC) is NP-complete  Proof:
  • 36.
    My T. Thai mythai@cise.ufl.edu 36 Proof(cont)  First we need to show that SC is in NP. Given a collection of sets C, it is easy to verify that if |C| ≤ b and the union of all sets listed in C does include all elements in U.  To complete the proof, we need to show that Vertex Cover (VC) ≤p Set Cover (SC) Given an instance C of VC (an undirected graph G=(V,E) and a positive integer j), we need to construct an instance C’ of SC in polynomial time such that C is satisfiable iff C’ is satisfiable.
  • 37.
    My T. Thai mythai@cise.ufl.edu 37 Proof(cont) Construction: Let U = E. We will define n elements of U and a collection S as follows: Note: Each edge corresponds to each element in U and each vertex corresponds to each set in S. Label all vertices in V from 1 to n. Let Si be the set of all edges that incident to vertex i. Finally, let b = j. This construction is in poly-time with respect to the size of VC instance.
  • 38.
    My T. Thai mythai@cise.ufl.edu 38 VERTEX-COVERp SET-COVER one set for every vertex, containing the edges it covers VC one element for every edge VC SC
  • 39.
    My T. Thai mythai@cise.ufl.edu 39 Proof(cont) Now, we need to prove that C is satisfiable iff C’ is. That is, we need to show that if the original instance of VC is a yes instance iff the constructed instance of SC is a yes instance.  (→) Suppose G has a vertex cover of size at most j, called C. By our construction, C corresponds to a collection C’ of subsets of U. Since b = j, |C’| ≤ b. Plus, C’ covers all elements in U since C “covers” all edges in G. To see this, consider any element of U. Such an element is an edge in G. Since C is a set cover, at least endpoint of this edge is in C.
  • 40.
    My T. Thai mythai@cise.ufl.edu 40 (←) Suppose there is a set cover C’ of size at most b in our constructed instance. Since each set in C’ is associated with a vertex in G, let C be the set of these vertices. Then |C| = |C’| ≤ b = j. Plus, C is a vertex cover of G since C’ is a set cover. To see this, consider any edge e. Since e is in U, C’ must contain at least one set that includes e. By our construction, the only set that include e correspond to nodes that are endpoints of e. Thus C must contain at least one endpoints of e.
  • 41.
    My T. Thai mythai@cise.ufl.edu 41 Solutions Algorithm1: (in the case of uniform cost) 1: C = empty 2: while U is not empty 3: pick a set Si such that Si covers the most elements in U 4: remove the new covered elements from U 5: C = C union Si 6: return C
  • 42.
    My T. Thai mythai@cise.ufl.edu 42 Solutions(cont)  In the case of non-uniform cost  Similar method. At each iteration, instead of picking a set Si such that Si covers the most uncovered elements, we will pick a set Si whose cost-effectiveness α is smallest, where α is defined as:  Questions: Why choose smallest α? Why define α as above | | / ) ( U S S c i i   
  • 43.
    My T. Thai mythai@cise.ufl.edu 43 Solutions(cont) Algorithm 2: (in the case of non-uniform cost) 1: C = empty 2: while U is not empty 3: pick a set Si such that Si has the smallest α 4: for each new covered elements e in U 5: set price(e) = α 6: remove the new covered elements from U 7: C = C union Si 8: return C
  • 44.
    My T. Thai mythai@cise.ufl.edu 44 ApproximationRatio Analysis Let ek, k = 1…n, be the elements of U in the order in which they were covered by Alg 2. We have:  Lemma 1:  Proof: Let Uj be the set the remaining set U at iteration j. That is, Uj contains all the uncovered elements at iteration j. At any iteration j, the leftover sets of the optimal solution can cover the remaining elements at a cost of at most opt. (Why?) solution optimal the of cost the is where ) 1 /( ) ( price }, ,..., 1 { each For opt k n opt e n k k    
  • 45.
    My T. Thai mythai@cise.ufl.edu 45 Proofof Lemma 1 (cont) Thus, among these leftover sets, there must exist one set where its α value is at most opt/|Uj|. Let ek be the element covered at this iteration, |Uj| ≥ n – k + 1. Since ek was covered by the most cost-effective set in this iteration, we have: price(ek) ≤ opt/|Uj| ≤ opt/(n-k+1)
  • 46.
    My T. Thai mythai@cise.ufl.edu 46 ApproximationRatio  Theorem 1: The set cover obtained form Alg 2 (also Alg 1) has a factor of Hn where Hn is a harmonic series Hn = 1 + 1/2 + … + 1/n  Proof: It follows directly from Lemma 1