2
Lecture outline
 graph concepts
 vertices, edges, paths
 directed/undirected
 weighting of edges
 cycles and loops
 searching for paths within a graph
 depth-first search
 breadth-first search
 Dijkstra's algorithm
 implementing graphs
 using adjacency lists
 using an adjacency matrix
3
Graphs
 graph: a data structure containing
 a set of vertices V
 a set of edges E, where an edge
represents a connection between 2 vertices
 the graph at right:
 V = {a, b, c}
 E = {(a, b), (b, c), (c, a)}
 Assuming that a graph can only have one edge between a pair of vertices, what is the
maximum number of edges a graph can contain, relative to the size of the vertex set
V?
4
More terminology
 degree: number of edges touching a vertex
 example: W has degree 4
 what is the degree of X? of Z?
 adjacent vertices: connected
directly by an edge
XU
V
W
Z
Y
a
c
b
e
d
f
g
h
i
j
5
Paths
 path: a path from vertex A to B is a sequence of edges that can be
followed starting from A to reach B
 can be represented as vertices visited or edges taken
 example: path from V to Z: {b, h} or {V, X, Z}
 reachability: V1 is reachable
from V2 if a path exists
from V1 to V2
 connected graph: one in
which it's possible to reach
any node from any other
 is this graph connected?
P1
XU
V
W
Z
Y
a
c
b
e
d
f
g
hP2
6
Cycles
 cycle: path from one node back to itself
 example: {b, g, f, c, a} or {V, X, Y, W, U, V}
 loop: edge directly from node to itself
 many graphs don't allow loops
C1
XU
V
W
Z
Y
a
c
b
e
d
f
g
hC2
7
Weighted graphs
 weight: (optional) cost associated with a given edge
 example: graph of airline flights
 vertices: cities (airports) to which the airline flies
 edges: distance between airports in miles
 if we were programming this graph, what information would we have to store for each
vertex / edge?
ORD
PVD
MIA
DFW
SFO
LAX
LGA
HNL
8
Directed graphs
 directed graph (digraph): edges are one-way connections between
vertices
 if graph is directed, a vertex has a separate in/out degree
9
Graph questions
 Are the following graphs directed or not directed?
 Buddy graphs of instant messaging programs?
(vertices = users, edges = user being on another's buddy list)
 bus line graph depicting all of Seattle's bus stations and routes
 graph of the main backbone servers
on the internet
 graph of movies in which actors
have appeared together
 Are these graphs potentially cyclic?
Why or why not?
John
David
Paul
brown.edu
cox.net
cs.brown.edu
att.net
qwest.net
math.brown.edu
cslab1bcslab1a
10
Graph exercise
 Consider a graph of instant messenger buddies.
 What do the vertices represent? What does an edge represent?
 Is this graph directed or undirected? Weighted or unweighted?
 What does a vertex's degree mean? In degree? Out degree?
 Can the graph contain loops? cycles?
 Consider this graph data:
 Marty's buddy list: Mike, Sarah, Amanda.
 Mike's buddy list: Sarah, Emily.
 David's buddy list: Emily, Mike.
 Amanda's buddy list: Emily, Mike.
 Sarah's buddy list: Amanda, Marty.
 Emily's buddy list: Mike.
 Compute the in/out degree of each vertex. Is the graph connected?
 Who is the most popular? Least? Who is the most antisocial?
 If we're having a party and want to distribute the message the most quickly, who should we tell
first?
11
Depth-first search
 depth-first search (DFS): finds a path between two vertices by exploring
each possible path as many steps as possible before backtracking
 often implemented recursively
12
DFS pseudocode
 Pseudo-code for depth-first search:
dfs(v1, v2):
dfs(v1, v2, {})
dfs(v1, v2, path):
path += v1.
mark v1 as visited.
if v1 is v2:
path is found.
for each unvisited neighbor vi of v1
where there is an edge from v1 to vi:
if dfs(vi, v2, path) finds a path, path is found.
path -= v1. path is not found.
13
DFS example
 Paths tried from A to others (assumes ABC edge order)
 A
 A -> B
 A -> B -> D
 A -> B -> F
 A -> B -> F -> E
 A -> C
 A -> C -> G
 A -> E
 A -> E -> F
 A -> E -> F -> B
 A -> E -> F -> B -> D
 What paths would DFS return from D to each vertex?
14
DFS observations
 guaranteed to find a path if one exists
 easy to retrieve exactly what the path
is (to remember the sequence of edges
taken) if we find it
 optimality: not optimal. DFS is guaranteed to find a path, not necessarily
the best/shortest path
 Example: DFS(A, E) may return
A -> B -> F -> E
15
DFS example
 Using DFS, find a path from BOS to SFO.
JFK
BOS
MIA
ORD
LAX
DFW
SFO
v2
v1
v3
v4
v5
v6
v7
16
Breadth-first search
 breadth-first search (BFS): finds a path between two nodes by taking
one step down all paths and then immediately backtracking
 often implemented by maintaining
a list or queue of vertices to visit
 BFS always returns the path with
the fewest edges between the start
and the goal vertices
17
BFS pseudocode
 Pseudo-code for breadth-first search:
bfs(v1, v2):
List := {v1}.
mark v1 as visited.
while List not empty:
v := List.removeFirst().
if v is v2:
path is found.
for each unvisited neighbor vi of v
where there is an edge from v to vi:
List.addLast(vi).
path is not found.
18
BFS example
 Paths tried from A to others (assumes ABC edge order)
 A
 A -> B
 A -> C
 A -> E
 A -> B -> D
 A -> B -> F
 A -> C -> G
 A -> E -> F
 A -> B -> F -> E
 A -> E -> F -> B
 A -> E -> F -> B -> D
 What paths would BFS return from D to each vertex?
19
BFS observations
 optimality:
 in unweighted graphs, optimal. (fewest edges = best)
 In weighted graphs, not optimal.
(path with fewest edges might not have the lowest weight)
 disadvantage: harder to reconstruct what the actual path is once you find
it
 conceptually, BFS is exploring many possible paths in parallel, so it's not easy to store
a Path array/list in progress
 observation: any particular vertex is only part of one partial path at a time
 We can keep track of the path by storing predecessors for each vertex (references to
the previous vertex in that path)
20
BFS example
 Using BFS, find a path from BOS to SFO.
JFK
BOS
MIA
ORD
LAX
DFW
SFO
v2
v1
v3
v4
v5
v6
v7
21
DFS, BFS runtime
 What is the expected runtime of DFS, in terms of the number of vertices V
and the number of edges E ?
 What is the expected runtime of BFS, in terms of the number of vertices V
and the number of edges E ?
 Answer: O(|V| + |E|)
 each algorithm must potentially visit every node and/or examine every edge once.
 why not O(|V| * |E|) ?
 What is the space complexity of each algorithm?
22
Implementing graphs
23
Implementing a graph
 If we wanted to program an actual data structure to represent a graph,
what information would we need to store?
 for each vertex?
 for each edge?
 What kinds of questions
would we want to be able to
answer quickly:
 about a vertex?
 about its edges / neighbors?
 about paths?
 about what edges exist in the graph?
 We'll explore three common graph implementation strategies:
 edge list, adjacency list, adjacency matrix
1
2
3
4
5
6
7
24
Edge list
 edge list: an unordered list of all edges in the graph
 advantages
 easy to loop/iterate over all edges
 disadvantages
 hard to tell if an edge
exists from A to B
 hard to tell how many edges
a vertex touches (its degree)
1
2
5
1
1
6
2
7
2
3
3
4
7
4
5
6
5
7
5
4
1
2
3
4
5
6
7
25
Adjacency lists
 adjacency list: stores edges as individual linked lists of references to each
vertex's neighbors
 generally, no information needs to be stored in the edges, only in nodes, these arrays
can simply be pointers to other nodes and thus represent edges with little memory
requirement
26
Pros/cons of adjacency list
 advantage: new nodes can be added to the graph easily, and they can be connected with
existing nodes simply by adding elements to the appropriate arrays
 disadvantage: determining whether an edge exists between two nodes requires O(n) time,
where n is the average number of incident edges per node
27
Adjacency list example
 The graph at right has the following adjacency list:
 How do we figure out the degree of a given vertex?
 How do we find out whether an edge exists from A to B?
 How could we look for loops in the graph?
1
2
3
4
5
6
71
2
3
4
5
6
7
2 5 6
3 1 7
2 4
3 7 5
6 1 7 4
1 5
4 5 2
28
Adjacency matrix
 adjacency matrix: an n × n matrix where:
 the nondiagonal entry aij is the number of edges joining vertex i and vertex j (or the
weight of the edge joining vertex i and vertex j)
 the diagonal entry aii corresponds to the number of loops (self-connecting edges) at
vertex i
29
Pros/cons of Adj. matrix
 advantage: fast to tell whether edge exists between any two vertices i and
j (and to get its weight)
 disadvantage: consumes a lot of memory on sparse graphs (ones with few
edges)
30
Adjacency matrix example
 The graph at right has the following adjacency matrix:
 How do we figure out the degree of a given vertex?
 How do we find out whether an edge exists from A to B?
 How could we look for loops in the graph?
1
2
3
4
5
6
70
1
0
0
1
1
0
1
2
3
4
5
6
7
1
0
1
0
0
0
1
0
1
0
1
0
0
0
0
0
1
0
1
0
1
1
0
0
1
0
1
1
1
0
0
0
1
0
0
0
1
0
1
1
0
0
1 2 3 4 5 6 7
31
Runtime table
 n vertices, m edges
 no parallel edges
 no self-loops
Edge
List
Adjacency
List
Adjacency
Matrix
Space
Finding all adjacent
vertices to v
Determining if v is
adjacent to w
inserting a vertex
inserting an edge
removing vertex v
removing an edge
 n vertices, m edges
 no parallel edges
 no self-loops
Edge
List
Adjacency
List
Adjacency
Matrix
Space n + m n + m n2
Finding all adjacent
vertices to v
m deg(v) n
Determining if v is
adjacent to w
m
min(deg(v),
deg(w))
1
inserting a vertex 1 1 n2
inserting an edge 1 1 1
removing vertex v m deg(v) n2
removing an edge 1 deg(v) 1
32
0
1
0
1
2
3
1
0
1
0
1
0
0
0
1
1
0
0
1
0
0
0
1
0
1 2 3 4 5 6 7
Practical implementation
 Not all graphs have vertices/edges that are easily "numbered"
 how do we actually represent 'lists' or 'matrices' of vertex/edge relationships? How do we
quickly look up the edges and/or vertices adjacent to a given vertex?
 Adjacency list: Map<V, List<V>>
 Adjacency matrix: Map<V, Map<V, E>>
 Adjacency matrix: Map<V*V, E>
ORD
PVD
MIA
DFW
SFO
LAX
LGA
HNL
1
2
3
4
2 5 6
3 1 7
2 4
3 7 5
33
Maps and sets within graphs
since not all vertices can be numbered, we can use:
1. adjacency map
 each Vertex maps to a List of edges or adjacent Vertices
 Vertex --> List of Edges
 to get all edges adjacent to V1, look up
List<Edge> v1neighbors = map.get(V1)
2. adjacency adjacency matrix map
 each Vertex maps to a Hash of adjacent
 Vertex --> (Vertex --> Edge)
 to find out whether there's an edge from V1 to V2, call map.get(V1).containsKey(V2)
 to get the edge from V1 to V2, call map.get(V1).get(V2)

Talk on Graph Theory - I

  • 2.
    2 Lecture outline  graphconcepts  vertices, edges, paths  directed/undirected  weighting of edges  cycles and loops  searching for paths within a graph  depth-first search  breadth-first search  Dijkstra's algorithm  implementing graphs  using adjacency lists  using an adjacency matrix
  • 3.
    3 Graphs  graph: adata structure containing  a set of vertices V  a set of edges E, where an edge represents a connection between 2 vertices  the graph at right:  V = {a, b, c}  E = {(a, b), (b, c), (c, a)}  Assuming that a graph can only have one edge between a pair of vertices, what is the maximum number of edges a graph can contain, relative to the size of the vertex set V?
  • 4.
    4 More terminology  degree:number of edges touching a vertex  example: W has degree 4  what is the degree of X? of Z?  adjacent vertices: connected directly by an edge XU V W Z Y a c b e d f g h i j
  • 5.
    5 Paths  path: apath from vertex A to B is a sequence of edges that can be followed starting from A to reach B  can be represented as vertices visited or edges taken  example: path from V to Z: {b, h} or {V, X, Z}  reachability: V1 is reachable from V2 if a path exists from V1 to V2  connected graph: one in which it's possible to reach any node from any other  is this graph connected? P1 XU V W Z Y a c b e d f g hP2
  • 6.
    6 Cycles  cycle: pathfrom one node back to itself  example: {b, g, f, c, a} or {V, X, Y, W, U, V}  loop: edge directly from node to itself  many graphs don't allow loops C1 XU V W Z Y a c b e d f g hC2
  • 7.
    7 Weighted graphs  weight:(optional) cost associated with a given edge  example: graph of airline flights  vertices: cities (airports) to which the airline flies  edges: distance between airports in miles  if we were programming this graph, what information would we have to store for each vertex / edge? ORD PVD MIA DFW SFO LAX LGA HNL
  • 8.
    8 Directed graphs  directedgraph (digraph): edges are one-way connections between vertices  if graph is directed, a vertex has a separate in/out degree
  • 9.
    9 Graph questions  Arethe following graphs directed or not directed?  Buddy graphs of instant messaging programs? (vertices = users, edges = user being on another's buddy list)  bus line graph depicting all of Seattle's bus stations and routes  graph of the main backbone servers on the internet  graph of movies in which actors have appeared together  Are these graphs potentially cyclic? Why or why not? John David Paul brown.edu cox.net cs.brown.edu att.net qwest.net math.brown.edu cslab1bcslab1a
  • 10.
    10 Graph exercise  Considera graph of instant messenger buddies.  What do the vertices represent? What does an edge represent?  Is this graph directed or undirected? Weighted or unweighted?  What does a vertex's degree mean? In degree? Out degree?  Can the graph contain loops? cycles?  Consider this graph data:  Marty's buddy list: Mike, Sarah, Amanda.  Mike's buddy list: Sarah, Emily.  David's buddy list: Emily, Mike.  Amanda's buddy list: Emily, Mike.  Sarah's buddy list: Amanda, Marty.  Emily's buddy list: Mike.  Compute the in/out degree of each vertex. Is the graph connected?  Who is the most popular? Least? Who is the most antisocial?  If we're having a party and want to distribute the message the most quickly, who should we tell first?
  • 11.
    11 Depth-first search  depth-firstsearch (DFS): finds a path between two vertices by exploring each possible path as many steps as possible before backtracking  often implemented recursively
  • 12.
    12 DFS pseudocode  Pseudo-codefor depth-first search: dfs(v1, v2): dfs(v1, v2, {}) dfs(v1, v2, path): path += v1. mark v1 as visited. if v1 is v2: path is found. for each unvisited neighbor vi of v1 where there is an edge from v1 to vi: if dfs(vi, v2, path) finds a path, path is found. path -= v1. path is not found.
  • 13.
    13 DFS example  Pathstried from A to others (assumes ABC edge order)  A  A -> B  A -> B -> D  A -> B -> F  A -> B -> F -> E  A -> C  A -> C -> G  A -> E  A -> E -> F  A -> E -> F -> B  A -> E -> F -> B -> D  What paths would DFS return from D to each vertex?
  • 14.
    14 DFS observations  guaranteedto find a path if one exists  easy to retrieve exactly what the path is (to remember the sequence of edges taken) if we find it  optimality: not optimal. DFS is guaranteed to find a path, not necessarily the best/shortest path  Example: DFS(A, E) may return A -> B -> F -> E
  • 15.
    15 DFS example  UsingDFS, find a path from BOS to SFO. JFK BOS MIA ORD LAX DFW SFO v2 v1 v3 v4 v5 v6 v7
  • 16.
    16 Breadth-first search  breadth-firstsearch (BFS): finds a path between two nodes by taking one step down all paths and then immediately backtracking  often implemented by maintaining a list or queue of vertices to visit  BFS always returns the path with the fewest edges between the start and the goal vertices
  • 17.
    17 BFS pseudocode  Pseudo-codefor breadth-first search: bfs(v1, v2): List := {v1}. mark v1 as visited. while List not empty: v := List.removeFirst(). if v is v2: path is found. for each unvisited neighbor vi of v where there is an edge from v to vi: List.addLast(vi). path is not found.
  • 18.
    18 BFS example  Pathstried from A to others (assumes ABC edge order)  A  A -> B  A -> C  A -> E  A -> B -> D  A -> B -> F  A -> C -> G  A -> E -> F  A -> B -> F -> E  A -> E -> F -> B  A -> E -> F -> B -> D  What paths would BFS return from D to each vertex?
  • 19.
    19 BFS observations  optimality: in unweighted graphs, optimal. (fewest edges = best)  In weighted graphs, not optimal. (path with fewest edges might not have the lowest weight)  disadvantage: harder to reconstruct what the actual path is once you find it  conceptually, BFS is exploring many possible paths in parallel, so it's not easy to store a Path array/list in progress  observation: any particular vertex is only part of one partial path at a time  We can keep track of the path by storing predecessors for each vertex (references to the previous vertex in that path)
  • 20.
    20 BFS example  UsingBFS, find a path from BOS to SFO. JFK BOS MIA ORD LAX DFW SFO v2 v1 v3 v4 v5 v6 v7
  • 21.
    21 DFS, BFS runtime What is the expected runtime of DFS, in terms of the number of vertices V and the number of edges E ?  What is the expected runtime of BFS, in terms of the number of vertices V and the number of edges E ?  Answer: O(|V| + |E|)  each algorithm must potentially visit every node and/or examine every edge once.  why not O(|V| * |E|) ?  What is the space complexity of each algorithm?
  • 22.
  • 23.
    23 Implementing a graph If we wanted to program an actual data structure to represent a graph, what information would we need to store?  for each vertex?  for each edge?  What kinds of questions would we want to be able to answer quickly:  about a vertex?  about its edges / neighbors?  about paths?  about what edges exist in the graph?  We'll explore three common graph implementation strategies:  edge list, adjacency list, adjacency matrix 1 2 3 4 5 6 7
  • 24.
    24 Edge list  edgelist: an unordered list of all edges in the graph  advantages  easy to loop/iterate over all edges  disadvantages  hard to tell if an edge exists from A to B  hard to tell how many edges a vertex touches (its degree) 1 2 5 1 1 6 2 7 2 3 3 4 7 4 5 6 5 7 5 4 1 2 3 4 5 6 7
  • 25.
    25 Adjacency lists  adjacencylist: stores edges as individual linked lists of references to each vertex's neighbors  generally, no information needs to be stored in the edges, only in nodes, these arrays can simply be pointers to other nodes and thus represent edges with little memory requirement
  • 26.
    26 Pros/cons of adjacencylist  advantage: new nodes can be added to the graph easily, and they can be connected with existing nodes simply by adding elements to the appropriate arrays  disadvantage: determining whether an edge exists between two nodes requires O(n) time, where n is the average number of incident edges per node
  • 27.
    27 Adjacency list example The graph at right has the following adjacency list:  How do we figure out the degree of a given vertex?  How do we find out whether an edge exists from A to B?  How could we look for loops in the graph? 1 2 3 4 5 6 71 2 3 4 5 6 7 2 5 6 3 1 7 2 4 3 7 5 6 1 7 4 1 5 4 5 2
  • 28.
    28 Adjacency matrix  adjacencymatrix: an n × n matrix where:  the nondiagonal entry aij is the number of edges joining vertex i and vertex j (or the weight of the edge joining vertex i and vertex j)  the diagonal entry aii corresponds to the number of loops (self-connecting edges) at vertex i
  • 29.
    29 Pros/cons of Adj.matrix  advantage: fast to tell whether edge exists between any two vertices i and j (and to get its weight)  disadvantage: consumes a lot of memory on sparse graphs (ones with few edges)
  • 30.
    30 Adjacency matrix example The graph at right has the following adjacency matrix:  How do we figure out the degree of a given vertex?  How do we find out whether an edge exists from A to B?  How could we look for loops in the graph? 1 2 3 4 5 6 70 1 0 0 1 1 0 1 2 3 4 5 6 7 1 0 1 0 0 0 1 0 1 0 1 0 0 0 0 0 1 0 1 0 1 1 0 0 1 0 1 1 1 0 0 0 1 0 0 0 1 0 1 1 0 0 1 2 3 4 5 6 7
  • 31.
    31 Runtime table  nvertices, m edges  no parallel edges  no self-loops Edge List Adjacency List Adjacency Matrix Space Finding all adjacent vertices to v Determining if v is adjacent to w inserting a vertex inserting an edge removing vertex v removing an edge  n vertices, m edges  no parallel edges  no self-loops Edge List Adjacency List Adjacency Matrix Space n + m n + m n2 Finding all adjacent vertices to v m deg(v) n Determining if v is adjacent to w m min(deg(v), deg(w)) 1 inserting a vertex 1 1 n2 inserting an edge 1 1 1 removing vertex v m deg(v) n2 removing an edge 1 deg(v) 1
  • 32.
    32 0 1 0 1 2 3 1 0 1 0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 1 2 34 5 6 7 Practical implementation  Not all graphs have vertices/edges that are easily "numbered"  how do we actually represent 'lists' or 'matrices' of vertex/edge relationships? How do we quickly look up the edges and/or vertices adjacent to a given vertex?  Adjacency list: Map<V, List<V>>  Adjacency matrix: Map<V, Map<V, E>>  Adjacency matrix: Map<V*V, E> ORD PVD MIA DFW SFO LAX LGA HNL 1 2 3 4 2 5 6 3 1 7 2 4 3 7 5
  • 33.
    33 Maps and setswithin graphs since not all vertices can be numbered, we can use: 1. adjacency map  each Vertex maps to a List of edges or adjacent Vertices  Vertex --> List of Edges  to get all edges adjacent to V1, look up List<Edge> v1neighbors = map.get(V1) 2. adjacency adjacency matrix map  each Vertex maps to a Hash of adjacent  Vertex --> (Vertex --> Edge)  to find out whether there's an edge from V1 to V2, call map.get(V1).containsKey(V2)  to get the edge from V1 to V2, call map.get(V1).get(V2)