Chapter 4: Graphs

Graphs
19:14

Definition (Graph)

An undirected graph is a set where

Remarks

  • The elements of V are called vertices
  • The elements of E are called edges
  • If instead , the graph is called directed

Adjacency matrix
19:14

Definition (Adjacency matrix)

Assume

Random browsing
19:14

  • Each page has the same probability of being the start page
  • There is at most one link from one page to any other
  • On each page, each link has the same probability of being clicked
  • If there are no links, the next page can be any page with equal probability

Ranking webpages
19:14

  • : k-th page visited
  • : probability that the k-th page is page

Idea

If is small, depends too much on . A good way to measure the popularity of page would be

Rank and PageRank
19:14

Definition (PageRank vector)

Transition Matrix
19:14

Properties of the transition matrix
19:14

Proposition

Corollary

The PageRank vector satisfies

Weakness of naive PageRank
19:14

Question

What's the issue with the above graphs? How would you fix it?

PageRank algorithm
19:14

Idea

Approximate for some large via

Exercise

Implement it on this dataset

Show the ten highest ranked pages.

PageRank: solution
19:14

Graph Exploration
19:14

Question

From a given set of nodes, which vertices can I reach?

  • Web crawling
  • Social networking
  • Network Broadcast
  • Garbage collection

Adjacency list
19:14

Definition (Adjacency list)

Question

How much space does that representation required?

Adjacency list/matrix
19:14

Breadth-first search
19:14

  • Visit all the nodes reachable from a given node
  • We want to achieve time
  • Look at nodes reachable in moves
  • Careful to avoid duplicates (otherwise running time could be infinite)

Breadth-First Search
19:14

Exercise

Write an algorithm that perform breadth-first search from a given source node s.

Shortest paths
19:14

Exercise (Shortest paths)

Change the BFS algorithm so that you can keep track of a shortest path from s to any node.

Running Time
19:14

Proposition

Depth-first search
19:14

  • Recursively explore graph, backtracking as necessary
  • Careful not to repeat

Exercise

Implement DFS, when you're given the set of vertices and an adjacency list.

DFS: Running time
19:14

Proposition

The Running time of DFS is

Exercise: Sudoku solving
19:14

Exercise (Sudoku Solving)

Sudoku is a puzzle where you're given a 9 by 9 grid partially filled with digits. The objective is to fill the grid subject to the constraint that every row, column, and box (3 by 3 subgrid) must contain all of the digits from 1 to 9.

Exercise: N queens problem
19:14

Recall: DFS
19:14

  • Recursively explore graph, backtracking as necessary
  • Careful not to repeat

Sudoku: solution
19:14

Loading

Edge classification
19:14

  • Tree edge: visit new vertex via edge
  • Forward edge: node to descendant in a tree but not a tree edge
  • Backward edge: node to ancestor in tree, indicate a cycle
  • Cross edge: between two non-ancestor related vertices

Question

How can we know what type of edge is?

  • Tree edges: is white when we explore edge
  • Forward edge: is black when we explore edge
  • Back edge: is gray when we explore edge
  • Cross edge: is black when we explore edge

Edge classification in undirected graphs
19:14

Proposition

In an undirected graph, every edge is either a tree or a backedge.

Corollary

An undirected graph is acyclic if and only if there are no back edges

Acyclic directed graphs
19:14

Proposition

A directed graph is acyclic if and only if there are no back edges.

Topological Sort
19:14

Problem

Given a directed acyclic graph, "sort" the vertices in such a way that the edges are always pointing to the right.

Applications include:

  • Building software
  • Job scheduling
  • Select courses with prerequisites

Topological sort: example
19:14

Topological sort: algorithm and proof of correctness
19:14

  • Run DFS
  • Output vertices in decreasing order of finish time

Proposition

If , then

Topological sort: implementation
19:14

Exercise

Modify DFS to implement topological sort

Exercices
19:14

Strongly Connected Component
19:14

Definition (Strongly connected component)

Let be a directed graph. A subset is a strongly connected component if for every pair , there is a path from to and from to .

Definition (Component graph)

Let be a directed graph with connected connected . The component graph is the graph

Strongly connected components
19:14

Problem

Given a directed graph G, find its connected components

Remark

  • DFS-Visit will always visit at least the connected component of the starting point, but might visit other connected components too.
  • Between two strongly connected components, there's a one-way bridge between them.

Transpose graph
19:14

Definition (Transpose graph)

Let be a graph. Its transpose graph is the graph

Question

Why is it called the transpose graph? What happens to the adjacency matrix when we transpose the graph?

Exercise

Write a Python code that transforms a graph (given via an adjacency list) to its transpose.

Strongly connected components: algorithm
19:14

  1. Topological sort the edges of the graphs
  2. Transpose the graph
  3. Run DFS visit on unvisited nodes in the order given in Step I.

Exercise

Implement it

Strongly connected components: proof of correctness
19:14

Proposition

Assume that are the strongly connected components of a graph in the order given by the SCC algorithm.

If , then DFS-Visit will only reach the vertices in in the transpose of .

Corollary

The SSC algorithm is correct.

Weighted graph
19:14

Definition

A graph is weighted if it's given with a function .

Minimum spanning tree
19:14

Problem

Given an undirected weighted and connected graph , find a spanning subtree with smallest total weight.

Edge contraction
19:14

Definition (Edge contraction)

: Remove edge and merge the vertices it previously joined

Question

What did we do about common neighbours?

Dynamic programming solution
19:14

  1. Guess edge in a MST
  2. Contract
  3. Recurse
  4. Decontract and add to MST

Remark (Greedy)

We will remove the guessing part by proving that the greedy choice is optimal

Question

What's the time complexity?

Correctness of the DP algorithm
19:14

Proposition

Assume belongs to some MST of . If is a MST of , then is a MST of G

Sketch proof

Definitions
19:14

  • Cut: partition of the vertices in two sets
  • Edge crosses cut if or vice-versa.
  • Cut respects if no edge in crosses the cut.
  • light edge crossing a cut: minimal weight of any edge entering the cut.

Criterion for safe edge
19:14

The greedy choice works.

Proposition

Assume is a subset of some MST.

If is a cut respecting and is a light edge crossing , then is also part of some MST.

We've reduced guessing an edge to using a particular cut respecting .

Greedy Generic MST algorithm
19:14

A = {}
while A is not spanning:
    find safe edge e
    A.add(e)
return A

We'll see: Prims, Kruskal

Prim's Algorithm: Idea
19:14

  • Start at vertex
  • Cut will be the visited vertices
  • Maintain a priority queue containing crossing edges. Prioritize lighter edges.

Example

Find the MST of the following graph

Prim's Algorithm: Python implementation
19:14

Loading

Question

How would we get the tree itself?

Prim: Time complexity
19:14

Proposition

The time complexity of this algorithm is

Kruskal's algorithm
19:14

Union-Find structure
19:14

Idea

Keep track of connected components

We'll need two operations to apply Kruskal

  • Union: unites two components
  • Find: finds an element's component

Union-Find structure: Python implementation
19:14

Loading

Kruskal's implementation
19:14

Loading

Kruskal: correctness
19:14

Proposition (Kruskal: correctness)

Kruskal's algorithm returns an MST.

Question (Time complexity)

What is the time complexity?