Lab 11: Building a Maze with Disjoint Sets

Background

Disjoint Sets

Disjoint Sets are a way of solving the "Dynamic Equivalence Problem" (p. 316).
Given two elements, we must determine if they are related.
Relations:
1. a is related to a (reflexive).
2. If a is related to b, b is related to a (symmetric).
3. If a is related to b and b is related to c, a is related to c (transitive).
All elements that are related are part of the same set.
The problem is dynamic, because elements may become related if two sets are joined through a union.
Similar to family relations. If two people in the family are joined through a union, every member in both families is now in the same family (or set).

Disjoint Set Data Structure

For a group of elements, we need to be able to determine whether they are in the same set (find) and join their two sets (union).
One way to represent 8 disjoint sets is to have an array of size 8. Each array location is one of the nodes (i.e. nodes 0 through 7).
The array location holds the name of the set this node is in.
The "name" of the set will just be a designated root index. If the item is the root, then it will hold the value "-1" to specify it is the root. Thus, in the picture, 0, 1, 2, 3, and 4 are roots of their sets, and 5, 6, and 7 are all part of set 4.
Even though 7 holds "6", 6 is not the root and holds "4". Therefore a find on 4, 5, 6, or 7 would return "4".
If "union" is called on 7 and 1, then the roots of both of their sets must be combined into the same set. Either 1 will be the new root or 4 will be. If 1 is the new root, index 4 will now hold 1.

Building a Maze

Disjoint sets can be used to build a maze.
If the maze is going to be 50 by 50, we make any array of 2500 elements.
Each element represents a room in the maze.
Each room begins as root of its own set. Therefore, we have 2500 sets. Each room initially has no doorways. Simply 4 walls.
Our mazed will always begin at 0, 0 and end as size-1, size-1.
In order to assure a path from beginning to end with many possible wrong turns, we need to put doors in the walls so that every room in the maze has a path to every other room.
In order to make the maze different each time, we will randomly choose which walls to put doors in.
If we put a wall between two rooms, will we union the two rooms' sets.
We will continue until every room is in one set.

Assignment

Copy over the 7 starter files at /users/smatzko/212/public/.
Write the constructor for disj_sets to create the appropriate number of elements and initialize their values to -1 (using the vector constructor). Additionally, I suggest using the num_sets variable to keep track of the number of sets (initially "num_elem").
Write disj_sets::find to return the root of the given array index. This method is best written recursively to allow for an extra credit modification. The base case is when the index is the root. Otherwise, call find on the element at this index.
Write disj_sets::union_sets to assign the root of one set to be a child of the root of the other set. Make sure to maintain your set counter. This method should be very simple.
Write disj_sets::num_diff_sets() to return the number of sets.
Write maze::build_maze () to create the maze.
1. Create the disjoints sets.
2. Seed the random number generator with time (NULL).
3. As long as there is more than one set,
  1. Randomly choose a wall to put a door in (either an east or a south wall). e.g. 0 is an east wall, 1 is a south wall.
  2. If the wall is an east wall,
    1. room2 is room1 + 1.
  3. If the wall is a south wall,
    1. Randomly generate a room to put the south door in (make sure to pick rooms that are not in the last row). This is room1.
    2. room2 is room1 + width (go down a row).
  4. Find the name of each of the rooms' sets (using a find).
  5. If the names are not the same, 1) Union the two sets and 2) clear the wall between room1 and room2 using clear_wall. Otherwise, move on.

Union by Height (Optional)

If our set is large, we may end up stepping a long way through array indices in order to reach the root.
In order to keep from making the path to the root become longer than necessary, we will instead be careful in our unions to pick the new root in a way that minimizes the path length.
When we union two sets, it is best to make the root with the longer path to its furthest node be the new root. Otherwise, the path will get even longer.
In order to track the height of the root, we will store the height at the root's element.
If we store the height as it is, we may be confused as to whether the node is a root. (e.g. if 5 holds "3", is 5 a child of 3 or is 5 the root with a path of 3?)
Instead, we will use negative numbers to specify the height: -1 is height 0, -2 is height 1, -3 is height 2, etc.

Path Compression (Optional)

If our set is large, we may end up stepping a long way through array indices in order to reach the root.
An option to speed up the process is Path Compression.
Any nodes accessed during a find process are given the value of the root, rather than simply the next node up.
Since find is written recursively, all we need to do is assign the result of a find to the element of the index it was called on.
NOTE: if you implement Union by Height and Path Compression, technically the heights aren't always right. Therefore, when the two are together, we call it "Union by Rank."
If you do Union by Rank, you can get 5 points extra credit. It's not too hard, but you can check out pages 321-325 for help.

Turn in

Turn in all of the code, and the Makefile.
Use sendlab.212.302 11 *.o *.cpp *.h Makefile