Advanced Recursion Example

Counting Cells in a Blob
- blob_check (grid[][], x, y)

Introduction

This problem is a good illustration of the power of recursion.Its solution is relatively easy to write recursively but it would be much more difficult to write an iterative solution.

Specification

We have a two dimensional grid of cells, each of which may be filled or empty. Filled cells which are connected form a blob. There may be several blobs on a grid. The problem is to write a recursive routine which returns as output the size of the blob containing a specified cell.

For example

	(* denotes a filled cell)

		4 * *
                3   * *
             y  2     *   *
                1 *       * 
                0   *   * *
                  0 1 2 3 4
                      x


	For the above figure

	blob_check(grid, 1, 3) should return a value of 5
	blob_check(grid, 0, 1) should return a value of 2
	blob_check(grid, 4, 4) should return a value of 0
	blob_check(grid, 4, 0) should return a value of 4

Design

What instance(s) of the problem can serve as the base case(s)?
For any particular cell (x, y) there are three possibilities.
- The cell (x, y) may not be on the grid.
- The cell (x, y) may be empty.
- The cell (x, y) may be filled.
In the first two cases the size of the blob containing the cell(x, y) is zero because there is no blob. For the last case however further computations will be needed and so 1 and 2 are our base cases.
If the cell (x, y) is on the grid and filled, how do we define the problem in terms of one or more smaller problems of the same type?
We must define the size of the blob containing the filled cell (x, y) in terms of the size of one or more smaller blobs. If we mark the filled cell (x, y) empty the moment we visit it then we can redefine the problem as follows:
- The size of the blob containing the filled cell (x, y) is 1 + the sizes of any blobs touching the now empty cell (x, y)
- A blob touches the cell (x, y) if and only if it contains a neighbour cell of cell (x, y). Thus we can further clarify our redefinition as:
- The size of the blob containing the filled cell (x, y) is 1 + the sizes of any blobs containing a neighbour cell of the empty cell (x, y).
- We have redefined the problem in terms of smaller problems of the same type. There are eight neighbour cells of the cell (x,y) and each one must be visited.
Algorithm
```
    1 If (the cell (x, y) is EMPTY) then the count is 0

    2 else if (the cell (x, y) is ZERO) then the count is 0

    3 else if (the cell (x, y) is FILLED) then

       mark the cell (x, y) as EMPTY
       the count is 1 + the counts of the cell (x, y)'s 
       eight neighbours.
```
As the problem size diminishes will you reach the base cases?
Everytime the routine visits a filled cell it marks it empty BEFORE it visits its neighbours, reducing the size of the blob by one. Eventually all of the filled cells in the blob will be marked EMPTY and the routine will encounter nothing but base cases.
How is the solution from the smaller problem used to build a correct solution to the current larger problem ?
The sum of all the values returned from the eight recursive calls to the neighbour cells is the size of the blob containing the current cell.

Note

A side effect of the blob_check routine is that all the elements of the blob containing the cell (x, y) are set to EMPTY. In other words the blob is erased.

If the cell visited is off the grid[][] or EMPTY, blob_check returns a 0 immediately. Otherwise the recursive step executes.

blob_check calls itself eight times, each time a different neighbour of the current cell is visited. The cells are visited in a clockwise manner starting with the neighbour above and to the left.

Common Errors

The sequence of statements executed in the routine blob_check is very important.

The routine blob_check() tests whether the cell (x, y) is on the grid before testing if it is empty. If the order were reversed then the condition ( grid[x][y] = EMPTY ) would reference an out-of-range element whenever the cell (x, y) was off the grid.

The statement grid[x][y] <- EMPTY; is used to set up conditions that will help solve a smaller version of the same problem and consequently preceeds the recursive calls.

If this statement was not executed first, then cell (x, y) would be counted more than once since it is a neighbour of each of its eight neighbours. In fact a much worse problem would occur. When each neighbour of the cell (x, y)is visited blob_check() is called again with the coordinates of the cell (x, y) as arguments. Thus if the cell (x, y) were still FILLED the recursive step would be executed erroneously and an infinite sequence of calls would be generated.

What to Do

Write a program to solve this problem recursively. In addition your program should read in the grid data from a file. The file will be structured as follows: The first line will contain two integers, the number of rows, followed by the number of columns. Subsequent lines will contain the actual data coded as 0's (empty) and 1's (full). The above data would be coded in a file as follows:

Note that the rows are presented from 0 to 4 in the file, rather than from 4 to 0 in the pictorial representation above.

Your program should simply output the values for each square in the grid. I will put some sample files in the course materials folder.

Put your project in the drop box with the usual naming restrictions.