# Wrapping C with Python: 3D image segmentation with region growing

Image segmentation with region growing is simple and can be used as an initialization step for more sophisticated segmentation methods. In this note, I’ll describe how to implement a region growing method for 3D image volume segmentation (note: the code here can be applied, without modification, to 2D images by adding an extra axis to the image) that uses a single seed point and uses a neighborhood average inclusion criteria. First, we’ll implement the algorithm in Python, and next, I’ll show how to implement the algorithm in C, wrapping the C code using `f2py`

in order to make it callable from Python. Implementing the code in C will give us big performance boosts, while wrapping the code in Python will gives all the convenience of scripting in Python.

Disclaimer: I make no claim that the presented code is robust or error-free :grinning:

#### Table of Contents

## Pseudocode

First, let’s describe the algorithm in some pseudocode:

We’ve glossed over a few details that will appear in the actual implementation, but in a nutshell, the pseudocode says that whenever the image value at a point is above (or equal to) the average image value in the neighborhood of that point, we set the corresponding point in segmentation to “True” and add all the neighbors of that point to a stack to be checked later. When there’s no points left to check, the boolean segmentation volume will (hopefully, roughly) indicate where the object is in the image volume.

It’s worth mentioning at this point that using the neighborhood average as the inclusion criteria will have a rather large effect in the final segmentation outcome. Indeed, we are assuming the object is “bright” relative to its surroundings. Generally speaking, this inclusion criteria should be guided by the application domain at hand and by the interpretation of actual image values. For example, in CT scans the image values correspond to Hounsfield units (which correspond to tissue density), and thus our region growing routine with the inclusion criteria as is would find connected regions of high density for CT data.

## Generating an image volume for testing

We’ll generate a synthetic volume for testing, rather than depending on some external data source. Specifically, we’ll use implicit surfaces of the form,

and we’ll use ranges of $k$ to define regions of higher and lower image values. Wolfram MathWorld calls this surface a tangle cube for $k=-11.8$.

The code below creates an image volume that has value 1 in regions where $5 \leq k \leq 20$ and where $k \leq -11$, and has image volume -1 otherwise. We also view the two regions using mayavi.

The figure on the right illustrates the two regions in the image volume. The light blue region corresponds to the outer shell $(5 \leq k \leq 20)$ that encloses the inner region $(k \leq -11)$. So, we can see that by planting the seed for the region growing in the inner structure, we should (hopefully) obtain only the inner structure since the inner region is not connected to the outer shell. Note that in the code above, we’ve also used different numbers of points to sample along the x, y, and z axes to reveal any potential flaws in this aspect of the implementation.

As another means of looking at this test volume, we could look at cuts through the volume, using the following code, for example:

## Python implementation

Now then, let’s translate the pseudocode to Python (always a relatively easy task!):

This python code works just like the pseudocode, but we have taken care of a couple of extra details:

- A point might be placed in the stack multiple times before its checked. So, if the point has already been checked, we add a check to just continue to the next iteration of the
`while`

loop. - We handle the special cases at the borders with an ugly six lines of code (
`imin`

,`imax`

, etc.).

The code uses a `get_nbhd`

function which is not yet written. This function adds neighboring voxels when necessary. It accepts the current voxel (`pt`

), the indicator array of checked points (`checked`

), and the dimensions of the volume (`img.shape`

). The `checked`

indicator is of course necessary so we don’t add points that have already been checked, and the volume dimensions argument is necessary so we don’t add points beyond the image borders.

Next, there is a connectivity choice for adding neighbors of a voxel, i.e., the choice of which voxels are considered the neighbors of a given voxel. This is determined by the `get_nbhd`

function and should not be confused with the neighborhood that is used for computing the average image value about a voxel (the latter usually being a comparatively larger region). Rather, this neighborhood defines the coordinates (i.e., voxels) that are adjacent to a given coordinate. There are different ways to define the neighborhood on a 3D lattice, and I use a 6-voxel neighborhood in the code below:

### Testing the Python implementation

So, we have the test data, and now let’s finally test the python implementation above. Supposing the `grow`

function is defined in file called `region_growing_python`

:

Upon running the above code, you should see something like:

and a corresponding image of the inner structure that we hoped to pick out:

Alternatively, we could set `seed = (45,38,35)`

and pick out the outer shell:

So, it appears to be working!

## C Implementation

We’ll start by writing the implementation in C, and then we’ll wrap the C implementation in Python by using `f2py`

.

### A simple stack structure

First, we need to write a minimal stack data structure using a singly linked list.

So, we’ll create a file `stack.h`

containing the following:

… and a quick test program to make sure the stack is working correctly (call it `stack_check.c`

) …

which gives (after `gcc stack_check.c stack.h && ./a.out`

)

Or maybe more importantly, running `valgrind ./a.out`

gives:

:+1: moving on …

### The region growing function in C

Let’s now create a file called `region_growing.c`

. At the top, we’ll include the necessary libraries

Next, we’ll write a function that maps a 3-tuple index to a “flat” index into the 3D array stored in row-major given the dimensions of the 3D volume:

This function is a perfect candidate for inlining, so we’ve done so.

Next, we write the analog of `get_nbhd`

from the Python implementation. This function was already pretty verbose in the Python version and becomes even more so (unsurprisingly) in the C version:

Moving along, in the C version, we’ll separate out the inclusion criteria check (which we did not do for Python version). This function could be overwritten to use all kinds of inclusion checks, but the main reason for separating out *here* is because we have to compute the neighborhood average of image values ourselves, rather than relying on built-in NumPy functions.

Almost there. Lastly, we write the `grow`

method, which actually implements the region growing procedure:

### Writing the `f2py`

signature file

Next, we should write a “signature” file so `f2py`

knows the names of things and what the types of the arguments are. Create a file called `region_growing.pyf`

:

We made the dimensions of the `img`

array implicit by making the variables `m,n,p`

hidden, and by telling `f2py`

that they should be extracted from the provided `img`

argument. Why do we use type integer for `seg`

instead of boolean? In my experience, `f2py`

can be a little funny when passing boolean arrays to Fortran or C, so it’s easier to just let `seg`

be integer type. This is why in the `grow`

C routine, we let `checked`

be a boolean-valued array, but `seg`

was integer type. The reason here was that `checked`

was local to the C program and not passed to Python.

### Makefile

Finally, let’s write a `make`

file that calls `f2py`

:

Running `make`

from the terminal creates a `region_growing.so`

module that can by imported from Python.

### Testing the Python-wrapped C implementation

Assuming `vol`

and other variables defined as before, let’s test it out:

If all goes correctly, you should observe something like:

$1.189 / 0.074 \approx 16$. That’s a speed-up of nearly a factor of 16! Of course, this factor will change if we use a image neighbor value larger than 5 as was tested here. Something to note is that in the above, the array returned by the C-wrapped function will actually be of type `int8`

. For reasons that are unknown to me, `f2py`

doesn’t care for accepting boolean types from NumPy and requires `int8`

instead. Generally, you have to be pretty careful when passing different types using `f2py`

because you can observe strange results that might not appear as blatant errors.

In the figure below, I’ve plotted the run times for the Python and C codes (and their ratio) averaged over 10 runs for image neighborhood radius size of $1,5,10,15,$ and $25$ (I’m using the term radius, but we really mean cube-half-length). This means at each coordinate where we check for inclusion, we must compute the average over a volume of size $(2 \cdot 1+1)^3, (2\cdot 5 + 1)^2$, etc. Notable speed increases happen within the 1 to 15 pixel radius range. For larger neighborhood sizes, the averaging operation dominates the code, and since the NumPy average is implemented in C, it is not too surprising that the relative speed increase obtained by the pure C code begins to vanish.

Also, note that for very large image averaging neighborhood sizes, region growing is essentially equivalent to first performing a global threshold (because every “neighborhood” includes the entire image), followed by a connected component analysis from the chosen seed point.