This is an example of how color segmentation can be used to extract useful
information from a map. In this case we will delimit county boundaries in the
following map. Once again we
will use the ideas described in this wonderful
tutorial to find the
appropriate color ranges and segment the features we need. Click here to access the original notebook.
The map shown above is composed of 12 counties that are delimited by boundaries
which can have different colors. In addition, counties contain regions such as
lakes, that make the segmentation a bit trickier.
The following code converts the map to HSV space and transforms the pixel values
(e.g. normalization) to make them easier to plot. It’s usually better to plot
the color data using a script run from the terminal. This allows rotating the 3D
plot and facilitates the task of finding value ranges for the target colors. A
color_plot script is available in the
twoisprime package.
As you can see in the plot above, the map is ideal for color segmentation.
Colors are isolated and they seem to be easily separable. The next step is to
select the relevant colors and find the value ranges. For this we need to
inspect the plot, pick an approximate range, and refine it by checking the
result. By examining the map we can determine that county boundaries can contain
the colors pink, red/brown, yellow, blue/lake, gray, and green. We segment these
colors in the next few cells.
Combining all color segmentations and dilating the result produces a mask of the
county boundaries. It is important to notice that the red/brown color mix is
treated as a secondary layer because it is considered a county border only when
it appears together with another color layer. Dilation kernel sizes are chosen
in relation to the image width so that they are valid for similar maps which
have a different size.
Thresholding the color layer blend using an intermediate value allows removing
the red/brown part which is not part of the county boundaries.
The final step is to find the contours, filter by area, and display the results.
Selecting area values was done again in relation to the width so that we can
process similar maps with different sizes.