The Watershed algorithm is an image segmentation technique that aims to separate an image into regions or segments based on gradient information. This algorithm is especially useful in scenarios where you want to separate close or touching objects in an image. Watershed’s approach simulates the image as a topographical landscape, where peaks represent local maxima and valleys represent separations between objects. The algorithm then fills these “tanks” with water, starting from local minima and merging the pools when water from different pools meets.
The Watershed algorithm
The Watershed algorithm is based on conceptualizing the image as a topographic surface, where pixel intensity represents height. The goal is to separate regions of the image as if they were watersheds, where water flows from the lowest points to the highest points.
1. Image Gradient:
Formally we can divide the Watershed algorithm into different steps, each of which requires a mathematical basis.
Before applying the Watershed algorithm, we calculate the gradient of the image to identify the transition regions. The image gradient, denoted by ( \nabla I ), can be calculated using derivative operators, such as the Sobel gradient or the Scharr gradient.
Where and represent the partial derivatives with respect to the coordinates and .
2. Image of Labels:
Next, we calculate the local minima of the gradient image to obtain the starting points for the Watershed algorithm. These local minima indicate the initial “tanks” from which we will begin to fill the image with water.
3. Distance Transform and Markers:
The distance transform of the label image creates an image in which each pixel indicates the distance between that pixel and the closest starting point.
These turn out to be the markers for the Watershed algorithm.
4. Application of Watershed:
Watershed’s algorithm applies water from these markers, simulating the filling of “tubs”. The water converges at the boundary regions and the algorithm returns the edges of these regions.
The resulting image after applying Watershed, with (M) markers, can be obtained by solving the modified distance transform equation:
Where is the resulting label image and is the original intensity image.
This process simulates the filling of image “tanks”, allowing robust segmentation of image regions based on intensity gradients.
The OpenCV approach with Markers
Mathematical formalism is one thing and practice is one thing. If we apply this algorithm to any image, we risk not obtaining the desired results. In fact, we will obtain an excessive level of segmentation compared to the real one (over-segmentation phenomenon). These effects are often due to irregularities present in the images or in the background noise of the image.
A solution to this problem is to implement the Watershed algorithm based on markers, which have the function of indicating which valleys will be “unified” and which will not. Therefore this approach is not totally automatic, but requires the interaction of the person carrying out the analysis.
From a practical point of view, this operation consists of assigning labels to different parts of the image. The subjects to be segmented will be labeled as foreground (subjects that must be segmented) while the background areas or subjects that are not of interest to us will be labeled as background. A convenient way to do this will be to label all the background areas with 0, while with a sequence of integer numbers (1,2,3…), the different subjects to be segmented.
Python implementation of Watershed algorithm with OpenCV
To use the Watershed algorithm with OpenCV in Python, you need to follow some basic steps. Let’s start with a simple image like the following. Download it to your PC and save it as simple.jpg.
First, let’s load the image and convert it to grayscale.
import cv2
import numpy as np
from matplotlib import pyplot as plt
image = cv2.imread('simple.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
plt.imshow(cv2.cvtColor(gray, cv2.COLOR_BGR2RGB))
plt.show()
When we want to represent an image produced by OpenCV we must always keep in mind that the latter reads images in BGR (Blue, Green, Red) format, while Matplotlib uses RGB (Red, Green, Blue). Then you need to convert the image using cv2.cvtColor to ensure that the colors are displayed correctly. Executing displays the gray image.
Now that we have loaded the color image and converted it to Grici scale, we can apply gradient filters to identify the border regions.
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# noise removal
kernel = np.ones((3,3),np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 2)
plt.imshow(cv2.cvtColor(opening, cv2.COLOR_BGR2RGB))
plt.show()
Let’s start from the first line: ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU). Here, we convert the grayscale image to a binary image using the OTSU binarization technique. In practice, this technique automatically determines the optimal threshold value for binarization by separating the image pixels into two classes. Pixels above the threshold turn black and those below turn white, thus creating an inverted binary image.
Next, a 3×3 kernel is defined which will be used for morphological operations. In our case, the kernel is simply an array of 1 values. After that, a morphological opening operation is applied to the binarized image. Opening is a combination of erosion (which reduces the size of objects in the image) followed by dilation (which widens the edges of objects). This process helps remove noise in the image and separate objects that may be connected to each other.
Finally, the result is shown in the viewing window.
Now we will process the image to divide it into clearly identifiable regions that are part of the objects of interest (foreground) and unknown regions that require further processing with the Watershed algorithm for more precise segmentation.
# Finding sure foreground area
dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5)
ret, sure_fg = cv2.threshold(dist_transform,0.3*dist_transform.max(),255,0)
# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(opening,sure_fg)
plt.imshow(cv2.cvtColor(unknown, cv2.COLOR_BGR2RGB))
plt.show()
With the line dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5) you calculate the distance transform on the image after the opening operation. The distance transform gives each pixel a value that represents its distance from the closest pixel of the object in the image. This is useful for determining which pixels are closer to objects than others.
Then, ret, sure_fg = cv2.threshold(dist_transform,0.3*dist_transform.max(),255,0) applies a threshold on the distance transform image to identify areas that are definitely part of the object of interest (foreground) . The threshold value is calculated as 30% of the maximum value of the distance transform. This process allows you to identify regions that are definitely part of the object without ambiguity.
Next we convert the foreground safe areas image with sure_fg = np.uint8(sure_fg) to a uint8 data type, which is the appropriate data type for images in OpenCV. The following lineunknown = cv2.subtract(opening,sure_fg)identifies the unknown region of the image by subtracting safe foreground areas from the image after the opening operation. This step gives us an estimate of the areas of the image that could be part of the object but have not yet been labeled as such.
Now we have all the elements to apply the actual Watershed algorithm, with the marker labeling procedure.
# Marker labelling
ret, markers = cv2.connectedComponents(sure_fg)
# Add one to all labels so that sure background is not 0, but 1
markers = markers+1
# Now, mark the region of unknown with zero
markers[unknown==255] = 0
markers = cv2.watershed(image,markers)
image[markers == -1] = [0,0,255]
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title('Watershed Segmentation')
plt.show()
Marker Labeling is done in the first part of the code:
- ret, markers = cv2.connectedComponents(sure_fg): The different objects in the image are labeled here. cv2.connectedComponents assigns a unique label to each object in the sure_fg binary image, which represents the safe foreground area identified previously. This step splits the image into several connected regions, where each region has a different label.
- markers = markers+1: Adds one to all labels to ensure that the safe background does not have a label of value 0, as this could cause problems in the Watershed algorithm.
- Next, unknown regions are marked, i.e. those parts of the image where we are not sure whether they belong to the background or objects of interest. These are the areas that have been identified via the difference between the morphological opening opening and the sure_fg safe foreground regions. These areas were stored in the unknown array.
- markers[unknown==255] = 0: Here the unknown area is marked by setting the corresponding labels in the markers array to 0.
Then subsequently Watershed Transformation is carried out:
- markers = cv2.watershed(image,markers): This is the key step of the Watershed algorithm. The Watershed algorithm is applied to the original image using the markers obtained from the previous step. The Watershed algorithm divides the image into regions, using markers to determine the boundaries between the regions. The result is an image in which objects are delimited by outlines.
- image[markers == -1] = [0,0,255]: Here we color undefined regions or edges of regions identified as “watershed line” with the color red [0, 0, 255] in the original image image. This helps to clearly visualize the boundaries between the segmented regions.
Finally, the resulting image is displayed using Matplotlib with the title “Watershed Segmentation” to highlight the segmentation process.
As we can see well above, the regions covered by the blue circles have been well identified, highlighted in the image with red outlines.
Conclusion
This is just a simple example of how the Watershed algorithm can be applied on a simple example. It is clear that more complex images require much more elaborate preliminary procedures, specific from case to case. But the goal of this article is to be able to illustrate with a short and simple example how the Watershed algorithm can be a valid tool for image segmentation, especially when it comes to separating complex and overlapping objects. By using Python together with OpenCV, you can easily implement this algorithm and achieve significant results in image segmentation.