Rolling ball background subtraction algorithm for OpenCV - opencv

Is there an OpenCV (android) implementation of "rolling ball" background subtraction algorithm found in ImageJ: Process->Subtract Background?
OpenCV has a BackgroundSubtractorMOG class, but it is used for video streams not single, independent images.
This is an example of what this method does:
http://imgur.com/8SN2CFz
Here is a documentation of the process: http://imagejdocu.tudor.lu/doku.php?id=gui:process:subtract_background

There's no implementation in the OpenCV C libraries that I know of and the Android JNI wrappers are just that - wrappers around the main libraries.
Having said that the source code for the ImageJ implementation is available online here and so you should be able to incorporate this directly into your Android image processing pipeline.
There is some discussion about the relative merits of rolling ball vs. e.g. using a disk structuring element (which is available in OpenCV) here.
If you absolutely require Rolling Ball and OpenCV then unfortunately it's not available 'out of the box'.

There is a recent rolling-ball implementation in opencv that you can find here
https://pypi.org/project/opencv-rolling-ball/
In short
Install pip install opencv-rolling-ball
Example
import cv2
from cv2_rolling_ball import subtract_background_rolling_ball
img = cv2.imread(f'path/to/img.tif', 0)
img, background = subtract_background_rolling_ball(img, 30, light_background=True, use_paraboloid=False, do_presmooth=True)

Building on #Xenthor's answer this is what I came up with:
import numpy as np
import scipy.ndimage as ndi
from scipy.ndimage._ni_support import _normalize_sequence
def rolling_ball_filter(data, ball_radius, spacing=None, top=False, **kwargs):
"""Rolling ball filter implemented with morphology operations
This implenetation is very similar to that in ImageJ and uses a top hat transform
with a ball shaped structuring element
https://en.wikipedia.org/wiki/Top-hat_transform
Parameters
----------
data : ndarray
image data (assumed to be on a regular grid)
ball_radius : float
the radius of the ball to roll
spacing : int or sequence
the spacing of the image data
top : bool
whether to roll the ball on the top or bottom of the data
kwargs : key word arguments
these are passed to the ndimage morphological operations
Returns
-------
data_nb : ndarray
data with background subtracted
bg : ndarray
background that was subtracted from the data
"""
ndim = data.ndim
if spacing is None:
spacing = 1
spacing = _normalize_sequence(spacing, ndim)
radius = np.asarray(_normalize_sequence(ball_radius, ndim))
mesh = np.array(np.meshgrid(*[np.arange(-r, r + s, s) for r, s in zip(radius, spacing)], indexing="ij"))
structure = 2 * np.sqrt(1 - ((mesh / radius.reshape(-1, *((1,) * ndim)))**2).sum(0))
structure[~np.isfinite(structure)] = 0
if not top:
# ndi.white_tophat(data, structure=structure, output=background)
background = ndi.grey_erosion(data, structure=structure, **kwargs)
background = ndi.grey_dilation(background, structure=structure, **kwargs)
else:
# ndi.black_tophat(data, structure=structure, output=background)
background = ndi.grey_dilation(data, structure=structure, **kwargs)
background = ndi.grey_erosion(background, structure=structure, **kwargs)
return data - background, background

Edit: Before using the method in this post read the comments below and also consider the answers of #renat and #David Hoffman.
In case someone is still looking for rolling ball background correction in python. For me, the following worked out very well.
Load the image and process each channel separately.
Create a weighted ball structuring element
Use white tophat transform
Here is some code for a monochrome image:
import scipy.ndimage as scim
from scipy.misc import imsave
from skimage.morphology import ball
# Read image
im = scim.imread("path")[:, :, 0].astype(int)
# Create 3D ball with radius of 50 and a diameter of 2*50+1
s = ball(50)
# Take only the upper half of the ball
h = s.shape[1] // 2 + 1 # 50 + 1
# Flatten the 3D ball to a weighted 2D disc
s = s[:h, :, :].sum(axis=0)
# Rescale weights into 0-255
s = (255 * (s - s.min())) / (s.max() - s.min())
# Use im-opening(im,ball) (i.e. white tophat transform) (see original publication)
im_corr = scim.white_tophat(im, structure=s)
# Save corrected image
imsave('outfile', im_corr)
This gives you not the exact same result as the imagej implementation but the results are quite similar. In my case there were both, better and worse corrected regions. Moreover the overall color intensity was higher.

The original algorithm that ImageJ implements comes from a 1983 paper https://www.computer.org/csdl/magazine/co/1983/01/01654163/13rRUwwJWBB. I took a look at it and it is actually a grayscale morphological white top-hat with a ball-shaped grayscale structuring element (see https://en.wikipedia.org/wiki/Top-hat_transform). In the ImageJ implementation (available here https://imagej.nih.gov/ij/developer/source/ij/plugin/filter/BackgroundSubtracter.java.html), the image is downsampled depending on the structuring elements' radius, then upsampled to the original resolution and, by default, a smoothing operation using a 3x3 mean filter is applied before computing the background to subtract. This likely explains the differences observed with the method proposed by Xenthor.
If you are working on Android, you have several options: 1) using the ImageJ library, since it is in Java, you will however need to implement an OpenCV-ImageJ image bridge; 2) if you work in C++ using the Android NDK and since OpenCV does not implement grayscale morphology for non-flat structuring elements, you can use ITK (https://itk.org/) instead to perform the graycale white top-hat; 3) still using the NDK, there is an OpenCV-based C++ port of the algorithm available here: https://github.com/folterj/BioImageOperation/tree/master/BioImageOperation, however it is still a work in progress.

I realize it's not opencv, but there is an implementation in scikit-image (version ≥ 0.18).
from skimage import data, restoration
image = data.coins()
background = restoration.rolling_ball(image, radius=100)
result = image - background
A more detailed walkthrough is provided in the documentation

Related

Detect handwritten characters in boxes from a filled form using Fourier transforms

I am trying to extract handwritten characters from boxes. The scanning of the forms is not consistent, so the width and height of the boxes are also not constants.
Here is a part of the form.
My current approach:
1. Extract horizontal lines
2. Extract vertical lines
3. Combine both the above images
4. Find contours ( used opencv)
This approach gives me most of the boxes. But, when the box is filled with characters like "L" or "I", the vertical line in the character is also getting extracted as a part of vertical lines extraction. Hence the contours also get messed up.
Since the boxes are arranged periodically, is there a way to extract the boxes using Fast Fourier transforms?
I recently came up with a python package that deals with this exact problem.
I called it BoxDetect and after installing it through:
pip install boxdetect
It may look somewhat like this (you need to adjust parameters for different forms:
from boxdetect import config
config.min_w, config.max_w = (20,50)
config.min_h, config.max_h = (20,50)
config.scaling_factors = [0.4]
config.dilation_iterations = 0
config.wh_ratio_range = (0.5, 2.0)
config.group_size_range = (1, 100)
config.horizontal_max_distance_multiplier = 2
from boxdetect.pipelines import get_boxes
image_path = "dumpster/m1nda.jpg"
rects, grouped_rects, org_image, output_image = get_boxes(image_path, config, plot=False)
You might want to check below thread for more info:
How to detect all boxes for inputting letters in forms for a particular field?
The Fourier transform is the last thing I would think of.
I'd rather try with a Hough line detector to get long lines or as you did, with edge detection, but I would reconstruct the grids explicitly, finding their pitch and the exact locations of the rows/columns, hence every individual cell.
You can try select handwritten characters by color.
example:
import cv2
import numpy as np
img=cv2.imread('YdUqv .jpg')
#convert to hsv
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
#color definition
color_lower = np.array([105,80,60])
color_upper = np.array([140,255,255])
# select color objects
mask = cv2.inRange(hsv, color_lower, color_upper)
cv2.imwrite('hand.png', mask)
Result:

Edge detection in pixelated images

I am trying to find different approaches for how to find edges in a pixelated image such as this one:
By edges I mean the clear lines that are showing from the pixels(blocks), not the edges from skin to background etc.
Does anyone got a tips for how to find these edges?
Would a Sobel filter be able to detect these lines as edges?
I have not tested anything yet, I am more looking into options on what type of filters exist.
I will be implementing the stuff in C++ and DirectX12.
There is a large selection of filters.
Result of MATLAB edge function applying different types of filters:
I looks like 'Canny' and 'approxcanny' gives the best result.
According to MATLAB documentation:
The 'Canny' and 'approxcanny' methods are not supported on a GPU.
It probably means that 'Canny' filter is less fitted for GPU implementation.
Here is the MATLAB code:
I = imread('images.jpg'); %Read image.
I = rgb2gray(I); %Convert RGB to Grayscale.
%Name of filters.
filt_name = {'sobel', 'Prewitt', 'Roberts', 'log', 'zerocross', 'Canny', 'approxcanny'};
%Display filtered images
figure('Position', [100, 100, size(I,2)*4, size(I,1)*4]);
for i = 1:length(filt_name)
%Filter I using edge detection filtes of type 'sobel', 'Prewitt', 'Roberts'...
%Use default MATLAB parameters for each filter.
J = edge(I, filt_name{i});
subplot(3, 3, i);
image(im2uint8(J));
colormap('gray');
title(filt_name{i});
axis image;axis off
end

Automatic approach for removing colord object shadow on white background?

I am working on some leaf images using OpenCV (Java). The leaves are captured on a white paper and some has shadows like this one:
Of course, it's somehow the extreme case (there are milder shadows).
Now, I want to threshold the leaf and also remove the shadow (while reserving the leaf's details).
My current flow is this:
1) Converting to HSV and extracting the Saturation channel:
Imgproc.cvtColor(colorMat, colorMat, Imgproc.COLOR_RGB2HSV);
ArrayList<Mat> channels = new ArrayList<Mat>();
Core.split(colorMat, channels);
satImg = channels.get(1);
2) De-noising (median) and applying adaptiveThreshold:
Imgproc.medianBlur(satImg , satImg , 11);
Imgproc.adaptiveThreshold(satImg , satImg , 255, Imgproc.ADAPTIVE_THRESH_MEAN_C, Imgproc.THRESH_BINARY, 401, -10);
And the result is this:
It looks OK, but the shadow is causing some anomalies along the left boundary. Also, I have this feeling that I am not using the white background to my benefit.
Now, I have 2 questions:
1) How can I improve the result and get rid of the shadow?
2) Can I get good results without working on saturation channel?. The reason I ask is that on most of my images, working on L channel (from HLS) gives way better results (apart from the shadow, of course).
Update: Using the Hue channel makes threshdolding better, but makes the shadow situation worse:
Update2: In some cases, the assumption that the shadow is darker than the leaf doesn't always hold. So, working on intensities won't help. I'm looking more toward a color channels approach.
I don't use opencv, instead I was trying to use matlab image processing toolbox to extract the leaf. Hopefully opencv has all the processing functions for you. Please see my result below. I did all the operations in your original image channel 3 and channel 1.
First I used your channel 3, threshold it with 100 (left top). Then I remove the regions on the border and regions with the pixel size smaller than 100, filling in the hole in the leaf, the result is shown in right top.
Next I used your channel 1, did the same thing as I did in channel 3, the result is shown in left bottom. Then I found out the connected regions (there are only two as you can see in the left bottom figure), remove the one with smaller area (shown in right bottom).
Suppose the right top image is I1, and the right bottom image is I, the leaf is extracted by implement ~I && I1. The leaf is:
Hope it helps. Thanks
I tried two different things:
1. other thresholding on the saturation channel
2. try to find two contours: shadow and leaf
I use c++ so your code snippets will look a little different.
trying otsu-thresholding instead of adaptive thresholding:
cv::threshold(hsv_imgs,mask,0,255,CV_THRESH_BINARY|CV_THRESH_OTSU);
leading to following images (just OTSU thresholding on saturation channel):
the other thing is computing gradient information (i used sobel, see oppenCV documentation), thresholding that and after an opening-operator I used findContours giving something like this, not useable yet (gradient contour approach):
I'm trying to do the same thing with photos of butterflies, but with more uneven and unpredictable backgrounds such as this. Once you've identified a good portion of the background (e.g. via thresholding, or as we do, flood filling from random points), what works well is to use the GrabCut algorithm to get all those bits you might miss on the initial pass. In python, assuming you still want to identify an initial area of background by thresholding on the saturation channel, try something like
import cv2
import numpy as np
img = cv2.imread("leaf.jpg")
sat = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)[:,:,1]
sat = cv2.medianBlur(sat, 11)
thresh = cv2.adaptiveThreshold(sat , 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 401, 10);
cv2.imwrite("thresh.jpg", thresh)
h, w = img.shape[:2]
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
grabcut_mask = thresh/255*3 #background should be 0, probable foreground = 3
cv2.grabCut(img, grabcut_mask,(0,0,w,h),bgdModel,fgdModel,5,cv2.GC_INIT_WITH_MASK)
grabcut_mask = np.where((grabcut_mask ==2)|(grabcut_mask ==0),0,1).astype('uint8')
cv2.imwrite("GrabCut1.jpg", img*grabcut_mask[...,None])
This actually gets rid of the shadows for you in this case, because the edge of the shadow actually has high saturation levels, so is included in the grab cut deletion. (I would post images, but don't have enough reputation)
Usually, however, you can't trust shadows to be included in the background detection. In this case you probably want to compare areas in the image with colour of the now-known background using the chromacity distortion measure proposed by Horprasert et. al. (1999) in "A Statistical Approach for Real-time Robust Background Subtraction and Shadow Detection". This measure takes account of the fact that for desaturated colours, hue is not a relevant measure.
Note that the pdf of the preprint you find online has a mistake (no + signs) in equation 6. You can use the version re-quoted in Rodriguez-Gomez et al (2012), equations 1 & 2. Or you can use my python code below:
def brightness_distortion(I, mu, sigma):
return np.sum(I*mu/sigma**2, axis=-1) / np.sum((mu/sigma)**2, axis=-1)
def chromacity_distortion(I, mu, sigma):
alpha = brightness_distortion(I, mu, sigma)[...,None]
return np.sqrt(np.sum(((I - alpha * mu)/sigma)**2, axis=-1))
You can feed the known background mean & stdev as the last two parameters of the chromacity_distortion function, and the RGB pixel image as the first parameter, which should show you that the shadow is basically the same chromacity as the background, and very different from the leaf. In the code below, I've then thresholded on chromacity, and done another grabcut pass. This works to remove the shadow even if the first grabcut pass doesn't (e.g. if you originally thresholded on hue)
mean, stdev = cv2.meanStdDev(img, mask = 255-thresh)
mean = mean.ravel() #bizarrely, meanStdDev returns an array of size [3,1], not [3], so flatten it
stdev = stdev.ravel()
chrom = chromacity_distortion(img, mean, stdev)
chrom255 = cv2.normalize(chrom, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX).astype(np.uint8)[:,:,None]
cv2.imwrite("ChromacityDistortionFromBackground.jpg", chrom255)
thresh2 = cv2.adaptiveThreshold(chrom255 , 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 401, 10);
cv2.imwrite("thresh2.jpg", thresh2)
grabcut_mask[...] = 3
grabcut_mask[thresh==0] = 0 #where thresh == 0, definitely background, set to 0
grabcut_mask[np.logical_and(thresh == 255, thresh2 == 0)] = 2 #could try setting this to 2 or 0
cv2.grabCut(img, grabcut_mask,(0,0,w,h),bgdModel,fgdModel,5,cv2.GC_INIT_WITH_MASK)
grabcut_mask = np.where((grabcut_mask ==2)|(grabcut_mask ==0),0,1).astype('uint8')
cv2.imwrite("final_leaf.jpg", grabcut_mask[...,None]*img)
I'm afraid with the parameters I tried, this still removes the stalk, though. I think that's because GrabCut thinks that it looks a similar colour to the shadows. Let me know if you find a way to keep it.

Why is my bicubic interpolation of discrete data looking ugly?

i have a 128x128 array of elevation data (elevations from -400m to 8000m are displayed using 9 colors) and i need to resize it to 512x512. I did it with bicubic interpolation, but the result looks weird. In the picture you can see original, nearest and bicubic. Note: only the elevation data are interpolated not the colors themselves (gamut is preserved). Are those artifacts seen on the bicubic image result of my bad interpolation code or they are caused by the interpolating of discrete (9 steps) data?
http://i.stack.imgur.com/Qx2cl.png
There must be something wrong with the bicubic code you're using. Here's my result with Python:
The black border around the outside is where the result was outside of the palette due to ringing.
Here's the program that produced the above:
from PIL import Image
im = Image.open(r'c:\temp\temp.png')
# convert the image to a grayscale with 8 values from 10 to 17
levels=((0,0,255),(1,255,0),(255,255,0),(255,0,0),(255,175,175),(255,0,255),(1,255,255),(255,255,255))
img = Image.new('L', im.size)
iml = im.load()
imgl = img.load()
colormap = {}
for i, color in enumerate(levels):
colormap[color] = 10 + i
width, height = im.size
for y in range(height):
for x in range(width):
imgl[x,y] = colormap[iml[x,y]]
# resize using Bicubic and restore the original palette
im4x = img.resize((4*width, 4*height), Image.BICUBIC)
palette = []
for i in range(256):
if 10 <= i < 10+len(levels):
palette.extend(levels[i-10])
else:
palette.extend((i, i, i))
im4x.putpalette(palette)
im4x.save(r'c:\temp\temp3.png')
Edit: Evidently Python's Bicubic isn't the best either. Here's what I was able to do by hand in Paint Shop Pro, using roughly the same procedure as above.
While bicubic interpolation can sometimes generate interpolating values outside the original range (can you verify if this is happening to you?) It really seems like you may have a bug, but it is hard to say without looking at the code. As a general rule the bicubic solution should be smoother than the nearest neighbor solution.
Edit: I take that back, I see no interpolating values outside the original range in your images. Still, I think the strange part is the "jaggedness" you get when using bicubic, you may want to double check that.

How to define the markers for Watershed in OpenCV?

I'm writing for Android with OpenCV. I'm segmenting an image similar to below using marker-controlled watershed, without the user manually marking the image. I'm planning to use the regional maxima as markers.
minMaxLoc() would give me the value, but how can I restrict it to the blobs which is what I'm interested in? Can I utilize the results from findContours() or cvBlob blobs to restrict the ROI and apply maxima to each blob?
First of all: the function minMaxLoc finds only the global minimum and global maximum for a given input, so it is mostly useless for determining regional minima and/or regional maxima. But your idea is right, extracting markers based on regional minima/maxima for performing a Watershed Transform based on markers is totally fine. Let me try to clarify what is the Watershed Transform and how you should correctly use the implementation present in OpenCV.
Some decent amount of papers that deal with watershed describe it similarly to what follows (I might miss some detail, if you are unsure: ask). Consider the surface of some region you know, it contains valleys and peaks (among other details that are irrelevant for us here). Suppose below this surface all you have is water, colored water. Now, make holes in each valley of your surface and then the water starts to fill all the area. At some point, differently colored waters will meet, and when this happen, you construct a dam such that they don't touch each other. In the end you have a collection of dams, which is the watershed separating all the different colored water.
Now, if you make too many holes in that surface, you end up with too many regions: over-segmentation. If you make too few you get an under-segmentation. So, virtually any paper that suggests using watershed actually presents techniques to avoid these problems for the application the paper is dealing with.
I wrote all this (which is possibly too naïve for anyone that knows what the Watershed Transform is) because it reflects directly on how you should use watershed implementations (which the current accepted answer is doing in a completely wrong manner). Let us start on the OpenCV example now, using the Python bindings.
The image presented in the question is composed of many objects that are mostly too close and in some instances overlapping. The usefulness of watershed here is to separate correctly these objects, not to group them into a single component. So you need at least one marker for each object and good markers for the background. As an example, first binarize the input image by Otsu and perform a morphological opening for removing small objects. The result of this step is shown below in the left image. Now with the binary image consider applying the distance transform to it, result at right.
With the distance transform result, we can consider some threshold such that we consider only the regions most distant to the background (left image below). Doing this, we can obtain a marker for each object by labeling the different regions after the earlier threshold. Now, we can also consider the border of a dilated version of the left image above to compose our marker. The complete marker is shown below at right (some markers are too dark to be seen, but each white region in the left image is represented at the right image).
This marker we have here makes a lot of sense. Each colored water == one marker will start to fill the region, and the watershed transformation will construct dams to impede that the different "colors" merge. If we do the transform, we get the image at left. Considering only the dams by composing them with the original image, we get the result at right.
import sys
import cv2
import numpy
from scipy.ndimage import label
def segment_on_dt(a, img):
border = cv2.dilate(img, None, iterations=5)
border = border - cv2.erode(border, None)
dt = cv2.distanceTransform(img, 2, 3)
dt = ((dt - dt.min()) / (dt.max() - dt.min()) * 255).astype(numpy.uint8)
_, dt = cv2.threshold(dt, 180, 255, cv2.THRESH_BINARY)
lbl, ncc = label(dt)
lbl = lbl * (255 / (ncc + 1))
# Completing the markers now.
lbl[border == 255] = 255
lbl = lbl.astype(numpy.int32)
cv2.watershed(a, lbl)
lbl[lbl == -1] = 0
lbl = lbl.astype(numpy.uint8)
return 255 - lbl
img = cv2.imread(sys.argv[1])
# Pre-processing.
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, img_bin = cv2.threshold(img_gray, 0, 255,
cv2.THRESH_OTSU)
img_bin = cv2.morphologyEx(img_bin, cv2.MORPH_OPEN,
numpy.ones((3, 3), dtype=int))
result = segment_on_dt(img, img_bin)
cv2.imwrite(sys.argv[2], result)
result[result != 255] = 0
result = cv2.dilate(result, None)
img[result == 255] = (0, 0, 255)
cv2.imwrite(sys.argv[3], img)
I would like to explain a simple code on how to use watershed here. I am using OpenCV-Python, but i hope you won't have any difficulty to understand.
In this code, I will be using watershed as a tool for foreground-background extraction. (This example is the python counterpart of the C++ code in OpenCV cookbook). This is a simple case to understand watershed. Apart from that, you can use watershed to count the number of objects in this image. That will be a slightly advanced version of this code.
1 - First we load our image, convert it to grayscale, and threshold it with a suitable value. I took Otsu's binarization, so it would find the best threshold value.
import cv2
import numpy as np
img = cv2.imread('sofwatershed.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
Below is the result I got:
( even that result is good, because great contrast between foreground and background images)
2 - Now we have to create the marker. Marker is the image with same size as that of original image which is 32SC1 (32 bit signed single channel).
Now there will be some regions in the original image where you are simply sure, that part belong to foreground. Mark such region with 255 in marker image. Now the region where you are sure to be the background are marked with 128. The region you are not sure are marked with 0. That is we are going to do next.
A - Foreground region:- We have already got a threshold image where pills are white color. We erode them a little, so that we are sure remaining region belongs to foreground.
fg = cv2.erode(thresh,None,iterations = 2)
fg :
B - Background region :- Here we dilate the thresholded image so that background region is reduced. But we are sure remaining black region is 100% background. We set it to 128.
bgt = cv2.dilate(thresh,None,iterations = 3)
ret,bg = cv2.threshold(bgt,1,128,1)
Now we get bg as follows :
C - Now we add both fg and bg :
marker = cv2.add(fg,bg)
Below is what we get :
Now we can clearly understand from above image, that white region is 100% foreground, gray region is 100% background, and black region we are not sure.
Then we convert it into 32SC1 :
marker32 = np.int32(marker)
3 - Finally we apply watershed and convert result back into uint8 image:
cv2.watershed(img,marker32)
m = cv2.convertScaleAbs(marker32)
m :
4 - We threshold it properly to get the mask and perform bitwise_and with the input image:
ret,thresh = cv2.threshold(m,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
res = cv2.bitwise_and(img,img,mask = thresh)
res :
Hope it helps!!!
ARK
Foreword
I'm chiming in mostly because I found both the watershed tutorial in the OpenCV documentation (and C++ example) as well as mmgp's answer above to be quite confusing. I revisited a watershed approach multiple times to ultimately give up out of frustration. I finally realized I needed to at least give this approach a try and see it in action. This is what I've come up with after sorting out all of the tutorials I've come across.
Aside from being a computer vision novice, most of my trouble probably had to do with my requirement to use the OpenCVSharp library rather than Python. C# doesn't have baked-in high-power array operators like those found in NumPy (though I realize this has been ported via IronPython), so I struggled quite a bit in both understanding and implementing these operations in C#. Also, for the record, I really despise the nuances of, and inconsistencies in most of these function calls. OpenCVSharp is one of the most fragile libraries I've ever worked with. But hey, it's a port, so what was I expecting? Best of all, though -- it's free.
Without further ado, let's talk about my OpenCVSharp implementation of the watershed, and hopefully clarify some of the stickier points of watershed implementation in general.
Application
First of all, make sure watershed is what you want and understand its use. I am using stained cell plates, like this one:
It took me a good while to figure out I couldn't just make one watershed call to differentiate every cell in the field. On the contrary, I first had to isolate a portion of the field, then call watershed on that small portion. I isolated my region of interest (ROI) via a number of filters, which I will explain briefly here:
Start with source image (left, cropped for demonstration purposes)
Isolate the red channel (left middle)
Apply adaptive threshold (right middle)
Find contours then eliminate those with small areas (right)
Once we have cleaned the contours resulting from the above thresholding operations, it is time to find candidates for watershed. In my case, I simply iterated through all contours greater than a certain area.
Code
Say we've isolated this contour from the above field as our ROI:
Let's take a look at how we'll code up a watershed.
We'll start with a blank mat and draw only the contour defining our ROI:
var isolatedContour = new Mat(source.Size(), MatType.CV_8UC1, new Scalar(0, 0, 0));
Cv2.DrawContours(isolatedContour, new List<List<Point>> { contour }, -1, new Scalar(255, 255, 255), -1);
In order for the watershed call to work, it will need a couple of "hints" about the ROI. If you're a complete beginner like me, I recommend checking out the CMM watershed page for a quick primer. Suffice to say we're going to create hints about the ROI on the left by creating the shape on the right:
To create the white part (or "background") of this "hint" shape, we'll just Dilate the isolated shape like so:
var kernel = Cv2.GetStructuringElement(MorphShapes.Ellipse, new Size(2, 2));
var background = new Mat();
Cv2.Dilate(isolatedContour, background, kernel, iterations: 8);
To create the black part in the middle (or "foreground"), we'll use a distance transform followed by threshold, which takes us from the shape on the left to the shape on the right:
This takes a few steps, and you may need to play around with the lower bound of your threshold to get results that work for you:
var foreground = new Mat(source.Size(), MatType.CV_8UC1);
Cv2.DistanceTransform(isolatedContour, foreground, DistanceTypes.L2, DistanceMaskSize.Mask5);
Cv2.Normalize(foreground, foreground, 0, 1, NormTypes.MinMax); //Remember to normalize!
foreground.ConvertTo(foreground, MatType.CV_8UC1, 255, 0);
Cv2.Threshold(foreground, foreground, 150, 255, ThresholdTypes.Binary);
Then we'll subtract these two mats to get the final result of our "hint" shape:
var unknown = new Mat(); //this variable is also named "border" in some examples
Cv2.Subtract(background, foreground, unknown);
Again, if we Cv2.ImShow unknown, it would look like this:
Nice! This was easy for me to wrap my head around. The next part, however, got me quite puzzled. Let's look at turning our "hint" into something the Watershed function can use. For this we need to use ConnectedComponents, which is basically a big matrix of pixels grouped by the virtue of their index. For example, if we had a mat with the letters "HI", ConnectedComponents might return this matrix:
0 0 0 0 0 0 0 0 0
0 1 0 1 0 2 2 2 0
0 1 0 1 0 0 2 0 0
0 1 1 1 0 0 2 0 0
0 1 0 1 0 0 2 0 0
0 1 0 1 0 2 2 2 0
0 0 0 0 0 0 0 0 0
So, 0 is the background, 1 is the letter "H", and 2 is the letter "I". (If you get to this point and want to visualize your matrix, I recommend checking out this instructive answer.) Now, here's how we'll utilize ConnectedComponents to create the markers (or labels) for watershed:
var labels = new Mat(); //also called "markers" in some examples
Cv2.ConnectedComponents(foreground, labels);
labels = labels + 1;
//this is a much more verbose port of numpy's: labels[unknown==255] = 0
for (int x = 0; x < labels.Width; x++)
{
for (int y = 0; y < labels.Height; y++)
{
//You may be able to just send "int" in rather than "char" here:
var labelPixel = (int)labels.At<char>(y, x); //note: x and y are inexplicably
var borderPixel = (int)unknown.At<char>(y, x); //and infuriatingly reversed
if (borderPixel == 255)
labels.Set(y, x, 0);
}
}
Note that the Watershed function requires the border area to be marked by 0. So, we've set any border pixels to 0 in the label/marker array.
At this point, we should be all set to call Watershed. However, in my particular application, it is useful just to visualize a small portion of the entire source image during this call. This may be optional for you, but I first just mask off a small bit of the source by dilating it:
var mask = new Mat();
Cv2.Dilate(isolatedContour, mask, new Mat(), iterations: 20);
var sourceCrop = new Mat(source.Size(), source.Type(), new Scalar(0, 0, 0));
source.CopyTo(sourceCrop, mask);
And then make the magic call:
Cv2.Watershed(sourceCrop, labels);
Results
The above Watershed call will modify labels in place. You'll have to go back to remembering about the matrix resulting from ConnectedComponents. The difference here is, if watershed found any dams between watersheds, they will be marked as "-1" in that matrix. Like the ConnectedComponents result, different watersheds will be marked in a similar fashion of incrementing numbers. For my purposes, I wanted to store these into separate contours, so I created this loop to split them up:
var watershedContours = new List<Tuple<int, List<Point>>>();
for (int x = 0; x < labels.Width; x++)
{
for (int y = 0; y < labels.Height; y++)
{
var labelPixel = labels.At<Int32>(y, x); //note: x, y switched
var connected = watershedContours.Where(t => t.Item1 == labelPixel).FirstOrDefault();
if (connected == null)
{
connected = new Tuple<int, List<Point>>(labelPixel, new List<Point>());
watershedContours.Add(connected);
}
connected.Item2.Add(new Point(x, y));
if (labelPixel == -1)
sourceCrop.Set(y, x, new Vec3b(0, 255, 255));
}
}
Then, I wanted to print these contours with random colors, so I created the following mat:
var watershed = new Mat(source.Size(), MatType.CV_8UC3, new Scalar(0, 0, 0));
foreach (var component in watershedContours)
{
if (component.Item2.Count < (labels.Width * labels.Height) / 4 && component.Item1 >= 0)
{
var color = GetRandomColor();
foreach (var point in component.Item2)
watershed.Set(point.Y, point.X, color);
}
}
Which yields the following when shown:
If we draw on the source image the dams that were marked by a -1 earlier, we get this:
Edits:
I forgot to note: make sure you're cleaning up your mats after you're done with them. They WILL stay in memory and OpenCVSharp may present with some unintelligible error message. I should really be using using above, but mat.Release() is an option as well.
Also, mmgp's answer above includes this line: dt = ((dt - dt.min()) / (dt.max() - dt.min()) * 255).astype(numpy.uint8), which is a histogram stretching step applied to the results of the distance transform. I omitted this step for a number of reasons (mostly because I didn't think the histograms I saw were too narrow to begin with), but your mileage may vary.

Resources