How to implement an "Excel like" "add cell" into a canvas - image-processing

I'm sorry for the title, that maybe can't describe properly what I would like to achieve. I'm starting to develop a new software which should present a "grid" to the user that can be manipulated by him adding "rows" or "columns" in any point of this "grid". The problem is that I'm not sure a real grid is the suitable solution, because there are some "graphical" requirements like changing invididual cells sizes, nesting them, zooming/stretching, etc. So I was starting to analyze a solution in WPF that uses DrawingVisual elements (for performance reason).
I'm able to draw the "grid" in the desired way. I'm also able to add rows or columns at the edges of the drawing. But I can't figure any solution to modify it in the "middle" (except redrawing the whole thing). I'll explain me better with an image. On the left there's the "grid" after it has been drawn for the first time. On the right there's a new grid that should be drawn after the user performs an operation.
An more complex example is the following, where the "row" is added inside an existing cell, causing all the cells to "grow".
As I said, I know I could redraw the whole thing, but I'm concerned about performance. Keep in mind that in a real scenario there could be thousands of blocks and many nesting levels.
Any suggestion is appreciated. The use of WPF is not mandatory, but it will be a desktop app in .NET 5.0. The use of a DrawingVisual is neither mandatory. I can evaluate any solution. Thank you.

A simple technique is to keep positions of columns relative to the left of the canvas in a variable when you first draw the tables. When you want to add a new column, you can crop the image from that point, and in a larger canvas, copy the left and right pieces and just draw the middle column from the beginning.
Of course, the coordinates of each column can be calculated with image processing techniques, but it reduces performance.
I wrote this code with Python, but I do not think it would be difficult to convert it to C#.
import cv2
import numpy as np
# copy image over another
def imdraw(im, over, x, y):
y1, y2 = y, y + over.shape[0]
x1, x2 = x, x + over.shape[1]
for c in range(0, 3):
im[y1:y2, x1:x2, c] = over[:, :, c]
return im
pt = 220
col = 300
off = 15
im = cv2.imread("grid.png", 1)
h, w = im.shape[:2]
crop_left = im[0 : 0 + h, 0:pt]
crop_right = im[0 : 0 + h, pt:w]
cv2.imwrite("left.jpg", crop_left)
cv2.imwrite("right.jpg", crop_right)
# Create an Empty image with white background
out = 255 * np.ones(shape=[h, w + col, 3], dtype=np.uint8)
out = imdraw(out, crop_left, 0, 0)
out = imdraw(out, crop_right, pt + col, 0)
out = cv2.rectangle(
out,
pt1=(pt + off, off),
pt2=(pt + col - off, h - off),
color=(128, 0, 200),
thickness=5,
lineType=cv2.LINE_AA,
)
cv2.imwrite("out.jpg", out)
Output:

Related

extracting the pieces and positions from a boardgame

So I am using OpenCV (in Go with OpenCV) to attempt to extract the pieces from a boardgame. Originally I was approaching this problem with somewhat success by manually finding the HSV values for each player piece colour and the board positions. I managed to get this working, and a programmatic representation of every piece and its position on the board. The downside being that this requires quite serious human interaction if using a different board - "finding" all the correct HSV values. I asked here and got a suggestion to start by ignoring the colour, find all the pieces and then using a clustering algorithm on colour to work out which player it is. I might have to do something for the positions as well but thats stage two.
So now I am attempting to just extract all pieces regardless of colour.
I started out trying to use the NewSimpleBlobDetectorWithParams - however made little progress it seems to struggle alot on false negatives/positives.
I tried HoughCirclesWithParams but again this seems very dependant on the parameters and I wasn't making much progress in the actual pieces being detected. Currently I am using FindContours and that seems to be giving me some reasonable accuracy. Lets look at the picures.
The original image looks like this:
I have built a "dashboard" of controls and the thing that seems to be most "useful" is erosion, dilation and threshold.
My current setup is a load of trackerbars/sliders to adjust the values and then
gocv.CvtColor(clone, &clone, gocv.ColorRGBToGray)
erodeKernel := gocv.GetStructuringElement(gocv.MorphRect, image.Pt(trackers.erosionValue, trackers.erosionValue))
gocv.Erode(clone, &clone, erodeKernel)
dilateKernel := gocv.GetStructuringElement(gocv.MorphRect, image.Pt(trackers.dilateValue, trackers.dilateValue))
gocv.Dilate(clone, &clone, dilateKernel)
gocv.Threshold(clone, &clone, float32(trackers.thresTruncValue), 255, gocv.ThresholdTrunc)
gocv.Threshold(clone, &clone, float32(trackers.threshBinaryValue), 255, gocv.ThresholdBinary)
cannies := gocv.NewMat()
gocv.Canny(clone, &cannies, float32(trackers.cannyMin), float32(trackers.cannyMax))
cnts := gocv.FindContours(cannies, gocv.RetrievalTree, gocv.ChainApproxSimple)
followed by
for i := 0; i < cnts.Size(); i++ {
cnt := cnts.At(i)
if len(cnt.ToPoints()) < 5 {
continue
}
rect := gocv.FitEllipse(cnt)
gocv.Circle(&colorImage, image.Pt(rect.Center.X, rect.Center.Y), (rect.Height + rect.Width)/4, cntColor, 3)
if gocv.ContourArea(cnt) < gocv.ArcLength(cnt, false) {
continue
}
gocv.Rectangle(&colorImage, rect.BoundingRect, rectColor, 2)
psVector := gocv.NewPointsVector()
psVector.Append(cnt)
gocv.DrawContours(&clone, psVector, 0, rectColor, 3)
if rect.Center.X == (rect.BoundingRect.Max.X + rect.BoundingRect.Min.X) / 2 && rect.Center.Y == (rect.BoundingRect.Min.Y + rect.BoundingRect.Max.Y) / 2 {
//Does the circle fit inside the square?
if float64(rect.Width * rect.Height) > math.Pi * math.Pow(float64((rect.Height+rect.Width)/4), 2) {
gocv.Circle(&colorImage, image.Pt(rect.Center.X, rect.Center.Y), 2, matchColor, 3)
pieces = append(pieces, image.Pt(rect.Center.X, rect.Center.Y))
}
}
}
The idea being if the contour has 5 points then you can find the bounding bounding rectangle, then if the contour is closed, draw a circle at the center of the contour and if it fits inside the bounding rectangle, and they share the same center, its probably a playing piece. Note - I came up with this principle based on seeing where the circles and bounding rectangles were lying and when they matched up it more often than not seemed to be a playing piece.
So I am making some nice progress. However my questions are help with approaches to dig out the other colour pieces and perhaps more "robustly" dig out the white pieces. I feel that I don't quite have enough tools at my disposal as if i increase one thing i have to decrease another and I for some reason feel finding 30 round chequers on a board should be reasonably robust.
When I adjust the values looking for the maroon pieces I can get a few of them
but as you can see the diference when playing with threshold/erosion/dilation is not doing a wonderful job of finding them.
EDIT:
I have added the hough circle algorithm back in to sort of show that it hits on false negatives alot - in this case it got 1.
gocv.HoughCirclesWithParams(
clone,
&circles,
gocv.HoughGradient,
1, // dp
15, // minDist
75, // param1
20, // param2
20, // minRadius
45, // maxRadius
)
blue := color.RGBA{0, 0, 255, 0}
for i := 0; i < circles.Cols(); i++ {
v := circles.GetVecfAt(0, i)
// if circles are found
if len(v) > 2 {
x := int(v[0])
y := int(v[1])
r := int(v[2])
gocv.Circle(&colorImage, image.Pt(x, y), r, blue, 2)
}
}
Here is the threshold I was using.
So I realise I have said a lot here. I am looking for some help to detect all the playing pieces on the board.
I am doing this in go with gocv, but I can use python/convert python code if anyone has a good reference or something.
The original image without any ammendments is here. As I say my goal is to automatically detect the 30 pieces on the board and then i can use a clustering algo to work out which group they are in (I think...) I want to do it with the least amount of human interaction dragging sliders as that is not a fun/nice user experience....
Thoughts I had
the user could drag bounding boxes around groups and then that would make the computers job easier knowing it had to find pieces in there.
the user could select a colour of the page and that would tell the computer what roughly HSV values it should be looking in
the user could calibrate against a known start position of the pieces so the computer knew where to look.
Not exactly an answer to your questions, but this would be so much easier if you used object detection instead. Same way in my tutorials I find different objects. In this case, I would have 2 or possibly 3 classes: light pieces, dark pieces, and possibly another class for the empty spaces.
I usually use OpenCV and Darknet/YOLO to solve these kinds of things. I have many tutorials on my youtube channel. Here is a simple one to detect a few shapes: https://www.youtube.com/watch?v=yOJIRArZeig Here is another that shows OpenCV and Darknet/YOLO used to solve Sudoku: https://www.youtube.com/watch?v=BUG7HlhuArw
Your case would be similar to that last one. You'd get back a vector of objects detected, with the bounding box coordinates of each one within the image or video frame. If interested, this is the tutorial video I recommend to start: https://www.youtube.com/watch?v=pJ2iyf_E9PM

Extract text from background grids/lines [2]

I'm trying to remove the grid lines in handwriting picture. I tried to use FFT to extract the grid pattern and remove it (this is from an answer in the original question, which is closed somehow. It has more background as well.). This image shows what I am able to get currently (Illustration result):
The first line is a real image with handwriting character. Since it's taken by phone in various conditions (light, direction, etc.), the grid line might not be perfect horizontal/vertical, and the color of grid line also varies and might be close the the color of characters. I turn it to grayscale, apply fft, and use tries to use thresholding to extract the patterns (in red rectangle, the illustration is using OTSU). Then I mask the image with the thresholding pattern, and use ifft to get the result. It fails on the real image obviously.
The second line is a real image of blank grid w/o handwriting character. From this, I think 3 lines (vertical and horizontal) in the center are the patterns I care.
The third line is a synthetic image w/ perfect grid lines. It's just for reference. And after applying the same algorithm, the grid lines could be removed successfully.
The fourth line is a synthetic image w/ perfect dashed grid lines, which is closer to the grid lines on real handwriting practice paper. It's also for reference. It shows the pattern of dashed lines are actually more complicated than 3 lines in the center. With the same algorithm, the grid lines could be removed almost completely as well.
The code I use is:
def FFTCV(img):
util.Plot(img, 'Input')
print(img.shape)
if len(img.shape) == 3 and img.shape[2] == 3:
img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
util.Plot(img, 'Gray')
dft = cv.dft(np.float32(img),flags = cv.DFT_COMPLEX_OUTPUT)
dft_shift = np.fft.fftshift(dft)
util.Plot(cv.magnitude(dft_shift[:,:,0],dft_shift[:,:,1]), 'fft shift')
magnitude_spectrum = np.uint8(20*np.log(cv.magnitude(dft_shift[:,:,0],dft_shift[:,:,1])))
util.Plot(magnitude_spectrum, 'Magnitude')
_, threshold = cv.threshold(magnitude_spectrum, 0, 1, cv.THRESH_BINARY_INV + cv.THRESH_OTSU)
# threshold = cv.adaptiveThreshold(
# magnitude_spectrum, 1, cv.ADAPTIVE_THRESH_MEAN_C, cv.THRESH_BINARY_INV, 11, 10)
# magnitude_spectrum, 1, cv.ADAPTIVE_THRESH_GAUSSIAN_C, cv.THRESH_BINARY_INV, 11, 10)
util.Plot(threshold, 'Threshold Mask')
fshift = dft_shift * threshold[:, :, None]
util.Plot(cv.magnitude(fshift[:,:,0],fshift[:,:,1]), 'fft shift Masked')
magnitude_spectrum = np.uint8(20*np.log(cv.magnitude(fshift[:,:,0],fshift[:,:,1])))
util.Plot(magnitude_spectrum, 'Magnitude Masked')
f_ishift = np.fft.ifftshift(fshift)
img_back = cv.idft(f_ishift)
img_back = cv.magnitude(img_back[:,:,0],img_back[:,:,1])
util.Plot(img_back, 'Back')
So I'd like to learn suggestions on how to extract the patterns for real images. Thanks very much.

Placing a shape inside another shape using opencv

I have two images and I need to place the second image inside the first image. The second image can be resized, rotated or skewed such that it covers a larger area of the other images as possible. As an example, in the figure shown below, the green circle need to be placed inside the blue shape:
Here the green circle is transformed such that it covers a larger area. Another example is shown below:
Note that there may be some multiple results. However, any similar result is acceptable as shown in the above example.
How do I solve this problem?
Thanks in advance!
I tested the idea I mentioned earlier in the comments and the output is almost good. It may be better but it takes time. The final code was too much and it depends on one of my old personal projects, so I will not share. But I will explain step by step how I wrote such an algorithm. Note that I have tested the algorithm many times. Not yet 100% accurate.
for N times do this:
1. Copy from shape
2. Transform it randomly
3. Put the shape on the background
4-1. It is not acceptable if the shape exceeds the background. Go to
the first step.
4.2. Otherwise we will continue to step 5.
5. We calculate the length, width and number of shape pixels.
6. We keep a list of the best candidates and compare these three
parameters (W, H, Pixels) with the members of the list. If we
find a better item, we will save it.
I set the value of N to 5,000. The larger the number, the slower the algorithm runs, but the better the result.
You can use anything for Transform. Mirror, Rotate, Shear, Scale, Resize, etc. But I used warpPerspective for this one.
im1 = cv2.imread(sys.path[0]+'/Back.png')
im2 = cv2.imread(sys.path[0]+'/Shape.png')
bH, bW = im1.shape[:2]
sH, sW = im2.shape[:2]
# TopLeft, TopRight, BottomRight, BottomLeft of the shape
_inp = np.float32([[0, 0], [sW, 0], [sW, sH], [0, sH]])
cx = random.randint(5, sW-5)
ch = random.randint(5, sH-5)
o = 0
# Random transformed output
_out = np.float32([
[random.randint(-o, cx-1), random.randint(1-o, ch-1)],
[random.randint(cx+1, sW+o), random.randint(1-o, ch-1)],
[random.randint(cx+1, sW+o), random.randint(ch+1, sH+o)],
[random.randint(-o, cx-1), random.randint(ch+1, sH+o)]
])
# Transformed output
M = cv2.getPerspectiveTransform(_inp, _out)
t = cv2.warpPerspective(shape, M, (bH, bW))
You can use countNonZero to find the number of pixels and findContours and boundingRect to find the shape size.
def getSize(msk):
cnts, _ = cv2.findContours(msk, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
cnts.sort(key=lambda p: max(cv2.boundingRect(p)[2],cv2.boundingRect(p)[3]), reverse=True)
w,h=0,0
if(len(cnts)>0):
_, _, w, h = cv2.boundingRect(cnts[0])
pix = cv2.countNonZero(msk)
return pix, w, h
To find overlaping of back and shape you can do something like this:
make a mask from back and shape and use bitwise methods; Change this section according to the software you wrote. This is just an example :)
mskMix = cv2.bitwise_and(mskBack, mskShape)
mskMix = cv2.bitwise_xor(mskMix, mskShape)
isCandidate = not np.any(mskMix == 255)
For example this is not a candidate answer; This is because if you look closely at the image on the right, you will notice that the shape has exceeded the background.
I just tested the circle with 4 different backgrounds; And the results:
After 4879 Iterations:
After 1587 Iterations:
After 4621 Iterations:
After 4574 Iterations:
A few additional points. If you use a method like medianBlur to cover the noise in the Background mask and Shape mask, you may find a better solution.
I suggest you read about Evolutionary Computation, Metaheuristic and Soft Computing algorithms for better understanding of this algorithm :)

Highcharts Vector Plot with connected vectors of absolute length

Scenario: I need to draw a plot that has a background image. Based on the information on that image there have to be multiple origins (let's call them 'targets') that can move over time. The movements of these targets will have to be indicated by arrows/vectors where the first vector originates at the location of the target, the second vector originates where the previous vector ended and so on.
The result should look similar to this:
Plot with targets and movement vectors
While trying to implement this, i stumbled upon different questions:
I would use a chart with combined series: a Scatter plot to add the targets at exact x/y locations and a vector plot to insert the vectors. Would this be a correct way?
Since i want to set each vectors starting point to exact x/y coordinates i use rotationOrigin: 'start'. When i now change vectorLength to something other than 20 the vector is still shifted by 10 pixels (http://jsfiddle.net/Chop_Suey/cx35ptrh/) this looks like a bug to me. Can it be fixed or is there a workaround?
When i define a vector it looks like [x, y, length, direction]. But length is a relative unit that is calculated with some magic relative to the longest vector which is 20 (pixels) by default or whatever i set vectorLength to. Thus, the vectors are not connected and the space between them changes depending on plot size axes min/max). I actually want to corellate the length with the plot axis (which might be tricky since the x-axis and y-axis might have different scales). A workaround could be to add a redraw event and recalculate the vectors on every resize and set the vectorLength to the currently longest vector (which again can be calculated to correlate to the axes). This is very cumbersome and i would prefer to be able to set the vectors somehow like [x1, y1, x2, y2] where (x1/y2) denotes the starting- and (x2/y2) the ending-point of the vector. Is this possible somehow? any recommendations?
Since the background image is not just a decoration but relevant for the displayed data to make sense it should change when i zoom in. Is it possible to 'lock' the background image to the original plot min/max so that when i zoom in, the background image is also zoomed (image quality does not matter)?
Combining these two series shoudn't be problematic at all, and that will be the correct way, but it is necessary to change the prototype functions a bit for that the vectors will draw in a different way. Here is the example: https://jsfiddle.net/6vkjspoc/
There is probably the bug in this module and we will report it as new issue as soon as it is possible. However, we made the workaround (or fix) for that and now it's working well, what you can notice in example above.
Vector length is currently calculated using scale, namely - if vectorLength value is equal to 100 (for example), and vector series has two points which looks like that:
{
type: 'vector',
vectorLength: 100,
rotationOrigin: 'start',
data: [
[1, 50000, 1, 120],
[1, 50000, 2, -120]
]
}
Then the highest length of all points is taken and basing on it the scale is calculated for each point, so first one length is equal to 50, because the algorithm is point.length / lengthMax, what you can deduce from the code below:
H.seriesTypes.vector.prototype.arrow = function(point) {
var path,
fraction = point.length / this.lengthMax,
u = fraction * this.options.vectorLength / 20,
o = {
start: 10 * u,
center: 0,
end: -10 * u
}[this.options.rotationOrigin] || 0;
// The stem and the arrow head. Draw the arrow first with rotation 0,
// which is the arrow pointing down (vector from north to south).
path = [
'M', 0, 7 * u + o, // base of arrow
'L', -1.5 * u, 7 * u + o,
0, 10 * u + o,
1.5 * u, 7 * u + o,
0, 7 * u + o,
0, -10 * u + o // top
];
return path;
}
Regarding your question about defining start and end of a vector by two x, y values, you need to refactor entire series code, so that it won't use the vectorLength at all as like as scale, because you will define the points length. I suspect that will be very complex solution, so you can try to do it by yourself, and let me know about the results.
In order to make it works, you need to recalculate and update vectorLength of your vector series inside of chart.events.selection handler. Here is the example: https://jsfiddle.net/nh7b6qx9/

How to define the markers for Watershed in OpenCV?

I'm writing for Android with OpenCV. I'm segmenting an image similar to below using marker-controlled watershed, without the user manually marking the image. I'm planning to use the regional maxima as markers.
minMaxLoc() would give me the value, but how can I restrict it to the blobs which is what I'm interested in? Can I utilize the results from findContours() or cvBlob blobs to restrict the ROI and apply maxima to each blob?
First of all: the function minMaxLoc finds only the global minimum and global maximum for a given input, so it is mostly useless for determining regional minima and/or regional maxima. But your idea is right, extracting markers based on regional minima/maxima for performing a Watershed Transform based on markers is totally fine. Let me try to clarify what is the Watershed Transform and how you should correctly use the implementation present in OpenCV.
Some decent amount of papers that deal with watershed describe it similarly to what follows (I might miss some detail, if you are unsure: ask). Consider the surface of some region you know, it contains valleys and peaks (among other details that are irrelevant for us here). Suppose below this surface all you have is water, colored water. Now, make holes in each valley of your surface and then the water starts to fill all the area. At some point, differently colored waters will meet, and when this happen, you construct a dam such that they don't touch each other. In the end you have a collection of dams, which is the watershed separating all the different colored water.
Now, if you make too many holes in that surface, you end up with too many regions: over-segmentation. If you make too few you get an under-segmentation. So, virtually any paper that suggests using watershed actually presents techniques to avoid these problems for the application the paper is dealing with.
I wrote all this (which is possibly too naïve for anyone that knows what the Watershed Transform is) because it reflects directly on how you should use watershed implementations (which the current accepted answer is doing in a completely wrong manner). Let us start on the OpenCV example now, using the Python bindings.
The image presented in the question is composed of many objects that are mostly too close and in some instances overlapping. The usefulness of watershed here is to separate correctly these objects, not to group them into a single component. So you need at least one marker for each object and good markers for the background. As an example, first binarize the input image by Otsu and perform a morphological opening for removing small objects. The result of this step is shown below in the left image. Now with the binary image consider applying the distance transform to it, result at right.
With the distance transform result, we can consider some threshold such that we consider only the regions most distant to the background (left image below). Doing this, we can obtain a marker for each object by labeling the different regions after the earlier threshold. Now, we can also consider the border of a dilated version of the left image above to compose our marker. The complete marker is shown below at right (some markers are too dark to be seen, but each white region in the left image is represented at the right image).
This marker we have here makes a lot of sense. Each colored water == one marker will start to fill the region, and the watershed transformation will construct dams to impede that the different "colors" merge. If we do the transform, we get the image at left. Considering only the dams by composing them with the original image, we get the result at right.
import sys
import cv2
import numpy
from scipy.ndimage import label
def segment_on_dt(a, img):
border = cv2.dilate(img, None, iterations=5)
border = border - cv2.erode(border, None)
dt = cv2.distanceTransform(img, 2, 3)
dt = ((dt - dt.min()) / (dt.max() - dt.min()) * 255).astype(numpy.uint8)
_, dt = cv2.threshold(dt, 180, 255, cv2.THRESH_BINARY)
lbl, ncc = label(dt)
lbl = lbl * (255 / (ncc + 1))
# Completing the markers now.
lbl[border == 255] = 255
lbl = lbl.astype(numpy.int32)
cv2.watershed(a, lbl)
lbl[lbl == -1] = 0
lbl = lbl.astype(numpy.uint8)
return 255 - lbl
img = cv2.imread(sys.argv[1])
# Pre-processing.
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, img_bin = cv2.threshold(img_gray, 0, 255,
cv2.THRESH_OTSU)
img_bin = cv2.morphologyEx(img_bin, cv2.MORPH_OPEN,
numpy.ones((3, 3), dtype=int))
result = segment_on_dt(img, img_bin)
cv2.imwrite(sys.argv[2], result)
result[result != 255] = 0
result = cv2.dilate(result, None)
img[result == 255] = (0, 0, 255)
cv2.imwrite(sys.argv[3], img)
I would like to explain a simple code on how to use watershed here. I am using OpenCV-Python, but i hope you won't have any difficulty to understand.
In this code, I will be using watershed as a tool for foreground-background extraction. (This example is the python counterpart of the C++ code in OpenCV cookbook). This is a simple case to understand watershed. Apart from that, you can use watershed to count the number of objects in this image. That will be a slightly advanced version of this code.
1 - First we load our image, convert it to grayscale, and threshold it with a suitable value. I took Otsu's binarization, so it would find the best threshold value.
import cv2
import numpy as np
img = cv2.imread('sofwatershed.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
Below is the result I got:
( even that result is good, because great contrast between foreground and background images)
2 - Now we have to create the marker. Marker is the image with same size as that of original image which is 32SC1 (32 bit signed single channel).
Now there will be some regions in the original image where you are simply sure, that part belong to foreground. Mark such region with 255 in marker image. Now the region where you are sure to be the background are marked with 128. The region you are not sure are marked with 0. That is we are going to do next.
A - Foreground region:- We have already got a threshold image where pills are white color. We erode them a little, so that we are sure remaining region belongs to foreground.
fg = cv2.erode(thresh,None,iterations = 2)
fg :
B - Background region :- Here we dilate the thresholded image so that background region is reduced. But we are sure remaining black region is 100% background. We set it to 128.
bgt = cv2.dilate(thresh,None,iterations = 3)
ret,bg = cv2.threshold(bgt,1,128,1)
Now we get bg as follows :
C - Now we add both fg and bg :
marker = cv2.add(fg,bg)
Below is what we get :
Now we can clearly understand from above image, that white region is 100% foreground, gray region is 100% background, and black region we are not sure.
Then we convert it into 32SC1 :
marker32 = np.int32(marker)
3 - Finally we apply watershed and convert result back into uint8 image:
cv2.watershed(img,marker32)
m = cv2.convertScaleAbs(marker32)
m :
4 - We threshold it properly to get the mask and perform bitwise_and with the input image:
ret,thresh = cv2.threshold(m,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
res = cv2.bitwise_and(img,img,mask = thresh)
res :
Hope it helps!!!
ARK
Foreword
I'm chiming in mostly because I found both the watershed tutorial in the OpenCV documentation (and C++ example) as well as mmgp's answer above to be quite confusing. I revisited a watershed approach multiple times to ultimately give up out of frustration. I finally realized I needed to at least give this approach a try and see it in action. This is what I've come up with after sorting out all of the tutorials I've come across.
Aside from being a computer vision novice, most of my trouble probably had to do with my requirement to use the OpenCVSharp library rather than Python. C# doesn't have baked-in high-power array operators like those found in NumPy (though I realize this has been ported via IronPython), so I struggled quite a bit in both understanding and implementing these operations in C#. Also, for the record, I really despise the nuances of, and inconsistencies in most of these function calls. OpenCVSharp is one of the most fragile libraries I've ever worked with. But hey, it's a port, so what was I expecting? Best of all, though -- it's free.
Without further ado, let's talk about my OpenCVSharp implementation of the watershed, and hopefully clarify some of the stickier points of watershed implementation in general.
Application
First of all, make sure watershed is what you want and understand its use. I am using stained cell plates, like this one:
It took me a good while to figure out I couldn't just make one watershed call to differentiate every cell in the field. On the contrary, I first had to isolate a portion of the field, then call watershed on that small portion. I isolated my region of interest (ROI) via a number of filters, which I will explain briefly here:
Start with source image (left, cropped for demonstration purposes)
Isolate the red channel (left middle)
Apply adaptive threshold (right middle)
Find contours then eliminate those with small areas (right)
Once we have cleaned the contours resulting from the above thresholding operations, it is time to find candidates for watershed. In my case, I simply iterated through all contours greater than a certain area.
Code
Say we've isolated this contour from the above field as our ROI:
Let's take a look at how we'll code up a watershed.
We'll start with a blank mat and draw only the contour defining our ROI:
var isolatedContour = new Mat(source.Size(), MatType.CV_8UC1, new Scalar(0, 0, 0));
Cv2.DrawContours(isolatedContour, new List<List<Point>> { contour }, -1, new Scalar(255, 255, 255), -1);
In order for the watershed call to work, it will need a couple of "hints" about the ROI. If you're a complete beginner like me, I recommend checking out the CMM watershed page for a quick primer. Suffice to say we're going to create hints about the ROI on the left by creating the shape on the right:
To create the white part (or "background") of this "hint" shape, we'll just Dilate the isolated shape like so:
var kernel = Cv2.GetStructuringElement(MorphShapes.Ellipse, new Size(2, 2));
var background = new Mat();
Cv2.Dilate(isolatedContour, background, kernel, iterations: 8);
To create the black part in the middle (or "foreground"), we'll use a distance transform followed by threshold, which takes us from the shape on the left to the shape on the right:
This takes a few steps, and you may need to play around with the lower bound of your threshold to get results that work for you:
var foreground = new Mat(source.Size(), MatType.CV_8UC1);
Cv2.DistanceTransform(isolatedContour, foreground, DistanceTypes.L2, DistanceMaskSize.Mask5);
Cv2.Normalize(foreground, foreground, 0, 1, NormTypes.MinMax); //Remember to normalize!
foreground.ConvertTo(foreground, MatType.CV_8UC1, 255, 0);
Cv2.Threshold(foreground, foreground, 150, 255, ThresholdTypes.Binary);
Then we'll subtract these two mats to get the final result of our "hint" shape:
var unknown = new Mat(); //this variable is also named "border" in some examples
Cv2.Subtract(background, foreground, unknown);
Again, if we Cv2.ImShow unknown, it would look like this:
Nice! This was easy for me to wrap my head around. The next part, however, got me quite puzzled. Let's look at turning our "hint" into something the Watershed function can use. For this we need to use ConnectedComponents, which is basically a big matrix of pixels grouped by the virtue of their index. For example, if we had a mat with the letters "HI", ConnectedComponents might return this matrix:
0 0 0 0 0 0 0 0 0
0 1 0 1 0 2 2 2 0
0 1 0 1 0 0 2 0 0
0 1 1 1 0 0 2 0 0
0 1 0 1 0 0 2 0 0
0 1 0 1 0 2 2 2 0
0 0 0 0 0 0 0 0 0
So, 0 is the background, 1 is the letter "H", and 2 is the letter "I". (If you get to this point and want to visualize your matrix, I recommend checking out this instructive answer.) Now, here's how we'll utilize ConnectedComponents to create the markers (or labels) for watershed:
var labels = new Mat(); //also called "markers" in some examples
Cv2.ConnectedComponents(foreground, labels);
labels = labels + 1;
//this is a much more verbose port of numpy's: labels[unknown==255] = 0
for (int x = 0; x < labels.Width; x++)
{
for (int y = 0; y < labels.Height; y++)
{
//You may be able to just send "int" in rather than "char" here:
var labelPixel = (int)labels.At<char>(y, x); //note: x and y are inexplicably
var borderPixel = (int)unknown.At<char>(y, x); //and infuriatingly reversed
if (borderPixel == 255)
labels.Set(y, x, 0);
}
}
Note that the Watershed function requires the border area to be marked by 0. So, we've set any border pixels to 0 in the label/marker array.
At this point, we should be all set to call Watershed. However, in my particular application, it is useful just to visualize a small portion of the entire source image during this call. This may be optional for you, but I first just mask off a small bit of the source by dilating it:
var mask = new Mat();
Cv2.Dilate(isolatedContour, mask, new Mat(), iterations: 20);
var sourceCrop = new Mat(source.Size(), source.Type(), new Scalar(0, 0, 0));
source.CopyTo(sourceCrop, mask);
And then make the magic call:
Cv2.Watershed(sourceCrop, labels);
Results
The above Watershed call will modify labels in place. You'll have to go back to remembering about the matrix resulting from ConnectedComponents. The difference here is, if watershed found any dams between watersheds, they will be marked as "-1" in that matrix. Like the ConnectedComponents result, different watersheds will be marked in a similar fashion of incrementing numbers. For my purposes, I wanted to store these into separate contours, so I created this loop to split them up:
var watershedContours = new List<Tuple<int, List<Point>>>();
for (int x = 0; x < labels.Width; x++)
{
for (int y = 0; y < labels.Height; y++)
{
var labelPixel = labels.At<Int32>(y, x); //note: x, y switched
var connected = watershedContours.Where(t => t.Item1 == labelPixel).FirstOrDefault();
if (connected == null)
{
connected = new Tuple<int, List<Point>>(labelPixel, new List<Point>());
watershedContours.Add(connected);
}
connected.Item2.Add(new Point(x, y));
if (labelPixel == -1)
sourceCrop.Set(y, x, new Vec3b(0, 255, 255));
}
}
Then, I wanted to print these contours with random colors, so I created the following mat:
var watershed = new Mat(source.Size(), MatType.CV_8UC3, new Scalar(0, 0, 0));
foreach (var component in watershedContours)
{
if (component.Item2.Count < (labels.Width * labels.Height) / 4 && component.Item1 >= 0)
{
var color = GetRandomColor();
foreach (var point in component.Item2)
watershed.Set(point.Y, point.X, color);
}
}
Which yields the following when shown:
If we draw on the source image the dams that were marked by a -1 earlier, we get this:
Edits:
I forgot to note: make sure you're cleaning up your mats after you're done with them. They WILL stay in memory and OpenCVSharp may present with some unintelligible error message. I should really be using using above, but mat.Release() is an option as well.
Also, mmgp's answer above includes this line: dt = ((dt - dt.min()) / (dt.max() - dt.min()) * 255).astype(numpy.uint8), which is a histogram stretching step applied to the results of the distance transform. I omitted this step for a number of reasons (mostly because I didn't think the histograms I saw were too narrow to begin with), but your mileage may vary.

Resources