OpenCV - how to pass a pattern-matching kernel over binary image? - opencv

I am implementing a contour-finding algorithm for pixel-wide contours in binary images. It needs to be robust against deletion of individual pixels (i.e. pixel-wide gaps).
Various attempts at dilation & erosion kernels have not yielded a reliable solution.
Instead the reliable solution I want ot implement is to pass a pattern matching kernel over the image, which can directly fill in the gaps based on surrounding pixels. For example, when the exact pattern on the left is observed at a location, it is replaced with the right (where * means wildcard):
[1 * *] [1 * *]
[* 0 *] ==> [* 1 *]
[* * 1] [* * 1]
[1 0 *] [1 0 *]
[* 0 1] ==> [* 1 1]
[* * *] [* * *]
[* 1 *] [* 1 *]
[* 0 *] ==> [* 1 *]
[* 1 *] [* 1 *]
And define the ~14 or so replacements necessary to fill in the possible gaps in each 3x3 window.
It could be implemented in raw Python but likely to be extremely slow without low level vectorized operations.
Can this be done through OpenCV or some other fast operation?

Thanks to #beaker comment above I implemented a solution. Design a kernel with the neighbouring pixels of interest as 0.5 and center as 1 and it will fill in the center with a 1 if it is missing, although some other pixels will be 2. Then clip the values to 1 and you get the desired result.
It needs to be applied independently for each direction of gap which isn't ideal but still works.
img_with_gap = np.array(
[[1,0,0,0,0],
[0,1,0,0,0],
[0,0,0,0,0],
[0,0,0,1,0],
[0,0,0,0,1]], dtype=np.uint8)
kernel = np.array(
[[0.5, 0, 0],
[0 , 1, 0],
[0 , 0, 0.5]])
connected_img = np.minimum(cv2.filter2D(img_with_gap, -1, kernel), 1)
connected_img
An even tighter implementation is to do an exact pattern match by penalizing the zero ones, clipping to {0,1} and ensuring that nothing deleted from the original image:
kernel = np.array([[0.5, -10.0, -10.0],
[-10.0 , 1, -10.0],
[-10.0 , -10.0, 0.5]])
connected_img = np.maximum(img, np.clip(cv2.filter2D(img, -1, kernel), 0, 1))

Related

How to efficiently find correspondences between two point sets without nested for loop in Pytorch?

I now have two point sets (tensor) A and B that shape like
A.size() >>(50, 3) , example: [ [0, 0, 0], [0, 1, 2], ..., [1, 1, 1]]
B.size() >>(10, 3)
where the first dimension stands for number of points and the second dim stands for coordinates (x,y,z)
To some extent, the question could also be simplified into " Finding common elements between two tensors ". Is there a quick way to do this without nested loop ?
You can quickly compute all the 50x10 distances using:
d2 = ((A[:, None, :] - B[None, ...])**2).sum(dim=2)
Once you have all the pair-wise distances, you can select "similar" ones if the distance does not exceed a threshold thr:
(d2 < thr).nonzero()
returns pairs of a-idx, b-idx of "similar" points.
If you want to match the points exactly, you can do instead:
((a[:, None, :] == b[None, ...]).all(dim=2)).nonzero()

What is the first parameter (gradients) of the backward method, in pytorch?

We have the following code from the pytorch documentation:
x = torch.randn(3)
x = Variable(x, requires_grad=True)
y = x * 2
while y.data.norm() < 1000:
y = y * 2
gradients = torch.FloatTensor([0.1, 1.0, 0.0001])
y.backward(gradients)
What exactly is the gradients parameter that we pass into the backward method and based on what do we initialize it?
To fully answer your question, it'd require a somewhat longer explanation that evolves around the details of how Backprop or, more fundamentally, the chain rule works.
The short programmatic answer is that the backwards function of a Variable computes the gradient of all variables in the computation graph attached to that Variable. (To clarify: if you have a = b + c, then the computation graph (recursively) points first to b, then to c, then to how those are computed, etc.) and cumulatively stores (sums) these gradients in the .grad attribute of these Variables. When you then call opt.step(), i.e. a step of your optimizer, it adds a fraction of that gradient to the value of these Variables.
That said, there are two answers when you look at it conceptually: If you want to train a Machine Learning model, you normally want to have the gradient with respect to some loss function. In this case, the gradients computed will be such that the overall loss (a scalar value) will decrease when applying the step function. In this special case, we want to compute the gradient to a specific value, namely the unit length step (so that the learning rate will compute the fraction of the gradients that we want). This means that if you have a loss function, and you call loss.backward(), this will compute the same as loss.backward(torch.FloatTensor([1.])).
While this is the common use case for backprop in DNNs, it is only a special case of general differentiation of functions. More generally, the symbolic differentiation packages (autograd in this case, as part of pytorch) can be used to compute gradients of earlier parts of the computation graph with respect to any gradient at a root of whatever subgraph you choose. This is when the key-word argument gradient comes in useful, since you can provide this "root-level" gradient there, even for non-scalar functions!
To illustrate, here's a small example:
a = nn.Parameter(torch.FloatTensor([[1, 1], [2, 2]]))
b = nn.Parameter(torch.FloatTensor([[1, 2], [1, 2]]))
c = torch.sum(a - b)
c.backward(None) # could be c.backward(torch.FloatTensor([1.])) for the same result
print(a.grad, b.grad)
prints:
Variable containing:
1 1
1 1
[torch.FloatTensor of size 2x2]
Variable containing:
-1 -1
-1 -1
[torch.FloatTensor of size 2x2]
While
a = nn.Parameter(torch.FloatTensor([[1, 1], [2, 2]]))
b = nn.Parameter(torch.FloatTensor([[1, 2], [1, 2]]))
c = torch.sum(a - b)
c.backward(torch.FloatTensor([[1, 2], [3, 4]]))
print(a.grad, b.grad)
prints:
Variable containing:
1 2
3 4
[torch.FloatTensor of size 2x2]
Variable containing:
-1 -2
-3 -4
[torch.FloatTensor of size 2x2]
and
a = nn.Parameter(torch.FloatTensor([[0, 0], [2, 2]]))
b = nn.Parameter(torch.FloatTensor([[1, 2], [1, 2]]))
c = torch.matmul(a, b)
c.backward(torch.FloatTensor([[1, 1], [1, 1]])) # we compute w.r.t. a non-scalar variable, so the gradient supplied cannot be scalar, either!
print(a.grad, b.grad)
prints
Variable containing:
3 3
3 3
[torch.FloatTensor of size 2x2]
Variable containing:
2 2
2 2
[torch.FloatTensor of size 2x2]
and
a = nn.Parameter(torch.FloatTensor([[0, 0], [2, 2]]))
b = nn.Parameter(torch.FloatTensor([[1, 2], [1, 2]]))
c = torch.matmul(a, b)
c.backward(torch.FloatTensor([[1, 2], [3, 4]])) # we compute w.r.t. a non-scalar variable, so the gradient supplied cannot be scalar, either!
print(a.grad, b.grad)
prints:
Variable containing:
5 5
11 11
[torch.FloatTensor of size 2x2]
Variable containing:
6 8
6 8
[torch.FloatTensor of size 2x2]

Co-occurrence for set prediction

I'm having trouble with set prediction. I thought I wanted to use co-occurrence to solve this, but now that I've attempted it I'm not sure it's the right tool to use.
I have a database with some data (each column corresponding to a specific item, each row corresponding to each set), e.g.:
data:
[[1 0 1 1]
[1 1 1 1]
[1 0 1 1]]
I calculate the co-occurrence matrix:
cooccur_matrix:
[[0 1 3 3]
[1 0 1 1]
[3 1 0 3]
[3 1 3 0]]
And now I have an incomplete set:
target:
[1 0 1 1]
The dot product of my co-occurrence matrix and this is:
prediction:
[6 3 6 6]
But that's not at all what I want. What I'm trying to get back is something like this:
prediction:
[1 0.33 1 1]
Or:
prediction:
[0 0.33 0 0]
Any thoughts on what I'm doing wrong? I'm fairly new to ML concepts, and this seems like a pretty simple problem.
It seems like what I really needed was cosine similarity applied to a co-occurrence matrix. This seems to work well (but was very hard to solve).

Scikit-learn Multiclass Naive Bayes with probabilities for y

I'm doing a tweet classification, where each tweet can belong to one of few classes.
The training set output is given as the probability for belonging that sample to each class.
Eg: tweet#1 : C1-0.6, C2-0.4, C3-0.0 (C1,C2,C3 being classes)
I'm planning to use a Naive Bayes classifier using Scikit-learn. I couldn't find a fit method in naive_bayes.py which takes probability for each class for training.
I need a classifier which accepts output probability for each class for the training set.
(ie: y.shape = [n_samples, n_classes])
How can I process my data set to apply a NaiveBayes classifier?
This is not so easy, as the "classes probability" can have many interpretations.
In case of NB classifier and sklearn the easiest procedure I see is:
Split (duplicate) your training samples according to the following rule:
given (x, [p1, p2, ..., pk ]) sample (where pi is probability for ith class) create artificial training samples:
(x, 1, p1), (x, 2, p2), ..., (x, k, pk). So you get k new observations, each "attached" to one class, and pi is treated as a sample weight, which NB (in sklearn) accepts.
Train your NB with fit(X,Y,sample_weights) (where X is a matrix of your x observations, Y is a matrix of classes from previous step, and sample_weights is a matrix of pi from the previous step.
For example if your training set consists of two points:
( [0 1], [0.6 0.4] )
( [1 3], [0.1 0.9] )
You transform them to:
( [0 1], 1, 0.6)
( [0 1], 2, 0.4)
( [1 3], 1, 0.1)
( [1 3], 2, 0.9)
and train NB with
X = [ [0 1], [0 1], [1 3], [1 3] ]
Y = [ 1, 2, 1, 2 ]
sample_weights = [ 0.6 0.4 0.1 0.9 ]

mean image filter

Starting to learn image filtering and stumped on a question found on website: Applying a 3×3 mean filter twice does not produce quite the same result as applying a 5×5 mean filter once. However, a 5×5 convolution kernel can be constructed which is equivalent. What does this kernel look like?
Would appreciate help so that I can understand the subject better. Thanks.
Marcelo's answer is right. Another way of seeing it (more easy to think it first in one dimension) : we know that the mean filter is equivalent to a convolution with a rectangular window. And we know that the convolution is a linear operation, which is also associative.
Now, applying a mean filter M to a signal X can be written as
Y = M * X
where * denotes convolution. Appying the filter twice would then give
Y = M * (M * X) = (M * M) * X = M2 * X
This says that filtering twice a signal with a mean filter is the same as filtering it once with an equivalent filter given by M2 = M * M. Now, this consists of applying the mean filter to itself, what gives a "smoother" filter (a triangular filter in this case).
The process can be repeated, (see first graph here) and it can be shown that the equivalent filter for many repetitions of a mean filter (N convolutions of the rectangular filter with itself) tends to a gaussian filter. Further, it can be shown that the gaussian filter has that property you didn't found in the rectangular (mean) filter: two passes of a gaussian filter are equivalent to another gaussian filter.
3x3 mean:
[1 1 1]
[1 1 1] * 1/9
[1 1 1]
3x3 mean twice:
[1 2 3 2 1]
[2 4 6 4 2]
[3 6 9 6 3] * 1/81
[2 4 6 4 2]
[1 2 3 2 1]
How? Each cell contributes indirectly via one or more intermediate 3x3 windows. Consider the set of stage 1 windows that contribute to a given stage 2 computation. The number of such 3x3 windows that contain a given source cell determines the contribution by that cell. The middle cell, for instance, is contained in all nine windows, so its contribution is 9 * 1/9 * 1/9. I don't know if I've explained it that well, so I hope it makes sense to you.
Actually I believe that 3x3 twice should give:
[1 2 3 2 1]
[2 4 6 4 2]
[3 6 9 6 3] * 1/81
[2 4 6 4 2]
[1 2 3 2 1]
The reason is because the sum of all values must be equal to 1.

Resources