OpenCV: Make a Difference-of-Gaussian - opencv

Is there a simple way of making DoG kernel to filter a image. I know it is possible to generate the kernels manually and then substract them from each other, but isn't there a smarter way to do this?

Another way of doing it is to create the analytic formula for your kernel, with the needed parameters, and calculate the position of every pixel inside.
getDoG(i,j, sigma1Big, sigma2Big, sigma1Small, sigma2small,
rotationBig, rotationSmall, kernelSize, ...);
Do not ask me for the formula :)
But the easiest way is to make two kernels with the correct parameters and subtract them.
Do not forget to normalize the kernel (shift the values so that the sum of all kernel values is 1)

Related

How to make the labels of superpixels to be locally consistent in a gray-level map?

I have a bunch of gray-scale images decomposed into superpixels. Each superpixel in these images have a label in the rage of [0-1]. You can see one sample of images below.
Here is the challenge: I want the spatially (locally) neighboring superpixels to have consistent labels (close in value).
I'm kind of interested in smoothing local labels but do not want to apply Gaussian smoothing functions or whatever, as some colleagues suggested. I have also heard about Conditional Random Field (CRF). Is it helpful?
Any suggestion would be welcome.
I'm kind of interested in smoothing local labels but do not want to apply Gaussian smoothing functions or whatever, as some colleagues suggested.
And why is that? Why do you not consider helpful advice of your colleagues, which are actually right. Applying smoothing function is the most reasonable way to go.
I have also heard about Conditional Random Field (CRF). Is it helpful?
This also suggests, that you should rather go with collegues advice, as CRF has nothing to do with your problem. CRF is a classifier, sequence classifier to be exact, requiring labeled examples to learn from and has nothing to do with the setting presented.
What are typical approaches?
The exact thing proposed by your collegues, you should define a smoothing function and apply it to your function values (I will not use a term "labels" as it is missleading, you do have values in [0,1], continuous values, "label" denotes categorical variable in machine learning) and its neighbourhood.
Another approach would be to define some optimization problem, where your current assignment of values is one goal, and the second one is "closeness", for example:
Let us assume that you have points with values {(x_i, y_i)}_{i=1}^N and that n(x) returns indices of neighbouring points of x.
Consequently you are trying to find {a_i}_{i=1}^N such that they minimize
SUM_{i=1}^N (y_i - a_i)^2 + C * SUM_{i=1}^N SUM_{j \in n(x_i)} (a_i - a_j)^2
------------------------- - --------------------------------------------
closeness to current constant to closeness to neighbouring values
values weight each part
You can solve the above optimization problem using many techniques, for example through scipy.optimize.minimize module.
I am not sure that your request makes any sense.
Having close label values for nearby superpixels is trivial: take some smooth function of (X, Y), such as constant or affine, taking values in the range [0,1], and assign the function value to the superpixel centered at (X, Y).
You could also take the distance function from any point in the plane.
But this is of no use as it is unrelated to the image content.

Implementation of image dilation and erosion

I am trying to figure out an efficient way of implementing image dilation and erosion for binary images. As far as I understand it, the naive way would be:
loop through the image
if pixel is 1
loop through the neighborhood based on the structuring element's
height and width
(dilate) substitute each pixel of the image with the value in the
corresponding location of the SE
(erode) check if all neighborhood is equal to the SE, if so keep all
the pixels, else delete the centre
so this means that for each pixel I have to loop through the SE as well making this a O(NMW*H).
Is there a more elegant way of doing this?
Yes there are!!!
First you want to decompose (if possible) your structuring element into segments (a square being composed by a vertical and an horizontal segment). And then you perform only erosion/dilation on segments, which already decreases the complexity.
Now for the erosion/dilation parts, you have different solutions:
If you work only on 8-bits images and do not C/C++, you use an implementation with histograms in order to keep track of the minimum/maximum value. See this remarkable work here. He even adds "landmarks" in order to reduce the number of operations.
If you use C/C++ and work on different types of image encodings, then you can use fast comparisons (SSE2, SSE4 and auto-vectorization), as it is the case in the SMIL library. In this case, you compare row with row, instead of working pixel by pixel, using material accelerations. It seems to be the fastest library ever.
A last way to do, slower but works for all types of encoding, is to use the Lemmonier algorithm. It is implemented by the fulguro library.
For structuring elements of type disk, there is nothing "fast", you have to use the basic algorithm. For hexagonal structuring elements, you can work row by row, but it cannot be parallelized.

GPUImage Taking sum of columns of image

Im using GPUImage in my project and I need an efficient way of taking the column sums. Naive way would obviously be retrieving the raw data and adding values of every column. Can anybody suggest a faster way for that?
One way to do this would be to use the approach I take with the GPUImageAverageColor class (as described in this answer), only instead of reducing the total size of each frame at each step, only do this for one dimension of the image.
The average color filter determines the average color of the overall image by stepping down in a factor of four in both X and Y, averaging 16 pixels into one at each step. If operating in a single direction, you should be able to use hardware interpolation to get an 18X reduction in a single direction per step with good performance. Your final step might either require a quick CPU-based iteration on the much smaller image or a tweaked version of this shader that pulls the last few pixels in a column together into the final result pixel for that column.
You notice that I've been talking about averaging here, because the output values for any OpenGL ES operation will need to be in terms of colors, which only have a 0-255 range per channel. A sum will easily overflow this, but you could use an average as an approximation of your sum, with a more limited dynamic range.
If you only care about one color channel, you could possibly encode a larger value into the RGBA channels and maintain a 32-bit sum that way.
Beyond what I describe above, you could look at performing this sum with the help of the Accelerate framework. While probably not quite as fast as doing a shader-based reduction, it might be good enough for your needs.

Detect the two highest Peaks from Histogram

I was trying to understand on how to detect the two peaks from the histogram. There can be multiple but I need to pick the two highest. Basically what I need to to is that although I will have these peaks shifted left or right, I need to get hold of them. Their spread can vary and their PEAK values might change so I have to find a way to get hold of these two peaks in Matlab.
What I have done so far is to create a 5 value window. This window is populated with values from the histogram and a scan is performed. Each time I move 5-steps ahead to the next value and compare the previous window value with current. Which ever is greater is kept.
Is there a better way of doing this?
The simplest way to do this would be to first smooth the data using a gaussian kernel to remove the high frequency variations.
Then use the function localmax to find the local maximums.
Return data from hist (or histc) function to a variable (y = hist(x,bin);) and use PEAKFINDER FileExchange submission to find local maximums.
I have also used PEAKDET function from Eli Billauer. Works great. You can check my answer here with code example.

How can I transpose an image in Assembly?

I'm working on a project and I need to compute something based on the rows and columns of an image. It's easy to take the bits of the rows of the image. However, to take the bits of each column I need to transpose the image so the columns become rows.
I'm using a BMP picture as the input. How many rows X columns are in BMP picture? I'd like to see a pseudocode or something if possible too.
It sounds like you are wanting to perform a matrix transpose which is a little different than rotation. In rotation, the rows may become columns, but either the rows or the columns will be in reverse order depending on the rotation direction. Transposition maintains the original ordering of the rows and columns.
I think using the right algorithm is much more important than whether you use assembly or just C. Rotation by 90 degrees or transposition really boils down to just moving memory. The biggest thing to consider is the effect of cache misses if you use a naive algorithm like this:
for(int x=0; x<width; x++)
{
for(y=0; y<height; y++)
out[x][y] = in[y][x];
}
This will cause a lot of cache misses because you are jumping around in the memory a lot. It is more efficient to use a block based approach. Google for "cache efficient matrix transpose".
One place you may be able to make some gains is using SSE instructions to move more than one piece of data at a time. These are available in assembly and in C. Also check out this link. About half way down they have a section on computing a fast matrix transpose.
edit:
I just saw your comment that you are doing this for a class in assembly so you can probably disregard most of what I said. I assumed you were looking to squeeze out the best performance since you were using assembly.
It varies. BMPs can have any size (up to a limit), and they can be in different formats too (32-bit RBG, 24-bit RBG, 16-bit paletted, 8-bit paletted, 1-bit monochrome), and so on.
As with most other problems, it's best to write a solution first in the high-level language of your choice and then convert parts or all of it to ASM as needed.
But yes, in its simplest form for this task, which would be the 32-bit RGB format, rotating with some multiple of 90 degress will be like rotating a 2-D array.

Resources