I'm hoping to work out how to improve image quality by doing a Matrix transformation on an image which essentially 'undoes' the softening/motion blur of an image.
If I was to apply transform A to a sharp image, what transform B would get me back to the original image if A is the following:
0 1 0
0 1 0
0 1 0
The only way to achieve such goal, would be depending on the transform.
Imagine a case a very basic(and perhaps stupid) blurring function. It applies to image and converts all pixels 127, 128, 129 from values in range of [0-255]. To inverse this filter we have to be able to map values back. However, we just lost the information. Example is both pixel values 1 and 5 converted to 127. Now, with no information of their previous values, when we read the value of converted pixel 127, should we convert it to 1 or 5? We don't know.
Now, keeping in the mind some transforms are one way ticket. There are better scenarios. For example, linear transforms such as rotation of an image can be completely reversed by basically transforming image with the inverse or rotation matrix.
A^(-1)=A^(T)
where A is rotation matrix.
So basically, transform is inversed when:
AIA^(T)
where I is image and A is rotation matrix.
Therefore, there are two things you need to have to be able to reverse the transform an image. Your transform must be mathematically reversable. And then you need to apply tranform which is mathematically inverse of your transform function.
There are of course, ways you can try to sharpen an image without exactly transforming back, if they are ok for you here are some these techniques for dealing with blur images;
High pass filtering, simple but a classic: http://northstar-www.dartmouth.edu/doc/idl/html_6.2/Sharpening_an_Image.html
Deconvolution: https://en.wikipedia.org/wiki/Deconvolution
Variational methods(Methods based on calculus of variations ) : http://www.math.ucla.edu/~bertozzi/papers/moellerpaper.pdf
More can be found in literature.
Related
When given an image such as this:
And not knowing the color of the object in the image, I would like to be able to automatically find the best H, S and V ranges to threshold the object itself, in order to get a result such as this:
In this example, I manually found the values and thresholded the image using cv::inRange.The output I'm looking for, are the best H, S and V ranges (min and max value each, total of 6 integer values) to threshold the given object in the image, without knowing in advance what color the object is. I need to use these values later on in my code.
Keypoints to remember:
- All given images will be of the same size.
- All given images will have the same dark background.
- All the objects I'll put in the images will be of full color.
I can brute force over all possible permutations of the 6 HSV ranges values, threshold each one and find a clever way to figure out when the best blob was found (blob size maybe?). That seems like a very cumbersome, long and highly ineffective solution though.
What would be good way to approach this? I did some research, and found that OpenCV has some machine learning capabilities, but I need to have the actual 6 values at the end of the process, and not just a thresholded image.
You could create a small 2 layer neural network for the task of dynamic HSV masking.
steps:
create/generate ground truth annotations for image and its HSV range for the required object
design a small neural network with at least 1 conv layer and 1 fcn layer.
Input : Mask of the image after applying the HSV range from ground truth( mxn)
Output : mxn mask of the image in binary
post processing : multiply the mask with the original image to get the required object highligted
I read on Wikipedia and see that if we need to perform spatial filtering on an image, we have to have a filter, for example 3x3, what I don't understand here is how can we choose the value for the filter? Let say that the original image is grey scale so its intensity goes from 0 to 255 (8 bits).
Another question is that if the image is 9x9, how can we apply the filter to boundary pixels of that image? If we choose to pad the image so the filter can work with all boundary pixels, what would be the value for new padded pixels?
Thank you very much
The value of the filter depends on what you want to achieve by filtering. There are a lot of filter design to perform a specific task. For example the simplest filter f=[-1 1 -1] kind of perform image derivation by performing first degree differencing on each pixel in horizontal direction (x-derivative) while f' perform the same thing in the vertical (y-derivative). The values -1,1,-1 are choose for such purpose. The same goes for 3*3 filters. In general the choose of the values come from a 2D(bi directional) designing of finite impulse response (FIR) and infinite impulse response (IIR) filters.
You should keep in mind that the value of filter operation on the boarders are not that much accurate. Filtering operation for boarder pixel are done interpolating out-of range pixel by a process called boarder interpolation.In OpenCV and similar image processing/computer vision libraries there are ways to do it. For example as the following in opencv
Various border types, image boundaries are denoted with '|'
BORDER_REPLICATE: aaaaaa|abcdefgh|hhhhhhh
BORDER_REFLECT: fedcba|abcdefgh|hgfedcb
BORDER_REFLECT_101: gfedcb|abcdefgh|gfedcba
BORDER_WRAP: cdefgh|abcdefgh|abcdefg
BORDER_CONSTANT: iiiiii|abcdefgh|iiiiiii with some specified 'i'
Thus according to you choose you pad the boarder pixels.
I'm new to image processing and I'm working on detecting lines in a document image. I read the theory of Hough line transform but I can't see why I must use Canny before calling that function in opencv like being said in many tutorials. What's the point of finding edges in this case? The fact is that if I don't use Canny or threshold before HoughLines() the results will be very messy. I hope someone will explain for me the reason why.
2 of the tutorials I've read:
Imgproc Feature Detection
Hough Line Transform
Short Answer
cvCanny is used to detect Edges, as well as increase contrast and remove image noise.
HoughLines which uses the Hough Transform is used to determine whether those edges are lines or not. Hough Transform requires edges to be detected well in order to be efficient and provide meaning results.
Long Answer
The Limitations of the Hough Transform are described in more detail on Wikipedia.
The efficiency of the Hough Transform relies of the bin of acculumated pixel being distinct, e.g. a direct contrast between a pixel and its surrounding neighbours or if using a mask region a pixel region and its surrounds regions. If all pixels had similar acculumated values nothing would stand out as a line or circle. This leads to the reduction of colour (colour to grayscale, grayscale to black and white) in order to increase contract.
The number of parameters to the Hough Transform also increase the spread of votes in the pixel bins and increase the complexity of the transform, which mean that normally only lines or circles are reliably detected using it as they have less than 3 parameters.
The edges need to be detected well before running the Hough Transform otherwise its efficiency suffers further. Also noisy images don't work well with Hough transform unless the noise is removed before hand.
First of all, to detect lines you need to work on a boolean matrix image (or binary), I mean: the color is black or white, there's no grayscale.
HoughLines()'s requirement to work properly is to have this kind of image as input. That's the reason you have to use Canny or Treshold, to convert the colored image matrix into a boolean one.
Hough transformation
A line in one picture is actually an edge. Hough transform scans the whole image and using a transformation that converts all white pixel cartesian coordinates in polar coordinates; the black pixels are left out. So you won't be able to get a line if you first don't detect edges, because HoughLines() don't know how to behave when there's a grayscale.
Theoretically, you are correct. Finding edges is not absolutely required for the Hough Line algorithm to work.
The way the Hough works is basically it takes every point and connects it to every other point, and whatever points have the most lines going through them, those lines stay. For this, we need points. The Canny creates those points. Theoretically you could use any sort of filter - isolate all blue or purple points and connect them, whatever - but edges works well.
The Hough also does not weight its lines or points. To the Hough, an image is binary - made up of either 1s or 0, points or not points. There is no need for greyscale, and the canny conveniently returns binary images.
Thus is the Canny always part of the Hough.
all is about processing binary data,
complex data -> (a binary data, b binary data, c binary data, ..) (using canny(),sobel(), etc)
a binary data -> function1() (using houghlines())
b binary data -> function2()
c binary data -> function3() ..
a binary data -X-> function2() ..
complex data -X-> function1() ..
HTH
I'd like to know what would be the best strategy to compare a group of contours, in fact are edges resulting of a canny edges detection, from two pictures, in order to know which pair is more alike.
I have this image:
http://i55.tinypic.com/10fe1y8.jpg
And I would like to know how can I calculate which one of these fits best to it:
http://i56.tinypic.com/zmxd13.jpg
(it should be the one on the right)
Is there anyway to compare the contours as a whole?
I can easily rotate the images but I don't know what functions to use in order to calculate that the reference image on the right is the best fit.
Here it is what I've already tried using opencv:
matchShapes function - I tried this function using 2 gray scales images and I always get the same result in every comparison image and the value seems wrong as it is 0,0002.
So what I realized about matchShapes, but I'm not sure it's the correct assumption, is that the function works with pairs of contours and not full images. Now this is a problem because although I have the contours of the images I want to compare, they are hundreds and I don't know which ones should be "paired up".
So I also tried to compare all the contours of the first image against the other two with a for iteration but I might be comparing,for example, the contour of the 5 against the circle contour of the two reference images and not the 2 contour.
Also tried simple cv::compare function and matchTemplate, none with success.
Well, for this you have a couple of options depending on how robust you need your approach to be.
Simple Solutions (with assumptions):
For these methods, I'm assuming your the images you supplied are what you are working with (i.e., the objects are already segmented and approximately the same scale. Also, you will need to correct the rotation (at least in a coarse manner). You might do something like iteratively rotate the comparison image every 10, 30, 60, or 90 degrees, or whatever coarseness you feel you can get away with.
For example,
for(degrees = 10; degrees < 360; degrees += 10)
coinRot = rotate(compareCoin, degrees)
// you could also try Cosine Similarity, or even matchedTemplate here.
metric = SAD(coinRot, targetCoin)
if(metric > bestMetric)
bestMetric = metric
coinRotation = degrees
Sum of Absolute Differences (SAD): This will allow you to quickly compare the images once you have determined an approximate rotation angle.
Cosine Similarity: This operates a bit differently by treating the image as a 1D vector, and then computes the the high-dimensional angle between the two vectors. The better the match the smaller the angle will be.
Complex Solutions (possibly more robust):
These solutions will be more complex to implement, but will probably yield more robust classifications.
Haussdorf Distance: This answer will give you an introduction on using this method. This solution will probably also need the rotation correction to work properly.
Fourier-Mellin Transform: This method is an extension of Phase Correlation, which can extract the rotation, scale, and translation (RST) transform between two images.
Feature Detection and Extraction: This method involves detecting "robust" (i.e., scale and/or rotation invariant) features in the image and comparing them against a set of target features with RANSAC, LMedS, or simple least squares. OpenCV has a couple of samples using this technique in matcher_simple.cpp and matching_to_many_images.cpp. NOTE: With this method you will probably not want to binarize the image, so there are more detectable features available.
I want to smooth the contour of binarized images and think that erode is the best way to do it. I know that normal way of work is use cvDilate(src, dst, 0, iter); where 0 is a 3x3 matrix.
Problem is 3x3 matrix makes a deep erode in my images. How can I do a erode with a 2x2 matrix or anything smaller than the default 3x3 matrix.
Here you have for your reference the results of using different kernels:
Saludos!
If your goals is to have a binarized image with smooth edges then, if you have the original, it is better to use something like a gaussian blur with cvSmooth() on that before performing the binarization.
That said, you are not restricted to 3x3 kernels. cvDilate() takes an IplConvKernel produced by CreateStructuringElementEx and you can make a structuring element with any (rectangular) shape with that function.
However, a structuring element works relative to an anchor point that must have integer coordinates, so if you use a 2x2 matrix the matrix can not be centered around the pixel. so in most cases it is best to use structuring elements with an odd number of rows and collumns.
What you could do is create a 3x3 structuring element where only the center value and the values directly above, below, left and to the right of that are 1 like such:
0 1 0
1 1 1
0 1 0
rather than the default:
1 1 1
1 1 1
1 1 1
The first kernel will make for some slightly smoother edges.
Here's a quick and dirty approach to tell you whether dilation/erosion will work for you:
Upsample your image.
Erode (dilate, open, close, whatever) with the smallest filter you can use (typically 3x3)
Downsample back to the original image size
With the C API, you can create a dedicated IplConvKernel object of any kind and size with the function CreateStructuringElementEx(). If you to use the C++ API (function dilate()), the structuring element used for dilation is any matrix (Mat) you want.
A kernel with all 1's is a low pass convolution filter. A dilation filter replaces each pixel in the 3X3 region with the darkest pixel in that 3x3 region. An erosion filter replaces each pixel in the 3X3 region with the lightest pixel in that 3x3 region. That is if your background is light and your foreground object is dark. If you flip your background and foreground, then you would also flip your dilation and erosion filter.
Also if you want to perform an 'open' operation, you perform an erosion followed by a dilation. Conversely a 'close' operation is a dilation followed by an erosion. Open tends to remove isolated clumps of pixels and close tends to fill in holes.
Errosion and dilation matrices should be odd order
-- a 2*2 matrix cannot be used
convolution matrices should be of the order 1*1 3*3 5*5 7*7 ... but ODD
try to apply close - Erode then dilate the image operation - use the cvMorpologyEx() function