In the context of image processing for edge detection or in my case a basic SIFT implementation:
When taking the 'difference' of 2 Gaussian blurred images, you are bound to get pixels whose difference is negative (they are originally between 0 - 255, when subtracting they are possibly between -255 - 255). What is the normal approach to 'fixing' this? I don't see taking the absolute value to be very correct in this situation.
There are two different approaches depending on what you want to do with the output.
The first is to offset the output by 128, so that your calculation range of -128 to 127 maps to 0 to 255.
The second is to clamp negative values so that they all equal zero.
Related
I have a 8-bit image and I want to filter it with a matrix for edge detection. My kernel matrix is
0 1 0
1 -4 1
0 1 0
For some indices it gives me a negative value. What am I supposed to with them?
Your kernel is a Laplace filter. Applying it to an image yields a finite difference approximation to the Laplacian operator. The Laplace operator is not an edge detector by itself.
But you can use it as a building block for an edge detector: you need to detect the zero crossings to find edges (this is the Marr-Hildreth edge detector). To find zero crossings, you need to have negative values.
You can also use the Laplace filtered image to sharpen your image. If you subtract it from the original image, the result will be an image with sharper edges and a much crisper feel. For this, negative values are important too.
For both these applications, clamping the result of the operation, as suggested in the other answer, is wrong. That clamping sets all negative values to 0. This means there are no more zero crossings to find, so you can't find edges, and for the sharpening it means that one side of each edge will not be sharpened.
So, the best thing to do with the result of the Laplace filter is preserve the values as they are. Use a signed 16-bit integer type to store your results (I actually prefer using floating-point types, it simplifies a lot of things).
On the other hand, if you want to display the result of the Laplace filter to a screen, you will have to do something sensical with the pixel values. Common in this case is to add 128 to each pixel. This shifts the zero to a mid-grey value, shows negative values as darker, and positive values as lighter. After adding 128, values above 255 and below 0 can be clipped. You can also further stretch the values if you want to avoid clipping, for example laplace / 2 + 128.
Out of range values are extremely common in JPEG. One handles them by clamping.
If X < 0 then X := 0 ;
If X > 255 then X := 255 ;
I am going to solve a binary high resolution segmentation problem. Positive pixels are marked as same value while negative pixels are all zero. The input image is scaled to 1/4 by bi-cubic interpolation.
After scaling, the pixel values of positive labels are not all the same. So how to process these label images to make it still a binary segmentation problem? Just set the pixels which are larger than 0 to positive or set the pixels which larger than a threshold to positive?
If the answer is the latter one, how to set the threshold?
I suggest you do not use built-in resize functions, such as zoom or imresize. Suppose you have a binary mask of size 225 * 225, then the central point is (113, 113), start from this central point, sub-sample the points in all four directions with equal steps,(like 4). And finally you will find you have 4 different sample ways, average them.
I would like to know the difference between contrast stretching and histogram equalization.
I have tried both using OpenCV and observed the results, but I still have not understood the main differences between the two techniques. Insights would be of much needed help.
Lets Define Contrast first,
Contrast is a measure of the “range” of an image; i.e. how spread its intensities are. It has many formal definitions one famous is Michelson’s:
He says contrast = ( Imax - Imin )/( Imax + I min )
Contrast is strongly tied to an image’s overall visual quality.
Ideally, we’d like images to use the entire range of values available
to them.
Contrast Stretching and Histogram Equalisation have the same goal: making the images to use entire range of values available to them.
But they use different techniques.
Contrast Stretching works like mapping
it maps minimum intensity in the image to the minimum value in the range( 84 ==> 0 in the example above )
With the same way, it maps maximum intensity in the image to the maximum value in the range( 153 ==> 255 in the example above )
This is why Contrast Stretching is un-reliable, if there exist only two pixels have 0 and 255 intensity, it is totally useless.
However a better approach is Histogram Equalisation which uses probability distribution. You can learn the steps here
I came across the following points after some reading.
Contrast stretching is all about increasing the difference between the maximum intensity value in an image and the minimum one. All the rest of the intensity values are spread out between this range.
Histogram equalization is about modifying the intensity values of all the pixels in the image such that the histogram is "flattened" (in reality, the histogram can't be exactly flattened, there would be some peaks and some valleys, but that's a practical problem).
In contrast stretching, there exists a one-to-one relationship of the intensity values between the source image and the target image i.e., the original image can be restored from the contrast-stretched image.
However, once histogram equalization is performed, there is no way of getting back the original image.
In Histogram equalization, you want to flatten the histogram into a uniform distribution.
In contrast stretching, you manipulate the entire range of intensity values. Like what you do in Normalization.
Contrast stretching is a linear normalization that stretches an arbitrary interval of the intensities of an image and fits the interval to an another arbitrary interval (usually the target interval is the possible minimum and maximum of the image, like 0 and 255).
Histogram equalization is a nonlinear normalization that stretches the area of histogram with high abundance intensities and compresses the area with low abundance intensities.
I think that contrast stretching broadens the histogram of the image intensity levels, so the intensity around the range of input may be mapped to the full intensity range.
Histogram equalization, on the other hand, maps all of the pixels to the full range according to the cumulative distribution function or probability.
Contrast is the difference between maximum and minimum pixel intensity.
Both methods are used to enhance contrast, more precisely, adjusting image intensities to enhance contrast.
During histogram equalization the overall shape of the histogram
changes, whereas in contrast stretching the overall shape of
histogram remains same.
I want to use FFT to accelerate 2D convolution. The filter is 15 x 15 and the image is 300 x 300. The filter's size is different with image so I can not doing dot product after FFT. So how to transform the filter before doing FFT so that its size can be matched with image?
I use the convention that N is kernel size.
Knowing the convolution is not defined (mathematically) on the edges (N//2 at each end of each dimension), you would loose N pixels in totals on each axis.
You need to make room for convolution : pad the image with enough "neutral values" so that the edge cases (junk values inserted there) disappear.
This would involve making your image a 307x307px image (with suitable padding values, see next paragraph), which after convolution gives back a 300x300 image.
Popular image processing libraries have this already embedded : when you ask for a convolution, you have extra arguments specifying the "mode".
Which values can we pad with ?
Stolen with no shame from Numpy's pad documentation
'constant' : Pads with a constant value.
'edge' : Pads with the edge values of array.
'linear_ramp' : Pads with the linear ramp between end_value and the arraydge value.
'maximum' :
Pads with the maximum value of all or part of the
vector along each axis.
'mean'
Pads with the mean value of all or part of the
vector along each axis.
'median'
Pads with the median value of all or part of the
vector along each axis.
'minimum'
Pads with the minimum value of all or part of the
vector along each axis.
'reflect'
Pads with the reflection of the vector mirrored on
the first and last values of the vector along each
axis.
'symmetric'
Pads with the reflection of the vector mirrored
along the edge of the array.
'wrap'
Pads with the wrap of the vector along the axis.
The first values are used to pad the end and the
end values are used to pad the beginning.
It's up to you, really, but the rule of thumb is "choose neutral values for the task at hand".
(For instance, padding with 0 when doing averaging makes little sense, because 0 is not neutral in an average of positive values)
it depends on the algorithm you use for the FFT, because most of them need to work with images of dyadic dimensions (power of 2).
Here is what you have to do:
Padding image: center your image into a bigger one with dyadic dimensions
Padding kernel: center you convolution kernel into an image with same dimensions as step 1.
FFT on the image from step 1
FFT on the kernel from step 2
Complex multiplication (Fourier space) of results from steps 3 and 4.
Inverse FFT on the resulting image on step 5
Unpadding on the resulting image from step 6
Put all 4 blocs into the right order.
If the algorithm you use does not need dyadic dimensions, then steps 1 is useless and 2 has to be a simple padding with the image dimensions.
I am trying to blend two images using Poisson Blending technique. I have written the program and solved the system of linear equations separately for each r,g,b channel. After solving the equation rgb values are going out of bound, each value greater than 255. If I clamp each value to 255, the resulting image becomes white as all three channes are 255 now. My question is that can the rgb values be greater than 255 after solving poisson equation ? How can I have a proper blended image in this case ?
I think you need to change your scale for color values. According to the formula given in most of the online sites (set of equations), they consider the color value to be in the 0 to 1 range. Convert your 0 - 255 scale to floating point values between 0 - 1 and see.