I have an issue with something vague for me regarding the input data preprocessing in YOLO-V4,
If the input image is a grayscale image of 16-bits per pixel, i.e. range of pixel values [0,2^16) instead of [0,2^8), it is mentioned that they are scaled so I want to ask about:
1- These values are scaled to be within -1 to 1 or 0 to 1?
2- Which method is used in scaling (where can I find its piece of code)?
3- Is the scaling done using the max value per image or the max depth per image, i.e in my case does the max used in the scaling will be 2^16 or the maximum value in the image for example 2000 or whatever?
Thanks in advance,
(1) - from 0 to 1.
Now scaling works as v = pix / 255.
Here
Related
I am trying to work out the difference between Erosion and Dilation for binary and grayscale images.
As far as I know, this is erosion/dilation for binary images...
Erosion: If every pixel corresponding to an SE index that has 1 is a 1, output a 1. Otherwise 0.
Dilation: If at least one pixel corresponding to an SE index that has 1 is a 1, output a 1. Otherwise 0.
My question is, how does this work for 16-bit (0, 65535) grayscale images?
So what we have to do is to create an structual Element, that could be for example:
The formula says for dilation says:
image http://utam.gg.utah.edu/tomo03/03_mid/HTML/img642.png
and for erosion:
image http://utam.gg.utah.edu/tomo03/03_mid/HTML/img643.png
that means with have to take the maximum or minumum of each kernel values in the image and add 10 to it. If we have for example:
it goes to using dilation:
How you can see you just look at pixel position x,y take the center and add 10 to it. Then you check the neighbors if the computed value is the maximum. If it is a new maximum the pixel value get replaced, when not the pixel value stays. Hope it is clear for erosion you just take the minimum.
Given an image (Like the one given below) I need to convert it into a binary image (black and white pixels only). This sounds easy enough, and I have tried with two thresholding functions. The problem is I cant get the perfect edges using either of these functions. Any help would be greatly appreciated.
The filters I have tried are, the Euclidean distance in the RGB and HSV spaces.
Sample image:
Here it is after running an RGB threshold filter. (40% it more artefects after this)
Here it is after running an HSV threshold filter. (at 30% the paths become barely visible but clearly unusable because of the noise)
The code I am using is pretty straightforward. Change the input image to appropriate color spaces and check the Euclidean distance with the the black color.
sqrt(R*R + G*G + B*B)
since I am comparing with black (0, 0, 0)
Your problem appears to be the variation in lighting over the scanned image which suggests that a locally adaptive thresholding method would give you better results.
The Sauvola method calculates the value of a binarized pixel based on the mean and standard deviation of pixels in a window of the original image. This means that if an area of the image is generally darker (or lighter) the threshold will be adjusted for that area and (likely) give you fewer dark splotches or washed-out lines in the binarized image.
http://www.mediateam.oulu.fi/publications/pdf/24.p
I also found a method by Shafait et al. that implements the Sauvola method with greater time efficiency. The drawback is that you have to compute two integral images of the original, one at 8 bits per pixel and the other potentially at 64 bits per pixel, which might present a problem with memory constraints.
http://www.dfki.uni-kl.de/~shafait/papers/Shafait-efficient-binarization-SPIE08.pdf
I haven't tried either of these methods, but they do look promising. I found Java implementations of both with a cursory Google search.
Running an adaptive threshold over the V channel in the HSV color space should produce brilliant results. Best results would come with higher than 11x11 size window, don't forget to choose a negative value for the threshold.
Adaptive thresholding basically is:
if (Pixel value + constant > Average pixel value in the window around the pixel )
Pixel_Binary = 1;
else
Pixel_Binary = 0;
Due to the noise and the illumination variation you may need an adaptive local thresholding, thanks to Beaker for his answer too.
Therefore, I tried the following steps:
Convert it to grayscale.
Do the mean or the median local thresholding, I used 10 for the window size and 10 for the intercept constant and got this image (smaller values might also work):
Please refer to : http://homepages.inf.ed.ac.uk/rbf/HIPR2/adpthrsh.htm if you need more
information on this techniques.
To make sure the thresholding was working fine, I skeletonized it to see if there is a line break. This skeleton may be the one needed for further processing.
To get ride of the remaining noise you can just find the longest connected component in the skeletonized image.
Thank you.
You probably want to do this as a three-step operation.
use leveling, not just thresholding: Take the input and scale the intensities (gamma correct) with parameters that simply dull the mid tones, without removing the darks or the lights (your rgb threshold is too strong, for instance. you lost some of your lines).
edge-detect the resulting image using a small kernel convolution (5x5 for binary images should be more than enough). Use a simple [1 2 3 2 1 ; 2 3 4 3 2 ; 3 4 5 4 3 ; 2 3 4 3 2 ; 1 2 3 2 1] kernel (normalised)
threshold the resulting image. You should now have a much better binary image.
You could try a black top-hat transform. This involves substracting the Image from the closing of the Image. I used a structural element window size of 11 and a constant threshold of 0.1 (25.5 on for a 255 scale)
You should get something like:
Which you can then easily threshold:
Best of luck.
I have a 48-bit (16 bits per pixel) image I've loaded with FreeImage. I'm trying to generate a histogram from this image without having to convert it to a 24-bit image.
This is how I understand histograms are calculated..
for (pixel in pixels)
{
red_histo[pixel.red]++;
}
Where pixel.red can be between 0 and 255. So there is a range from 0 to 255 on my histogram. But if there is 16 bits per pixel, it could be between 0 and 65535, which is too large to be displayed on a histogram.
Is there a standard way to calculate histograms with 48-bit (or higher) images?
You have to decide how many bins you need in the histogram. For eg. the Matlab histogram function takes these forms
imhist(I)
imhist(I, n)
imhist(X, map)
In the first case, the number of bins is by default used as 256. So, if you have 16bit input, these will be scaled down to 8 bit and split into 256 bin histogram.
In the second one, you can specify number of bins 'n'. Lets say you specify n=2 for your 16 bit data. Then, this will essentially split the histogram as [0-2^15, 2^15-2^16-1].
The third case is where you specify the map for each bin. ie you have to specify the ranges of the pixel values for each bin.
http://www.mathworks.com/help/images/ref/imhist.html
How you want to choose the number of bins depends on your requirement.
This Stack Overflow Question May have the answer you are looking for.
I do not know if there is a "standard" way.
If this is for display purposes you can scale back the pixels to keep the range from 0-255 for instance:
double scalingFactor = 255/65535;
for (pixel in pixels)
{
red_histo[(int)(scalingFactor * pixel.red)]++;
}
This will allow the upper range of the 16 bit pixel to come in at 255 and lower range of the 16 bit pixel to come in at 0.
I am using .NET AForge libraries to sharpen and image. The "Sharpen" filter uses the following matrix.
0 -1 0
-1 5 -1
0 -1 0
This in fact does sharpen the image, but I need to sharpen the image more aggressively and based on a numeric range, lets say 1-100.
Using AForge, how do I transform this matrix with numbers 1 through 100 where 1 is almost not noticeable and 100 is very noticeable.
Thanks in advance!
The one property of a filter like this that must be maintained is that all the values sum to 1. You can subtract 1 from the middle value, multiple by some constant, then add 1 back to the middle and it will be scaled properly. Play around with the range (100 is almost certainly too large) until you find something that works.
You might also try using a larger filter matrix, or one that has values in the corners as well.
I would also suggest looking at the GaussianSharpen class and adjusting the sigma value.
In the context of image processing for edge detection or in my case a basic SIFT implementation:
When taking the 'difference' of 2 Gaussian blurred images, you are bound to get pixels whose difference is negative (they are originally between 0 - 255, when subtracting they are possibly between -255 - 255). What is the normal approach to 'fixing' this? I don't see taking the absolute value to be very correct in this situation.
There are two different approaches depending on what you want to do with the output.
The first is to offset the output by 128, so that your calculation range of -128 to 127 maps to 0 to 255.
The second is to clamp negative values so that they all equal zero.