I am confused whether the image mean subtraction is helpful in my use case.
I am training SegNet network with road images, and I am subtracting the mean during the training.
When I compare the images before and after the mean subtraction, the image without mean subtraction appears to have more features and detailed pixel information.
I understand the importance of mean subtraction as it reduce the effect of varied illumination, and also helps in the gradient calculation during the training process. But, doesnt it mean that I loose some important information. I am attachting images for reference.
Original
With mean subtraction
Looking the pictures above, I have an assumption that image without mean subtraction can learn more features especially about the Cars ( which is very imp here ). The image with mean subtraction is mostly dark around the Cars.
An explaination or link to some source which might explain this, will be greatly appreciated. Thank you.
There are two things to consider here:
The pixel values themselves
How those pixel values are shown on your screen
If your images is of an unsigned integer type (say uint8), and you subtract the mean without casting the image to a different type, you will likely destroy image information. For example, if your image contains pixel values
204 208 100 75 86
and the mean is 100.3, the uint8 result of subtracting that mean is either
104 108 0 0 0 -- saturated subtraction
or
104 108 0 231 242 -- C-style subtraction
depending on if you use saturated subtraction or C-style arithmetic. In both cases the image no longer contains the same information it did before.
Correct of course is to use floating-point values:
103.7 107.7 -0.3 -25.3 -14.3 -- floating-point subtraction
In this case, the data still contains exactly the same information, only now it is zero-mean.
Now, how to display this zero-mean image to the screen? You can either map every value <0 to 0, and every value >255 to 255, so that values outside of the valid range [0,255] are saturated; or you can find the minimum and maximum values in your data, linearly map the pixel values to the valid range. In the first case it will look like you corrupted the image (as in your example), in the second case it will look like not much changed in the image. That is to say, there is no way of showing the image on screen and still see the effect of mean subtraction.
Related
In the docs for Kernel Routine Rules, it says 'A kernel routine computes an output pixel by using an inverse mapping back to the corresponding pixels of the input images. Although you can express most pixel computations this way—some more naturally than others—there are some image processing operations for which this is difficult, if not impossible. For example, computing a histogram is difficult to describe as an inverse mapping to the source image.'
However, apple obviously is doing it somehow because they do have a CIAreaHistogram Core Image Filter that does just that.
I can see one theoretical way to do it with the given limitations:
Lets say you wanted a 256 element red-channel histogram...
You have a 256x1 pixel output image. The kernel function gets called for each of those 256 pixels. The kernel function would have to read EVERY PIXEL IN THE ENTIRE IMAGE each time its called, checking if that pixel's red value matches that bucket and incrementing a counter. When its processed every pixel in the entire image for that output pixel, it divides by the total number of pixels and sets that output pixel value to that calculated value. The problem is, assuming it actually works, this is horribly inefficient, since every input pixel is accessed 256 times, although every output pixel is written only once.
What would be optimal would be a way for the kernel to iterate over every INPUT pixel, and let us update any of the output pixels based on that value. Then the input pixels would each be read only once, and the output pixels would be read and written a total of (input width)x(input height) times altogether.
Does anyone know of any way to get this kind of filter working? Obviously there's a filter available from apple for doing a histogram, but I need it for doing a more limited form of histogram. (For example, a blue histogram limited to samples that have a red value in a given range.)
The issue with this is that custom kernel code in Core Image works like a function which remaps pixel by pixel. You don't actually have a ton of information to go off of except for the pixel that you are currently computing. A custom core image filter sort of goes like this
for i in 1 ... image.width
for j in 1 ... image.height
New_Image[i][j] = CustomKernel(Current_Image[i][j])
end
end
So actually, it's not really plausible to make your own histogram via custom kernels, because you literally do not have any control over the new image other than in that CustomKernel function that has been made. This is actually one of the reasons that CIImageProcessor was created for iOS10, you probably would have an easier time making a histogram via that function(and also producing other cool affects via image processing), and I suggest checking out the WWDC 2016 video on it ( Raw images and live images session).
IIRC, if you really want to make a histogram, it is still possible, but you will have to work with the UIImage version, and then convert the resulting image to an RBG image for which you can do the counting, and storing them in bins. I would recommend Simon Gladman's book on this, as he has a chapter devoted to histograms, but there is a lot more that goes into the core image default version because they have MUCH more control over the image than we do using the framework.
I'm looking for a way to get a complete list of all the RGB values for each pixel in a given image using OpenCV, now i call this "color quantization".
The problem is that according to what I have found online, at least at this point, this "color quantization" thing is about histograms or "color reduction" or similar discrete computation solutions.
Since I know what I want and the "internet" seems to have a different opinion about what this words mean, I was wondering: maybe there is not a real solution for this ? a workable way or a working algorithm in the OpenCV lib.
Generally speaking, quantization is an operation that takes an input signal with real (mathematical) values to a set of discrete values. A possible algorithm to implement this process is to compute the histogram of the data, then retaining the n values that correspond to the n bins of the histogram with the higher population.
What you are trying to do would be called maybe color listing.
If you ar eworking with 8 bits quantized images (type CV_8UC3), my guess is that you do what you desire by taking the histogram of the input image (bin width equal to 1) then searching the result for non-empty bins.
Color quantization is the conversion of infinite natural colors in the finite digital color space. Anyway to create a full color 'histogram' you can use opencv's sparse matrix implementation and write your own function to compute it. Of course you have to access the pixels one by one, if you have no other structural or continuity information about the image.
Im using GPUImage in my project and I need an efficient way of taking the column sums. Naive way would obviously be retrieving the raw data and adding values of every column. Can anybody suggest a faster way for that?
One way to do this would be to use the approach I take with the GPUImageAverageColor class (as described in this answer), only instead of reducing the total size of each frame at each step, only do this for one dimension of the image.
The average color filter determines the average color of the overall image by stepping down in a factor of four in both X and Y, averaging 16 pixels into one at each step. If operating in a single direction, you should be able to use hardware interpolation to get an 18X reduction in a single direction per step with good performance. Your final step might either require a quick CPU-based iteration on the much smaller image or a tweaked version of this shader that pulls the last few pixels in a column together into the final result pixel for that column.
You notice that I've been talking about averaging here, because the output values for any OpenGL ES operation will need to be in terms of colors, which only have a 0-255 range per channel. A sum will easily overflow this, but you could use an average as an approximation of your sum, with a more limited dynamic range.
If you only care about one color channel, you could possibly encode a larger value into the RGBA channels and maintain a 32-bit sum that way.
Beyond what I describe above, you could look at performing this sum with the help of the Accelerate framework. While probably not quite as fast as doing a shader-based reduction, it might be good enough for your needs.
My lecturer has slides on edge histograms for image retrieval, whereby he states that one must first divide the image into 4x4 blocks, and then check for edges at the horizontal, vertical, +45°, and -45° orientations. He then states that this is then represented in a 14x1 histogram. I have no idea how he came about deciding that a 14x1 histogram must be created. Does anyone know how he came up with this value, or how to create an edge histogram?
Thanks.
The thing you are referring to is called the Histogram of Oriented Gradients (HoG). However, the math doesn't work out for your example. Normally you will choose spatial binning parameters (the 4x4 blocks). For each block, you'll compute the gradient magnitude at some number of different directions (in your case, just 2 directions). So, in each block you'll have N_{directions} measurements. Multiply this by the number of blocks (16 for you), and you see that you have 16*N_{directions} total measurements.
To form the histogram, you simply concatenate these measurements into one long vector. Any way to do the concatenation is fine as long as you keep track of the way you map the bin/direction combo into a slot in the 1-D histogram. This long histogram of concatenations is then most often used for machine learning tasks, like training a classifier to recognize some aspect of images based upon the way their gradients are oriented.
But in your case, the professor must be doing something special, because if you have 16 different image blocks (a 4x4 grid of image blocks), then you'd need to compute less than 1 measurement per block to end up with a total of 14 measurements in the overall histogram.
Alternatively, the professor might mean that you take the range of angles in between [-45,+45] and you divide that into 14 different values: -45, -45 + 90/14, -45 + 2*90/14, ... and so on.
If that is what the professor means, then in that case you get 14 orientation bins within a single block. Once everything is concatenated, you'd have one very long 14*16 = 224-component vector describing the whole image overall.
Incidentally, I have done a lot of testing with Python implementations of Histogram of Gradient, so you can see some of the work linked here or here. There is also some example code at that site, though a more well-supported version of HoG appears in scikits.image.
OpenCV has a handy cvEqualizeHist() function that works great on faded/low-contrast images.
However when an already high-contrast image is given, the result is a low-contrast one. I got the reason - the histogram being distributed evenly and stuff.
Question is - how do I get to know the difference between a low-contrast and a high-contrast image?
I'm operating on Grayscale images and setting their contrast properly so that thresholding them won't delete the text i'm supposed to extract (thats a different story).
Suggestions welcome - esp on how to find out if the majority of the pixels in the image are light gray (which means that the equalise hist is to be performed)
Please help!
EDIT: thanks everyone for many informative answers. But the standard deviation calculation was sufficient for my requirements and hence I'm taking that to be the answer to my query.
You can probably just use a simple statistical measure of the image to determine whether an image has sufficient contrast. The variance of the image would probably be a good starting point. If the variance is below a certain threshold (to be empirically determined) then you can consider it to be "low contrast".
If you're adjusting contrast just so you can threshold later on, you may be able to avoid the contrast adjustment step if you set your threshold adaptively using Ohtsu's method.
If you're still interested in finding out the image contrast, then read on.
While there are a number of different ways to calculate "contrast". Often, those metrics are applied locally as opposed to the entire image, to make the result more sensitive to image content:
Divide the image into adjacent non-overlaying neighborhoods.
Pick neighborhood sizes that are approximate to size of the features of your image (e.g. if your main feature is horizontal text, make neighborhoods tall enough to capture 2 lines of text, and just as wide).
Apply the metric to each neighborhood individually
Threshold the metric result to separate low and high variance blocks. This will prevent such things as large, blank areas of page skewing your contrast estimates.
From there, you can use a number of features to determine contrast:
The proportion of high metric blocks to low metric blocks
High metric block mean
Intensity distance between the high and low metric blocks (using means, modes, etc)
This may serve as a better indication of image contrast than global image variance alone. Here's why:
(stddev: 50.6)
(stddev: 7.9)
The two images are perfectly in contrast (the grey background is just there to make it obvious it's an image), but their standard deviations (and thus variance) are completely different.
Calculate cumulative histogram of image.
Make linear regression of cumulative histogram in the form y(x) = A*x + B.
Calculate RMSE of real_cumulative_frequency(x)-y(x).
If that RMSE is close to zero - image is already equalized. (That means that for equalized images cumulative histograms must be linear)
Idea is taken from here.
EDIT:
I've illustrated this approach in my blog (C example code included).
There is a support provided in skimage for this. skimage.exposure.is_low_contrast. reference
example :
>>> image = np.linspace(0, 0.04, 100)
>>> is_low_contrast(image)
True
>>> image[-1] = 1
>>> is_low_contrast(image)
True
>>> is_low_contrast(image, upper_percentile=100)
False