Applications of matrix addition to image processing? - image-processing

What are the applications of matrix addition to image processing?And also is there any application in image processing which modifies current pixel value based on values of neighbour pixels?

Strict addition is rare, but subtraction is more common, like when you subtract one image from its filtered one. For example you may subtract one image from its low pass filter and obtain its details.
Edit 1: Examples: image addition (averaging) for noise suppression. Subtraction for change enhancement.
Lots of image processing algorithms modify the pixels based on its neighbors, like many digital image filters.
Take for example the prewit filter which emphasizes horizontal edges. Its kernel is:
[1 1 1
0 0 0
-1 -1 -1]
Here the current pixel value will be replaced by the sum of its the north-west, north and north-east pixel value, minus the sum of its south-west, south and south-east pixel value.
Similar kernels, for other applications, include average, gaussian, laplacian, sobel, etc.. Its is also possible to compute fast rough distance maps.
Note 1: By the way, your acceptance rate is very low. People here may understand that you are ungrateful for the answers, and therefore won't bother to answer at all. Since you're new, I'm sure it's just because you need to understand how the system works. For any answer that has usefulness, click the arrow up. At some point, chose one of the answers as "the best one" and accept it (the green V tick sign.)
Note 2: TBH the question is actually quite vague, and would address a full book of digital image processing. In the future, try to be more specific (and, for stackoverflow, ask within programming).
All the best.

Related

What's the theory behind computing variance of an image?

I am trying to compute the blurriness of an image by using LaplacianFilter.
According to this article: https://www.pyimagesearch.com/2015/09/07/blur-detection-with-opencv/ I have to compute the variance of the output image. The problem is I don't understand conceptually how do I compute variance of an image.
Every pixel has 4 values for every color channel, therefore I can compute the variance of every channel, but then I get 4 values, or even 16 by computing variance-covariance matrix, but according to the OpenCV example, they have only 1 number.
After computing that number, they just play with the threshold in order to make a binary decision, whether the image is blurry or not.
PS. by no means I am an expert on this topic, therefore my statements can make no sense. If so, please be nice to edit the question.
On sentence description:
The blured image's edge is smoothed, so the variance is small.
1. How variance is calculated.
The core function of the post is:
def variance_of_laplacian(image):
# compute the Laplacian of the image and then return the focus
# measure, which is simply the variance of the Laplacian
return cv2.Laplacian(image, cv2.CV_64F).var()
As Opencv-Python use numpy.ndarray to represent the image, then we have a look on the numpy.var:
Help on function var in module numpy.core.fromnumeric:
var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<class 'numpy._globals$
Compute the variance along the specified axis.
Returns the variance of the array elements, a measure of the spread of a distribution.
The variance is computed for the flattened array by default, otherwise over the specified axis.
2. Using for picture
This to say, the var is calculated on the flatten laplacian image, or the flatted 1-D array.
To calculate variance of array x, it is:
var = mean(abs(x - x.mean())**2)
For example:
>>> x = np.array([[1, 2], [3, 4]])
>>> x.var()
1.25
>>> np.mean(np.abs(x - x.mean())**2)
1.25
For the laplacian image, it is edged image. Make images using GaussianBlur with different r, then do laplacian filter on them, and calculate the vars:
The blured image's edge is smoothed, so the variance is little.
First thing first, if you see the tutorial you gave, they convert the image to a greyscale, thus it will have only 1 channel and 1 variance. You can do it for each channel and try to compute a more complicated formula with it, or just use the variance over all the numbers... However I think the author also converts it to greyscale since it is a nice way of fusing the information and in one of the papers that the author supplies actually says that
A well focused image is expected to have a high variation in grey
levels.
The author of the tutorial actually explains it in a simple way. First, think what the laplacian filter does. It will show the well define edges here is an example using the grid of pictures he had. (click on it to see better the details)
As you can see the blurry images barely have any edges, while the focused ones have a lot of responses. Now, what would happen if you calculate the variance. let's imagine the case where white is 255 and black is 0. If everything is black... then the variance is low (cases of the blurry ones), but if they have like half and half then the variance is high.
However, as the author already said, this threshold is dependent on the domain, if you take a picture of a sky even if it is focus it may have low variance, since it is quite similar and does not have very well define edges...
I hope this answer your doubts :)

Algorithm suitable for a region-growing image segmentation based on the minimization of a metric

Is there a pixel-based region growing algorithm that can be employed for the extraction of features (segmentation) on an image, by adding pixels to the seed based on the minimization of a certain metric. Potentially, a pixel can be removed if the metric is not optimized when this pixel is added (i.e. possibility to backtrack and go back to the seed obtained in the previous iterations).
I'll try to explain further my objectives:
This algorithm starts from a central pixel selected as an initial seed on the image.
Afterwards, each of the 4 neighbors is explored (right, left, bottom and top neighbors) separately, to see if the metric is optimized by growing the seed in the selected direction.
A neighboring pixel might not optimize the metric immediately, even if the seed created by adding this pixel will be optimal in future iterations.
There is a possibility that a neighboring pixel is added to the seed but is removed later, if the obtained seed is not optimal.
Can anyone suggest to me an Artificial Intelligence technique (or a greedy approach) that is adequate to solve this kind of problems? Also, what would be a good criteria to judge that the addition of a pixel will optimize the metric even though this will probably happen in future iterations.
P.S: I started implementing what's explained above in Python but was stuck in the issue of determining if a path (neighboring pixel) is worth exploring or not. Right now, I try to add a neighboring pixel only if the seed produced improve (i.e. minimize) the error relatively to the metric. However, even though by adding the right or left neighbors the metric isn't optimized, one of these two paths might lead to the optimal solution in the future (as explained in the third objective).
You've basically outlined the most successful algorithm you could get with this approach. It's success will depend heavily on the metric you use to add/remove pixels, but there are a few things you can do to emulate the behavior you want.
Definitions
We'll call the metric we're optimizing M where M(R) is the metric's value for a region R and a region R is some collections of pixels. I will assume that optimizing the metric will result in the largest possible value of M, but this approach can work if the goal is to minimize M as well.
Methodology
This approach is going to be slightly backwards to your original outline, but it should satisfy both requirements of adding pixels that lie in non-optimal paths from the seed and removing pixels that do not contribute significantly to the optimization.
We will begin at a seed s, but instead of evaluating paths as we go we will add all pixels in the image (or maximum feature size) iteratively to our region. At each step we will determine a value of the pixel based on how much it improves the metric for the current region, M(p). This is not the same as the value of the region containing the pixel (M(R) where p is in R). Rather it would be the difference of the value of the region containing the pixel and the value of the region before the pixel was added (M(p) = M(R) - M(R') where R = R' + p). If you have the capacity to evaluate a single pixel you could simply use that instead.
The next change is to include an regularization parameter in M(R) that penalizes the score based on the number of pixels included: N(R) = M(R) - a * |R| where a is some arbitrary positive constant and |R| represents the cardinality (number of pixels) in our region. Note: if the goal is to minimize M then a should be negative. This will have the effect of penalizing the score of the region if it includes too many pixels.
Finally, after all pixels have been added to the region and N(p) has been evaluated for each pixel we iterate over the region again. This time we begin at the last pixel added and iterate backwards over our set of pixels, ending at the seed s. At each iteration determine the score of the region N(R). If the score N(R) has decreased since the last iteration then we remove the pixel p with the lowest score N(p). This should have the effect of the smallest number of pixels in the region that contribute the most to the score.
Additional Considerations
If the remaining pixels lie on non-contiguous paths after pruning you could run a secondary algorithm to add in adjoining pixels. You'll need to do testing to determine an optimal value of a such that enough pixels are kept to reconstruct the building, but it doesn't include every pixel from the image.
My Opinion (that you didn't ask for)
In general I think you would have more luck with more robust algorithms such as Convolutional Neural Networks for feature classification. They'll likely be faster and definitely more accurate than the algorithm described above.

Threshold values for binary filtering

How to determine good values for the two threshold values for binary filtering?
The images I want to filter are MRI or CT images like these http://pubimage.hcuge.ch:8080/, the images are also most likely gray scale images.
I'm trying to extract a surface model from a stack of 2D images using marching cubes algorithm and binary filtering on the iPad. For the binary filtering I use a lower and upper threshold value, the pixel is set to inside value if lowerThreshold <= pixelValue <= upperThreshold.
Thanks for your help, Manu
Update: I have asked one of my image processing professors about this question now. He said if the histogramm of the image is bimodal (which means there are two hills in the histogramm) the solution is relatively easy which is the case in my images
If your image background is black and your object of interest of any other shade, then you can try to guess a threshold from the histogram of your image (note though, that you may have to try hard to find a suitable percentage threshold that suits all your images).
This may not be sufficient however. A tool that would be interesting for this task is clearly active contours (aka snakes), but it's hard to guess if you can afford the time and effort needed to use them (there is an implementation of geodesic active contours in ITK, but I don't know how much effort it requires before use). If snakes are an option, then you can make the contour evolve from the boundary of your image until they meet your object and fit its contour.

Image retrieval - edge histogram

My lecturer has slides on edge histograms for image retrieval, whereby he states that one must first divide the image into 4x4 blocks, and then check for edges at the horizontal, vertical, +45°, and -45° orientations. He then states that this is then represented in a 14x1 histogram. I have no idea how he came about deciding that a 14x1 histogram must be created. Does anyone know how he came up with this value, or how to create an edge histogram?
Thanks.
The thing you are referring to is called the Histogram of Oriented Gradients (HoG). However, the math doesn't work out for your example. Normally you will choose spatial binning parameters (the 4x4 blocks). For each block, you'll compute the gradient magnitude at some number of different directions (in your case, just 2 directions). So, in each block you'll have N_{directions} measurements. Multiply this by the number of blocks (16 for you), and you see that you have 16*N_{directions} total measurements.
To form the histogram, you simply concatenate these measurements into one long vector. Any way to do the concatenation is fine as long as you keep track of the way you map the bin/direction combo into a slot in the 1-D histogram. This long histogram of concatenations is then most often used for machine learning tasks, like training a classifier to recognize some aspect of images based upon the way their gradients are oriented.
But in your case, the professor must be doing something special, because if you have 16 different image blocks (a 4x4 grid of image blocks), then you'd need to compute less than 1 measurement per block to end up with a total of 14 measurements in the overall histogram.
Alternatively, the professor might mean that you take the range of angles in between [-45,+45] and you divide that into 14 different values: -45, -45 + 90/14, -45 + 2*90/14, ... and so on.
If that is what the professor means, then in that case you get 14 orientation bins within a single block. Once everything is concatenated, you'd have one very long 14*16 = 224-component vector describing the whole image overall.
Incidentally, I have done a lot of testing with Python implementations of Histogram of Gradient, so you can see some of the work linked here or here. There is also some example code at that site, though a more well-supported version of HoG appears in scikits.image.

Determine if an image needs contrasting automatically in OpenCV

OpenCV has a handy cvEqualizeHist() function that works great on faded/low-contrast images.
However when an already high-contrast image is given, the result is a low-contrast one. I got the reason - the histogram being distributed evenly and stuff.
Question is - how do I get to know the difference between a low-contrast and a high-contrast image?
I'm operating on Grayscale images and setting their contrast properly so that thresholding them won't delete the text i'm supposed to extract (thats a different story).
Suggestions welcome - esp on how to find out if the majority of the pixels in the image are light gray (which means that the equalise hist is to be performed)
Please help!
EDIT: thanks everyone for many informative answers. But the standard deviation calculation was sufficient for my requirements and hence I'm taking that to be the answer to my query.
You can probably just use a simple statistical measure of the image to determine whether an image has sufficient contrast. The variance of the image would probably be a good starting point. If the variance is below a certain threshold (to be empirically determined) then you can consider it to be "low contrast".
If you're adjusting contrast just so you can threshold later on, you may be able to avoid the contrast adjustment step if you set your threshold adaptively using Ohtsu's method.
If you're still interested in finding out the image contrast, then read on.
While there are a number of different ways to calculate "contrast". Often, those metrics are applied locally as opposed to the entire image, to make the result more sensitive to image content:
Divide the image into adjacent non-overlaying neighborhoods.
Pick neighborhood sizes that are approximate to size of the features of your image (e.g. if your main feature is horizontal text, make neighborhoods tall enough to capture 2 lines of text, and just as wide).
Apply the metric to each neighborhood individually
Threshold the metric result to separate low and high variance blocks. This will prevent such things as large, blank areas of page skewing your contrast estimates.
From there, you can use a number of features to determine contrast:
The proportion of high metric blocks to low metric blocks
High metric block mean
Intensity distance between the high and low metric blocks (using means, modes, etc)
This may serve as a better indication of image contrast than global image variance alone. Here's why:
(stddev: 50.6)
(stddev: 7.9)
The two images are perfectly in contrast (the grey background is just there to make it obvious it's an image), but their standard deviations (and thus variance) are completely different.
Calculate cumulative histogram of image.
Make linear regression of cumulative histogram in the form y(x) = A*x + B.
Calculate RMSE of real_cumulative_frequency(x)-y(x).
If that RMSE is close to zero - image is already equalized. (That means that for equalized images cumulative histograms must be linear)
Idea is taken from here.
EDIT:
I've illustrated this approach in my blog (C example code included).
There is a support provided in skimage for this. skimage.exposure.is_low_contrast. reference
example :
>>> image = np.linspace(0, 0.04, 100)
>>> is_low_contrast(image)
True
>>> image[-1] = 1
>>> is_low_contrast(image)
True
>>> is_low_contrast(image, upper_percentile=100)
False

Resources