Scale gradient op in Tensorflow - machine-learning

Is there an op that:
When executed in a graph, outputs its input tensor as-is.
When building ops to compute gradients, scale the incoming gradient by the given constant
Something similar to tf.stop_gradient, but instead of setting the gradient to zero, scale it by the specified constant.
If there is no such an op, what would be the easiest way to achieve this behaviour?

The easiest way I could think of is to preprocess the gradients before applying then. You can see how to do that in the documentation here.
Or you could do a (dirty) trick like:
res = ...
res = (1 - alpha) * tf.stop_gradients(res) + alpha * res

Related

why weiner filter reduces only noise in my case, it is not reducing blur amount

I am implementing weiner filtering in python which is applied on an image blurred using disk shape point spread function, i am including code of making disk shape psf and weiner filter
def weinerFiltering(kernel,K_const,image):
#F(u,v)
copy_img= np.copy(image)
image_fft =np.fft.fft2(copy_img)
#H(u,v)
kernel_fft = np.fft.fft2(kernel,s=copy_img.shape)
#H_mag(u,v)
kernel_fft_mag = np.abs(kernel_fft)
#H*(u,v)
kernel_conj = np.conj(kernel_fft)
f = (kernel_conj)/(kernel_fft_mag**2 + K_const)
return np.abs(np.fft.ifft2(image_fft*f))
def makeDiskShape(arr,radius,centrX,centrY):
for i in range(centrX-radius,centrX+radius):
for j in range(centrY-radius,centrY+radius):
if(l2dist(centrX,centrY,i,j)<=radius):
arr[i][j]=1
return arr/np.sum(arr)
this is blurred and gaussian noised image
this is what i am getting result after weiner filtering for K value of 50
result does not seem very good, can someone help
seems noise is reduced but amount of blurred is not, shape of disk shaped psf matrix is 20,20 and radius is 9 which seems like this
Update
using power spectrum of ground truth image and noise to calculate K constant value, still i am getting strong artifacts
this is noised and blurred image
this is result after using power specturm in place of a constant K value
Reduce your value of K. You need to play around with it until you get good results. If it's too large it doesn't filter, if it's too small you get strong artifacts.
If you have knowledge of the noise variance, you can use that to estimate the regularization parameter. In the Wiener filter, the constant K is a simplification of N/S, where N is the noise power and S is the signal power. Both these values are frequency-dependent. The signal power S can be estimated by the Fourier transform of the autocorrelation function of the image to be filtered. The noise power is hard to estimate, but if you have such an estimate (or know it because you created the noisy image synthetically), then you can plug that value into the equation. Note that this is the noise power, not the variance of the noise.
The following code uses DIPlib (the Python interface we call PyDIP) to demonstrate Wiener deconvolution (disclaimer: I'm an author). I don't think it is hard to convert this code to use other libraries.
import PyDIP as dip
image = dip.ImageRead('trui.ics');
kernel = dip.CreateGauss([3,3]).Pad(image.Sizes())
smooth = dip.ConvolveFT(image, kernel)
smooth = dip.GaussianNoise(smooth, 5.0) # variance = 5.0
H = dip.FourierTransform(kernel)
F = dip.FourierTransform(smooth)
S = dip.SquareModulus(F) # signal power estimate
N = dip.Image(5.0 * smooth.NumberOfPixels()) # noise power (has same value at all frequencies)
Hinv = dip.Conjugate(H) / ( dip.SquareModulus(H) + N / S )
out = dip.FourierTransform(F * Hinv, {"inverse", "real"})
The smooth image looks like this:
The out image that comes from deconvolving the image above looks like this:
Don't expect a perfect result. The regularization term impedes a perfect inverse filtering because such filtering would enhance the noise so strongly that it would swamp the signal and produce a totally useless output. The Wiener filter finds a middle ground between undoing the convolution and suppressing the noise.
The DIPlib documentation for WienerDeconvolution explains some of the equations involved.

Where is OpenCV's getGaussianKernel method's formula for sigma coming from? [duplicate]

I find on the OpenCV documentation for cvSmooth that sigma can be calculated from the kernel size as follows:
sigma = 0.3(n/2 - 1) + 0.8
I would like to know the theoretical background of this equation.
Thank you.
Using such a value for sigma, the ratio between the value at the centre of the kernel and on the edge of the kernel, found for y=0 and x=n/2-1, is:
g_edge / g_center = exp(-(x²+y²)/(2σ²))
= exp(-(n/2-1)²/(2*(0.3(n/2-1)+0.8)²))
The limit of this value as n increases is:
exp(-1/(2*0.3²)) = 0.00386592
Note that 1/256 is 0.00390625. Images are often encoded in 256-value ranges. The choice of 0.3 ensures that the kernel considers all pixels that may significantly influence the resulting value.
I am afraid I do not have an explanation for the 0.8 part, but I imagine it is here to ensure reasonable values when n is small.

Do I need to include my scaled outputs in my back-propagation equation (SGD)?

Quick question, when I am backpropagating the loss function to my parameters and I used a scaled output (ex. tanh(x) * 2), do I need to include the derivative of the scaled output w.r.t the original output? Thank you!
Before we can backprop the errors, we've to compute the gradient of the loss function with respect to each of the parameters. This computation involves computing the gradients of the outputs first and then use chain rule repeatedly. So, when you do this, the scaling constant remains as is. So, yes, you've to scale the errors accordingly.
As an example, you might have observed the following L2 regularized loss - a.k.a Ridge regression:
Loss = 1/2 * |T - Y|^2 + \lambda * ||w||^2
Here, we are scaling down the squared error. So, when we compute the gradient 1/2 & 2 would cancel out. If we would not have multiplied this by 0.5 in the first place, then we would have to scale up our gradient by 2. Else the gradient vector would point in some other direction instead of the direction which minimizes the loss.

Threshold of blurry image - part 2

How can I threshold this blurry image to make the digits as clear as possible?
In a previous post, I tried adaptively thresholding a blurry image (left), which resulted in distorted and disconnected digits (right):
Since then, I've tried using a morphological closing operation as described in this post to make the brightness of the image uniform:
If I adaptively threshold this image, I don't get significantly better results. However, because the brightness is approximately uniform, I can now use an ordinary threshold:
This is a lot better than before, but I have two problems:
I had to manually choose the threshold value. Although the closing operation results in uniform brightness, the level of brightness might be different for other images.
Different parts of the image would do better with slight variations in the threshold level. For instance, the 9 and 7 in the top left come out partially faded and should have a lower threshold, while some of the 6s have fused into 8s and should have a higher threshold.
I thought that going back to an adaptive threshold, but with a very large block size (1/9th of the image) would solve both problems. Instead, I end up with a weird "halo effect" where the centre of the image is a lot brighter, but the edges are about the same as the normally-thresholded image:
Edit: remi suggested morphologically opening the thresholded image at the top right of this post. This doesn't work too well. Using elliptical kernels, only a 3x3 is small enough to avoid obliterating the image entirely, and even then there are significant breakages in the digits:
Edit2: mmgp suggested using a Wiener filter to remove blur. I adapted this code for Wiener filtering in OpenCV to OpenCV4Android, but it makes the image even blurrier! Here's the image before (left) and after filtering with my code and a 5x5 kernel:
Here is my adapted code, which filters in-place:
private void wiener(Mat input, int nRows, int nCols) { // I tried nRows=5 and nCols=5
Mat localMean = new Mat(input.rows(), input.cols(), input.type());
Mat temp = new Mat(input.rows(), input.cols(), input.type());
Mat temp2 = new Mat(input.rows(), input.cols(), input.type());
// Create the kernel for convolution: a constant matrix with nRows rows
// and nCols cols, normalized so that the sum of the pixels is 1.
Mat kernel = new Mat(nRows, nCols, CvType.CV_32F, new Scalar(1.0 / (double) (nRows * nCols)));
// Get the local mean of the input. localMean = convolution(input, kernel)
Imgproc.filter2D(input, localMean, -1, kernel, new Point(nCols/2, nRows/2), 0);
// Get the local variance of the input. localVariance = convolution(input^2, kernel) - localMean^2
Core.multiply(input, input, temp); // temp = input^2
Imgproc.filter2D(temp, temp, -1, kernel, new Point(nCols/2, nRows/2), 0); // temp = convolution(input^2, kernel)
Core.multiply(localMean, localMean, temp2); //temp2 = localMean^2
Core.subtract(temp, temp2, temp); // temp = localVariance = convolution(input^2, kernel) - localMean^2
// Estimate the noise as mean(localVariance)
Scalar noise = Core.mean(temp);
// Compute the result. result = localMean + max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
Core.max(temp, noise, temp2); // temp2 = max(localVariance, noise)
Core.subtract(temp, noise, temp); // temp = localVariance - noise
Core.max(temp, new Scalar(0), temp); // temp = max(0, localVariance - noise)
Core.divide(temp, temp2, temp); // temp = max(0, localVar-noise) / max(localVariance, noise)
Core.subtract(input, localMean, input); // input = input - localMean
Core.multiply(temp, input, input); // input = max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
Core.add(input, localMean, input); // input = localMean + max(0, localVariance - noise) / max(localVariance, noise) * (input - localMean)
}
Some hints that you might try out:
Apply the morphological opening in your original thresholded image (the one which is noisy at the right of the first picture). You should get rid of most of the background noise and be able to reconnect the digits.
Use a different preprocessing of your original image instead of morpho closing, such as median filter (tends to blur the edges) or bilateral filtering which will preserve better the edges but is slower to compute.
As far as threshold is concerned, you can use CV_OTSU flag in the cv::threshold to determine an optimal value for a global threshold. Local thresholding might still be better, but should work better with the bilateral or median filter
I've tried thresholding each 3x3 box separately, using Otsu's algorithm (CV_OTSU - thanks remi!) to determine an optimal threshold value for each box. This works a bit better than thresholding the entire image, and is probably a bit more robust.
Better solutions are welcome, though.
If you're willing to spend some cycles on it there are de-blurring techniques that could be used to sharpen up the picture prior to processing. Nothing in OpenCV yet but if this is a make-or-break kind of thing you could add it.
There's a bunch of literature on the subject:
http://www.cse.cuhk.edu.hk/~leojia/projects/motion_deblurring/index.html
http://www.google.com/search?q=motion+deblurring
And some chatter on the OpenCV mailing list:
http://tech.groups.yahoo.com/group/OpenCV/message/20938
The weird "halo effect" that you're seeing is likely due to OpenCV assuming black for the color when the adaptive threshold is at/near the edge of the image and the window that it's using "hangs over" the edge into non-image territory. There are ways to correct for this, most likely you would make an temporary image that's at least two full block-sizes taller and wider than the image from the camera. Then copy the camera image into the middle of it. Then set the surrounding "blank" portion of the temp image to be the average color of the image from the camera. Now when you perform the adaptive threshold the data at/near the edges will be much closer to accurate. It won't be perfect since its not a real picture but it will yield better results than the black that OpenCV is assuming is there.
My proposal assumes you can identify the sudoku cells, which I think, is not asking too much. Trying to apply morphological operators (although I really like them) and/or binarization methods as a first step is the wrong way here, in my opinion of course. Your image is at least partially blurry, for whatever reason (original camera angle and/or movement, among other reasons). So what you need is to revert that, by performing a deconvolution. Of course asking for a perfect deconvolution is too much, but we can try some things.
One of these "things" is the Wiener filter, and in Matlab, for instance, the function is named deconvwnr. I noticed the blurry to be in the vertical direction, so we can perform a deconvolution with a vertical kernel of certain length (10 in the following example) and also assume the input is not noise free (assumption of 5%) -- I'm just trying to give a very superficial view here, take it easy. In Matlab, your problem is at least partially solved by doing:
f = imread('some_sudoku_cell.png');
g = deconvwnr(f, fspecial('motion', 10, 90), 0.05));
h = im2bw(g, graythresh(g)); % graythresh is the Otsu method
Here are the results from some of your cells (original, otsu, otsu of region growing, morphological enhanced image, otsu from morphological enhanced image with region growing, otsu of deconvolution):
The enhanced image was produced by performing original + tophat(original) - bottomhat(original) with a flat disk of radius 3. I manually picked the seed point for region growing and manually picked the best threshold.
For empty cells you get weird results (original and otsu of deconvlution):
But I don't think you would have trouble to detect whether a cell is empty or not (the global threshold already solves it).
EDIT:
Added the best results I could get with a different approach: region growing. I also attempted some other approaches, but this was the second best one.

Optimal sigma for Gaussian filtering of an image?

When applying a Gaussian blur to an image, typically the sigma is a parameter (examples include Matlab and ImageJ).
How does one know what sigma should be? Is there a mathematical way to figure out an optimal sigma? In my case, i have some objects in images that are bright compared to the background, and I need to find them computationally. I am going to apply a Gaussian filter to make the center of these objects even brighter, which hopefully facilitates finding them. How can I determine the optimal sigma for this?
There's no formula to determine it for you; the optimal sigma will depend on image factors - primarily the resolution of the image and the size of your objects in it (in pixels).
Also, note that Gaussian filters aren't actually meant to brighten anything; you might want to look into contrast maximization techniques - sounds like something as simple as histogram stretching could work well for you.
edit: More explanation - sigma basically controls how "fat" your kernel function is going to be; higher sigma values blur over a wider radius. Since you're working with images, bigger sigma also forces you to use a larger kernel matrix to capture enough of the function's energy. For your specific case, you want your kernel to be big enough to cover most of the object (so that it's blurred enough), but not so large that it starts overlapping multiple neighboring objects at a time - so actually, object separation is also a factor along with size.
Since you mentioned MATLAB - you can take a look at various gaussian kernels with different parameters using the fspecial('gaussian', hsize, sigma) function, where hsize is the size of the kernel and sigma is, well, sigma. Try varying the parameters to see how it changes.
I use this convention as a rule of thumb. If k is the size of kernel than sigma=(k-1)/6 . This is because the length for 99 percentile of gaussian pdf is 6sigma.
You have to find a min/max of a function G such that G(X,sigma) where X is a set of your observations (in your case, your image grayscale values) , This function can be anything that maintain the "order" of the intensities of the iamge, for example, this can be done with the 1st derivative of the image (as G),
fil = fspecial('sobel');
im = imfilter(I,fil);
imagesc(im);
colormap = gray;
this gives you the result of first derivative of an image, now you want to find max sigma by
maximzing G(X,sigma), that means that you are trying a few sigmas (let say, in increasing order) until you reach a sigma that makes G maximal. This can also be done with second derivative.
Given the central value of the kernel equals 1 the dimension that guarantees to have the outermost value less than a limit (e.g 1/100) is as follows:
double limit = 1.0 / 100.0;
size = static_cast<int>(2 * std::ceil(sqrt(-2.0 * sigma * sigma * log(limit))));
if (size % 2 == 0)
{
size++;
}

Resources