Suppose I have an image A, I applied Gaussian Blur on it with Sigam=3 So I got another Image B. Is there a way to know the applied sigma if A,B is given?
Further clarification:
Image A:
Image B:
I want to write a function that take A,B and return Sigma:
double get_sigma(cv::Mat const& A,cv::Mat const& B);
Any suggestions?
EDIT1: The suggested approach doesn't work in practice in its original form(i.e. using only 9 equations for a 3 x 3 kernel), and I realized this later. See EDIT1 below for an explanation and EDIT2 for a method that works.
EDIT2: As suggested by Humam, I used the Least Squares Estimate (LSE) to find the coefficients.
I think you can estimate the filter kernel by solving a linear system of equations in this case. A linear filter weighs the pixels in a window by its coefficients, then take their sum and assign this value to the center pixel of the window in the result image. So, for a 3 x 3 filter like
the resulting pixel value in the filtered image
result_pix_value = h11 * a(y, x) + h12 * a(y, x+1) + h13 * a(y, x+2) +
h21 * a(y+1, x) + h22 * a(y+1, x+1) + h23 * a(y+1, x+2) +
h31 * a(y+2, x) + h32 * a(y+2, x+1) + h33 * a(y+2, x+2)
where a's are the pixel values within the window in the original image. Here, for the 3 x 3 filter you have 9 unknowns, so you need 9 equations. You can obtain those 9 equations using 9 pixels in the resulting image. Then you can form an Ax = b system and solve for x to obtain the filter coefficients. With the coefficients available, I think you can find the sigma.
In the following example I'm using non-overlapping windows as shown to obtain the equations.
You don't have to know the size of the filter. If you use a larger size, the coefficients that are not relevant will be close to zero.
Your result image size is different than the input image, so i didn't use that image for following calculation. I use your input image and apply my own filter.
I tested this in Octave. You can quickly run it if you have Octave/Matlab. For Octave, you need to load the image package.
I'm using the following kernel to blur the image:
h =
0.10963 0.11184 0.10963
0.11184 0.11410 0.11184
0.10963 0.11184 0.10963
When I estimate it using a window size 5, I get the following. As I said, the coefficients that are not relevant are close to zero.
g =
9.5787e-015 -3.1508e-014 1.2974e-015 -3.4897e-015 1.2739e-014
-3.7248e-014 1.0963e-001 1.1184e-001 1.0963e-001 1.8418e-015
4.1825e-014 1.1184e-001 1.1410e-001 1.1184e-001 -7.3554e-014
-2.4861e-014 1.0963e-001 1.1184e-001 1.0963e-001 9.7664e-014
1.3692e-014 4.6182e-016 -2.9215e-014 3.1305e-014 -4.4875e-014
EDIT1:
First of all, my apologies.
This approach doesn't really work in the practice. I've used the filt = conv2(a, h, 'same'); in the code. The resulting image data type in this case is double, whereas in the actual image the data type is usually uint8, so there's loss of information, which we can think of as noise. I simulated this with the minor modification filt = floor(conv2(a, h, 'same'));, and then I don't get the expected results.
The sampling approach is not ideal, because it's possible that it results in a degenerated system. Better approach is to use random sampling, avoiding the borders and making sure the entries in the b vector are unique. In the ideal case, as in my code, we are making sure the system Ax = b has a unique solution this way.
One approach would be to reformulate this as Mv = 0 system and try to minimize the squared norm of Mv under the constraint squared-norm v = 1, which we can solve using SVD. I could be wrong here, and I haven't tried this.
Another approach is to use the symmetry of the Gaussian kernel. Then a 3x3 kernel will have only 3 unknowns instead of 9. I think, this way we impose additional constraints on v of the above paragraph.
I'll try these out and post the results, even if I don't get the expected results.
EDIT2:
Using the LSE, we can find the filter coefficients as pinv(A'A)A'b. For completion, I'm adding a simple (and slow) LSE code.
Initial Octave Code:
clear all
im = double(imread('I2vxD.png'));
k = 5;
r = floor(k/2);
a = im(:, :, 1); % take the red channel
h = fspecial('gaussian', [3 3], 5); % filter with a 3x3 gaussian
filt = conv2(a, h, 'same');
% use non-overlapping windows to for the Ax = b syatem
% NOTE: boundry error checking isn't performed in the code below
s = floor(size(a)/2);
y = s(1);
x = s(2);
w = k*k;
y1 = s(1)-floor(w/2) + r;
y2 = s(1)+floor(w/2);
x1 = s(2)-floor(w/2) + r;
x2 = s(2)+floor(w/2);
b = [];
A = [];
for y = y1:k:y2
for x = x1:k:x2
b = [b; filt(y, x)];
f = a(y-r:y+r, x-r:x+r);
A = [A; f(:)'];
end
end
% estimated filter kernel
g = reshape(A\b, k, k)
LSE method:
clear all
im = double(imread('I2vxD.png'));
k = 5;
r = floor(k/2);
a = im(:, :, 1); % take the red channel
h = fspecial('gaussian', [3 3], 5); % filter with a 3x3 gaussian
filt = floor(conv2(a, h, 'same'));
s = size(a);
y1 = r+2; y2 = s(1)-r-2;
x1 = r+2; x2 = s(2)-r-2;
b = [];
A = [];
for y = y1:2:y2
for x = x1:2:x2
b = [b; filt(y, x)];
f = a(y-r:y+r, x-r:x+r);
f = f(:)';
A = [A; f];
end
end
g = reshape(A\b, k, k) % A\b returns the least squares solution
%g = reshape(pinv(A'*A)*A'*b, k, k)
Related
I am wondering whether I could broadcast the operation below without using a for loop.
X1 = torch.rand(B,d)
X2 = torch.rand(B,d)
X3 = torch.rand(B,d)
X = [X1,X2,X3]
w = torch.rand(3)
res = 0
for i in range(3):
res += X[i] * w[i]:
An important thing to note here is that w is a parameter that I need to optimize. So I need to track its gradient.
Any help would be appreciated.
I just don't know how to speed up this operation without ruining the gradient map.
I am currently trying to simulate an optical flow using the following equation:
Below is a basic example where I have a 7x7 image where the central pixel is illuminated. The velocity I am applying is a uniform x-velocity of 2.
using Interpolations
using PrettyTables
# Generate grid
nx = 7 # Image will be 7x7 pixels
x = zeros(nx, nx)
yy = repeat(1:nx, 1, nx) # grid of y-values
xx = yy' # grid of x-values
# In this example x is the image I in the above equation
x[(nx-1)÷2 + 1, (nx-1)÷2 + 1] = 1.0 # set central pixel equal to 1
# Initialize velocity
velocity = 2;
vx = velocity .* ones(nx, nx); # vx=2
vy = 0.0 .* ones(nx, nx); # vy=0
for t in 1:1
# create 2d grid interpolator of the image
itp = interpolate((collect(1:nx), collect(1:nx)), x, Gridded(Linear()));
# create 2d grid interpolator of vx and vy
itpvx = interpolate((collect(1:nx), collect(1:nx)), vx, Gridded(Linear()));
itpvy = interpolate((collect(1:nx), collect(1:nx)), vy, Gridded(Linear()));
∇I_x = Array{Float64}(undef, nx, nx); # Initialize array for ∇I_x
∇I_y = Array{Float64}(undef, nx, nx); # Initialize array for ∇I_y
∇vx_x = Array{Float64}(undef, nx, nx); # Initialize array for ∇vx_x
∇vy_y = Array{Float64}(undef, nx, nx); # Initialize array for ∇vy_y
for i=1:nx
for j=1:nx
# gradient of image in x and y directions
Gx = Interpolations.gradient(itp, i, j);
∇I_x[i, j] = Gx[2];
∇I_y[i, j] = Gx[1];
Gvx = Interpolations.gradient(itpvx, i, j) # gradient of vx in both directions
Gvy = Interpolations.gradient(itpvy, i, j) # gradient of vy in both directions
∇vx_x[i, j] = Gvx[2];
∇vy_y[i, j] = Gvy[1];
end
end
v∇I = (vx .* ∇I_x) .+ (vy .* ∇I_y) # v dot ∇I
I∇v = x .* (∇vx_x .+ ∇vy_y) # I dot ∇v
x = x .- (v∇I .+ I∇v) # I(x, y, t+dt)
pretty_table(x)
end
What I expect is that the illuminated pixel in x will shift two pixels to the right in x_predicted. What I am seeing is the following:
where the original illuminated pixel's value is moved to the neighboring pixel twice rather than being shifted two pixels to the right. I.e. the neighboring pixel goes from being 0 to 2 and the original pixel goes from a value of 1 to -1. I'm not sure if I'm messing up the equation or if I'm thinking of velocity in the wrong way here. Any ideas?
Without looking into it too deeply, I think there are a couple of potential issues here:
Violation of the Courant Condition
The code you originally posted (I've edited it now) simulates a single timestep. I would not expect a cell 2 units away from your source cell to be activated in a single timestep. Doing so would voilate the Courant condition. From wikipedia:
The principle behind the condition is that, for example, if a wave is moving across a discrete spatial grid and we want to compute its amplitude at discrete time steps of equal duration, then this duration must be less than the time for the wave to travel to adjacent grid points.
The Courant condition requires that uΔt/Δx <= 1 (for an explicit time-marching solver such as the one you've implemented). Plugging in u=2, Δt=1, Δx=1 gives 2, which is greater than 1, so you have a mathematical problem. The general way of fixing this problem is to make Δt smaller. You probably want something like:
x = x .- Δt*(v∇I .+ I∇v) # I(x, y, t+dt)
Missing gradients?
I'm a little concerned about what's going on here:
Gvx = Interpolations.gradient(itpvx, i, j) # gradient of vx in both directions
Gvy = Interpolations.gradient(itpvy, i, j) # gradient of vy in both directions
∇vx_x[i, j] = Gvx[2];
∇vy_y[i, j] = Gvy[1];
You're able to pull two gradients out of both Gvx and Gvy, but you're only using one from each of them. Does that mean you're throwing information away?
https://scicomp.stackexchange.com/ is likely to provide better help with this.
A time series (x, y, t) in 3D space (X, Y, T) satisfies:
x(t) = f1(t), y(t) = f2(t),
where t = 1, 2, 3,....
In other words, coordinates (x, y) vary with timestamp t. It is easy to compute the FFT of x(t) or y(t), but how do you calculate the FFT of (x, y)? I assume it should NOT be computed as a 2D-FFT, because that is for an image, whereas (x, y) is just a series. Any suggestion? Thank you.
use
fftn
for example: Y = fftn(X) returns the multidimensional Fourier transform of an N-D array using a fast Fourier transform algorithm. The N-D transform is equivalent to computing the 1-D transform along each dimension of X. The output Y is the same size as X.
for 3-D transform:
Create a 3-D signal X. The size of X is 20-by-20-by-20
x = (1:20)';
y = 1:20;
z = reshape(1:20,[1 1 20]);
X = cos(2*pi*0.01*x) + sin(2*pi*0.02*y) + cos(2*pi*0.03*z);
Compute the 3-D Fourier transform of the signal, which is also a 20-by-20-by-20 array.
Y = fftn(X)
Pad X with zeros to compute a 32-by-32-by-32 transform.
m = nextpow2(20);
Y = fftn(X,[2^m 2^m 2^m]);
size(Y)
also you can use this code:
first You might use SINGLE intead of DOUBLE
psi = single(psi);
fftpsi = fft(psi,[],3);
Next might be working slide by slide
psi=rand(10,10,10);
% costly way
fftpsi=fftn(psi);
% This might save you some RAM, to be tested
[m,n,p] = size(psi);
for k=1:p
psi(:,:,k) = fftn(psi(:,:,k));
end
psi = reshape(psi,[m*n p]);
for i=1:m*n % you might work on bigger row-block to increase speed
psi(i,:) = fft(psi(i,:));
end
psi = reshape(psi,[m n p]);
% Check
norm(psi(:)-fftpsi(:))
I hope it will be useful for you
Say you have two column vectors vv and ww, each with 7 elements (i.e., they have dimensions 7x1). Consider the following code:
z = 0;
for i = 1:7
z = z + v(i) * w(i)
end
A) z = sum (v .* w);
B) z = w' * v;
C) z = v * w;
D) z = w * v;
According to the solutions, answers (A) AND (B) are the right answers, can someone please help me understand why?
Why is z = v * w' which is similar to answer (B) but only the order of the operation changes, is false? Since we want a vector that by definition only has one column, wouldn't we need a matrix of this size: 1x7 * 7x1 = 1x1 ? So why is z = v' * w false ? It gives the same dimension as answer (B)?
z = v'*w is true and is equal to w'*v.
They both makes 1*1 matrix, which is a number value in octave.
See this:
octave:5> v = rand(7, 1);
octave:6> w = rand(7, 1);
octave:7> v'*w
ans = 1.3110
octave:8> w'*v
ans = 1.3110
octave:9> sum(v.*w)
ans = 1.3110
Answers A and B both perform a dot product of the two vectors, which yields the same result as the code provided. Answer A first performs the element-wise product (.*) of the two column vectors, then sums those intermediate values. Answer B performs the same mathematical operation but does so via a dot product (i.e., matrix multiplication).
Answer C is incorrect because it would be performing a matrix multiplication on misaligned matrices (7x1 and 7x1). The same is true for D.
z = v * w', which was not one of the options, is incorrect because it would yield a 7x7 matrix (instead of the 1x1 scalar value desired). The point is that order matters when performing matrix multiplication. (1xN)X(Nx1) -> (1x1), whereas (Nx1)X(1xN) -> (NxN).
z = v' * w is actually a correct solution but was simply not provided as one of the options.
I have a question related to projective transform. Suppose now we know a line function ax+by+c=0in an image, and the image will go through a projective distortion, and the distortion can be represented as a projective transformation matrix :
Then after the porjective transformation, how could I know the line function in the new distorted image? Thanks!
** EDIT **
Based on the suggestion, I have found the answer. Here, I posted the MATLAB codes to illustrate it:
close all;
% Step 1: show the images as well as lines on it
imshow(img);
line = hor_vt{1}.line(1).line; a = line(1); b=line(2); c=line(3);
[row,col] = size(img);
x_range = 1:col;
y_range = -(a*x_range+c)/b;
hold on; plot(x_range,y_range,'r*');
line = hor_vt{1}.line(2).line; a = line(1); b=line(2); c=line(3);
y_range = -(a*x_range+c)/b;
hold on; plot(x_range,y_range,'y*');
% Step 2: show the output distorted image that goes through projective
% distortion.
ma_imshow(output);
[row,col] = size(output);
x_range = 1:col;
line = hor_vt{1}.line(1).line;
line = reverse_tform.tdata.Tinv*line(:); % VERY IMPORT
a = line(1); b=line(2); c=line(3);
y_range = -(a*x_range+c)/b;
hold on; plot(x_range,y_range,'r*');
disp('angle');
disp( atan(-a/b)/pi*180);
line = hor_vt{1}.line(2).line;
line = reverse_tform.tdata.Tinv*line(:); % VERY IMPORT
a = line(1); b=line(2); c=line(3);
y_range = -(a*x_range+c)/b;
hold on; plot(x_range,y_range,'y*');
disp('angle');
disp( atan(-a/b)/pi*180);
The original images with two lines on it:
After projective distoration, the output image with lines on it becomes:
Here is an easier way to understand it than in your code above.
Given a non-singular homography H (i.e. a homography represented by a 3x3 matrix H with non-zero determinant):
Homogeneous 2D points (represented as 3D column vectors) transform from the right:
p' = H * p
2D lines (represented as 3D row vectors of their 3 coefficients) transform from the left by the inverse homography:
l' = l * H^-1
Proof: for every point p belonging to line l it is l * p = 0. But then l * (H^-1 * H) * p = 0, since (H^-1 * H) = I is the identity matrix. The last equation, by the associative property, can be rewritten as (l * H^-1) * (H * p) = 0. But, for every p belonging to the line, p' = H * p is the same point transformed by the homography. Therefore the last equation says that these same points, in transformed coordinates, belong to a line with coefficients l' = l * H^-1, QED.
I'm no mathematician so maybe there's a better solution, but you could use the equation as it is and then transform the output by multiplying by the transformation matrix.