How to perform a linear homography of an image in tensorflow - opencv

I would like to be able to replicate the behaviour of the opencv function warpPerspective which takes as input an image and an homography matrix, and projects the image according to the homography matrix (more details here : https://docs.opencv.org/2.4/modules/imgproc/doc/geometric_transformations.html).
It seems like tf.contrib.image.sparse_image_warp should do the job, but I am unable to replicate the behaviour of warpPerspective. The output I get is distorted in a non-linear fashion despite the use of the parameter interpolation_order=1.
With some further research, I suspect this is due to the fact that tf.contrib.image.interpolate_spline does not perform linear interpolation even when its order is 1 but rather uses some RBF kernels.
I can't see a way around this except encoding it with a dense_image_warp, but it seems a bit overkill and maybe costly. Does anyone has another solution ?

After some research, here is a solution. it uses the tf.contrib.image.dense_image_warp function and is not really pretty, but still, it works :
This first function computes the optical flow needed to perform the homography :
def homography_matrix_to_flow(tf_homography_matrix, im_shape1, im_shape2):
Y, X = np.meshgrid(range(im_shape1), range(im_shape2))
Z = np.ones_like(X)
XYZ = np.stack((X, Y, Z), axis=-1)
tf_XYZ = tf.constant(XYZ.astype("float64"))
tf_XYZ = tf_XYZ[tf.newaxis,:,:, :, tf.newaxis]
tf_homography_matrix = tf.tile(tf_homography_matrix[tf.newaxis, tf.newaxis], (1, im_shape2, im_shape1, 1, 1))
tf_unnormalized_transformed_XYZ = tf.matmul(tf_homography_matrix, tf_XYZ, transpose_b=False)
tf_transformed_XYZ = tf_unnormalized_transformed_XYZ / tf_unnormalized_transformed_XYZ[:,:,:, -1][:,:,:, tf.newaxis]
flow = -tf.squeeze(tf_transformed_XYZ-tf_XYZ)[..., :2]
return flow
Then, it used to warp the original image to the distorted image.
There is one trick : due to how the tf.contrib.image.dense_image_warp function works, you need to pass the inverse of the homography matrix to find the correct optical flow to use.
homography_matrix = np.array([[-4.86219067e-01, -2.20871298e+00, 4.08214879e+02],
[-1.02940133e-01, -5.60378659e+00, 3.87573763e+02],
[-1.35051362e-04, -6.59600583e-03, 2.91244998e-01]])
inv_homography_matrix = np.linalg.inv(homography_matrix)
tf_inv_homography_matrix = tf.constant(inv_homography_matrix)[tf.newaxis]
flow = homography_matrix_to_flow(tf_inv_homography_matrix, img.shape[1], img.shape[2])[tf.newaxis]
flow =tf.tile(flow, (self.bs, 1,1,1))
image_warped = tf.contrib.image.dense_image_warp(tf.transpose(img, (0,2,1,3)), flow)
image_warped = tf.transpose(image_warped, (0,2,1,3))
I still hope to find a better answer (one which does not have to compute a whole tensor of flow), therefore, I leave the question unanswered for now.

Related

TensorFlow custom model optimizer returning NaN. Why?

I want to learn optimal weights and exponents for a custom model I've created:
weights = tf.Variable(tf.zeros([t.num_features, 1], dtype=tf.float64))
exponents = tf.Variable(tf.ones([t.num_features, 1], dtype=tf.float64))
# works fine:
pred = tf.matmul(x, weights)
# doesn't work:
x_to_exponent = tf.mul(tf.sign(x), tf.pow(tf.abs(x), tf.transpose(exponents)))
pred = tf.matmul(x_to_exponent, weights)
cost_function = tf.reduce_mean(tf.abs(pred-y_))
optimizer = tf.train.GradientDescentOptimizer(t.LEARNING_RATE).minimize(cost_function)
The problem is that whenever there is a negative value zero in x the optimizer returns the weight as NaN. If I simply add 0.0001 when x = 0 then everything works as expected. But should I really have to do this? Shouldn't the TensorFlow optimizer have a way to handle this?
I've noticed Wikipedia shows no activation functions where x is taken to an exponent. Why isn't there an activation function that looks as below Image?
For the above image I'd like my program to learn that the correct exponent is 0.5.
This is correct behavior on TensorFlow's part, since the gradient is infinity there (and many computations that should mathematically be infinity end up NaN due to indeterminate limits).
If you want to work around the problem, a slightly generalized version of gradient clipping may work. You can get the gradients via Optimizer.compute_gradients, manually clip them via something like
safe_grad = tf.clip_by_value(tf.select(tf.is_nan(grad), 0, grad), -lim, lim)
and then pass the clipped gradients to Optimizer.apply_gradients. The clipping will be necessary to not explode for values near the singularity, where the gradient may be arbitrarily large.
Warning: There is no guarantee that this will work, especially for deeper networks where the nans may pollute large swaths of the network.

Using Levenberg-Marquardt optimization algorithm via opencv projectPoints() to estimate Calibration Errors

In camera calibration, i have used calibrateCamera() to find the camera parameters from several views of a calibration pattern. It precisely does two things:
1) Estimate the Initial Camera Parameters in closed form, assuming lens distortion as zero.
2) Run the global Levenberg-Marquardt optimization algorithm to minimize the reprojection error, which is done using projectPoints()
Now, i don't just want to compute the Minimized Reprojection Error but the fit parameters which caused it. There is currently no function which would return the error-free parameters. So, what i thought was i would use projectPoints() to get the reprojected image points and then use the reprojected image points and world points to calibrate again and obtain the error-free parameters. Problem is this is not something i am sure would give me output. Can anybody tell me whether it is? Any help would be appreciated.
Levenberg-Marquardt will give you the best estimate of what your model and data are capable of. You can't get an error free parameters unless your input data is noise free and your model complexity matches the complexity of your real model.
For example, your model is:
x * 2 + y = z, with x > 0 and x is an integer number
Input data z = { 3 }
Depending on your initial value, Levenberg-Marquardt will give you:
(x = 1, y = 1) or (x=2,y=-1) or ... which are error free
However, with the same input z, if your model is:
x * 2 = z, with x > 0 and x is an integer number
There is no way that you will get an error free parameters

Example of factorization with the Montgomery curve

I have programmed the elliptic curve method for integer factorization using Montgomery curves(the same idea as Lenstra's elliptic curve method, just changed a bit so it works with Montgomey curves). However, I haven't really been able to find any examples of numbers being factorized using the method, and I would really like to be able to test it on numbers I know should give a result, in order to check if it works as it should. So my question is, does anyone have an example of the method used on numbers, so that I can see whether my code gives the same output using the same numbers?
You might like to factor the Mersenne number M(677) = 2^677-1 = 1943118631 * 531132717139346021081 * 978146583988637765536217 * P53 * P98. The P53 can be found by elliptic curve factorization with B1 = 9000000, B2 = 16000000, and lucky curve sigma = 8689346476060549. You might enjoy my blog, which gives a solution to that factorization and also has a bunch of other prime-number stuff if you want to poke around.

Better estimation of Homography using Kalman filter?

I am creating an AR application that tracks feature , calculates homography and then obtains the object's pose from 3D-2D point correspondences and use that to render any 3D Object.
I am selecting a specific area for detecting features on my source image (by masking). and then matching it with features detected on subsequent frames.Then I filter those matches and estimate Homography of the unmasked region.
The problem is lies in Homography estimation. It differs every time(very slightly, but nonetheless, differs). The effect is : Even on keeping my camera still, I get a vibrating rectangle around my tracked region, which i draw using the estimated homography.
I have already posted a question titled Unstable homography estimation using ORB and got a reassurance of a fact i was considering (not recalculating my homography if the position of the region is similar to its last position).
However, I recently came to know of the Kalman filter, that it gives a better estimate of the position by combining our prior knowledge with our measurement observation.
So ,after looking at various examples (one in particular, http://www.youtube.com/watch?v=GBYW1j9lC1I), I modeled a Kalman filter(rather 4, one for every point of the rectanglular region) for my scenario:
m_KF1.init(4, 2, 1);
setIdentity(m_KF2.transitionMatrix);
m_measurement1 = Mat::zeros(2,1,cv::DataType<float>::type);
m_KF1.statePre.setTo(0);
m_KF1.controlMatrix.setTo(0);
//initialzing filter
m_KF1.statePre.at<float>(0) = m_scene_corners[1].x; //the first reading
m_KF1.statePre.at<float>(1) = m_scene_corners[1].y;
m_KF1.statePre.at<float>(2) = 0;
m_KF1.statePre.at<float>(3) = 0;
setIdentity(m_KF1.measurementMatrix);
setIdentity(m_KF1.processNoiseCov,Scalar::all(.1)) //updated at every step
setIdentity(m_KF1.measurementNoiseCov, Scalar::all(4)); //assuming measurement error of
//not more than 2 pixels
setIdentity(m_KF1.errorCovPost, Scalar::all(.1));
4 state variables (position in x, y and velocity in x,y).
2 measurement variables (position in x,y)
1 control variable (acceleration)
following steps taken at every iteration
//---First,the predicion phase , to update the internal variables-------//
// 'dt' is the time taken between the measurements
//Updating the transitionMatrix
m_KF1.transitionMatrix.at<float>(0,2) = dt;
m_KF1.transitionMatrix.at<float>(1,3) = dt;
//Updating the Control matrix
m_KF1.controlMatrix.at<float>(0,1) = (dt*dt)/2;
m_KF1.controlMatrix.at<float>(1,1) = (dt*dt)/2;
m_KF1.controlMatrix.at<float>(2,1) = dt;
m_KF1.controlMatrix.at<float>(3,1) = dt;
//Updating the processNoiseCovmatrix
m_KF1.processNoiseCov.at<float>(0,0) = (dt*dt*dt*dt)/4;
m_KF1.processNoiseCov.at<float>(0,2) = (dt*dt*dt)/2;
m_KF1.processNoiseCov.at<float>(1,1) = (dt*dt*dt*dt)/4;
m_KF1.processNoiseCov.at<float>(1,3) = (dt*dt*dt)/2;
m_KF1.processNoiseCov.at<float>(2,0) = (dt*dt*dt)/2;
m_KF1.processNoiseCov.at<float>(2,2) = dt*dt;
m_KF1.processNoiseCov.at<float>(3,1) = (dt*dt*dt)/2;
m_KF1.processNoiseCov.at<float>(3,3) = dt*dt;
Mat prediction1 = m_KF1.predict();
Point2f predictPt1(prediction1.at<float>(0),prediction1.at<float>(1));
// Get the measured corner
m_measurement1.at<float>(0,0) = scene_corners[0].x;
m_measurement1.at<float>(0,1) = scene_corners[0].y;
//----Then, the correction phase which uses the predicted value and our measured value
Mat estimated = m_KF1.correct(m_measurement1);
Point2f statePt1(estimated.at<float>(0),estimated.at<float>(1));
This model hardly corrects my measured value
Now My Questions are:
Is Kalman filter suited for my scenario? Will it give me any better results?
If it is, then what's missing? am I modelling it right? Instead creating 4 filters for four points of the rectangle, should I model it in some other manner (for instance, take the 10 strongest matches based on the distance and use those as input to the filter)
If Kalman filter isn't suited, what else can i do to provide more stability to the estimated homography?
Any help would be highly appreciated.
Thanks.
This question is badly titled and after reading your explanation what you really ask is: "Why my OpenCV Kalman filter will still leaves lots of noise?"
Anyway your answers are:
Yes Kalman works for your scenario
You are using it wrong
Modify: KF.processNoiseCov, You can get the code from here: Opencv kalman filter prediction without new observtion there is nice line explaining it.
See:
setIdentity(KF.processNoiseCov, Scalar::all(.005)); //adjust this for faster convergence - but higher noise
For what I see, you have very basic understanding of it, you can go to a naive approach and use four 2D Kalman filters, for this you can use the code here: . It will work, from there grow and adapt until you get a better understanding.
After that, you could model it more closely to your problem or you can keep using four filters, there is no "perfect" implementation so if that works for you, just go for it.

RANSAC Algorithm

Can anybody please show me how to use RANSAC algorithm to select common feature points in two images which have a certain portion of overlap? The problem came out from feature based image stitching.
I implemented a image stitcher a couple of years back. The article on RANSAC on Wikipedia describes the general algortihm well.
When using RANSAC for feature based image matching, what you want is to find the transform that best transforms the first image to the second image. This would be the model described in the wikipedia article.
If you have already got your features for both images and have found which features in the first image best matches which features in the second image, RANSAC would be used something like this.
The input to the algorithm is:
n - the number of random points to pick every iteration in order to create the transform. I chose n = 3 in my implementation.
k - the number of iterations to run
t - the threshold for the square distance for a point to be considered as a match
d - the number of points that need to be matched for the transform to be valid
image1_points and image2_points - two arrays of the same size with points. Assumes that image1_points[x] is best mapped to image2_points[x] accodring to the computed features.
best_model = null
best_error = Inf
for i = 0:k
rand_indices = n random integers from 0:num_points
base_points = image1_points[rand_indices]
input_points = image2_points[rand_indices]
maybe_model = find best transform from input_points -> base_points
consensus_set = 0
total_error = 0
for i = 0:num_points
error = square distance of the difference between image2_points[i] transformed by maybe_model and image1_points[i]
if error < t
consensus_set += 1
total_error += error
if consensus_set > d && total_error < best_error
best_model = maybe_model
best_error = total_error
The end result is the transform that best tranforms the points in image2 to image1, which is exacly what you want when stitching.

Resources