I am implementing shape descriptors based classification. I have already implemented convex hull, code chain and fourier and getting successful results. Now I am trying to find polar shape matrix. The image below shows an example. If more than half pixels in a sector are of the shape, then I store it as 1, else 0. Now my problem is, how do I scan the sectors?
Image shows an example of polar shape coordinates.
Try to find the approximative shapes that containing invariant measures. Then you compare by these measures that preserve the same value under geometric deformations.
For example a triangle you can find a ratio of length as invariant if you don't have complex deformation (Euclidean), or a barycenteric coordinates if you have affine deformation (see this paper it may be useful : ), and a cross ratio could be for the most complex deformation (projectivity), see this pages also for cross ratio
Related
Given a set of 4x4 pose matrices, one can derive the camera's euclidean coordinate system location as the following:
where R is the 3x3 rotation matrix and t is the translation vector of the pose, as per this question.
When the set of poses is treated in a sequential manner, such as when each refers to a camera's pose at some time step, the rotation and translation components can be accumulated as follows:
and
Where both can be plugged in to the first equation to yield the camera's relative position at a given time step.
My question is how to plot such points using OpenCV or a similar tool. For a camera moving around an object in a circular motion, the output plot should be circular, with the origin at the starting point of the trajectory.
An example is shown below:-
Though my question is not explicitly about plotting the axes as shown above, it would be a bonus.
TL;DR: Given a set of poses, how can we generate a plot like the one above with common tools such as OpenCV, VTK, Matplotlib, MATLAB etc.
obtain axises vectors X,Y,Z and position O for each plot point
simply extract them form matrix. See Understanding 4x4 homogenous transform matrices. Now I do not know if your matrices are already inverse or not. So if your matrices represent camera coordinate system (not inverted) extract needed info directly. If not first invert the matrix and then extract.
If you got homogenuous transform matrix then you can do pseudo inverse by exploiting transpose operation. For more info see full pseudo inverse matrix.
Render each plot point
so first plot the axises as lines:
red_line(O,O+a*X);
green_line(O,O+a*Y);
blue_line(O,O+a*Z);
where a is axis lines size. And after this plot a dot for the position
black_circle(O,r);
Where r is some radius. You can use any gfx lib/engine for the plot. I would go for GDI or OpenGL but that depends solely on what are you familiar with.
BTW. to improve avarenes of the time line you can modulate the colors intensity (start with dark and end with bright colors so you see where the motion starts and ends ...)
I am not able to under stand the formula ,
What is W (window) and intensity in the formula mean,
I found this formula in opencv doc
http://docs.opencv.org/trunk/doc/py_tutorials/py_feature2d/py_features_harris/py_features_harris.html
For a grayscale image, intensity levels (0-255) tells you how bright is the pixel..hope that you already know about it.
So, now the explanation of your formula is below:
Aim: We want to find those points which have maximum variation in terms of intensity level in all direction i.e. the points which are very unique in a given image.
I(x,y): This is the intensity value of the current pixel which you are processing at the moment.
I(x+u,y+v): This is the intensity of another pixel which lies at a distance of (u,v) from the current pixel (mentioned above) which is located at (x,y) with intensity I(x,y).
I(x+u,y+v) - I(x,y): This equation gives you the difference between the intensity levels of two pixels.
W(u,v): You don't compare the current pixel with any other pixel located at any random position. You prefer to compare the current pixel with its neighbors so you chose some value for "u" and "v" as you do in case of applying Gaussian mask/mean filter etc. So, basically w(u,v) represents the window in which you would like to compare the intensity of current pixel with its neighbors.
This link explains all your doubts.
For visualizing the algorithm, consider the window function as a BoxFilter, Ix as a Sobel derivative along x-axis and Iy as a Sobel derivative along y-axis.
http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/sobel_derivatives/sobel_derivatives.html will be useful to understand the final equations in the above pdf.
I have read sift features' paper, and I understand why it is rotation invariant.
But I do not understand why does it also invariant to planar homography transform, as my test code shows.
In the homography transform between two images, the change does not only include rotation and scale.
For example, a rectangle may be transformed to other quadrangle with every corner less or larger than 90 degrees. You can image that the shape of the object is changed, but why the feature of the key point still match?
As to the details of the algorithm, when the key point's surrounding pixel changed without rotating the same degree, the keypoint's 128 dimension feature's value will be different when they subtract the keypoint's gradient angle.
Can someone explain why?
As far as I know, the SIFT descriptor is not invariant to a projective transformation (homography). However, it works well enough when the actual homography is sufficiently close to a similarity transformation.
This paper by Mikolajczyk and Schmid proposes an interest point detector, which is affine-invariant. They also make the descriptor affine-invariant by transforming the image patch from which it is computed.
I have an image of a chessboard taken at some angle. Now I want to warp perspective so the chessboard image look again as if was taken directly from above.
I know that I can try to use 'findHomography' between matched points but I wanted to avoid it and use e.g. rotation data from mobile sensors to build homography matrix on my own. I calibrated my camera to get intrinsic parameters. Then lets say the following image has been taken at ~60degrees angle around x-axis. I thought that all I have to do is to multiply camera matrix with rotation matrix to obtain homography matrix. I tried to use the following code but looks like I'm not understanding something correctly because it doesn't work as expected (result image completely black or white.
import cv2
import numpy as np
import math
camera_matrix = np.array([[ 5.7415988502105745e+02, 0., 2.3986181527877352e+02],
[0., 5.7473682183375217e+02, 3.1723734404756237e+02],
[0., 0., 1.]])
distortion_coefficients = np.array([ 1.8662919398453856e-01, -7.9649812697463640e-01,
1.8178068172317731e-03, -2.4296638847737923e-03,
7.0519002388825025e-01 ])
theta = math.radians(60)
rotx = np.array([[1, 0, 0],
[0, math.cos(theta), -math.sin(theta)],
[0, math.sin(theta), math.cos(theta)]])
homography = np.dot(camera_matrix, rotx)
im = cv2.imread('data/chess1.jpg')
gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
im_warped = cv2.warpPerspective(gray, homography, (480, 640), flags=cv2.WARP_INVERSE_MAP)
cv2.imshow('image', im_warped)
cv2.waitKey()
pass
I also have distortion_coefficients after calibration. How can those be incorporated into the code to improve results?
This answer is awfully late by several years, but here it is ...
(Disclaimer: my use of terminology in this answer may be imprecise or incorrect. Please do look up on this topic from other more credible sources.)
Remember:
Because you only have one image (view), you can only compute 2D homography (perspective correspondence between one 2D view and another 2D view), not the full 3D homography.
Because of that, the nice intuitive understanding of the 3D homography (rotation matrix, translation matrix, focal distance, etc.) are not available to you.
What we say is that with 2D homography you cannot factorize the 3x3 matrix into those nice intuitive components like 3D homography does.
You have one matrix - (which is the product of several matrices unknown to you) - and that is it.
However,
OpenCV provides a getPerspectiveTransform function which solves the 3x3 perspective matrix (using homogenous coordinate system) for a 2D homography between two planar quadrilaterals.
Link to documentation
To use this function,
Find the four corners of the chessboard on the image. These will be your source coordinates.
Supply four rectangle corners of your choice. These will be your destination coordinates.
Pass the source coordinates and destination coordinates into the getPerspectiveTransform to generate a 3x3 matrix that is able to dewarp your chessboard to an upright rectangle.
Notes to remember:
Mind the ordering of the four corners.
If the source coordinates are picked in clockwise order, the destination also needs to be picked in clockwise order.
Likewise, if counter-clockwise order is used, do it consistently.
Likewise, if z-order (top left, top right, bottom left, bottom right) is used, do it consistently.
Failure to order the corners consistently will generate a matrix that executes the point-to-point correspondence exactly (mathematically speaking), but will not generate a usable output image.
The aspect ratio of the destination rectangle can be chosen arbitrarily. In fact, it is not possible to deduce the "original aspect ratio" of the object in world coordinates, because "this is 2D homography, not 3D".
One problem is that to multiply by a camera matrix you need some concept of a z coordinate. You should start by getting basic image warping given Euler angles to work before you think about distortion coefficients. Have a look at this answer for a slightly more detailed explanation and try to duplicate my result. The idea of moving your image down the z axis and then projecting it with your camera matrix can be confusing, let me know if any part of it does not make sense.
You do not need to calibrate the camera nor estimate the camera orientation (the latter, however, in this case would be very easy: just find the vanishing points of those orthogonal bundles of lines, and take their cross product to find the normal to the plane, see Hartley & Zisserman's bible for details).
The only thing you need to do is estimate the homography that maps the checkers to squares, then apply it to the image.
I have written algorithm to extract the points shown in the image. They form convex shape and I know order of them. How do I extract corners (top 3 and bottom 3) from such points?
I'm using opencv.
if you already have the convex hull of the object, and that hull includes the corner points, then all you need to to do is simplify the hull until it only has 6 points.
There are many ways to simplify polygons, for example you could just use this simple algorithm used in this answer: How to find corner coordinates of a rectangle in an image
do
for each point P on the convex hull:
measure its distance to the line AB _
between the point A before P and the point B after P,
remove the point with the smallest distance
repeat until 6 points left
If you do not know the exact number of points, then you could remove points until the minimum distance rises above a certain threshold
you could also do Ramer-Douglas-Peucker to simplify the polygon, openCV already has that implemented in cv::approxPolyDP.
Just modify the openCV squares sample to use 6 points instead of 4
Instead of trying to directly determine which of your feature points correspond to corners, how about applying an corner detection algorithm on the entire image then looking for which of your feature points appear close to peaks in the corner detector?
I'd suggest starting with a Harris corner detector. The OpenCV implementation is cv::cornerHarris.
Essentially, the Harris algorithm applies both a horizontal and a vertical Sobel filter to the image (or some other approximation of the partial derivatives of the image in the x and y directions).
It then constructs a 2 by 2 structure matrix at each image pixel, looks at the eigenvalues of that matrix, and calls points corners if both eigenvalues are above some threshold.