I am tring to figure out the chess board corners using openCV API. Below is the code snippet.
leftImage = cv2.imread ("left.jpg")
retVal, detectedCorners = cv2.findChessboardCorners (leftImage, (7, 6))
Now, detectedCorners[0] gives below values.
array([[ 475.44540405, 264.75949097]], dtype=float32)
My Question is :
How these pixels coordinate values are represented in the float value. It must have been the integer value of (x,y) in image?
I haven’t delved into the code yet. But I bet OpenCV is is using Harris corners here and calculating the sub pixel locations as described here
The result type is correct. They may decided it float to get more accurate results. As the documentation says:
The image points: This is a vector of Point2f vector which for
each input image contains coordinates of the important points (corners
for chessboard and centers of the circles for the circle pattern).
We have already collected this from findChessboardCorners or findCirclesGrid function. We just need to pass it on.
As a result of the faster r-cnn method of object detection, I have obtained a set of boxes of intensity values(each bounding box can be thought of as a 3D matrix with depth of 3 for rgb intensity, a width and a height which can then be converted into a 2D matrix by taking gray scale) corresponding to the region containing the object. What I want to do is to obtain the corresponding co-ordinate points in the original image for each cell of intensity inside of the bounding box. Any ideas how to do so?
From what I understand, you got an R-CNN model that outputs cropped pieces of the input image and you now want to trace those output crops back to their coordinates in the original image.
What you can do is simply use a patch-similarity-measure to find the original position.
Since the output crop should look exactly like itself in the original image, just use Pixel-based distance:
Find the place in the image with the smallest distance (should be zero) and from that you can find your desired coordinates.
In python:
d_min = 10**6
crop_size = crop.shape
for x in range(org_image.shape[0]-crop_size[0]):
for y in range(org_image.shape[1]-crop_size[1]):
d = np.abs(np.sum(np.sum(org_image[x:x+crop_size[0],y:y+crop_size[0]]-crop)))
if d <= d_min:
d_min = d
coord = [x,y]
However, your model should have that info available in it (after all, it crops the output based on some coordinates). Maybe if you add some info on your implementation.
Using OpenCV's findContours() I have a list of contours in an image. I'm interested only in the straight lines, so if they are too 'squiggly' they should be rejected. The question is how to evaluate how straight each contour is?
I looked at fitLine(), but there doesn't appear to be a goodness-of-fit measure returned. I could evaluate this myself using the returned line.
I looked at arcLength() with the aim to compare this to the bounding rectangle dimensions, but even for somewhat straight lines, the arc length can be relatively long if the contour points are dense.
I could find the convex hull and compare to the bounding rectangle dimensions, but I'd have to analyze the convexity defects.
Is there a moment that would be useful here?
Find the contours as you are doing now
Find the straight lines in the image using HoughLines()
Compute the overlap between the contours and the straight lines
Take two points (with for instance cv::approxPoly) on your contour and compute their absolute distance. Then go through the contour points between the two points and add up all the distances. If the difference between distance over the contour and the absolute distance is bigger than a certain threshold you can reject it.
The function, findContours() already approximated contours with line segments somehow. Each contour is represented by a list of points around it. For your purpose, simply computing the distances of each pair of consecutive points in the contour would give you all line segment lengths.
Here is an example:
c = cnts[0]
#d is the points in contour c shifted by one with wraparound (numpy.roll)
d = np.roll(c, 1, axis=0)
np.linalg.norm(c - d, axis = -1)
I am not able to under stand the formula ,
What is W (window) and intensity in the formula mean,
I found this formula in opencv doc
For a grayscale image, intensity levels (0-255) tells you how bright is the pixel..hope that you already know about it.
So, now the explanation of your formula is below:
Aim: We want to find those points which have maximum variation in terms of intensity level in all direction i.e. the points which are very unique in a given image.
I(x,y): This is the intensity value of the current pixel which you are processing at the moment.
I(x+u,y+v): This is the intensity of another pixel which lies at a distance of (u,v) from the current pixel (mentioned above) which is located at (x,y) with intensity I(x,y).
I(x+u,y+v) - I(x,y): This equation gives you the difference between the intensity levels of two pixels.
W(u,v): You don't compare the current pixel with any other pixel located at any random position. You prefer to compare the current pixel with its neighbors so you chose some value for "u" and "v" as you do in case of applying Gaussian mask/mean filter etc. So, basically w(u,v) represents the window in which you would like to compare the intensity of current pixel with its neighbors.
This link explains all your doubts.
For visualizing the algorithm, consider the window function as a BoxFilter, Ix as a Sobel derivative along x-axis and Iy as a Sobel derivative along y-axis.
http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/sobel_derivatives/sobel_derivatives.html will be useful to understand the final equations in the above pdf.
I have an image of a chessboard taken at some angle. Now I want to warp perspective so the chessboard image look again as if was taken directly from above.
I know that I can try to use 'findHomography' between matched points but I wanted to avoid it and use e.g. rotation data from mobile sensors to build homography matrix on my own. I calibrated my camera to get intrinsic parameters. Then lets say the following image has been taken at ~60degrees angle around x-axis. I thought that all I have to do is to multiply camera matrix with rotation matrix to obtain homography matrix. I tried to use the following code but looks like I'm not understanding something correctly because it doesn't work as expected (result image completely black or white.
import cv2
import numpy as np
import math
camera_matrix = np.array([[ 5.7415988502105745e+02, 0., 2.3986181527877352e+02],
[0., 5.7473682183375217e+02, 3.1723734404756237e+02],
[0., 0., 1.]])
distortion_coefficients = np.array([ 1.8662919398453856e-01, -7.9649812697463640e-01,
1.8178068172317731e-03, -2.4296638847737923e-03,
7.0519002388825025e-01 ])
theta = math.radians(60)
rotx = np.array([[1, 0, 0],
[0, math.cos(theta), -math.sin(theta)],
[0, math.sin(theta), math.cos(theta)]])
homography = np.dot(camera_matrix, rotx)
im = cv2.imread('data/chess1.jpg')
gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
im_warped = cv2.warpPerspective(gray, homography, (480, 640), flags=cv2.WARP_INVERSE_MAP)
cv2.imshow('image', im_warped)
I also have distortion_coefficients after calibration. How can those be incorporated into the code to improve results?
This answer is awfully late by several years, but here it is ...
(Disclaimer: my use of terminology in this answer may be imprecise or incorrect. Please do look up on this topic from other more credible sources.)
Because you only have one image (view), you can only compute 2D homography (perspective correspondence between one 2D view and another 2D view), not the full 3D homography.
Because of that, the nice intuitive understanding of the 3D homography (rotation matrix, translation matrix, focal distance, etc.) are not available to you.
What we say is that with 2D homography you cannot factorize the 3x3 matrix into those nice intuitive components like 3D homography does.
You have one matrix - (which is the product of several matrices unknown to you) - and that is it.
OpenCV provides a getPerspectiveTransform function which solves the 3x3 perspective matrix (using homogenous coordinate system) for a 2D homography between two planar quadrilaterals.
Link to documentation
To use this function,
Find the four corners of the chessboard on the image. These will be your source coordinates.
Supply four rectangle corners of your choice. These will be your destination coordinates.
Pass the source coordinates and destination coordinates into the getPerspectiveTransform to generate a 3x3 matrix that is able to dewarp your chessboard to an upright rectangle.
Notes to remember:
Mind the ordering of the four corners.
If the source coordinates are picked in clockwise order, the destination also needs to be picked in clockwise order.
Likewise, if counter-clockwise order is used, do it consistently.
Likewise, if z-order (top left, top right, bottom left, bottom right) is used, do it consistently.
Failure to order the corners consistently will generate a matrix that executes the point-to-point correspondence exactly (mathematically speaking), but will not generate a usable output image.
The aspect ratio of the destination rectangle can be chosen arbitrarily. In fact, it is not possible to deduce the "original aspect ratio" of the object in world coordinates, because "this is 2D homography, not 3D".
One problem is that to multiply by a camera matrix you need some concept of a z coordinate. You should start by getting basic image warping given Euler angles to work before you think about distortion coefficients. Have a look at this answer for a slightly more detailed explanation and try to duplicate my result. The idea of moving your image down the z axis and then projecting it with your camera matrix can be confusing, let me know if any part of it does not make sense.
You do not need to calibrate the camera nor estimate the camera orientation (the latter, however, in this case would be very easy: just find the vanishing points of those orthogonal bundles of lines, and take their cross product to find the normal to the plane, see Hartley & Zisserman's bible for details).
The only thing you need to do is estimate the homography that maps the checkers to squares, then apply it to the image.
I am looking for the right set of algorithms to solve this image processing problem:
I have a distorted binary image containing a distorted rectangle
I need to find a good approximation of the 4 corner points of this rectangle
I can calculate the contour using OpenCV, but as the image is distorted it will often contain more than 4 corner points.
Is there a good approximation algorithm (preferably using OpenCV operations) to find the rectangle corner points using the binary image or the contour description?
The image looks like this:
Use cvApproxPoly function to eliminate number of nodes of your contour, then filter out those contours that have too many nodes or have angles which much differ from 90 degrees. See also similar answer
little different answer, see
Look at the opencv function ApproxPoly. It approximates a polygon from a contour.
Try Harris Corner Detector. There is example in OpenCV package. You need to play with params for your image.
And see other OpenCV algorithms: http://www.comp.leeds.ac.uk/vision/opencv/opencvref_cv.html#cv_imgproc_features
I would try generalised Hough Transform it is a bit slow but deals well with distorted/incomplete shapes.
This will work even if you start with some defects, i.e. your approxPolly call returns pent/hexagons. It will reduce any contour, transContours in example, to a quad, or whatever poly you wish.
vector<Point> cardPoly;// Quad storage
int PolyLines = 0;//PolyPoly counter ;)
double simplicity = 0.5;//Increment of adjustment, lower numbers may be more precise vs. high numbers being faster to cycle.
while(PolyLines != 4)//Adjust this
approxPolyDP(transContours, Poly, simplicity, true);
PolyLines = Poly.size();
simplicity += 0.5;