Pixel-Milimeter Proportion - image-processing

Pixel-Milimeter Proportion - image-processing

I have a digital image, and I want to make some calculation based on distances on it. So I need to get the Milimeter/Pixel proportion. What I'm doing right now, is to mark two points wich I know the real world distance, to calculate the Euclidian distance between them, and than obtain the proportion.
The question is, Only with two points can I make the correct Milimeter/Pixel's proportion, or do I need to use 4 points, 2 for the X-Axis and 2 for Y-axis?

If your image is of a flat surface and the camera direction is perpendicular to that surface, then your scale factor should be the same in both directions.
If your image is of a flat surface, but it is tilted relative to the camera, then marking out a rectangle of known proportions on that surface would allow you to compute a perspective transform. (See for example this question)
If your image is of a 3D scene, then of course there is no way in general to convert pixels to distances.

If you know the distance between the points A and B measured on the picture(say in inch) and you also know the number of pixels between the points, you can easily calculate the pixels/inch ratio by dividing <pixels>/<inches>.
I suggest to take the points on the picture such that the line which intersects them is either horizontal either vertical such that calculations do not have errors taking into account the pixels have a rectangular form.

Related

Image Processing(oblique image)

I hope you can give me some suggestions. I want to semantically segment the cyanobacteria image of the lake, and hope to calculate the cyanobacteria area in the image. How to preprocess the image due to the existence of a certain angle?It is not vertical. Make it more accurate to calculate the actual area through pixels. The image is as follows.

You can't make calibrated measurements (in true units of area) without knowing the scaling factors. So you should let a calibration target float on the water, wholly in the field of view*.
If the viewing distance is sufficiently large that the perspective effect can be neglected, the transformation is affine and it suffices to take the ratio of the apparent area of the cyanobacteria (in pixels) over the apparent area of the target (in pixels), times the true area of the target**.
If the perspective is strong, the transformation is an homography, and things get a little more complicated. From four points of the target (say corners), you can obtain the coefficients of the homography that maps the viewed points to undistorded space. Then you need to undistort the cyanobacteria area outline (as a polygon) and you can compute its area by the shoelace formula.
You can also completely straighten the image before segmentation, though this is not really necessary.
*You could think of obtaining the scaling factors by knowing viewing angles and distance, but that method will be unpractical to use in the field.
**Take a picture of a large square. If it appears like a parallelogram, you are good. If like a general quadrilateral, perspective must be corrected.

Extend a square in world space to a cube when only screen space coordinates are available

I have a photo of a Go-board, which is basically a grid with n*n squares, each of size a.
Depending on how the image was taken, the grid can have either one vanishing point like this (n = 15, board size b = 15*a):
or two vanishing points like this (n = 9, board size b = 9*a):
So what is available to me are the four screen space coordinates of the four corners of the flat board: p1, p2, p3, p4.
What I would like to do is to calculate the corresponding four screen space coordinates q1, q2, q3, q4 of the corners of the board, if the board was moved 'upward' (perpendicular to the plane of the board) in world space by a, or in other words the coordinates on top of the board, if the board had a thickness of a.
Is the information about the four points even sufficient to calculate this?
If this is not enough information, maybe it would help to make the assumption that the distance of the camera to the center of the board is typically of the order of 1.5 or 2 times the board size b?
From my understanding, the four lines p1-q1, p2-q2, p3-q3, p4-q4 would all go through the same (yet unknown) vanishing point, located somewhere below the board.
Maybe a sufficient approximation (because typically for a Go board n=18 and therefore square size a is small in comparison to the board size) for the direction of each of the lines p1-q1, p2-q2, ... in screen space would be to simply choose a line perpendicular to the horizon (given by the two vanishing points vp1-vp2 or by p1-p2 in the case of only one vanishing point)?
Having made this approximation, still the length of the four lines p1-q1, p2-q2, p3-q3, p4-q4 would need to be calculated ...
Any hints are highly appreciated!
PS: I am using Objective-C & OpenCV

Not yet a full answer but this might help to move forward. As MvG pointed out 4 points alone are not enough. Luckily we know the board is a square so even with perspective distortion the diagonals in 2D should/will intersect at board center (unless serious fish-eye or other distortions are present in the image). Here a test image (created by OpenGL I used as a test input):
The grayish surface is 2D QUAD using 2D perspective distorted corner points (your input). The aqua/bluish grid is 3D OpenGL grid I created the 2D corner points with (to see if they match). The green lines are 2D diagonals and Orange points are the 2D corner points and the diagonals intersection. As you can see 2D diagonal intersection correspond exactly with 3D board mid cell center.
Now we can use the ratio between half diagonal lengths to assume/fit the perspective. If we handle cell coordinates in range <0,9> we want to achieve further division of halve diagonals like this:
I am still not sure how exactly (linear ratio l0/(l0+l1) is not working) so I need to inspect perspective mapping equations to find relative ratio dependence and compute inverse (when I have time mood for this).
If that will be a success than we can compute any points along the diagonals (we want the cell edges). If that is done from that we can easily compute visual size of any cell size a and use the vanishing point without any 3D transform matrices at all.
In case this is not doable there is still the option to use DIP/CV techniques to detect the cell crossings like this:
OpenCV Birdseye view without loss of data
using just the bullet #2 but for that you need to take into account type of images you will have and adjust the detector or add preprocessing for it ...
Now back to your offsetting you can simply offset your cells up by the visual size of the cell like this:
And handle the left side points (either interpolate the size or use the sane as neighboring cell) That should work unless too weird angles of the board are used.

How to measure ratio between lines in a photo

I'm working with OpenCV for a task on measuring the solar angle in a photo (without any camera parameter). In the photo there is a straight stick with the height of 3 meters standing in the middle of the field. The shadow it casts, however, lies obliquely on the ground (not in the same projection plane as the stick). I can obtain the pixel length of the stick and shadow, but I don't know if the ratio should be directly calculated with the two numbers, since only lines within the same projection plane have the same scale.
This is more like a graphic issue rather than algorithm. Can anyone shed some light on me about how to determine the height-shadow ratio?

OpenCV: measuring distance between two balls in millimeters - how to improve accuracy

I also posted this topic in the Q&A forum at opencv.org but I don't know how many experts from here are reading this forum - so forgive me that I'm also trying it here.
I'm currently learning OpenCV and my current task is to measure the distance between two balls which are lying on a plate. My next step is to compare several cameras and resolutions to get a feeling how important resolution, noise, distortion etc. is and how heavy these parameters affect the accuracy. If the community is interested in the results I'm happy to share the results when they are ready! The camera is placed above the plate using a wide-angle lens. The width and height of the plate (1500 x 700 mm) and the radius of the balls (40 mm) are known.
My steps so far:
camera calibration
undistorting the image (the distortion is high due to the wide-angle lens)
findHomography: I use the corner points of the plate as input (4 points in pixels in the undistorted image) and the corner points in millimeters (starting with 0,0 in the lower left corner, up to 1500,700 in the upper right corner)
using HoughCircles to find the balls in the undistorted image
applying perspectiveTransform on the circle center points => circle center points now exist in millimeters
calculation the distance of the two center points: d = sqrt((x1-x2)^2+(y1-y2)^2)
The results: an error of around 4 mm at a distance of 300 mm, an error of around 25 mm at a distance of 1000 mm But if I measure are rectangle which is printed on the plate the error is smaller than 0.2 mm, so I guess the calibration and undistortion is working good.
I thought about this and figured out three possible reasons:
findHomography was applied to points lying directly on the plate whereas the center points of the balls should be measured in the equatorial height => how can I change the result of findHomography to change this, i.e. to "move" the plane? The radius in mm is known.
the error increases with increasing distance of the ball to the optical center because the camera will not see the ball from the top, so the center point in the 2D projection of the image is not the same as in the 3D world - I will we projected further to the borders of the image. => are there any geometrical operations which I can apply on the found center to correct the value?
during undistortion there's probably a loss of information, because I produce a new undistorted image and go back to pixel accuracy although I have many floating point values in the distortion matrix. Shall I search for the balls in the distorted image and tranform only the center points with the distortion matrix? But I don't know what's the code for this task.
I hope someone can help me to improve this and I hope this topic is interesting for other OpenCV-starters.
Thanks and best regards!

Here are some thoughts to help you along... By no means "the answer", though.
First a simple one. If you have calibrated your image in mm at a particular plane that is distance D away, then points that are r closer will appear larger than they are. To get from measured coordinates to actual coordinates, you use
Actual = measured * (D-r)/D
So since the centers of the spheres are radius r above the plane, the above formula should answer part 1 of your question.
Regarding the second question: if you think about it, the center of the sphere that you see should be in the right place "in the plane of the center of the sphere", even though you look at it from an angle. Draw yourself a picture to convince yourself this is so.
Third question: if you find the coordinates of the spheres in the distorted image, you should be able to transform them to the corrected image using perspectiveTransform. This may improve accuracy a little bit - but I am surprised at the size of errors you see. How large is a single pixel at the largest distance (1000mm)?
EDIT
You asked about elliptical projections etc. Basically, if you think of the optical center of the camera as a light source, and look at the shadow of the ball onto the plane as your "2D image", you can draw a picture of the rays that just hit the sides of the ball, and determine the different angles:
It is easy to see that P (the mid point of A and B) is not the same as C (the projection of the center of the sphere). A bit more trig will show you that the error C - (A+B)/2 increases with x and decreases with D. If you know A and B you can calculate the correct position of C (given D) from:
C = D * tan( (atan(B/D) + atan(A/D)) / 2 )
The error becomes larger as D is smaller and/or x is larger. Note D is the perpendicular (shortest) distance from the lens to the object plane.
This only works if the camera is acting like a "true lens" - in other words, there is no pincushion distortion, and a rectangle in the image plane maps into a rectangle on the sensor. The above combined with your own idea to fit in the uncorrected ('pixel') space, then transform the centers found with perspectiveTransform, ought to get you all the way there.
See what you can do with that!

How to determine the distance of a (skewed) rectangular target from a camera

I have a photograph containing multiple rectangles of various sizes and orientations. I am currently trying to find the distance from the camera to any rectangles present in the image. What is the best way to accomplish this?
For example, an example photograph might look like similar to this (although this is probably very out-of-proportion):
I can find the pixel coordinates of the corners of any of the rectangles in the image, along with the camera FOV and resolution. I also know beforehand the length and width of any rectangle that could be in the image (but not what angle they face the camera). The ratio of length to width of each rectangular target that could be in the image is guaranteed to be unique. The rectangles and the camera will always be parallel to the ground.
What I've tried:
I hacked out a solution based on some example code I found on the internet. I'm basically iterating through each rectangle and finding the average pixel length and height.
I then use this to find the ratio of length vs. height, and compare it against a list of
the ratios of all known rectangular targets so I can find the actual height of the target in inches. I then use this information to find the distance:
...where actual_height is the real height of the target in inches, the IMAGE_HEIGHT is how tall the image is (in pixels), the pixel_height is the average height of the rectangle on the image (in pixels), and the VERTICAL_FOV is the angle the camera sees along the vertical axis in degrees (about 39.75 degrees on my camera).
I found this formula on the internet, and while it seems to work somewhat ok, I don't really understand how it works, and it always seems to undershoot the actual distance by a bit.
In addition, I'm not sure how to go about modifying the formula so that it can deal with rectangles that are very skewed from viewing them along an angle. Since my algorithm works by finding the proportion of the length and height, it works ok for rectangles 1 and 2 (which aren't too skewed), but doesn't work for rectangle 3, since it's very skewed, throwing the ratios completely off.
I considered finding the ratio using the method outlined in this StackOverflow question regarding the proportions of a perspective-deformed rectangle, but I wasn't sure how well that would work with what I have, and was wondering if it's overkill or if there's a simpler solution I could try.

FWIW I once did something similar with triangles (full 6DoF pose, not just distance).

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart