I need to find orientation of corn pictures (as examples below) they have different angles to right or left. I need to turn them upside (90 degree angle with their normal) (when they look like a water drop)
Is there any way I can do it easily?
As starting point - find image moments (and Hu moments for complex forms like pear). From the link:
Information about image orientation can be derived by first using the
second order central moments to construct a covariance matrix.
I suspect that usage of some image processing library like OpenCV could give more reliable results in common case
From the OP I got the impression you a rookie in this so I stick to something simple:
compute bounding box of image
simple enough go through all pixels and remember min,max of x,y coordinates of non background pixels
compute critical dimensions
Just cast few lines through the bounding box computing the red points positions. So select the start points I choose 25%,50%,75% of height. First start from left and stop on first non background pixel. Then start from right and stop on first non background pixel.
axis aligned position
start rotating the image with some step remember/stop on position where the red dots are symmetric so they are almost the same distance from left and from right. Also the bounding box has maximal height and minimal width in axis aligned position so you can also exploit that instead ...
determine the position
You got 4 options if I call the distance l0,l1,l2,r0,r1,r2
l means from left, r means from right
0 is upper (bluish) line, 1 middle, 2 bottom
then you wanted position is if (l0==r0)>=(l1==r1)>=(l2==r2) and bounding box is bigger in y axis then in x axis so rotate by 90 degrees until match is found or determine the orientation directly from distances and rotate just once ...
[Notes]
You will need accessing pixels of image so I strongly recommend to use Graphics::TBitmap from VCL. Look here gfx in C specially the section GDI Bitmap and also at this finding horizon on high altitude photo might help a bit.
I use C++ and VCL so you have to translate to Pascal but the VCL stuff is the same...
Related
I'm trying to blindly detect signals in a spectra.
one way that came to my mind is to detect rectangles in the waterfall (a 2D matrix that can be interpret as an image) .
Is there any fast way (in the order of 0.1 second) to find center and width of all of the horizontal rectangles in an image? (heights of rectangles are not considered for me).
an example image will be uploaded (Note I know that all rectangles are horizontal.
I would appreciate it if you give me any other suggestion for this purpose.
e.g. I want the algorithm to give me 9 center and 9 coordinates for the above image.
Since the rectangle are aligned, you can do that quite easily and efficiently (this is not the case with unaligned rectangles since they are not clearly separated). The idea is first to compute the average color of each line and for each column. You should get something like that:
Then, you can subtract the background color (blue), compute the luminance and then compute a threshold. You can remove some artefact using a median/blur before.
Then, you can just scan the resulting 1D array filled with binary values so to locate where each rectangle start/stop. The center of each rectangle is ((x_start+x_end)/2, (y_start+y_end)/2).
Suppose that I want to find the 3D position of a cup with its rotation, with image input like this (this cup can be rotated to point in any direction):
Given that I have a bunch of 2D points specifying the top circle and bottom circle like the following image. (Let's assume that these points are given by a person drawing the lines around the cup, so it won't be very accurate. Ellipse fitting or SolvePnP might be needed to recover a good approximation. And the bottom circle is not a complete circle, it's just part of a circle. Sometimes the top part will be occluded as well so we cannot rely that there will be a complete circle)
I also know the physical radius of the top and bottom circle, and the distance between them by using a ruler to measure them beforehand.
I want to find the complete 2 circle like following image (I think I need to find the position of the cup and its up direction before I could project the complete circles):
Let's say that my ultimate goal is to be able to find the closest 2D top point and closest 2D bottom point, given a 2D point on the side of the cup, like the following image:
A point can also be inside of the cup, like so:
Let's define distance(a, b) as a function that find euclidean distance from point a and point b in pixel units.
From that I would be able to calculate the distance(side point, bottom point) / distance(top point, bottom point) which will be a scale number from 0 to 1, if I multiply this number to the physical height of the cup measured by the ruler, then I will know how high the point is from the bottom of the cup in metric unit.
What is the method I can use to find the corresponding top and bottom point given point on the side, so that I can finally find out the height of the point from the bottom of the cup?
I'm thinking of using PnP to solve this but my points do not have correct IDs associated with them. And I don't want to know the exact rotation of the cup, I only want to know the up direction of the cup.
I also think that fitting the ellipse might help somewhat, but maybe it's not the best because the circle is not complete.
If you have any suggestions, please tell me how to obtain the point height from the bottom of the cup.
Given the accuracy issues, I don't think it is worth performing a 3D reconstruction of the cone.
I would perform a "standard" ellipse fit on the top outline, which is the most accurate, then a constrained one on the bottom, knowing the position of the vertical axis. After reduction of the coordinates, the bottom ellipse can be written as
x²/a² + (y - h)²/b² = 1
which can be solved by least-squares.
Note that it could be advantageous to ask the user to point at the endpoints of the straight edges at the bottom, plus the lowest point, instead of the whole curve.
Solving for the closest top and bottom points is a pure 2D problem (draw the line through the given point and the intersection of the sides, and find the intersection points with the ellipse.
I'm writing an application in c++ which gets the camera pose using fiducial markers and also as input get a lat/lon coordinate in the real world and as output streams a video with X marker which shows the location of the coordinate on the screen.
When I move my head , the X stays in the same place spatially (because I know how to move it on the screen based on the camera pose or even hide it when I look away.
My only problem is to convert the coordinate from real life to coordinate on the screen.
I know my own gps coordinate and the target gps coordinate.
I also have the screen size (height / width) .
How can I in openCV translate all these to x,y pixel on the screen ?
In my point, your question isn't so clear.
The opencv is an image processing library
You can't convert your needs with opencv. You've need a solution with your own algorithms. So I have some advices and some experiments to explain somethings.
You can simulate to show your real life position on screen with any programming language. Imagine it, you want to develop a measurement software, it can measure a house plan image on screen with drawing lines to edges of all walls (You know some length of walls owing to an image like below)
If you want to measure wall of WC at bottom, you must know how much pixels are how ft, so firstly you should draw a line from start to end of known length for how much pixel width it. For example, If 12'4"" ft equals 9 pixels width. no longer, you can calculate length wall of WC at bottom with use basic proportion. Of course this is basic ratio for you.
I know this is not your need but this answer is helpful for you, I hope it will give some ideas.
What is Distance Transform?What is the theory behind it?if I have 2 similar images but in different positions, how does distance transform help in overlapping them?The results that distance transform function produce are like divided in the middle-is it to find the center of one image so that the other is overlapped just half way?I have looked into the documentation of opencv but it's still not clear.
Look at the picture below (you may want to increase you monitor brightness to see it better). The pictures shows the distance from the red contour depicted with pixel intensities, so in the middle of the image where the distance is maximum the intensities are highest. This is a manifestation of the distance transform. Here is an immediate application - a green shape is a so-called active contour or snake that moves according to the gradient of distances from the contour (and also follows some other constraints) curls around the red outline. Thus one application of distance transform is shape processing.
Another application is text recognition - one of the powerful cues for text is a stable width of a stroke. The distance transform run on segmented text can confirm this. A corresponding method is called stroke width transform (SWT)
As for aligning two rotated shapes, I am not sure how you can use DT. You can find a center of a shape to rotate the shape but you can also rotate it about any point as well. The difference will be just in translation which is irrelevant if you run matchTemplate to match them in correct orientation.
Perhaps if you upload your images it will be more clear what to do. In general you can match them as a whole or by features (which is more robust to various deformations or perspective distortions) or even using outlines/silhouettes if they there are only a few features. Finally you can figure out the orientation of your object (if it has a dominant orientation) by running PCA or fitting an ellipse (as rotated rectangle).
cv::RotatedRect rect = cv::fitEllipse(points2D);
float angle_to_rotate = rect.angle;
The distance transform is an operation that works on a single binary image that fundamentally seeks to measure a value from every empty point (zero pixel) to the nearest boundary point (non-zero pixel).
An example is provided here and here.
The measurement can be based on various definitions, calculated discretely or precisely: e.g. Euclidean, Manhattan, or Chessboard. Indeed, the parameters in the OpenCV implementation allow some of these, and control their accuracy via the mask size.
The function can return the output measurement image (floating point) - as well as a labelled connected components image (a Voronoi diagram). There is an example of it in operation here.
I see from another question you have asked recently you are looking to register two images together. I don't think the distance transform is really what you are looking for here. If you are looking to align a set of points I would instead suggest you look at techniques like Procrustes, Iterative Closest Point, or Ransac.
I am saving my driven X/Y coordinates, and then using a function that convert the coordinates to meters, and add 1280 to each point (so it will fit nicely into a 2560x2560 image), and then draw a polygon between the 'points', resulting in a some sort of racing line. But once I have generated the polygon and saved it as an image, it is vertically flipped somehow. Flipping the image vertically will make it match the track bitmaps perfectly. I was told this is due to DirectX internally has the Y axis flipped. Why does DirectX use a flipped Y axis?
Well, the question is, does DirectX have a flipped Y-axis or does the image?
DirectX uses a 3D/4D coordinate system where the X-axis points to the right and Y-axis points upwards when no transformation is applied. This is because the screen (where Y-axis points downwards) is the last instance that has to process the image. Every step before that uses the coordinate system with the upward Y-axis. Since Direct3D is designed for 3D worlds, a coordinate system that is aligned like the world and like most coordinate system in maths is much more convenient for the programmer and designer. Imagine, you would create a 3D model. Wouldn't it be kind of weird, if you design it so that the Y-axis is pointing downwards?
When you have no transformation at all that would allow perspective and so on, you have the same coordinate system. Ignoring the Z-axis, the top left corner is (-1 | 1), the bottom right corner is (1, -1). This is equal to the coordinate systems used in e.g. maths. In the end, this coordinate system is transformed with the viewport which will result in the top left corner to be (0 | 0) and the bottom right corner to be (ResolutionX | ResolutionY).
So all in all, the reason why the Y-axis points upwards is that Direct3D's main purpose is to describe worlds in a convenient way independently of the screen's physical attributes.