XNA and Matrices - xna

So thanks to many of the replies from this board, I have a much better understanding of some of what's under the hood but I just need this little bit more to get a firm understanding. So I am reading through part of Riemer's tutorials here:
http://www.riemers.net/eng/Tutorials/XNA/Csharp/Series2D/Coll_Detection_Matrices.php
I'll use the carriage example as it's fairly simple.
So the series of transformations for the carriage is this:
1: Matrix.Identity;
2: Matrix.CreateTranslation(xPos, yPos, 0)
3: Matrix.CreateScale(playerScaling)
4: Matrix.CreateTranslation(0, -carriage.Height, 0)
So my questions are these:
Riemer states:
1: If we would render the image just
like it is (or: with the Identity
transformation), it would be rendered
in its original size in the top-left
corner of the screen.
I don't get in what context this is applicable. Is he just referring to the fact that before the draw, all images start at the world origin of 0,0? Or is he saying that when these transformations take effect, the origin of the object will now be placed at the upper-left and translated back to the world origin?
2: First, the carriage image is moved so its top-left point is at the position specified as second argument in the SpriteBatch.Draw method.
Ok, so is he saying this is what happens under the hood for what happens during the draw method or that he is just passing the same parameter as what you send to the draw method?
4: Finally, and this is the most challenging step: the image is moved over the Y axis, since in our SpriteBatch.Draw method we’ve specified (0, carriageTexture.Height) as origin. Very important: it is moved over its own Y axis, which has been scaled down. So instead of being moved over 39 screen pixels, the carriages will be moved vertically over 39*0.4=16 pixels pixels (since carriageTexture.Height = 39 and playerScaling = 0.4).
Ok, so we've been positioning things using these matrices in our world/screen space. When you scale something down, I don't get why it will now start moving in accordance to the object's local space. For example, our first translation matrix moved according to the world/screen space but now he states "it is moved over its own Y axis".
Why is the identity matrix optional?
Thus, I am having trouble connecting back the draw method with these matrix multiplications, what the deal with the object going to the world origin is and why the create translation shifted its behavior.

1.1: Images are drawn at the upper-left hand corner by default on this system. Not all systems are like that, so it's worth pointing that out in the documentation. So if none of the transformations are applied (or only the identity one, which doesn't do anything), it will draw in the upper-left hand corner.
2.1: Once you've applied the transformation in line 2 it will translate (move) the image xPos, yPos across the screen. When exactly it happens isn't that important, but it'll probably be applied by the graphics card at draw time.
4.1: No, you've always been positioning by local space. It's just that until line 3, local space has been the same size as global space. Now local space is scaled, so translations will be scaled as well.
4.2: Most systems initialize matrices to the identity matrix. The identity matrix by definition doesn't do anything (if you multiply it with matrix A you'll get matrix A again). So if you omit it, it doesn't matter. It is useful for being explicit or for resetting a matrix later.

Related

Why did Apple flip their unit circle for UIBezierPath?

The image on the left is what a typical unit circle looks like. The one on the right is from the documentation. I haven't seen a more in depth explanation for why it was flipped anywhere online. Why is this so?
A drawing of a circle is often represented by x = center.x + r * cos(φ) and y = center.y + r * sin(φ) as φ progresses from 0 to 2π. With a standard Cartesian coordinate system, with the origin in the lower-left corner, this results in a circle drawn, starting at 3 o'clock and proceeding counterclockwise. See diagram to the left, below.
But the iOS coordinate system has the the y-axis flipped from the standard Cartesian coordinate system, with the origin in the upper-left corner and y increasing as you move down the screen. See right diagram below:
(This is adapted from Coordinate Systems in the Quartz 2D Programming Guide. The original diagram in the Apple documentation is merely illustrating how the coordinate systems are flipped, but I've changed the arrow to more accurately represent how this affects the drawing of an arc from 0 to π/2.)
The result is that, when using the iOS coordinate system, it will start at 3 o'clock but then proceed clockwise.
Because iOS draws everything upside-down.
In OS X (and just about every math book, and also by default in Core Graphics which came from OS X), the origin is in the lower-left corner and the Y-axis increases as you move up. In that coordinate system, the angles lay out the way you think they should. In UIKit's upside-down coordinate system, everything is flipped.
What's interesting is that the direction of the angles bothered you, but the bizarre Y-axis did not. (*) This inverted intuition among programmers is likely the reason that Apple flipped the coordinate system when they wrote iOS, but you can still see the artifacts here and there. (In fairness to Apple, layout code that mimics a page, like text views and scroll views, is much easier to compute when Y increases downward. Since those are very common design elements, it's not so crazy that UIKit flips the axes. It also goes to show that these things are very arbitrary in math and computers.)
(*) Yes, you noted where the origin was located, and that this was a likely part of the answer, but allow me some hyperbole to make the next point :D

how to find orientation of a picture with delphi

I need to find orientation of corn pictures (as examples below) they have different angles to right or left. I need to turn them upside (90 degree angle with their normal) (when they look like a water drop)
Is there any way I can do it easily?
As starting point - find image moments (and Hu moments for complex forms like pear). From the link:
Information about image orientation can be derived by first using the
second order central moments to construct a covariance matrix.
I suspect that usage of some image processing library like OpenCV could give more reliable results in common case
From the OP I got the impression you a rookie in this so I stick to something simple:
compute bounding box of image
simple enough go through all pixels and remember min,max of x,y coordinates of non background pixels
compute critical dimensions
Just cast few lines through the bounding box computing the red points positions. So select the start points I choose 25%,50%,75% of height. First start from left and stop on first non background pixel. Then start from right and stop on first non background pixel.
axis aligned position
start rotating the image with some step remember/stop on position where the red dots are symmetric so they are almost the same distance from left and from right. Also the bounding box has maximal height and minimal width in axis aligned position so you can also exploit that instead ...
determine the position
You got 4 options if I call the distance l0,l1,l2,r0,r1,r2
l means from left, r means from right
0 is upper (bluish) line, 1 middle, 2 bottom
then you wanted position is if (l0==r0)>=(l1==r1)>=(l2==r2) and bounding box is bigger in y axis then in x axis so rotate by 90 degrees until match is found or determine the orientation directly from distances and rotate just once ...
[Notes]
You will need accessing pixels of image so I strongly recommend to use Graphics::TBitmap from VCL. Look here gfx in C specially the section GDI Bitmap and also at this finding horizon on high altitude photo might help a bit.
I use C++ and VCL so you have to translate to Pascal but the VCL stuff is the same...

Understanding Distance Transform in OpenCV

What is Distance Transform?What is the theory behind it?if I have 2 similar images but in different positions, how does distance transform help in overlapping them?The results that distance transform function produce are like divided in the middle-is it to find the center of one image so that the other is overlapped just half way?I have looked into the documentation of opencv but it's still not clear.
Look at the picture below (you may want to increase you monitor brightness to see it better). The pictures shows the distance from the red contour depicted with pixel intensities, so in the middle of the image where the distance is maximum the intensities are highest. This is a manifestation of the distance transform. Here is an immediate application - a green shape is a so-called active contour or snake that moves according to the gradient of distances from the contour (and also follows some other constraints) curls around the red outline. Thus one application of distance transform is shape processing.
Another application is text recognition - one of the powerful cues for text is a stable width of a stroke. The distance transform run on segmented text can confirm this. A corresponding method is called stroke width transform (SWT)
As for aligning two rotated shapes, I am not sure how you can use DT. You can find a center of a shape to rotate the shape but you can also rotate it about any point as well. The difference will be just in translation which is irrelevant if you run matchTemplate to match them in correct orientation.
Perhaps if you upload your images it will be more clear what to do. In general you can match them as a whole or by features (which is more robust to various deformations or perspective distortions) or even using outlines/silhouettes if they there are only a few features. Finally you can figure out the orientation of your object (if it has a dominant orientation) by running PCA or fitting an ellipse (as rotated rectangle).
cv::RotatedRect rect = cv::fitEllipse(points2D);
float angle_to_rotate = rect.angle;
The distance transform is an operation that works on a single binary image that fundamentally seeks to measure a value from every empty point (zero pixel) to the nearest boundary point (non-zero pixel).
An example is provided here and here.
The measurement can be based on various definitions, calculated discretely or precisely: e.g. Euclidean, Manhattan, or Chessboard. Indeed, the parameters in the OpenCV implementation allow some of these, and control their accuracy via the mask size.
The function can return the output measurement image (floating point) - as well as a labelled connected components image (a Voronoi diagram). There is an example of it in operation here.
I see from another question you have asked recently you are looking to register two images together. I don't think the distance transform is really what you are looking for here. If you are looking to align a set of points I would instead suggest you look at techniques like Procrustes, Iterative Closest Point, or Ransac.

Marker Tracking + perspective warp of marker

I'm tracking a marker with ARToolKit+. I receive a model view matrix that looks about right. Now I'd like to warp the image in a way that the marker looks just like it would look if I looked straight at it. But whatever I do, the result looks just extremely distorted. I know that ARToolKit stores the 4x4 matrix in column major order, so I fixed that for OpenCV.
What I tried so far was:
1) fix the order to row major order
2) calculate the inverse with cvInverse (although transposing the 3x3 rotation part + inverting the translation should suffice)
3) use that matrix with cvPerspectiveWarp
Am I doing something wrong?
tl;dr:
I want this: https://www.youtube.com/watch?v=qZ-LU-C2p2Q
I get some distorted lines and lots of black instead.
Your problem is in converting from 4x4 to 3x3. The short answer is that you want to drop the 3rd column and bottom row to make the 3x3 and then premultiply with your camera matrix. For a longer explanation see here
Clarification
The pose you get from ARTK represents a transform from one place to another. When I say "the initial image appears without rotation" I meant that your transform goes from an initial state which has no rotation about the x or y axis to the current state. That is a fine assumption for most augmented reality applications, I mentioned it just to be thorough.
As for why you can drop the 3rd column. Since you are transforming a plane, your z coordinate can be completely expressed by your x and y coordinates given the equation of your plane. If we assume that initially there is no rotation then your initial z coordinate is a constant value. If there is rotation then z is not constant but it varies deterministically in x and y according to its plane equation which can still be expressed in one matrix (though you don't need that). Since in your case your 4x4 transform is probably expressing the transform from the marker lying flat at z = 0 to its current position, the 3rd column of your 4x4 matrix does nothing (it all gets multiplied by 0) so it can be dropped without affecting the result.
In short: Forget about the rotation stuff, its more complicated than you need, just realize that the transform is from initial coordinates to final coordinates and your initial coordinates are always
[x,y,0,1]
which makes your third column irrelevant.
Update
I'm sorry! I just re-read your question and realized you just want to warp the marker so it looks like a straight on view, I got caught up in describing a general transform from 4x4 to 3x3. The 4x4 transform you get from ARTK is not the transform that will de warp the warker, it is the transform that moves the marker from the origin to its final position. To de warp the marker like you asked the process is similar but would be slightly different. I haven't done that before but here is my guess.
First, you need to get the 4x4 transform between where the marker is in world space, and where you would like it to appear to be after warping it. Right now the transform goes from the origin to the marker location. To change the transform to go from some point farther down on the z axis (say 100) to the marker location define the transform.
initial_marker_pose = [1,0,0,0
0,1,0,0
0,0,1,100
0,0,0,1];
Now you have the transform from the origin to what you want as your "inital" position, and the transform from the origin to your "final" position. To get the transform from initial to final simply
initial_to_final = origin_to_marker*initial_marker_pose.inv();
Now you would follow the process outlined in the link I gave you, in this case your initial zpos is no longer 0, it is 100. Then when you are finished you will need to invert your 3x3 matrix. That is because this process takes you from a straight on view to the one defined by the pose from ARTK and you want the opposite of that. You will need to experiment with the initial z position. The smaller it is, the larger your marker will appear after de-warping.
Hopefully that works, sorry for the confusion about your question.

Given a set of points to define a shape, how can I contract this shape like Photoshop's Selection>Contract

I have a set of points to define a shape. These points are in order and essentially are my "selection".
I want to be able to contract this selection by an arbitrary amount to get a smaller version of my original shape.
In a basic example with a triangle, the points are simply moved along their normal which is defined by the points to the left and the right of the points in question.
Eventually all 3 points will meet and form one point but until that point they will make a smaller and smaller triangle.
For more complex shapes, when moving the individual points inward, they may pass through the outer edge of the shape resulting in weird artifacts. Obviously I'll need to cull these points and remove them from the array.
Any help in exactly how I can do that would be greatly appreciated.
Thanks!
This is just an idea but couldn't you find the center of mass of the object, create a vector from the center to each point, and move each point along this vector?
To find the center of mass would of course involve averaging each x and y coordinate. Getting a vector is as simple a subtracting the point in question with the center point. Normalizing and scaling are common vector operations that can be found with the Google.
EDIT
Another way to interpret what you're asking is you want to erode your collection of points. As in morphology erosion. This is typically applied to binary images but you can slightly modify the concept to work with a collection of points. Essentially, you need to write a function that, given a point, will return true (black) or false (white) depending on if that point is inside or outside the shape defined by your points. You'd have to look up how to do that for shapes that aren't always concave (it's harder but not impossible).
Now, obviously, every single one of your actual points will return false because they're all on the border (by definition). However, you now have a matrix of points around your point of interest that define where is "inside" and where is "outside". Average all of the "inside" points and move your actual point along the vector between itself and towards this average. You could play with different erosion kernels to see what works best.
You could even work with a kernel with floating point weights instead of either/or values which will affect your average calculation proportional to their weights. With this, you could approximate a circular kernel with a low number of points. Try the simpler method first.
Find the selection center (as suggested by colithium)
Map the selection points to the coordinate system with the selection center at (0,0). For example, if the selection center is at (150,150), and a given selection point is at (125,75), the mapped position of the point becomes (-25,-75).
Scale the mapped points (multiply X and Y by something in the range of 0.0..1.0)
Remap the points back to the original coordinate system
Only simple maths required, no need to muck about normalizing vectors.

Resources