O'Reilly book clarification on 2D linear system - opencv

The Oreilly book "Learning OpenCV" states at page 356 :
Quote
Before we get totally lost, let’s consider a particular realistic situation of taking measurements
on a car driving in a parking lot. We might imagine that the state of the car could
be summarized by two position variables, x and y, and two velocities, vx and vy. These
four variables would be the elements of the state vector xk. Th is suggests that the correct form for F is:
x = [ x;
y;
vx;
vy; ]k
F = [ 1, 0, dt, 0;
0, 1, 0, dt;
0, 0, 1, 0;
0, 0, 0, 1; ]
It seems natural to put 'dt' just there in the F matrix but I just don't get why. What if I have a n states system, how would I spray some "dt" in the F matrix?

The dts are coefficients of the velocities with the corresponding positions. If you write your state update after time dt has elapsed:
x(t+dt) = x(t) + dt * vx(t)
y(t+dt) = y(t) + dt * vy(t)
vx(t+dt) = vx(t)
vy(t+dt) = vy(t)
You can read F off of these equations pretty easily.

Related

Bilinear Interpolation from wikipedia

I've tried reading the bilinear interpolation on the Wikipedia page https://en.wikipedia.org/wiki/Bilinear_interpolation and I have implemented one of the algorithms and I wanted to know if I'm doing it right or not.
This is the algorithm I implemented from the page:
This what I tried to implement in code:
for(int i = 0; i < w; i++) {
for( int j = 0; j < h; j++) {
new_img(i,j,0) = old_img(i,j,0)*(1-i)(1-j) + old_img(i+1,j,0)*i*(1-j) + old_img(i,j+1)*(1-i)*j + old_img(i+1,j+1,0)*i*j;
}
}
Is that how to implement it?
The bilinear interpolation deals with upsampling an imgae. It try to estimate the value of an array (e.g. an image) between the known values. For example, if i have an image and I want to estimate its value in the location (10.5, 24.5), the value will be a weighted average of the values in the four neighbors: (10,24), (10,25), (11,24), (11,25) which are known. The formula is the equation you posted in your question.
This is the python code:
scale_factor = 0.7
shape = np.array(image.shape[:2])
new_shape = np.ceil(shape * scale_factor).astype(int)
grid = np.meshgrid(np.linspace(0, 1, new_shape[1]), np.linspace(0, 1, new_shape[0]))
grid_mapping = [i * s for i, s in zip(grid, shape[::-1])]
resized_image = np.zeros(tuple(new_shape))
for x_new_im in range(new_shape[1]):
for y_new_im in range(new_shape[0]):
mapping = [grid_mapping[i][y_new_im, x_new_im] for i in range(2)]
x_1, y_1 = np.floor(mapping).astype(int)
try:
resized_image[y_new_im, x_new_im] = do_interpolation(f=image[y_1: y_1 + 2, x_1: x_1 + 2], x=mapping[0] - x_1, y=mapping[1] - y_1)
except ValueError:
pass
def do_interpolation(f, x, y):
return np.array([1 - x, x]).dot(f).dot([[1 - y], [y]])
Explanation:
new_shape = np.ceil(shape * scale_factor).astype(int) - calculate the new image shape after rescale.
grid = np.meshgrid(np.linspace(0, 1, new_shape[1]), np.linspace(0, 1, new_shape[0]))
grid_mapping = [i * s for i, s in zip(grid, shape[::-1])] - generate a x,y grid, when x goes from 0 to the width of the resized image and y goes from 0 to the hight of the new image. Now, we have the inverse mapping from the new image to the original one.
In the for loop we look at every pixel in the resized image and where is its source in the original image. The source will not be an integer but float (a fraction).
Now we take the four pixels in the original image that are around the mapped x and y. For example, if the mapped x,y is in (17.3, 25.7) we will take the four values of the original image in the pixels: (17, 25), (17, 26), (18, 25), (18, 26). Then we will apply the equation you bring.
note:
I add a try-except because I did not want to deal with boundaries, but you can edit the code to do it.

How to compute new camera extrinsics after coordinates system transformation?

I am trying to compute new camera extrinsics after coordinates tranformation. Say, the old extrinsics(rotation and transformation) are R0, T0. I first rotate the coordinates system α along X-axis, then β along Y-axis, and then γ along Z-axis, αβγ are in radian and rotation follows the right hand rule. Also I translate the coordinates system by Tt.
Let's denote:
Rx = {1, 0, 0;
0, cosα, -sinα;
0, sinα, cosα}
Ry = {cosβ, 0, sinβ;
0, 1, 0;
-sinβ, 0, cosβ}
Rz = {cosγ, -sinγ, 0;
0, 1, 0;
sinγ, cosγ, 1}
So after this transformation, will the new extrinsics become R = R0* Rx* Ry* Rz, and T = T0+Tt?
Also I wonder what the new coordinates of a given 3D point will be like. Say, the old 3D coordinates is P0 (3X1), will its new coordinates become Rx* Ry* Rz* P0 - Tt?
Thank you very much!
I have got the answer after lots of tests.
After the transformation, the new extrinsics will be:
R = R0 * Rx^(-1) * Ry^(-1) * Rz^(-1); //-1 is the inverse or the transpose (they are equal)
T = Rz * Ry * Rx * T0 + Tt;
//New coordinates:
P = Rz * Ry * Rx * P0 + Tt;
NB. It's easy to make mistakes for T. T is not T0 + Tt

How to create a 2D perspective transform matrix from individual components?

I am trying to create a 2D perspective transform matrix from individual components like translation, rotation, scale, shear. But at the end the matrix is not producing a true perspective effect like the image below. I think I am missing some component in the code that I wrote to create the matrix. Could some one help me add the missing components and their formulation in the below function? I have used opencv library for my code
cv::Mat getPerspMatrix2D( double rz, double s, double tx, double ty ,double shx, double shy)
{
cv::Mat R = (cv::Mat_<double>(3,3) <<
cos(rz), -sin(rz), 0,
sin(rz), cos(rz), 0,
0, 0, 1);
cv::Mat S = (cv::Mat_<double>(3,3) <<
s, 0, 0,
0, s, 0,
0, 0, 1);
cv::Mat Sh = (cv::Mat_<double>(3,3) <<
1, shx, 0,
shy, 1, 0,
0, 0, 1);
cv::Mat T = (cv::Mat_<double>(3,3) <<
1, 0, tx,
0, 1, ty,
0, 0, 1);
return T * Sh * S * R;
}
Keywords are Homography and 8DOF. Taken from 1 and 2 there exists two coefficients for perspective transformation. But it needs a 2nd step to calculate it. I'm not familiar with OpenCV but I'm hoping to answer your question a bit late in a basically way ;-)
Step 1
You can imagine lx describes a vanishing point on the x axis. The image shows a31=lx=1. lx=100 is less transformation. For lx=0 the position is infinite far means no perspective transform = identity matrix.
[1 0 0]
PL = [0 1 0]
[lx ly 1]
lx/ly are perspective foreshortening parameters
Step 2
When you apply a right hand matrix multiplication P x [u; v; 1] you will recognize that the last value in the result is sometimes other than 1. For affine transformation it is always 1 for perspective projection not. In the 2nd step the result is scaled to make the last coefficient 1. This is a part of the effect.
Your Example Image
Image' = P4 x P3 x P2 x P1 x Image
I would translate the center of the blue rectangle to the origin tx=-w/2 and ty=-h/2 = P1.
Apply projective projection with ly = h (to make both sides at an angle)
Eventually translate back that all point are located in one quadrant
Eventually scale to desired size
Step 2 from the perspective projection can be done after 2.) or at the end.

OpenCV for Unity : 4-point calibration/reprojection

It is my first post on Stack so I'm sorry in advance for my clumsiness. Please let me know if I can improve my question anyway.
► What I want to achieve (in a long term):
I try to manipulate my Unity3d presentation with a laser pointer using OpenCV fo Unity.
I believe one picture is worth more than a thousand words, so this should tell the most:
► What is the problem:
I try to make a simple 4-point calibration (projection) from camera view (some kind of trapezium) into plane space.
I thought it will be something very basic and easy, but I have no experience with OpenCV and I can't make it work.
► Sample:
I made a much less complicated example, without any laser detection and all other stuff. Only 4-points trapezium that I try to reproject into the plane space.
Link to the whole sample project: https://1drv.ms/u/s!AiDsGecSyzmuujXGQUapcYrIvP7b
The core script from my example:
using OpenCVForUnity;
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using UnityEngine.UI;
using System;
public class TestCalib : MonoBehaviour
{
public RawImage displayDummy;
public RectTransform[] handlers;
public RectTransform dummyCross;
public RectTransform dummyResult;
public Vector2 webcamSize = new Vector2(640, 480);
public Vector2 objectSize = new Vector2(1024, 768);
private Texture2D texture;
Mat cameraMatrix;
MatOfDouble distCoeffs;
MatOfPoint3f objectPoints;
MatOfPoint2f imagePoints;
Mat rvec;
Mat tvec;
Mat rotationMatrix;
Mat imgMat;
void Start()
{
texture = new Texture2D((int)webcamSize.x, (int)webcamSize.y, TextureFormat.RGB24, false);
if (displayDummy) displayDummy.texture = texture;
imgMat = new Mat(texture.height, texture.width, CvType.CV_8UC3);
}
void Update()
{
imgMat = new Mat(texture.height, texture.width, CvType.CV_8UC3);
Test();
DrawImagePoints();
Utils.matToTexture2D(imgMat, texture);
}
void DrawImagePoints()
{
Point[] pointsArray = imagePoints.toArray();
for (int i = 0; i < pointsArray.Length; i++)
{
Point p0 = pointsArray[i];
int j = (i < pointsArray.Length - 1) ? i + 1 : 0;
Point p1 = pointsArray[j];
Imgproc.circle(imgMat, p0, 5, new Scalar(0, 255, 0, 150), 1);
Imgproc.line(imgMat, p0, p1, new Scalar(255, 255, 0, 150), 1);
}
}
private void DrawResults(MatOfPoint2f resultPoints)
{
Point[] pointsArray = resultPoints.toArray();
for (int i = 0; i < pointsArray.Length; i++)
{
Point p = pointsArray[i];
Imgproc.circle(imgMat, p, 5, new Scalar(255, 155, 0, 150), 1);
}
}
public void Test()
{
float w2 = objectSize.x / 2F;
float h2 = objectSize.y / 2F;
/*
objectPoints = new MatOfPoint3f(
new Point3(-w2, -h2, 0),
new Point3(w2, -h2, 0),
new Point3(-w2, h2, 0),
new Point3(w2, h2, 0)
);
*/
objectPoints = new MatOfPoint3f(
new Point3(0, 0, 0),
new Point3(objectSize.x, 0, 0),
new Point3(objectSize.x, objectSize.y, 0),
new Point3(0, objectSize.y, 0)
);
imagePoints = GetImagePointsFromHandlers();
rvec = new Mat(1, 3, CvType.CV_64FC1);
tvec = new Mat(1, 3, CvType.CV_64FC1);
rotationMatrix = new Mat(3, 3, CvType.CV_64FC1);
double fx = webcamSize.x / objectSize.x;
double fy = webcamSize.y / objectSize.y;
double cx = 0; // webcamSize.x / 2.0f;
double cy = 0; // webcamSize.y / 2.0f;
cameraMatrix = new Mat(3, 3, CvType.CV_64FC1);
cameraMatrix.put(0, 0, fx);
cameraMatrix.put(0, 1, 0);
cameraMatrix.put(0, 2, cx);
cameraMatrix.put(1, 0, 0);
cameraMatrix.put(1, 1, fy);
cameraMatrix.put(1, 2, cy);
cameraMatrix.put(2, 0, 0);
cameraMatrix.put(2, 1, 0);
cameraMatrix.put(2, 2, 1.0f);
distCoeffs = new MatOfDouble(0, 0, 0, 0);
Calib3d.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs, rvec, tvec);
Mat uv = new Mat(3, 1, CvType.CV_64FC1);
uv.put(0, 0, dummyCross.anchoredPosition.x);
uv.put(1, 0, dummyCross.anchoredPosition.y);
uv.put(2, 0, 0);
Calib3d.Rodrigues(rvec, rotationMatrix);
Mat P = rotationMatrix.inv() * (cameraMatrix.inv() * uv - tvec);
Vector2 v = new Vector2((float)P.get(0, 0)[0], (float)P.get(1, 0)[0]);
dummyResult.anchoredPosition = v;
}
private MatOfPoint2f GetImagePointsFromHandlers()
{
MatOfPoint2f m = new MatOfPoint2f();
List<Point> points = new List<Point>();
foreach (RectTransform handler in handlers)
{
Point p = new Point(handler.anchoredPosition.x, handler.anchoredPosition.y);
points.Add(p);
}
m.fromList(points);
return m;
}
}
Thanks in advance for any help.
This question is not opencv specific but heavily math-based and more often seen in the realm of computer graphics. What you are looking for is called a Projective Transformation.
A Projective Transformation takes a set of coordinates and projects them onto something. In your case you want to project a 2D point in the camera view to a 2D point on a flat plane.
So we want a projection transform for 2D-Space. To perform a projection transform we need to find the projection matrix for the transformation we want to apply. In this case we need a matrix that expresses the projective deformation of the camera in relation to a flat plane.
To work with projections we first need to convert our points into homogeneous coordinates. To do so we simply add a new component to our vectors with value 1. So (x,y) becomes (x,y,1). And we will do that with all our five available points.
Now we start with the actual math. First some definitions: The camera's point of view and respective coordinates shall be the camera space, coordinates in relation to a flat plane are in flat space. Let c₁ to c₄ be the corner points of the plane in relation to camera space as homogeneous vectors. Let p be the point that we have found in camera space and p' the point we want to find in flat space, both as homogeneous vectors again.
Mathematically speaking, we are looking for a Matrix C that will allow us to calculate p' by giving it p.
p' = C * p
Now we obviously need to find C. To find a projection matrix for two dimensional space, we need four points (how convenient..) I will assume that c₁ will go to (0,0), c₂ will go to (0,1), c₃ to (1,0) and c₄ to (1,1). You need to solve two matrix equations using e.g. the gaussian row elimination or an LR Decomposition algorithm. OpenCV should contain functions to do those tasks for you, but be aware of matrix conditioning and their impact on a usable solution.
Now back to the matrices. You need to calculate two basis change matrices as they are called. They are used to change the frame of reference of your coordinates (exactly what we want to do). The first matrix will transform our coordinates to three dimensional basis vectors and the second one will transform our 2D plane into three dimensional basis vectors.
For the coordinate one you'll need to calculate λ, μ and r in the following equation:
⌈ c₁.x c₂.x c₃.x ⌉ ⌈ λ ⌉ ⌈ c₄.x ⌉
c₁.y c₂.y c₃.y * μ = c₄.y
⌊ 1 1 1 ⌋ ⌊ r ⌋ ⌊ 1 ⌋
this will lead you to your first Matrix, A
⌈ λ*c₁.x μ*c₂.x r*c₃.x ⌉
A = λ*c₁.y μ*c₂.y r*c₃.y
⌊ λ μ r ⌋
A will now map the points c₁ to c₄ to the basis coordinates (1,0,0), (0,1,0), (0,0,1) and (1,1,1). We do the same thing for our plane now. First solve
⌈ 0 0 1 ⌉ ⌈ λ ⌉ ⌈ 1 ⌉
0 1 0 * μ = 1
⌊ 1 1 1 ⌋ ⌊ r ⌋ ⌊ 1 ⌋
and get B
⌈ 0 0 r ⌉
B = 0 μ 0
⌊ λ μ r ⌋
A and B will now map from those three dimensional basis vectors into your respective spaces. But that is not quite what we want. We want camera space -> basis -> flat space, so only matrix B manipulates in the right direction. But that is easily fixable by inverting A. That will give us matrix C = B * A⁻¹ (watch the order of B and A⁻¹ it is not interchangeable). This leaves us with a formula to calculate p' out of p.
p' = C * p
p' = B * A⁻¹ * p
Read it from left to right like: take p, transform p from camera space into basis vectors and transform those into flat space.
If you remember correctly, p' still has three components, so we need to dehomogenize p' first before we can use it. This will yield
x' = p'.x / p'.z
y' = p'.y / p'.z
and viola we have successfully transformed a laser point from a camera view onto a flat piece of paper. Totally not overly complicated or so...
I Develop Code. MouseUp Call this Function. And Resolution Edit;
void Cal()
{
// Webcam Resolution 1280*720
MatOfPoint2f pts_src = new MatOfPoint2f(
new Point(Double.Parse(imagePoints.get(0,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(0, 0).GetValue(1).ToString())),
new Point(Double.Parse(imagePoints.get(1,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(1, 0).GetValue(1).ToString())),
new Point(Double.Parse(imagePoints.get(2,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(2, 0).GetValue(1).ToString())),
new Point(Double.Parse(imagePoints.get(3,0).GetValue(0).ToString()), Double.Parse(imagePoints.get(3, 0).GetValue(1).ToString()))
);
//Resolution 1920*1080
MatOfPoint2f pts_dst = new MatOfPoint2f(
new Point(0, 0),
new Point(1920, 0),
new Point(1920, 1080),
new Point(0, 1080)
);
// 1. Calculate Homography
Mat h = Calib3d.findHomography((pts_src), (pts_dst));
// Pick Point (WebcamDummy Cavas : 1280*0.5f / 720*0.5f)
MatOfPoint2f srcPointMat = new MatOfPoint2f(
new Point(dummyCross.anchoredPosition.x*2.0f, dummyCross.anchoredPosition.y*2.0f)
);
MatOfPoint2f dstPointMat = new MatOfPoint2f();
{
//2. h Mat Mul srcPoint to dstPoint
Core.perspectiveTransform(srcPointMat, dstPointMat, h);
Vector2 v = new Vector2((float)dstPointMat.get(0, 0)[0], (float)dstPointMat.get(0, 0)[1]);
//(ResultDummy Cavas: 1920 * 0.5f / 1080 * 0.5f)
dummyResult.anchoredPosition = v*0.5f;
Debug.Log(dummyCross.anchoredPosition.ToString() + "\n" + dummyResult.anchoredPosition.ToString());
}
}

Lua 2d rotation gives wierd output for y

I need to be able to perform 2d rotations with lua-script. I think, or at least thought i knew how to do 2d rotations, and for x it works, but for y i get a weird value.
Code:
rad = math.rad;
cos = math.cos;
sin = math.sin;
w = 90;
vec = {0, 1};
new_vec = {0, 0};
new_vec[1] = vec[1] * cos(rad(w)) - vec[2] * sin(rad(w));
new_vec[2] = vec[1] * sin(rad(w)) + vec[2] * cos(rad(w));
print("original vector_xy: ", "x= ", vec[1], "y= ", vec[2]);
print("new vector_xy: ", "x= ", new_vec[1], "y= ", new_vec[2]);
output:
original vector_xy: x= 0 y= 1
new vector_xy: x= -1 y= 6.1232339957368e-017
When I test the calculations on a calculator I get the correct answer. Must be something codewise I'm doing wrong.
Welcome to the world of floating point.
When you convert 90 degrees to radians, you don't get π/2 exactly, but something quite close to it. The cosine of this angle is not exactly 0, but the very small number you see.
This problem is not specific to Lua.

Resources