Image pixel coordinates to world coordinate transformations - image-processing

I'm asking this question from the perspective of non-mathematician.
So please dumb down answers as much as possible.
I'm using a microscope which has a camera and also a confocal scanning mode. The camera is slightly rotated counter clockwise relative to the physical stage orientation (0.53 degrees).
Furthermore, the camera has a slight lateral translation compared to the center of the stage. In other words, the center of the field of view (FOV) of the camera is offset compared to the center of the stage.
Specifically, my camera image has pixel dimensions of 2560, 2160. So the center of the camera FOV is 1280, 1080.
However, the center of the stage is actually at the image pixel coordinates 1355, 980.
My goal is to map objects detected in the image to their physical stage coordinate. We can assume the stage starts at physical coordinates 0,0 um.
The camera image has pixel size of 65nm.
Im not sure how to apply the transformations. (I know how to apply a simpler rotation matrix).
Could someone show me how to do this with a few example pixel coordinates in the camera image?
Schematic representation of the shifts. WF means Widefield Camera

Related

How to estimate intrinsic properties of a camera from data?

I am attempting camera calibration from a single RGB image (panorama) given 3D pointcloud
The methods that I have considered all require an intrinsic properties matrix (which I have no access to)
The intrinsic properties matrix can be estimated using the Bouguet’s camera calibration Toolbox, but as I have said, I have a single image only and a single point cloud for that image.
So, knowing 2D image coordinates, extrinsic properties, and 3D world coordinates, how can the intrinsic properties be estimated?
It would seem that the initCameraMatrix2D function from the OpenCV (https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html) works in the same way as the Bouguet’s camera calibration Toolbox and requires multiple images of the same object
I am looking into the Direct linear transformation DLT and Levenberg–Marquardt algorithm with implementations https://drive.google.com/file/d/1gDW9zRmd0jF_7tHPqM0RgChBWz-dwPe1
but it would seem that both use the pinhole camera model and therefore find linear transformation between 3D and 2D points
I can't find my half year old source code, but from top of my head
cx, cy is optical centre which is width/2, height/2 in pixels
fx=fy is focal length in pixels (distance from camera to image plane or axis of rotation)
If you know that image distance from camera to is for example 30cm and it captures image that has 16x10cm and 1920x1200 pixels, size of pixel is 100mm/1200=1/12mm and camera distance (fx,fy) would be 300mm*12px/1mm=3600px and image centre is cx=1920/2=960, cy=1200/2=600. I assume that pixels are square and camera sensor is centered at optical axis.
You can get focal lenght from image size in pixels and measured angle of view.

OpencCV: Camera motion detection

I have a white board with a black line. I want to find the angle made by the line. I could do this part using background subtraction and Hough line transforms. Even when I rotate the board the angle detected is correct. The problem I face is that if the camera is rotated the value obtained varies. I want to obtain the original angle only even when camera is rotated. What should my approach when the camera is rotated to obtain the original angle?

OpenCV, dlib landmarks rotation

I am new in OpenCV an dlib, and I am not sure if my designe is correct. I want to write C++ face detector for android phone wich should detect faces with differents phone orientation and rotatrion angles. Lets stay when phone orientation is portrait and landscape. I am using OpenCV to rotate/edit image and dlib to detect faces. dlib shape predicats initialized with shape_predictor_68_face_landmarks.dat and it can detect face only in correct phone orientation (it means if I rotate phone by 90 it can not detect face.)
To make possible detect faces I read axis from accelerometor and rotate source image to correct orientation before send it to dlib face detector and it detects ok, but output coordinates in dlib::full_object_detection shape of course matchs to rotated picture but not original. So it means i have to convert (rotate landmarks) to back to original image.
Is there are any existing API in dlib or OpenCV to make possible rotate landmarks (dlib::full_object_detection) for specified angle? It will be good if you can provide some example.
For iPhone apps, EXIF data in images captured using iPhone cameras can be used to rotate images first. But I can't guarantee this for Android phones.
In most practical situations, it is easier to rotate the image and perform face detection when face detection in the original image does not return any results (or returns strange results like very small faces). I have seen this done in several Android apps, and have used it myseklf on a couple of projects.
As I understand, you want to rotate the detected landmark to the coordinates system of the original image. If so, you can use getRotationMatrix2D and transform to rotate the list of point.
For example:
Your image was rotated 90 degree to the right around the center point (the middle point of image), now you need to rotate the landmark points back -90 degree around the center point. The code is
// the center point
Point2f center=(width/2,height/2)
//the angle to rotate, in radiant
// in your case it is -90 degree
double theta_deg= angleInDegree * 180 /M_PI;
// get the matrix to rotate
Mat rotateMatrix = getRotationMatrix2D(center, theta_deg, 1.0);
// the vector to get landmark points
std::vector<cv::Point> inputLandmark;
std::vector<cv::Point> outputLandmark;
// we use the same rotate matrix and use transform
cv::transform(inputLandmark, outputLandmark, rotateMatrix);

Calculating position of object so it matches screen pixels

I would like to move a 3D plane in a 3D space, and have the movement match
the screens pixels so I can snap the plane to the edges of the screen.
I have played around with the focal length, camera position and camera scale,
and I have managed to get a plane to match the screen pixels in terms of size,
however when moving the plane things are not correct anymore.
So basically my current status is that I feed the plane size with values
assuming that I am working with standard 2D graphics.
So if I set the plane size to 128x128, it more or less is viewed as a 2D sqaure with that
exact size.
I am not using and will not use Orthographic view, I am using and will be using Projection view because my application needs some perspective to it.
How can this be calculated?
Does anyone have any links to resources that I can read?
you need to grab the transformation matrices you use in the vertex shader and apply them to the point/some points that represents the plane
that will result in a set of points in -1,-1 to 1,1 (after dividing by w) which you will need to map to the viewport

How do you counter a rotated camera?

We are currently using opencv to track a planar rectangular target. While directly straight(no pitch), this works perfectly using findContours with solvePnp and returns a very accurate location of the target.
The problem is, is that obviously we get the different results once we increase the pitch. We know the pitch of the camera at all time.
How would I "cancel out" the pitch of the camera, and obtain coordinates as if the camera was facing straight ahead?
In the general case you can use an affine transform to map the quadrilateral seen by the camera back to the original rectangle. In your case the quadrilateral seen by the camera may be a good approximation of a parallelogram since only one angle is changing, but in real-world applications you can generally assume that the camera can have non-zero values for each of the three rotations (e.g. in pitch, yaw, and roll).
http://opencv.itseez.com/doc/tutorials/imgproc/imgtrans/warp_affine/warp_affine.html
The transform allows you to calculate the matching coordinates (x,y) within the rectangle's plane given coordinates (x', y') in the image of the rectangle.

Resources