How to find 3d Projection Matrix in Surf Detection

How to find 3d Projection Matrix in Surf Detection - opencv

I am new to OpenCV. I have got the Surf Detection sample working. Now I want to place a 3d model on the detected image.
How can I find the 3d Projection matrix?

I guess you are talking about Augment Reality as you say you want to place a 3D model on the detected image (in the camera frame? ). They key of the problem is always to detect at least 4 points that match other 4 "keypoints" in our marker. Then, solving some equations we will get our homography, which will allow us to project any point.
In OpenCV there is a function that performs this task: cvFindHomography
You just need the pairs of matches, select a method (RANSAC, i.e.) and you will get the Homography.
Then you can project the points like explained here:

Related

Detect nose coordinates on a face profil view

In such tasks, I tend to you Mediapipe or Dlib to detect the landmarks of the face and get the specific coordinates I'm interested to work with.
But in the case of the human face taken from a Profil view, Dlib can't detect anything and Mediapipe shows me a standard 3D face-mesh superimposed on top of the 2D image which provides false coordinates.
I was wondering if anyone with Computer Vision (Image Processing) knowledge can guide me on how to detect the A & B points coordinates from this image
PS: The color of the background changes & also the face location is not standard.
Thanks in advance strong text

Your question seems a little unclear. If you just want (x,y) screen coordinates you can use this answer to convert the (x,y,z) that mediapipe gives you to just (x,y). If this doesn't
doesnt work for you I would recommend this repo or this one which both only work with 68 facial landmarks, but this should be sufficient for your use case.
If all of this fails I would recommend retraining hrnet on a dataset with profile views. I believe either 300-W dataset or the 300-VW dataset provides some data with heads at extreme angles.
If you wish to get the 3D coordinates in camera coordinates (X,Y,Z) you're going to have to use solvePNP. This will require getting calibration info and model mesh points for whatever facial landmark detector you use. You can find a some examples for some of this here

Project a 2D point from one camera view onto the corresponding 2D point in another camera view of the same scene

I'm using open cv in C++ in multi-view scene with two cameras. I have the intrinsic and extrinsic parameters for both cameras.
I would like to map a (X,Y) point in View 1 to the same point in the second View. I'm am slightly unsure how I should use the intrinsic and extrinsic matrices in order to convert the points to a 3D world and finally end up with the new 2D point in view 2.

It is (normally) not possible to take a 2D coordinate in one image and map it into another 2D coordinate without some additional information.
The main problem is that a single point in the left image will map to a line in the right image (an epipolar line). There are an infinite number of possible corresponding locations because depth is a free parameter. Secondly it's entirely possible that the point doesn't exist in the right image i.e. it's occluded. Finally it may be difficult to determine exactly which point is the right correspondence, e.g. if there is no texture in the scene or if it contains lots of repeating features.
Although the fundamental matrix (which you get out of cv::StereoCalibrate anyway) gives you a constraint between points in each camera: x'Fx = 0, for a given x' there will be a whole family of x's which will satisfy the equation.
Some possible solutions are as follows:
You know the 3D location of a 2D point in one image. Provided that 3D point is in a common coordinate system, you just use cv::projectPoints with the calibration parameters of the other camera you want to project into.
You do some sparse feature detection and matching using something like SIFT or ORB. Then you can calculate a homography to map the points from one image to the other. This makes a few assumptions about things being planes. If you Google panorama homography, there are plenty of lecture slides detailing this.
You calibrate your cameras, perform an epipolar rectification (cv::StereoRectify, cv::initUndistortRectifyMap, cv::remap) and then run them through a stereo matcher. The output is a disparity map which gives you exactly what you want: a per-pixel mapping from one camera to the other. That is, left[y,x] = right[y, x+disparity_map[y,x]].
(1) is by far the easiest, but it's unlikely you have that information already. (2) is often doable and might be suitable, and as another commenter pointed out will be poor where the planarity assumption fails. (3) is the general (ideal) solution, but has its own drawbacks and relies on the images being amenable to dense matching.

selecting 3D world points to process a camera calibration

I have 2 images for the same object from different views. I want to form a camera calibration, but from what I read so far I need to have a 3D world points to get the camera matrix.
I am stuck at this step, who can explain it to me

Popular camera calibration methods use 2D-3D point correspondences to determine the projective properties (intrinsic parameters) and the pose of a camera (extrinsic parameters). The most simple approach is the Direct Linear Transformation (DLT).
You might have seen, that often planar chessboards are used for camera calibrations. The 3D coordinates of it's corners can be chosen by the user itself. Many people choose the chessboard being in x-y plane [x,y,0]'. However, the 3D coordinates need to be consistent.
Coming back to your object: Span your own 3D coordinate system over the object and find at least six spots, from which you can determine easy their 3D position. Once you have that, you have to find their corresponding 2D positions (pixel) in your two images.
There are complete examples in OpenCV. Maybe you get a better picture when reading the code.

Correspondence between a set of 3D model points and their image projections

I have a set of 3-d points and some images with the projections of these points. I also have the focal length of the camera and the principal point of the images with the projections (resulting from previously done camera calibration).
Is there any way to, given these parameters, find the automatic correspondence between the 3-d points and the image projections? I've looked through some OpenCV documentation but I didn't find anything suitable until now. I'm looking for a method that does the automatic labelling of the projections and thus the correspondence between them and the 3-d points.

The question is not very clear, but I think you mean to say that you have the intrinsic calibration of the camera, but not its location and attitude with respect to the scene (the "extrinsic" part of the calibration).
This problem does not have a unique solution for a general 3d point cloud if all you have is one image: just notice that the image does not change if you move the 3d points anywhere along the rays projecting them into the camera.
If have one or more images, you know everything about the 3D cloud of points (e.g. the points belong to an object of known shape and size, and are at known locations upon it), and you have matched them to their images, then it is a standard "camera resectioning" problem: you just solve for the camera extrinsic parameters that make the 3D points project onto their images.
If you have multiple images and you know that the scene is static while the camera is moving, and you can match "enough" 3d points to their images in each camera position, you can solve for the camera poses up to scale. You may want to start from David Nister's and/or Henrik Stewenius's papers on solvers for calibrated cameras, and then look into "bundle adjustment".
If you really want to learn about this (vast) subject, Zisserman and Hartley's book is as good as any. For code, look into libmv, vxl, and the ceres bundle adjuster.

Find the position of a pattern/marker inside a photograph

i need to find a marker like the ones used in Augmented Reality.
Like this:
I have a solid background on algebra and calculus, but no experience whatsoever on image processing. My thing is Php, sql and stuff.
I just want this to work, i've read the theory behind this and it's extremely hard to see in code for me.
The main idea is to do this as a batch process, so no interactivity is needed. What do you suggest?
Input : The sample image.
Output: Coordinates and normal vector in 3D of the marker.
The use for this will be linking images that have the same marker to spatialize them, a primitive version of photosync we could say. Just a caroussel of pinned images, the marker acting like the pin.
The reps given allowed me to post images, thanks.

You can always look at the open source libraries such as ARToolkit and see how it works but generally in order to get the 3D coordinates of marker you would need to:
Do the camera calibration.
Find marker in image using local features for example.
Using calibrated camera parameters and 2D coordinates of marker do the approximation the 3D coordinates.
I've never implemented sth similar by myself but I think this is a general concept you should apply on your method.

Your problem can be solved by perspective n point camera pose estimation. When you can reasonably assume that all correspondences are correct, a linear algorithm should do.
Since the marker is planar, you can also recover the displacement from the homography between the model plane and the image plane (link). As usual, best results are obtained by iterative algorithms (link).

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart