For an Augmented Reality app, we use a library to compute an homography from 2 sets of points via Ransac and DTL.
Cool.
But we know the rotation (mobile sensors) so we think it could all be made quicker and more accurately.
What would be your advice ?
Thanks for your insights,
Michaƫl
Related
Right now I'm exploring features of iOS Depth camera and now I want to obtain the distance in real-world metrics between two points (for example, between two eyes).
I have successfully connected iOS Depth camera functionality and I have AVDepthData in my hands but I'm not quite sure how I can get a real-world distance between two specific points.
I believe I could calculate it if I have depth and viewing angle, but I don't see that the latter is presented as parameter. Also I know that this task could be handled with ARKit, but I'm really curious how I can implement it myself. I mean ARKit uses Depth camera as well, so there must be an algorithm where Depth maps is all I need to calculate the real distance
Could you please give me an advice how to tackle this task? Thanks in advance!
I am trying to write a program that stitches images using SURF detector and I would like to know the difference between the two homography estimator.
I understand findHomography uses RANSAC, is HomographyBasedEstimator using RANSAC too?
If it isn't, would someone point me to the paper HomographyBasedEstimator used?
Thanks in advance
The main difference between both functions is that findHomography, as the name says, is used to find an homography, and HomographyBasesEstimator uses an already existing homographies to calculate the rotation of the cameras.
I mean, HomographyBasesEstimator doesn't find the homographies, it use them to do the computation of camera motion and all the other camera parameters such as focal lengths and optical centers.
I hope this can help you.
Actually, findHomography has been called in BestOf2NearestMatcher.
Documentation doesn't seem to say, but it suggests that HomographyBasedEstimator finds a rotation matrix, which is a special case of the homography matrix that requires the focal length. If you're doing stitching, HomographyBasedEstimator is probably the way to go. (My guess is that it's doing RANSAC internally.)
I want to get 3d model of some real word object.
I have two web cams and using openCV and SBM for stereo correspondence I get point cloud of the scene, and filtering through z I can get point cloud only of object.
I know that ICP is good for this purprose, but it needs point clouds to be initally good aligned, so it is combined with SAC to achieve better results.
But my SAC fitness score it too big smth like 70 or 40, also ICP doesn't give good results.
My questions are:
Is it ok for ICP if I just rotate the object infront of cameras for obtaining point clouds? What angle of rotation must be to achieve good results? Or maybe there are better way of taking pictures of the object for getting 3d model? Is it ok if my point clouds will have some holes? What is maximal acceptable fitness score of SAC for good ICP, and what is maximal fitness score of good ICP?
Example of my point cloud files:
https://drive.google.com/file/d/0B1VdSoFbwNShcmo4ZUhPWjZHWG8/view?usp=sharing
My advice and experience is that you already have rgb images or grey. ICP is an good application for optimising the point cloud, but has some troubles aligning them.
First start with rgb odometry (through feature points aligning the point cloud (rotated from each other)) then use and learn how ICP works with the already mentioned point cloud library. Let rgb features giving you a prediction and then use ICP to optimize that when possible.
When this application works think about good fitness score calculation. If that all works use the trunk version of ICP and optimize the parameter. After this all been done You have code that is not only fast, but also with the a low error of going wrong.
The following post is explain what went wrong.
Using ICP, we refine this transformation using only geometric information. However, here ICP decreases the precision. What happens is that ICP tries to match as many corresponding points as it can. Here the background behind the screen has more points that the screen itself on the two scans. ICP will then align the clouds to maximize the correspondences on the background. The screen is then misaligned
https://github.com/introlab/rtabmap/wiki/ICP
I am a beginner when it comes to computer vision so I apologize in advance. Basically, the idea I am trying to code is that given two cameras that can simulate a multiple baseline stereo system; I am trying to estimate the pose of one camera given the other.
Looking at the same scene, I would incorporate some noise in the pose of the second camera, and given the clean image from camera 1, and slightly distorted/skewed image from camera 2, I would like to estimate the pose of camera 2 from this data as well as the known baseline between the cameras. I have been reading up about homography matrices and related implementation in opencv, but I am just trying to get some suggestions about possible approaches. Most of the applications of the homography matrix that I have seen talk about stitching or overlaying images, but here I am looking for a six degrees of freedom attitude of the camera from that.
It'd be great if someone can shed some light on these questions too: Can an approach used for this be extended to more than two cameras? And is it also possible for both the cameras to have some 'noise' in their pose, and yet recover the 6dof attitude at every instant?
Let's clear up your question first. I guess You are looking for the pose of the camera relative to another camera location. This is described by Homography only for pure camera rotations. For General motion that includes translation this is described by rotation and translation matrices. If the fields of view of the cameras overlap the task can be solved with structure from motion which still estimates only 5 dof. This means that translation is estimated up to scale. If there is a chessboard with known dimensions in the cameras' field of view you can easily solve for 6dof by running a PnP algorithm. Of course, cameras should be calibrated first. Finally, in 2008 Marc Pollefeys came up with an idea how to estimate 6 dof from two moving cameras with non-overlapping fields of view without using any chess boards. To give you more detail please tell a bit for the intended appljcation you are looking for.
I'm working on a project to recognize people by their skeleton using Microsoft Kinect SDK.
The problem is that the skeleton size grows as the person moves towards the Kinect sensor. I need to build my system in a way to be independent of the position of the person. I don't know how to solve this problem.
Some related works say " the skeleton should be normalized in the time domain." , I don't know what that means!
Any advice would be appreciated.
Thank you
I believe the normalization methods depends on the reason you need it. Maybe you can provide more information on why you would need a normalized skeleton.
Here are some methods that I can think of:
Compute a 3D bounding box (AABB) for the skeleton joints. Find the axis where the AABB is the largest and calculate a scale factor so that size would be 1.0. You can then apply this scaling factor on all other axes.
Maybe you want skeleton joints independent of their position. In this case, you can use a joint (e.g. spine) and consider that the origin of your skeleton. Then, compute every other joint position as:
newjointPosition = oldjointPosition - spinePosition;
If you need a more specific answer, please tell us more about your app and requirements.