I have a Time-of-Flight sensor using IR as illumination.
I use PCL(Point Cloud Library) to get point cloud data from that sensor.
The result really confused me.
The object is nearly flat as a plane, it doesn't look like what I expected(or any other point cloud image you can google),
and if you look right from the side, there is a big curve.
also look from the top
I've search this issue for a long time.
The top reason to cause this I guess is "lens distortion".
However, as I implemented the algorithm in Opencv that it claims to solve lens distortion, the result still look the same.
There are still other reasons I presume like pixel offset, gradient offset...etc, but I don't know how to deal with them.
I really need help since I never study Optics, 3D, and I only googled "pinhole camera" to seek knowledge for just 1 week(I know intrinsic, extrinsic parameters).
I will be very appreciated of getting any clue about what to do next.
Related
I recently managed to get my augmented reality application up and running close to what is expected. However, I'm having an issue where, even though the values are correct, the augmentation is still off by some translation! It would be wonderful to get this solved as I'm so close to having this done.
The system utilizes an external tracking system (Polaris Spectra stereo optical tracker) with IR-reflective markers to establish global and reference frames. I have a LEGO structure with a marker attached which is the target of the augmentation, a 3D model of the LEGO structure created using CAD with the exact specs of its real-world counterpart, a tracked pointer tool, and a camera with a world reference marker attached to it. The virtual space was registered to the real world using a toolset in 3D Slicer, a medical imaging software which is the environment I'm developing in. Below are a couple of photos just to clarify exactly the system I'm dealing with (May or may not be relevant to the issue).
So a brief overview of exactly what each marker/component does (Markers are the black crosses with four silver balls):
The world marker (1st image on right) is the reference frame for all other marker's transformations. It is fixed to the LEGO model so that a single registration can be done for the LEGO's virtual equivalent.
The camera marker (1st image, attached to camera) tracks the camera. The camera is registered to this marker by an extrinsic calibration performed using cv::solvePnP().
The checkerboard is used to acquire data for extrinsic calibration using a tracked pointer (unshown) and cv::findChessboardCorners().
Up until now I've been smashing my face against the mathematics behind the system until everything finally lined up. When I move where I estimate the camera origin to be to the reference origin, the translation vector between the two is about [0; 0; 0]. So all of the registration appears to work correctly. However, when I run my application, I get the following results:
As you can see, there's a strange offset in the augmentation. I've tried removing distortion correction on the image (currently done with cv::undistort()), but it just makes the issue worse. The rotations are all correct and, as I said before, the translations all seem fine. I'm at a loss for what could be causing this. Of course, there's so much that can go wrong during implementation of the rendering pipeline, so I'm mostly posting this here under the hope that someone has experienced a similar issue. I already performed this project using a webcam-based tracking method and experienced no issues like this even though I used the same rendering process.
I've been purposefully a little ambiguous in this post to avoid bogging down readers with the minutia of the situation as there are so many different details I could include. If any more information is needed I can provide it. Any advice or insight would be massively appreciated. Thanks!
Here are a few tests that you could do to validate that each module works well.
First verify your extrinsic and intrinsic calibrations:
Check that the position of the virtual scene-marker with respect to the virtual lego scene accurately corresponds to the position of the real scene-marker with respect to the real lego scene (e.g. the real scene-marker may have moved since you last measured its position).
Same for the camera-marker, which may have moved since you last calibrated its position with respect to the camera optical center.
Check that the calibration of the camera is still accurate. For such a camera, prefer a camera matrix of the form [fx,0,cx;0,fy,cy;0,0,1] (i.e. with a skew fixed to zero) and estimate the camera distortion coefficients (NB: OpenCV's undistort functions do not support camera matrices with non-zero skews; using such matrices may not raise any exception but will result in erroneous undistortions).
Check that the marker tracker does not need to be recalibrated.
Then verify the rendering pipeline, e.g. by checking that the scene-marker reprojects correctly into the camera image when moving the camera around.
If it does not reproject correctly, there is probably an error with the way you map the OpenCV camera matrix into the OpenGL projection matrix, or with the way you map the OpenCV camera pose into the OpenGL model view matrix. Try to determine which one is wrong using toy examples with simple 3D points and simple projection and modelview matrices.
If it reprojects correctly, then there probably is a calibration problem (see above).
Beyond that, it is hard to guess what could be wrong without directly interacting with the system. If I were you and I still had no idea where the problem could be after doing the tests above, I would try to start back from scratch and validate each intermediate step using toy examples.
I'm trying to calibrate the integrated Camera of my notebook.
I'm using a 9x6 cheeseboard with a length of 300mm. It's printed on a Konica bizhub 452c and fixated on a drawingboard.
Using the tutorial-Code I'm getting strange undistorted Pictures, which shows that the calibration is bad (example below).
http://answers.opencv.org/question/64905/bad-camera-calibration/
I have feed about 70 pictures in the algorithm (different positions etc.) trying to get trainingpoints as far as possible to the picture edges.
I have tried for days to get an expectable calibration, but I'm only able to minimize the hole-effects on the sides.
Any Help would be appreciated.
If they are need I will provide the calibration-pictures.
regards
Moglei
I was having the same problem. I calibrated over and over again, but couldn't get any results better than the image you linked to, and sometimes worse. I read this question and answer from the OpenCV Q&A site, and it helped me solve my problem. In the link, you will see that the person answering the question writes that the problem is related to deficiencies in a couple of OpenCV functions that only become apparent when dealing with cameras with strong radial distortion. For me, I was able to "solve" the problem simply by zooming in. The "fish bowl" effect of radial distortion is most pronounced near the edges of the field-of-view, so by zooming in, you are effectively "cropping" your image and thereby reducing the extreme radial distortion. This may not be practical for your application, if you require the widest angle possible, or if your camera doesn't have zoom, but it worked for me!
This question is quite old, probably the issue has already been solved. I experienced the same issue with a wide-angle camera, my solution was to use a fisheye model that was able to correctly estimate camera intrinsics and lens distortions.
I was comparing Kinect V2 with my own ToF sensor and I found a different place.
Below is a point cloud with RGB information produced by Kinect V2, which is placed in front of a object, and the object is placed between two walls. This means the two walls are parallel to Kinect V2's view line.
If I drag the point cloud, you can see the point cloud of two walls in the .ply file are parallel with each other and parallel with the Kinect' view line.
Look from top of the point cloud
However, if I use my own ToF sensor to catch the point cloud under same environment and same view(please ignore the difference of the object in the middle, it has been changed), the point cloud looks like this(color camera hasn't been implemented)
The left wall(red circle area) somehow distorted like a "/"(The right wall cannot visualize due to my sensor's FOV)
I was confused by this phenomenon, I pretty sure Kinect V2 did some processes to fix this issue, but I cannot figure it out.
Can someone give me some clues about the scene I saw?
If there is any further information need to be provided, feel free to ask.
TOF cameras suffer from many systematic errors from both sensor and illumination unit which result in inhomogeneus lighting for example.
I did a detailed study on this topic that can be find at this address:
https://arxiv.org/pdf/1505.05459.pdf
The error gets a lot worth on the border of your frame, as the active illumination has never:
same fov as camera
same power on center and border of the frame
For your case I think the problem is happening at the border where the measured distance gets extremely unlinear for the border pixels. Check figure 5 and 15 in the paper.
Possible solutions:
Try to remain between 1 and 3 meter distance.
Concentrate more on the central 2/3 of the frame.
First of all I'm a total newbie in image processing, so please don't be too harsh on me.
That being said, I'm developing an application to analyse changes in blood flow in extremities using thermal images obtained by a camera. The user is able to define a region of interest by placing a shape (circle,rectangle,etc.) on the current image. The user should then be able to see how the average temperature changes from frame to frame inside the specified ROI.
The problem is that some of the images are not steady, due to (small) movement by the test subject. My question is how can I determine the movement between the frames, so that I can relocate the ROI accordingly?
I'm using the Emgu OpenCV .Net wrapper for image processing.
What I've tried so far is calculating the center of gravity using GetMoments() on the biggest contour found and calculating the direction vector between this and the previous center of gravity. The ROI is then translated using this vector but the results are not that promising yet.
Is this the right way to do it or am I totally barking up the wrong tree?
------Edit------
Here are two sample images showing slight movement downwards to the right:
http://postimg.org/image/wznf2r27n/
Comparison between the contours:
http://postimg.org/image/4ldez2di1/
As you can see the shape of the contour is pretty much the same, although there are some small differences near the toes.
Seems like I was finally able to find a solution for my problem using optical flow based on the Lukas-Kanade method.
Just in case anyone else is wondering how to implement it in Emgu/C#, here's the link to a Emgu examples project, where they use Lukas-Kanade and Farneback's algorithms:
http://sourceforge.net/projects/emguexample/files/Image/BuildBackgroundImage.zip/download
You may need to adapt a few things, e.g. the parameters for the corner detection (the frame.GoodFeaturesToTrack(..) method) , but it's definetly something to start with.
Thanks for all the ideas!
I have a very specific application in which I would like to try structure from motion to get a 3D representation. For now, all the software/code samples I have found for structure from motion are like this: "A fixed object that is photographed from all angle to create the 3D". This is not my case.
In my case, the camera is moving in the middle of a corridor and looking forward. Sometimes, the camera can look on other direction (Left, right, top, down). The camera will never go back or look back, it always move forward. Since the corridor is small, almost everything is visible (no hidden spot). The corridor can be very long sometimes.
I have tried this software and it doesn't work in my particular case (but it's fantastic with normal use). Does anybody can suggest me a library/software/tools/paper that could target my specific needs? Or did you ever needed to implement something like that? Any help is welcome!
Thanks!
What kind of corridors are you talking about and what kind of precision are you aiming for?
A priori, I don't see why your corridor would not be a fixed object photographed from different angles. The quality of your reconstruction might suffer if you only look forward and you can't get many different views of the scene, but standard methods should still work. Are you sure that the programs you used aren't failing because of your picture quality, arrangement or other reasons?
If you have to do the reconstruction yourself, I would start by
1) Calibrating your camera
2) Undistorting your images
3) Matching feature points in subsequent image pairs
4) Extracting a 3D point cloud for each image pair
You can then orient the point clouds with respect to one another, for example via ICP between two subsequent clouds. More sophisticated methods might not yield much difference if you don't have any closed loops in your dataset (as your camera is only moving forward).
OpenCV and the Point Cloud Library should be everything you need for these steps. Visualization might be more of a hassle, but the pretty pictures are what you pay for in commercial software after all.
Edit (2017/8): I haven't worked on this in the meantime, but I feel like this answer is missing some pieces. If I had to answer it today, I would definitely suggest looking into the keyword monocular SLAM, which has recently seen a lot of activity, not least because of drones with cameras. Notably, LSD-SLAM is open source and may not be as vulnerable to feature-deprived views, as it operates directly on the intensity. There even seem to be approaches combining inertial/odometry sensors with the image matching algorithms.
Good luck!
FvD is right in the sense that your corridor is a static object. Your scenario is the same and moving around and object and taking images from multiple views. Your views are just not arranged to provide a 360 degree view of the object.
I see you mentioned in your previous comment that the data is coming from a video? In that case, the problem could very well be the camera calibration. A camera calibration tells the SfM algorithm about the internal parameters of the camera (focal length, principal point, lens distortion etc.) In the absence of knowledge about these, the bundler in VSfM uses information from the EXIF data of the image. However, I don't think video stores any EXIF information (not a 100% sure). As a result, I think the entire algorithm is running with bad focal length information and cannot solve for the orientation.
Can you extract a few frames from the video and see if there is any EXIF information?