Basic camera calibration without checkerboard

Basic camera calibration without checkerboard - opencv

The problem
I don't have access to the camera that was taking pictures below. You can find the source video here https://youtu.be/C7hS3enWh94?t=343
I would like to perform coarse camera calibration using only the information I have in the video frames (which is a road line that supposed to be straight but looks rounded in the images and luckily covers most of the sensor area over time).
What I need
I'm looking for a quick and dirty way to find coarse camera distortion parameters because I think there is no way to accurately estimate camera calibration parameters using only available information.
I'm out of ideas on how to progress with this problem. My idea is too complicated and would take much effort to implement with low guarantee that it would actually work. So the question I have is actually more of a brainstorm on hypothetical approaches to solving this problem.
P.S.
I thought it should be possible to implement Hough transform to look for circle curvature (circle radius that would accommodate most pixels) but the curvature we're looking at would definitely result with very large radius. In turn it might not form perfect circle and rather be an ellipse because of imperfect 90 degrees angle at which camera looks down to road. This complicates Hough Transform implementation significantly.
Another way I was thinking was to use random search algorithms such as genetic algorithm to fiddle distortion + scale + rotation + translation parameters that would result with one image being fit on top of another image perfectly. But again it would take me much time to complete anything like this.
Any better ideas from OpenCV gurus out there?

Related

How can I detect whether the object is 3D?

I am trying to build a solution where I could differentiate between a 3D textured surface with the height of around 200 micron and a regular text print.
The following image is a textured surface. The black color here is the base surface.
Regular text print will be the 2D print of the same 3D textured surface.
[EDIT]
Initial thought about solving this problem, could look like this:
General idea here would be, images shot at different angles of a 3D object would be less related to each other than the images shot for a 2D object in the similar condition.
One of the possible way to verify could be: 1. Take 2 images, with enough light around (flash of the camera). These images should be shot at as far angle from the object plane as possible. Say, one taken at camera making 45 degree at left side and other with the same angle on the right side.
Extract the ROI, perspective correct them.
Find GLCM of the composite of these 2 images. If the contrast of the GLCM is low, then it would be a 3D image, else a 2D.
Please pardon the language, open for edit suggestion.

General idea here would be, images shot at different angles of a 3D object would be less related to each other than the images shot for a 2D object in the similar condition.
One of the possible way to verify could be:
1. Take 2 images, with enough light around (flash of the camera). These images should be shot at as far angle from the object plane as possible. Say, one taken at camera making 45 degree at left side and other with the same angle on the right side.
Extract the ROI, perspective correct them.
Find GLCM of composite of these 2 images. If contrast of the GLCM is low, then it would be a 3D image, else a 2D.
Please pardon the language, open for edit suggestion.

If you can get another image which
different angle or
sharper angle or
different lighting condition
you may get result. However, using two image with different angle with calibrate camera can get stereo vision image which solve your problem easily.

This is a pretty complex problem and there is no plug-in-and-go solution for this. Using light (structured or laser) or shadow to detect a height of 0.2 mm will almost surely not work with an acceptable degree of confidence, no matter of how much "photos" you take. (This is just my personal intuition, in computer vision we verify if something works by actually testing).
GLCM is a nice feature to describe texture, but it is, as far as I know, used to verify if there is a pattern in the texture, so, I believe it would output a positive value for 2D print text if there is some kind of repeating pattern.
I would let the computer learn what is text, what is texture. Just extract a large amount of 3D and 2D data, and use a machine learning engine to learn which is what. If the feature space is rich enough, it may be able to find a way to differentiate one from another, in a way our human mind wouldn't be able to. The feature space should consist of edge and colour features.
If the system environment is stable and controlled, this approach will work specially well, since the training data will be so similar to the testing data.
For this problem, I'd start by computing colour and edge features (local image pixel sums over different edge and colour channels) and try a boosted classifier. Boosted classifiers aren't the state of the art when it comes to machine learning, but they are good at not overfitting (meaning you can just insert as much data as you want), and will most likely work in a stable environment.
Hope this helps,
Good luck.

Determine movement/motion (in pixels) between two frames

First of all I'm a total newbie in image processing, so please don't be too harsh on me.
That being said, I'm developing an application to analyse changes in blood flow in extremities using thermal images obtained by a camera. The user is able to define a region of interest by placing a shape (circle,rectangle,etc.) on the current image. The user should then be able to see how the average temperature changes from frame to frame inside the specified ROI.
The problem is that some of the images are not steady, due to (small) movement by the test subject. My question is how can I determine the movement between the frames, so that I can relocate the ROI accordingly?
I'm using the Emgu OpenCV .Net wrapper for image processing.
What I've tried so far is calculating the center of gravity using GetMoments() on the biggest contour found and calculating the direction vector between this and the previous center of gravity. The ROI is then translated using this vector but the results are not that promising yet.
Is this the right way to do it or am I totally barking up the wrong tree?
------Edit------
Here are two sample images showing slight movement downwards to the right:
http://postimg.org/image/wznf2r27n/
Comparison between the contours:
http://postimg.org/image/4ldez2di1/
As you can see the shape of the contour is pretty much the same, although there are some small differences near the toes.

Seems like I was finally able to find a solution for my problem using optical flow based on the Lukas-Kanade method.
Just in case anyone else is wondering how to implement it in Emgu/C#, here's the link to a Emgu examples project, where they use Lukas-Kanade and Farneback's algorithms:
http://sourceforge.net/projects/emguexample/files/Image/BuildBackgroundImage.zip/download
You may need to adapt a few things, e.g. the parameters for the corner detection (the frame.GoodFeaturesToTrack(..) method) , but it's definetly something to start with.
Thanks for all the ideas!

Image Rectification for Shake Correction on OpenCV

I've 2 pictures of the same scene from an uncalibrated camera. The pics are from a slightly different angle and scale(zoom) and I'd like to superpose them, rejecting any kind of shake. In other words, I should transform them so the shake becomes imperceptible, do a Motion Compensation.
I've already tried using a simple SURF (feature) detector along with Homography but sometimes the result isn't satisfactory. So I am thinking about trying Image Rectification to compensate the motion.
- Would it work with slight changes, such as user shake?
- Would it really work to reject shake for these 2 frames? And for a bigger buffer of pictures (10 maybe)?
- Anyone knows if it would fix scale disparity (different zoom in the images)?
- What the algorithm really do? Will it transform both pictures into a third orientation?
If there is a better solution, I would be glad to know =)
EDIT
I don't aim to compensate blur motion but the displacement itself. For example, in this file the author compensates the angle difference between two cameras by Image Rectification. How does it actually work? Does it always create an intermediate picture orientation or can I specify that one of the pictures shall remains still??
Also, would I be able to apply this to many frames or it would always find an intermediate orientation for each two frames I put in?
Cheers,

I'm not sure how well superimposing the images would work. Another way to remove blur (including motion blur which should dominate in handheld camera devices) from an image is by blind deconvolution. It is basically a method of finding the inverse of the blur filter that was physically applied (camera shaken) to the real image. There's plenty of techniques out on the web. I've specifically had good results using a modified version of the algorithm in this paper: http://www.cse.cuhk.edu.hk/~leojia/all_final_papers/motion_deblur_cvpr07.pdf
It also comes with an executable file somewhere around the web so you can see if it's fit for your purpose.
Good luck out there!

Fiducial marker detection in the presence of camera shake

I'm trying to make my OpenCV-based fiducial marker detection more robust when the user moves the camera (phone) violently. Markers are ArTag-style with a Hamming code embedded within a black border. Borders are detected by thresholding the image, then looking for quads based on the found contours, then checking the internals of the quads.
In general, decoding of the marker is fairly robust if the black border is recognized. I've tried the most obvious thing, which is downsampling the image twice, and also performing quad-detection on those levels. This helps with camera defocus on extreme nearground markers, and also with very small levels of image blur, but doesn't hugely help the general case of camera motion blur
Is there available research on ways to make detection more robust? Ideas I'm wondering about include:
Can you do some sort of optical flow tracking to "guess" the positions of the marker in the next frame, then some sort of corner detection in the region of those guesses, rather than treating the rectangle search as a full-frame thresholding?
On PCs, is it possible to derive blur coeffiients (perhaps by registration with recent video frames where the marker was detected) and deblur the image prior to processing?
On smartphones, is it possible to use the gyroscope and/or accelerometers to get deblurring coefficients and pre-process the image? (I'm assuming not, simply because if it were, the market would be flooded with shake-correcting camera apps.)
Links to failed ideas would also be appreciated if it saves me trying them.

Yes, you can use optical flow to estimate where the marker might be and localise your search, but it's just relocalisation, your tracking will have broken for the blurred frames.
I don't know enough about deblurring except to say it's very computationally intensive, so real-time might be difficult
You can use the sensors to guess the sort of blur you're faced with, but I would guess deblurring is too computational for mobile devices in real time.
Then some other approaches:
There is some really smart stuff in here: http://www.robots.ox.ac.uk/~gk/publications/KleinDrummond2004IVC.pdf where they're doing edge detection (which could be used to find your marker borders, even though you're looking for quads right now), modelling the camera movements from the sensors, and using those values to estimate how an edge in the direction of blur should appear given the frame-rate, and searching for that. Very elegant.
Similarly here http://www.eecis.udel.edu/~jye/lab_research/11/BLUT_iccv_11.pdf they just pre-blur the tracking targets and try to match the blurred targets that are appropriate given the direction of blur. They use Gaussian filters to model blur, which are symmetrical, so you need half as many pre-blurred targets as you might initially expect.
If you do try implementing any of these, I'd be really interested to hear how you get on!

From some related work (attempting to use sensors/gyroscope to predict likely location of features from one frame to another in video) I'd say that 3 is likely to be difficult if not impossible. I think at best you could get an indication of the approximate direction and angle of motion which may help you model blur using the approaches referenced by dabhaid but I think it unlikely you'd get sufficient precision to be much more help.

Find the position of a pattern/marker inside a photograph

i need to find a marker like the ones used in Augmented Reality.
Like this:
I have a solid background on algebra and calculus, but no experience whatsoever on image processing. My thing is Php, sql and stuff.
I just want this to work, i've read the theory behind this and it's extremely hard to see in code for me.
The main idea is to do this as a batch process, so no interactivity is needed. What do you suggest?
Input : The sample image.
Output: Coordinates and normal vector in 3D of the marker.
The use for this will be linking images that have the same marker to spatialize them, a primitive version of photosync we could say. Just a caroussel of pinned images, the marker acting like the pin.
The reps given allowed me to post images, thanks.

You can always look at the open source libraries such as ARToolkit and see how it works but generally in order to get the 3D coordinates of marker you would need to:
Do the camera calibration.
Find marker in image using local features for example.
Using calibrated camera parameters and 2D coordinates of marker do the approximation the 3D coordinates.
I've never implemented sth similar by myself but I think this is a general concept you should apply on your method.

Your problem can be solved by perspective n point camera pose estimation. When you can reasonably assume that all correspondences are correct, a linear algorithm should do.
Since the marker is planar, you can also recover the displacement from the homography between the model plane and the image plane (link). As usual, best results are obtained by iterative algorithms (link).

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart