The idea is to rectify single photograph taken from perspective (its not an aerial image, so I cannot use ground control points).
Instead of GCPs, street furniture and distance between them will be measured in real life, therefore I'll know some true measurements and distances.
The photograph will be taken once a month to see if any changes occurred, therefore, ideally compared with other images.
Once photo is (somehow)corrected from distortions and from perspective, I then need to extract basic dimensions. - a survey of some kind.
The attached is a sample image (Sea Wall) of what I can be given, - sea wall (fixed, so I be able to collect real measurements), and gravel, where gravel needs to be measured (increase or decrease of it)
Ideally I'm looking for a free and simple software but, I'm open to any ideas.
Sea Wall
Photogrammetric image rectification from cad-kas seems to do it.( i have not tried myself and it is not free!):
Rectify photos with the help of photogrammetry. So you get a completely level view of one side of a building for example.
So you can take the result for measurements of this building. You can export the result as picture or as DXF file. If you take a photo of a rectangle what you get is a trapezoid.
https://www.cadkas.com/photogrammetric-image-rectification.php
Related
I have a problem and a solution but i wonder whether my answer is true or not or where did i do wrong thanks for your help-->Question is that-->
A professor of archeology doing research on currency exchange practices became aware that
four Roman coins crucial to his research are listed in the holdings of a museum. Unfortunately,
he learned that the coins had been recently stolen. Good thing on his part is that the museum
keeps photographs of every item in their holdings. Unfortunately, the photos of the coins he is
interested in are blurred to the point where the date and other markings are not readable. The
original camera used to take the photographs is available, as well as other representative coins
from the same era.
You are hired to as an image processing consultant to help the professor to determine whether
computer processing can be utilized to process the images to the point where the professor can
read markings.
Propose a step by step solution to this problem.
Than my solution-->
Firstly, from the coins and cameras that i have
setting that causes blurred photo to occur
and select a start method accordingly.After using the deconvolution method to get rid of the blur, I use a high pass filter to reveal the perimeter of the coin or use tresholding method so that I can reveal the shape of the money.In 2D photography, it is easy to see where the coin is and what pixelles I have to trade.So for example, instead of applying a whole process to the image,
I have been working on the pixeller that has been determined.Then the place where you have the necessary information from the coin I take the gradient in the x-y direction to determine what parts might be from the image.Then i apply dialation to necessary part so that this part can be clear.Once the place has become clearer I get a clear image of the required area with OCR (optical character recognition) or template matching.If you make a comment at least I am very pleased thanks a lot.
Take a picture of a single, sharp black dot with the old camera to estimate its PSF.
Now use a supervised deblurring algorithm using that PSF.
If the settings of the camera were lost, you can try an reproduce similarly blurred images of the available coins, by varying the focus and optimizing the correlation to the available pictures.
This is the setup: A fairly large room with 4 fish-eye cameras mounted on the ceiling. There are no blind spots. Each camera coverage overlaps a little with the other.
The idea is to track people across these cameras. As of now a blob extracting algorithm is in place, which detects people as blobs. It's a fairly decently working algorithm which detects individual people pretty good. Am using the OpenCV API for all of this.
What I mean by track people is that - Say, camera 1 identifies two people, say Person A and Person B. Now, as these two people move from the coverage of camera 1 into the overlapping area of coverage of cam1 and cam2 and into the area where only cam2 covers, cam2 should be able to identify them as the same people A and B cam1 identified them as.
This is what I thought I'd do -
1) The camera renders the image at 15fps and I think the dimensions of the frames are of 1920x1920.
2) Identify blobs individually in each camera and give each blob an unique label.
3) Now as for the overlaps - Compute an affine transformation matrix which maps pixels on one camera's frame onto another camera's frame - this needn't be done for every frame - this can be done before the whole process starts, as a pre-processing step. So in real time, whenever I detect a blob which is in the overlapping area, all I have to do is apply the transformation matrix to the pixels in cam1 and see if there is a corresponding blob in cam2 and give them the same label.
So, Questions :
1) Would this system give me a badly-working system which tracks people decently ?
2) So, for the affine transform, do I have to convert the fish-eye to rectilinear image ? (My answer is yes, but am not too sure)
Please feel free to point out possible errors and why certain things might not work in the process I've described. Also alternate suggestions are welcome! TIA
1- blob extraction is not enough to track a specific object, for people case I suggest HoG - or at least background subtraction before blob extraction, since all of the cameras have still scenes.
2- opencv <=2.4.9 uses pinhole model for stereo vision. so, before any calibration with opencv methods your fisheye images must be converted to rectilinear images first. You might try calibrating yourself using other approaches too
release 3.0.0 will have support for fisheye model. It is on alpha stage, you can still download and give it a try.
I have a photocamera mounted vertically under water in a tank, looking downwards.
There is a flat grid on the bottom of the tank (approx 2m away from the camera).
I want to be able to place markers on the bottom, and use computer vision to know their real life exact position.
So, I need to map from pixels to mm.
If I am not mistaken, cv::calibrateCamera(...) does just this, but is dependent on moving a pattern in front of the camera.
I have just static pictures of the scene, and the camera never moves in relation to the grid. Thus, I have only a "single" image to find the parameters.
How can I do this using the grid?
Thank you.
Interesting problem! The "cute" part is the effect on the intrinsic parameters of the refraction at the water-glass interface, namely to increase the focal length (or, conversely, to reduce the field of view) compared to the same lens in air. In theory, you could calibrate in air and then correct for the difference in refraction index, but calibrating directly in water is likely to give you more accurate results.
Do know your accuracy requirements? And have you verified that your lens/sensor combination is adequate to meet them (with an adequate margin)? To answer the question you need to estimate (either by calculation from the lens and sensor specifications, or experimentally using a resolution chart) whether you can resolve in an image the minimal distances required by your application.
From the wording of your question I think that you are interested only in measurements on a single plane. So you only need to (a) remove the nonlinear (barrel or pincushion) lens distortion and (b) estimate the homography between the plane of interest and the image. Once you have the latter, you can directly convert from undistorted image coordinates to world ones by matrix multiplication. Additionally if (as I imagine) the plane of interest is roughly parallel to the image plane, you should not have any problem keeping the entire field-of-view in focus.
Of course, for all of this to work as expected, you should make sure that the tank bottom is really flat, within the measurement tolerances of your application. Otherwise you are really dealing with a 3D problem, and need to modify your procedures accordingly.
The actual procedure depends a lot on the size of the tank, which you don't indicate clearly. If it's small enough that it is practical to manufacture a chessboard-like movable calibration target, by all means go for it. You may want to take a look at this other answer for suggestions. In the following I'll discuss the more interesting case in which your tank is large, e.g. the size of a swimming pool.
I'd proceed by sticking calibration markers in a regular grid at the pool bottom. I'd probably choose checker-like markers like these, maybe printing them myself with a good laser printer on plastic with an adhesive backing (assuming you can leave them in place forever). You should plan on having quite a few of them, say, an 8x8 or 10x10 grid, covering as much as possible of the field of view of the camera in its operating position and pose. To help with lining up the grid nicely you might use a laser line projector of suitable fan angle, or a laser pointer attached to a rotating support. Note carefully that it is not necessary that they be affixed in a precise X-Y grid (which may be complicated, depending on the size of your pool), only that their positions with respect any arbitrarily chosen (but fixed) three of them be known. In other words, you can attach them to the bottom approximately in a grid, then measure the distances of three extreme corners from each other as accurately as you can, thus building a base triangle, then measure the distances of all the other corners from the vertices of the triangle, and finally reconstruct their true positions with a bit of trigonometry. It's basically a surveying problem and, depending on your accuracy requirements and budget, you may want to enroll a local friendly professional surveyor (and their tools) to get it done as precisely as necessary.
Once you have your grid, you can fill the pool, get your camera, focus and f-stop the lens as needed for the application. From now on you may not touch the focus and f-stop ever again, under penalty of miscalibrating - exposure can only be controlled by the exposure time, so make sure to have enough light. Disable any and all auto-focus and auto-iris functions, if any. If the camera has a non-rigid lens mount (e.g. a DLSR), you'll need some kind of mechanical rig to ensure that the lens-body pair stay rigid. F-stop as close as you can, given the available lighting and sensor, so to have a fair bit of depth of field available. Then take several photos (~ 10) of the grid, moving and rotating the camera, and going a bit closer and farther away than your expected operating distance from the plane. You'll want to "see" in some images some significant perspective foreshortening of the grid - this is needed to accurately calibrate the focal length. Avoid JPG and any other lossy compression format when storing the images - use lossless PNG or TIFF.
Once you have the images, you can manually mark and identify the checker markers in the images. For a once-off project like this I would not bother with automatic identification, just do it manually (e.g. in Matlab, or even in Photoshop or Gimp). To help identify the markers, you could, e.g. print a number next to them. Once you have the manual marks, you can refine them automatically to subpixel accuracy, e.g. using cv::findCornerSubpix.
You're almost done. Feed the "reference" measured position of the real corners, and the observed ones in all images, to your favorite camera calibration routine, e.g. cv::calibrateCamera. You use the nominal focal length of the camera (converted to pixels) for an initial estimate, along with null distortion. If all goes well, you will obtain the camera intrinsic parameters, which you will keep, and the camera poses at all images, which you'll throw away.
Now you can mount the camera in your final setup, as needed by your application, and take one further image of the grid. Mark and refine the corner positions as before. Undistort their image positions using the distortion parameters returned by the calibration. Finally compute the homography between the reference positions of the real markers (in meters) and their undistorted positions, and you're done.
HTH
To calibrate the camera you do need multiple images of the checkerboard (or one of the other patterns found here). What you can do, is calibrate the camera outside of the water or do a calibration sequence once.
Once you have that information (focal length, center of lens, distortion, etc). You can use the solvePNP function to estimate the orientation of a single board. This estimation provides you with a distance from the camera to the board.
A completely different alternative could be to find what kind of lens the camera uses and manually fill in the data. I've not tried this, so I'm uncertain how well this would work.
In an iPad app, we would like to help users visualize the width of an object with their camera in the style of an augmented reality app.
For instance, if we want to help someone visualize the width of a bookshelf against a wall, how could we do that? Are there algorithms to estimate width (i.e., if you're standing 5 feet away and pointing your camera at the wall, 200 pixels in the camera will represent X inches)?
Any good resources to start looking?
To do this you may wish to do a bit of research yourself as this will vary depending upon the camera being used, resolution and it's depth of field.
I would recommend taking a large strip of paper (wallpaper would work fine) and writing measurements at specified intervals, e.g. you could write distance markers for each foot on the paper. Then all you need to do is stand at varying distances from the wall with the paper mounted to it, and take photographs, you should then be able to establish the correlation of how distance, resolution and measurements correlate to each other and use these findings to form your own algorithm. You've essentially answered your own question.
How do Google Maps do their panoramas in Street View?
Yeah, I know its Flash, but how do they skew bitmaps with Correct Texture Mapping?
Are they doing it on the pixel-level like most Flash 3D engines?, or just applying some tricky transformation to the Bitmaps in the Movieclips?
Flash Panorama Player can help achieve a similar result!
It uses 6 equirectangular images (cube faces) stitched together seamlessly with some 'magic' ActionScript.
Also see these parts of flashpanos.com for plugins, and tutorials with (possibly) documentation.
A quick guide to shooting panoramas so you can view them with FPP (Flash Panorama Player).
Cubic projection cube faces are actually 90x90 degrees rectilinear
images like the ones you get from a normal camera lens. ~ What is VR Photography?
Check out http://www.panoguide.com/. They have howtos, links to software etc.
Basically there are 2 components in the process: the stitching software which creates a single panoramic photo from many separate image sources, then there is the panoramic viewer, which distorts the image as you change your POV to simulate what your eyes would see if you were actually there.
My company uses the Papervision3D flash render engine, and maps a panoramic image (still image or video) onto a 3D sphere. We found that using a spherical object with about 25 divisions along both the axes gives a much better visual result than mapping the same image on the six faces of a cube. Check it for yourself at http://www.panocast.com.
Actually, you could of course distort your image in advance, so that when it is mapped on the faces of a cube, its perspective is just right, but this requires the complete rerendering of your imagery.
With some additional "magic", we can also load still images incrementally, as needed, depending on where the user is looking and at what zoom level (not unlike Google Street View does).
In terms of what Google actually does, Bork had this right. I'm not sure of the exact details (and not sure I could release the details even if I did), but Google stores individual 360 degree streetview scenes in an equirectangular representation for serving. The flash player then uses a series of affine transformations to display the image in perspective. The affine transformations are approximate, but good enough to aggregate to a decent image overall.
The calculation of the served images is very involved, since there are many stages of image processing that have to be done, to remove faces, account for bloom, etc. etc. In terms of actually stitching the panoramas, there are many algorithms for this (wikipedia article). Just one interesting thing I'd like to point out though, as food for thought, in the 360 degree panoramas on street view, you can see the road at the bottom of the image, where there was no camera on the cars. Now that's stitching.
An expensive camera. makes
A 360 degree video
It is pretty impressive to watch a video that allows panning in every direction... which is what street view is without the bandwidth to support the full video.
For those wondering how the Google VR Photographers and editers add the ground to their Equirectangular panoramas, check out the feature called Viewpoint Correction, as seen in software like PTGui:
ptgui.com/excamples/vptutorial.html
(Note that this is NOT the software used by Google)
If you take a closer look at the ground in street view, you see that the stitching seems streched, and sometimes it even overlaps with information from the viewpoint next to the current one. (With that I mean that you can see something in one place, and suddenly that same feature is shown as the ground in the next place, revealing the technique used for the ground stitching).