I want to create a site that will take the original images and create a spherical panorama of them. I plan to use the finished library, but I do not know which one has this feature. I watched OpenCV, but I didn’t understand whether it could create spherical panoramas from a set of photos. Perhaps someone had experience with this issue.
Related
I want to be able to create a panoramic photo app or something that will be able to stitch multiple photos together (much like google photo sphere), but before I start I want to get a bit more information how it is done.
Is it done using the UIImagePickerController framework?
Is there any other useful API's or anything out there I can use?
Can somebody give me a brief overview of how this works.
There is no available native API with stitching algorithm. You should dig into the 3rd party OpenCV library and check their stitcher documentation
Basic Key steps of stitching algorithm:
Detect keypoints in each input image (eg. Harris corners) and extract the invariant descriptors of the images (eg. SIFT)
Match the descriptors between images
Using RANSACcalculate the homography matrix and apply the transformation
My project is Face Authentication.
System Description: My input is only one image (which was taken when the user logins for the first time) and using that image system should authenticate whenever the user logins to the application. The authentication images may differ from the first input image like -- different illumination conditions, different distance from camera and -10 to 10 degrees variation in pose. The camera used is same (ex: ipad) for all cases.
1) Authentication images are stored each time the user logins. How to
make use of these images to enhance the accuracy of the system??
2) When a new image comes, I need to select the closest image(s) (and
not all stored images) from the image repository and use for
authenticate to reduce the time. How to label an image based on
illumination/distance from camera automatically??
3) How should I make my system to perform decently for changes in
illumination and distance from camera??
Please, can anyone suggest me good alogirthm/papers/opensource-codes for my above questions??
Though it sounds like a research project, I would be extremely grateful if I get any response from someone.
For this task I think you should take a look at OpenCV's Face Recognition API. The API is basically able to identify the structure of a face (within certain limitations of course) and provide you with the coordinates of the image within which the face is available.
Having to deal with just the face in my opinion reduces the need to deal with different background colours which I think is something you do not really need.
Once you have the image of the face, you could scale it up/down to have a uniform size and also change the colour of the image to grey scale. Lastly, I would consider feeding all this information to an Artificial Neural Network since these are able to deal with inconsistencies with the input. This will allow you to increase your knowledge base each time a user logs in.
I'm pretty sure there are other ways to go around this. I would recommend taking a look at Google Scholar to try and find papers which deal with this matter for more information and quite possible other ways to achieve what you are after. Also, keep in mind that with some luck you might also find some open source project which already does most of what you are after.
If you really have a database of photographs of faces, you could probably use that to enhance the features of OpenCV face detection. The way faces are recognized is by comparing the principal components of the picture with those of the face examples in OpenCV database.
Check out:
How to create Haar Cascade (xml) for using with OpenCV?
Seeing that, you could also try to do your own Principal Component Analysis on every picture of a recognized face (use OpenCV face detection for that-> Black out everything exept the face, OpenCV gives you the position and size of the face). Compare the PCA to the ones in your database and match it to the closest. Course, this would work best with a fairly big database, so maybe at the beginning there could be wrong matches.
I think creating your own OpenCV haarcascade would be the best way to go.
Good Luck!
I would like some hints, maybe more, on detecting a custom image marker in a real-time video feed. I'm using OpenCV, iPhone and the camera feed.
By custom image marker I'm referring to a predefined image, but it can be any kind of image (not a specific designed marker). For example, it can be a picture of some skyscrapers.
I've already worked with ARTags and understand how they are detected, but how would I detect this custom image and especially find out its position & orientation?
What makes a good custom image to be detected successfully?
Thanks
The most popular markers used in AR are
AR markers (a simple form of QR codes) - those detected by AR tookit & others
QR codes. There are plenty of examples on how to create/detect/read QR.
Dot grids. Similar with the chess grids used in calibration. It seems their detection can be more robust than the classical chess grid. OpenCV has codes related to dot grid detection in the calibration part. Also, the OpenCV codebase offers a good starting point to extract 3D position and orientation.
Chess grids. Similar to dot grids. They were the standard calibration pattern, and some people used them for marker detection of a long time. But they lost their position to dot grids recently, when some people discovered that dots can be detected with better accuracy.
Note:
Grids are symmetrical. I bet you already know that. But that means you will not be able to
recover full orientation data from them. You will get the plane where the grid lies, but nothing more.
Final note:
Code and examples for the first two are easily found on the Internet. They are considered the best by many people. If you decide to use the grid patterns, you have to enjoy some math and image processing work :) And it will take more.
This answer is valid no more since Vuforia is now a paid engine.
I think you should give Vuforia a try. It's a AR engine that can use any image you want as a marker. What makes a good marker for Vuforia is high frequency images.
http://www.qualcomm.com/solutions/augmented-reality
Vuforia is a free to use engine.
I'm working on a photo gallery *projected on a wall*, in which the users should interact with gestures. The users will be standing in front of the wall projection. The user should be able to select one photo, to go back to the main gallery and to do other (unspecified) gestures.
I have programming skills in c,c++ and some knowledge in opengl. I have no experience with opencv but I think I can use it to recognize the user gestures.
The raw idea is to place a webcam in front of the user (up or down the wall rectangle) and process the video stream with opencv.
This may not be the best solution at all... so a lot of questions arises:
Any reference to helpful documentation?
Should I use a controlled lights ambient?
In your experience where is the best camera position?
Might it be better to back project the wall (I mean that the wall will not be a real wall ;-) )
Any different (better) solution? are there any devices to visually intercept the user gestures (like xbox360 for example)?
Thanks a lot!
Massimo
I don't have much experience on human detection with OpenCV, but with any tool, this is a difficult task. You didn't even specified which parts of the human body you're planned to use... Are gestures use the full body, only arms and hands, etc. ?
OpenCV has some predefined files to detect full human body, face, mouth, etc. (look for dedicated .xml file into OpenCV source code), you may want to try them.
For documentation, the official OpenCV documentation is a must see: http://opencv.willowgarage.com/documentation/cpp/index.html but of course, it is very general.
Controlling the ambient light may be useful, but it depends on the methods you'll use. First, find the suited methods, and make your choice depending on your capacity to control the light. Again, the best position of the camera will depend on the methods and surely on which parts of the human body you planned to use. Finally, keep in mind that OpenCV is not particularly fast do you may need to use some OpenGL routines to make things faster.
If you're prepared not to use only webcams, you may want to have a look at the Kinect SDKs. The official is only supposed to be released next spring, but you can find stuff for Linux boxes already.
have fun!
How do Google Maps do their panoramas in Street View?
Yeah, I know its Flash, but how do they skew bitmaps with Correct Texture Mapping?
Are they doing it on the pixel-level like most Flash 3D engines?, or just applying some tricky transformation to the Bitmaps in the Movieclips?
Flash Panorama Player can help achieve a similar result!
It uses 6 equirectangular images (cube faces) stitched together seamlessly with some 'magic' ActionScript.
Also see these parts of flashpanos.com for plugins, and tutorials with (possibly) documentation.
A quick guide to shooting panoramas so you can view them with FPP (Flash Panorama Player).
Cubic projection cube faces are actually 90x90 degrees rectilinear
images like the ones you get from a normal camera lens. ~ What is VR Photography?
Check out http://www.panoguide.com/. They have howtos, links to software etc.
Basically there are 2 components in the process: the stitching software which creates a single panoramic photo from many separate image sources, then there is the panoramic viewer, which distorts the image as you change your POV to simulate what your eyes would see if you were actually there.
My company uses the Papervision3D flash render engine, and maps a panoramic image (still image or video) onto a 3D sphere. We found that using a spherical object with about 25 divisions along both the axes gives a much better visual result than mapping the same image on the six faces of a cube. Check it for yourself at http://www.panocast.com.
Actually, you could of course distort your image in advance, so that when it is mapped on the faces of a cube, its perspective is just right, but this requires the complete rerendering of your imagery.
With some additional "magic", we can also load still images incrementally, as needed, depending on where the user is looking and at what zoom level (not unlike Google Street View does).
In terms of what Google actually does, Bork had this right. I'm not sure of the exact details (and not sure I could release the details even if I did), but Google stores individual 360 degree streetview scenes in an equirectangular representation for serving. The flash player then uses a series of affine transformations to display the image in perspective. The affine transformations are approximate, but good enough to aggregate to a decent image overall.
The calculation of the served images is very involved, since there are many stages of image processing that have to be done, to remove faces, account for bloom, etc. etc. In terms of actually stitching the panoramas, there are many algorithms for this (wikipedia article). Just one interesting thing I'd like to point out though, as food for thought, in the 360 degree panoramas on street view, you can see the road at the bottom of the image, where there was no camera on the cars. Now that's stitching.
An expensive camera. makes
A 360 degree video
It is pretty impressive to watch a video that allows panning in every direction... which is what street view is without the bandwidth to support the full video.
For those wondering how the Google VR Photographers and editers add the ground to their Equirectangular panoramas, check out the feature called Viewpoint Correction, as seen in software like PTGui:
ptgui.com/excamples/vptutorial.html
(Note that this is NOT the software used by Google)
If you take a closer look at the ground in street view, you see that the stitching seems streched, and sometimes it even overlaps with information from the viewpoint next to the current one. (With that I mean that you can see something in one place, and suddenly that same feature is shown as the ground in the next place, revealing the technique used for the ground stitching).