slideshow with gestures interaction using opencv - opencv

I'm working on a photo gallery *projected on a wall*, in which the users should interact with gestures. The users will be standing in front of the wall projection. The user should be able to select one photo, to go back to the main gallery and to do other (unspecified) gestures.
I have programming skills in c,c++ and some knowledge in opengl. I have no experience with opencv but I think I can use it to recognize the user gestures.
The raw idea is to place a webcam in front of the user (up or down the wall rectangle) and process the video stream with opencv.
This may not be the best solution at all... so a lot of questions arises:
Any reference to helpful documentation?
Should I use a controlled lights ambient?
In your experience where is the best camera position?
Might it be better to back project the wall (I mean that the wall will not be a real wall ;-) )
Any different (better) solution? are there any devices to visually intercept the user gestures (like xbox360 for example)?
Thanks a lot!
Massimo

I don't have much experience on human detection with OpenCV, but with any tool, this is a difficult task. You didn't even specified which parts of the human body you're planned to use... Are gestures use the full body, only arms and hands, etc. ?
OpenCV has some predefined files to detect full human body, face, mouth, etc. (look for dedicated .xml file into OpenCV source code), you may want to try them.
For documentation, the official OpenCV documentation is a must see: http://opencv.willowgarage.com/documentation/cpp/index.html but of course, it is very general.
Controlling the ambient light may be useful, but it depends on the methods you'll use. First, find the suited methods, and make your choice depending on your capacity to control the light. Again, the best position of the camera will depend on the methods and surely on which parts of the human body you planned to use. Finally, keep in mind that OpenCV is not particularly fast do you may need to use some OpenGL routines to make things faster.
If you're prepared not to use only webcams, you may want to have a look at the Kinect SDKs. The official is only supposed to be released next spring, but you can find stuff for Linux boxes already.
have fun!

Related

I want to detect motion/movement in the live camera. How can i do it?

I'm creating motion detect app for ios. when camera on live any object passes the camera like person , animal. than i want detect motion feature. how's it possible?
I suggest you get familiar with the AVFoundation framework to understand how to get live video frames using the camera of an iOS device. A good starting point is Apple's famous sample AVCam, which should get you familiar with all the camera concepts.
As the next step, figure out how to do the movement detection. The simplest algorithm for that would be the background subtraction. The idea is to subtract two consecutive frames one from another. The areas without movement just cancel each other and become black, while the areas with movements show some nonzero values.
Here's an example of background subtraction in the OpenCV framework.
If in the end, you decide to use OpenCV (which is a classic Computer Vision framework which I definitely recommend), then you'll need to integrate OpenCV into your iOS app. You can see a short tutorial here.
I tried to show you some pointers which could get you going. The problem (how you presented it) is definitely not an easy one, so good luck!

ARKit and Unity - How can I detect the act of hitting the AR object by a real world object from the camera?

Think if someone in real life waved their hand and hit the 3D object in AR, how would I detect that? I basically want to know when something crosses over the AR object so I can know that something "hit" it and react.
Another example would be to place a virtual bottle on the table and then wave your hand in the air where the bottle is and then it gets knocked over.
Can this be done? If so how? I would prefer unity help but if this can only be done via Xcode and ARKit natively, I would be open to that as well.
ARKit does solve a ton of issues with AR and make them a breeze to work with. Your issue just isn't one of them.
As #Draco18s notes (and emphasizes well with the xkcd link 👍), you've perhaps unwittingly stepped into the domain of hairy computer vision problems. You have some building blocks to work with, though: ARKit provides pixel buffers for each video frame, and the projection matrix needed for you to work out what portion of the 2D image is overlaid by your virtual water bottle.
Deciding when to knock over the water bottle is then a problem of analyzing frame-to-frame differences over time in that region of the image. (And tracking that region's movement relative to the whole camera image, since the user probably isn't holding the device perfectly still.) The amount of of analysis required varies depending on the sophistication of effect you want... a simple pixel diff might work (for some value of "work"), or there might be existing machine learning models that you could put together with Vision and Core ML...
You should take a look at ManoMotion: https://www.manomotion.com/
They're working on this issue and suppose to release a solution in form of library soon.

Detecting whether someone is speaking in a video

I'm trying to figure out how to detect whether a human that I've identified in a video is speaking. I'm using some of the multi-person multi-camera tracking code posted here to detect individuals and I want to determine whether someone identified is speaking at any given time. Is anyone aware of good CV projects that might be able to do this? I've trawled around the action recognition literature a bit but haven't found anything that seems to directly address this. Detection of speaking needs to be done only with video.
There is an implementation of face pose estimation in an open source library.
As you can see from this figure : there are lines around lips.By digging into source code of example you can track movement of lips as you try this example on your own environment you will see that lines covering lips are also moving depending on movement of lips.

How to detect Facial Sideview Left ear, Sideview Nose, Sideview mouth in iOS Application using OpenCV?

I need help with face profiling through image in an iOS application.
I am trying to detect left ear, nose and mouth in a given image. So far I tried OpenCV, I found voila's haar classifiers but this haar classifier does not detect left ear.
I need to perform this detection without going to server/online.
Is OpenCV good choice for this? Any sample codes you can share to achieve this functionality would be great.
What can be other choices to achieve this functionality?
I think only using part templates (e.g., viola's haar-classifiers) will not work in your case. The parts you want to detect are very small and will be fully/partially occluded most of the time. My suggestion would be to use graphical models based methods, i.e., active appearance models, pictorial structures, etc. This will not only allow you to exploit spatial constraints (i.e, mouth should be always below nose, etc.), but also works when one or few of the parts are occluded. Probably you can start with following publicly available codes:
http://cmp.felk.cvut.cz/~uricamic/flandmark/index.php#structclass
http://www.iai.uni-bonn.de/~gall/projects/facialfeatures/facialfeatures.html
Both codes are in C++, and will allow you to detect facial body parts, but I think ears are not included in both. May be you can try adding additional parts by slightly modifying the source code, and also training your own part-templates for the missing parts.
PS. I am not an iOS developer, so I am not sure if iOS can afford such models, but on normal computers they are sufficiently real time for normal size images.

Augmented Reality with large and complex markers

does anyone have any experience with using large and complex images as markers (e.g. magazine layout, photo, text-layout) for a.r.?
i am not sure which way to go:
flash, papervision and flar would be nice for distribution but i suspect them to be too bad in terms of performance for a more complex marker than the usual 9x9 or 12x12 blocks. i had difficulties achieving both a good 3d performance and a smooth and solid detection.
i can also do java or objective-c with opengl/opencv and this is definitely also an option for this project.
i just would like to know before if anyone has had experiences in this field and could give me a few hints or warnings. i know it has been done already so there is a way to do it smoothly.
thanks,
anton
It sounds like you might want to start investigating natural feature tracking libraries. In general the tracking is smoother and more robust than with markers, and any feature-full natural image can be used as the marker. The downside is, I'm not aware of any non-proprietary solutions.
Metaio Unifeye works in a web-browser via flash if I recall correctly, something like that might be what you're looking for.
You should look at MOPED.
MOPED is a real-time Object Recognition and Pose Estimation system. It recognizes objects from point-based features (e.g. SIFT, SURF) and their geometric relationships extracted from rigid 3D models of objects.
See this video for a demonstration.

Resources