I am looking for ways of cropping the head and upper body contour form a live camera feed and putting it in front of a virtual background. For example how does zoom achieve this exact same thing with Virtual Background feature?
I know openCV is there, but I don't know there is just face detection or it can help with cropping the whole head and body including head hair, shoulders, arms etc.
I am not sure how apps like Instagram does it, but I know they have the functionality to replace the complete background of camera feed with virtual things. Not sure if they use ARKit or ARCore, but even these platforms only support detecting different positions on the face, nothing for detecting the boundary of the body itself.
Appreciate any help.
Thanks,
Amit.
Apps like Instagram and Snapchat use their own custom programs to achieve that like SparkAR in Instagram case and Lens Studio for Snapchat. I really believe they don't use ARKit or ARCore, for stability reasons.
Now, If you are building your own program to detect face or background than you would ideally start with OpenCV. Then, over the top of it you would use MATLAB for calculating boundary, head or whatever you want to achieve.
Related
I need to detect where objects (mostly people) are in relation to a wall. I can have a fixed position camera in the ceiling so I thought to get an image of the space with nothing in it. Then use the difference of that and the current camera image to get an image with just the things. Then I can do blob detection I think to get the positions (only need x).
Does this seem sound? I'm not very accomplished in OpenCV so am looking for some advice.
That would be one way of going about it, but not very robust as the video feed won't produce consistent precise images so the background will never be nicely subtracted out, and people walking through the scene will occlude light and could also possibly match parts of your background.
This process of removing the background from a video is simply dubbed "background subtraction" and there are built-in OpenCV methods for it.
OpenCV has tutorials on their site showing the basics, for both python and C++.
I am still developing on my sci-fi video game using my own custom game engine. Now, I want to implement the combat system in my game and in the engine. While nearly everything is clear to me, I wonder how to do proper laser beams like the ones known from Star Wars, Star Trek, Babylon 5, etc.?
I did some online research, however I did not find any suitable article. I am pretty sure I searched with the wrong keywords/tags. Can you give me some hints how to implement such effects as laser beams? I think, it'd be enough to know the proper techniques or terms I need for online research...
A common way is to draw three (or more) intersecting transparent planes like this, if you excuse my crude drawing:
Each of them then bears the same laser texture that fades to black near the top and bottom edges:
If you add any subtle detail, remember to scale the texture coordinates appropriately based on the length of the beam and enable wrapping.
Finally, and most importantly, use a shader that shows only the planes facing the camera, while fading away the ones at a glancing angle to hide the fact that we're using intersecting planes and make the beam look smooth and plausible. The blending should be additive. You should also add some extra effects to the ends of the beam, again to hide the planes.
I would like some advice on how to approach this problem. I am making an app where users will be retrieving photos of faces from a camera roll or camera capture (assuming they are always portrait) and I want to make it appear as though the face images are talking (ex. moving pixels around the mouth up and down) using any known image manipulation techniques. The resultant animation of the photo will appear on a separate view. I have started learning OpenGL and researched Open CV, Core Image, GPUImage and other frameworks. I have been given a small timeframe and generally, my experience with graphics processing is limited. I would appreciate it if anybody were to instruct me on what to do using any of the frameworks or libraries I have mentioned above.
Since all you need is some animation of the image, I don't think it is a good idea to move the pixels around as you said. It's very complicated and the result of moving the pixels around might looks bad.
A much simpler approach is by using gif image. All you need to do is to make the animation of talking as a gif image and then use it in your app.
Please refer to the following question.
I would like some hints, maybe more, on detecting a custom image marker in a real-time video feed. I'm using OpenCV, iPhone and the camera feed.
By custom image marker I'm referring to a predefined image, but it can be any kind of image (not a specific designed marker). For example, it can be a picture of some skyscrapers.
I've already worked with ARTags and understand how they are detected, but how would I detect this custom image and especially find out its position & orientation?
What makes a good custom image to be detected successfully?
Thanks
The most popular markers used in AR are
AR markers (a simple form of QR codes) - those detected by AR tookit & others
QR codes. There are plenty of examples on how to create/detect/read QR.
Dot grids. Similar with the chess grids used in calibration. It seems their detection can be more robust than the classical chess grid. OpenCV has codes related to dot grid detection in the calibration part. Also, the OpenCV codebase offers a good starting point to extract 3D position and orientation.
Chess grids. Similar to dot grids. They were the standard calibration pattern, and some people used them for marker detection of a long time. But they lost their position to dot grids recently, when some people discovered that dots can be detected with better accuracy.
Note:
Grids are symmetrical. I bet you already know that. But that means you will not be able to
recover full orientation data from them. You will get the plane where the grid lies, but nothing more.
Final note:
Code and examples for the first two are easily found on the Internet. They are considered the best by many people. If you decide to use the grid patterns, you have to enjoy some math and image processing work :) And it will take more.
This answer is valid no more since Vuforia is now a paid engine.
I think you should give Vuforia a try. It's a AR engine that can use any image you want as a marker. What makes a good marker for Vuforia is high frequency images.
http://www.qualcomm.com/solutions/augmented-reality
Vuforia is a free to use engine.
I'm working on a photo gallery *projected on a wall*, in which the users should interact with gestures. The users will be standing in front of the wall projection. The user should be able to select one photo, to go back to the main gallery and to do other (unspecified) gestures.
I have programming skills in c,c++ and some knowledge in opengl. I have no experience with opencv but I think I can use it to recognize the user gestures.
The raw idea is to place a webcam in front of the user (up or down the wall rectangle) and process the video stream with opencv.
This may not be the best solution at all... so a lot of questions arises:
Any reference to helpful documentation?
Should I use a controlled lights ambient?
In your experience where is the best camera position?
Might it be better to back project the wall (I mean that the wall will not be a real wall ;-) )
Any different (better) solution? are there any devices to visually intercept the user gestures (like xbox360 for example)?
Thanks a lot!
Massimo
I don't have much experience on human detection with OpenCV, but with any tool, this is a difficult task. You didn't even specified which parts of the human body you're planned to use... Are gestures use the full body, only arms and hands, etc. ?
OpenCV has some predefined files to detect full human body, face, mouth, etc. (look for dedicated .xml file into OpenCV source code), you may want to try them.
For documentation, the official OpenCV documentation is a must see: http://opencv.willowgarage.com/documentation/cpp/index.html but of course, it is very general.
Controlling the ambient light may be useful, but it depends on the methods you'll use. First, find the suited methods, and make your choice depending on your capacity to control the light. Again, the best position of the camera will depend on the methods and surely on which parts of the human body you planned to use. Finally, keep in mind that OpenCV is not particularly fast do you may need to use some OpenGL routines to make things faster.
If you're prepared not to use only webcams, you may want to have a look at the Kinect SDKs. The official is only supposed to be released next spring, but you can find stuff for Linux boxes already.
have fun!