I downloaded the kinect sensor datasets (depth(textfile) and image)because kinect is expensive.I don't know how to proceed with the dataset?I have to extract the hand from the image.i can't use kinectSDK because it works only if kinect sensor is connected.So i decided to extract hand from the image using image processing.Can anyone please suggest any algorithm for that? or can I extract hand by means of other methods?
Thanks in advance.
color image and depth information can be used for hand detection,
i think you can use nearest skin region to camera as the hand, because in the dataset hand placed front of body.
Related
In such tasks, I tend to you Mediapipe or Dlib to detect the landmarks of the face and get the specific coordinates I'm interested to work with.
But in the case of the human face taken from a Profil view, Dlib can't detect anything and Mediapipe shows me a standard 3D face-mesh superimposed on top of the 2D image which provides false coordinates.
I was wondering if anyone with Computer Vision (Image Processing) knowledge can guide me on how to detect the A & B points coordinates from this image
PS: The color of the background changes & also the face location is not standard.
Thanks in advance strong text
Your question seems a little unclear. If you just want (x,y) screen coordinates you can use this answer to convert the (x,y,z) that mediapipe gives you to just (x,y). If this doesn't
doesnt work for you I would recommend this repo or this one which both only work with 68 facial landmarks, but this should be sufficient for your use case.
If all of this fails I would recommend retraining hrnet on a dataset with profile views. I believe either 300-W dataset or the 300-VW dataset provides some data with heads at extreme angles.
If you wish to get the 3D coordinates in camera coordinates (X,Y,Z) you're going to have to use solvePNP. This will require getting calibration info and model mesh points for whatever facial landmark detector you use. You can find a some examples for some of this here
I try to get a point cloud using a 32bit color depth image from Hololens, But I am having a hard time because I do not have much information about it. Do I have to have camera parameters to get point clouds from the depth image? Is there a way to convert from PCL or OpenCV?
I add some comment and picture. Finally I can get the point cloud using depth image from hololens. But I convert 32bit depth image to grayscale and feel that the sensors of the lens alone have a lot of distortion. To complement this, I think we need to find a way to undistortion and filtering the depth image.
Do you have any other information about this?
I want to send my images to Kinect SDK either OpenNI or windows SDK for Kinect sdk to tell me the position of a user's hand or head and ..
I don't want to use kinect's camera feed. The images are from a paper which I want to do some image processing on them and I need to work exactly on same images so can not use my own body as input to Kinect camera.
I don't matter between Kinect sdk of Microsoft or the OpenNI thing, it just needs to be able to get my rgb and depth images as input instead of Kinect camera's one.
Is it possible? If yes how can I do it?
I subscribe to the same question. I want to feed a Kinect face detection app to read images from the hard drive and return the Animation Units of the recognized face. I want to train a classifier for facial emotion recognition using Animation Units as input and features.
Thanks,
Daniel.
I need to reconstruct a depth map from an image sequence taken by a single static camera of a moving object.
As far as I understand I can calculate the depth of a point found in two images using a stereo camera using the intercept theorem. Is there any way to calculate depth information using only a single camera and matching points from multiple images instead?
Any comments and alternative solutions are welcome. Thanks in advance for your help!
There are some algorithms which help you get depth from a single image. A list of them is mentioned here, http://make3d.cs.cornell.edu/results_stateoftheart.html
These techniques use MRFs and assume that the scene is made up of a collection of planes.
A moving object does not provide any information about the depth (until unless you know the depth of some other moving object), however a single rotating camera can help in extracting depth.
I'm trying to use OpenCV to detect IR point using in-built camera. My camera can see infrared light. However I don't have a clue how to distinguish between visible light and IR light.
After transformation to RGB we can't distinguish, but maybe OpenCV has some methods to do it.
Does anybody know about such OpenCV functions? Or how to do it in other way?
--edit
Is it possible to recognise for example light wavelength using laptop in-build camera ? Or it's just impossible to distinguish between visible and infrared light without using special camera?
You wouldn't be able do anything in OpenCV because by the time it goes to work on it, it will just be another RGB like the visible light (you sort of mention this).
You say your camera can see infrared...Does this mean it has a filter which separates IR light from the visible light? In which case by the time you have your image inside OpenCV you would be only focusing on IR. Then look at intensities etc?
In your setting, assuming you have RGB +IR camera, probably your camera will display these three channels:
R + IR
G + IR
B + IR
So it would be difficult to identify IR pixels directly from the image. But nothing is impossible. R, G, B and IR are broad bands so information on all wavelengths is in the channels.
One thing You can do is to train classification model to classify non-IR and IR pixels in an image with lots of image data with pre-determined classes. With that model trained, you could identify IR pixels of new image.
There is no way to separate IR from visible light with software, because your camera in fact "transforms" IR light into for your eyes visible light.
I assume the only way to solve that would be using 2 cameras, one IR camera with IR-transmitting filter and one normal camera with IR blocking filter. Then you can match the images and pull out the information you need.