How to track an opened hand in any environment with RGB camera? - opencv

I want to make a movable camera that tracks an opened hand (toward the floor). It just needs to track the opened hand but it has to also know the rotation (2d rotation).
This is what I searched for so far:
Contour- As the camera is movable, the background is unknown, even the lighting is not fixed. It's hard for me to get a clear hand
segment in real time.
Haar- It seems this just returns a rect and can't deal with rotation.
Feature detect- A hand doesn't have enough detail for this.
I am using the Opencv Unity plugin to do this.
EDIT
https://www.codeproject.com/Articles/826377/Rapid-Object-Detection-in-Csharp
I see another library can do something like this. Can OpenCV also do this?

Related

Methods to track marked points in a stationary video?

Not sure where to ask this. Please redirect me if SO is not the place.
I want make a web app that accurately tracks pose in a stationary video of someone pedaling a stationary bike. The joints can be marked with some stickers to make the process easier and more accurate. Basically, I want to do what does this app.
First i tried markerless tracking using pose estimation models such as mediapipe's Blazepose and google's MoveNet. However, these are not accurate enough. I would also like to track some additional landmarks (ball of the foot,...).
Then I tried OpenCV.js's Lukas-Kanade optical flow method. But the algorithm lost the tracked point quickly. Even when i placed a colored tape on the part of the body that i wanted to track.
I also tried template matching a single marked point in opencv but it was not very robust, and it would probably not work well when using more markers.
What other methods can I try? Since the app i send the video of requires stickers to be placed, I though it is using something like Lukas-Kanade. But as I said, when I tried it, it wasn't able to track the marked point. Because the app is only on iOS I thought it may be using this API. However, this is only my speculation.
Edit: added example video: https://www.youtube.com/watch?v=eCNyyABfWSE
I tried shooting in slowmo to have more fps, but the quality suffered because of this. Also i didn't have blue or green tape so I had to use yellow, which is not very visible on the sweater or on my wrist. But the markers on the pants should be trackable right?

Detecting a real world object using ARKit with iOS

I am currently playing a bit with ARKit. My goal is to detect a shelf and draw stuff onto it.
I did already find the ARReferenceImage and that basically works for a very, very simple prototype, but the image needs to be quite complex it seems? Xcode always complains if I try to use something a lot simpler (like a QR-Code like image). With that marker I would know the position of an edge and then I'd know the physical size of my shelf and know how to place stuff into it. So that would be ok, but I think small and simple markers will not work, right?
But ideally I would not need a marker at all.
I know that I can detect e.g. planes, but I want to detect the shelf itself. But as my shelf is open, it's not really a plane. Are there other possibilities to find an object using ARKit?
I know that my question is very vague, but maybe somebody could point me in the right direction. Or tell me if that's even possible with ARKit or if I need other tools? Like Unity?
There are several different possibilities for positioning content in augmented reality. They are called content anchors, and they are all subclasses of the ARAnchor class.
Image anchor
Using an image anchor, you would stick your reference image on a pre-determined spot on the shelf and position your 3D content relative to it.
the image needs to be quite complex it seems? Xcode always complains if I try to use something a lot simpler (like a QR-Code like image)
That's correct. The image needs to have enough visual detail for ARKit to track it. Something like a simple black and white checkerboard pattern doesn't work very well. A complex image does.
Object anchor
Using object anchors, you scan the shape of a 3D object ahead of time and bundle this data file with your app. When a user uses the app, ARKit will try to recognise this object and if it does, you can position your 3D content relative to it. Apple has some sample code for this if you want to try it out quickly.
Manually creating an anchor
Another option would be to enable ARKit plane detection, and have the user tap a point on the horizontal shelf. Then you perform a raycast to get the 3D coordinate of this point.
You can create an ARAnchor object using this coordinate, and add it to the ARSession.
Then you can again position your content relative to the anchor.
You could also implement a drag gesture to let the user fine-tune the position along the shelf's plane.
Conclusion
Which one of these placement options is best for you depends on the use case of your app. I hope this answer was useful :)
References
There are a lot of informative WWDC videos about ARKit. You could start off by watching this one: https://developer.apple.com/videos/play/wwdc2018/610
It is absolutely possible. If you do this in swift or Unity depends entirely on what you are comfortable working in.
Arkit calls them https://developer.apple.com/documentation/arkit/arobjectanchor. In other implementations they are often called mesh or model targets.
This Youtube video shows what you want to do in swift.
But objects like a shelf might be hard to recognize since their content often changes.

OpenCV Colour Detection Error

I am writing a script on the raspberry pi to detect the majority colour featured in a frame of a webcam and I seem to be having an issue. The following image is me holding up my phone with a blank red image on it. I seem to be getting an orange colour instead.
Now when I angle the phone I do in fact produce the red colour expected.
I am not sure why this is the case.
I am using a logitech c920 webcam that emits a blue light when activated and also have the monitor going. I am wondering whether the light from these two are causing this issue and when I angle it, these lights are not hitting it front on and thus not distributing the image.
I am still not heavily experienced in this area so I would enjoy hearing explanations and possible work arounds for my problem.
Thanks
There are a few things that can mess this up:
As you already mention, the light from the monitor and the camera.
The iPhone screen is a display, so flicker and sync might also be coming to play.
Reflection from the iPhone screen.
If your camera has automatic control for exposure and color balance etc., the picture quality can change as you move around.
I suggest using a colored piece of non-glossy paper so that you can remove the iPhone display's effects.

Xtion Vertical position usage ( calibration ) with Skeleton

I am currently working on the "Xtion pro live" by using "OpenNI" library.
The problem is that the Xtion must be vertically placed (along a wall). The problem is that in this position the user calibration always fails, so it is impossible to get the Skeleton info.
So, I would like to know how to fix this issue, I suppose there is something that I didn't understand about "GetSkeletonCap().RequestCalibration()" or with the "SampleConfig.xml" file. After a lot of research however I am still stuck.
Try moving the user, followed by the camera in 360degree circumference around the subject keeping the vertical positioning of the camera the same all the way through. It may detect optimal angle on the depth sensor. We did this twice with the kinect and it worked.
Also make sure the room is well lit.

Image tracking - tracking a screen with a camera

I want to track the relative position of a camera aimed at a computer screen.
I can’t control what is displayed on the computer screen but I can receive screen dumps whenever something changes on the screen. Those screen dumps can hopefully be used to find the screen when analyzing the video from the camera.
I see many videos on youtube for face, logo or single colored objects tracking using OpenCV but I’m unsure those methods would work finding and tracking a more detailed image like a screen dump.
Maybe Template Matching is the way to go? But I need to find the screen even at an angle.
Basically I don’t know where to begin and need help from people with experience in this field to find the best way for achieving what I want.
Thanks
Using feature matching should do the trick (Sift/SURF/ORB/...)

Resources