I want to track the relative position of a camera aimed at a computer screen.
I can’t control what is displayed on the computer screen but I can receive screen dumps whenever something changes on the screen. Those screen dumps can hopefully be used to find the screen when analyzing the video from the camera.
I see many videos on youtube for face, logo or single colored objects tracking using OpenCV but I’m unsure those methods would work finding and tracking a more detailed image like a screen dump.
Maybe Template Matching is the way to go? But I need to find the screen even at an angle.
Basically I don’t know where to begin and need help from people with experience in this field to find the best way for achieving what I want.
Thanks
Using feature matching should do the trick (Sift/SURF/ORB/...)
Related
Not sure where to ask this. Please redirect me if SO is not the place.
I want make a web app that accurately tracks pose in a stationary video of someone pedaling a stationary bike. The joints can be marked with some stickers to make the process easier and more accurate. Basically, I want to do what does this app.
First i tried markerless tracking using pose estimation models such as mediapipe's Blazepose and google's MoveNet. However, these are not accurate enough. I would also like to track some additional landmarks (ball of the foot,...).
Then I tried OpenCV.js's Lukas-Kanade optical flow method. But the algorithm lost the tracked point quickly. Even when i placed a colored tape on the part of the body that i wanted to track.
I also tried template matching a single marked point in opencv but it was not very robust, and it would probably not work well when using more markers.
What other methods can I try? Since the app i send the video of requires stickers to be placed, I though it is using something like Lukas-Kanade. But as I said, when I tried it, it wasn't able to track the marked point. Because the app is only on iOS I thought it may be using this API. However, this is only my speculation.
Edit: added example video: https://www.youtube.com/watch?v=eCNyyABfWSE
I tried shooting in slowmo to have more fps, but the quality suffered because of this. Also i didn't have blue or green tape so I had to use yellow, which is not very visible on the sweater or on my wrist. But the markers on the pants should be trackable right?
I have this requirement how to find whether the user is looking down or up the iPhone screen. like if the user have his iPhone in desk and he need to look down to the screen. if the same user taking a photo over his head means how to find it.
Is there is any sensors we need to use?
There is no direct sensor in iPhone that can recognise where your eyes looking but you can use front camera & machine learning to achieve your functionality. For more refer recognize gaze direction
I want to make a movable camera that tracks an opened hand (toward the floor). It just needs to track the opened hand but it has to also know the rotation (2d rotation).
This is what I searched for so far:
Contour- As the camera is movable, the background is unknown, even the lighting is not fixed. It's hard for me to get a clear hand
segment in real time.
Haar- It seems this just returns a rect and can't deal with rotation.
Feature detect- A hand doesn't have enough detail for this.
I am using the Opencv Unity plugin to do this.
EDIT
https://www.codeproject.com/Articles/826377/Rapid-Object-Detection-in-Csharp
I see another library can do something like this. Can OpenCV also do this?
I am currently working on an iOS app that can take a picture programmatically using AVFoundation libraries like AVCaptureDevice through a custom button.
The new requirement is that the camera should automatically take a picture when the camera session detects something specific. For example, if the camera is open, and I line up an apple to fill a certain circle part of the capture screen, it should take the picture automatically. We can see this auto capture feature in some banking apps when you submit a mobile check deposit.
Does anyone know of existing libraries(open-source or proprietary) that can analyze images in real time while a user is taking a picture?
The first thing you are going to need to do is decide how you want to detect the apple. You can do this using shape detection, image recognition, or various other methods. This is important because you need to know the approach you want to take before you can identify the best way to implement it.
Once you know how you are going to identify the apple, the easiest way to do real-time image processing like this would be to use an existing augmented reality SDK. For example:
http://www.wikitude.com/products/wikitude-sdk/
http://artoolkit.org/
https://developer.vuforia.com/
If you are feeling really adventurous you could roll your own using AForge or a similar library. I have taken this approach in the past for basic shape detection projects.
Edit
The reason I suggest using an existing AR SDK is because generally they provide a lot of the glue between the camera feed and their API for you and it takes a lot of leg work out of the equation. Even though you won't be using any of the actual "augmentation" part of their SDKs, you can still take advantage of the detection part.
No matter what approach you take, you can think about it in the simplest terms of looking a picture, and figuring out if the item you want is in that picture. How do you decide? In most cases you look for a specific shape or pattern.
I want to build an iPad app that detect an alphabet physical shape placed on the iPad screen and print the alphabet to the screen after processing the object detection. Is this doable?
I am trying to find a way to implement this, but could not find any article or online resource that guide me to that.
Thanks,
I would imagine you could start by looking at the various pens and stylus's that are available for iPads. Look at how they work. Then you would need to see if you cna make an object that will activate the touch mechanism over a defined area in the same way, for example - a line, and see if you can detech the touch points along the line. Sorting all that out will effectively get you started.