I'm working on computer screen detection using emgucv (a c# opencv wrapper ).
I want to detect my computer screnn and draw a rectangle on it.
To help in this process, I used 3 Infrared Leds on the screen of the computer which I detect firtsly and after the detection, I could find the screen areas below those 3 leds.
Here is the results after the detection of the 3 leds.
The 3 red boxes are the detected leds.
.
And in general I have something like this
Does anyone have an idea about how I can proceed to detect the whole screan area ?
This is just a suggestion but, if you know for a fact that your computer screen is below your LEDs, you could try using OpenCV GrabCut algorithm. Draw a rectangle below the LEDs, large enough to contain the screen (maybe you could guess the size from the space between the LEDs) and use it to initialize the GrabCut.
Let me know what kind of results you get.
You can try to use a camera with no IR filter(This is mostly all night vision cameras) so that you can get a more intense light from the LEDs hence making it stand out than what your display would have then its a simple blob detection to get there position.
Another solution would be using ARUCO markers on the display if the view angle you are tending to use are not very large then its should be a compelling option and even the relative position of the camera with the display can be predicted also if that is what you want. With the detection of ARUCO you can get the angles that the plane of the display is placed at hence making the estimation of the display area with them.
Related
I need to track cars on the road from top-view video.
My application contain two main parts:
Detecting cars on the frame (Tensorflow trained network)
Tracking detected cars (opencv trackers)
I have troubles with opencv trackers. Initially i tried to different trackers, but only MOSSE is fast enough. This tracker works almost perfect for case with straight road, but i faced problems with rotating cars. This situation appears on crossroads.
As i understood, bounding box of rotated object is bigger that bbox of horizontal or vertical object. As result bbox contains big part of static background and the tracker lose target object.
Are there any alternative trackers which can track contours (not bounding boxes)?
Can i adjust quality of existing opencv trackers results by any settings or by adjusting picture?
Schema:
Real image:
If your camera is stationary the following scenario is feasible:
use ‌background subtraction methods to separate background image from foreground blobs.
Improve the foreground results using morphological operations.
Detect car blobs and remove other blobs.
Track foreground blobs in video i.e. binary track (simply use this or even apply KF).
A very basic but effective approach in this scenario might be to track the center coordinates of the bounding box, if the center coordinates only change along one axis (with a small tolerance for either axis), its a linear motion (not a rotation). If both x and y change, the car is moving in the roundabout.
This only has the weakness that it will detect diagonal motion, but since you are looking at a centered roundabout, that shouldn't be an issue.
It will also be very efficient memory-wise.
You should use PCA method, which can calculate the orientation of an detected object and which way it is facing. You can change the threshold of detection to select objects more like the cars (based upon shape and colour - a HSV conversion which in your case is red) in your picture.
Link to an introduction to Principal Component Analysis (PCA)
Method 1 :
- Detect bounding boxes and subtract the background to get blobs rotated rectangles.
Method 2 :
- implement your own version of detector with rotated boxes.
Method 3 :
- Use segmentation instead ... Unet for example.
There are no other trackers than the ones found in the library.
Your best bet is to filter the image and use findcontours.
Optical flow and background subtraction will help with this. You can combine optical flow with your car detector to rule out false positives.
https://docs.opencv.org/3.4/d4/dee/tutorial_optical_flow.html
https://docs.opencv.org/3.4/d1/dc5/tutorial_background_subtraction.html
First of all I'm a total newbie in image processing, so please don't be too harsh on me.
That being said, I'm developing an application to analyse changes in blood flow in extremities using thermal images obtained by a camera. The user is able to define a region of interest by placing a shape (circle,rectangle,etc.) on the current image. The user should then be able to see how the average temperature changes from frame to frame inside the specified ROI.
The problem is that some of the images are not steady, due to (small) movement by the test subject. My question is how can I determine the movement between the frames, so that I can relocate the ROI accordingly?
I'm using the Emgu OpenCV .Net wrapper for image processing.
What I've tried so far is calculating the center of gravity using GetMoments() on the biggest contour found and calculating the direction vector between this and the previous center of gravity. The ROI is then translated using this vector but the results are not that promising yet.
Is this the right way to do it or am I totally barking up the wrong tree?
------Edit------
Here are two sample images showing slight movement downwards to the right:
http://postimg.org/image/wznf2r27n/
Comparison between the contours:
http://postimg.org/image/4ldez2di1/
As you can see the shape of the contour is pretty much the same, although there are some small differences near the toes.
Seems like I was finally able to find a solution for my problem using optical flow based on the Lukas-Kanade method.
Just in case anyone else is wondering how to implement it in Emgu/C#, here's the link to a Emgu examples project, where they use Lukas-Kanade and Farneback's algorithms:
http://sourceforge.net/projects/emguexample/files/Image/BuildBackgroundImage.zip/download
You may need to adapt a few things, e.g. the parameters for the corner detection (the frame.GoodFeaturesToTrack(..) method) , but it's definetly something to start with.
Thanks for all the ideas!
I have been looking for a bit and know that people are able to track faces with core image and openGL. However I am not that sure where to start the process of tracking a colored ball with the iOS camera.
Once I have a lead to being able to track the ball. I hope to create something to detect. when the ball changes directions.
Sorry I don't have source code, but I am unsure where to even start.
The key point is image preprocessing and filtering. You can use the Camera API-s to get the video stream from the camera. Take a snapshot picture from it, then you should use a Gaussian-blur on it (spatial enhance), then a Luminance Average Threshold Filter (to make black and white image). After that a morphological preprocessing should be wise (opening, closing operators), to hide the small noises. Then an Edge detection algorithm (with for example a Prewitt-operator). After these processes only the edges remain, your ball should be a circle (when the recording environment was ideal) After that you can use a Hough-transform to find the center of the ball. You should record the ball position and in the next frame, the small part of the picture can be processed (around the ball only).
Other keyword could be: blob detection
A fast library for image processing (on GPU with openGL) is Brad Larsons: GPUImage library https://github.com/BradLarson/GPUImage
It implements all the needed filter (except Hough-transformation)
The tracking process can be defined as following:
Having the initial coordinate and dimensions of an object with a given visual characteristics (image features)
In the next video frame, find the same visual characteristics near the coordinate of the last frame.
Near means considering basic transformations related to the last frame:
translation in each direction;
scale;
rotation;
The variation of these tranformations are strictly related with the frame rate. Higher the frame rate, nearest the position will be in the next frame.
Marvin Framework provides plug-ins and examples to perform this task. It's not compatible with iOs yet. However, it is open source and I think you can port the source code easily.
This video demonstrates some tracking features, starting at 1:10.
I am working on a multi-view telepresence project using an array of kinect cameras.
To improve the visual quality we want to extract the foreground, e.g. the person standing in the middle using the color image, the color image and not the depth image because we want to use the more reliable color image to repair some artefacts in the depth image.
The problem now is that the foreground objects (usually 1-2 persons) are standing in front of a huge screen showing another party, which is also moving all the time, of the telepresence system and this screen is visible for some of the kinects. Is it still possible to extract the foreground for these kinects and if so, could you point me in the right direction?
Some more information regarding the existing system:
we already have a system running that merges the depth maps of all the kinects, but that only gets us so far. There are a lot of issues with the kinect depth sensor, e.g. interference and distance to the sensor.
Also the color and depth sensor are slightly shifted, so when you map the color (like a texture) on a mesh reconstructed using the depth data you sometimes map the floor on the person.
All these issues decrease the overall quality of the depth data, but not the color data, so one could view the color image silhouette as the "real" one and the depth one as a "broken" one. But nevertheless the mesh is constructed using the depth data. So improving the depth data equals improving the quality of the system.
Now if you have the silhouette you could try to remove/modify incorrect depth values outside of the silhouette and/or add missing depth values inside
Thanks for every hint you can provide.
In my experience with this kind of problems the strategy you propose is not the best way to go.
As you have a non-constant background, the problem you want to solve is actually 2D segmentation. This is a hard problem, and people are typically using depth to make segmentation easier and not the other way round. I would try to combine / merge the multiple depth maps from your Kinects in order to improve your depth images, maybe in a Kinect fusion kind of way, or using classic sensor fusion techniques.
If you are absolutely determined to follow your strategy, you could try to use your imperfect depth maps to combine the RGB camera images of the Kinects in order to reconstruct a complete view of the background (without occlusion by the people in front of it). However, due to the changing background image on the screen, this would require your Kinects' RGB cameras to by synchronized, which I think is not possible.
Edit in the light of comments / updates
I think exploiting your knowledge of the image on the screen is your only chance of doing background subtraction for silhouette enhancement. I see that this is a tough problem as the sceen is a stereoscopic display, if I understand you correctly.
You could try to compute a model that describes what your Kinect RGB cameras see (given the stereoscopic display and their placement, type of sensor etc) when you display a certain image on your screen, essentially a function telling you: Kinect K sees (r, g, b) at pixel (x, y) when I show (r',g',b') at pixel (x',y') on my display. To do this you will have do create a sequence of calibration images which you show on the display, without a person standing in front of it, and film with the Kinect. This would allow you to predict the appearance of your screen in the Kinect cameras, and thus compute background subtraction. This is a pretty challenging task (but it would give a good research paper if it works).
A side note: You can easily compute the geometric relation of a Kinect's depth camera to its color camera, in order to avoid mapping the floor on the person. Some Kinect APIs allow you to retrieve the raw image of the depth camera. If you cover the IR projector you can film a calibration pattern with both, depth and RGB camera, and compute an extrinsic calibration.
I want to make an apps detect an square/rectangle in my webcam using EMGU CV (an OPENCV wrapper). The square/rectangle will have a solid color.
if it's posible I would like to obtain the width and heigth of the square/rectangle
In this video you can see what I would like to do.
http://www.youtube.com/watch?v=ytvO2dijZ7A&NR=1
I'm working with C#
If you already know the color of the desired object then you can segment the image based on that color. (Which may be why the rectangle disapears when the guy movies the direction to and away from the camera [differences in lighting]. Once you have the object segmented out of the image you can do region calculations on the image. [In matlab think regionprops]
Once you have the blob you can attempt to do model fitting to get a good approximation of the object being represented.
In the video link provided what is probably being done is Surf feature detection. Take a look at the SURFFeture example that ships with EMGU. Rather than drawing lines in this case however the four corner points are detected and a shape drawn on top. Similar examples which will help you are ShapeDetection and TrafficSignRecognition both in the EMGU.CV.Examples folder. ShapeDetection will teach you how to classify the square and the StopSignDetector.cs class will show you another example of how to apply a surf feature detection algorithm.
It will require a little reconfiguration but if you get stuck feel free to ask another question.
Cheers
Chris