Real-Time Multi-Object Tracking with Learning [closed] - opencv

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
My Goal is to have real-time MultiTracker with Learning.I used Kalman filter to track an object but i found errors in the estimation while tracking.
The object was not been tracked continuously. I want to implement some learning mechanism along with Tracking.
One way i thought of doing this is,
1) calculate the average HSV of a particular roi then store that HSV value in a vector(Scalar or Vec3b)
2) Compare the new HSV value (average from some ROI) with all previous HSV values present in vector collection.
3)If the new HSV value did not match with the HSV values in vector then track this as a new separate object.
4) else if the new roi matched HSV values in vector, then it is said to be the same object present in the roi. continue tracking old object.
5) Have some regular time based checking to remove old HSV values in vector.
I tried KCF, MIL e.t.c they are not realtime. can you recommend any realtime learning mecahnism or ways to improve above proposed one.

Related

Is it possible to gather the image matrix of a pygame gameboard without rendering the image to the screen? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I wrote a 2d simulation (very similar to the Atari- OpenAi games) in pygame, which I need for an reinforcement learning project. I'd like to train a neural network using mainly image data, i.e. screenshots of the pygame gameboard.
I am able to make those screenshots, but:
- Is possible to gather this image data - or, more precisely, the
corresponding rgb image matrix - also without rendering the whole
playing ground to the screen?
As I figured out there is the possibility to do such in pyglet ... But I would like to avoid to rewrite the whole simulation.
Basically, yes. You don't have to actually draw anything to the screen surface.
Once you have a Surface, you can use methods like get_at, the PixelArray module or the surfarray module to access the RGB(A)-values of each pixel.

why do we need object detection algorithms like yolo while we have image segmentation algorithms like deeplab_V3+ which do the same? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I am working on image segmentation and object detection and I think they do the same thing (they both localize and recognize objects in the raw picture). Is there any benefit in using object detection at all cause deeplab_V3+ do the job with better performance than any other object detection algorithms?
you can look at deeplab_V3+ demo in here
In object detection, the method localizes and classifies the object in the image based on bounding box coordinates. However, in image segmentation, the model also detects the exact boundaries of the object, which usually makes it a bit slower. They both have their own uses. In many applications (e.g. face detection), you only want to detect some specific objects in images and don't necessarily care about the exact boundaries of them. But in some applications (e.g. medical images), you want the exact boundaries of a tumor for example. Also we can consider the process of preparing the data for these tasks:
classification: we only provide a label for each image
localization: we provide a bounding box (4 elements) for each image
detection: we should provide a bounding box and a label for each object
segmentation: we need to define the exact boundaries of each object (semantic segmentation)
So for segmentation, more work is required both in providing the data and in training a (encoder-decoder) model, and it depends on your purpose of the task.

The drawbacks of the LBP image feature extraction method [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
The LBP feature has the drawback that it is not too robust on flat image areas.
Questions:
What is a flat image?
What do we mean by "not being robust on flat
image areas"?
An image region is said to be "flat" if it has a nearly uniform intensity. In other words, the variance of the intensity values within the region is very low.
The LBP feature is not robust on "flat" image areas since it is based on intensity differences. Within flat image regions, the intensity differences are of small magnitude and highly affected by image noise. Moreover, they are ignorant of the actual intensity level at the location they are computed on.

How to subtract color pixels [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
A lot of research papers that I am reading these days just abstractly write image1-image2
I imagine they mean gray scale images. But how to extend these to color images ?
Do I take the intensities and subtract ? How would I compute these intensities by taking the average or by taking the weighted average as illustrated here?
Also I would prefer if you could quote the source of this as well preferably from a research paper or a textbook.
Edit: I am working on motion detection where there are tons of algorithms which create a background model of the video(image) and then we subtract the current frame(again a image) from this model. We see if this difference exceeds a given threshold in which case we classify the pixel as foreground pixel. So far I have been subtracting the intensities directly but don't know whether other approach is possible.
Subtraction directly at RGB space or after converting to grayscale space is possible to miss useful information, and at the same time induce many unwanted outliers. It is possible that you don't need the subtraction operation. By investigating the intensity difference between background and object at all three channels, you can determine the range of background at the three channels, and simply set them to zero. This study demonstrated such method is robust against non-salient motion (such as moving leaves) with the presence of shadows at various environments.

How to position an object in 3D space using cameras [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Is it possible to use a couple of webcams (or any camera for that matter) to get the x, y and z co-ordinates of an object and then track them perhaps using OpenCV as it moves around a room.
I'm thinking of it in relation to localising and then controling an RC helicopter.
Yes. You need to detect points on both images simultaneously and then match the pairs that correspond to the same point in the scene. This way you will have the same point represented by different coordinate spaces (camera 1 and camera 2).
You can start here.
If using depth sensor is acceptable then you can take a look at how ReconstructMe does it. Otherwise take a look at this google search.

Resources