Tracking of rotating objects using opencv - opencv

I need to track cars on the road from top-view video.
My application contain two main parts:
Detecting cars on the frame (Tensorflow trained network)
Tracking detected cars (opencv trackers)
I have troubles with opencv trackers. Initially i tried to different trackers, but only MOSSE is fast enough. This tracker works almost perfect for case with straight road, but i faced problems with rotating cars. This situation appears on crossroads.
As i understood, bounding box of rotated object is bigger that bbox of horizontal or vertical object. As result bbox contains big part of static background and the tracker lose target object.
Are there any alternative trackers which can track contours (not bounding boxes)?
Can i adjust quality of existing opencv trackers results by any settings or by adjusting picture?
Schema:
Real image:

If your camera is stationary the following scenario is feasible:
use ‌background subtraction methods to separate background image from foreground blobs.
Improve the foreground results using morphological operations.
Detect car blobs and remove other blobs.
Track foreground blobs in video i.e. binary track (simply use this or even apply KF).

A very basic but effective approach in this scenario might be to track the center coordinates of the bounding box, if the center coordinates only change along one axis (with a small tolerance for either axis), its a linear motion (not a rotation). If both x and y change, the car is moving in the roundabout.
This only has the weakness that it will detect diagonal motion, but since you are looking at a centered roundabout, that shouldn't be an issue.
It will also be very efficient memory-wise.

You should use PCA method, which can calculate the orientation of an detected object and which way it is facing. You can change the threshold of detection to select objects more like the cars (based upon shape and colour - a HSV conversion which in your case is red) in your picture.
Link to an introduction to Principal Component Analysis (PCA)

Method 1 :
- Detect bounding boxes and subtract the background to get blobs rotated rectangles.
Method 2 :
- implement your own version of detector with rotated boxes.
Method 3 :
- Use segmentation instead ... Unet for example.

There are no other trackers than the ones found in the library.
Your best bet is to filter the image and use findcontours.
Optical flow and background subtraction will help with this. You can combine optical flow with your car detector to rule out false positives.
https://docs.opencv.org/3.4/d4/dee/tutorial_optical_flow.html
https://docs.opencv.org/3.4/d1/dc5/tutorial_background_subtraction.html

Related

How to get the coordinates of the moving object with GPUImage motionDetection- swift

How would I go about getting the screen coordinates of something that enters frame with motionDetection filter? I'm fairly new to programming, and would prefer a swift answer if possible.
Example - I have the iphone pointing at a wall - monitoring it with the motionDetector. If I bounce a tennis ball against the wall - I want the app to place an image of a tennis ball on the iphone display at the same spot it hit the wall.
To do this, I would need the coordinates of where the motion occurred.
I thought maybe the "centroid" argument did this.... but I'm not sure.
I should point out that the motion detector is pretty crude. It works by taking a low-pass filtered version of the video stream (a composite image generated by a weighted average of incoming video frames) and then subtracting that from the current video frame. Pixels that differ above a certain threshold are marked. The number of these pixels, along with the centroid of the marked pixels are provided as a result.
The centroid is a normalized (0.0-1.0) coordinate representing the centroid of all of these differing pixels. A normalized strength gives you the percentage of the pixels that were marked as differing.
Movement in a scene will generally cause a bunch of pixels to differ, and for a single moving object the centroid will generally be the center of that object. However, it's not a reliable measure, as lighting changes, shadows, other moving objects, etc. can also cause pixels to differ.
For true object tracking, you'll want to use feature detection and tracking algorithms. Unfortunately, the framework as published does not have a fully implemented version of any of these at present.

OpenCV Image Comparison for Surface Damage detection

We are planning to create a surface damage detection prototype for ceramic tiles with surface discoloration as a specific damage through the use of OpenCV. We would like to know what method should we consider using. We are new into developing these types of object recognition/object tracking programs. We've read about methods such as the Histogram method and the one where the Hue saturation value was being tracked, but still we are confused.
Also, we would like to know whether it is possible to detect the Hue saturation value of an object without the use of track bars.
Any relevant and helpful response will be greatly appreciated.
I think you can do it in sequence:
1) find tile region. Use corners detector, hough lines, etc.
2) find SIFT (or other descriprors) and recognize what image must be on this tile (find it in you tiles images database).
3) align images carefully. For example find homograpy between found in DB image and image of tile from camera (using SIFT features).
4) find color distance between every pixel in tile image from camera and tile image from database.
5) threshold differences by some value -> get problematic regions
And think about lighting. You have to provide equal lighting conditions for you measurements.

Filtering out shadows when diffing frames in opencv

I am using OpenCV to process some videos where a user is placing their hands on different parts of a wall. I've selected some regions of interest and I'm currently just using cv2.absdiff on the original image of the wall with no user and the current frame to detect whether the user has their hand in a region of interest by looking at the average pixel difference. If it's above some threshold, I consider that region "activated".
The problem I'm having is that some of the video clips contain lighting and positions that result in the user casting a shadow over certain ROIs, such that they are above the threshold. Is there a good way to filter out shadows when diffing images?
OpenCV has a Mixture of Gaussian based background subtractor which also has an option to account for shadow. You can use this instead of absdiff. MOG can be a bit slow though, compared to absdiff.
Alternatively, you can convert to HSV, and check that the Hue doesn't change.
You could first detect shadow regions in the original images, and exclude them from the difference imaging part. This paper provides a simple but effective method to detect shadows in images. They explore a colour space that is invariant to shadows.

How to position a car in image processing (computer vision)?

I would like to locate a car (front center point x,y) using a high resolution single camera. The camera setup is fixed at 1-2m high, and tilted around 25 degrees. The camera can provide images in where the front side of the car is visible. The intrinsic and extrinsic parameters are known.
So far, I tried to detect the headlights and number plates. Issues... Headlights are not detected as blobs all the time. The shape of the headlights are changing depending on the distance. Also, the number plate is not visible in the dark.
Is there a robust algorithm to detect a car? or to detect headlights? or detect number plate?How could I proceed?
Thanks in advance,
Are you detecting the same car everytime? If yes, then presumably the appearance remains consistent. Rather than detect and recognise blobs and shapes, you may be better off using scale and rotation invariant features combined with a machine learning algorithm. Look into the SIFT and SURF feature descriptors. For easy experimentation, use OpenCV's implementation of feature description and matching. Take a look at this example.
This is not an easy problem because of the change in the scale and point of view. Ideally, you would need a collection of training images with the car seen from different points of view to match later some of them to your input image. Then, you need local features (SIFT, SURF) or some classifier to decide on the match.
On the other hand, if you are tracking the same car all the time, check out the MeanShift algorithm. The problem is you need an initial position to carry on with the tracking.

Vehicle segmentation and tracking

I've been working on a project for some time, to detect and track (moving) vehicles in video captured from UAV's, currently I am using an SVM trained on bag-of-feature representations of local features extracted from vehicle and background images. I am then using a sliding window detection approach to try and localise vehicles in the images, which I would then like to track. The problem is that this approach is far to slow and my detector isn't as reliable as I would like so I'm getting quite a few false positives.
So I have been considering attempting to segment the cars from the background to find the approximate position so to reduce the search space before applying my classifier, but I am not sure how to go about this, and was hoping someone could help?
Additionally, I have been reading about motion segmentation with layers, using optical flow to segment the frame by flow model, does anyone have any experience with this method, if so could you offer some input to as whether you think this method would be applicable for my problem.
Below is two frames from a sample video
frame 0:
frame 5:
Assumimg your cars are moving, you could try to estimate the ground plane (road).
You may get a descent ground plane estimate by extracting features (SURF rather than SIFT, for speed), matching them over frame pairs, and solving for a homography using RANSAC, since plane in 3d moves according to a homography between two camera frames.
Once you have your ground plane you can identify the cars by looking at clusters of pixels that don't move according to the estimated homography.
A more sophisticated approach would be to do Structure from Motion on the terrain. This only presupposes that it is rigid, and not that it it planar.
Update
I was wondering if you could expand on how you would go about looking for clusters of pixels that don't move according to the estimated homography?
Sure. Say I and K are two video frames and H is the homography mapping features in I to features in K. First you warp I onto K according to H, i.e. you compute the warped image Iw as Iw( [x y]' )=I( inv(H)[x y]' ) (roughly Matlab notation). Then you look at the squared or absolute difference image Diff=(Iw-K)*(Iw-K). Image content that moves according to the homography H should give small differences (assuming constant illumination and exposure between the images). Image content that violates H such as moving cars should stand out.
For clustering high-error pixel groups in Diff I would start with simple thresholding ("every pixel difference in Diff larger than X is relevant", maybe using an adaptive threshold). The thresholded image can be cleaned up with morphological operations (dilation, erosion) and clustered with connected components. This may be too simplistic, but its easy to implement for a first try, and it should be fast. For something more fancy look at Clustering in Wikipedia. A 2D Gaussian Mixture Model may be interesting; when you initialize it with the detection result from the previous frame it should be pretty fast.
I did a little experiment with the two frames you provided, and I have to say I am somewhat surprised myself how well it works. :-) Left image: Difference (color coded) between the two frames you posted. Right image: Difference between the frames after matching them with a homography. The remaining differences clearly are the moving cars, and they are sufficiently strong for simple thresholding.
Thinking of the approach you currently use, it may be intersting combining it with my proposal:
You could try to learn and classify the cars in the difference image D instead of the original image. This would amount to learning what a car motion pattern looks like rather than what a car looks like, which could be more reliable.
You could get rid of the expensive window search and run the classifier only on regions of D with sufficiently high value.
Some additional remarks:
In theory, the cars should even stand out if they are not moving since they are not flat, but given your distance to the scene and camera resolution this effect may be too subtle.
You can replace the feature extraction / matching part of my proposal with Optical Flow, if you like. This amounts to identifying flow vectors that "stick out" from a consistent frame-to-frame motion of the ground. It may be prone to outliers in the optical flow, however. You can also try to get the homography from the flow vectors.
This is important: Regardless of which method you use, once you have found cars in one frame you should use this information to robustify your search of these cars in consecutive frame, giving a higher likelyhood to detections close to the old ones (Kalman filter, etc). That's what tracking is all about!
If the number of cars in your field of view always remain the same but move around then you can use optical flow...it will give you good results against a still background...if the number of cars are changing then you need to call goodFeaturestoTrack function in OpenCV after certain number of frames and again track the cars using optical flow.
You can use background modelling to model the background and hence the cars are always your foreground.The simplest example is frame differentiation...subtract the previous frame current frame. diff(x,y,k) = I(x,y,k) - I(x,y,k-1) .As your cars are moving in each frame you will get their position..
Both the process will work fine since you have a still background I presume..check this link to find what Optical flow can do.

Resources