OpenCV: method of measuring an objects angular displacement? - opencv

I would like to track an object using a webcam and compare every two frames together to see the changes in the angular displacement of the object. I was thinking to first extract the object from the image using background subtraction then create a bounding box around the object and then measure the angular displacement of the box. Is there any way to implement this on OpenCV?

I have some experimental experience in this.
Try calculating motion vectors and see the orientation and length correlations with the two frames. It will give you neat idea of type of movements eg:
Motion(length correlation)
Direction(collective direction of group of motion vectors)
Depth(direction and relative length of motion eg if there is depth
then direction will be same but length of MV will be smaller).
Rotation and scaling can be seen by motion vectors' group behavior.
Try it and if you want more clarification, write back.
Good Luck and Happy Coding.

Related

OpenCV - align stack of images - different cameras

We have this camera array arranged in an arc around a person (red dot). Think The Matrix - each camera fires at the same time and then we create an animated gif from the output. The problem is that it is near impossible to align the cameras exactly and so I am looking for a way in OpenCV to align the images better and make it smoother.
Looking for general steps. I'm unsure of the order I would do it. If I start with image 1 and match 2 to it, then 2 is further from three than it was at the start. And so matching 3 to 2 would be more change... and the error would propagate. I have seen similar alignments done though. Any help much appreciated.
Here's a thought. How about performing a quick and very simple "calibration" of the imaging system by using a single reference point?
The best thing about this is you can try it out pretty quickly and even if results are too bad for you, they can give you some more insight into the problem. But the bad thing is it may just not be good enough because it's hard to think of anything "less advanced" than this. Here's the description:
Remove the object from the scene
Place a small object (let's call it a "dot") to position that rougly corresponds to center of mass of object you are about to record (the center of area denoted by red circle).
Record a single image with each camera
Use some simple algorithm to find the position of the dot on every image
Compute distances from dot positions to image centers on every image
Shift images by (-x, -y), where (x, y) is the above mentioned distance; after that, the dot should be located in the center of every image.
When recording an actual object, use these precomputed distances to shift all images. After you translate the images, they will be roughly aligned. But since you are shooting an object that is three-dimensional and has considerable size, I am not sure whether the alignment will be very convincing ... I wonder what results you'd get, actually.
If I understand the application correctly, you should be able to obtain the relative pose of each camera in your array using homographies:
https://docs.opencv.org/3.4.0/d9/dab/tutorial_homography.html
From here, the next step would be to correct for alignment issues by estimating the transform between each camera's actual position and their 'ideal' position in the array. These ideal positions could be computed relative to a single camera, or relative to the focus point of the array (which may help simplify calculation). For each image, applying this corrective transform will result in an image that 'looks like' it was taken from the 'ideal' position.
Note that you may need to estimate relative camera pose in 3-4 array 'sections', as it looks like you have a full 180deg array (e.g. estimate homographies for 4-5 cameras at a time). As long as you have some overlap between sections it should work out.
Most of my experience with this sort of thing comes from using MATLAB's stereo camera calibrator app and related functions. Their help page gives a good overview of how to get started estimating camera pose. OpenCV has similar functionality.
https://www.mathworks.com/help/vision/ug/stereo-camera-calibrator-app.html
The cited paper by Zhang gives a great description of the mathematics of pose estimation from correspondence, if you're interested.

How to get the coordinates of the moving object with GPUImage motionDetection- swift

How would I go about getting the screen coordinates of something that enters frame with motionDetection filter? I'm fairly new to programming, and would prefer a swift answer if possible.
Example - I have the iphone pointing at a wall - monitoring it with the motionDetector. If I bounce a tennis ball against the wall - I want the app to place an image of a tennis ball on the iphone display at the same spot it hit the wall.
To do this, I would need the coordinates of where the motion occurred.
I thought maybe the "centroid" argument did this.... but I'm not sure.
I should point out that the motion detector is pretty crude. It works by taking a low-pass filtered version of the video stream (a composite image generated by a weighted average of incoming video frames) and then subtracting that from the current video frame. Pixels that differ above a certain threshold are marked. The number of these pixels, along with the centroid of the marked pixels are provided as a result.
The centroid is a normalized (0.0-1.0) coordinate representing the centroid of all of these differing pixels. A normalized strength gives you the percentage of the pixels that were marked as differing.
Movement in a scene will generally cause a bunch of pixels to differ, and for a single moving object the centroid will generally be the center of that object. However, it's not a reliable measure, as lighting changes, shadows, other moving objects, etc. can also cause pixels to differ.
For true object tracking, you'll want to use feature detection and tracking algorithms. Unfortunately, the framework as published does not have a fully implemented version of any of these at present.

How to position a car in image processing (computer vision)?

I would like to locate a car (front center point x,y) using a high resolution single camera. The camera setup is fixed at 1-2m high, and tilted around 25 degrees. The camera can provide images in where the front side of the car is visible. The intrinsic and extrinsic parameters are known.
So far, I tried to detect the headlights and number plates. Issues... Headlights are not detected as blobs all the time. The shape of the headlights are changing depending on the distance. Also, the number plate is not visible in the dark.
Is there a robust algorithm to detect a car? or to detect headlights? or detect number plate?How could I proceed?
Thanks in advance,
Are you detecting the same car everytime? If yes, then presumably the appearance remains consistent. Rather than detect and recognise blobs and shapes, you may be better off using scale and rotation invariant features combined with a machine learning algorithm. Look into the SIFT and SURF feature descriptors. For easy experimentation, use OpenCV's implementation of feature description and matching. Take a look at this example.
This is not an easy problem because of the change in the scale and point of view. Ideally, you would need a collection of training images with the car seen from different points of view to match later some of them to your input image. Then, you need local features (SIFT, SURF) or some classifier to decide on the match.
On the other hand, if you are tracking the same car all the time, check out the MeanShift algorithm. The problem is you need an initial position to carry on with the tracking.

Vehicle segmentation and tracking

I've been working on a project for some time, to detect and track (moving) vehicles in video captured from UAV's, currently I am using an SVM trained on bag-of-feature representations of local features extracted from vehicle and background images. I am then using a sliding window detection approach to try and localise vehicles in the images, which I would then like to track. The problem is that this approach is far to slow and my detector isn't as reliable as I would like so I'm getting quite a few false positives.
So I have been considering attempting to segment the cars from the background to find the approximate position so to reduce the search space before applying my classifier, but I am not sure how to go about this, and was hoping someone could help?
Additionally, I have been reading about motion segmentation with layers, using optical flow to segment the frame by flow model, does anyone have any experience with this method, if so could you offer some input to as whether you think this method would be applicable for my problem.
Below is two frames from a sample video
frame 0:
frame 5:
Assumimg your cars are moving, you could try to estimate the ground plane (road).
You may get a descent ground plane estimate by extracting features (SURF rather than SIFT, for speed), matching them over frame pairs, and solving for a homography using RANSAC, since plane in 3d moves according to a homography between two camera frames.
Once you have your ground plane you can identify the cars by looking at clusters of pixels that don't move according to the estimated homography.
A more sophisticated approach would be to do Structure from Motion on the terrain. This only presupposes that it is rigid, and not that it it planar.
Update
I was wondering if you could expand on how you would go about looking for clusters of pixels that don't move according to the estimated homography?
Sure. Say I and K are two video frames and H is the homography mapping features in I to features in K. First you warp I onto K according to H, i.e. you compute the warped image Iw as Iw( [x y]' )=I( inv(H)[x y]' ) (roughly Matlab notation). Then you look at the squared or absolute difference image Diff=(Iw-K)*(Iw-K). Image content that moves according to the homography H should give small differences (assuming constant illumination and exposure between the images). Image content that violates H such as moving cars should stand out.
For clustering high-error pixel groups in Diff I would start with simple thresholding ("every pixel difference in Diff larger than X is relevant", maybe using an adaptive threshold). The thresholded image can be cleaned up with morphological operations (dilation, erosion) and clustered with connected components. This may be too simplistic, but its easy to implement for a first try, and it should be fast. For something more fancy look at Clustering in Wikipedia. A 2D Gaussian Mixture Model may be interesting; when you initialize it with the detection result from the previous frame it should be pretty fast.
I did a little experiment with the two frames you provided, and I have to say I am somewhat surprised myself how well it works. :-) Left image: Difference (color coded) between the two frames you posted. Right image: Difference between the frames after matching them with a homography. The remaining differences clearly are the moving cars, and they are sufficiently strong for simple thresholding.
Thinking of the approach you currently use, it may be intersting combining it with my proposal:
You could try to learn and classify the cars in the difference image D instead of the original image. This would amount to learning what a car motion pattern looks like rather than what a car looks like, which could be more reliable.
You could get rid of the expensive window search and run the classifier only on regions of D with sufficiently high value.
Some additional remarks:
In theory, the cars should even stand out if they are not moving since they are not flat, but given your distance to the scene and camera resolution this effect may be too subtle.
You can replace the feature extraction / matching part of my proposal with Optical Flow, if you like. This amounts to identifying flow vectors that "stick out" from a consistent frame-to-frame motion of the ground. It may be prone to outliers in the optical flow, however. You can also try to get the homography from the flow vectors.
This is important: Regardless of which method you use, once you have found cars in one frame you should use this information to robustify your search of these cars in consecutive frame, giving a higher likelyhood to detections close to the old ones (Kalman filter, etc). That's what tracking is all about!
If the number of cars in your field of view always remain the same but move around then you can use optical flow...it will give you good results against a still background...if the number of cars are changing then you need to call goodFeaturestoTrack function in OpenCV after certain number of frames and again track the cars using optical flow.
You can use background modelling to model the background and hence the cars are always your foreground.The simplest example is frame differentiation...subtract the previous frame current frame. diff(x,y,k) = I(x,y,k) - I(x,y,k-1) .As your cars are moving in each frame you will get their position..
Both the process will work fine since you have a still background I presume..check this link to find what Optical flow can do.

Object tracking in OpenCV

I had been using LK algorithm in detecting corners and interested point for tracking.
However, I am stucked at this point where I need to have something like a rectangle box to follow the tracked object. All I have now was just a lot of points showing my moving objects.
Is there any methods or suggestions for that? Also, any idea on adding counter into the window so that my object moving in and out the screen can be counted as well?
Thank you
There are lots of options! Within OpenCV, I'd suggest using CamShift as a starting point, since it is a relatively easy to use. CamShift uses mean shift to iteratively search for an object in consecutive frames.
Note that you need to seed the tracker with some kind of input. You could have the user draw a rectangle around the object, or use a detector to get the initial input. If you want to track faces, for example, OpenCV has a cascade classifier and training data for a face detector included.

Resources