What is the difference between sparse and dense optical flow? - image-processing

Lots of resources say that there are two types optical flow algorithms. And Lucas-Kanade is a sparse technique, but I can't find the meanings of sparse and dense? Can some one tell me what is the difference between dense and sparse optical flow?

The short explanation is, sparse techniques only need to process some pixels from the whole image, dense techniques process all the pixels. Dense techniques are slower but can be more accurate, but in my experience Lucas-Kanade accuracy might be enough for real-time applications. An example of a dense optical flow algorithm (the most popular) is Gunner Farneback's Optical Flow.
To get an overview of the flow quality look at the benchmark page e.g. the KITTI or the Middleburry dataset

Sparse optical flow gives you the flow vectors of some "interesting features" within the image.
Dense optical flow attempts to give you the flow all over the image - up to a flow vector per pixel.

First of all, Lucas-Kanade is NOT a sparse optical flow technique. The reason so many believe it is, is due to a wide spread misunderstanding. The misconception became an accepted truth since the very first implementation of Lucas-Kanade in OpenCV was labelled as SPARSE, and still is to this day. The arguments to why Lucas-Kanade should be called sparse, apply to any dense flow algorithm. If you insist that Lucas-Kanade is sparse, then all flow algorithms are sparse and there is no point in distinguising them.
Sparse flow is the same as point tracking, dense flow consists of vectors over the video, indicating estimates of motion of fixed positions.
You can read more about all of this in this tutorial that I wrote, where I also show how Lucas-Kanade is just as dense as any other algoritm out there (although not as accurate).

Sparse optical flow - Lucas-Kanade method computes optical flow for a sparse feature set (e.g. corners detected using Shi-Tomasi algorithm).
Dense optical flow - Gunner Farneback's algorithm computes the optical flow for all the points in the frame. This is explained in "Two-Frame Motion Estimation Based on Polynomial Expansion" by Gunner Farneback in 2003.
Example implementation of can be found in opencv docmentation here

Sparse optical flow works on features(edges,corners etc). Dense optical flow is designed to work on all the pixels. The advantage of the first is that it is generally faster while the second can give estimates for more pixels than the first.

Sparse optical flow gives you the velocity vectors for some interesting (corner) points, these points are extracted beforehand using algorithms like Shi-Tomashi, Harris etc. The extracted points are passed into your [optical flow function] along with the present image and next image. Any good optical flow function should check the optical flow in the forward direction using the above corner points and also back track to cross check if it is following the same points.
On the other hand, dense optical flow can referred from here: http://www.cs.toronto.edu/~fleet/courses/cifarSchool09/flowChapter05.pdf

Related

What is the difference between Optical flow estimation and Disparity estimation?

I am trying to understand some computer vision topics. One main difference I observed between these two is that In optical flow, the 2nd image is often at time (t+1) whereas in disparity estimation, its often the same time-step unless one is having a static view and using single non-stereo camera.
Is there any other difference and their respective implications ?
As you pointed out, Optical flow represents the displacement of pixels between an image at time t and at time t+1 whereas disparity estimation is the displacement of a pixel between one camera and another.
Strictly speaking these two tasks could be considered identical.
However, in practice, disparity is computed using a "right" and a "left" camera which are horizontally aligned. Therefore, the disparity is only horizontal (and in a single direction due to the laws of optics) and can be represented by a heatmap.
On the contrary, Optical flow is a 2D vector field in which vectors can take any value.
In machine learning, this distinction mainly changes the dimension of the output (1D for disparity and 2D for Optical flow) to predict as well as its scale (positive for disparity vs all real numbers for Optical flow).
I hope my answer was clear :)

OpenCV - Feature Matching vs Optical Flow

I am interested in making a motion tracking app using OpenCV, and there has been a wealth of information available online. However, I am a tad confused between feature matching and tracking features using a sparse optical flow algorithm such as Lucas-Kanade. With that in mind, I have the following questions:
What is the main difference between the two (feature matching and optical flow) if I have specified a region of pixels to track? I'm not interested in tracking in real time, if that helps clear up any assumptions.
In addition, since I'm not doing real time tracking, is it a better idea to use dense optical flow (Farneback) to keep track of the pixels in my specified region of interest?
Thank you.
I would like to add a few thoughts about that theme since I found this a very interesting question too.
As said before Feature Matching is a technique that is based on:
A feature detection step which returns a set of so called feature points. These feature points are located at positions with salient image structures, e.g. edge-like structures when you are using FAST or blob like structures if you are using SIFT or SURF.
The second step is the matching. The association of feature points extracted from two different images. The matching is based on local visual descriptors, e.g. histogram of gradients or binary patterns, that are locally extracted around the feature positions. The descriptor is a feature vector and associated feature point pairs are pairs a minimal feature vector distances.
Most feature matching methods are scale and rotation invariant and are robust for changes in illuminations (e.g caused by shadow or different contrast). Thus these methods can be applied to image sequences but are more often used to align image pairs captured from different views or with different devices.The disadvantage of Feature Matching methods is the difficulty of defining where the feature matches are spawn and that the feature pair (which in a image sequence are motion vectors) are in general very sparse. In addition the subpixel accuracy of matching approaches are very limited as most detector are fine-graded to integer positions.
From my experience the main advantage of feature matching approaches is that they can compute very large motions/ displacements.
OpenCV offers some feature matching methods but there are a lot of more recent, faster and more accurate approaches available online e.g.:
DeepMatching which relies on deep learning and are often used to initialize optical flow methods to help them deal with long-range motions.
Stereoscann which is a very fast approach at its origin proposed for visual odometry.
Optical flow methods in contrast rely on the minimization of the brightness constancy and additional constrain e.g. smoothness etc. Thus they derive motion vector based on spatial and temporal image gradients of a sequence of consecutive frames. Thus they are more suited image sequences rather than image pairs that are captured from very different view points. The main challenges in the estimation of motion with optical flow vectors are large motions, occlusion, strong illumination changes and changes of the appearance of the objects and mostly the low runtime. However optical flow methods can be highly accurate and compute dense motion fields which respect to shared motion boundaries of the objects in a scene.
However, the accuracy of different optical flow methods is very different. Local methods such as the PLK (Lucas Kanade) are in general less accurate but allow to compute pre selected motion vectors only and can thus be very fast. (In the recent years we have done some research to improve the accuracy of the local approach, see here for further information).
The main OpenCV trunk offers global approaches such as the Farnback. But this is a quite outdated approach. Try the OpenCV contrib trunk which more recent methods. But to get an good overview of the most recent methods take a look at the public optical flow benchmarks. Here you will find code and implementations as well e.g.:
MPI-Sintel optical flow benchmark
KITTI 2012 optical flow benchmark. Both offer links e.g. to git's or source code for some newer methods. Such as FlowFields.
But from my point of view I would not on an early stage reject a specific approach matching or optical flow. Try as much as possible available online implementations and see what is the best for your application.
Feature matching uses the feature descriptors to match features with one another (usually) using a nearest neighbor search in the feature descriptor space. The basic idea is you have descriptor vectors, and the same feature in two images should be near each other in the descriptor space, so you just match that way.
Optical flow algorithms do not look at a descriptor space, and instead, looks at pixel patches around features and tries to match those patches instead. If you're familiar with dense optical flow, sparse optical flow just does dense optical flow but on small patches of the image around feature points. Thus optical flow assumes brightness constancy, that is, that pixel brightness doesn't change between frames. Also, since you're looking around neighboring pixels, you need to make the assumption that neighboring points to your features move similarly to your feature. Finally, since it's using a dense flow algorithm on small patches, the points where they move cannot be very far in the image from the original feature location. If they are, then the pyramid-resolution approach is recommended, where you scale down the image before you do this so that what once was a 16 pixel translation is now a 2 pixel translation, and then you can scale up with the found transformation as your prior.
So feature matching algorithms are all-in-all far better when it comes to using templates where the scale is not exactly the same, or if there's a perspective difference in the image and template, or if the transformations are large. However, your matches are only as good as your feature detector is exact. On optical flow algorithms, as long as it's looking in the right spot, the transformations can be really, really precise. They're both computationally expensive a bit; optical flow algorithms being an iterative approach makes them expensive (and although you'd think the pyramid approach can eat up more costs by running on more images, it can actually make it faster in some cases to reach the desired accuracy), and nearest neighbor searches are also expensive. Optical flow algorithms OTOH can work really well when the transformations are small, but if anything in your scene messes with your lighting or you get some incorrect pixels (like say, even minor occlusion) can really throw it off.
Which one to use definitely depends on the project. For a project I worked on with satellite imagery, I used dense optical flow because the images of desert terrain I was working with did not have precise enough features (in location) and different feature descriptors happen to look relatively similar so searching that feature space wasn't giving tons of great matches. In this case, optical flow was the better method. However, if you were doing image alignment on satellite imagery of a city where buildings can occlude parts of the scene, there are a lot of features that will stay matched and give a better result.
The OpenCV Lucas-Kanade tutorial doesn't give a whole lot of insight but should get your code moving in the right direction with the above in mind.
key-point matching = sparse optical flow
KLT tracking is a good example of sparse flow, see the demo LKDemo.cpp (it had some python wrapper example too, cant remember it now).
for a dense example, see samples/python/opt_flow.py, using Farnebäcks method.
You are right in being confused... The entire world is confused about this terribly simple topic. Alot of the reason is because people believe Lucas-Kanade to be sparse flow (due to a terribly badly named and commented example in openCV: LKdemo which should be called KLTDemo).

How to do grid-based (dense) optical flow on a masked image?

I am trying to track multiple people using a video camera. I do not want to use blob segmentation techniques.
What I want to do:
Perform background subtraction to obtain a mask isolating the peoples' motion.
Perform grid based optical flow on those areas -
What would be my best bet?
I am struggling to implement. I have tried blob detection and also some optical flow based examples (sparse), sparse didn't really do it for me as I wasn't getting enough feature points from goodfeaturestotrack() - I would like to end up with at least 20 track able points per person so that's why I think a grid based method would be better for me, I will use the motion vectors obtained to classify different people ( clustering on magnitude and direction possibly? )
I am using opencv3 with Python 3.5 - but am still quite noobish in this field.
Would appreciate some guidance immensely!
For a sparse optical flow ( in OpenCV the pyramidal Lucas Kanade method) you don't need good features-to-track mandatory to get the positions.
The calcOpticalFlowPyrLK function allows you to estimate the motion at predefined positions and these can be given by you too.
So just initialized a grid of cv::Point2f by your self, e.g. create a list of points and set the positions to the grid points located at your blobs, and run calcOpticalFlowPyrLK().
The idea of the good features-to-track method is that it gives you the points where the calcOpticalFlowPyrLK() result is more likely to be accurate and this is on image locations with edge-like structures. But in my experiences this gives not always the optimal feature point set. I prefer to use regular grids as feature point sets.

How does KLT work in OpenCV?

I am curious about the logic behind KLT in openCV.
From what I have known so far, the images sent to find optical flow in OpenCV is firstly converted to grayscale.
What I am curious is that, when running the algorithm, we need set of features for computation. What are the features used in finding optical flow method in openCV?
Thank you :)
There are 2 types of optical flow. Dense and sparse.
Dense finds flow for all the pixels while sparse finds flow for the selected points.
The selected points may be user specified, or calculated automatically using any of the feature detectors available in OpenCV. Most common feature detectors include GoodFeaturesToTrack which finds corners using cornerHarris or cornerMinEigenVal
The feature list is then passed to the KLT Tracker calcOpticalFlowPyrLK.
Feature can be any point in the image. Most common features are corners and edges.

Is there a way to approximate the Laplacian matrix eigenvalues by using random walk like algorithm

It seems to me that many advanced graph analysis algorithm are based on spectral graph analysis, relying more specifically on the Laplacian matrix properties.
I know there are some alternative for clustering that are based on random-walk type algorithms, that make no use of the Laplacian matrix factorization.
I am curious if there exists anything to go a bit further and determine the Laplacian matrix eigenvalues (especially the second one), without using spectral method, but more like wandering on the graph.

Resources