Can OpenCV be used to calculate dense optical flow using Lucas Kanade method? I am aware of function in gpu/ocl module that can do that (gpu::PyrLKOpticalFlow::dense), but is there non-gpu equivalent of that function?
I'm also aware of Farneback and TV L1, but I need LK / pyramidal LK for my research.
No. Actually there is no good dense optical flow extraction method. I'm facing the same problem (particle advection on optical flow, right?)
There is a function that evaluates optical flow with Farneback method [1], but it gives me bad results. It does not use ocl nor gpu.
You may try with phaseCorrelate to extract it with a shift based algorithm. I've used this method. When I will upload it to github I'll give you the link.
[EDIT]
Here is the code. I've decided to separate the phase correlation algorithm from the whole project, to make it more simple to understand:
https://github.com/MatteoRagni/OpticalFlow
Please star it, if you intend to use it.
You can find the OpenCV non-gpu video analysis functionality documentation here
There is an implementation of the sparse iterative Lucas-Kanade method with pyramids (specifically from this paper). The function is called calcOpticalFlowPyrLK, and you build the associated pyramid(s) via buildOpticalFlowPyramid. Note however that it does specify that it's for sparse feature sets, so I don't know how much of a difference that'll make for you if you need dense optical flow.
Related
I am currently using a CNN based object detection module which gives me objects which I then use as input for tracking using OpenCV. The object detection module produced rectangles until now but I want to shift to a segmentation module like Mask-RCNN which outputs masks along with rectangles for each object. Masks are a more accurate representation of an object. All the trackers in OpenCV take rectangles as input. Is there any way to use the masks for tracking an object rather than the boxes. I can convert the masks to contours if that will help me track the object.
Sorry, there is no built-in out-of-box solution in OpenCV for active contour models.
This segmentation model is widely used on computer vision problems (was proposed by Kass on 1988 and is the starting point for other segmentation model based on energies like level sets models, geodesic active contours or fuzzy-snake model.
So, trying to perform the active contour segmentation on OpenCV, there are several solution, but I think you must understand the mathematical model in order to be able to set the parameters properly according to the context of application.
There is a nice implementation (a bit obfuscated) by Eric Yuan
And others implementation from SO, that could help you to link between theory and implementation:
Solution 1
Solution 2
My advice:
Read the original paper to understand the parameters.
Test some examples on Matlab to play a bit with parameters and results.
Test some of the implementation using OpenCV that are linked here.
Determine the best parameters for you problem context and test them.
Think about contributing to OpenCV with you results.
active contours can track using contours as input. https://www.ee.iitb.ac.in/uma/~krishnan/research.html
So you initialize the first frame using contour from cnn model and in subsequent frames, you don't need to call the expensive forward but able to update the contour to a new one based on this model.
I've been playing with the excellent GPUImage library, which implements several feature detectors: Harris, FAST, ShiTomas, Noble. However none of those implementations help with the feature extraction and matching part. They simply output a set of detected corner points.
My understanding (which is shakey) is that the next step would be to examine each of those detected corner points and extract the feature from then, which would result in descriptor - ie, a 32 or 64 bit number that could be used to index the point near to other, similar points.
From reading Chapter 4.1 of [Computer Vision Algorithms and Applications, Szeliski], I understand that using a BestBin approach would help to efficient find neighbouring feautures to match, etc. However, I don't actually know how to do this and I'm looking for some example code that does this.
I've found this project [https://github.com/Moodstocks/sift-gpu-iphone] which claims to implement as much as possible of the feature extraction in the GPU. I've also seen some discussion that indicates it might generate buggy descriptors.
And in any case, that code doesn't go on to show how the extracted features would be best matched against another image.
My use case if trying to find objects in an image.
Does anyone have any code that does this, or at least a good implementation that shows how the extracted features are matched? I'm hoping not to have to rewrite the whole set of algorithms.
thanks,
Rob.
First, you need to be careful with SIFT implementations, because the SIFT algorithm is patented and the owners of those patents require license fees for its use. I've intentionally avoided using that algorithm for anything as a result.
Finding good feature detection and extraction methods that also work well on a GPU is a little tricky. The Harris, Shi-Tomasi, and Noble corner detectors in GPUImage are all derivatives of the same base operation, and probably aren't the fastest way to identify features.
As you can tell, my FAST corner detector isn't operational yet. The idea there is to use a lookup texture based on a local binary pattern (why I built that filter first to test the concept), and to have that return whether it's a corner point or not. That should be much faster than the Harris, etc. corner detectors. I also need to finish my histogram pyramid point extractor so that feature extraction isn't done in an extremely slow loop on the GPU.
The use of a lookup texture for a FAST corner detector is inspired by this paper by Jaco Cronje on a technique they refer to as BFROST. In addition to using the quick, texture-based lookup for feature detection, the paper proposes using the binary pattern as a quick descriptor for the feature. There's a little more to it than that, but in general that's what they propose.
Feature matching is done by Hamming distance, but while there are quick CPU-side and CUDA instructions for calculating that, OpenGL ES doesn't have one. A different approach might be required there. Similarly, I don't have a good solution for finding a best match between groups of features beyond something CPU-side, but I haven't thought that far yet.
It is a primary goal of mine to have this in the framework (it's one of the reasons I built it), but I haven't had the time to work on this lately. The above are at least my thoughts on how I would approach this, but I warn you that this will not be easy to implement.
For object recognition / these days (as of a couple weeks ago) best to use tensorflow /Convolutional Neural Networks for this.
Apple has some metal sample code recently added. https://developer.apple.com/library/content/samplecode/MetalImageRecognition/Introduction/Intro.html#//apple_ref/doc/uid/TP40017385
To do feature detection within an image - I draw your attention to an out of the box - KAZE/AKAZE algorithm with opencv.
http://www.robesafe.com/personal/pablo.alcantarilla/kaze.html
For ios, I glued the Akaze class together with another stitching sample to illustrate.
detector = cv::AKAZE::create();
detector->detect(mat, keypoints); // this will find the keypoints
cv::drawKeypoints(mat, keypoints, mat);
// this is the pseudo SIFT descriptor
.. [255] = {
pt = (x = 645.707153, y = 56.4605064)
size = 4.80000019
angle = 0
response = 0.00223364262
octave = 0
class_id = 0 }
https://github.com/johndpope/OpenCVSwiftStitch
Here is a GPU accelerated SIFT feature extractor:
https://github.com/lukevanin/SIFTMetal
The code is written in Swift 5 and uses Metal compute shaders for most operations (scaling, gaussian blur, key point detection and interpolation, feature extraction). The implementation is largely based on the paper and code from the "Anatomy of the SIFT Method Article" published in the Image Processing Online Journal (IPOL) in 2014 (http://www.ipol.im/pub/art/2014/82/). Some parts are based on code by Rob Whess (https://github.com/robwhess/opensift), which I believe is now used in OpenCV.
For feature matching I tried using a kd-tree with the best-bin first (BBF) method proposed by David Lowe. While BBF does provide some benefit up to about 10 dimensions, with a higher number of dimensions such as used by SIFT, it is no better than quadratic search due to the "curse of dimensionality". That is to say, if you compare 1000 descriptors against 1000 other descriptors, it stills end up making 1,000 x 1,000 = 1,000,000 comparisons - the same as doing brute-force pairwise.
In the linked code I use a different approach optimised for performance over accuracy. I use a trie to locate the general vicinity for potential neighbours, then search a fixed number of sibling leaf nodes for the nearest neighbours. In practice this matches about 50% of the descriptors, but only makes 1000 * 20 = 20,000 comparisons - about 50x faster and scales linearly instead of quadratically.
I am still testing and refining the code. Hopefully it helps someone.
When detecting objects using SURF, how can a plot a graph for false positives and hits using the Good matches and several keypoints?
(A) How do I get the statistics of good matches i.e an ROC plot or the true positives vs false positives of detection from so many of the line descriptors?Can somebody put a code for plotting true positves vs false positive statistics.
(B)**Secondly,there are many resources vdo1 , vdo2and implemetations, papers ( Object tracking using improved Camshift with SURF method ;
A Study on Moving Object Tracking Algorithm Using SURF Algorithm
and Depth Information
) which say that SURF and SIFT can be used for tracking in combination with camshift or meanshift.
But, what I fail to understand is that we need prediction algorithm like Kalman filters or tracking algorithm like Camshift,mean shift or template differencing(not sure) for tracking.So,how come some video implementations and tutorial say that Lukas Kanade Optical flow,SIFT,SURF is tracking objects whereas the papers mention to club either camshift or meanshift.Am I missing out on some conceptual matter?
Shall be obliged for pointers and a detailed explanation on how SURF or SIFT or feature based methods can be used for tracking alone?
Lucas-Kandae with pyramid (pyrLK) is a method that looks for a small shift in a single feature location. It can do this to many features at once. Camshift and meanshift track a statistic for a group of features. You can also just try to use a matcher, to find where the features went on the next frame. GoodFeturesToTrack, SIFT and SURF are algorithms that find points that should be easy to find and tell apart one from another. SURF and SIFT include also descriptors, that characterise those features in a way which can ignore size change, orientation change or both.
Kalman filter is used to refine Your results. It is able to shrink the area where the answer should lay, because algorithms above are not perfect.
As for the code, I haven't done too much tracking except Shi-Thomasi + pyrLK, so I dont't think I can help.
I have to train a Support Vector Machine model and I'd like to use a custom kernel matrix, instead of the preset ones (like RBF, Poly, ecc.).
How can I do that (if is it possible) with opencv's machine learning library?
Thank you!
AFAICT, custom kernels for SVM aren't supported directly in OpenCV. It looks like LIBSVM, which is the underlying library that OpenCV uses for this, doesn't provide a particularly easy means of defining custom kernels. So, many of the wrappers that use LIBSVM don't provide this either. There seem to be a few, e.g. scikit for python: scikit example of SVM with custom kernel
You could also take a look at a completely different library, like SVMlight. It supports custom kernels directly. Also take a look at this SO question. The answers there include a handful of SVM libraries, along with brief reviews.
If you have compelling reasons to stay within OpenCV, you might be able to accomplish it by using kernel type CvSVM::LINEAR and applying your custom kernel to the data before training the SVM. I'm a little fuzzy on whether this direction would be fruitful, so I hope someone with more experience with SVM can chime in and comment. If it is possible to use a "precomputed kernel" by choosing "linear" as your kernel, then take a look at this answer for more ideas on how to proceed.
You might also consider including LIBSVM and calling it directly, without using OpenCV. See FAQ #418 for LIBSVM, which briefly touches on how to do custom kernels:
Q: I would like to use my own kernel. Any example? In svm.cpp, there are two subroutines for kernel evaluations: k_function() and kernel_function(). Which one should I modify ?
An example is "LIBSVM for string data" in LIBSVM Tools.
The reason why we have two functions is as follows. For the RBF kernel exp(-g |xi - xj|^2), if we calculate xi - xj first and then the norm square, there are 3n operations. Thus we consider exp(-g (|xi|^2 - 2dot(xi,xj) +|xj|^2)) and by calculating all |xi|^2 in the beginning, the number of operations is reduced to 2n. This is for the training. For prediction we cannot do this so a regular subroutine using that 3n operations is needed. The easiest way to have your own kernel is to put the same code in these two subroutines by replacing any kernel.
That last option sounds like a bit of a pain, though. I'd recommend scikit or SVMlight. Best of luck to you!
If you're not married to OpenCV for the SVM stuff, have a look at the shogun toolbox ... lots of SVM voodoo in there.
I would like to know, if there is any code or any good documentation available for implementing HOG features? I tried to read the documentation here but it's quite difficult to understand and it needs SVM..
What I need is just to implement a HOG detector for objects.... Like what it does SIFT or SURF
Btw, I'm not interesting in this work.
Thank you..
you can take a look at
http://szproxy.blogspot.com/2010/12/testtest.html
he also published "tutorial" for HOG on source forge here:
http://sourceforge.net/projects/hogtrainingtuto/?_test=beta
I know this since I'm having the same problem as you. The tutorial though isn't what i would call a tutorial, its a bunch of source codes, no documentation, but I assume that it works and can at least get you somewhere.
At the end and simplifying a bit, all that you need to detect specific objects in image is:
Localize "points of interest" to extract the patches:
In order to get points of interest, you can use some algorithms like Harris corner detector, randomly or something simply like sliding windows.
From these points get patches:
You will have to take the decission of the patch size.
From these patches compute the feature descriptor. (like HOG).
Instead of HOG you can use another feature descriptor like SIFT, SURF...
HOG's implementation is not too hard. You have to calculate the gradients of the extracted patch doing applying Sobel X and Y kernels, after that you have to divide the patch in NxM cells, 8x8 for instance, and compute an histogram of gradients, angle and magnitude. In the following link you can see it more detailed explanation:
HOG Person Detector Tutorial
Check your feature vector in the previously trained classifier
Once you got this vector, check if it is the desired object or not with a previously trained classifier like SMV. Instead SVM you could use NeuralNetworks for instance.
SVM implementation is more dificult, but there are some libraries like opencv that you can use.
There is a function extractHOGFeatures in the Computer Vision System Toolbox for MATLAB.