I need to develop an image processing program for my project in which I have to count the number of cars on the road. I am using GPU programming. Should I go for OpenCV program with GPU processing feature or should I develop my entire program on CUDA without any OpenCV library?
The algorithms which I am using for counting the number of cars is background subtraction, segmentation and edge detection.
You can use GPU functions in OpenCV.
First visit the introduction about this : http://docs.opencv.org/modules/gpu/doc/introduction.html
Secondly, I think above mentioned processes are already implemented in OpenCV optimized for GPU. So It will be much easier to develop with OpenCV.
Canny Edge Detection : http://docs.opencv.org/modules/gpu/doc/image_processing.html#gpu-canny
PerElement Operations (including subtraction): http://docs.opencv.org/modules/gpu/doc/per_element_operations.html#per-element-operations
For other functions, visit OpenCV docs.
OpenCV, no doubt, has the biggest collection of Image processing functionality and recently they've started porting functions to CUDA as well. There's a new GPU module in latest OpenCV with few functions ported to CUDA.
Being said that, OpenCV is not the best option to build a CUDA based application as there are many dedicated CUDA libraries like CUVI that beat OpenCV in Performance. If you're looking for an optimized solution, you should also give them a try.
Related
I've finished an algorithm aimed to foreground extraction based on video recently, but it processes too slowly per frame. There is an algorithm based on Mixed Gaussian Model named BackgroundSubtractorMOG2 in OpenCV3.0 and I find it processes quickly as nearly 15 times as mine per frame. I just wonder is it accelerated by OpenCL on GPU ? Or it is just run on CPU? p.s. I've seen some source codes of it and noticed there are OpenCL blocks but I'm not sure since I'm fresh. I will be very appreciated if anyone could help me figure it out!
If you look at the API page here You will find the line:
The function implements a sparse iterative version of the Lucas-Kanade optical flow in pyramids. See [Bouguet00]. The function is parallelized with the TBB library.
The TBB library is a parallization library and is used to "write parallel C++ programs that take full advantage of multicore performance" - this means that it is using more than just one CPU at a time, a much quicker way of processing. This can be seen on lines like this (Line 566):
parallel_for_(Range(0, image.rows),
MOG2Invoker(image, fgmask,
(GMM*)bgmodel.data,
(float*)(bgmodel.data + sizeof(GMM)*nmixtures*image.rows*image.cols),
bgmodelUsedModes.data, nmixtures, (float)learningRate,
(float)varThreshold,
backgroundRatio, varThresholdGen,
fVarInit, fVarMin, fVarMax, float(-learningRate*fCT), fTau,
bShadowDetection, nShadowDetection));
I am going to train my Haar classifier for flowers(which I am highly skeptical about). I have been following the CodingRobin Tut for everything.
http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html
Now, it has been emphasized that I use GPU support, multithreading etc. otherwise the training is gonna take days. I am going to use pre-built libraries and therefore the pre-built opencv_traincascade utility.
I want to ask beforehand, Will I be able to leverage GPU support if I use the pre-built libs, given that I install CUDA?
Where does TBB fit in the whole picture?
Do you recommend me building the whole library from scratch with TBB and CUDA support checked, or that would be a waste?
Note: I am using OpenCV 2.4.11. And I am a complete beginner to OpenCV.
As known in OpenCV 2.4.9.0 are these feature-detectors: SIFT, SURF, BRISK, FREAK, STAR, FAST, ORB.
http://docs.opencv.org/modules/features2d/doc/feature_detection_and_description.html
http://docs.opencv.org/modules/features2d/doc/common_interfaces_of_feature_detectors.html
All of these have implementation on CPU, but only FAST and ORB on GPU. http://docs.opencv.org/genindex.html
And as known, some are scale/rotate-invariant, but some aren't: Are there any fast alternatives to SURF and SIFT for scale-invariant feature extraction?
These are scale-invariant and rotate-invariant:
SIFT
SURF
BRISK
FREAK
STAR
But these are not scale-invariant and not rotate-invariant:
FAST
ORB
Are there any detectors which implemented on GPU and are scale/rotate-invariant?
Or will be added in OpenCV 3.0 on GPU or OpenCL?
Actually, SURF is the only scale/rotate-invariant feature detector with GPU support in OpenCV.
In OpenCV 3.0 FAST and ORBhave got OCL support and moreover, these two (FAST and ORB) have already got CUDA support.
The OCL/CUDA support of SURF has been already mentioned in the comments of your question, but it is only a contribution to OpenCV and this is how OpenCV's developers about opencv_contrib:
New modules quite often do not have stable API, and they are not
well-tested. Thus, they shouldn't be released as a part of official
OpenCV distribution, since the library maintains binary compatibility,
and tries to provide decent performance and stability.
Based on my previous experiences OpenCV’s implementation of SURF features were much weaker than OpenSURF. It would be reasonable to try it, or find some other open source implementations.
p.s.:
to my knowledge still there is no GPU accelerated version of KAZE/AKAZE.
I recently implemented AKAZE using CUDA with a couple of colleagues, if you are familiar with the original library you should have no problem using it since we respected the API. You can find the current version here:
https://github.com/nbergst/akaze
I have my code written in c++ and I used openCV functions for Image processing tasks.
I want to run my code in GPU (using cuda) to read a camera/stream inputs and do the image processing tasks in each frame in parallel.
I've read somewhere that I can't include the openCV functions in a .cu code, since the NVCC can't compile openCv functions (please correct me if this is not true)
I found the openCV gpu module in the openCV documentation, but I don't want to run the whole function in parallel, I want the whole algorithm to be processed in parallel ( in other way, include openCv in cuda not vise versa), so I've thought about writing all of my openCV functions in cuda, But I'm newbie to cuda.
My questions:
1- Are there cuda functions that can be used instead of openCv following functions :
split, inRange
fillHoles
Morphology (erosion, dilation, closing)
Countours (findContours, moments, boundingRect, approxPolyDP)
Drawing function (drawContours, rectangle, circle)
kmeans (or any other function for clustering)
I found some of them in Github, but still didn't test any, any documentation will be highly appreciated.
2- Does cuda reads only .pgm image format, and should I convert the .jpg frames before copying them to the device? Is it impossible to read the camera input directly to GPU global memory?
3- Do you suggest keeping my code in openCV and use another libraries for parallel processing like openCL? or use CPU (instead of GPU) for parallel processing using OpenMP? what might be the best option I should go with?
Before you begin down this route, i would recommend that you go through the first few lessons in this tutorial:
https://www.udacity.com/course/cs344
Then you will have a better idea about if a GPU is suitable for what your application requires.
In any case, openCV 1.0 is mostly in C, and cuda kernels are in C, so maybe you could try wrapping some of those into cuda kernels
Cheers
I'm using OpenCV 2.4.6 in a prototype of object detection and I was wondering in how to improve the feature detection/extraction performance. Someone knows if is it possible to run feature detection/extraction/matching, like SIFT/SIFT/BF, or even the findHomography, on GPU?
Tks
OpenCV GPU module contains implementations for FAST, ORB and SURF feature detectors/extractors and for BruteForceMatcher.
You can read more in documentation:
http://docs.opencv.org/2.4.6/modules/gpu/doc/feature_detection_and_description.html
http://docs.opencv.org/2.4.6/modules/nonfree/doc/feature_detection.html#gpu-surf-gpu