CUDA Performance comparison for Computer Vision applications - opencv

I am currently working on performance comparison of various computer vision applications. The research is based on evaluating how these different algorithms perform on CUDA and OpenMP.
Do you have any source codes in CUDA as well as the serial implementation in C for these kind of applications?
Where can I find them?

The CUDA SDK is full of examples, compiled both on GPU and CPU.
sources are included.
Here is a list of the samples you get by installing it.
You could start from here :)

Related

What platform to use for YOLO output when using AMD GPU?

long time tormented by this question, I ask your advice in what direction to move. Objective - to develop universal application with yolo on windows, which can use computing power of AMD/Nvidia/Intel GPU, AMD/Intel CPU (one of the devices will be used). As far as I know, the OpenCV DNN module is leading in CPU computation; a DNN + Cuda bundle is planned for Nvidia graphics cards and a DNN + OpenCL bundle is planned for Intel GPUs. But testing AMD GPU rx580 with DNN + OpenCL, I ran into the following problem: https://github.com/opencv/opencv/issues/17656. Does this module not support AMD GPU computing at all? If so, could you please let me know what platform this is possible on and, preferably, as efficiently as possible. A possible solution might be Tencent's ncnn, but I'm not sure of the performance on the desktop. By output I mean the coordinates of detected objects and their names (in opencv dnn module I got them with cv::dnn::Net::forward()). Also, correct me if I'm wrong somewhere. Any feedback would be appreciated.
I tried the OpenCV DNN + OpenCL module and expected high performance, but this combination does not work.
I believe OpenCV doesn't support AMD for GPU optimization. If you're interested in running DL models on non-Nvidia GPUs, I suggest reading PlaidML, YOLO-OpenCL, DeepCL

Hardware optimizations using Qualcomm Snapdragon 800 and Adreno 330

I am developing a real-time computer vision project that runs on an Ubuntu (Linaro) board with an ARM CPU (Snapdragon 800).
Some parts of the software operate on HD images, huge amount of data. This slows the execution and acts as a bottleneck.
These operations include:
Finding all local minimum and maximum values in a 2D array (image). Currenly, it is implemented using the naive, trivial way.
Building a KD-Tree and performing a K-Nearest-Neighbors search. This is currently done using the FLANN library included in OpenCV.
I am looking for ways to utilize the available Adreno 330 GPU, and accelerate these computations.
I was looking at OpenCL, but I found out that it is supported on Adreno 330 only as an "embedded profile", something that I do not what it is, and how it affects things.
I also heard about NEON in ARM processors, but I do not know how will it be any use for me.
Any help, tips and links will be appreciated.
Thanks,
Avi

SIFT hardware accelerator for smartphones

I'm a fresh graduate electronics engineer and I've an experience on computer vision.I want to ask if it's feasible to make a hardware accelerator of SIFT algorithm - or any other openCV algorithms - to be used on smartphones instead of the current software implementation?
What are the advantages (much low computation, lower power, more complex applications will appear, ...) and the disadvantages(isn't better than the current software implementation, ...)?
Do you have an insight of that?
Thanks
You might be interested to check NEON optimizations - a type of SIMD instructions supported by Nvidia Tegra 3 architectures. Some OpenCV functions are NEON optimized.
Start by reading this nice article Realtime Computer Vision with OpenCV, it has performance comparisons about using NEON, etc.
I also recommend you to start here and here, you will find great insights.
Opencv supports both cuda and (experimentally) opencl
There are specific optimizations for Nvidia's Tegra chipset used in a lot of phones/tablets. I don't know if any phone's use opencl

Image Processing on CUDA or OpenCV?

I need to develop an image processing program for my project in which I have to count the number of cars on the road. I am using GPU programming. Should I go for OpenCV program with GPU processing feature or should I develop my entire program on CUDA without any OpenCV library?
The algorithms which I am using for counting the number of cars is background subtraction, segmentation and edge detection.
You can use GPU functions in OpenCV.
First visit the introduction about this : http://docs.opencv.org/modules/gpu/doc/introduction.html
Secondly, I think above mentioned processes are already implemented in OpenCV optimized for GPU. So It will be much easier to develop with OpenCV.
Canny Edge Detection : http://docs.opencv.org/modules/gpu/doc/image_processing.html#gpu-canny
PerElement Operations (including subtraction): http://docs.opencv.org/modules/gpu/doc/per_element_operations.html#per-element-operations
For other functions, visit OpenCV docs.
OpenCV, no doubt, has the biggest collection of Image processing functionality and recently they've started porting functions to CUDA as well. There's a new GPU module in latest OpenCV with few functions ported to CUDA.
Being said that, OpenCV is not the best option to build a CUDA based application as there are many dedicated CUDA libraries like CUVI that beat OpenCV in Performance. If you're looking for an optimized solution, you should also give them a try.

Is OpenCV 2.0 optimized for AMD processors?

I know that in the past OpenCV was based on IPP and was optimized only for Intel CPUs. Is this still the case with OpenCV 2.0?
History says that OpenCV was originally developed by Intel.
If you check OpenCV faq, they'll say:
OpenCV itself is open source and written in quite portable C/C++, it runs on other processors already and should be fairly easy to port (for example, there are already some CUDA optimizations on NVidia. On the other hand, OpenCV can sometimes run much faster on Intel processors (and sometimes AMD) because it can take advantage of SSE optimizations. OpenCV can be compiled statically with IPP libraries from Intel also which can speed up some function.
I have used it on other processors and different OS and I've always been very happy, including for video processing applications.

Resources