Is it possible to update an existing device driver to a new model of the same device? - device-driver

i have typically compiled a device driver for radeon from
GitHub Source
Upstream Source
but this driver did not detect my device.
here is a portion from /var/log/Xorg.0.log
[ 3766.775] (II) LoadModule: "radeonhd"
[ 3766.775] (II) Loading /usr/lib/xorg/modules/drivers/radeonhd_drv.so
[ 3766.775] (II) Module radeonhd: vendor="AMD GPG"
[ 3766.775] compiled for 1.12.4, module version = 1.3.0
[ 3766.775] Module class: X.Org Video Driver
[ 3766.775] ABI class: X.Org Video Driver, version 12.1
[ 3766.775] (II) RADEONHD: X driver for the following AMD GPG (ATI) graphics devices:
[ 3766.775] RV505 : Radeon X1550, X1550 64bit.
RV515 : Radeon X1300, X1550, X1600; FireGL V3300, V3350.
RV516 : Radeon X1300, X1550, X1550 64-bit, X1600; FireMV 2250.
R520 : Radeon X1800; FireGL V5300, V7200, V7300, V7350.
RV530 : Radeon X1300 XT, X1600, X1600 Pro, X1650; FireGL V3400, V5200.
RV535 : Radeon X1300, X1650.
RV550 : Radeon X2300 HD.
RV560 : Radeon X1650.
RV570 : Radeon X1950, X1950 GT; FireGL V7400.
R580 : Radeon X1900, X1950; AMD Stream Processor.
[ 3766.775] R600 : Radeon HD 2900 GT/Pro/XT; FireGL V7600/V8600/V8650.
RV610 : Radeon HD 2350, HD 2400 Pro/XT, HD 2400 Pro AGP; FireGL V4000.
RV620 : Radeon HD 3450, HD 3470.
RV630 : Radeon HD 2600 LE/Pro/XT, HD 2600 Pro/XT AGP; Gemini RV630;
FireGL V3600/V5600.
RV635 : Radeon HD 3650, HD 3670.
RV670 : Radeon HD 3690, 3850, HD 3870, FireGL V7700, FireStream 9170.
R680 : Radeon HD 3870 X2.
[ 3766.775] M52 : Mobility Radeon X1300.
M54 : Mobility Radeon X1400; M54-GL.
M56 : Mobility Radeon X1600; Mobility FireGL V5200.
M58 : Mobility Radeon X1800, X1800 XT; Mobility FireGL V7100, V7200.
M62 : Mobility Radeon X1350.
M64 : Mobility Radeon X1450, X2300.
M66 : Mobility Radeon X1700, X1700 XT; FireGL V5250.
M68 : Mobility Radeon X1900.
[ 3766.775] M71 : Mobility Radeon HD 2300.
M72 : Mobility Radeon HD 2400; Radeon E2400.
M74 : Mobility Radeon HD 2400 XT.
M76 : Mobility Radeon HD 2600;
(Gemini ATI) Mobility Radeon HD 2600 XT.
[ 3766.775] M82 : Mobility Radeon HD 3400.
M86 : Mobility Radeon HD 3650, HD 3670, Mobility FireGL V5700.
M88 : Mobility Radeon HD 3850, HD 3850 X2, HD 3870, HD3870 X2.
[ 3766.775] RS600 : Radeon Xpress 1200, Xpress 1250.
RS690 : Radeon X1200, X1250, X1270.
RS740 : RS740, RS740M.
RS780 : Radeon HD 3100/3200/3300 Series.
[ 3766.775] R700 : Radeon R700.
RV710 : Radeon HD4570, HD4350.
RV730 : Radeon HD4670, HD4650.
RV740 : Radeon HD4770. EXPERIMENTAL AND UNTESTED.
RV770 : Radeon HD 4800 Series; Everest, K2, Denali ATI FirePro.
RV790 : Radeon HD 4890.
[ 3766.776] M92 : Mobility Radeon HD4330, HD4530, HD4570. EXPERIMENTAL.
M93 : Mobility Radeon M93. EXPERIMENTAL AND UNTESTED.
M96 : Mobility Radeon HD4600.
M97 : Mobility Radeon HD4860. EXPERIMENTAL AND UNTESTED.
M98 : Mobility Radeon HD4850, HD4870.
[ 3766.776]
[ 3766.776] (II) RADEONHD: version 1.3.0, built from git branch master, commit 76cdcba6
[ 3766.776] (--) using VT number 7
[ 3766.779] (EE) No devices detected.
[ 3766.780]
Fatal server error:
[ 3766.780] no screens found
[ 3766.780]
Please consult the The X.Org Foundation support
at http://wiki.x.org
for help.
[ 3766.780] Please also check the log file at "/var/log/Xorg.0.log" for additional information.
[ 3766.780]
I think my model is radeon 7480 related.
Is there any way to extend the driver to radeon 7480 related?
I am a newbie to device driver programming.
here is part from lspci output
00:01.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI Device 9993

Related

Nvidia codec SDK samples: can't decode an encoded file correctly

I'm trying out the sample applications in the Nvidia video codec sdk, and am having trouble getting a useable decoded result.
My input file is YUV 4:2:0, taken from here, which is 352x288px.
I'm encoding using the AppEncD3D12.exe sample, with the following command:
.\AppEncD3D12.exe -i D:\akiyo_cif.y4m -s 352x288 -o D:\akiyo_out.mp4
This gives the output
GPU in use: NVIDIA GeForce RTX 2080 Super with Max-Q Design
[INFO ][17:46:39] Encoding Parameters:
codec : h264
preset : p3
tuningInfo : hq
profile : (default)
chroma : yuv420
bitdepth : 8
rc : vbr
fps : 30/1
gop : 250
bf : 1
multipass : 0
size : 352x288
bitrate : 0
maxbitrate : 0
vbvbufsize : 0
vbvinit : 0
aq : disabled
temporalaq : disabled
lookahead : disabled
cq : 0
qmin : P,B,I=0,0,0
qmax : P,B,I=0,0,0
initqp : P,B,I=0,0,0
Total frames encoded: 112
Saved in file D:\akiyo_out.mp4
Which looks promising. However, using the decode sample, a single frame of the output contains what look like 12 smaller frames of the input, in monochrome.
I'm running the decode sample like this:
PS D:\Nvidia\Video_Codec_SDK_11.1.5\Samples\build\Debug> .\AppDecD3D.exe -i D:\akiyo_out.mp4
GPU in use: NVIDIA GeForce RTX 2080 Super with Max-Q Design
Display with D3D9.
[INFO ][17:58:58] Media format: raw H.264 video (h264)
Session Initialization Time: 23 ms
[INFO ][17:58:58] Video Input Information
Codec : AVC/H.264
Frame rate : 30000/1000 = 30 fps
Sequence : Progressive
Coded size : [352, 288]
Display area : [0, 0, 352, 288]
Chroma : YUV 420
Bit depth : 8
Video Decoding Params:
Num Surfaces : 7
Crop : [0, 0, 0, 0]
Resize : 352x288
Deinterlace : Weave
Total frame decoded: 112
Session Deinitialization Time: 8 ms
I'm quite new to this so could be doing something stupid. Right now I don't know whether to look at encode or decode! Any ideas or tips most appreciated.
-I've tried other YUV files with the same result. I read that 4:2:2 is not supported, the above is 4:2:0.
Using the AppEncCuda sample, the decoded video (played with AppDecD3D.exe) is the correct size and in colour, but the video appears to scroll to the right as it is played, with colour information not scrolling at the same rate as the image
you have 2 problems:
According to the code and remarks in the AppEncD3D12 sample it expect the input frames to be in ARGB format but your input file is YUV -so the sample read data from the YUV file and treat it as ARGB. If you want the AppEncD3D12 to work with this file you need to either convert each YUV frame to argb or to change the code to work with YUV as input. The AppEncCuda sample is expecting YUV as input and that is the reason it give you better results. you can also see that in the AppEncD3D12 there were a total of 112 encoded but in the AppEncCuda there a total of 300 frames - this is because YUV frame are smaller then ARGB frames.
the 2nd problem is that the both sample save the output as RAW h264. The file is not really MP4 despite the name you gave it. There are a few players that can play a file of h264 RAW data and you can try to use them to play the output file. another option is to use FFMPEG to create a valid MP4 file and pass the RAW h264 samples to it - the NVIDIA encoder encode the video but it does not handle the creation of video files containers (There 2 many type of files like avi,mpg,mp4,mkv,ts, etc.) - you should use FFMPEG or other solution for that. The sdk samples contain a file FFmpegStreamer.h under the Utils folder that show how to use ffmpeg to output h264 video in Mpeg2 transport stream format to a file (*.ts) or the network.

OpenCL can not detect my AMD GPU using OpenCV

I am using AMD Radeon R9 M375. I tried following this answer https://stackoverflow.com/a/34250412/8731839 but it didn't work for me.
I followed this: http://answers.opencv.org/question/108646/opencl-can-not-detect-my-nvidia-gpu-via-opencv/?answer=108784#post-id-108784
Here is my output from clinfo.exe
Platform Name: AMD Accelerated Parallel Processing
Number of devices: 2
Device Type: CL_DEVICE_TYPE_GPU
Vendor ID: 1002h
Board name: AMD Radeon (TM) R9 M375
Device Topology: PCI[ B#4, D#0, F#0 ]
Max compute units: 10
Max work items dimensions: 3
Max work items[0]: 256
Max work items[1]: 256
Max work items[2]: 256
Max work group size: 256
Preferred vector width char: 4
Preferred vector width short: 2
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 4
Native vector width short: 2
Native vector width int: 1
Native vector width long: 1
Native vector width float: 1
Native vector width double: 1
Max clock frequency: 1015Mhz
Address bits: 32
Max memory allocation: 3019898880
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 1024
Alignment (bits) of base address: 2048
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: No
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 3221225472
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Max pipe arguments: 0
Max pipe active reservations: 0
Max pipe packet size: 0
Max global variable size: 0
Max global variable preferred total size: 0
Max read/write image args: 0
Max on device events: 0
Queue on device max size: 0
Max on device queues: 0
Queue on device preferred size: 0
SVM capabilities:
Coarse grain buffer: No
Fine grain buffer: No
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: No
Profiling : No
Platform ID: 00007FFF209D0188
Name: Capeverde
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 1.2
Driver version: 2348.3
Profile: FULL_PROFILE
Version: OpenCL 1.2 AMD-APP (2348.3)
Extensions: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing
cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing
cl_khr_image2d_from_buffer cl_khr_spir cl_khr_gl_event cl_amd_liquid_flash
Device Type: CL_DEVICE_TYPE_CPU
Vendor ID: 1002h
Board name:
Max compute units: 4
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 1024
Max work group size: 1024
Preferred vector width char: 16
Preferred vector width short: 8
Preferred vector width int: 4
Preferred vector width long: 2
Preferred vector width float: 8
Preferred vector width double: 4
Native vector width char: 16
Native vector width short: 8
Native vector width int: 4
Native vector width long: 2
Native vector width float: 8
Native vector width double: 4
Max clock frequency: 2200Mhz
Address bits: 64
Max memory allocation: 2147483648
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 64
Max image 2D width: 8192
Max image 2D height: 8192
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 16
Max size of kernel argument: 4096
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 32768
Global memory size: 8499593216
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Global
Local memory size: 32768
Max pipe arguments: 16
Max pipe active reservations: 16
Max pipe packet size: 2147483648
Max global variable size: 1879048192
Max global variable preferred total size: 1879048192
Max read/write image args: 64
Max on device events: 0
Queue on device max size: 0
Max on device queues: 0
Queue on device preferred size: 0
SVM capabilities:
Coarse grain buffer: No
Fine grain buffer: No
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 1
Error correction support: 0
Unified memory for Host and Device: 1
Profiling timer resolution: 465
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: Yes
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: No
Profiling : No
Platform ID: 00007FFF209D0188
Name: Intel(R) Core(TM) i5-5200U CPU # 2.20GHz
Vendor: GenuineIntel
Device OpenCL C version: OpenCL C 1.2
Driver version: 2348.3 (sse2,avx)
Profile: FULL_PROFILE
Version: OpenCL 1.2 AMD-APP (2348.3)
What works:
std::vector<cv::ocl::PlatformInfo> platforms;
cv::ocl::getPlatfomsInfo(platforms);
//OpenCL Platforms
for (size_t i = 0; i < platforms.size(); i++)
{
//Access to Platform
const cv::ocl::PlatformInfo* platform = &platforms[i];
//Platform Name
std::cout << "Platform Name: " << platform->name().c_str() << "\n";
//Access Device within Platform
cv::ocl::Device current_device;
for (int j = 0; j < platform->deviceNumber(); j++)
{
//Access Device
platform->getDevice(current_device, j);
//Device Type
int deviceType = current_device.type();
cout << "Device Number: " << platform->deviceNumber() << endl;
cout << "Device Type: " << deviceType << endl;
}
}
The above code displays
Platform Name: Intel(R) OpenCL
Device Number: 2
Device Type: 2
Device Number: 2
Device Type: 4
Platform Name: AMD Accelerated Parallel Processing
Device Number: 2
Device Type: 4
Device Number: 2
Device Type: 2
How do I go about making a Context from here using AMD as my GPU? The linked post says to use method initializeContextFromHandlerbut the documentation on OpenCV is not sufficient enough. Documentation Link
Issue is resolved. I don't know what I did but AMD is working now.
Current settings (On Windows):
Environment Variable:
Name: OPENCV_OPENCL_DEVICE
Value: AMD:GPU:Capeverde
Using setUseOpenCL(bool foo) present in ocl.hpp to select whether to use GPU or CPU.
Most likely problem: In my actual code, I wasn't doing any computation but when I wrote a simple code for subtraction of two matrices, AMD started working.
Code:
#include <opencv2/core/ocl.hpp>
#include <opencv2/opencv.hpp>
int main() {
cv::UMat mat1 = cv::UMat::ones(10, 10, CV_32F);
cv::UMat mat2 = cv::UMat::zeros(10, 10, CV_32F);
cv::UMat output = cv::UMat(10, 10, CV_32F);
cv::subtract(mat1, mat2, output);
std::cout << output << "\n";
std::getchar();
}

CUDA not running in OpenCV even after successful build

I am trying to build OpenCV 2.4.10 on a Win 8.1 machine with CUDA 6.5. I have other third part libraries as well and they have installed successfully. I ram a simple GPU based program and I got this error No GPU found or the library was compiled without GPU support. I also ran the sample exe files like performance_gpu.exe that were built during the installation and I got the same error. I also had WITH_CUDA flag checked. Following are the flags (related to CUDA) that were set during the CMAKE build.
WITH_CUDA : Checked
WITH_CUBLAS : Checked
WITH_CUFFT : Checked
CUDA_ARCH_BIN : 1.1 1.2 1.3 2.0 2.1(2.0) 3.0 3.5
CUDA_ARCH_PTX : 3.0
CUDA_FAST_MATH : Checked
CUDA_GENERATION : Auto
CUDA_HOST_COMPILER : $(VCInstallDir)bin
CUDA_SPERABLE_COMPILATION : Unchecked
CUDA_TOOLKIT_ROOT_DIR : C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v6.5
Another thing is that in some posts I have read that along with CUDA the built takes a lot of time. My build takes ~ 3 Hrs where maximum time is taken up during the compilation of .cu files. I have not got any errors as far as I know during the compilation of those files.
In some posts I have seen that people talk about a directory names gpu inside the build directory but I don't see any in mine!
I am using Visual Studio 2013.
What could be the issue? Please help!
UPDATE:
I again tried to build opencv and this time before starting the build I added the bin, lib and include directories of CUDA. After the build in E:\opencv\build\bin\Release I ran gpu_perf4au.exe and I got this output
[----------]
[ INFO ] Implementation variant: cuda.
[----------]
[----------]
[ GPU INFO ] Run test suite on GeForce GTX 860M GPU.
[----------]
Time compensation is 0
OpenCV version: 2.4.10
OpenCV VCS version: unknown
Build type: release
Parallel framework: tbb
CPU features: sse sse2 sse3 ssse3 sse4.1 sse4.2 avx avx2
[----------]
[ GPU INFO ] Run on OS Windows x64.
[----------]
*** CUDA Device Query (Runtime API) version (CUDART static linking) ***
Device count: 1
Device 0: "GeForce GTX 860M"
CUDA Driver Version / Runtime Version 6.50 / 6.50
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 2048 MBytes (2147483648 bytes)
GPU Clock Speed: 1.02 GHz
Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65536), 3
D=(4096,4096,4096)
Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16
384) x 2048
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 2147483647 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: No
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 1 / 0
Compute Mode:
Default (multiple host threads can use ::cudaSetDevice() with device simul
taneously)
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.50, CUDA Runtime Ver
sion = 6.50, NumDevs = 1
I thought that every thing was fine but after running this program where I had included all opencv and CUDA directories in its property files,
#include <cv.h>
#include <highgui.h>
#include <iostream>
#include <opencv2\opencv.hpp>
#include <opencv2\gpu\gpu.hpp>
using namespace std;
using namespace cv;
char key;
Mat thresholder (Mat input) {
gpu::GpuMat dst, src;
src.upload(input);
gpu::threshold(src, dst, 128.0, 255.0, CV_THRESH_BINARY);
Mat result_host(dst);
return result_host;
}
int main(int argc, char* argv[]) {
cvNamedWindow("Camera_Output", 1);
CvCapture* capture = cvCaptureFromCAM(CV_CAP_ANY);
while (1){
IplImage* frame = cvQueryFrame(capture);
IplImage* gray_frame = cvCreateImage(cvGetSize(frame), IPL_DEPTH_8U, 1);
cvCvtColor(frame, gray_frame, CV_RGB2GRAY);
Mat temp(gray_frame);
Mat thres_temp;
thres_temp = thresholder(temp);
//cvShowImage("Camera_Output", frame); //Show image frames on created window
imshow("Camera_Output", thres_temp);
key = cvWaitKey(10);
if (char(key) == 27){
break; //If you hit ESC key loop will break.
}
}
cvReleaseCapture(&capture);
cvDestroyWindow("Camera_Output");
return 0;
}
I got the error:
OpenCV Error: No GPU support (The library is compiled without CUDA support) in E
mptyFuncTable::mallocPitch, file C:\builds\2_4_PackSlave-win64-vc12-shared\openc
v\modules\dynamicuda\include\opencv2/dynamicuda/dynamicuda.hpp, line 126
Thanks to #BeRecursive for giving me a lead to solve my issue. The CMAKE build log has three unavailable opencv modules namely androidcamera, dynamicuda and viz. I could not find any information on dynamicuda i.e. the module whose unavailability might have caused the error that I mentioned in the question. Instead I searched for viz module and checked how is it installed.
After going through some blogs and forums I found out that viz module has not been included in the pre-built versions of OpenCV. It was recommended to build from source version 2.4.9. I thought to give it a try and I installed it with VS 2013 and CMAKE 3.0.1 but there were many build failures and warnings. Upon further search I found that CMAKE versions 3.0.x aren't recommended for building OpenCV as they are producing many warnings.
At last I decided to switch to VS 2010 and CMAKE 2.8.12.2 and after building the source I got no error and luckily the after adding all executables, libraries and DLLs in the PATH, when I ran my program that I have mentioned above I got no errors but it is running very slowly! So I ran this program:
#include <cv.h>
#include <highgui.h>
#include <iostream>
#include <opencv2\opencv.hpp>
#include <opencv2\core\core.hpp>
#include <opencv2\gpu\gpu.hpp>
#include <opencv2\highgui\highgui.hpp>
using namespace std;
using namespace cv;
Mat thresholder(Mat input) {
cout << "Beginning thresholding using GPU" << endl;
gpu::GpuMat dst, src;
src.upload(input);
cout << "upload done ..." << endl;
gpu::threshold(src, dst, 128.0, 255.0, CV_THRESH_BINARY);
Mat result_host(dst);
cout << "Thresolding complete!" << endl;
return result_host;
}
int main(int argc, char** argv) {
Mat image, gray_image;
image = imread("desert.jpg", CV_LOAD_IMAGE_COLOR); // Read the file
if (!image.data) {
cout << "Could not open or find the image" << endl;
return -1;
}
cout << "Orignal image loaded ..." << endl;
cvtColor(image, gray_image, CV_BGR2GRAY);
cout << "Original image converted to Grayscale" << endl;
Mat thres_image;
thres_image = thresholder(gray_image);
namedWindow("Original Image", WINDOW_AUTOSIZE);// Create a window for display.
namedWindow("Gray Image", WINDOW_AUTOSIZE);
namedWindow("GPU Threshed Image", WINDOW_AUTOSIZE);
imshow("Original Image", image);
imshow("Gray Image", gray_image);
imshow("GPU Threshed Image", thres_image);
waitKey(0);
return 0;
}
Later I even tested the build on VS 2013 and it also worked.
The GPU based programs are slow due to reasons mentioned here.
So three important things I want to point out:
BUILD from source only
Use a little older version of CMAKE
Prefer VS 2010 for building the binaries.
NOTE:
This might sound weird but all my first BUILDS failed due to some linker error. So, I don't know whether this is work around or not but try to build opencv_gpu before anything and all other modules one by one after that and then build ALL_BUILDS and INSTALL projects.
When you build this way in DEBUG mode you might get an error iff you are building opencv with Python support i.e. "python27_d.lib" otherwise all projects will be built successfully.
WEB SOURCES:
Following are web sources that helped me in solving my problem:
http://answers.opencv.org/question/32502/opencv-249-viz-module-not-there/
http://home.eps.hw.ac.uk/~cgb7/opencv/opencv_tutorial.pdf
http://perso.uclouvain.be/allan.barrea/opencv/opencv.html
http://eavise.wikispaces.com/Building+OpenCV+yourself+on+Windows+7+x64+with+OpenCV+2.4.5+and+CUDA+5.0
https://devtalk.nvidia.com/default/topic/767647/how-do-i-enable-cuda-when-installing-opencv-/
So that is a run time error, being thrown by OpenCV. If you take a look at your CMake log fro your previous question, you can see that one of the Unavailable packages was dynamiccuda, which appears to be what that error is complaining about.
However, I don't have a lot of experience with Windows OpenCV so that could be a red herring. My gut feeling says that you don't have all the libraries correctly on the path. Have you made sure that you have the CUDA lib/include/bin on the PATH? Have you made sure that you have your OpenCV build lib/include directory on the path. Windows has a very simple linking order that essentially just includes the current directory, anything on the PATH and the main Windows directories. So, I would try making sure everything was correctly on the PATH/that you have copied all the correct libraries into the folder.
A note: this is different from a compiling/linking error because it is at RUNTIME. So setting the compiler paths will not help with runtime linking errors.

Why doesn't my MPEG-TS play on iOS?

My MPEG-TS video isn't playing on iOS via HTTP Live Streaming and I am not sure why. I know my iOS code/m3u8 format is correct because if I replace my .ts file with a sample one from apple (bipbop), it works. I provided information on my video (doesn't work) and the one that works.
Mine (not working)
General
ID : 1 (0x1)
Format : MPEG-TS
File size : 9.57 MiB
Duration : 3s 265ms
Overall bit rate mode : Variable
Overall bit rate : 24.3 Mbps
Video
ID : 769 (0x301)
Menu ID : 1 (0x1)
Format : AVC
Format/Info : Advanced Video Codec
Format profile : High#L4.2
Format settings, CABAC : No
Format settings, ReFrames : 1 frame
Codec ID : 27
Duration : 3s 279ms
Bit rate : 23.1 Mbps
Width : 1 920 pixels
Height : 1 080 pixels
Display aspect ratio : 16:9
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Stream size : 9.01 MiB (94%)
Apples (working)
General
ID : 1 (0x1)
Format : MPEG-TS
File size : 281 KiB
Duration : 9s 943ms
Overall bit rate mode : Variable
Overall bit rate : 231 Kbps
Video
ID : 257 (0x101)
Menu ID : 1 (0x1)
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Main#L2.1
Format settings, CABAC : No
Format settings, ReFrames : 2 frames
Format settings, GOP : M=2, N=24
Codec ID : 27
Duration : 9s 542ms
Width : 400 pixels
Height : 300 pixels
Display aspect ratio : 4:3
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Color primaries : BT.601 NTSC
Transfer characteristics : BT.709
Matrix coefficients : BT.601
Audio
ID : 258 (0x102)
Menu ID : 1 (0x1)
Format : AAC
Format/Info : Advanced Audio Codec
Format version : Version 4
Format profile : LC
Muxing mode : ADTS
Codec ID : 15
Duration : 9s 380ms
Bit rate mode : Variable
Channel(s) : 2 channels
Channel positions : Front: L R
Sampling rate : 22.05 KHz
Compression mode : Lossy
Delay relative to video : -121ms
My video doesn't have an audio stream, but that shouldn't matter.
What is it about my video that makes it not work via HTTP Live Streaming?
Your video is high profile, level 4.2. iPhone 5 only supports up level 4.1. iPhone 4 only supports up to main profile level 3.1. Also 23.1 Also MBps is really high. 3 or 4 is probably max.
Edit:
Here is a compiled list I have made for ios devices.
The problem is not the operating system. iOS is just passing on the encoded h.264 stream to the SoC's video decode block. The hardware decoding blocks are limited, and each SoC iteration has different limitations.
Generally the limits are on the profile and macroblock rate. You will need to severely cut back the bitrate in your video if you want it to play on any iOS device.
Szatmary's table looks like a great resource for choosing your target encoding parameters.

"NVENC Feature not available for current license key type" error from nvEncoder sample

When I try to run the nvEncoder sample application included in NV Encode SDK 2.0, it fails to open an encode session. Here is the output:
C:\Users\Timothy\Downloads\nvenc_2.0_pkg\Samples\nvEncodeApp>1080p_heavyhand_3se
c.bat
C:\Users\Timothy\Downloads\nvenc_2.0_pkg\Samples\nvEncodeApp>nvEncoder -infile=.
.\yuv\1080p\HeavyHandIdiot.3sec.yuv -outfile=HeavyHandIdiot.3sec.264 -width=1920
-height=1080 -bitrate=6000000
> NVEncode configuration parameters for Encoder[0]
> GPU Device ID = 0
> Input File = ..\yuv\1080p\HeavyHandIdiot.3sec.yuv
> Output File = HeavyHandIdiot.3sec.264
> Frames [000--01] = 0 frames
> Multi-View Codec = No
> Width,Height = [1920,1080]
> Video Output Codec = 4 - H.264 Codec
> Average Bitrate = 6000000 (bps/sec)
> Peak Bitrate = 24000000 (bps/sec)
> BufferSize = 3000000
> Rate Control Mode = 2 - CBR (Constant Bitrate)
> Frame Rate (Num/Denom) = (30000/1001) 29.9700 fps
> GOP Length = 30
> Set Initial RC QP = 0
> Initial RC QP (I,P,B) = I(0), P(0), B(0)
> Number of B Frames = 0
> Display Aspect Ratio X = 1920
> Display Aspect Ratio Y = 1080
> Number of B-Frames = 0
> QP (All Frames) = 26
> QP (I-Frames) = 25
> QP (P-Frames) = 28
> QP (B-Frames) = 31
> Hiearchical P-Frames = 0
> Hiearchical B-Frames = 0
> SVC Temporal Scalability = 0
> Number of Temporal Layers = 0
> Outband SPSPPS = 0
> Video codec profile = 100
> Stereo 3D Mode = 0
> Stereo 3D Enable = No
> Number slices per Frame = 1
> Encoder Preset = 3 - High Performance (HP) Preset
> Asynchronous Mode = Yes
> YUV Input Format = NV12 (Semi-Planar UV Interleaved) Pitch Linear
> NVENC API Interface = 2 - CUDA
> Map Resource API Demo = No
> Dynamic Resolution Change = 0
> Dynamic Bitrate Change = 0
Input Filesize: 236390400 bytes
Input Filename: ..\yuv\1080p\HeavyHandIdiot.3sec.yuv
Auto-Detected (nvAppEncoderParams.endFrame = 76 frames)
>> GetNumberEncoders() has detected 1 CUDA capable GPU device(s) <<
[ GPU #0 - < GeForce GTX 670 > has Compute SM 3.0, NVENC Available ]
>> InitCUDA() has detected 1 CUDA capable GPU device(s)<<
[ GPU #0 - < GeForce GTX 670 > has Compute SM 3.0, Available NVENC ]
>> Select GPU #0 - < GeForce GTX 670 > supports SM 3.0 and NVENC
File: src\CNVEncoder.cpp, Line: 1380, nvEncOpenEncodeSessionEx() returned with e
rror 21
Note: GUID key may be invalid or incorrect. Recommend to upgrade your drivers a
nd obtain a new key
NVENC error at src\CNVEncoder.cpp:1382 code=21(NVENC Feature not available for c
urrent license key type) "nvStatus"
The API says error code 21 is NV_ENC_ERR_INCOMPATIBLE_CLIENT_KEY, with the comment:
/**
* This indicates that the client is attempting to use a feature
* that is not available for the license type for the current system.
*/
The programming guide says:
2. SETTING UP THE HARDWARE FOR ENCODING
2.1 Opening an Encode Session
After loading the NVENC Interface, the client should first call NvEncOpenEncodeSession to open an encoding session. The NVENC Interface will provide a encode session handle to the client, which must be used for all further API calls in the current session.
2.1.1 Using the License client Key GUID:
The client should pass a pointer to the key GUID that has been delivered with this SDK or has been purchased as part of a license separately, as NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS::clientKeyPtr
According to the guide, the sample code is invalid, as it doesn't set NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS::clientKeyPtr. But the SDK wasn't delivered with a key GUID like the guide said.
Someone had the same problem here and resolved it by using a free trial key. It seems to have been included with the 2.0 beta version of the SDK, which is no longer available.
I've also tried installing drivers 311.06, 312.07, and 314.22 with no success. I have a GeForce GTX 670.
Is there a solution?
Starting with the GeForce 334.67 driver, NVENC no longer requires a license key to use on GeForce cards.
Unfortunately, I have not been able to find the beta version of the SDK anywhere, only the final version. The only way would probably to be to find someone who downloaded the beta version.
The other way would be try to reverse engineer NVIDIA's drivers (especially with "Shadowplay" and SHIELD coming both using NVENC) or existing encoding tools that are licensed to use NVENC on Geforce cards to find a compatible key.
Another potential solution I've been watching is to simply hard mod the card into a Quadro/Tesla/GRID, which you should be able to do on your 670 (though unfortunately for me, nobody has tried it on a Titan): http://www.eevblog.com/forum/projects/hacking-nvidia-cards-into-their-professional-counterparts/
Annoyingly, NVIDIA advertised NVENC as a feature of consumer-level Kepler cards upon the launch of the GTX 680, and they've backed away from this to make it a pro-only feature. It doesn't even work with my "prosumer" $1k GTX Titans. Ironically, I don't even want to use the Titans long-term; even with NVENC, the Grid K1 or K2 would be far more suitable for my project. It would be great to get something working on my workstation/gaming rig before scaling it up (and buying a ton of NVIDIA GPUs...) instead of dropping more of my own money on GPUs... Guess it might be better to go the AMD/OpenCL route with their Open Video Encode engine instead, except Catalyst on GNU/Linux doesn't support it. Ugh.
You need a license key, which can be obtained by asking Nvidia (good luck!), or found by disassembling the shared library, or using gdb's rwatch with the bundled example code. Sorry I can't be more helpful than this.

Resources