An OpenCL program I originally wrote for an AMD GPU (RX 570) fails the runtime compilation on the Nvidia card (RTX 3060, latest drivers) with the message <kernel>:28:5: error: use of unknown builtin '__builtin_mul_overflow'.
Through the preprocessor macros I found the OpenCL code now gets compiled with Clang 3.4. According to the documentation, this version did not yet support __builtin_mul_overflow(), which was introduced in Clang 3.8.
Is it possible to specify OpenCL to use a newer Clang version?
Or are the latest Nvidia drivers indeed limited to such an old compiler?
Unfortunately you can't control which compiler the OpenCL driver uses. You could try conditionally compiling the code to account for different compilers, but that's about it.
Related
Is there a way to get clang/clang++ to use a gcc/g++ installation in a non-standard (i.e. not /usr) place?
I'm trying to get AMD's AOCC 4.0 compiler to work. They provide a pre-compiled version that you just unpack. The problem is that it seems to assume gcc is in /usr/lib/gcc/... In my case I'm on CentOS 7 so that's gcc 4.8.5. I want to use newer gcc's install in /sw/opt (and managed with environment modules) but even if the gcc is in my path, clang only finds that 4.8.5 version in /usr. This is also a problem in that I have a cluster that has no default gcc installed (but many gcc versions installed in /cluster/sw) and I can't get clang to see them.
When I want LLVM I usually just build from scratch and specify GCC_INSTALL_PREFIX but that only seems to be useful at build time and since AMD only provides executables I'm out of luck.
Ideally I'd like to get clang/clang++ to point to another gcc (en mass: include, libs, etc...) or not be dependent on gcc at all.
AOCC seems to be based on 14.0.6 if that matters:
AMD clang version 14.0.6 (CLANG: AOCC_4.0.0-Build#434 2022_10_28) (based on LLVM Mirror.Version.14.0.6)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /sw/opt/aocc-compiler-4.0.0/bin
After more poking around I've discovered that there is a clang option "--gcc-toolchain" that seems to address this. Some clang documentation also lists an option "--gcc-install-dir" but neither the 14.0.6 based version of AOCC nor the 16.0.0 based version of OneAPI (2023.0) seem to recognize it. I don't see it in the output of "clang --help" either so who knows.
I try to compile Opencv with cuda, but I have an error
Running with:
nvidia-driver-525.60.13
CUDA 12.0
OpenCV 3.4.16
I don't know where it comes from...
[ 7%] Building CXX object 3rdparty/protobuf/CMakeFiles/libprotobuf.dir/src/google/protobuf/stubs/atomicops_internals_x86_gcc.cc.o
/home/totar/cv2/opencv-3.4.16/modules/cudev/include/opencv2/cudev/ptr2d/texture.hpp(61): error: texture is not a template
/home/totar/cv2/opencv-3.4.16/modules/cudev/include/opencv2/cudev/ptr2d/texture.hpp(83): error: identifier "cudaUnbindTexture" is undefined
/home/totar/cv2/opencv-3.4.16/modules/core/include/opencv2/core/cuda/common.hpp(99): error: identifier "textureReference" is undefined
3 errors detected in the compilation of "/home/totar/cv2/opencv-3.4.16/modules/core/src/cuda/gpu_mat.cu".
CMake Error at cuda_compile_1_generated_gpu_mat.cu.o.Release.cmake:279 (message):
Error generating file
/home/totar/cv2/opencv-3.4.16/build/modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/./cuda_compile_1_generated_gpu_mat.cu.o
modules/core/CMakeFiles/opencv_core.dir/build.make:63: recipe for target 'modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/cuda_compile_1_generated_gpu_mat.cu.o' failed
make[2]: *** [modules/core/CMakeFiles/cuda_compile_1.dir/src/cuda/cuda_compile_1_generated_gpu_mat.cu.o] Error 1
CMakeFiles/Makefile2:1889: recipe for target 'modules/core/CMakeFiles/opencv_core.dir/all' failed
make[1]: *** [modules/core/CMakeFiles/opencv_core.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
Output of nvcc --version
totar#totar:~/cv2/opencv-3.4.16/build$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Mon_Oct_24_19:12:58_PDT_2022
Cuda compilation tools, release 12.0, V12.0.76
Build cuda_12.0.r12.0/compiler.31968024_0
Has anyone ever had this problem?
CUDA 12.0 dropped support for legacy texture references. Therefore, any code that uses legacy texture references can no longer be properly compiled with CUDA 12.0 or beyond.
Legacy texture reference usage has been deprecated for some time now.
As indicated in the comments, by reverting to CUDA 11.x where legacy texture references are still supported (albeit deprecated) you won't run into this issue.
The other option may happen some day when OpenCV converts usage of legacy texture references to texture object methods. In that case, it may then be possible to use CUDA 12.0 or a newer CUDA toolkit to compile OpenCV/CUDA functionality.
There is no work around to somehow allow texture reference usage to be compiled properly with CUDA 12.0 and beyond.
Likewise, this limitation is not unique or specific to OpenCV. Any CUDA code that uses texture references can no longer be compiled properly with CUDA 12.0 and beyond. The options are to refactor that code with texture object usage instead, or revert to a previous CUDA toolkit that still has the deprecated support for texture reference usage.
Following from Robert Crovella's answer, another solution (for many people) would be to build OpenCV-4x instead, where the backend uses OpenCL for GPU access, and the Cuda modules are optional (found in the separate https://github.com/opencv/opencv_contrib repository).
Obviously this depends on whether you have a lot of code using the Cuda gpuMat class, which you would need to migrate to the UMat class for the OpenCV-4 "Transparent API" (aka TAPI). This probably worth doing for code that you intend to keep using for the long term.
I am trying to use OpenCV with target OpenCL in a Ubuntu 16.04 system with intel UHD 620 graphics. I have installed ocl-icd-opencl-dev for OpenCL but cv::ocl::haveOpenCL() tells me that I do not have OpenCL
clinfo gives me
Number of platforms 0
Then I tried installing beignet as this answer proposes. Still cv::ocl::haveOpenCL() tells me that I do not have OpenCL and now clinfo says
Number of platforms 1
Platform Name Intel Gen OCL Driver
Platform Vendor Intel
Platform Version OpenCL 1.2 beignet 1.1.1
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_spir cl_khr_icd
Platform Extensions function suffix Intel
beignet-opencl-icd: no supported GPU found, this is probably the wrong opencl-icd package for this hardware
Can anybody help?
ocl-icd-opencl-dev are development files for OCL-ICD loader. You'll need that if you want to develop (compile) against libOpenCL. If you don't want to develop, only use OpenCL programs, then you just need ocl-icd-libopencl1.
cv::ocl::haveOpenCL() tells me that I do not have OpenCL
ocl-icd is just a loader; you need an actual implementation. As explained on Khronos:
The OpenCL Installable Client Driver (ICD) is a mechanism to allow OpenCL implementations from multiple vendors to coexist on a system
Then I tried installing beignet
beignet is an implementation, but it's too old for your GPU. You need either their proprietary implementation, or Intel NEO.
Suppose I have a C++ project, and I compile it with gcc and with clang. You can assume that the gcc compiled version runs in another linux machine. Will this imply (in normal circumstances) that the clang version will also run on the other linux machine?
Clang binraries are as portable as gcc binaries are, as long as you are linking to the same libraries and you aren't passing flags like -march=native to the compiler.
Clang has one huge advantage over gcc, it can deal with alsmost all libstdc++ versions,
while gcc is bound to its bundled version and often can't parse any older versions.
So the following often happens in production environments:
Install an LTS distro (Ubuntu 12.04 for example)
Keep gcc, glibc and libstdc++ untouched
Install a recent clang version for C++11, etc
Build the release binaries with clang
So (in my specific example) those binaries will work on all
distros with libstdc++ >= 4.6 and glibc >= 2.15.
This may be an interesting read for you.
If the program is a simple Hello world, it should work on the other machine when compiled through Clang.
But when the program is a real program with a lot a lines and compilation units, and calls to many external libs everything is possible depending on the program itself and the compilation options :
hardware requirements (memory) being different (mainly depends on compilation options)
use of different (versions of) libraries between gcc and clang
UB giving expected results in one and not in the other
different usages for implementation defined rules
use of gcc extensions not accepted by clang
For all of the above except 2 first, it should run on other machines it it runs on one
linux programs depend on their build environment. If your glibc version or kernel is different there will be lots of possibilities that the executable will not be able to run. You could use the interpreter language of llvm though, it compiles into bytecode which can be interpreted on various operating systems.
The answer is, well, depends.
The first hard requirement is the same CPU architecture. 64 Bit is not enough of a qualifier. If you compile of x64 you won't have much success running it on 64-Bit ARM.
The next big one is libraries. If you use any libraries in the program, the target system needs to have those libraries. This includes the kernel headers. So if you compile for e.g. a current kernel version, using the most cutting-edge features, then you will have no joy running that program on a very old version of Linux.
The last one is hardware dependencies. If you create a program that e.g. requires 4 GB of RAM and then try to run it on a small embedded device with 256 MB RAM, that won't work either.
To fit better to your changed question: From my experience there shouldn't be much of a difference in portability between Clang and gcc. Also googling didn't turn up anything, so it should basically work. But better always test stuff like that before you publish some binary in production.
I try to compile NEON assembly code with LLVM clang integrated macro assembler (the LLVM compiler shipped with XCode 4.3) and get the following error:
vld1.8 {D0}, [R0] - invalid operand for instruction
What can be the reason? Why this instruction is successfuly compiled by GAS for Android and can't be compiled by 'clang -integrated-as ...' for iOS? Thanks.
After a day of experimenting I've found a solution. I've just compiled LLVM from the SVN source base (version 3.2). The integrated macro assembler in LLVM 3.2svn supports ARM NEON ISA much better compared to LLVM 3.0svn shipped with XCode 4.3.1. The problem with VLD NEON instruction have been automatically resolved.
Those, who use gas-preprocessor.pl Perl script, may try to switch from GAS 1.38 (it's external GNU assembler used by LLVM on Mac OS X 10.7.X) to LLVM integrated macro assembler and stop using unnecessary preprocessing.
I've not used clang for assembly but the following site might help: ARM Assembly
In addition, this may help as it solved someone else's issue with ARM assembly (selecting the correct device, lower case instructions etc...): Useful Stackoverflow answer