I am interested in how OpenCL memory transferring functions operate underneath (migration, reading/writing the buffer, mapping/unmapping). I could not find any open source implementation for OpenCL (for me Intel's one could be fine) and just explanations in the documentation don't give me any idea what is happening, for example, when I call clEnqueueMigrateMemObjects: what calls happen during this migration, what modules are active, how this migration happens, what mechanisms it uses underneath, does it use some cache mechanisms.
Is there a good source to read about it?
I am now exploring how OpenCL passes data to FPGAs. Xilinx currently uses native OpenCL implementation, present on a machine, plus some extensions.
If you're looking for low-level information (how a particular implementation implements those calls), probably the only source is the implementation.
There are a few opensource OpenCL on GPU implementations:
Raspberry Pi 3 (beta): https://github.com/doe300/VC4CL
OpenCL on Vulkan (beta): https://github.com/kpet/clvk
Mesa Clover (supports only 1.1): https://cgit.freedesktop.org/mesa/mesa/log/?qt=grep&q=clover
AMD ROCm: https://github.com/RadeonOpenCompute/ROCm-OpenCL-Runtime
Intel sources of NEO (their new OpenCL implementation) here: https://github.com/intel/compute-runtime
I'm not aware of Xilinx providing sources for their implementation, so if you want to know what exactly happens on Xilinx, your best chance is probably to ask on Xilinx forums or via some official support.
Related
I have been through the documentation and didn't get a clear detailed description about UMat; however I think it has something to relate with GPU and CPU. Please help me out.
Thank you.
Perhaps section 3 of this document will help: [link now broken]
https://software.intel.com/sites/default/files/managed/2f/19/inde_opencv_3.0_arch_guide.pdf
Specifically, section 3.1:
A unified abstraction cv::UMat that enables the same APIs to be implemented using CPU or OpenCL code, without a requirement to call OpenCL accelerated version explicitly. These functions use an OpenCL-enabled GPU if exists in the system, and automatically switch to CPU operation otherwise.
and section 3.3:
Generally, the cv::UMat is the C++ class, which is very similar to cv::Mat. But the actual UMat data can be located in a regular system memory, dedicated video memory, or shared memory.
Link to usage suggested in the comments by #BourbonCreams:
https://docs.opencv.org/3.0-rc1/db/dfa/tutorial_transition_guide.html#tutorial_transition_hints_opencl
I'm working on a project that will use an AMD GPU for processing data. I noticed AMD has two different SDKs available on their website for using the GPU: ATI Stream Technology and
OpenCLâ„¢ and the AMD APP SDK. It looks like both support OpenCL but I haven't found anything on the site explicitly pointing out why one would use one over the other. What's the difference between these two?
The AMD APP SDK is here: http://developer.amd.com/sdks/AMDAPPSDK/Pages/default.aspx
The website should also answer your question about the difference between Stream and APP:
AMD Accelerated Parallel Processing (APP) SDK (formerly ATI Stream)
It used to be called AMD Stream SDK, they probably renamed it after adding support for non-Firestream hardware (namely OpenCL)
stream is the higher level amd-specific project (hardware and software) that includes opencl as the current software implementation. stream originally used the "brook" language, but switched to opencl in 2011. since then opencl became more popular (because it is a cross-platform standard that has been particularly well supported by apple) and these days amd doesn't seem to mention stream much. you can see this in a link like http://www.amd.com/us/products/technologies/stream-technology/opencl/pages/opencl.aspx where opencl is a "child" of stream (or the menu on the left of that page, where the higher level group is stream; other children are related to hardware).
in short, you want opencl. and despite the confusing mess that is amd's site, their opencl implementation is pretty solid.
hmmm. re-reading your question you seem to say there are two separate sdks. do you actually drill down to two different packages? my understanding is that opencl is the stream sdk. if you have found two different sdks (that are both current) can you link to them?
I'm completely new to OpenCL and GPU programming in general. Right now I am working on a project where I'm trying to see the performance saves that making use of the GPU in a game has. With this, however, I have ran into a snag; how do I set up my Directx project to speak to the OpenCL code base?
I've been googling this for about a week and haven't been able to find anything. If someone could point me in the right direction, I would be greatful.
OpenCL does not have anything to do with DirectX, it's simply another library.
For OpenCL you'll need an implementation ('SDK'), as Khronos don't provide those (they only provide the specifications).
Intel, AMD and Nvidia all provide one, but they have different requirements and limitations. See here for some of the existing implementations
After installing one of these, you'll have the necessary headers and libraries to code against the OpenCL API and link with OpenCL.dll
There are lots of sample sources in the SDKs or online, you have to write the kernel, the rest is mostly boilerplate code for initialization and kernel compilation.
The specific OpenCL extension that allows sharing of OpenCL buffers as textures and vice versa is cl_khr_d3d10_sharing.txt. http://www.khronos.org/registry/cl/extensions/khr/cl_khr_d3d10_sharing.txt
OpenCL has extensions for sharing memory between DirectX and OpenCL (and also between OpenGL and OpenCL.) This allows you to read or write DirectX buffers, including textures from within OpenCL. Ani's answer mentioned the extension for DirectX 10, but since the question is about DirectX 9, the extension you'll actually be using is cl_khr_dx9_media_sharing.
This extension has just 4 functions:
clGetDeviceIDsFromDX9MediaAdapterKHR
This function allows you to get the OpenCL device IDs of the OpenCL device(s) that can share memory with a given Direct3D 9 device.
clCreateFromDX9MediaSurfaceKHR
This function gets an OpenCL cl_mem memory object for a given Direct3D 9 memory object.
clEnqueueAcquireDX9MediaSurfacesKHR
This function locks the specified shared memory object so that you can read and/or write to it from OpenCL.
clEnqueueReleaseDX9MediaSurfacesKHR
This function unlocks the specified memory object from OpenCL, so that Direct3D can read/write it again.
Once you've used the above functions to share and synchronize access to the memory buffers, everything else on both the Direct3D 9 side and the OpenCL side works as it would otherwise with those particular APIs.
Note that your GPU will need to support the cl_khr_dx9_media_sharing extension in order for this to work. You can check the extensions property of the OpenCL platform and device in order to confirm that this extension is supported.
Some NVidia GPUs support a different extension instead, called cl_nv_d3d9_sharing. The basic idea of how it works is the same as with the cl_khr_dx9_media_sharing extension, but the exact details are a bit different. The biggest difference is just that it has different functions for getting cl_mem objects for different types of Direct3D 9 buffers, rather than just one function to cover all of them.
I know Nvidia has CUDA, but what does ATI have? I dont want to use OpenCL because I want to keep as low level to the hardware as possible.
Is it brook, or stream?
The documentation available is pretty pathetic! CUDA seems easy to get programming, but I want to use ATI specifically because of their hardware.
OpenCL is AMD's currently preferred GPU/compute language.
Brook is deprecated.
However, you can write code at a very low level, using AMD's
shader and kernel analyzer
http://developer.amd.com/tools/shader/Pages/default.aspx.
http://developer.amd.com/tools/AMDAPPKernelAnalyzer/Pages/default.aspx
E.g. http://developer.amd.com/tools/shader/PublishingImages/GSA.png
shows OpenCL code, and the Radeon 5870 assembly produced.
You can actually code directly in several forms of "assembly".
Or at least you could - the webpages no longer mention this.
(I used to have this installed for tuning and testing, but do not at the moment.)
More usually, you can code in any of several forms of AMD IL, Intermediate Language,
which is closer to the machine than OpenCL. The kernel analyzer web page says
"If your kernel is an IL kernel Stream, KernelAnalyzer will automatically compile the IL..."
I would recommend that you use OpenCL, and then look at the disassembly and tweak the OpenCL code to be better tuned. But you can work in IL, and probably still can work at an even lower level.
GPGPU is the principle of using the parallel processors on video cards for massive increases in performance.
Does anyone have any ideas about using GPGPU in Delphi, using either OpenCL or CUDA? CUDA was/is NVidia only, but they have also adopted the OpenCL "standard".
I found a few Delphi samples from Google searches but they either crash or don't compile/run.
The ultimate instruction sample would be:
Download and install the OpenCL DLLs from here.
Download the OpenCL SDK from from here.
Download this sample Delphi project from here.
Open and compile the Delphi project. If all goes to plan it will do "whatever it is supposed to do"
At that stage I can then start researching the OpenCL SDK and writing/compiling DLLs to call from any Delphi app.
This sort of stuff is really starting to take off. Embarcadero do not have to do anything themselves at this stage (unless they want to), but if there were a tutorial and samples for Delphi available it would be great. Many samples are available for other languages, but we do also need a good and simple Delphi example to show how easy it is to use Delphi for GPGPU apps.
CUDA is still nVidia only, and that won't change. OpenCL is a true standard in this case, not only limited to GPGPU.
As for using it in Delphi, all I know of is how to use it in Free Pascal. However, there's quite some chance that the code will be portable, here's a link to updated headers:
FreePascal Mantis RFE OpenCL
As for DLL's, if you use nVidia, they can be found here.
Here however we have a sample project in Delphi.
You could be interested in GPGPUonDelphi2007.
GPGPU example plus needed OpenGL and CG libraries for Delphi 2007 now available!
I created the necessary OpenGL and CG (delphi) packages yesterday and finished converting/translating/porting a C GPGPU OpenGL/CG example to Delphi today, and I would like to share it with you so that maybe some more (Delphi) people will look into GPGPU programming, especially with OpenGL 3.0 for (older) DX9 graphics cards.
You should use CUDA DELPHI
In native pascal code you can run CUDA kernels
I made a floating point test, using OpenCL and Delphi, some time ago:
https://plus.google.com/110131086673878874356/posts/eWcipK16MV7
(contains link to demo and sources)