I am having some difficulty finding a library with which to explore machine learning/ai. I have a pair of R9 290x's, and can't seem to find a lib which works well for it.
First I tried array-fire, which has excellent CPU performance, but poor GPU performance for machine learning, as demonstrated on the benchmarks in the machine_learning sample folder.
I looked into rocm and MIOpen, I tried the hip enabled tensorflow but found it is not supported on the 290x generations. I found someone working on llvm-amdgpu suppport for tensorflow as well, but it doesn't look ready yet
I looked into accelerate for haskell, and found an issue regarding the amdgpu backend, but it also looks not ready.
Maybe I haven't been searching broadly enough? But from what I can tell, almost everything runs on cuda, and I can't afford a new GPU for this right now.
At the time you asked the question, AMD did not support Hawaii GPU's with their rocm driver and compute stack.
Since then support has been added for these older GPU's.
AMD has made a tensorflow port which installs and functions the same as CUDA tensorflow (amd's port). However it doesn't support anything older than gfx803 (Fiji, such as R9 Fury).
I have an R9 290 and it is works with the latest rocm drivers from AMD's repo, but not with the AMD tensorflow port. This is the error I get:
2018-08-16 12:10:58.529311: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Ignoring visible gpu device (device: 0, name: Hawaii PRO [Radeon R9 290], pci bus id: 0000:01:00.0) with AMDGPU ISA gfx701. The minimum required AMDGPU ISA is gfx803.
Related
So I have a GPU memory leak in certain scenarios in my application. However, I am not aware of any detailed memory profiler for the GPU like those for the CPU. Are there anything out there that can achieve this? I am using D3D (since its WPF, there are d3d9, d3d10, d3d11 components...)
Thanks!
Are you using the debug setting in Dx control panel? This helps you dump the id of the leaking allocation. You can then proceed to set a HKLM registry value and break on the leaking allocation, as is explained here:
http://legalizeadulthood.wordpress.com/2009/06/28/direct3d-programming-tip-5-use-the-debug-runtime/
http://www.gamedev.net/topic/313718-tracking-down-a-directx-leak/
You can also try NSight, which you can download for free from NVidia. For Maximus cards there is also a specific GPU Debugger, and otherwise you can use the Graphics Debugger and try to isolate the memory bump there. In the Performance Debugger you can detect both OpenGl and DirectX events, though this is more performance oriented.
Depending on your GPU's vendor (As you have not provided us with the information), here are the possible solutions:
Intel: Use the Intel Media SDK 's GPU Utilization Utility. This comes packaed in the Intel INDE (Integrated Developer Environment).
AMD: CodeXL provides an on-the-fly debugger and an extensive memory profiling tool, and is now provided as part of their GPUOPen initiative.
NVIDIA: Use the Nvidia Visual Profiler (NVVP) combined with traces from Nvidia Nsight, and these utilities are provided with the standard Nvidia CUDA installer.
Notes:
With Nvidia, you must also install the provided GPU driver (~from the CUDA SDK) to enable any form of GPU-based driver profiling and debugging. Take note of the above limitation if you use your development rig for other purposes such as gaming, as the bundled driver is often much, much older than the stock, Game-ready drivers.
Thanks and regards,
Brainiarc7.
i want to do a project which uses eye tracking, is it possible to port an open cv code on a microcontroller.
i am new to opencv as well as microcontroller so can any one tell me if it is possible to make a code which works like this vedio.
http://www.youtube.com/watch?feature=endscreen&v=eBtpKAja-m0&NR=1
Q: Can i use an eye detecting opencv code on microcontroller?
A: Yes, you can
Q: Is it possible to port an open cv code on a microcontroller
A: OpenCV is already in the Unix and Android platform. The easiest approach therefore will be to get hold of some embedded device with ARM. There are a lot of help available for the 'OpenCV-ARM' combination.
Beagleboard and RasberryPi are the cheapest embedded ARM devices available for less than $150. Sometimes they come preloaded with Unix boot system and opencv2.0. Thus it would be so easy to run the executable that you created in the computer system.
Be aware of the speed of the processor. If your algorithm is computationally intensive then you wont be quiet satisfied with the output being obtained in the low-end embedded devices.
If some ARM embedded Linux board can fit into your definition of microcontroller, then there is nothing to port.
http://www.google.com/search?q=opencv+arm
I'm working on a project that will use an AMD GPU for processing data. I noticed AMD has two different SDKs available on their website for using the GPU: ATI Stream Technology and
OpenCLâ„¢ and the AMD APP SDK. It looks like both support OpenCL but I haven't found anything on the site explicitly pointing out why one would use one over the other. What's the difference between these two?
The AMD APP SDK is here: http://developer.amd.com/sdks/AMDAPPSDK/Pages/default.aspx
The website should also answer your question about the difference between Stream and APP:
AMD Accelerated Parallel Processing (APP) SDK (formerly ATI Stream)
It used to be called AMD Stream SDK, they probably renamed it after adding support for non-Firestream hardware (namely OpenCL)
stream is the higher level amd-specific project (hardware and software) that includes opencl as the current software implementation. stream originally used the "brook" language, but switched to opencl in 2011. since then opencl became more popular (because it is a cross-platform standard that has been particularly well supported by apple) and these days amd doesn't seem to mention stream much. you can see this in a link like http://www.amd.com/us/products/technologies/stream-technology/opencl/pages/opencl.aspx where opencl is a "child" of stream (or the menu on the left of that page, where the higher level group is stream; other children are related to hardware).
in short, you want opencl. and despite the confusing mess that is amd's site, their opencl implementation is pretty solid.
hmmm. re-reading your question you seem to say there are two separate sdks. do you actually drill down to two different packages? my understanding is that opencl is the stream sdk. if you have found two different sdks (that are both current) can you link to them?
I know Nvidia has CUDA, but what does ATI have? I dont want to use OpenCL because I want to keep as low level to the hardware as possible.
Is it brook, or stream?
The documentation available is pretty pathetic! CUDA seems easy to get programming, but I want to use ATI specifically because of their hardware.
OpenCL is AMD's currently preferred GPU/compute language.
Brook is deprecated.
However, you can write code at a very low level, using AMD's
shader and kernel analyzer
http://developer.amd.com/tools/shader/Pages/default.aspx.
http://developer.amd.com/tools/AMDAPPKernelAnalyzer/Pages/default.aspx
E.g. http://developer.amd.com/tools/shader/PublishingImages/GSA.png
shows OpenCL code, and the Radeon 5870 assembly produced.
You can actually code directly in several forms of "assembly".
Or at least you could - the webpages no longer mention this.
(I used to have this installed for tuning and testing, but do not at the moment.)
More usually, you can code in any of several forms of AMD IL, Intermediate Language,
which is closer to the machine than OpenCL. The kernel analyzer web page says
"If your kernel is an IL kernel Stream, KernelAnalyzer will automatically compile the IL..."
I would recommend that you use OpenCL, and then look at the disassembly and tweak the OpenCL code to be better tuned. But you can work in IL, and probably still can work at an even lower level.
I recently started to learn how to use openCL to speed up some part of my code. So far the speed gain is impressive. In one case the code ran up to 50X faster than on the CPU. However I wonder if can start using this code in a production environnement. The reason is that the first time that I tried to run the example code, nothing worked. I was able to make it run by downloading the driver on the Nvidia openCL SDK download page (I have a Geforce GTX260). It gave me a blue during installation but after that I was able to run the example program and create my own code.
Does the fact that it didn't work "out of the box" for me mean that the mainstream drivers does not yet support it, despite the fact that it is specifically written that it does on the driver download page? What about ATI support? Will everyone have to download the special driver that gave me a blue screen on install?
In short, is openCL ready for production code?
If someone can give me some details, I'd like to know. Does anyone has been able to run a simple program on a number of different device without installing anything SDK related?
You may find an accurate answer on the OpenCL forums on the Khronos Group message boards. The OpenCL work group hangs out there regularly.
Does anyone has been able to run a
simple program on a number of
different device without installing
anything SDK related?
Nop. For instance, on ATI's GPUs end-users need to install ATI Stream SDK in order to run OpenCL code (just having an up-to-date graphics driver is not sufficient).
You may want to consider trying DirectCompute (Microsoft's version of GPU programming) or doing your OpenCL work on a Snow Leopard Mac. Those are the two ways (that I know of) that you can deliver a GPU programming solution to another user without any driver or other installation hassle.