According to documentation(https://developer.apple.com/documentation/metal/gpu_features/understanding_gpu_family_4) "On A7- through A10-based devices, Metal doesn't explicitly describe this tile-based architecture". In the same article I seen "Metal 2 on the A11 GPU" and get confused because not found any more info about Metal 2 support in metal shading language specification. For example I found table "Attributes for fragment function tile input arguments" and note "iOS: attributes in Table 5.5 are supported since Metal 2.0."
Is Metal 2 support specific for gpu family?
Not all features are supported by all devices. Newer devices generally support more features, older devices might not support newer features.
There are several factors of this support.
First, each MTLDevice has a set of MTLGPUFamily it supports that you can query with supportsFamily method. Some documentation articles mention what kind of family the device needs to support to use this or that feature, but generally, you can find that info in the Metal Feature Set Tables. The support for the families may vary depending on the chip itself, how much memory or some other units is available to it. And the chips are grouped into families based on those.
There are other supports* queries on an MTLDevice though, that don't depend on the family of the device, but rather on a device itself. Like, for example, supportsRaytracing query. These are also based on the GPU itself, but are separate probably because they don't fall neatly into any of the "families".
Third kind of support is based on an OS version. Newer versions of OS might ship new APIs or an extensions to existing APIs. Those are marked with API_AVAILABLE macroses in the headers and may only be used on the OSes that are the same version or higher. To query support for these ones, you need to use either macroses or if #available syntax in Objective-C or similar syntax in Swift. Here, the API availability isn't so much affected by the GPU itself, but rather by having newer OS and drivers to go with it.
Last kind of "support" to limit some features is the Metal Shading Language version. It's tied to the OS version, and it refers to those notes in the Metal Shading Language specification you mentioned in your question. Here, the availability of the features is a mix of limitations of a compiler version (not everyone is going to use latest and greatest spec, I think most production game engines are using something like Metal 2.1, at least the games that aren't using latest and greatest game engine versions do) and the device limitations. For example, tile shaders are limited to a version of a compiler, but also they are limited to Apple Silicon GPUs.
So there are different types of support at play when you are using Metal in your application and it's easy to confuse them, but it's important to know each one.
Related
I'm having a hard time everytime i look at SharpDX code and try to follow DirectX documentation. Is there a place where what each of the numbered classes map to and why they exist is clearly laid out?
I'm talking about things like :
DXGI.Device
DXGI.Device1
DXGI.Device2
DXGI.Device3
DXGI.Device4
SharpDX.Direct3D11.Device
SharpDX.Direct3D11.Device1
SharpDX.Direct3D11.Device11On12
SharpDX.Direct3D11.Device2
SharpDX.Direct3D11.Device3
SharpDX.Direct3D11.Device4
SharpDX.Direct3D11.Device5
SharpDX.Direct3D11.DeviceContext
SharpDX.Direct3D11.DeviceContext1
SharpDX.Direct3D11.DeviceContext2
SharpDX.Direct3D11.DeviceContext3
SharpDX.Direct3D11.DeviceContext4
Everytime i start from code i find it seems to be picked by black magic and i have no idea where to go from there, for example i'm using this (from code i found) and i have no idea why it's device3, factory 3 going with swapchain1 on which we queryinterface swapchain2 :
using (DXGI.Device3 dxgiDevice3 = this.device.QueryInterface<DXGI.Device3>())
using (DXGI.Factory3 dxgiFactory3 = dxgiDevice3.Adapter.GetParent<DXGI.Factory3>())
{
DXGI.SwapChain1 swapChain1 = new DXGI.SwapChain1(dxgiFactory3, this.device, ref swapChainDescription);
this.swapChain = swapChain1.QueryInterface<DXGI.SwapChain2>();
}
If full explanation is too large of a the scope of an answer here any link to get me started on figuring out what C++ DX maps to which numbered object and why is most welcome.
In case this matters i'm only interested in DX >= 11, and i'm using SharpDX within an UWP project.
SharpDx is a pretty thin wrapper around DirectX, and pretty much everything in DirectX is expressed in SharpDx as a pass-through with some naming and calling conventions to accommodate the .net world.
Real documentation on SharpDx is essentially nonexistent, so you will have to do what everybody else does. If you are starting with something in SharpDx then look directly at the SharpDx API listings and the header files to understand what underlying DirectX functions are being expressed. Once you have the name of the DirectX function, you can read the MSDN documentation to understand how that function works. If you are starting with something in DirectX, then look first at MSDN to understand how it works and how it's named, and then go to the SharpDx API and header files to find out how that function is wrapped (named and exposed) in SharpDx.
For the specific question you ask, SharpDx device numbering identifies the Direct3D version that is being wrapped.
Direct3D 11.1 device ==> ID3D11Device1 ==> SharpDX.Direct3D11.Device1
Direct3D 11.2 device ==> ID3D11Device2 ==> SharpDX.Direct3D11.Device2
Direct3D 11.3 device ==> ID3D11Device3 ==> SharpDX.Direct3D11.Device3
and so on.
Naturally each version has a slightly different ("improved") interface. Lower version numbers will work pretty much anywhere, and higher version numbers include additional functionality that may require something specific from your video card and/or your operating system. You can read about the API for each version in sections found here.
For example, the description of the new methods added to the ID3D11Device5 interface (i.e, what's new since ID3D11Device4) is here. In this case, Device5 adds the ability to create a fence object and to open a handle for a shared fence.
When example code uses a specific device number, it's usually because the code requires some functionality that wasn't there in a previous version of Direct3D. In general you want to use the lowest numbered device (and factory, etc.) possible, because that will permit your code to run on the widest variety of machines and video cards.
If you find example code that creates a SharpDX.Direct3D11.Device1 but doesn't appear to use any methods beyond those in SharpDX.Direct3D11.Device, it's probably for one of two reasons. First, the author may know that a later example will require a method or field that doesn't exist before Direct3D 11.1. Second, the author may know that every video card and operating system capable of running the example at all will be capable of running Direct3D 11.1.
For a person just starting out, I would suggest you just stick with Direct3D (and Direct2D) version 11.1, thus DXGI.Device1, SharpDX.Direct3D11.Device1 and SharpDX.Direct3D11.DeviceContext1. These are likely to run on any machine you'll encounter. Only increase the version number if you actually need some functionality that doesn't appear in that version.
One additional hint: if you read a thread about some Direct3D or Direct2D functionality and you can't seem to find it anywhere in SharpDx, look at the Direct3D API to see what version number first contains that functionality. Then go through the SharpDx API (or better yet the header files) for that version until you see a similarly named element. It may be wrapped in an unexpected way, but AFAIK it's all exposed, even when you have a hard time finding it.
Here you can find about all SharpDx objects, specifically for DXGI you can found here, There you can see the Device mapped to IDXGIDevice.
Note the words IDXGIDevice are hyperlink that references to documentation for C++ object. And on this way Device1 and Device2 etc.
You can see that there is a very simple logic here, SharpDx divides the name of the C++ object into Namespace and a class name,
For example instead of IDXGIDevice, you get Namespace: DXG and class Name: Device.
In the documentation for each C++ object you can find Requirements.
And there is detailed in which operating system you can use the object.
As the number is higher, the object will work in a newer operating system.
For example, IDXGIDevice1 works under Windows 7, however IDXGIDevice3 works under Windows 8.1 or higher.
With the push towards multimedia enabled mobile devices this seems like a logical way to boost performance on these platforms, while keeping general purpose software power efficient. I've been interested in the IPad hardware as a developement platform for UI and data display / entry usage. But am curious of how much processing capability the device itself is capable of. OpenCL would make it a JUICY hardware platform to develop on, even though the licensing seems like it kinda stinks.
OpenCL is not yet part of iOS.
However, the newer iPhones, iPod touches, and the iPad all have GPUs that support OpenGL ES 2.0. 2.0 lets you create your own programmable shaders to run on the GPU, which would let you do high-performance parallel calculations. While not as elegant as OpenCL, you might be able to solve many of the same problems.
Additionally, iOS 4.0 brought with it the Accelerate framework which gives you access to many common vector-based operations for high-performance computing on the CPU. See Session 202 - The Accelerate framework for iPhone OS in the WWDC 2010 videos for more on this.
Caution! This question is ranked as 2nd result by google. However most answers here (including mine) are out-of-date. People interested in OpenCL on iOS should visit more update-to-date entries like this -- https://stackoverflow.com/a/18847804/443016.
http://www.macrumors.com/2011/01/14/ios-4-3-beta-hints-at-opencl-capable-sgx543-gpu-in-future-devices/
iPad2's GPU, PowerVR SGX543 is capable of OpenCL.
Let's wait and see which iOS release will bring OpenCL APIs to us.:)
Following from nacho4d:
There is indeed an OpenCL.framework in iOS5s private frameworks directory, so I would suppose iOS6 is the one to watch for OpenCL.
Actually, I've seen it in OpenGL-related crash logs for my iPad 1, although that could just be CPU (implementing parts of the graphics stack perhaps, like on OSX).
You can compile and run OpenCL code on iOS using the private OpenCL framework, but you probably don't get a project into the App Store (Apple doesn't want you to use private frameworks).
Here is how to do it:
https://github.com/linusyang/opencl-test-ios
OpenCL ? No yet.
A good way of guessing next Public Frameworks in iOSs is by looking at Private Frameworks Directory.
If you see there what you are looking for, then there are chances.
If not, then wait for the next release and look again in the Private stuff.
I guess CoreImage is coming first because OpenCL is too low level ;)
Anyway, this is just a guess
I am writing a small utility that reports system capabilities. One is the highest shader model supported by the installed graphics card, and I am currently detecting this using Direct3D 9.0c's device capabilities and checking the VertexShaderVersion and PixelShaderVersion fields of the D3DCAPS9 structure.
HRESULT hrDCaps = poD3D9->GetDeviceCaps(D3DADAPTER_DEFAULT, D3DDEVTYPE_HAL, &oCaps);
if (!FAILED(hrDCaps)) {
// Pixel and vertex shader model versions. Use the minimum number of each for "the" shader model version
const int iVertexShaderModel = D3DSHADER_VERSION_MAJOR(oCaps.VertexShaderVersion);
const int iPixelShaderModel = D3DSHADER_VERSION_MAJOR(oCaps.PixelShaderVersion);
However, both these values return shader model 3 even for cards that support higher models. Here is what GPU-Z returns for the same card, for example:
This question indicates that DX9 will never report more than SM3 even on cards that support a higher model, but doesn't actually mention how to solve it.
How do I accurately get the shader model supported by the installed card? That is, the card capabilities, not the installed DirectX driver capabilities.
The utility has to run on Windows 2000 and above, and work on systems where a graphics card and even DirectX are not installed. I am currently dynamically loading DX9, so on those systems the check gracefully fails (which is ok.) But I am seeking a similar solution: something that will still run on all systems, and work correctly (detect the SM version) on most systems.
Edit - purpose: I am not using this code to dynamically change features of a program, ie select shaders. I am using it to report hardware capabilities as a 'ping' to a server, which is used to we have a good idea of typical hardware that our customers use, which can inform future product decisions. (For example: how many customers have SM4 or above? How many are using a 64-bit OS? Etc.) This is why either (a) gracefully failing, so we know it failed, or (b) getting an accurate shader model number are the two preferred modes.
Edit - answers so far: The answer below by SigTerm suggests instantiating DirectX 11, 10.1, 10, and 9.0c in order, and basing the reported shader model on which version instantiated without failures (shader model 5, 4.1, 4, and DXCAPS in that order.) If possible, I'd appreciate a code example of the DX11 and 10 ways to do this.
This may not be a reliable solution. For example, I am running Windows on a VMWare Fusion virtual machine on OSX. The Fusion drivers report DX11 in DxDiag, yet I know from the Fusion tech specs that it only supports DX9.0c and shader model 3. Still, with this exception, this method seems the best way so far.
version 4 is only supported on Direct3D10. Therefore, D3D9 api won't report it. Use D3D10/D3D11 api to detect higher version.
something that will still run on all systems, and work correctly (detect the SM version) on most systems.
Attempt to initialize D3D10/D3D11 to check functionality, if it fails init D3D9. Use LoadLibrary + GetProcAddress to load D3D10 functions, because if you link with D3D10 using .lib file, your application will fail to start if d3d10 is missing.
Or use OpenGL and try to map capabilities reported by OpenGL to D3D capabilities (probably a very bad idea).
Or build GPU database and use that.
where a graphics card and even DirectX are not installed.
I think you're asking for the impossible, because shaders are provided by DirectX, and the driver/GPU might not even have a concept of a "shader model" under the hood. In this case the only way to detect capabilites will be to make GPU database of some sort, detect installed devices, and return answer from database. This won't be relabile, of course.
Here is a link about DirectX versions and supported shader models.
I'm working on a project that will use an AMD GPU for processing data. I noticed AMD has two different SDKs available on their website for using the GPU: ATI Stream Technology and
OpenCLâ„¢ and the AMD APP SDK. It looks like both support OpenCL but I haven't found anything on the site explicitly pointing out why one would use one over the other. What's the difference between these two?
The AMD APP SDK is here: http://developer.amd.com/sdks/AMDAPPSDK/Pages/default.aspx
The website should also answer your question about the difference between Stream and APP:
AMD Accelerated Parallel Processing (APP) SDK (formerly ATI Stream)
It used to be called AMD Stream SDK, they probably renamed it after adding support for non-Firestream hardware (namely OpenCL)
stream is the higher level amd-specific project (hardware and software) that includes opencl as the current software implementation. stream originally used the "brook" language, but switched to opencl in 2011. since then opencl became more popular (because it is a cross-platform standard that has been particularly well supported by apple) and these days amd doesn't seem to mention stream much. you can see this in a link like http://www.amd.com/us/products/technologies/stream-technology/opencl/pages/opencl.aspx where opencl is a "child" of stream (or the menu on the left of that page, where the higher level group is stream; other children are related to hardware).
in short, you want opencl. and despite the confusing mess that is amd's site, their opencl implementation is pretty solid.
hmmm. re-reading your question you seem to say there are two separate sdks. do you actually drill down to two different packages? my understanding is that opencl is the stream sdk. if you have found two different sdks (that are both current) can you link to them?
Are there any limitations with using DirectCompute on DX10.1 GPUs? I will do most of my development on a DX11 desktop, but I'd like to demo code on a DX10.1 laptop. It'll be a Macbook Pro running Win7 in Bootcamp. The GPU is an Nvidia 330M. What limitations can I expect?
Edit: I found a page about using Compute Shaders on DX10, but it's not entirely clear to me if these are serious limitations or not.
Edit 2: My goal is to learn a bit about quantitative finance and solving PDEs.
Frankly I think CS 4.x is rather limitating because of the lack of atomics, double precision, restrictions for accessing groupshared memory, as well as the 16KB limit. Also you can have only one UAV that can be bound.
I believe most of DirectCompute developers will use CS 4.x for post-processing in games or so (probably with both CS 4.x and CS 5.0 code path). People that want to do heavy GPGPU work will learn with CS 4.x then later move on CS 5.0.
Now you're saying you haven't a clue of the CS 4.x limitations. I suggest to go with CS 4.x and stick to it for now.
But really it all depends what you are developing, how and your target audience (professional developer vs hobby coder, shipping your application now vs in two years, mainstream audience vs pro market etc).
I can't tell you if the limitations are serious or not, as 1) it depends on what you're trying to achieve, and 2) I simply don't know enough about the compute shader.
However, you can run the DirectX Caps Viewer to see what features your device will support (or what limitations you can expect). Also, AFAIK other than the limitations highlighted in the link you posted, you will only be able to use CS 4.0, not the new features in CS 5.0.