Sampling BC5_SNORM texture yields incorrect value range - directx

I'm working with Direct3d 11, and I've come across something strange. I have taken a normal map and encoded it to a DDS file twice. Once with R8G8B8A8_SNORM encoding, and once with BC5_SNORM.
Next I load each texture using D3DX11CreateShaderResourceViewFromFile in conjunction with D3DX11GetImageInfoFromFile. When I sample these textures in my pixel shader I find that the R8G8B8A8_SNORM texture is returning values in the range [-1,1], which is what I would expect for a SNORM texture. However, the BC5_SNORM texture is returning values in the range [0,1], which doesn't make any sense to me.
I double an triple checked with my debugger and PIX. The format of the texture is correct (BC5_*S*NORM), so I am at a loss for why it's not returning signed values.

I managed to reproduce the same issue as you and I also got the same behaviour when doing a conversion from a R8G8B8A8_SNORM texture (with -1 to +1 values) to BC5_SNORM (producing only 0 to 1 values) when doing the conversion through D3Dx11LoadTextureFromTexture. There does appear to be a fault in D3DX11, at least regarding BC5_SNORM, in that, regardless of all kinds of input formats, the (BC5)SNORM output is always in the 0 to 1 range.
As suggested by #chuckwalbourn I can confirm that the DirectXTex utilities, which supersedes the now deprecated D3DX11, does respect and correctly handle signed values for BC5_SNORM outputs.
You can either have your program write out a temporary .dds (using D3DX11SaveTextureToFile with a R8G8B8A8_SNORM texture) and then invoke the standalone DirectXTex 'texconv.exe' utility to convert to BC5_SNORM, or wrangle the DirectXTex library into your program and use the 'Convert(...)' function appropriately.

Related

DirectCompute: How to read from a RWTexture2D<float4>?

I have the following buffer:
RWTexture2D<float4> Output : register(u0);
This buffer is used by a compute shader for rendering a computed image.
To write a pixel in that texture, I just use code similar to this:
Output[XY] = SomeFunctionReturningFloat4(SomeArgument);
This works very well and my computed image is correctly rendered on screen.
Now at some stage in the compute shader, I would like to read back an
already computed pixel and process it again.
Output[XY] = SomeOtherFunctionReturningFloat4(Output[XY]);
The compiler return an error:
error X3676: typed UAV loads are only allowed for single-component 32-bit element types
Any help appreciated.
In Compute Shaders, data access is limited on some data types, and not at all intuitive and straightforward. In your case, you use a
RWTexture2D<float4>
That is a UAV typed of DXGI_FORMAT_R32G32B32A32_FLOAT format.
This forma is only supported for UAV typed store, but it’s not supported by UAV typed load.
Basically, you can only write on it, but not read it. UAV typed load only supports 32 bit formats, in your case DXGI_FORMAT_R32_FLOAT, that can only contain a single component (32 bits and that’s all).
Your code should run if you use a RWTexture2D<float> but I suppose this is not enough for you.
Possible workarounds that spring to my minds are:
1. using 4 different RWTexture2D<float>, one for each component
2. using 2 different textures, RWTexture2D<float4> to write your values and Texture2D<float4> to read from
3. Use a RWStructuredBufferinstead of the texture.
I don’t know your code so I don’t know if solutions 1. and 2. could be viable. However, I strongly suggest going for 3. and using StructuredBuffer. A RWStructuredBuffer can hold any type of struct and can easily cover all your needs. To be honest, in compute shaders I almost only use them to pass data. If you need the final output to be a texture, you can do all your calculations on the buffer, then copy the results on the texture when you’re done. I would add that drivers often use CompletePath to access RWTexture2D data, and FastPath to access RWStructuredBuffer data, making the former awfully slower than the latter.
Reference for data type access is here. Scroll down to UAV typed load.

Equivalent of glColorMask in Metal for a kernel program?

I am trying to move from OpenGL to Metal for my iOS apps. In my OpenGL code I use glColorMask (if I want to write only to selected channels, for example only to alpha channel of a texture) in many places.
In Metal, for render pipeline (though vertex and fragment shader) seems like MTLColorWriteMask is the equivalent of glColorMask. I can setup it up while creating a MTLRenderPipelineState through the MTLRenderPipelineDescriptor.
But I could not find a similar option for compute pipeline (through kernel function). I always need to write all the channels (red, green, blue and alpha) every time I write to an output texture. What if I want to preserve the alpha (or any other channel) and only want to modify the color channels? I can create a copy of the output texture and use it as one of the inputs and read alpha from it to preserve the values but that is expensive.
Computer memory architectures don't like writing only some bytes of data. A write to 1 out of 4 bytes usually involves reading those four bytes into the cache, modifying one of them in the cache, and then writing those four bytes back out into memory. Well, most computers read/write a lot more than 4 bytes at a time, but you get the idea.
This happens with framebuffers too. If you do a partial write mask, the hardware is still going to be doing the equivalent of a read/modify/write on that texture. It's just not changing all of the bytes its reads.
So you can do the same thing from your compute shader. Read the 4-vector value, modify the channels you want, and then write it back out. As long as the read and write are from the same shader invocation, there should be no synchronization problems (assuming that no other invocations are trying to read/write to that same location, but if that were the case, you'd have problems anyway).

What factors determine DXGI_FORMAT?

I am not familiar with directx, but I ran into a problem in a small project, part of which involves capturing directx data. I hope, below I make some sense.
General question:
I would like to know what factors determine the DXGI_FORMAT of a texture in the backbuffer (hardware?, OS?, application?, directx version?). And more importantly, when capturing a texture from the backbuffer, is it possible to receive a texture in the desired format by supplying the format as a parameter, having the format automatically converted if necessary.
Specifics about my problem :
I capture screens from games using Open Broadcaster Software(OBS) and process them using a specific library(OpenCV) prior to streaming. I noticed that, following updates to both Windows and OBS, I get 'DXGI_FORMAT_R10G10B10A2_UNORM' as the DXGI_FORMAT. This is a problem for me, because as far as I know OpenCV does not provide a convenient way for building an OpenCV object when colors are 10bits. Below are a few relevant lines from the modified OBS source file.
d3d11_copy_texture(data.texture, backbuffer);
...
hlog(toStr(data.format)); // prints 24 = DXGI_FORMAT_R10G10B10A2_UNORM
...
ID3D11Texture2D* tex;
bool success = create_d3d11_stage_surface(&tex);
if (success) {
...
HRESULT hr = data.context->Map(tex, subresource, D3D11_MAP_READ, 0, &mappedTex);
...
Mat frame(data.cy, data.cx, CV_8UC4, mappedTex.pData, (int)mappedTex.RowPitch); //This creates an OpenCV Mat object.
//No support for 10-bit coors. Expects 8-bit colors (CV_8UC4 argument).
//When the resulting Mat is viewed, colours are jumbled (Probably because 10-bits did not fit into 8-bits).
Before the updates (when I was working on this a year ago), I was probably receiving DXGI_FORMAT = DXGI_FORMAT_B8G8R8A8_UNORM, because the code above used to work.
Now I wonder what changed, and whether I can modify the source code of OBS to receive data with the desired DXGI_FORMAT.
'create_d3d11_stage_surface' method called above sets the DXGI_FORMAT, but I am not sure if it means 'give me data with this DXGI_FORMAT' or 'I know you work with this format, give me what you have'.
static bool create_d3d11_stage_surface(ID3D11Texture2D **tex)
{
HRESULT hr;
D3D11_TEXTURE2D_DESC desc = {};
desc.Width = data.cx;
desc.Height = data.cy;
desc.Format = data.format;
...
I hoped that, overriding the desc.Format with DXGI_FORMAT_B8G8R8A8_UNORM would result in that format being passed as argument in the ID3D11DeviceContext::Map call above, and I would get data with specified format. But that did not work.
The choice of render target is up to the application, but they need to pick one based on the Direct3D hardware feature level. Formats for render targets in swapchains are usually display scanout formats:
DXGI_FORMAT_R8G8B8A8_UNORM
DXGI_FORMAT_R8G8B8A8_UNORM_SRGB
DXGI_FORMAT_B8G8R8A8_UNORM
DXGI_FORMAT_B8G8R8A8_UNORM
DXGI_FORMAT_R10G10B10A2_UNORM
DXGI_FORMAT_R16G16B16A16_FLOAT
DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM (rare)
See the DXGI documentation for the full list of supported formats and usages by feature level.
Direct3D 11 does not do format conversions when you copy resources such as copying to staging render textures, so if you want to do a format conversion you'll need to handle that yourself. Note that CPU-side conversion code for all the DXGI formats can be found in DirectXTex.
It is the application that decides that format. The simplest one would be R8G8B8A8, which simply represents RGB and alpha values. But, if developer decides that he will be using HDR, the backbuffer would probably be R11B11G10, because you can store way more precise data there, without alpha channel information. If the game is for example black and white, there's no need to keep all RGB channels in the back buffer, you could use simpler format. I hope this helps.

iOS Import .obj file to Model I/O without duplicating vertices

I'm trying to import a .obj file to use in Scene Kit using the Model I/O framework. I initially used the simple MDLAsset initWithURL: function, but after transferring the mesh to a SCNGeometry, I realized this function was triangulizing the mesh, such that each face had 3 unique vertices, and there were separate vertices at the same location for border faces. This was causing some major problems with my other functions, so I tried to fix it by instead using the MDLAsset initWithURL:vertexDescriptor:bufferAllocator:preserveTopology function with preserveTopology set to YES with the descriptor/allocator set to the default with nil. This preserving topology fixed my problem of duplicating vertices, so the faces/edges were all good, but in the process I lost the normals data.
By lost the normals, I don't mean multiple indexing, I mean after setting preserveTopology to YES, the buffer did not contain any normals values at all. Whereas before it was v1/n1/v2/n2... and the stride was 24 bytes (3 dimensions *4 bytes/float * 2 attributes), now the first half of the buffer is v1/v2/... with a stride of 12 and the entire 2nd half of the buffer is just 0.0 floats.
Also something weird with this, when you look at the SCNGeometrySources of the Geometry, there are 2 sources, 1 with semantic kGeometrySourceSemanticVertex, and 1 with semantic kGeometrySourceSemanticNormal. You would think that the semantic vertex source would contain the position data, and the semantic normal source would contain the normal data. However that is not the case. No matter what you set preserveTopology, they are buffers of size to contain both position and normal data with identical values. So when I said before there was no normal data, I mean both of these buffers, semantic vertex AND semantic normal went from being v1/n1/v2/n2... to v1/v2/.../(0.0, 0.0, 0.0)/(0.0, 0.0, 0.0)/... I went into the mdlmesh's buffer (before the transfer to scene kit) at found the same problem, so the problem must be with the initWithURL, not with the model i/o to scenekit bridge.
So I figured there must be something wrong with the default vertex descriptor and buffer allocator (since I was using nil) and went about trying to create my own that matched these 2 possible data formats. Alas after much trying I was unable to get something that worked.
Any ideas on how I should do this? How to give MDLAsset the proper vertexDescriptor and bufferAllocator (I feel like nil should be ok here) for importing a .obj file? Thanks
An obj file with vertices and normals has vertices, indicated by v lines, normals, indicated by vn lines, and faces, indicated by f lines.
The v and vn lines will just be the floating point values you expect, and the f line will be of the form -
f v0//n0 v1//n1 etc
Since OpenGL and Metal don't allow multiple indexing, you'll see the first effect of vertices being duplicated. For example,
f 0//0 1//2 2//0
can't work as a vertex buffer because it would require different indices per vertex. So typical OBJ parsers have to create new vertices that allow the face to become
f 0//0 1//1 2//2
The preserve topology option doesn't help you. It preserves the connectivity and shape of the mesh (no triangulation occurs, shared edges remain shared) but it still enforces a single index per vertex component.
One solution would be to make sure that your tool that is outputting the OBJ files uses single indexing during export, if that is an option.
Another option, and this won't solve the problem immediately, would be file a request that multiple-indexing be supported at the Model I/O level. SceneKit would still have to uniquely-index because it is has to be able to render.
Another option would be to use a format like PLY that doesn't have multiple indexing.

Unsupported format or combination of formats when using cv::reduce method in OpenCV

I am using OpenCV 2.4.2 and I am trying to take projections of two matrices (tmpl(32x44), subj(32x44)) along row and column. I have initialised a result matrix as rowProjectionSubj(subj.rows,1,CV_8UC1) Then I call cv::reduce(subj,rowProjectionSubj,1,CV_REDUCE_SUM,-1);
Why is this complaining about the type mismatch? I have kept the types same (by keeping dtype=-1 in cv::reduce. I get the tmpl and subj objects by doing cv::imread("image_path",0) i.e. scanning grayscale images in.
I might not be right, but after I saw this:
http://answers.opencv.org/question/3698/cvreduce-gives-unsupported-format-exception/?answer=3701#post-id-3701
and with a little experiment and using an old friend called "register math", I realised that when you add two 8-bit numbers, you need to consider a 8+1+1 bit register to store the sum because it potentially has carry output. so any result of reduce should have bigger space than the source i.e. if the source is 8-bit unsigned, it should be at least 16-bit unsigned or signed; might as well be 32-bit if it is going to be used for some product calculation and stuff...
NOTE: The destination type must be EXPLICITLY stated in the cv::reduce method. Please follow my openCV link for further information.

Resources