ID3D11VideoDevice::CreateVideoDecoderOutputView fails - directx

I'm developing an application able to decode H264 stream through DrectX11's ID3D11VideoDecoder interface ( https://msdn.microsoft.com/en-us/library/windows/desktop/hh447766(v=vs.85).aspx ) and I got stuck at ID3D11VideoDevice::CreateVideoDecoderOutputView method, it just fails returning E_INVALIDARG. Yes, I know, there can be millions of reasons,
but are there some exceptionally common maybe? Are there any samples available illustrating decoding through ID3D11VideoDecoder (I haven't found any) ?
The part of my code that I think is most likely to fail looks as follows:
// texture
D3D11_TEXTURE2D_DESC descT = { 0 };
descT.Width = 1024;
descT.Height = 768;
descT.MipLevels = 1;
descT.ArraySize = 1;
descT.Format = DXGI_FORMAT_NV12;
descT.SampleDesc.Count = 1;
descT.Usage = D3D11_USAGE_DEFAULT;
descT.BindFlags = D3D11_BIND_DECODER;
ID3D11Texture2D *pTex = nullptr;
pDX11VideoDevice->CreateTexture2D(&desc, 0, &pTex);
// decoder
D3D11_VIDEO_DECODER_OUTPUT_VIEW_DESC desc;
desc.DecodeProfile = D3D11_DECODER_PROFILE_H264_VLD_NOFGT; // what is interesting it fails whatever decoder I choose
desc.Texture2D.ArraySlice = 1;
desc.ViewDimension = D3D11_VDOV_DIMENSION_TEXTURE2D;
HRESULT hr = pDX11VideoDevice->CreateVideoDecoderOutputView(pTex, &desc, &pVideoDecoderOutputView); // and here the fail occurs
Thank you

OK, solved the problem, there should be
desc.Texture2D.ArraySlice = 0;
in the snippet in post above. Still lots of work ahead

Related

Creating Texture with initial data DirectX 11

I am trying to implement Variable Rate Shading in the app based on DirectX 11.
I am doing it this way:
UINT dwRtWidth = 2560;
UINT dwRtHeight = 1440;
D3D11_TEXTURE2D_DESC srcDesc;
ZeroMemory(&srcDesc, sizeof(srcDesc));
int sri_w = dwRtWidth / NV_VARIABLE_PIXEL_SHADING_TILE_WIDTH;
int sri_h = dwRtHeight / NV_VARIABLE_PIXEL_SHADING_TILE_HEIGHT;
srcDesc.Width = sri_w;
srcDesc.Height = sri_h;
srcDesc.ArraySize = 1;
srcDesc.Format = DXGI_FORMAT_R8_UINT;
srcDesc.SampleDesc.Count = 1;
srcDesc.SampleDesc.Quality = 0;
srcDesc.Usage = D3D11_USAGE_DEFAULT; //Optional
srcDesc.BindFlags = D3D11_BIND_SHADER_RESOURCE; //Optional
srcDesc.CPUAccessFlags = 0;
srcDesc.MiscFlags = 0;
D3D11_SUBRESOURCE_DATA initialData;
UINT* data = (UINT*)malloc(sri_w * sri_h * sizeof(UINT));
for (int i = 0; i < sri_w * sri_h; i++)
data[i] = (UINT)0;
initialData.pSysMem = data;
initialData.SysMemPitch = sri_w;
//initialData.SysMemSlicePitch = 0;
HRESULT hr = s_device->CreateTexture2D(&srcDesc, &initialData, &pShadingRateSurface);
if (FAILED(hr))
{
LOG("Texture not created");
LOG(std::system_category().message(hr));
}
else
LOG("Texture created");
When I try to create texture with initial data, it is not being created and HRESULTS gives message: 'The parameter is incorrect'. Doesn't say which one.
When I create texture without initial data it's created successfully.
What's wrong with the initial data? I also tried to use unsigned char instead of UINT as it has 8 bits, but result was the same, texture was not created.
Please help.
Aftr some time I found a solution to the problem. I needed to add a line:
srcDesc.MipLevels = 1;
With this change the texture was finally created with initial data

Trying to understand why mDataByteSize property of AudioBuffer is changing

I have the following code which I'm using to read the contents of a WAV file into an SInt16 array:
AudioBufferList *buffers;
UInt32 ablSize = offsetof(AudioBufferList, mBuffers) +
(sizeof(AudioBuffer) * 1);
buffers = malloc(ablSize);
buffers->mNumberBuffers = 1;
buffers->mBuffers[0].mNumberChannels = 1;
buffers->mBuffers[0].mDataByteSize = dataByteSize;
UInt32 dataSize = (UInt32)fileLengthFrames * sizeof(SInt16);
self.extractedSamples = malloc(dataSize);
self.extractedByteCount = dataByteSize;
UInt32 totalFramesRead = 0;
do {
UInt32 framesRead = (UInt32)fileLengthFrames - totalFramesRead;
buffers->mBuffers[0].mData = self.extractedSamples +
(totalFramesRead * sizeof(SInt16));
ExtAudioFileRead(eaf, &framesRead, buffers);
totalFramesRead += framesRead;
} while (totalFramesRead < fileLengthFrames);
free(buffers);
This is working fine for files of < 0.5 seconds duration. But for a longer file I'm testing, the app crashes with a bad access error inside the do loop. For this file, dataByteSize is 60472, and at the start of the loop buffer->mBuffers[0].mDataByteSize is also 60472. But when the crash occurs, I see that buffer->mBuffers[0].mDataByteSize has changed to 57300, which is presumably why the crash is now occurring.
Anybody know how/why this value is changing in the middle of the loop? One guess I have is that I'm not properly retaining the AudioBufferList and the memory space for mDataByteSize is somehow getting overwritten.
Edit: When this code is run on the simulator with the same file, it works fine.
mDataByteSize should be set to framesRead * sizeof(SInt16) * channelCount before each call to ExtAudioFileRead

Core Audio: Float32 to SInt16 conversion artefacts

I am converting from the following format:
const int four_bytes_per_float = 4;
const int eight_bits_per_byte = 8;
_stereoGraphStreamFormat.mFormatID = kAudioFormatLinearPCM;
_stereoGraphStreamFormat.mFormatFlags = kAudioFormatFlagsNativeFloatPacked | kAudioFormatFlagIsNonInterleaved;
_stereoGraphStreamFormat.mBytesPerPacket = four_bytes_per_float;
_stereoGraphStreamFormat.mFramesPerPacket = 1;
_stereoGraphStreamFormat.mBytesPerFrame = four_bytes_per_float;
_stereoGraphStreamFormat.mChannelsPerFrame = 2;
_stereoGraphStreamFormat.mBitsPerChannel = eight_bits_per_byte * four_bytes_per_float;
_stereoGraphStreamFormat.mSampleRate = 44100;
to the following format:
interleavedAudioDescription.mFormatID = kAudioFormatLinearPCM;
interleavedAudioDescription.mFormatFlags = kAudioFormatFlagIsSignedInteger;
interleavedAudioDescription.mChannelsPerFrame = 2;
interleavedAudioDescription.mBytesPerPacket = sizeof(SInt16)*interleavedAudioDescription.mChannelsPerFrame;
interleavedAudioDescription.mFramesPerPacket = 1;
interleavedAudioDescription.mBytesPerFrame = sizeof(SInt16)*interleavedAudioDescription.mChannelsPerFrame;
interleavedAudioDescription.mBitsPerChannel = 8 * sizeof(SInt16);
interleavedAudioDescription.mSampleRate = 44100;
Using the following code:
int32_t availableBytes = 0;
void* tailL = TPCircularBufferTail(inputBufferL(), &availableBytes);
void* tailR = TPCircularBufferTail(inputBufferR(), &availableBytes);
// If we have no data in the buffer, we simply return
if (availableBytes <= 0)
{
return;
}
// ========== Non-Interleaved to Interleaved (Plus Samplerate Conversion) =========
// Get the number of frames available
UInt32 frames = availableBytes / this->mInputFormat.mBytesPerFrame;
pcmOutputBuffer->mBuffers[0].mDataByteSize = frames * interleavedAudioDescription.mBytesPerFrame;
struct complexInputDataProc_t data = (struct complexInputDataProc_t) { .self = this, .sourceL = tailL, .sourceR = tailR, .byteLength = availableBytes };
// Do the conversion
OSStatus result = AudioConverterFillComplexBuffer(interleavedAudioConverter,
complexInputDataProc,
&data,
&frames,
pcmOutputBuffer,
NULL);
// Tell the buffers how much data we consumed during the conversion so that it can be removed
TPCircularBufferConsume(inputBufferL(), availableBytes);
TPCircularBufferConsume(inputBufferR(), availableBytes);
// ========== Buffering Of Interleaved Samples =========
// If we got converted frames back from the converter, we want to add it to a separate buffer
if (frames > 0)
{
// Make sure we have enough space in the buffer to store the new data
TPCircularBufferHead(&pcmCircularBuffer, &availableBytes);
if (availableBytes > pcmOutputBuffer->mBuffers[0].mDataByteSize)
{
// Add the newly converted data to the buffer
TPCircularBufferProduceBytes(&pcmCircularBuffer, pcmOutputBuffer->mBuffers[0].mData, frames * interleavedAudioDescription.mBytesPerFrame);
}
else
{
printf("No Space in Buffer\n");
}
}
However I am getting the following output:
It should be a perfect sine wave, however as you can see it is not.
I have been working on this for days now and just can’t seem to find where it is going wrong.
Can anyone see something that I might be missing?
Edit:
Buffer initialisation:
TPCircularBuffer pcmCircularBuffer;
static SInt16 pcmOutputBuf[BUFFER_SIZE];
pcmOutputBuffer = (AudioBufferList*)malloc(sizeof(AudioBufferList));
pcmOutputBuffer->mNumberBuffers = 1;
pcmOutputBuffer->mBuffers[0].mNumberChannels = 2;
pcmOutputBuffer->mBuffers[0].mData = pcmOutputBuf;
TPCircularBufferInit(&pcmCircularBuffer, BUFFER_SIZE);
Complex input data proc:
static OSStatus complexInputDataProc(AudioConverterRef inAudioConverter,
UInt32 *ioNumberDataPackets,
AudioBufferList *ioData,
AudioStreamPacketDescription **outDataPacketDescription,
void *inUserData) {
struct complexInputDataProc_t *arg = (struct complexInputDataProc_t*)inUserData;
BroadcastingServices::MP3Encoder *self = (BroadcastingServices::MP3Encoder*)arg->self;
if ( arg->byteLength <= 0 )
{
*ioNumberDataPackets = 0;
return 100; //kNoMoreDataErr;
}
UInt32 framesAvailable = arg->byteLength / self->interleavedAudioDescription.mBytesPerFrame;
if (*ioNumberDataPackets > framesAvailable)
{
*ioNumberDataPackets = framesAvailable;
}
ioData->mBuffers[0].mData = arg->sourceL;
ioData->mBuffers[0].mDataByteSize = arg->byteLength;
ioData->mBuffers[1].mData = arg->sourceR;
ioData->mBuffers[1].mDataByteSize = arg->byteLength;
arg->byteLength = 0;
return noErr;
}
I see a few things that raise a red flag.
1) as mentioned in a comment above, the fact that you are overwriting availableBytes for the left input with that from the right:
void* tailL = TPCircularBufferTail(inputBufferL(), &availableBytes);
void* tailR = TPCircularBufferTail(inputBufferR(), &availableBytes);
If the two input streams are changing asynchronously to this code then most certainly you have a race condition.
2) Truncation errors: availableBytes is not necessarily a multiple of bytes per frame. If not then the following bit of code could cause you to consume more bytes than you actually converted.
void* tailL = TPCircularBufferTail(inputBufferL(), &availableBytes);
void* tailR = TPCircularBufferTail(inputBufferR(), &availableBytes);
...
UInt32 frames = availableBytes / this->mInputFormat.mBytesPerFrame;
...
TPCircularBufferConsume(inputBufferL(), availableBytes);
TPCircularBufferConsume(inputBufferR(), availableBytes);
3) If the output buffer is not ready to consume all of the input you just throw the input buffer away. That happens in this code.
if (availableBytes > pcmOutputBuffer->mBuffers[0].mDataByteSize)
{
...
}
else
{
printf("No Space in Buffer\n");
}
I'd be really curious if your seeing the print output.
Here's is how I would suggest doing it. It's going to be pseudo-codeish since I don't have anything necessary to compile and test it.
int32_t availableBytesInL = 0;
int32_t availableBytesInR = 0;
int32_t availableBytesOut = 0;
// figure out how many bytes are available in each buffer.
void* tailL = TPCircularBufferTail(inputBufferL(), &availableBytesInL);
void* tailR = TPCircularBufferTail(inputBufferR(), &availableBytesInR);
TPCircularBufferHead(&pcmCircularBuffer, &availableBytesOut);
// figure out how many full frames are available
UInt32 framesInL = availableBytesInL / mInputFormat.mBytesPerFrame;
UInt32 framesInR = availableBytesInR / mInputFormat.mBytesPerFrame;
UInt32 framesOut = availableBytesOut / interleavedAudioDescription.mBytesPerFrame;
// figure out how many frames to process this time.
UInt32 frames = min(min(framesInL, framesInL), framesOut);
if (frames == 0)
return;
int32_t bytesConsumed = frames * mInputFormat.mBytesPerFrame;
struct complexInputDataProc_t data = (struct complexInputDataProc_t) {
.self = this, .sourceL = tailL, .sourceR = tailR, .byteLength = bytesConsumed };
// Do the conversion
OSStatus result = AudioConverterFillComplexBuffer(interleavedAudioConverter,
complexInputDataProc,
&data,
&frames,
pcmOutputBuffer,
NULL);
int32_t bytesProduced = frames * interleavedAudioDescription.mBytesPerFrame;
// Tell the buffers how much data we consumed during the conversion so that it can be removed
TPCircularBufferConsume(inputBufferL(), bytesConsumed);
TPCircularBufferConsume(inputBufferR(), bytesConsumed);
TPCircularBufferProduceBytes(&pcmCircularBuffer, pcmOutputBuffer->mBuffers[0].mData, bytesProduced);
Basically what I've done here is to figure out up front how many frames should be processed making sure I'm only processing as many frames as the output buffer can handle. If it were me I'd also add some checking for buffer underruns on the output and buffer overruns on the input. Finally, I'm not exactly sure of the semantics of AudioConverterFillComplexBuffer wrt the frame parameter that is passing in and out. It's conceivable that the # frames out would be less or more than the number of frames in. Although, since your not doing sample rate conversion that's probably not going to happen. I've attempted to account for that condition by assigning bytesProduced after the conversion.
Hope this helps. If not you have 2 other clues. One is that the drop outs are periodic and two is that the size of the drop outs looks to be about the same. If you can figure out how many samples each are then you can look for those numbers in your code.
I don't think your output buffer, pcmCircularBuffer, is big enough.
Try replacing
TPCircularBufferInit(&pcmCircularBuffer, BUFFER_SIZE);
with
TPCircularBufferInit(&pcmCircularBuffer, sizeof(pcmOutputBuf));
Even if that is the solution, I think there are some problems with your code. I don't know exactly what you're doing, I guess encoding mp3 (which by itself is an uphill battle on iOS, why not use hardware AAC?), but unless you have realtime demands on both input and output, why use ring buffers at all? Also, I recommend using units to visually catch type frame/byte size mismatches: e.g. BUFFER_SIZE_IN_FRAMES
If it's not the solution, then I want to see the sine generator.

EZAudio framework -"Error: Couldn't initialize output unit ('fmt?')"

I am using EZAudio framework (https://github.com/syedhali/EZAudio) and when trying to initialize my output with a custom AudioStreamBasicDescription like so...
- (void)openMediaPlayer {
// Initialize the EZOutput instance and assign it a delegate to provide the output audio data
AudioStreamBasicDescription audioDesc;
audioDesc.mFormatID = kAudioFormatLinearPCM;
audioDesc.mSampleRate = 44100;
audioDesc.mChannelsPerFrame = 2;
audioDesc.mBytesPerFrame = 4;
audioDesc.mFramesPerPacket = 1;
audioDesc.mBytesPerPacket = 4;
audioDesc.mBitsPerChannel = 16;
audioDesc.mReserved = 0;
self.output = [EZOutput outputWithDataSource:self withAudioStreamBasicDescription:audioDesc];
self.currentAudioPieceIndex = 0;
}
I get the error "Error: Couldn't initialize output unit ('fmt?')"
What does this mean? AudioDesc is set with sane defaults for PCM 16 bit stereo audio.
Update: When I use the debugger I found I was getting OSStatus 1718449215.
It turns out you have to set
audioDesc.mFormatFlags = kAudioFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger;
When your audio format is PCM.

What could cause "fft Window" value is NaN in Hanning normalized window?

i am trying to build an iOS 7 application that detecting the sound/song pitch(or frequency), For example: 349.23Hz, 392.00Hz, 440.00Hz......
So, I download the "Auto Correllation" project (it's a Musician's ket http://musicianskit.com/developer.php), I run it on iOS 7 Simulator, it works fine, The "hanning fft window" have value (not NaN), and able get the frequency finally.
But, it doesn't work on iPhone device, it cannot has any value in "hanning fft window".
Can anybody have a look into these classes by Kevin Murphy and tell me how I could modify them to work on iPhone device(not the iOS simulator)?
Many many thanks~
I've pasted my code below:
// PitchDetector.m
-(id) initWithSampleRate: (float) rate lowBoundFreq: (int) low hiBoundFreq: (int) hi andDelegate: (id<PitchDetectorDelegate>) initDelegate {
self.lowBoundFrequency = low;
self.hiBoundFrequency = hi;
self.sampleRate = rate;
self.delegate = initDelegate;
bufferLength = self.sampleRate/self.lowBoundFrequency;
hann = (float*) malloc(sizeof(float)*bufferLength);
// applied the Hanning windows, the 'hann' is the Hanning fft Window
vDSP_hann_window(hann, bufferLength, vDSP_HANN_NORM);
sampleBuffer = (SInt16*) malloc(512);
samplesInSampleBuffer = 0;
result = (float*) malloc(sizeof(float)*bufferLength);
return self;
}
-(void) performWithNumFrames: (NSNumber*) numFrames;
{
int n = numFrames.intValue;
float freq = 0;
SInt16 *samples = sampleBuffer;
int returnIndex = 0;
float sum;
bool goingUp = false;
float normalize = 0;
for(int i = 0; i<n; i++) {
sum = 0;
for(int j = 0; j<n; j++) {
//here I found the hann[j] is NaN. seems doesn't have value in hann('hann' is the Hanning fft Window)
//if hann[j] is Not a Number (NaN), the value of sum also to be NaN.
sum += (samples[j]*samples[j+i])*hann[j];
}
if(i ==0 ) normalize = sum;
result[i] = sum/normalize;
}
......
......
}
I am using this same program from:
https://github.com/fotock/PitchDetectorExample/tree/1c68491f9c9bff2e851f5711c47e1efe4092f4de
Although I have not put this on an iPhone yet, only simulator, I was having problems from time time with the program crashing. I found that I needed to manually update it with from a "fork" of the code on github found here:
https://github.com/fotock/PitchDetectorExample/network
I added Jordan Liggitt's bug fixes manually and now the app does not crash. I hope this helps because if it does not, then I will be facing the same issues when I load this app on an iPhone.
Hope it works!
Update
I have now installed this on an iPhone vs the simulator and it works as it should without errors or crashing.

Resources