DirectXTK 3D audio uses only left channel - directx

I've decided to try the DirectXTK12 audio and it's working fine except for the 3D sound. I'm following the guide from wiki but the sound is always in the left speaker no matter how I position the listener/emitter. What's wrong? My code looks like this:
HRESULT hr = CoInitializeEx(nullptr, COINIT_MULTITHREADED);
if (FAILED(hr)) {...}
std::unique_ptr<DirectX::AudioEngine> audEngine;
DirectX::AUDIO_ENGINE_FLAGS eflags = DirectX::AudioEngine_Default;
#ifdef _DEBUG
eflags |= DirectX::AudioEngine_Debug;
#endif
std::unique_ptr<DirectX::SoundEffect> soundEffect;
soundEffect = std::make_unique<DirectX::SoundEffect>(audEngine.get(), L"Sound.wav");
auto effect = soundEffect->CreateInstance(DirectX::SoundEffectInstance_Use3D);
effect->Play(false);
DirectX::AudioListener listener;
listener.SetPosition(DirectX::XMFLOAT3(0.0f, 0.0f, 0.0f));
DirectX::AudioEmitter emitter;
emitter.SetPosition(DirectX::XMFLOAT3(0.0f, 0.0f, 0.0f));
effect->Apply3D(listener, emitter, false);
It should be in the center but it's only using the left channel although there are no errors in the output, the only thing the output says is this:
INFO: XAudio 2.9 debugging enabled
INFO: mastering voice has 2 channels, 96000 sample rate, 00000003 channel mask
Playing the sound without 3D uses both speakers as expected.

I've fixed the issue by converting used Sound.wav to mono (1 channel) sound.

Related

How to feed FFMPEG AV_CODEC_ID_PCM_S16BE audio data to AudioQueue

I am using FFMPEG in combination with FFmpegAudioPlayer to do live streaming. The issue I am having is that, while the audio can be decoded and played, there's a constant clicking/screeching noise in the audio that isn't present when streaming the same source by other applications. So I am guess the issue arise due to how I process the FFMPEG AV_CODEC_ID_PCM_S16BE audio data before handing it to AudioQueue:
audioFormat.mFormatID = kAudioFormatLinearPCM;
audioFormat.mFormatFlags = kAudioFormatFlagsCanonical;//kAudioFormatFlagIsBigEndian|kAudioFormatFlagIsAlignedHigh;
audioFormat.mSampleRate = pAudioCodecCtx->sample_rate;
audioFormat.mBitsPerChannel = 8*av_get_bytes_per_sample(AV_SAMPLE_FMT_S16);
audioFormat.mChannelsPerFrame = pAudioCodecCtx->channels;
audioFormat.mBytesPerFrame = pAudioCodecCtx->channels * av_get_bytes_per_sample(AV_SAMPLE_FMT_S16);
audioFormat.mBytesPerPacket= pAudioCodecCtx->channels * av_get_bytes_per_sample(AV_SAMPLE_FMT_S16);
audioFormat.mFramesPerPacket = 1;
audioFormat.mReserved = 0;
pSwrCtx = swr_alloc_set_opts(pSwrCtx,
1,//pAudioCodecCtx->channel_layout,
AV_SAMPLE_FMT_S16,
pAudioCodecCtx->sample_rate,
1,//pAudioCodecCtx->channel_layout,
AV_SAMPLE_FMT_S16,
pAudioCodecCtx->sample_rate,
0,
0);
outCount = swr_convert(pSwrCtx,
(uint8_t **)(&pOut),
in_samples,
(const uint8_t **)pAVFrame1->extended_data,
in_samples);
Please also note that I've tried many different parameters for swr_alloc_set_opts, but either the audio became unrecognizable or the noise persisted.
Here's a sample of the audio with clicking sound, if it helps.
I don't know exactly, but s16be is integer (16bit) whereas kAudioFormatLinearPCM is float (32bit).
If I were in your shoes, I'll just use s16be and kAudioFormatLinearPCM format which means fixing AudioCodecCtx->channels * av_get_bytes_per_sample(AV_SAMPLE_FMT_S16) and others.
Then insert PCM format conversion step between ffmpeg -> iOS data flow.
This post looks like very helpful: iOS Core Audio : Converting between kAudioFormatFlagsCanonical and kAudioFormatFlagsAudioUnitCanonical
It turned out the noise isn't a problem in decoding the audio stream, but a problem in the camera device that feed the audio stream to our app.
The noise is quite inaudible when connected to the Android app, caused us to miss it when testing Android app and thought it's a problem with how our iOS app handles audio rather than something that's wrong with the device.

Capturing through a single multi-head (stereo) camera using OpenCV

I have a single multi-head (stereo) usb camera that can be detected and can stream stereo videos using the "Video Capture Sources" filter in GraphEdit .
I'm trying to access both channels using OpenCV2.4.8 (on PC that has VS2010, Win7 X64) for further stereo image processing. However, I can only detect/stream single head(channel) of the camera and not both stereo heads of it. My code is set according to the related documentation notes of VideoCapture::grab/VideoCapture::retrieve and looks like the following:
#include "opencv2/opencv.hpp"
using namespace cv;
int main(int, char**)
{
VideoCapture cap(0); // open the default camera
if(!cap.isOpened()) // check if we succeeded
return -1;
Mat Lframe,Rframe;
namedWindow("Lframe",CV_WINDOW_AUTOSIZE);namedWindow("Rframe",CV_WINDOW_AUTOSIZE);
while(char(waitKey(1)) != 'q') {
if(cap.grab())
cap.retrieve(Lframe,0); cap.retrieve(Rframe,1);// get a new frame
imshow("Lframe",Lframe);imshow("Rframe",Rframe);
if(waitKey(30) >= 0) break;
}
return 0;
}
The problem is that the rendered channels (Lframe,Rframe) are identical no matter which Channel index is passed. Hence, only certain head is accessed & I can't get stereo streaming.
Is there a way to use "Video Capture Sources" filter directly with OpenCV?
Waiting for your assistance & Thank you in advance,

OpenCV's camera in Windows 8

I use OpenCV in my project of Augment Reality. The original platform is Windows 7 and everything works perfect - full-screen with 1080p. However, when I launched my program on Windows 8 it showed live video with resolution 640x480. The same program on the same hardware, but with different OS Windows shows different results. I wrote simple test program which showed the same problem:
include "highgui.h"
int main()
{
cvNamedWindow("VideoTest", CV_WINDOW_AUTOSIZE);
CvCapture *capture = cvCreateCameraCapture(0);
CvSize size = cvSize(1920, 1080);
cvSetCaptureProperty(capture, CV_CAP_PROP_FRAME_WIDTH , size.width);
cvSetCaptureProperty(capture, CV_CAP_PROP_FRAME_HEIGHT , size.height);
IplImage* frame;
while(1)
{
frame = cvQueryFrame(capture);
if(!frame) break;
cvShowImage("VideoTest", frame);
char c = cvWaitKey(33);
if(c == 27) break;
}
cvReleaseCapture(&capture);
cvDestroyWindow("VideoTest");
return 0;
}
I think that there is problem with - cvSetCaptureProperty(capture, CV_CAP_PROP_FRAME_WIDTH , size.width); But I have no idea how to resolve it.
I would be glad any help.
P.S.
I have some new info:
I wrote test program for using of DirectShow.
It captures web camera "USB Web-camera Microsoft LifeCam Studio" into full screen live video with 1080p quality. However, when I launched this program on Windows 8 it showed only live video with 640x480 resolution.
Simple test showed that method SetFormat() of IAMStreamConfig produces HRESULT value S_OK on Windows 7 and E_FAIL on Windows 8.
It is shown in the next listing:
hr = streamConfTest->SetFormat(&mtGroup);
if(SUCCEEDED(hr))
{
printf("Success SetFormat( &mtGroup )");
}else
{
printf("Error SetFormat( &mtGroup )");
}
The first branch is chosen on Windows 7, and the second is chosen on Windows 8.
I have no idea how to resolve it. I would be glad any help.
After some times I found the suitable decision of this problem. I have included Media Foundation in my project and have written simple C++ class for this. Short article about it is showed on Capturing of video from web-camera on Windows 7 and 8 by Media Foundation

Apple's Voice Processing Audio Unit ( kAudioUnitSubType_VoiceProcessingIO ) broken on iOS 5.1

I'm writing a VOIP app for iPad (currently targeting 2&3).
I originally wrote the audio code using the Audio Unit API, with a kAudioUnitSubtype_RemoteIO unit. This worked well, but unsurprisingly echo was a problem. I tried to use the built-in echo-suppression by switching to using a kAudioUnitSubType_VoiceProcessingIO unit. This works really well on iOS 6 (iPad 3), but the same code on iOS 5.1 (iPad 2) produces white noise on the microphone input.
The documentation just mentions that it should be available in iOS 3.0 and later
The iOS version seems to be the important difference here. I tried running the app on two iPhone 4Ss, one with iOS 6 which sounded fine and one with iOS 5.1 which sounded like white noise.
My ASBD looks like this:
typedef int16_t sample_t;
#define AUDIO_BUFFER_SAMPLE_RATE 48000
#define FORMAT_FLAGS (kAudioFormatFlagsIsSignedInteger | kAudioFormatFlagsIsNonInterleaved)
#define CHANNELS_PER_FRAME 1
...
const size_t bytes_per_sample = sizeof(sample_t);
const int eight_bits_per_byte = 8;
AudioStreamBasicDescription streamFormat;
streamFormat.mFormatID = kAudioFormatLinearPCM;
streamFormat.mSampleRate = AUDIO_BUFFER_SAMPLE_RATE;
streamFormat.mFormatFlags = FORMAT_FLAGS;
streamFormat.mChannelsPerFrame = CHANNELS_PER_FRAME;
streamFormat.mBytesPerFrame = bytes_per_sample * CHANNELS_PER_FRAME;
streamFormat.mBitsPerChannel = bytes_per_sample * eight_bits_per_byte;
streamFormat.mFramesPerPacket = 1;
streamFormat.mBytesPerPacket = streamFormat.mBytesPerFrame * streamFormat.mFramesPerPacket;
streamFormat.mReserved = 0;
Has anyone ever got kAudioUnitSubType_VoiceProcessingIO to work on iOS 5.1?
Does anyone know of any serious documentation for this IO?
TL;DR add kAudioFormatFlagsIsPacked to FORMAT_FLAGS
I discovered this via a bit of a convoluted route. None of this seems to be well documentated anywhere, but I came across this SO post talking about using the IO on a Mac. One of the things mentioned was using "FlagsCononical". I tried setting:
#define FORMAT_FLAGS kAudioFormatFlagsAudioUnitCanonical
which didn't work, and the call to AudioUnitInitialize failed with a return code of 29759. I couldn't find any documentation about what this meant, but when I tried:
#define FORMAT_FLAGS kAudioFormatFlagsCanonical
everything worked! Success!
The definition of kAudioFormatFlagsCanonical in CoreAudioTypes.h if you are building for an iPad (and therefore have CA_PREFER_FIXED_POINT defined as 1) is:
kAudioFormatFlagsCanonical = kAudioFormatFlagsIsSignedInteger
| kAudioFormatFlagsNativeEndian
| kAudioFormatFlagIsPacked;
After adding kAudioFormatFlagIsPacked to my original code it worked. I added kAudioFormatFlagsNativeEndian for good measure. I removed kAudioFormatFlagsIsNonInterleaved as it was unnecessary for single channel audio anyway. What I was left with is identical to kAudioFormatFlagsCanonical.
So my set-up, which worked on an iPad 2 (iOs 5.1) and an iPad 3 (iOS 6.0) was the following:
Sample rate of 48000
1 channel
kAudioFormatFlagsCanonical
int16_t samples
Linear PCM
I'm still keen on documentation for this IO if anyone has any, and of course if this helped you don't forget to upvote :)

my iOS app using audio units with an 8000 hertz sample rate returns a distorted voice

I really need help with this issue. I'm developing an iOS application with audio units, the recorded audio needs to at 8bit / 8000 hertz sample rate using alaw format. How ever I'm getting a distorted voice coming out the speaker.
I came across this sample online:
http://www.stefanpopp.de/2011/capture-iphone-microphone/comment-page-1/
while trying to debug my app I used my audioFormat in his application and I am getting the same distorted sound. I guessing I either have incorrect settings or I need to do something else to enable this to work. Given the application in the link and the below audioFormat can anyone tell me if I'm doing something wrong or missing something ? I don't know a lot about this stuff, thanks.
Audio Format:
AudioStreamBasicDescription audioFormat;
audioFormat.mSampleRate = 8000;
audioFormat.mFormatID = kAudioFormatALaw;
audioFormat.mFormatFlags = kAudioFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger;
audioFormat.mFramesPerPacket = 1;
audioFormat.mChannelsPerFrame = 1;
audioFormat.mBitsPerChannel = 8;
audioFormat.mBytesPerPacket = 1;
audioFormat.mBytesPerFrame = 1;
Eventually got it to play correctly. I'm posting here to help out anyone else facing similar issues.
Main issue I was facing is that there is a huge difference between the simulator and an actual device. Running the app on the device the sound quality was better but it kept skipping every second or 2, I found a setting that seemed to fix this and a setting to change the buffer size / duration. (The duration setting does not work on the simulator, some of my issues were needing it to run at a certain rate to sync with something else, this was causing distorted sounds)
status = AudioSessionInitialize(NULL, kCFRunLoopDefaultMode, NULL, audioUnit);
UInt32 audioCategory = kAudioSessionCategory_PlayAndRecord;
status = AudioSessionSetProperty(kAudioSessionProperty_AudioCategory, sizeof(audioCategory), &audioCategory);
[self hasError:status:__FILE__:__LINE__];
Float32 preferredBufferSize = 0.005805; // in seconds
status = AudioSessionSetProperty(kAudioSessionProperty_PreferredHardwareIOBufferDuration, sizeof(preferredBufferSize), &preferredBufferSize);
[self hasError:status:__FILE__:__LINE__];
status = AudioSessionSetActive(true);
The first audio session property is what stopped the skipping making it play much more smoothly. The second adjusts the buffer duration, this is in seconds how often the callbacks are fired and will give you a different buffer size. Its best effort meaning it will get as close as it can to the value you provide but it seems to have a list of available sizes and picks the closest.
See the post I link to in my question for a very good tutorial / sample program to get started with this stuff.

Resources