Getting raw pcm audio buffer from XAudio2 when playing compressed file - xaudio2

Is this possible to access the raw audio PCM data that is being played when using XAudio2 to play file?
I've been searching for several ways to access a decoded version of audio files being played in SL4/Windows Phone, without success.

According to this post someone had success writing a custom XAPO that just grabs samples and is enabled on a Submix Voice. http://social.msdn.microsoft.com/Forums/windowsapps/en-US/05593fad-dfd8-4c77-983b-8c84cd4a324b/xaudio2-saving-output-custom-xapos-slow-down-audio-play-backwards
Please note that if you just want to do this for audio processing this approach is not optimal because you are limited to the speed of audio playback.

Related

Real time audio recording in Swift

I am building an application which needs to do real time audio recording. I am using Swift for the project - so unable to use Novocaine library (as it has some Obj-C++ code).
What I need is get small chunks of the audio recording (real-time) which I can process or send to my websocket. Is there a Swift library that I can use to achieve this?
In addition to getting the live audio from the microphone, I also need to show a real time waveform.
Start recording
Get an event every for few bytes of recorded data, where I can send these bytes to my websocket.
Showing a waveform for the audio.
Let me know.
You do not need any of 3-rd party tools for getting audio from mic. It can be set up easily using AVAudioEngine. However, for minimising network traffic I suggest to use lame for compressing raw PCM audio stream into mp3.
Here you can find project with minimal functionality for getting mic input and compressing into mp3. In this example project mp3 stores into Documents folder, so you can try and listen to make sure it works.
From this point you can take mp3 buffer and send via socket. You can also play with lame settings to change quality, etc.
There is another branch called no-lame where same functionality implemented without lame encoding. Look here

FFmpeg save stream to mp3

I have an iOS project that play online radio streams, it is use FFmpeg to play. Also I added ability to record streams, decode streams via avcodec_decode_audio4 function, and write output to .wav file. But this files are too big, because it is uncompressed format, so I want to decode files to .mp3.
I have found couple ways to convert audio but only when audio it is ready file, but I want decode to some compressed format as soon as I get chunk of data from stream, not ready file.
Is it possible?
Can you give me some advise how to achieve this?
You can use ffmpeg (aka libav) to encode the audio you're reading with avcodec_decode_audio4 into a file as mp3, as long as libav was configured with lame (--enable-libmp3lame).
Basically, you configure an mp3 codec, then call avcodec_encode_audio2 (who names these things?) on the progressive output of avcodec_decode_audio4.
The canonical example can be confusing because it also deals with video, but you should be able to tease the details you want out of it.
This post on transcoding audio by arashafiei is broadly helpful.

Audio format to choose for Big audio files

Which audio file format is best to use for large audio files? I have many large audio files to be used in my app but their current mp3 size is of hundred of MB's
If you want to save more storage on audio files, file format may not change too much on the file size, reducing the bit rate(for example 320Kbps to 128Kbps) can reduce the file size significantly.
:how to do it using microsofts audio compression manager?(practically its not well documented in m.s.d.n.
Windows provide codecs that compress specifically audio files. The audio files tipically are PCM format (WAVE_FORMAT_PCM) and get played by using the simplest directsound method (check msdn it`s at hand and it works)
To play a file using directsound, thus PCM format you first create a directsound object, create a directsoundbuffer, and then pump the PCM data directly to the buffer using a keep-fill-buffer algorithm.
If you wish to use codecs, u try and write a procedure that opens a stream file and passes it through a acm driver object, thus (de)compressing it.
The driver for ACM (audio compression manager) finds a codecs that suits the input source and decompresses it yet again to WAVE_FORMAT_PCM for your app be able to play it.

AUGraph setup on iOS

I am designing an AUGraph for an iOS application and would appreciate help on the following things.
If I want to play a number of audio files at once, does each file need an audio unit?
From the Core-Audio docs
Linear PCM and IMA/ADPCM (IMA4) audio You can play multiple linear PCM or IMA4 format sounds simultaneously in iOS without incurring CPU resource problems.
AAC, MP3, and Apple Lossless (ALAC) audio Playback for AAC, MP3, and Apple Lossless (ALAC) sounds uses efficient hardware-based decoding on iPhone and iPod touch. You can play only one such sound at a time.
So multiple AAC or MP3 files cannot be played at the same time. What is the optimal LPCM format to play multiple sounds at once?
Does this apply to Audio-Units too, as this in under the AudioQueue documentation.
Can an audio unit in an AUGraph be inactive? If an AUGraph looks like this
Speaker/output < recorder unit < mixer unit < number of audio file playing units
what happens if the recorder is not active, would it still pull, but just not write the buffers to a file?
No; you need to use the mixer audio unit. Check this:
http://developer.apple.com/library/ios/DOCUMENTATION/MusicAudio/Conceptual/AudioUnitHostingGuide_iOS/ConstructingAudioUnitApps/ConstructingAudioUnitApps.html#//apple_ref/doc/uid/TP40009492-CH16-SW1
Mostly reading the document above, wrapping the sample code in a class and creating a pair of utility structures, I coded this 'Simple Sound Engine' from scratch:
ttp://nicolasmiari.com/blog/a-simple-sound-engine-for-ios-using-the-audio-unit-framework/
(Link to article in my blog containing the source code). Sorry, moved blog to Jekyll/Github and this article didn't make the cut.
...I was going to start a repo on github, but it's too much trouble. I am a visual guy, still pretty much git-phobic. Okay, that was a long time ago... Now I use git from the command line :-)
You can use it as-is, or extract the Audio Unit-related code and adapt it to your project.
I believe the Cocos Denshion 'Simple Audio Engine' does pretty much the same thing, but haven't checked the source code.
Known issues
If you have an exception breakpoint set for C++ exceptions, when debugging, the code will stop 2 or 3 times on AUGraphInitialize(). This is a 'non-crashing' exception, so you can click on continue and the code works OK.
To convert your wav files to the uncompressed .caf format, use this command on the Terminal:
%afconvert -f caff -d LEI16 mysoundFile.wav mySoundFile.caf
EDIT: So I created a GitHub repo after all:
https://github.com/nicolas-miari/Sound-Engine
Both ordinary common .wav and .caf files contain raw PCM audio samples, and can be played without hardware assist or DSP processing if already at the destination sample rate.
When there's no audio file or other synthesized data to feed an audio unit that's pulling buffers, the usual practice is to feed it buffers of silence (or perhaps a taper to zero if the previous buffer ended with non-zero amplitude).

How do you write audio to the first frame with AVAssetWriter while capturing video/audio on iOS?

Long story short, I am trying to implement a naive solution for streaming video from the iOS camera/microphone to a server.
I am using AVCaptureSession with audio and video AVCaptureOutputs, and then using AVAssetWriter/AVAssetWriterInput to capture video and audio in the captureOutput:didOutputSampleBuffer:fromConnection method and write the resulting video to a file.
To make this a stream, I am using an NSTimer to break the video files into 1 second chunks (by hot-swapping in a different AVAssetWriter that has a different outputURL) and upload these to a server over HTTP.
This is working, but the issue I'm running into is this: the beginning of the .mp4 files appear to always be missing audio in the first frame, so when the video files are concatenated on the server (running ffmpeg) there is a noticeable audio skip at the intersections of these files. The video is just fine - no skipping.
I tried many ways of making sure there were no CMSampleBuffers dropped and checked their timestamps to make sure they were going to the right AVAssetWriter, but to no avail.
Checking the AVCam example with AVCaptureMovieFileOutput and AVCaptureLocation example with AVAssetWriter and it appears the files they generate do the same thing.
Maybe there is something fundamental I am misunderstanding here about the nature of audio/video files, as I'm new to video/audio capture - but thought I'd check before I tried to workaround this by learning to use ffmpeg as some seem to do to fragment the stream (if you have any tips on this, too, let me know!). Thanks in advance!
I had the same problem and solved it by recording audio with a different API, Audio Queue. This seems to solve it, just need to take care of timing in order to avoid sound delay.

Resources