Audio Synthesis with (AVFoundation?) Using Sine Wave - ios

Let's say I have an array of Y values for a sine wave. (Assume X is time)
In Python you can just write it to a Wav file:
wav.write("file.wav", <sample rate>, <waveform>)
Is it possible to do this in Swift using AVFoundation? If so how? If not, what library should I be using? (I'm trying to avoid AudioKit for now.)
Thanks,
Charles

In AVFoundation there is AVAudioFile, but you'll have to provide the data as AVAudioPCMBuffers, which keeps the data in a AudioBufferList, which in turn consists of AudioBuffers, which are imho all rather complicated since their design goal apparently was to be able to handle every conceivable audio format (including compressed, VBR etc.). So AVAudioFile is probably overkill for just writing some synthetic samples to a WAV file.
Alternatively, there is the Audio File Services C-API. It provides AudioFileCreateWithURL, AudioFileWriteBytes and AudioFileClose, which will probably do the trick for your task.
The most complicated part may be the AudioStreamBasicDescription required by AudioFileCreateWithURL. To help with this a utility function exists: FillOutASBDForLPCM.

Related

Which method in lua could I use to most effectively get the hertz frequency of a certain position in a .wav file?

So I want to be able to convert a .wav file to a json table using lua, which would probably include something like {time="0:39.34",hz=440} or something. I already have all my json libraries but I just need a method to be able to convert a .wav file into something that I could use to convert it into json. If there's already a library that can do this then I need the source code of the library to be able to implement it into my code for a single-file program.
At each point in wav you'll have a full spectrum, not just "the hertz frequency". You'll have to perform a fourier transform on the data, and from many peaks in spectrum select the one you're interested in - be it fundamental or dominant, etc.
There are libs for Fast Fourier Transform out there, like LuaFFT, but you'd better get more clear picture of what you really need from the WAV. If you're just trying to read DTMF signal, you don't really need the full scale spectrum analysis.

Split audio track into segments by BPM and analyse each segment using Superpowered iOS

I have been using the Superpowered iOS library to analyse audio and extract BPM, loudness, pitch data. I'm working on an iOS Swift 3.0 project and have been able to get the C classes work with Swift using the Bridging headers for ObjC.
The problem I am running into is that whilst I can create a decoder object, extract audio from the Music Library and store it as a .WAV - I am unable to create a decoder object for just snippets of the extracted audio and get the analyser class to return data.
My approach has been to create a decoder object as follows:
var decodeAttempt = decoder!.open(self.originalFilePath, metaOnly: false, offset: offsetBytes, length: lengthBytes, stemsIndex: 0)
'offsetBytes' and 'LengthBytes' I think are the position within the audio file. As I have already decompressed audio, stored it as WAV and then am providing it to the decoder here, I am calculating the offset and length using the PCM Wave audio formula of 44100 x 2 x 16 / 8 = 176400 bytes per second. Then using this to specify a start point and length in bytes. I'm not sure that this is the correct way to do this as the decoder will return 'Unknown file format'.
Any ideas or even alternative suggestions of how to achieve the title of this question? Thanks in advance!
The offset and length parameters of the SuperpoweredDecoder are there because of the Android APK file format, where bundled audio files are simply concatenated to the package.
Despite a WAV file is as "uncompressed" as it can be, there is a header at the beginning, so offset and length are not a good way for this purpose. Especially as the header is present at the beginning only, and without the header decoding is not possible.
You mention that you can extract audio to PCM (and save to WAV). Then you have the answer in your hand: just submit different extracted portions to different instances of the SuperpoweredOfflineAnalyzer.

convert audio to lower sampling rate

I am using AVRecorder to save recording and AVAssetExportsession to append multiple files. But output of the Exportsession is too large.
So I would like to convert it to a lower size before uploading it to the server. How can I convert this to a lower sampling rate.
Use AVAssetWriter (Apple docs: https://developer.apple.com/library/mac/documentation/AVFoundation/Reference/AVAssetWriter_Class/index.html), which will allow you to choose bitrate/channel/etc options for the file.
This related question (AVAssetWriter How to write down-sampled/compressed m4a/mp3 files) has a full code sample using AVAssetWriter if you need that -- be sure, of course, to take note of the answer to that question, in regards to locations for the exported file.

Get PTS from raw H264 mdat generated by iOS AVAssetWriter

I'm trying to simultaneously read and write H.264 mov file written by AVAssetWriter. I managed to extract individual NAL units, pack them into ffmpeg's AVPackets and write them into another video format using ffmpeg. It works and the resulting file plays well except the playback speed is not right. How do I calculate the correct PTS/DTS values from raw H.264 data? Or maybe there exists some other way to get them?
Here's what I've tried:
Limit capture min/max frame rate to 30 and assume that the output file will be 30 fps. In fact its fps is always less than values that I set. And also, I think the fps is not constant from packet to packet.
Remember each written sample's presentation timestamp and assume that samples map one-to-one to NALUs and apply saved timestamp to output packet. This doesn't work.
Setting PTS to 0 or AV_NOPTS_VALUE. Doesn't work.
From googling about it I understand that raw H.264 data usually doesn't contain any timing info. It can sometimes have some timing info inside SEI, but the files that I use don't have it. On the other hand, there are some applications that do exactly what I'm trying to do, so I suppose it is possible somehow.
You will either have to generate them yourself, or access the Atom's containing timing information in the MP4/MOV container to generate PTS/DTS information. FFmpeg's mov.c in libavformat might help.
Each sample/frame you write with AVAssetWriter will map one to one with the VCL NALs. If all you are doing is converting then have FFmpeg do all the heavy lifting. It will properly maintain the timing information when going from one container format to another.
The bitstream generated by AVAssetWriter does not contain SEI data. It only contains SPS/PPS/I/P frames. The SPS also does not contain VUI or HRD parameters.
-- Edit --
Also, keep in mind that if you are saving PTS information from the CMSampleBufferRef's then the time base may be different from that of the target container. For instance AVFoundation time base is nanoseconds, and a FLV file is milliseconds.

iOS: Sound generation on iPad given Hz parameter?

Is there an API in one of the iOS layers that I can use to generate a tone by just specifying its Hertz. What I´m looking to do is generate a DTMF tone. This link explains how DTMF tones consists of 2 tones:
http://en.wikipedia.org/wiki/Telephone_keypad
Which basically means that I should need playback of 2 tones at the same time...
So, does something like this exist:
SomeCleverPlayerAPI(697, 1336);
If spent the whole morning searching for this, and have found a number of ways to playback a sound file, but nothing on how to generate a specific tone. Does anyone know, please...
Check out the AU (AudioUnit) API. It's pretty low-level, but it can do what you want. A good intro (that probably already gives you what you need) can be found here:
http://cocoawithlove.com/2010/10/ios-tone-generator-introduction-to.html
There is no iOS API to do this audio synthesis for you.
But you can use the Audio Queue or Audio Unit RemoteIO APIs to play raw audio samples, generate an array of samples of 2 sine waves summed (say 44100 samples for 1 seconds worth), and then copy the results in the audio callback (1024 samples, or whatever the callback requests, at a time).
See Apple's aurioTouch and SpeakHere sample apps for how to use these audio APIs.
The samples can be generated by something as simple as:
sample[i] = (short int)(v1*sinf(2.0*pi*i*f1/sr) + v2*sinf(2.0*pi*i*f2/sr));
where sr is the sample rate, f1 and f1 are the 2 frequencies, and v1 + v2 sum to less than 32767.0. You can add rounding or noise dithering to this for cleaner results.
Beware of clicking if your generated waveforms don't taper to zero at the ends.

Resources