I want to play very basic sounds using the AudioKit oscillator at a given frequency.
For example: play a simple sine wave at 400Hz for 50 miliseconds, then 100 ms of silence, then play 600Hz for 50ms …
I have a metal view where I render some visual stimuli. My intention was to use the basic AKOscillator and the render function of the CADisplayLink to play/stop the sound at some frames.
I tried using oscillator.play() and oscillator.stop() or changing the amplitude with oscillator.amplitude = 0 and oscillator.amplitude = 1 but the result in both cases is a jittering of about 10 ms.
If I first create .wav files and then played them with AKPlayer.play() the timing is correct.
I want the flexibility to use any frequency at any time. How can I do something similar to the first approach? Wrapping the oscillator in an midi instrument is the way to go?
CADisplayLink runs in the UI thread, so will have some sub-frame-time jitter, as the UI thread has lower priority than audio (and perhaps other OS) threads.
The only way I've succeeded in reliable sub-millisecond accurate real-time synthesis of arbitrary frequency audio is to put a sinewave oscillator (or waveform generator or table look-up) inside an Audio Unit callback (RemoteIO for iOS), and count samples, before dynamically inserting the stop/start of the desired sinewave (or other waveform) into the callback buffer(s), starting at the correct sample buffer offset index.
At first, I pre-generated sinewave look-up tables; but later did some profiling; and found that just calling the sin() function with a rotating phase, at audio sample rates, took a barely measurable fraction of a percent of CPU time on any contemporary iOS device. If you do use an incrementing phase, make sure to keep the phase in a reasonable numeric range by occasionally adding/subtracting multiples of 2*pi.
Related
I have a problem with the iOS SDK. I can't find the API to slowdown a video with continuous values.
I have made an app with a slider and an AVPlayer, and I would like to change the speed of the video, from 50% to 150%, according to the slider value.
As for now, I just succeeded to change the speed of the video, but only with discrete values, and by recompiling the video. (In order to do that, I used AVMutableComposition APIs.
Do you know if it is possible to change continuously the speed, and without recompiling?
Thank you very much!
Jery
The AVPlayer's rate property allows playback speed changes if the associated AVPlayerItem is capable of it (responds YES to canPlaySlowForward or canPlayFastForward). The rate is 1.0 for normal playback, 0 for stopped, and can be set to other values but will probably round to the nearest discrete value it is capable of, such as 2:1, 3:2, 5:4 for faster speeds, and 1:2, 2:3 and 4:5 for slower speeds.
With the older MPMoviePlayerController, and its similar currentPlaybackRate property, I found that it would take any setting and report it back, but would still round it to one of the discrete values above. For example, set it to 1.05 and you would get normal speed (1:1) even though currentPlaybackRate would say 1.05 if you read it. Set it to 1.2 and it would play at 1.25X (5:4). And it was limited to 2:1 (double speed), beyond which it would hang or jump.
For some reason, the iOS API Reference doesn't mention these discrete speeds. They were found by experimentation. They make some sense. Since the hardware displays video frames at a fixed rate (e.g.- 30 or 60 frames per second), some multiples are easier than others. Half speed can be achieved by showing each frame twice, and double speed by dropping every other frame. Dropping 1 out of every 3 frames gives you 150% (3:2) speed. But to do 105% is harder, dropping 1 out of every 21 frames. Especially if this is done in hardware, you can see why they might have limited it to only certain multiples.
I am trying do slow motion for my video file along with audio. In my case, I have to do Ramped Slow motion(Gradually slowing down and speeding up
like parabola not a "Linear Slow Motion".
Ref:Linear slow motion :
Ref : Ramped Slow Motion :
What have i done so far:
Used AVFoundation for first three bullets
From video files, separated audio and video.
Did slow motion for video using AVFoundation api (scaletimeRange).Its really working fine.
The same is not working for audio. Seems there's a bug in apple api itself (Bug ID : 14616144). The relevant question is scaleTimeRange has no effect on audio type AVMutableCompositionTrack
So i switched to Dirac. later found there is a limitation with Dirac's open source edition that it doesn't support Dynamic Time Stretching.
Finally trying to do with OpenAL.
I've taken a sample OpenAL program from Apple developer forum and executed it.
Here are my questions:
Can i store/save the processed audio in OpenAl?if its directly not possible with "OpenAl", can it be done with AVFoundation + OpenAL?
Very importantly, how to do slow motion or stretch the time scale with OpenAL? If i know time stretching, i can apply logic for Ramp Slow Motion.
Is there any other way?
I can't really speak to 1 or 2, but time scaling audio can be as easy as resampling. If you have RAW/PCM audio sampled at 48 kHz and want to playback at half speed, resample to 96 kHz and play the audio at 48 kHz. Since you have twice the number of samples it will take twice as long to play. Generally:
scaledSampleRate = (orignalSampleRate / playRate);
or
playRate = (originalSampleRate / scaledSampleRate);
This will effect the pitch of the track, however that may be the desired effect since that behavior is somewhat is expected in "slow motion" audio. There are more advanced techniques that preserve pitch while scaling time. The open source software Audacity implements these algorithms. You could find inspiration there. There are many resources on the web that explain the tradeoffs of pitch shifting vs time stretching.
http://en.wikipedia.org/wiki/Audio_time-scale/pitch_modification
http://www.dspdimension.com/admin/time-pitch-overview/
Another option you may not have considered is muting the audio during slow motion. That seems to be the technique employed by most AV playback utilities. However, depending on your use case, distorted audio does indicate time is being manipulated.
I have applied slow motion on complete video including audio this might help You check this link : How to do Slow Motion video in IOS
In my AudioInputRenderCallback I'm looking to capture an accurate time stamp of certain audio events. To test my code, I'm inputting a click track #120BPM or every 500 milliseconds (The click is accurate, I checked, and double checked). I first get the decibel of every sample, and check if it's over a threshold, this works as expected. I then take the hostTime from the AudioTimeStamp, and convert it to milliseconds. The first click gets assigned to that static timestamp and the second time through does a calculation of the interval and then reassigns to the static one. I expected to see a 500 interval. To be able to calculate the click correctly I have to be with in 5 milliseconds. The numbers seem to bounce back and forth between 510 & 489. I understand it's not an RTOS, but can iOS be this accurate? Is there any issues with using the mach_absolute_time member of the AudioUnitTimeStamp?
Audio Units are buffer based. The minimum length of an iOS Audio Unit buffer seems to be around 6 mS. So if you use the time-stamps of the buffer callbacks, your time resolution or time sampling jitter will be about +- 6 mS.
If you look at the actual raw PCM samples inside the Audio Unit buffer and pattern match the "attack" transient (by threshold or autocorrelation, etc.) you might be able get sub-millisecond resolution.
It seems to me that CoreAudio adds sound waves together when mixing into a single channel. My program will make synthesised sounds. I know the amplitudes of each of the sounds. When I play them together should I add them together and multiply the resultant wave to keep within the range? I can do it like this:
MaxAmplitude = max1 + max2 + max3 //Maximum amplitude of each sound wave
if MaxAmplitude > 1 then //Over range
Output = (wave1 + wave2 + wave3)/MaxAmplitude //Meet range
else
Output = (wave1 + wave2 + wave3) //Normal addition
end if
Can I do it this way? Should I pre-analyse the sound waves to find the actual maximum amplitude (Because the maximum points may not match on the timeline) and use that?
What I want is a method to play several synthesised sounds together without reducing the volume throughout extremely and sounding seamless. If I play a chord with several synthesised instruments, I don't want to require single notes to be practically silent.
Thank you.
Changing the scale suddenly on a single sample basis, which is what your "if" statement does, can sound very bad, similar to clipping.
You can look into adaptive AGC (automatic gain control) which will change the scale factor more slowly, but could still clip or get sudden volume changes during fast transients.
If you use lookahead with the AGC algorithm to prevent sudden transients from clipping, then your latency will get worse.
If you do use AGC, then isolated notes may sound like they were played much more loudly than when played in a chord, which may not correctly represent a musical composition's intent (although this type of compression is common in annoying TV and radio commercials).
Scaling down the mixer output volume so that the notes will never clip or have their volume reduced other than when the composition indicates will result in a mix with greatly reduced volume for a large number of channels (which is why properly reproduced classical music on the radio is often too quiet to draw enough viewers to make enough money).
It's all a trade-off.
I don't see this is a problem. If you know the max amplitude of all your waves (for all time) it should work. Be sure not to change the amplitude on per sample basis but decide for every "note-on". It is a very simple algorithm but could suit your needs.
I am attempting to record an animation (computer graphics, not video) to a WMV file using DirectShow. The setup is:
A Push Source that uses an in-memory bitmap holding the animation frame. Each time FillBuffer() is called, the bitmap's data is copied over into the sample, and the sample is timestamped with a start time (frame number * frame length) and duration (frame length). The frame rate is set to 10 frames per second in the filter.
An ASF Writer filter. I have a custom profile file that sets the video to 10 frames per second. Its a video-only filter, so there's no audio.
The pins connect, and when the graph is run, a wmv file is created. But...
The problem is it appears DirectShow is pushing data from the Push Source at a rate greater than 10 FPS. So the resultant wmv, while playable and containing the correct animation (as well as reporting the correct FPS), plays the animation back several times too slowly because too many frames were added to the video during recording. That is, a 10 second video at 10 FPS should only have 100 frames, but about 500 are being stuffed into the video, resulting in the video being 50 seconds long.
My initial attempt at a solution was just to slow down the FillBuffer() call by adding a sleep() for 1/10th second. And that indeed does more or less work. But it seems hackish, and I question whether that would work well at higher FPS.
So I'm wondering if there's a better way to do this. Actually, I'm assuming there's a better way and I'm just missing it. Or do I just need to smarten up the manner in which FillBuffer() in the Push Source is delayed and use a better timing mechanism?
Any suggestions would be greatly appreciated!
I do this with threads. The main thread is adding bitmaps to a list and the recorder thread takes bitmaps from that list.
Main thread
Animate your graphics at time T and render bitmap
Add bitmap to renderlist. If list is full (say more than 8 frames) wait. This is so you won't use too much memory.
Advance T with deltatime corresponding to desired framerate
Render thread
When a frame is requested, pick and remove a bitmap from the renderlist. If list is empty wait.
You need a threadsafe structure such as TThreadList to hold the bitmaps. It's a bit tricky to get right but your current approach is guaranteed to give to timing problems.
I am doing just the right thing for my recorder application (www.videophill.com) for purposes of testing the whole thing.
I am using Sleep() method to delay the frames, but am taking great care to ensure that timestamps of the frames are correct. Also, when Sleep()ing from frame to frame, please try to use 'absolute' time differences, because Sleep(100) will sleep about 100ms, not exactly 100ms.
If it won't work for you, you can always go for IReferenceClock, but I think that's overkill here.
So:
DateTime start=DateTime.Now;
int frameCounter=0;
while (wePush)
{
FillDataBuffer(...);
frameCounter++;
DateTime nextFrameTime=start.AddMilliseconds(frameCounter*100);
int delay=(nextFrameTime-DateTime.Now).TotalMilliseconds;
Sleep(delay);
}
EDIT:
Keep in mind: IWMWritter is time insensitive as long as you feed it with SAMPLES that are properly time-stamped.