Cropping MIDI file using AudioKit - ios

I am trying to crop and loop a certain part of a MIDI file using AudioKit.
I am using a sequencer and found a couple of things that are close to what I need, but not exactly.
I found a method in AKSequencer called clearRange. With this method I am able to silence the parts of the MIDI I don't want, but I haven't found a way to trim the sequencer and only keep the part I am interested in. Right now only the part I want has sound but I still get the silent parts.
Is there a way to trim a sequencer or to create a new sequencer with only the portion I want to keep from the original one?
Thanks!

One frustrating limitation of Apple's MusicSequence (which AKSequencer is based on) is that although you can easily set the 'right side' of the looping section, the left side will always loop back to zero and cannot be changed. So to crop from the left side, you need to isolate the section that you want to loop and shift it over so that the start of your loop is at zero.
As of AudioKit 4.2.4, this is possible. Use AKMusicTrack's .getMIDINoteData() to get an array of AKMIDINoteData structs whose content can be edited and then used to replace the original data. If you had a 16 beat track and you wanted to loop the last four beats, you could do something ike this:
let loopStart = 12.0
let loopLength = 4.0
// keep track of the original track contents
let originalLength = 16.0
let originalContents = track.getMIDINoteData()
// isolate the segment for looping and shift it to the start of the track
let loopSegment = originalContents.filter { loopStart ..< (loopStart + loopLength) ~= $0.position.beats }
let shiftedSegment = loopSegment.map { AKMIDINoteData(noteNumber: $0.noteNumber,
velocity: $0.velocity,
channel: $0.channel,
duration: $0.duration,
position: AKDuration(beats: $0.position.beats - loopStart))
}
// replace the track contents with the loop, and assert the looping behaviour
track.replaceMIDINoteData(with: shiftedSegment)
seq.setLength(AKDuration(beats: loopLength))
seq.enableLooping()
// and to get back to the original:
track.replaceMIDINoteData(with: originalContents)
seq.setLength(AKDuration(beats: originalLength))
seq.enableLooping()
If you wanted to looping section to repeat for the length of the original sequence, then you could use the shiftedSegment as a template and build a 16 beat sequence from it.

Related

AVAudioPlayerNode - Mix between schedule buffers and segments

I'm writing an application where I should play parts of audio files. Each audio file contains audio data for a separate track.
These parts are sections with a begin time and a end time, and I'm trying to play those parts in the order I choose.
So for example, imagine I have 4 sections :
A - B - C - D
and I activate B and D, I want to play, B, then D, then B again, then D, etc..
To make smooth 'jumps" in playback I think it's important to fade in/out start/end sections buffers.
So, I have a basic AVAudioEngine setup, with AVAudioPlayerNode, and a mixer.
For each audio section, I cache some information :
a buffer for the first samples in the section (which I fade in manually)
a tuple for the AVAudioFramePosition, and AVAudioFrameCount of a middle segment
a buffer for the end samples in the audio section (which I fade out manually)
now, when I schedule a section for playing, I say the AVAudioPlayerNode :
schedule the start buffer (scheduleBuffer(_:completionHandler:) no option)
schedule the middle segment (scheduleSegment(_:startingFrame:frameCount:at:completionHandler:))
finally schedule the end buffer (scheduleBuffer(_:completionHandler:) no option)
all at "time" nil.
The problem here is I can hear clic, and crappy sounds at audio sections boundaries and I can't see where I'm doing wrong.
My first idea was the fades I do manually (basically multiplying sample values by a volume factor), but same result without doing that.
I thought I didn't schedule in time, but scheduling sections in advance, A - B - C for example beforehand has the same result.
I then tried different frame position computations, with audio format settings, same result.
So I'm out of ideas here, and perhaps I didn't get the schedule mechanism right.
Can anyone confirm I can mix scheduling buffers and segments in AVAudioPlayerNode ? or should I schedule only buffers or segments ?
I can confirm that scheduling only segments works, playback is perfectly fine.
A little context on how I cache information for audio sections..
In the code below, file is of type AVAudioFile loaded on disk from a URL, begin and end are TimeInterval values, and represent the start/end of my audio section.
let format = file.processingFormat
let startBufferFrameCount: AVAudioFrameCount = 4096
let endBufferFrameCount: AVAudioFrameCount = 4096
let audioSectionStartFrame = framePosition(at: begin, format: format)
let audioSectionEndFrame = framePosition(at: end, format: format)
let segmentStartFrame = audioSectionStartFrame + AVAudioFramePosition(startBufferFrameCount)
let segmentEndFrame = audioSectionEndFrame - AVAudioFramePosition(endBufferFrameCount)
startBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: startBufferFrameCount)
endBuffer = AVAudioPCMBuffer(pcmFormat: format, frameCapacity: endBufferFrameCount)
file.framePosition = audioSectionStartFrame
try file.read(into: startBuffer)
file.framePosition = segmentEndFrame
try file.read(into: endBuffer)
middleSegment = (segmentStartFrame, AVAudioFrameCount(segmentEndFrame - segmentStartFrame))
frameCount = AVAudioFrameCount(audioSectionEndFrame - audioSectionStartFrame)
Also, the framePosition(at:format:) multiplies the TimeInterval value by the sample rate of the AVAudioFormat passed in.
I cache this information for every audio section, but I hear clicks at section boundaries, no matter if I schedule them in advance or not.
I also tried not mixing buffer and segments when scheduling, but I doesn't change anything, so I start thinking I'm doing wrong frame computations.

Simple Babymonitor with Bass.DLL

I am trying to program a simple Babymonitor for Windows (personal use).
The babymonitor should just detect the dB level of the microphone and triggers at a certain volume.
After some research, I found the Bass.dll library and came across it's function BASS_ChannelGetLevel, which is great but seems to have limitations and doesn't fit my needs (Peak equals to a DWORD value).
In the examples I found a livespec example which is "almost" what I need. The example uses BASS_ChannelGetData, but I don't quite know how to handle the returned array...
I want to keep it as simple as possible: Detect the volume from the microphone as dB or any other value (e.g. value 0-MAXINT).
How can this be done with the Bass.dll library?
The BASS_ChannelGetLevel returns the value that is capped to 0dB (return value is 32768 in this case). If you adjust your source level (lower microphone level in sound card settings) then it will work just fine.
Another way, if you want to get uncapped value is to use the BASS_ChannelGetLevelEx function instead: it returns floating point levels, where 1 is maximum (0dB) value that corresponds to BASS_ChannelGetLevel's 32767, but it can exceed 1 to detect sound levels above 0dB which is what you may need.
I also suggest you to monitor sound level for a while: trigger only if certain level exists for 2-3 seconds at least (this way you will exclude false alarms).
Here is how you obtain the db level given an input stream handle (streamHandle):
var peak = (double)Bass.BASS_ChannelGetLevel(streamHandle);
var decibels = 20 * Math.Log10(peak / Int32.MaxValue);
Alternatively, you can use the following to get the RMS (average) peak. To get the RMS value, you have to pass in a sample length into BASS_ChannelGetLevel. I'm using 20 milliseconds here but you can play with the value to see which works best for your needs.
var decibels = 0m;
var channelCount = 2; //Assuming two channels
var sampleLengthMS = 20f;
var rmsLevels = new float[channelCount];
var rmsObtained = Bass.BASS_ChannelGetLevel(streamHandle, rmsLevels, sampleLengthMS / 1000f, BASSLevel.BASS_LEVEL_RMS);
if (rmsObtained)
decibels = 20*Math.Log10(rmsLevels[0]); //using first channel (index 0) but you can get both if needed.
else
Console.WriteLine(Bass.BASS_ErrorGetCode());
Hope this helps.

Xcode iOS: how to find position of one audio file (snippet) inside another?

I have audio files, with different durations. They have common content and unique content. E.g. two files, 70 seconds each, last 10 seconds of the first file is the same as first two seconds of the second file. How can I find the exact position of common content (e.g. 60.0 of the first file)?
Sounds a little bit messy, hope the following image can help https://drive.google.com/file/d/0BzBE2Kfw8uQoUWNTN1RXOEtLVEk/view?usp=sharing
So, I'm looking for the red mark - common content starts at 60.0 sec of the first file.
The problem is that I have files with different durations. Sometimes it's 70 seconds long, sometimes one file is 70 seconds, the other is 80 seconds long, etc. Most likely they have 60.0 seconds of unique content, but I'm not sure (it could be 59.9 of unique content, etc.).
Thus, I assume I need to get a short snippet of the second file from first 10 seconds and find it in the first file:
For example, output: 2.5 sec of the second file = 62.5 from the first file - works for me, as well.
THE MAIN GOAL IS TO PLAY FILE AFTER FILE GAPLESS. If I get the values, I'll be able to do this. Sometimes the values can be: 2.5 = 63.7, that's why I need the exact match.
Can anybody help with the code or at least some information of how to compare two snippets of audio content? Thanks in advance!
Wow, that is quite a problem to solve. And I must confess that i've not done anything exactly like this or have any code based suggestions.
All I will say is that if I were looking to try and solve this problem, then I would try and save the audio file as some kind of uncompressed and fixed size (as in a known number of bytes per second) format.
Then you could take a section of one file and byte match it with another, then you would know how many bytes inwards that snippet occurred. Then, knowing the bytes per ms (sort of frame size), you could work out the exact time position.
It's a bit hair brained, but i've used that technique with images before but at least audio is linear!
Here is an approximate example of how I would go about doing the comparison of a sample within a sound file.
- (int)positionOf:(NSData*)sample inData:(NSData*)soundfile {
// the block size has to be big enough to find something genuinely unique but small enough to ensure it is still fast.
int blockSize = 128;
int position = 0;
int returnPosition = INT32_MAX;
// check to see if the block size exceeds the sample or data file size
if (soundfile.length < blockSize || sample.length < blockSize) {
return returnPosition;
}
// create a byte array of the sample, ready to use to compare with the shifting buffer
char* sampleByteArray = malloc(sample.length);
memcpy(sampleByteArray, sample.bytes, sample.length);
// now loop through the sound file, shifting the window along.
while (position < (soundfile.length - blockSize)) {
char* window = malloc(blockSize);
memcpy(window, soundfile.bytes + position, blockSize);
// check to see if this is a match
if(!memcmp(sampleByteArray, window, blockSize)) {
// these are the same, now to check if the whole sample is the same
if ((position + sample.length) > soundfile.length) {
// the sample won't fit in the remaining soundfile, so it can't be this!
free(window);
break;
}
if(!memcmp(sampleByteArray, soundfile.bytes + position, sample.length)) {
// this is an entire match, position marks the start in bytes of the sample.
free(window);
returnPosition = position;
break;
}
}
free(window);
position++;
}
free(sampleByteArray);
return returnPosition;
}
It compiles, didn't have time to setup the scenario to check your exact case, but i'm quite confident this may help.

naughty notifications with long messages

Is there a way to bypass the character limit of naughty's notification messages? I don't know what that limit is, but they seem to be getting truncated after a certain length. I'm using orglendar and it summarizes my google calendar events in one message/pane, which is very convenient.
Problem is if I have more than 4 or so events, they tend to get too long and get cut off. The resulting message ends up being somewhat unreadable due to incorrect parsing because of truncated </span> tags.
Alternatively, since notification messages probably weren't designed for long lists like this to begin with, is there an alternative widget that I could modify orglendar to use instead?
Looking at the orglendar code, it seems you could simply tweek the calendar_width variable at the top of the script to a larger value.
Alternatively, the naughty documentation tells that you can use a nil width to get automatic width. If this is what you want, you can simply change orglendar to reflect this. At the bottom of the orglendar.lua file (line 268), simply comment-out or remove the width parameter:
calendar = naughty.notify({ title = header,
text = cal_text,
timeout = 0, hover_timeout = 0.5,
-- width = calendar_width * char_width,
screen = mouse.screen,
})

Determine consecutive video clips

I a long video stream, but unfortunately, it's in the form of 1000 15-second long randomly-named clips. I'd like to reconstruct the original video based on some measure of "similarity" of two such 15s clips, something answering the question of "the activity in clip 2 seems like an extension of clip 1". There are small gaps between clips --- a few hundred milliseconds or so each. I can also manually fix up the results if they're sufficiently good, so results needn't be perfect.
A very simplistic approach can be:
(a) Create an automated process to extract the first and last frame of each video-clip in a known image format (e.g. JPG) and name them according to video-clip names, e.g. if you have the video clips:
clipA.avi, clipB.avi, clipC.avi
you may create the following frame-images:
clipA_first.jpg, clipA_last.jpg, clipB_first.jpg, clipB_last.jpg, clipC_first.jpg, clipC_last.jpg
(b) The sorting "algorithm":
1. Create a 'Clips' list of Clip-Records containing each:
(a) clip-name (string)
(b) prev-clip-name (string)
(c) prev-clip-diff (float)
(d) next-clip-name (string)
(e) next-clip-diff (float)
2. Apply the following processing:
for Each ClipX having ClipX.next-clip-name == "" do:
{
ClipX.next-clip-diff = <a big enough number>;
for Each ClipY having ClipY.prev-clip-name == "" do:
{
float ImageDif = ImageDif(ClipX.last-frame.jpg, ClipY.first_frame.jpg);
if (ImageDif < ClipX.next-clip-diff)
{
ClipX.next-clip-name = ClipY.clip-name;
ClipX.next-clip-diff = ImageDif;
}
}
Clips[ClipX.next-clip-name].prev-clip-name = ClipX.clip-name;
Clips[ClipX.next-clip-name].prev-clip-diff = ClipX.next-clip-diff;
}
3. Scan the Clips list to find the record(s) with no <prev-clip-name> or
(if all records have a <prev-clip-name> find the record with the max <prev-clip-dif>.
This is a good candidate(s) to be the first clip in sequence.
4. Begin from the clip(s) found in step (3) and rename the clip-files by adding
a 5 digits number (00001, 00002, etc) at the beginning of its filename and going
from aClip to aClip.next-clip-name and removing the clip from the list.
5. Repeat steps 3,4 until there are no clips in the list.
6. Voila! You have your sorted clips list in the form of sorted video filenames!
...or you may end up with more than one sorted lists (if you have enough
'time-gap' between your video clips).
Very simplistic... but I think it can be effective...
PS1: Regarding the ImageDif() function: You can create a new DifImage, which is the difference of Images ClipX.last-frame.jpg, ClipY.first_frame.jpg and then then sum all pixels of DifImage to a single floating point ImageDif value. You can also optimize the process to abort the difference (or sum process) if your sum is bigger than some limit: You are actually interested in small differences. A ImageDif value which is larger than an (experimental) limit, means that the 2 images differs so much that the 2 clips cannot be one next each other.
PS2: The sorting algorithm order of complexity must be approximately O(n*log(n)), therefore for 1000 video clips it will perform about 3000 image comparisons (or a little more if you optimize the algorithm and you allow it to not find a match for some clips)

Resources