I'm trying to make a simple frequency analyzer VST plugin using Tobybears VST Template for Delphi.
The problem I'm having is that I cant seem to find any documentation or information about how to get something like an array of values that represent the different frequencies from a chunk of audio data that is recieved from the host.
Does anybody have a clue on how to do this?
Also, my VST host keeps crashing whenever I try to use the DelphiASIOVst library, which is another library for making custom VSTs.
Thanks!
The Tobybears VST Template is obsolate(vst 2.3). Rather use the DAV project on sourceforge, as sugested by Shannon.(which make some vst 2.4)
About the analysis, it's quite easy, you basically have to make some FFT on the signal (you buffer the input and when 2^n data have been accumulated you make a FFT), and then you compute the hypothenus of each imaginary,real pair to get the aproximative amplitude of a band...then you plot on a graph...In combination with a envelope follower and some GUI programming skills you'll get someting like the Voxengo Span...
VST plugins receive audio signals as time domain signals. The audio signal data doesn't contain frequency information (which is why you can't find any documentation).
To implement a frequency analyzer you'll need to transform the received time domain signal into a frequency domain signal. Performing a Fast Fourier Transformation (FFT) is the standard way to transform time domain signals into frequency domain signals.
Related
I wonder if the Clickhouse is a possible solution for the next task.
I'm collecting time-series data (for example pulse measurements of people)
I have different types of thresholds (for example min and max pulse value based on the age)
Once a pulse for an individual human reached the appropriate threshold, I want to trigger external service
In other words, what I looking for beyond a regular time-series storage is:
ability to set multiple thresholds
detect if the value is beyond the threshold automatically
emit some kind of event to 3rd party
Any other tools suggestions are appreciated. Thanks in advance.
Clickhouse have partial features for this task
you can try to write your own code (python, golang everything else) as an external process
which can use LIVE VIEWS and WATCH for trigger event detection, look article which describes these features
https://www.altinity.com/blog/2019/11/13/making-data-come-to-life-with-clickhouse-live-view-tables
and this code should emit an event to 3rd party system
Question 1: Is it feasible? (As far as I know [info get from google], it is feasible. However, I need a more affirmative answer.)
Question 2: Say I have a device that generates square wave, how can I get the message?
As a beginner, I want to know to which class I should pay my attention to?
Thanks, any info will be appreciated.
This is a great question. I've put together a C library which does this, so you might be able to adapt it for iOS.
Library: https://github.com/quiet/quiet
Live Demo: https://quiet.github.io/quiet-js/lab.html
With a sound port, you want to avoid square waves. Those make inefficient use of the range of amplitudes you have available, and they're not very spectrally efficient. The most basic modulation people typically use here is frequency shift keying. My library offers that (as gaussian minimum shift keying) but also more advanced modes like phase shift keying and quadrature amplitude shift keying. I've managed to reach transfer speeds of 64kbps using this library.
I'm looking to build a really simple EQ that plays a filtered version of a song in the user's library. It would essentially be a parametric EQ: I'd specify the bandwidth, cut/boost (in dB), and centre frequency, and then be returned some object that I could play just like my original MPMediaItem.
For MPMediaItems, I've generally used AVAudioPlayer in the past with great success. For audio generation, I've used AudioUnits. In MATLAB, I'd probably just create custom filters to do this. I'm at a bit of a loss for how to approach this in iOS! Any pointers would be terrific. Thanks for reading
iOS ships with a fairly sizeable number of audio units. One of kAudioUnitSubType_ParametricEQ, kAudioUnitSubType_NBandEQ or kAudioUnitSubType_BandPassFilter is probably what you want depending on whether you want to control Q as well as Fc and Gain.
I suspect you will have to forego using higher-level components such as AVAudioPlayer to make use of it.
The relevant iOS audio unit reference can be found here
I am interested in making a simple digital synthesizer to be implemented on an 8bit MCU. I would like to make wavetables for accurate representations of the sound. Standard wavetables seem to either have a table for several frequencies or to have a single sample that has fractional increments with missing data interpolated by the program to create different frequencies.
Would it be possible to create a single table for a given waveform, likely of a low frequency and change the rate at which the program polls the table to generate different frequencies which would then be processed. My MCU (free one, no budget) is rather slow so I don't have the space for lots of wavetables nor for large amounts of processing so I am trying to skimp where I can. Has anyone seen this implementation?
You should consider using a single table with a phase accumulator and linear interpolation. See this question on DSP.SE for many useful suggestions.
I plan to write a conversation analysis software, which will recognize the individual speakers, their pitch and intensity. Pitch and intensity are somewhat straightforward (pitch via autocorrelation).
How would I go about recognizing individual speakers, so I can record his/her features? Will storing some heuristics for each speaker's frequencies be enough? I can assume that only one person speaks at a time (strictly non-overlapping). I can also assume that for training, each speaker can record a minute's worth of data before actual analysis.
Pitch and intensity on their own tell you nothing. You really need to analyse how pitch varies. In order to identify different speakers you need to transform the speech audio into some kind of feature space, and then make comparisons against your database of speakers in this feature space. The general term that you might want to Google for is prosody - see e.g. http://en.wikipedia.org/wiki/Prosody_(linguistics). While you're Googling you might also want to read up on speaker identification aka speaker recognition, see e.g. http://en.wikipedia.org/wiki/Speaker_identification
If you are still working on this... are you using speech-recognition on the sound input? Because Microsoft SAPI for example provides the application with a rich API for digging into the speech sound wave, which could make the speaker-recognition problem more tractable. I think you can get phoneme positions within the waveform. That would let you do power-spectrum analysis of vowels, for example, which could be used to generate features to distinguish speakers. (Before anybody starts muttering about pitch and volume, keep in mind that the formant curves come from vocal-tract shape and are fairly independent of pitch, which is vocal-cord frequency, and the relative position and relative amplitude of formants are (relatively!) independent of overall volume.) Phoneme duration in-context might also be a useful feature. Energy distribution during 'n' sounds could provide a 'nasality' feature. And so on. Just a thought. I expect to be working in this area myself.