Creating a phase vocoder with the webaudio API using analyzer and periodic wave - signal-processing

I was wondering if it would be possible to poll the AnalyzerNode from the WebAudio API and use it to construct a PeriodicWave that is synthesized via an OscillatorNode?
My intuition is that something about the difference in amplitudes between analyzer frames can help calculate the right phase for a PeriodicWave, but I'm not sure how to go about implementing it. Any help on the right algorithm to use would be appreciated!

As luck would have it, I was working on a similar project just a few weeks ago. I put together a JSFiddle to explore the idea of reconstructing a phase-randomized version of a waveform using frequency data from an AnalyserNode. You can find that experiment here:
https://jsfiddle.net/mattdiamond/w4u7x8zk/
Here's the code that takes in the frequency data output from an AnalyserNode and generates a PeriodicWave:
function generatePeriodicWave(freqData) {
const real = [];
const imag = [];
freqData.forEach((x, i) => {
const amp = fromDecibels(x);
const phase = getRandomPhase();
real.push(amp * Math.cos(phase));
imag.push(amp * Math.sin(phase));
});
return context.createPeriodicWave(real, imag);
}
function fromDecibels(x) {
return 10 ** (x / 20);
}
function getRandomPhase() {
return Math.random() * 2 * Math.PI - Math.PI;
}
Since the AnalyserNode converts the FFT amplitude values to decibels, we need to recover those original values first (which we do by simply using the inverse of the formula that was used to convert them to decibels). We also need to provide a phase for each frequency, which we select at random from the range -π to π.
Now that we have an amplitude and phase, we construct a complex number by multiplying the amplitude by the cosine and sine of the phase. This is because the amplitude and phase correspond to a polar coordinate, and createPeriodicWave expects a list of real and imaginary numbers corresponding to Cartesian coordinates in the complex plane. (See here for more information on the mathematics behind this conversion.)
Once we've generated the PeriodicWave, all that's left to do is load it into an OscillatorNode, set the desired frequency, and start the oscillator. You'll notice that the default frequency is set to context.sampleRate / FFT_SIZE (you can ignore the toFixed, that was just for the sake of the UI). This causes the oscillator to play the wave at the same rate as the original samples. Increasing or decreasing the frequency from this value will pitch-shift the audio up or down, respectively.
You'll also notice that I chose 2^15 as the FFT size, which is the maximum size that the AnalyserNode allows. For my purposes -- creating interesting looped drones -- a larger FFT results in a more interesting and less "loopy" drone. (A while back I created a webpage that allowed users to generate drones from much larger FFTs... that experiment utilized a third-party FFT library instead of the AnalyserNode.) I'm not sure if this is the right FFT size for your purposes, but it's something to consider.
Anyway, I think that covers the core of the algorithm. Hope this helps! (And feel free to ask more questions in the comments if anything's unclear.)

Related

Algorithm for detecting the lowest level of EMG activity?

How, having only data from the EMG sensor, to determine whether a person is in the REM phase? In other words, I need to detect the lowest level of activity from the sensor's EMG. Well, or at least register the phase change ...
In more detail... I'm going to make a REM phase detector using an EMG (electromyography) sensor. There is already a sketch of the Android application on the github, if you are interested, I can post a link. Although there is still work to be done...)
The device should work based on the fact that in different states of the brain (wakefulness, slow sleep, REM sleep), different levels of activity will be recorded from the sensor. In REM sleep, this activity is minimal.
The Bluetooth sensor is attached to the body before going to bed, the Android program is launched, communicates with the sensor and sends the data read from it to the connected TCP client via WiFi. TCP client - python script running on a nettop. It receives data, and by design should determine in real time whether the current level of activity is the minimum for the entire observation period. If so, the script will tell the server (Android program) to turn on the hint - it can be a vibration on the phone or a fitness bracelet, playing an audio sample through a headphone, light flashes, a slight electric shock =), etc.
Because only one EMG sensor is used, I admit that it will not be possible to catch REM sleep phases with 100% accuracy, but this is not necessary. If the accuracy is 80% - it's already good. For starters, even an algorithm for detecting a change in the current activity level is suitable. - There will be something to experiment with and something to build on.
The problem is with the algorithm. I would not like to use fixed thresholds, because these thresholds will be different for different people, and even for the same person at different times and in different states they will differ. Therefore, I will be glad to ideas and tips from your side.
I found an interesting article "A Self-adaptive Threshold Method for Automatic Sleep Stage Classifi-
cation Using EOG and EMG" (https://www.researchgate.net/publication/281722966_A_Self-adaptive_Threshold_Method_for_Automatic_Sleep_Stage_Classification_Using_EOG_and_EMG):
But there remains incomprehensible, a few points. First, how is the energy calculated (Step 1-4: Energy)?
d = f.read(epoche_seconds*SAMPLE_RATE*2)
if not d:
break
print('epoche#{}'.format(i))
fmt = '<{}H'.format(len(d) // 2)
t = unpack(fmt, d)
d = list(t)
d = np.array(d)
d = d / (1 << 14) # 14 - bits per sample
df=pd.DataFrame({'signal': d, 'id': [x*2 for x in range(len(d))]})
df = df.set_index('id')
d = df.signal
# signal prepare:
d = butter_bandpass_filter(d, lowcut, highcut, SAMPLE_RATE, order=2)
# extract signal futures:
future_iv = np.mean(np.absolute(d))
print('iv={}'.format(future_iv))
future_var = np.var(d)
print('var={}'.format(future_var))
ws = 6
df = pd.DataFrame({'signal': np.absolute(d)})
future_e = df.rolling(ws).sum()[ws-1:].max().signal
print('E={}'.format(future_e))
-- Will it be right?
Secondly, can someone elaborate on this:
Step 2-1: normalized processed
EMG and EOG feature vectors were processed with
normalized function before involved into classification
steps. Feature vectors were normalized as the follow-
ing function (4):
Where, Xmax and Xmin were got by following
steps: first, sort x(i) vector, and set window length N
represents 50 ; then compare 2Nk (k, values from
10 to 1) with the length of x(i) , if 2Nk is bigger
than the later one, reduce k , until 2Nk is lower than
the length of x(i). If length of x(i) is greater than
2Nk, compute the mean of 50 larger values as
Xmax, and the average of 50 smaller values as Xmin.
?
If you are looking for REM (deep sleep); a spectogram will show you intensity and frequency information on a 1 page graph/chart. People of a certain age refer to spectograms as TFFT - the wiki link is... https://en.wikipedia.org/wiki/Spectrogram
Can you input the data into a spectogram display/plot. Use a large FFT window (you probably have hours of data) with a small overlap (15%). I would recommend starting with an FFT window around 1 second.

Trying to do PCA analysis on interest rate swaps data (multivariate time series)

I have a data set with 20 non-overlapping different swap rates (spot1y, 1y1y, 2y1y, 3y1y, 4y1y, 5y2y, 7y3y, 10y2y, 12y3y...) over the past year.
I want to use PCA / multiregression and look at residuals in order to determine which sectors on the curve are cheap/rich. Has anyone had experience with this? I've done PCA but not for time series. I'd ideally like to model something similar to the first figure here but in USD.
https://plus.credit-suisse.com/rpc4/ravDocView?docid=kv66a7
Thanks!
Here are some broad strokes that can help answer your question. Also, that's a neat analysis from CS :)
Let's be pythonistas and use NumPy. You can imagine your dataset as a 20x261 array of floats. The first place to start is creating the array. Suppose you have a CSV file storing the raw data persistently. Then a reasonable first step to load the data would be something as simple as:
import numpy
x = numpy.loadtxt("path/to/my/file")
The object x is our raw time series matrix, and we verify the truthness of x.shape == (20, 261). The next step is to transform this array into it's covariance matrix. Whether it has been done on the raw data already, or it still has to be done, the first step is centering each time series on it's mean, like this:
x_centered = x - x.mean(axis=1, keepdims=True)
The purpose of this step is to help simplify any necessary rescaling, and is a very good habit that usually shouldn't be skipped. The call to x.mean uses the parameters axis and keepdims to make sure each row (e.g. the time series for spot1yr, ...) is centered with it's mean value.
The next steps are to square and scale x to produce a swap rate covariance array. With 2-dimensional arrays like x, there are two ways to square it-- one that leads to a 261x261 array and another that leads to a 20x20 array. It's the second array we are interested in, and the squaring procedure that will work for our purposes is:
x_centered_squared = numpy.matmul(x_centered, x_centered.transpose())
Then, to scale one can chose between 1/261 or 1/(261-1) depending on the statistical context, which looks like this:
x_covariance = x_centered_squared * (1/261)
The array x_covariance has an entry for how each swap rate changes with itself, and changes with any one of the other swap rates. In linear-algebraic terms, it is a symmetric operator that characterizes the spread of each swap rate.
Linear algebra also tells us that this array can be decomposed into it's associated eigen-spectrum, with elements in this spectrum being scalar-vector pairs, or eigenvalue-eigenvector pairs. In the analysis you shared, x_covariance's eigenvalues are plotted in exhibit two as percent variance explained. To produce the data for a plot like exhibit two (which you will always want to furnish to the readers of your PCA), you simply divide each eigenvalue by the sum of all of them, then multiply each by 100.0. Due to the convenient properties of x_covariance, a suitable way to compute it's spectrum is like this:
vals, vects = numpy.linalg.eig(x_covariance)
We are now in a position to talk about residuals! Here is their definition (with our namespace): residuals_ij = x_ij − reconstructed_ij; i = 1:20; j = 1:261. Thus for every datum in x, there is a corresponding residual, and to find them, we need to recover the reconstructed_ij array. We can do this column-by-column, operating on each x_i with a change of basis operator to produce each reconstructed_i, each of which can be viewed as coordinates in a proper subspace of the original or raw basis. The analysis describes a modified Gram-Schmidt approach to compute the change of basis operator we need, which ensures this proper subspace's basis is an orthogonal set.
What we are going to do in the approach is take the eigenvectors corresponding to the three largest eigenvalues, and transform them into three mutually orthogonal vectors, x, y, z. Research the web for active discussions and questions geared toward developing the Gram-Schmidt process for all sorts of practical applications, but for simplicity let's follow the analysis by hand:
x = vects[0] - sum([])
xx = numpy.dot(x, x)
y = vects[1] - sum(
(numpy.dot(x, vects[1]) / xx) * x
)
yy = numpy.dot(y, y)
z = vects[2] - sum(
(numpy.dot(x, vects[2]) / xx) * x,
(numpy.dot(y, vects[2]) / yy) * y
)
It's reasonable to implement normalization before or after this step, which should be informed by the data of course.
Now with the raw data, we implicitly made the assumption that the basis is standard, we need a map between {e1, e2, ..., e20} and {x,y,z}, which is given by
ch_of_basis = numpy.array([x,y,z]).transpose()
This can be used to compute each reconstructed_i, like this:
reconstructed = []
for measurement in x.transpose().tolist():
reconstructed.append(numpy.dot(ch_of_basis, measurement))
reconstructed = numpy.array(reconstructed).transpose()
And then you get the residuals by subtraction:
residuals = x - reconstructed
This flow obviously might need further tuning, but it's the gist of how to do compute all the residuals. To get that periodic bar plot, take the average of each row in residuals.

How to use apple's Accelerate framework in swift in order to compute the FFT of a real signal?

How am I supposed to use the Accelerate framework to compute the FFT of a real signal in Swift on iOS?
Available example on the web
Apple’s Accelerate framework seems to provide functions to compute the FFT of a signal efficiently.
Unfortunately, most of the examples available on the Internet, like Swift-FFT-Example and TempiFFT, crash if tested extensively and call the Objective C API.
The Apple documentation answers many questions, but also leads to some others (Is this piece mandatory? Why do I need this call to convert?).
Threads on Stack Overflow
There are few threads addressing various aspects of the FFT with concrete examples. Notably FFT Using Accelerate In Swift, DFT result in Swift is different than that of MATLAB and FFT Calculating incorrectly - Swift.
None of them address directly the question “What is the proper way to do it, starting from 0”?
It took me one day to figure out how to properly do it, so I hope this thread can give a clear explanation of how you are supposed to use Apple's FFT, show what are the pitfalls to avoid, and help developers save precious hours of their time.
TL ; DR : If you need a working implementation to copy past here is a gist.
What is FFT?
The Fast Fourier transform is an algorithm that take a signal in the time domain -- a collection of measurements took at a regular, usual small, interval of time -- and turn it into a signal expressed into the phase domain (a collection of frequency).
The ability to express the signal along time lost by the transformation (the transformation is invertible, which means no information is lost by computing the FFT and you can apply a IFFT to get the original signal back), but we get the ability to distinguish between frequencies that the signal contained. This is typically used to display the spectrograms of the music you are listening to on various hardware and youtube videos.
The FFT works with complexe numbers. If you don't know what they are, lets just pretend it is a combination of a radius and an angle. There is one complex number per point on a 2D plane. Real numbers (your usual floats) can be saw as a position on a line (negative on the left, positive on the right).
Nb: FFT(FFT(FFT(FFT(X))) = X (up to a constant depending on your FFT implementation).
How to compute the FFT of a real signal.
Usual you want to compute the FFT of a small window of an audio signal. For sake of the example, we will take a small 1024 samples window. You would also prefer to use a power of two, otherwise things gets a little bit more difficult.
var signal: [Float] // Array of length 1024
First, you need to initialize some constants for the computation.
// The length of the input
length = vDSP_Length(signal.count)
// The power of two of two times the length of the input.
// Do not forget this factor 2.
log2n = vDSP_Length(ceil(log2(Float(length * 2))))
// Create the instance of the FFT class which allow computing FFT of complex vector with length
// up to `length`.
fftSetup = vDSP.FFT(log2n: log2n, radix: .radix2, ofType: DSPSplitComplex.self)!
Following apple's documentation, we first need to create a complex array that will be our input.
Dont get mislead by the tutorial. What you usual want is to copy your signal as the real part of the input, and keep the complex part null.
// Input / Output arrays
var forwardInputReal = [Float](signal) // Copy the signal here
var forwardInputImag = [Float](repeating: 0, count: Int(length))
var forwardOutputReal = [Float](repeating: 0, count: Int(length))
var forwardOutputImag = [Float](repeating: 0, count: Int(length))
Be careful, the FFT function do not allow to use the same splitComplex as input and output at the same time. If you experience crashs, this may be the cause. This is why we define both an input and an output.
Now, we have to be careful and "lock" the pointer to this four arrays, as showed in the documentation example. If you simply use &forwardInputReal as argument of your DSPSplitComplex, the pointer may become invalidated at the following line and you will likely experience sporadic crash of your app.
forwardInputReal.withUnsafeMutableBufferPointer { forwardInputRealPtr in
forwardInputImag.withUnsafeMutableBufferPointer { forwardInputImagPtr in
forwardOutputReal.withUnsafeMutableBufferPointer { forwardOutputRealPtr in
fforwardOutputImag.withUnsafeMutableBufferPointer { forwardOutputImagPtr in
// Input
let forwardInput = DSPSplitComplex(realp: forwardInputRealPtr.baseAddress!, imagp: forwardInputImagPtr.baseAddress!)
// Output
var forwardOutput = DSPSplitComplex(realp: forwardOutputRealPtr.baseAddress!, imagp: forwardOutputImagPtr.baseAddress!)
// FFT call goes here
}
}
}
}
Now, the finale line: the call to your fft:
fftSetup.forward(input: forwardInput, output: &forwardOutput)
The result of your FFT is now available in forwardOutputReal and forwardOutputImag.
If you want only the amplitude of each frequency, and you don't care about the real and imaginary part, you can declare alongside the input and output an additional array:
var magnitudes = [Float](repeating: 0, count: Int(length))
add right after your fft compute the amplitude of each "bin" with:
vDSP.absolute(forwardOutput, result: &magnitudes)

Does FFT neccessary to find peaks and pits on audio files

I'm able to read a wav files and its values. I need to find peaks and pits positions and their values. First time, i tried to smooth it by (i-1 + i + i +1) / 3 formula then searching on array as array[i-1] > array[i] & direction == 'up' --> pits style solution but because of noise and other reasons of future calculations of project, I'm tring to find better working area. Since couple days, I'm researching FFT. As my understanding, fft translates the audio files to series of sines and cosines. After fft operation the given values is a0's and a1's for a0 + ak * cos(k*x) + bk * sin(k*x) which k++ and x++ as this picture
http://zone.ni.com/images/reference/en-XX/help/371361E-01/loc_eps_sigadd3freqcomp.gif
My question is, does fft helps to me find peaks and pits on audio? Does anybody has a experience for this kind of problems?
It depends on exactly what you are trying to do, which you haven't really made clear. "finding the peaks and pits" is one thing, but since there might be various reasons for doing this there might be various methods. You already tried the straightforward thing of actually looking for the local maximum and minima, it sounds like. Here are some tips:
you do not need the FFT.
audio data usually swings above and below zero (there are exceptions, including 8-bit wavs, which are unsigned, but these are exceptions), so you must be aware of positive and negative values. Generally, large positive and large negative values carry large amounts of energy, though, so you want to count those as the same.
due to #2, if you want to average, you might want to take the average of the absolute value, or more commonly, the average of the square. Once you find the average of the squares, take the square root of that value and this gives the RMS, which is related to the power of the signal, so you might do something like this is you are trying to indicate signal loudness, intensity or approximate an analog meter. The average of absolutes may be more robust against extreme values, but is less commonly used.
another approach is to simply look for the peak of the absolute value over some number of samples, this is commonly done when drawing waveforms, and for digital "peak" meters. It makes less sense to look at the minimum absolute.
Once you've done something like the above, yes you may want to compute the log of the value you've found in order to display the signal in dB, but make sure you use the right formula. 10 * log_10( amplitude ) is not it. Rule of thumb: usually when computing logs from amplitude you will see a 20, not a 10. If you want to compute dBFS (the amount of "headroom" before clipping, which is the standard measurement for digital meters), the formula is -20 * log_10( |amplitude| ), where amplitude is normalize to +/- 1. Watch out for amplitude = 0, which gives an infinite headroom in dB.
If I understand you correctly, you just want to estimate the relative loudness/quietness of an audio digital sample at a given point.
For this estimation, you don't need to use FFT. However your method of averaging the signal does not produce the appropiate picture neither.
The digital signal is the value of the audio wave at a given moment. You need to find the overall amplitude of the signal at that given moment. You can somewhat see it as the local maximum value for a given interval around the moment you want to calculate. You may have a moving max for the signal and get your amplitude estimation.
At a 16 bit sound sample, the sound signal value can go from 0 up to 32767. At a 44.1 kHz sample rate, you can find peaks and pits of around 0.01 secs by finding the max value of 441 samples around a given t moment.
max=1;
for (i=0; i<441; i++) if (array[t*44100+i]>max) max=array[t*44100+i];
then for representing it on a 0 to 1 scale you (not really 0, because we used a minimum of 1)
amplitude = max / 32767;
or you might represent it in relative dB logarithmic scale (here you see why we used 1 for the minimum value)
dB = 20 * log10(amplitude);
all you need to do is take dy/dx, which can getapproximately by just scanning through the wave and and subtracting the previous value from the current one and look at where it goes to zero or changes from positive to negative
in this code I made it really brief and unintelligent for sake of brevity, of course you could handle cases of dy being zero better, find the 'centre' of a long section of a flat peak, that kind of thing. But if all you need is basic peaks and troughs, this will find them.
lastY=0;
bool goingup=true;
for( i=0; i < wave.length; i++ ) {
y = wave[i];
dy = y - lastY;
bool stillgoingup = (dy>0);
if( goingup != direction ) {
// changed direction - note value of i(place) and 'y'(height)
stillgoingup = goingup;
}
}

Why do the convolution results have different lengths when performed in time domain vs in frequency domain?

I'm not a DSP expert, but I understand that there are two ways that I can apply a discrete time-domain filter to a discrete time-domain waveform. The first is to convolve them in the time domain, and the second is to take the FFT of both, multiply both complex spectrums, and take IFFT of the result. One key difference in these methods is the second approach is subject to circular convolution.
As an example, if the filter and waveforms are both N points long, the first approach (i.e. convolution) produces a result that is N+N-1 points long, where the first half of this response is the filter filling up and the 2nd half is the filter emptying. To get a steady-state response, the filter needs to have fewer points than the waveform to be filtered.
Continuing this example with the second approach, and assuming the discrete time-domain waveform data is all real (not complex), the FFT of the filter and the waveform both produce FFTs of N points long. Multiplying both spectrums IFFT'ing the result produces a time-domain result also N points long. Here the response where the filter fills up and empties overlap each other in the time domain, and there's no steady state response. This is the effect of circular convolution. To avoid this, typically the filter size would be smaller than the waveform size and both would be zero-padded to allow space for the frequency convolution to expand in time after IFFT of the product of the two spectrums.
My question is, I often see work in the literature from well-established experts/companies where they have a discrete (real) time-domain waveform (N points), they FFT it, multiply it by some filter (also N points), and IFFT the result for subsequent processing. My naive thinking is this result should contain no steady-state response and thus should contain artifacts from the filter filling/emptying that would lead to errors in interpreting the resulting data, but I must be missing something. Under what circumstances can this be a valid approach?
Any insight would be greatly appreciated
The basic problem is not about zero padding vs the assumed periodicity, but that Fourier analysis decomposes the signal into sine waves which, at the most basic level, are assumed to be infinite in extent. Both approaches are correct in that the IFFT using the full FFT will return the exact input waveform, and both approaches are incorrect in that using less than the full spectrum can lead to effects at the edges (that usually extend a few wavelengths). The only difference is in the details of what you assume fills in the rest of infinity, not in whether you are making an assumption.
Back to your first paragraph: Usually, in DSP, the biggest problem I run into with FFTs is that they are non-causal, and for this reason I often prefer to stay in the time domain, using, for example, FIR and IIR filters.
Update:
In the question statement, the OP correctly points out some of the problems that can arise when using FFTs to filter signals, for example, edge effects, that can be particularly problematic when doing a convolution that is comparable in the length (in the time domain) to the sampled waveform. It's important to note though that not all filtering is done using FFTs, and in the paper cited by the OP, they are not using FFT filters, and the problems that would arise with an FFT filter implementation do not arise using their approach.
Consider, for example, a filter that implements a simple average over 128 sample points, using two different implementations.
FFT: In the FFT/convolution approach one would have a sample of, say, 256, points and convolve this with a wfm that is constant for the first half and goes to zero in the second half. The question here is (even after this system has run a few cycles), what determines the value of the first point of the result? The FFT assumes that the wfm is circular (i.e. infinitely periodic) so either: the first point of the result is determined by the last 127 (i.e. future) samples of the wfm (skipping over the middle of the wfm), or by 127 zeros if you zero-pad. Neither is correct.
FIR: Another approach is to implement the average with an FIR filter. For example, here one could use the average of the values in a 128 register FIFO queue. That is, as each sample point comes in, 1) put it in the queue, 2) dequeue the oldest item, 3) average all of the 128 items remaining in the queue; and this is your result for this sample point. This approach runs continuously, handling one point at a time, and returning the filtered result after each sample, and has none of the problems that occur from the FFT as it's applied to finite sample chunks. Each result is just the average of the current sample and the 127 samples that came before it.
The paper that OP cites takes an approach much more similar to the FIR filter than to the FFT filter (note though that the filter in the paper is more complicated, and the whole paper is basically an analysis of this filter.) See, for example, this free book which describes how to analyze and apply different filters, and note also that the Laplace approach to analysis of the FIR and IIR filters is quite similar what what's found in the cited paper.
Here's an example of convolution without zero padding for the DFT (circular convolution) vs linear convolution. This is the convolution of a length M=32 sequence with a length L=128 sequence (using Numpy/Matplotlib):
f = rand(32); g = rand(128)
h1 = convolve(f, g)
h2 = real(ifft(fft(f, 128)*fft(g)))
plot(h1); plot(h2,'r')
grid()
The first M-1 points are different, and it's short by M-1 points since it wasn't zero padded. These differences are a problem if you're doing block convolution, but techniques such as overlap and save or overlap and add are used to overcome this problem. Otherwise if you're just computing a one-off filtering operation, the valid result will start at index M-1 and end at index L-1, with a length of L-M+1.
As to the paper cited, I looked at their MATLAB code in appendix A. I think they made a mistake in applying the Hfinal transfer function to the negative frequencies without first conjugating it. Otherwise, you can see in their graphs that the clock jitter is a periodic signal, so using circular convolution is fine for a steady-state analysis.
Edit: Regarding conjugating the transfer function, the PLLs have a real-valued impulse response, and every real-valued signal has a conjugate symmetric spectrum. In the code you can see that they're just using Hfinal[N-i] to get the negative frequencies without taking the conjugate. I've plotted their transfer function from -50 MHz to 50 MHz:
N = 150000 # number of samples. Need >50k to get a good spectrum.
res = 100e6/N # resolution of single freq point
f = res * arange(-N/2, N/2) # set the frequency sweep [-50MHz,50MHz), N points
s = 2j*pi*f # set the xfer function to complex radians
f1 = 22e6 # define 3dB corner frequency for H1
zeta1 = 0.54 # define peaking for H1
f2 = 7e6 # define 3dB corner frequency for H2
zeta2 = 0.54 # define peaking for H2
f3 = 1.0e6 # define 3dB corner frequency for H3
# w1 = natural frequency
w1 = 2*pi*f1/((1 + 2*zeta1**2 + ((1 + 2*zeta1**2)**2 + 1)**0.5)**0.5)
# H1 transfer function
H1 = ((2*zeta1*w1*s + w1**2)/(s**2 + 2*zeta1*w1*s + w1**2))
# w2 = natural frequency
w2 = 2*pi*f2/((1 + 2*zeta2**2 + ((1 + 2*zeta2**2)**2 + 1)**0.5)**0.5)
# H2 transfer function
H2 = ((2*zeta2*w2*s + w2**2)/(s**2 + 2*zeta2*w2*s + w2**2))
w3 = 2*pi*f3 # w3 = 3dB point for a single pole high pass function.
H3 = s/(s+w3) # the H3 xfer function is a high pass
Ht = 2*(H1-H2)*H3 # Final transfer based on the difference functions
subplot(311); plot(f, abs(Ht)); ylabel("abs")
subplot(312); plot(f, real(Ht)); ylabel("real")
subplot(313); plot(f, imag(Ht)); ylabel("imag")
As you can see, the real component has even symmetry and the imaginary component has odd symmetry. In their code they only calculated the positive frequencies for a loglog plot (reasonable enough). However, for calculating the inverse transform they used the values for the positive frequencies for the negative frequencies by indexing Hfinal[N-i] but forgot to conjugate it.
I can shed some light to the reason why "windowing" is applied before FFT is applied.
As already pointed out the FFT assumes that we have a infinite signal. When we take a sample over a finite time T this is mathematically the equivalent of multiplying the signal with a rectangular function.
Multiplying in the time domain becomes convolution in the frequency domain. The frequency response of a rectangle is the sync function i.e. sin(x)/x. The x in the numerator is the kicker, because it dies down O(1/N).
If you have frequency components which are exactly multiples of 1/T this does not matter as the sync function is zero in all points except that frequency where it is 1.
However if you have a sine which fall between 2 points you will see the sync function sampled on the frequency point. It lloks like a magnified version of the sync function and the 'ghost' signals caused by the convolution die down with 1/N or 6dB/octave. If you have a signal 60db above the noise floor, you will not see the noise for 1000 frequencies left and right from your main signal, it will be swamped by the "skirts" of the sync function.
If you use a different time window you get a different frequency response, a cosine for example dies down with 1/x^2, there are specialized windows for different measurements. The Hanning window is often used as a general purpose window.
The point is that the rectangular window used when not applying any "windowing function" creates far worse artefacts than a well chosen window. i.e by "distorting" the time samples we get a much better picture in the frequency domain which closer resembles "reality", or rather the "reality" we expect and want to see.
Although there will be artifacts from assuming that a rectangular window of data is periodic at the FFT aperture width, which is one interpretation of what circular convolution does without sufficient zero padding, the differences may or may not be large enough to swamp the data analysis in question.

Resources