Cannot explain result of minimum-phase reconstruction of a windowed sinc function - signal-processing

I'm currently trying to calculate 'minimum-phase band-with limited step functions' (BLEP, see (1) and (2)) for synthesizing alias-free saw/square/triangle waves.
I think I understand most of the theory, and are exploring/prototyping in Jupyter notebooks and in MATLAB. I'm getting some strange (or at least, strange to me) results in the minimum-phase reconstruction of the windowed sinc function I want to use to calculate the BLEP.
The idea is as follows:
Calculate sinc function with specified number of zero-crossings and oversampling rate
Apply a Blackman window to the sinc function to remove periodic discontinuities
Calculate minimum-phase reconstruction of the windowed sinc function to move the energy of the impulse response to the left (to remove the need for delay lines)
Integrate the minimum-phase reconstruction to get the BLEP
Step 3 is what confuses me. The minimum phase reconstruction is defined as the ifft of the weighted cepstrum of the signal, which I can calculate step-by-step in a Jupyter notebook, but I can also directly calculate in MATLAB using the rceps function. Both are giving me the same confusing results, which I like to understand better before moving forward.
This is the MATLAB code I am using for experimentation:
clear;
L = 16;
OMEGA = 64;
N = 2*L*OMEGA+1;
x = linspace(-L, L, N);
h = blackman(N);
%h = [zeros(floor(N/4), 1); blackman(ceil(N/2)); zeros(floor(N/4), 1)];
s = sinc(x);
sw = s .* h';
sw_f = fft(sw);
[y,ym] = rceps(sw);
figure;
plot(y);
figure;
plot(ym);
What is happening is that the 'right half' of the minimum-phase reconstruction of the windowed sinc (the result from step 3) contains signal energy that I don't understand.
For example, the windowed sinc looks like this:
windowed-sinc
Which gives me the following minimum-phase reconstruction and cepstrum:
min-phase reconstruction
cepstrum
As you can see, stuff is going on in the right-half of the min-phase reconstruction, where I would have expected the graph to converge to 0 and stay there.
The effect is even more pronounced when I narrow the window function by 2x, filtering out all but the first few zero crossings:
min-phase reconstruction, narrow window
It looks like a lot of energy 'leaks' into bins that affect the right half of the reconstructed signal, but I have no idea why. The input windowed sinc function is periodic, the sample rate is high enough, and the left half of the reconstructed signal looks correct.
The cepstrum for the narrow-window sinc function also shows a spike right in the middle, which I don't understand:
min-phase reconstruction, narrow window
Now, I'm aware that to use the min-phase reconstructed signal to generate a BLEP that will be used for at most up to L samples, I don't really care what the signal looks like after the first L zero-crossings. I can just ignore it and calculate the BLEP from the left part of the graph. But I still like to understand what is going on, if only to be sure I'm not missing something.
One thing I already found out is that if I reduce the ration between zero-crossings and oversampling factor too much, I can introduce FFT leakage that even affects the left-half of the min-phase reconstructed signal, to make it unusable. I think I understand this phenomenon and just need to make sure the sample rate is high enough. But bumping the oversampling to some very high number does not remove any energy in the right-hand part of the reconstructed signal, so it does not appear it is a result of FFT leakage. What explains it though?
(1) https://www.experimentalscene.com/articles/minbleps.php
(2) https://www.kvraudio.com/forum/viewtopic.php?t=364256

Related

why weiner filter reduces only noise in my case, it is not reducing blur amount

I am implementing weiner filtering in python which is applied on an image blurred using disk shape point spread function, i am including code of making disk shape psf and weiner filter
def weinerFiltering(kernel,K_const,image):
#F(u,v)
copy_img= np.copy(image)
image_fft =np.fft.fft2(copy_img)
#H(u,v)
kernel_fft = np.fft.fft2(kernel,s=copy_img.shape)
#H_mag(u,v)
kernel_fft_mag = np.abs(kernel_fft)
#H*(u,v)
kernel_conj = np.conj(kernel_fft)
f = (kernel_conj)/(kernel_fft_mag**2 + K_const)
return np.abs(np.fft.ifft2(image_fft*f))
def makeDiskShape(arr,radius,centrX,centrY):
for i in range(centrX-radius,centrX+radius):
for j in range(centrY-radius,centrY+radius):
if(l2dist(centrX,centrY,i,j)<=radius):
arr[i][j]=1
return arr/np.sum(arr)
this is blurred and gaussian noised image
this is what i am getting result after weiner filtering for K value of 50
result does not seem very good, can someone help
seems noise is reduced but amount of blurred is not, shape of disk shaped psf matrix is 20,20 and radius is 9 which seems like this
Update
using power spectrum of ground truth image and noise to calculate K constant value, still i am getting strong artifacts
this is noised and blurred image
this is result after using power specturm in place of a constant K value
Reduce your value of K. You need to play around with it until you get good results. If it's too large it doesn't filter, if it's too small you get strong artifacts.
If you have knowledge of the noise variance, you can use that to estimate the regularization parameter. In the Wiener filter, the constant K is a simplification of N/S, where N is the noise power and S is the signal power. Both these values are frequency-dependent. The signal power S can be estimated by the Fourier transform of the autocorrelation function of the image to be filtered. The noise power is hard to estimate, but if you have such an estimate (or know it because you created the noisy image synthetically), then you can plug that value into the equation. Note that this is the noise power, not the variance of the noise.
The following code uses DIPlib (the Python interface we call PyDIP) to demonstrate Wiener deconvolution (disclaimer: I'm an author). I don't think it is hard to convert this code to use other libraries.
import PyDIP as dip
image = dip.ImageRead('trui.ics');
kernel = dip.CreateGauss([3,3]).Pad(image.Sizes())
smooth = dip.ConvolveFT(image, kernel)
smooth = dip.GaussianNoise(smooth, 5.0) # variance = 5.0
H = dip.FourierTransform(kernel)
F = dip.FourierTransform(smooth)
S = dip.SquareModulus(F) # signal power estimate
N = dip.Image(5.0 * smooth.NumberOfPixels()) # noise power (has same value at all frequencies)
Hinv = dip.Conjugate(H) / ( dip.SquareModulus(H) + N / S )
out = dip.FourierTransform(F * Hinv, {"inverse", "real"})
The smooth image looks like this:
The out image that comes from deconvolving the image above looks like this:
Don't expect a perfect result. The regularization term impedes a perfect inverse filtering because such filtering would enhance the noise so strongly that it would swamp the signal and produce a totally useless output. The Wiener filter finds a middle ground between undoing the convolution and suppressing the noise.
The DIPlib documentation for WienerDeconvolution explains some of the equations involved.

Strange FFT spectrum from a near perfect sinusoid

I have retrieved some signal in my Abaqus simulation for verification purpose. The true signal shall be a perfect sinusoid at 300kHz and I performed fft on the sampled signal using scipy.fftpack.fft.
But I got a strange spectrum as shown below (sorry that I am too lazy to scale the x-axis of the spectrum to the correct frequency). In the same figure, I sliced the signal into pieces and plotted in the time domain. I also repeated the same process for a pure sine wave.
This totally surprises me. As indicated below in the code, sampling frequency is 16.66x of the frequency of the signal. At the moment, I think it is due to the very little error in the sampling period. In theory, Abaqus shall sample it in a regular time interval. As you can see, there is some little error so that the dots in my signal appear to be thicker than the perfect signal. But does such a small error give a striking difference in the frequency spectrum? Otherwise, why is the frequency spectrum like that?
FYI1: This is the magnified fft spectrum of my signal:
FYI2: This is the python code that was used to produce the above figures
def myfft(x, k, label):
plt.plot(np.abs(fft(x))[0:k], label = label)
plt.legend()
plt.subplot(4,1,1)
for i in range(149800//200):
plt.plot(mysignal[200*i:200*(i+1)], 'bo')
plt.subplot(4,1,2)
myfft(mysignal,150000//2, 'fft of my signal')
plt.subplot(4,1,3)
[Fs,f, sample] = [5e6,300000, 150000]
x = np.arange(sample)
y = np.sin(2 * np.pi * f * x / Fs)
for i in range(149800//200):
plt.plot(y[200*i:200*(i+1)], 'bo')
plt.subplot(4,1,4)
myfft(y,150000//2, 'fft of a perfect signal')
plt.subplots_adjust(top = 2, right = 2)
FYI3: Here is my signal in .npy and .txt format. The signal is pretty long. It has 150001 points. The .txt one is the raw file from Abaqus. The .npy format is what I used to produce the above plot - (1) the time vector is removed and (2) the data is in half precision and normalized.
Any standard FFT algorithm you use operates on the assumption that the signal you provide is uniformly sampled. Uniform in this context means equally spaced in time. Your signal is clearly not uniformly sampled, therefore the FFT does not "see" a perfect sine but a distorted version. As a consequence you see all these additional spectral components the FFT computes to map your distorted signal to the frequency domain. You have two options now. Resample your signal i.e. it is uniformly sampled and use your off the shelf FFT or take a non-uniform FFT to get your spectrum. Here is one library you could use to calculate your non-uniform FFT.

What is the correct way to average several rotation matrices?

I get many rotation vectors from pose estimation of many frames (while camera is stationary) and I want the most accurate measure. Theoretically Can I average through rotation vector\matrices\other kind of data? or is that wrong?
in addition, how can I tell when a rotation vector\matrix is an outlier (i.e. very different from all the others and may be a miscalculation)? for example, in translation matrix I see the difference in centimeters of every entry and can have an intuitive threshold. Is there a similar way in rotation?
One way, if you want to average rotations that are 'close', is to seek, in analogy with the mean of say numbers, the value that minimises the 'dispersion'. For numbers x[], the mean is what mimnimises
disp = Sum{ i | sqr( x[i]-mean)}
So for rotations R[] we can seek a rotation Q to minimise
disp = Sum{ i | Tr( (R[i]-Q)'*(R[i]-Q))}
where Tr is the trace and ' denotes transpose. Note that writing things this way does not change what we are tring to minimise, it just makes the algebra easier.
That particular measure of dispersion leads to a practical way of computing Q:
a/ compute the 'mean matrix' of the rotations
M = Sum{ i | R[i] } /N
b/ take the SVD of that
M = U*D*V'
c/ compute the rotation closest to M
Q = U*V'
You may not average rotation matrices, specifically not, when you use the term "most accurate". But let's go back to start: Matrix multiplications, i.e. rotations, do not commute. ABC != BAC != CBA ... the outcomes can be as dramatically apart as imaginable.
As far the outliers go: use quaternions instead of rotation matrices. Firstly, the amount of calculation steps can be minimised leading to higher performance there are tons of implementations of that online. And secondly by building euclidean norms on the quaternions, you get a good measure for outliers.

1D discrete denoising of image by variational method (the length of smoothing term)

As of speaking about this 1D discrete denoising via variational calculus I would like to know how to manipulate the length of smoothing term as long as it should be N-1, while the length of data term is N. Here the equation:
E=0;
for i=1:n
E+=(u(i)-f(i))^2 + lambda*(u[i+1]-n[i])
E is the cost of actual u in optimization process
f is given image (noised)
u is output image (denoised)
n is the length of 1D vector.
lambda>=0 is weight of smoothness in optimization process (described around 13 minute in video)
here the length of second term and first term mismatch. How to resolve this?
More importantly, I would like to use linear equation system to solve this problem.
This is nowhere near my cup of tea but I think you are referring to the fact that:
u[i+1]-n[i] is accessing the next pixel making the term work only on resolution 1 pixel smaller then original f image
In graphics and filtering is this usually resolved in 2 ways:
use default value for pixels outside image resolution
you can set default or neutral(for the process) color to those pixels (like black)
use color of the closest neighbor inside image resolution
interpolate the mising pixels (bilinear,bicubic...)
I think the first choice is not suitable for your denoising technique.
change the resolution of output image
Usually after some filtering techniques (via FIR,etc) the result is 1 pixel smaller then the input to resolve the missing data problem. In your case it looks like your resulting u image should be 1 pixel bigger then input image f while computing cost functions.
So either enlarge it via bullet #1 and when the optimization is done you can crop back to original size.
Or virtually crop the f one pixel down (just say n'=n-1) before computing cost function so you avoid access violations (and also you can restore back after the optimization...)

Frequency Shifting with FFT

I've been experimenting with a few different techniques that I can find for a freq shifting (specifically I want to shift high freq signals to a lower freq). At the moment I'm trying to use this technique -
take the original signal, x(t), multiply it by: cos(2 PI dF t), sin(2
PI dF t)
R(t) = x(t) cos(2 PI dF t)
I(t) = x(t) sin(2 PI dF t)
where dF is the delta frequency to be shifted.
Now you have two time series signals: R(t) and I(t).
Conduct complex Fourier transform using R(t) as real and I(t) as
imaginary parts. The results will be frequency shifted spectrum.
I have interpreted this into the following code -
for(j=0;j<(BUFFERSIZE/2);j++)
{
Partfunc = (((double)j)/2048);
PreFFTShift[j+x] = PingData[j]*(cos(2*M_PI*Shift*(Partfunc)));
PreFFTShift[j+1+x] = PingData[j]*(sin(2*M_PI*Shift*(Partfunc)));
x++;
}
//INITIALIZE FFT
status = arm_cfft_radix4_init_f32(&S, fftSize, ifftFlag, doBitReverse);
//FFT on FFTData
arm_cfft_radix4_f32(&S, PreFFTShift);
This builds me an array with interleaved real and imag data and then FFT. I then inverse the FFT, but the output im getting is pretty garbled. Results seem huge in comparison to what I think they should be, and although there are a few traces of a freq shifted signal, its hard to tell as the result seems mostly pretty noisy.
I've also attempted simply revolving the array values of a standard FFT of my original signal to get a freq shift, but to no avail. Is there a better method for doing this?
have you tried something like:
Use a Hanning window for each framed data
Once you have your windowed frame of audio data, you do an FFT on it
Do some kind of transformation in the frequency domain (you can use
Flanagan - phase vocoder)
Now you need to go back to the time domain with an IFFT
Apply Hanning window in the IFFT data
Use overlap-add at each new frame of time-domain data into the output
stream
My results:
I created two concatenated sinusoids (250Hz and 400Hz) and move one octave UP!
Blue waveform is the original and red was changed, you can see one fadeIN-fadeOut caused by overlap add and hann window !
If you want the frequency shift to sound more "natural", you will have to maintain the ratios between all the initial frequency bins, where the amount of shift will depend on the FFT bin, thus requiring lots of interpolation. The Phase Vocoder algorithm will use multiple FFTs to reduce phase distortion in the result.

Resources