Noise Detection in audio signal - signal-processing

I have a 1 min clip of audio signal sampled at 44.1KHz.
The spectrogram of the same is provided below.
There is too much of background noise in the clip
The link to the original clip is provided
here
What will be the best possible way to either a) Reduce the background noise or b) Enhance the speech signals only.
Also, how can the noise be identified in the spectrogram?
For generating spectrogram I have used the following commands
[x, Fs] = wavread('cpp_part.wav');
spectrogram(x,512,400,512,Fs,'yaxis');

Related

Normalizing lighting conditions for image recognition

I am using OpenCV to process a video to identify frames containing objects using a rudimentary algorithm (background subtraction → brightness threshold → dilation → contour detection).
The video is shot over the course of a day, so lighting conditions are gradually changing, so I expect it would improve my results if I did some sort of brightness and/or contrast normalization step first.
Answers to this question suggest using convertScaleAbs, or contrast optimization with histogram clipping, or histogram equalization.
Of these techniques or others, what would be the best preprocessing step to normalize the frames?

How to upsample audio with digital interpolation

I want to take an array with N number of audio data points and upsample it such that there are L*N points. I understand an accurate way to do this is to pad L-1 zero points between each original point and then to low pass the signal. According to this 4 minute video https://www.youtube.com/watch?v=sJslC6TuCoc I should lowpass at a frequency of Pi / L and then add a gain of L to the result to properly upsample my signal. I am having trouble with this low passing step and my result audio signal is not audible at all. Can anyone help me here? Is this "low pass" really more like a band reject filter or something?
My low pass algorithm is noted here (biquad transfer function with coefficients marked under "LPF"): http://music.columbia.edu/pipermail/music-dsp/1998-October/054185.html
You can interpolate all the added points using a high quality interpolation algorithm, such as a polyphase windowed Sinc FIR filter.

Detecting "noise" in a video stream ("snow", green blocks, partial frame distortion etc.)

I am on OpenCV 2.4.1 and need to detect if a video stream has any kind of noise. Noise such as the sample frames shown below:
What might be a simple, quick way to detect these kinds of noise. The issue is this noise could be intermittent, unpredictable and need detection
You can use simple image subtraction. Subtract two successing frames. Find the mean of this result. If it is not close to zero, then you have your noise.

Why in FFT output of 1Hz sine wave, does 1Hz magnitude behave like a sine wave?

I have been developing a small software in .NET that takes a signal from a sensor in real time and takes the FFT of that signal which is also shown in real time.
I have used the alglib library for the FFT function. Now my purpose is to observe the intensity of some particular frequency in time.
In order to check the software, I provided a sine wave to its input having a frequency of 1 Hz. The following image shows the screen shot from the software. The upper graph shows the frequency spectrum showing the peak at 1 Hz. However, when this peak is observed in time, as shown in lower graph, the intensity behaves like a sine wave.
My sampling frequency is 30kHz. What I do not understand is how am I getting this sine signal and why is the magnitude of frequency behaving like this?
This is an example of the effects of Windowing. It derives from the fact that the FFT is not a precise operation except for when dealing with perfectly periodic signals. When you window your signal, you turn it into a smaller chunk that may not repeat perfectly. The FFT algorithm calculates the spectrum of this chunk of audio, repeated infinitely. Since it is not a perfect sine wave, you don't get an exact value for the result. Furthermore, we can see that if your window doesn't line up perfectly with a multiple of your signal frequency, then it will phase shift with respect to your signal, the window capturing a slightly different chunk of your signal, and the FFT calculating the spectrum of a different infinitely repeated signal. If you think about it, this phase difference will naturally be periodic as well, as the window catches up with the next period of your signal.
However, this would only explain smaller variations in the intensity. Assuming you used correct labels on the axes of the bottom graph (something you should double-check), something else is wrong. You're window might be too small (although I expect not, because then you would see more spectral bleeding). Another possibility that just occurred to me is that you might just be plotting the real part of the FFT, not the magnitude. As the phase changes, the real and complex parts might vary, but you'd expect the magnitude to stay roughly the same.

Rapid motion and object detection in opencv

How can we detect rapid motion and object simultaneously, let me give an example,....
suppose there is one soccer match video, and i want to detect position of each and every players with maximum accuracy.i was thinking about human detection but if we see soccer match video then there is nothing with human detection because we can consider human as objects.may be we can do this with blob detection but there are many problems with blobs like:-
1) I want to separate each and every player. so if players will collide then blob detection will not help. so there will problem to identify player separately
2) second will be problem of lights on stadium.
so is there any particular algorithm or method or library to do this..?
i've seen some research paper but not satisfied...so suggest anything related to this like any article,algorithm,library,any method, any research paper etc. and please all express your views in this.
For fast and reliable human detection, Dalal and Triggs' Histogram of Gradients is generally accepted as very good. Have you tried playing with that?
Since you mentioned rapid motion changes, are you worried about fast camera motion or fast player/ball motion?
You can do 2D or 3D video stabilization to fix camera motion (try the excellent Deshaker plugin for VirtualDub).
For fast player motion, background subtraction or other blob detection will definitely help. You can use that to get a rough kinematic estimate and use that as an estimate of your blur kernel. This can then be used to deblur the image chip containing the player.
You can do additional processing to establish identify based upon OCRing jersey numbers, etc.
You mentioned concern about lights on the stadium. Is the main issue that it will cast shadows? That can be dealt with by the HOG detector. Blob detection to get blur kernel should still work fine with the shadow.
If you have control over the camera, you may want to reduce exposure times to reduce blur. Denoising techniques can be used to reduce CCD noise that occurs with extreme low light and dense optical flow approaches align the frames and boost the signal back up to something reasonable via adding the denoised frames.

Resources