I am trying to detect a curve of a certain shape and its position from a signal as shown below:
(link to picture: http://tinypic.com/view.php?pic=ab5j45&s=6)
I would be getting the signal as an array of floats.
Due to noise and other variations, the curve may not be exact so I can not use simple number matching. I was wondering if there is something in OpenCV which I can use for this.
Note that I will need to detect curves of different shapes and their position in the signal but if I know to detect one type, I can use the same method to detect other types.
Regards,
Peter
I would try to define a parametric mathematical function representing the shape you want to match.
Then all you need to do is to apply a technique (For instance least squares) to get the values of the parameters that best matches the curve over your signal.
You may want to match your function against a sliding window, especially if you want to match multiple events in your signal.
Noise and "other variations" are high frecuencies, so you need to filter the signal with a low-pass filter (for filtering, use the convolution operation). It seems that you signal have very low frecuencies (below 5KHz maybe?). Alfter filtering, look at your signal, and when you get the desired curve shape, apply numerical matching.
Matched filter has the highest peak at the integer position in a signal that best matches the given shape (or energy of the pattern to be matched). But in addition to that, often the neighboring values of the matched filter output can be used to fine tune the position by calculating tau = a-b/(a+b) (IIRC), where a=peak value and b is the second best value.
This works especially well, if the signal to be matched has good auto correlation characteristics -- one high peak and close to zero at +-1 from the peak (basically means detecting pilot signals).
Related
I have a bunch of gray-scale images decomposed into superpixels. Each superpixel in these images have a label in the rage of [0-1]. You can see one sample of images below.
Here is the challenge: I want the spatially (locally) neighboring superpixels to have consistent labels (close in value).
I'm kind of interested in smoothing local labels but do not want to apply Gaussian smoothing functions or whatever, as some colleagues suggested. I have also heard about Conditional Random Field (CRF). Is it helpful?
Any suggestion would be welcome.
I'm kind of interested in smoothing local labels but do not want to apply Gaussian smoothing functions or whatever, as some colleagues suggested.
And why is that? Why do you not consider helpful advice of your colleagues, which are actually right. Applying smoothing function is the most reasonable way to go.
I have also heard about Conditional Random Field (CRF). Is it helpful?
This also suggests, that you should rather go with collegues advice, as CRF has nothing to do with your problem. CRF is a classifier, sequence classifier to be exact, requiring labeled examples to learn from and has nothing to do with the setting presented.
What are typical approaches?
The exact thing proposed by your collegues, you should define a smoothing function and apply it to your function values (I will not use a term "labels" as it is missleading, you do have values in [0,1], continuous values, "label" denotes categorical variable in machine learning) and its neighbourhood.
Another approach would be to define some optimization problem, where your current assignment of values is one goal, and the second one is "closeness", for example:
Let us assume that you have points with values {(x_i, y_i)}_{i=1}^N and that n(x) returns indices of neighbouring points of x.
Consequently you are trying to find {a_i}_{i=1}^N such that they minimize
SUM_{i=1}^N (y_i - a_i)^2 + C * SUM_{i=1}^N SUM_{j \in n(x_i)} (a_i - a_j)^2
------------------------- - --------------------------------------------
closeness to current constant to closeness to neighbouring values
values weight each part
You can solve the above optimization problem using many techniques, for example through scipy.optimize.minimize module.
I am not sure that your request makes any sense.
Having close label values for nearby superpixels is trivial: take some smooth function of (X, Y), such as constant or affine, taking values in the range [0,1], and assign the function value to the superpixel centered at (X, Y).
You could also take the distance function from any point in the plane.
But this is of no use as it is unrelated to the image content.
I need to know a way to make FFT (DFT) work with just n points, where n is not a power of 2.
I want to analyze an modify the sound spectrum, in particular of Wave-Files, which have in common 44100 sampling points. But my FFT does not work, it only works with points which are in shape like 2^n.
So what can I do? Beside fill up the vector with zeros to the next power of 2 ?!
Any way to modify the FFT algorithm?
Thanks!
You can use the FFTW library or the code generators of the Spiral project. They implement FFT for numbers with small prime factors, break down large prime factors p by reducing it to a FFT of size (p-1) which is even, etc.
However, just for signal analysis it is questionable why you want to analyze exactly one second of sound and not smaller units. Also, you may want to use a windowing procedure to avoid the jumps at the ends of the segment.
Aside from padding the array as you suggest, or using some other library function, you can construct a Fourier transform with arbitrary length and spacing in the frequency domain (also for non-integer sample spacings).
This is a well know result and is based on the Chirp-z transform (or Bluestein's FFT). Another good reference is given by Rabiner and can be found at the above link.
In summary, with this approach you don't have to write the FFT yourself, you can simply use an existing high-performance FFT and then apply the convolution theorem to a suitably scaled and conditioned version of your signal.
The performance will still be, O(n*log n), multiplied by some implementation-dependent scaling factor.
The FFT is just a faster method of computing the DFT for certain length vectors; and a DFT can be computed for any length of input vector. You can also zero-pad your input vector to a length supported by your FFT library, which may be faster.
If you want to modify your sound file, you may need to use the overlap-add or overlap-save fast convolution filtering after determining the length of the impulse response of your frequency domain modification.
I'm trying to detect a pattern like this in some images
The actual image looks something like this
It could be scaled and/or rotated. Is there a way to do that efficiently without resorting to neural nets or some learning algorithm? Can some detection be done based on the value gradient for example (dark-bright-dark-bright-dark)?
input image is MxN (in your example M<N ):
take mean RGB image
mean Y to get 1xN vector
derive
abs
threshold
calculate the distance between peaks.
search for a location where the ratio between the distances is as expected (from what i see in your example ~ 1:7:1)
if a place found, validate the colors in the middle of the distance (from your example should be white-black-white)
You might be able to use Gabor Filters at varying orientations, and do standard threshold to identify objects.
If you know the frequency of the pattern you could try using a bandpass filter to isolate objects at that frequency. If it is a very strong frequency, you might be able to identify it in the image's Fourier transform.
Without much other knowledge about what you are looking for in your image, it will be very difficult to identify a specific repeating pattern.
I'm working on signal processing issues. I'm extracting some features for feeding a classifier. Among these features, there is the sum of first 5 FFT coefficients. As you know primary FFT coefficients actually indicate how dominant low frequency components of a signal are. This is very close to what a low-pass filter gives.
Here I'm suspicious about whether computing FFT to take those first 5 coefficients is an unnecessary task. I think applying low-pass filter will just eliminate low-frequency components and it won't have a significant effect on primary FFT coefficients. However there may be some other way in combination with low-pass filter in order to extract same information (that is contained in first five FFT coefficients) without using FFT.
Do you have any ideas or suggestions regarding this issue?
Thanks in advance.
If you just need an indicator for the low freq part of a signal I suggest to do something really simple. Just take an ordinary lowpass filter, for instance a 2nd order butterworth, with the cutoff frequency set appropriately (5Hz in your case, if I understood right). Then compute the energy (sum over squared values) or rms-value over your window (length 100). Or perhaps take the ratio of the low-freq energy and the overall energy of the window, to get a relative measure. That should give you a pretty good indicator for low frequency contributions of your signal.
People tend to overuse the fft for all kinds of really simple tasks. In 90% of the use cases an fft can be replaced by a simpler algorithm.
I seems you should take a look at the Goertzel Algorithm, as for the seemingly limited number of frequencies you need, it should take less computation. After updating the feedback parts on each sample, you can select how often to generate your "feature metric" or a little additional weighting of the results, can yield a respectable low pass filter.
I would like to create an application which can learn to classify a sequence of points drawn by a user, e.g. something like handwriting recognition. If the data point consists of a number of (x,y) pairs (like the pixels corresponding to a gesture instance), what are the best features to compute about the instance which would make for a good multi-class classifier (e.g. SVM, NN, etc)? Particularly if there are limited training examples provided.
If I were you, I would find the data points that correspond with corners, end points and intersections, use those as features and discard the intermediate points. You could include the angle or some other descriptor of these interest points as well.
For detecting interest points you could use a Harris detector, you could then use the gradient value at that point as a simple descriptor. Alternatively you could go with a more fancy method like SIFT.
You could use the descriptor of every pixel in your downsampled image and then classify with SVM. The disadvantage of that is that there would be a large amount of uninteresting data points in the feature vector.
An alternative would be to not approach it as a classification problem but as a template matching problem (fairly common in computer-vision). In this case a gesture can be specified as an arbitrary number of interest points, completely leaving out the non-interesting data. A certain threshold percentage of an instance's points has to match a template for a positive identification. For example, when matching the corner points of an instance of 'R' against the template for 'X', the bottom right point should match, being end points in the same position orientation, but the others are too dissimilar, giving a fairly low score and the identification R=X will be rejected.