I've been working on human activity recognition using a UCI dataset. 561 features are computed from the raw signals come from accelerometer and gyroscope. Based on these 561 features, a high accuracy is achieved. Now I intend to use less features to achieve comparable results, is there any algorithm I can implement to select the most significant features?
Any suggestion will be appreciated.
Related
I am working on anomaly detection problem and I need your help and expertise. I have a sensor that records episodic time series data. For example, once in a while, the sensor activates for 10 seconds and records values at millisecond interval. My task is to identify whether the recorded pattern is not normal. In other words, I need to detect anomalies in that pattern compared to other recorded patterns.
What would be the state-of-the-art approaches to that?
After doing my own research, the following methods proven to work very well in practice:
Variational Inference for On-line Anomaly Detection in
High-Dimensional Time Series
Multivariate Industrial Time Series
with Cyber-Attack Simulation: Fault Detection Using an LSTM-based
Predictive Data Model
I am planning to do SVM classification on multidimensional sensor data. There are two classes and 13 sensors. Suppose that I want to extract features e.g. average, standard deviation etc. I read from somewhere that we need to do feature scaling before apply to SVM. I am wondering when I should do the scaling, before extracting features or after extracting features?
As you have read, and as already pointed out, you would:
do feature derivation
do feature normalization (scaling, deskewing if necessary, etc)
hand data to training/evaluating model(s).
For the example you mentioned, just to be clear: I assume you mean that you want to derive (the same) features for each sample, so that you have e.g. a mean feature, standard deviation feature, etc. for each sample - which is how it should be done. Normalization, in turn, has to be done per feature over all samples.
I'm working on replicating results from a paper and when the authors are describing their setup for SVM they say this:
To increase the dimensionality of our feature vectors to be better
suited to SVMs, we expanded the feature space by taking the polynomial
combinations of degree less than or equal to 2, of all features. This
increased the number of features from 12 to 91.
How would you do this in the gui version of weka?
I really can't figure out what setting they changed to increase the number of attributes by 79. I've looked through the internet and through the weka documentation and even just clicking around on the gui but I can't seem to find any functionality that would do this.
Thank you for your help!
It seems the authors of the paper do not really understand how SVM works. Simply train SVM with polynomial kernel of degree 2 and you will get the same expressive power.
How can we use machine learning to get human voice from an audio clip which can be having a lot many noise over whole frequency domain.
As in any ML application the process is simple: collect samples, design features, train the classifier. For the samples you can use your noisy recordings or you can find a lot of noises in the web sound collections like freesound.org. For the features you can use mean-normalized mel-frequency coefficients, you can find implementation in CMUSphinx speech recognition toolkit. For classifier you can pick GMM or SVM. If you have enough data it will work fairly well.
To improve the accuracy you can add assumption that noise and voice are continuous, for that reason you can analyze detection history with hangover scheme (essentially HMM) to detect voice chunks instead of analysis of the every frame individually.
I've built an algorithm for pedestrian detection using openCV tools. To perform classification I use a boosted classifier trained with the CvBoost class.
The problem of this implementation is that I need to feed my classifier the whole set of features I used for training. This makes the algorithm extremely slow, so much that each image takes around 20 seconds to be fully analysed.
I need a different detection structure, and openCV has this Soft Cascade class that seems like exactly what I need. Its basic principle is that there is no need to examine all the features of a testing sample, since a detector can reject most negative samples using a small number of features. The problem is that I have no idea how to train one given a fully labeled set of negative and positive examples.
I find no information about this online, so I am looking for any tips you can give me on how to use this soft cascade to make classification.
Best regards