State-of-art for sensor's anomaly detection - machine-learning

I am working on anomaly detection problem and I need your help and expertise. I have a sensor that records episodic time series data. For example, once in a while, the sensor activates for 10 seconds and records values at millisecond interval. My task is to identify whether the recorded pattern is not normal. In other words, I need to detect anomalies in that pattern compared to other recorded patterns.
What would be the state-of-the-art approaches to that?

After doing my own research, the following methods proven to work very well in practice:
Variational Inference for On-line Anomaly Detection in
High-Dimensional Time Series
Multivariate Industrial Time Series
with Cyber-Attack Simulation: Fault Detection Using an LSTM-based
Predictive Data Model

Related

Condition Based Monitoring | CBM

Machine Learning (ML) can do two things from Vibration/ Acoustic Signal for Condition Based Monitoring (CBM):
1 . Feature Extraction (FT) and
2 . Classification
But if we look through the research/process, then why signal processing techniques are used for pre-processing and ML for rest of the part; I mean classification?
We can use only ML for all of these. But I have seen the merging model of the two techniques: conventional signal processing approach and ML.
I want to know the specific reason for that. Why researchers use these two; they could do with ML only; but they use both.
Yes you can do so. However, the task becomes more complicated.
FFT for example transforms the input space into a more meaningful representation. If you have rotating equipment you would expect that the spectrum is mainly on the frequency of rotation. However, if there is a problem the spectrum changes. This can often be detected by for example SVMS.
If you don't do the FFT but only give the raw signal, SVMs have a hard time.
Nevertheless, i've seen recent practical examples using Deep Convolutional Networks which have learned to predict problems on raw vibration data. The disadvantage is, however, that you do need more data. More data is not a problem in general, but if you take for example a wind turbine more failure data is obviously -- or hopefully ;-) -- a problem.
The other thing is that the ConvNet learned the FFT all by itself. But why not use prior knowledge if you have that.....

Training data and testing data from the same sensor

When using learning methords, We have training and testing data.
I'd like to confirm that
1)whether the training data and testing data must capture from the same sensor 2)What if they are from different sensors?
3) If they must be captured from the same sensor, are there any methods to uniform the data even they are not from the same sensor?
Thank you.
Yes, you would need both train and test data from the same sensor because of measurement error and detection bias that's specific to that sensor. If the test data came from a sensor that's always different from the sensor training data came from, you could have total system failure. Each sensor has it's own precision, bias, detection limits, etc, so that has to be distributed to both training and testing.
The idea with testing and training is not so much what you are thinking about, but rather the idea that when training the algorithm, the objects used in testing were never used in training. It's called selection bias. But you can have have objects from the same sensor used in training or testing.
If, however, the measurement wavelength or angle (pitch) of each sensor is different, then you are dealing more with a problem requiring MUlti-Signal Classification or Pisarenko harmonic decomposition.

Anomaly dectection algorithm for time series univariate dataset

I have univariate time series data and I need to run anomaly detection algorithm on the same. Can anyone suggest any standard algorithm for anomaly detection which works in most cases?
There is no such algorithm "which works in most cases". The task heavily depends on the specifics of your case, e.g. whether you need local anomalies when a point differs from other points near it or global ones when a point does not look similar to any other point in the dataset.
The very good review of anomaly detection algorithms can be found here
Perhaps you can easily try one-class-SVM which is available in many libraries and programming languages. For instance, in Python you can use scikit-learn.

Geo-location clustering

I have a customer location streaming data, which i need to analyze and check out for each event if the location is his usual visited location or not and generate an alert in real time if its not his usually visited location.
I was looking at various clustering algorithms but couldn't find a good one which do it in 'real time'.
Kmeans is too rigid with number of centriods.. DBSCAN is heavy weight and not sure if its fast enough to respond in real time...
Can you suggest one, which suits the real time stream processing?
I believe DBSCAN is suitable enough. Its worst-case-scenario complexity is O(n2) which is decent enough compared to other traditional algorithms such as hierarchical. In comparison to kmeans, I believe that kmeans is applicable if you use a ST_Centroid function from a spatial database such as SpatiaLIte or PostGIS ( take for granted that you use geographic data).
Between kmeans and DBSCAN, I choose DBSCAN because I think the answer to your problem is a density-based approach regarding real-time data.

Time series Clustering

I have a number of sensors measuring a Temperature (or some other physical attribute) data. Does anyone know of any clustering method that can tell which sensors are showing similar patterns and behaviors? My series are showing some trends with cycles.
I am very new to Time series analysis.
Thank you,
Basic K-means clustering works fine for most kinds of sensor data. You will need to take time slices to avoid auto-regressive issues. Check out the proc in R

Resources