using LSTM on time series with different intervals - machine-learning

I want to build a classifier to classify time series. For each point in time series there are multiple features and a timestamp. Sometimes there is 1 second between 2 points but sometimes there could be 1 minute between timestamp.
I tought to give the time compared to the previous point as a feature.
Can LSTM handle that ?

Ultimately I think you are going to have to play with the data and see what works for your particular problem, but here are some thoughts
I have done something similar. My data contained regular gaps during part of the day and providing the time of day as a feature proved to be beneficial, however in this case it was likely useful in more ways than adjusting for the gaps.
If the size of the gap to the previous timestamp contains information that is useful to the network then definitely include it. If the gap is because there is data missing then that might not be very useful, but its worth a try.
If the data at each point is statistically similar regardless of the size of the gap then you may be able to simply feed them in as if there are no gaps.
If the gaps are causing the data to be non-stationary then that could make it harder for the network to learn. Which comes back to your question of can providing the gap size let the network correct for the non-stationary nature of the time series, it is possible but probably not ideal.
You might also want to try interpolation to fill in the missing gaps, and re-sampling the data to the level of granularity that is actually important for your prediction.

Related

Evaluating the confidence of an image registration process

Background:
Assuming there are two shots for the same scene from two different perspective. Applying a registration algorithm on them will result in Homography Matrix that represents the relation between them. By warping one of them using this Homography Matrix will (theoretically) result in two identical images (if the non-shared area is ignored).
Since no perfection is exist, the two images may not be absolutely identical, we may find some differences between them and this differences can be shown obviously while subtracting them.
Example:
Furthermore, the lighting condition may results in huge difference while subtracting.
Problem:
I am looking for a metric that I can evaluate the accuracy of the registration process. This metric should be:
Normalized: 0->1 measurement which does not relate to the image type (natural scene, text, human...). For example, if two totally different registration process on totally different pair of photos have the same confidence, let us say 0.5, this means that the same good (or bad) registeration happened. This should applied even one of the pair is for very details-reach photos and the other of white background with "Hello" in black written.
Distinguishing between miss-registration accuracy and different lighting conditions: Although there is many way to eliminate this difference and make the two images look approximately the same, I am looking of measurement that does not count them rather than fixing them (performance issue).
One of the first thing that came in mind is to sum the absolute differences of the two images. However, this will result in a number that represent the error. This number has no meaning when you want to compare it to another registration process because another images with better registration but more details may give a bigger error rather than a smaller one.
Sorry for the long post. I am glad to provide any further information and collaborating in finding the solution.
P.S. Using OpenCV is acceptable and preferable.
You can always use invariant (lighting/scale/rotation) features in both images. For example SIFT features.
When you match these using typical ratio (between nearest and next nearest), you'll have a large set of matches. You can calculate the homography using your method, or using RANSAC on these matches.
In any case, for any homography candidate, you can calculate the number of feature matches (out of all), which agree with the model.
The number divided by the total matches number gives you a metric of 0-1 as to the quality of the model.
If you use RANSAC using the matches to calculate the homography, the quality metric is already built in.
This problem is given two images decide how misaligned they are.
Thats why we did the registration. The registration approach cannot answer itself how bad a job it did becasue if it knew it it would have done it.
Only in the absolute correct case do we know the result: 0
You want a deterministic answer? you add deterministic input.
a red square in a given fixed position which can be measured how rotated - translated-scaled it is. In the conditions of lab this can be achieved.

How to defend thresholding technique

On a job for a customer, I am locating items within a grayscale scene with nonuniform background illumination. Once the items are located, I need to do another search within each one for details. The items are easy enough to locate by masking with the output of a variance filter; and within the items, if the threshold is correct, the details are easy to locate as well. But the mean and contrast of these items varies substantially.
I played around with threshold calculation for a while, and none of the techniques I implemented is perfect; but the one that turns out simplest, as accurate as any other, and quite low cost, is to take the mean pixel value and add one standard deviation.
My question is: is there some analytical way to defend this calculation other than "it works well"? I mean, I did sort of fall on this technique accidentally (only later did I find this answer), and using it seems arbitrary.

Is there a way to summarize the features of many time series?

I'm actually trying to detect characteristics of the time series for a very big region composed of many smaller subregions (in my case pixels). I don't know much about this, so the only way I can come up with is an averaged time series for the entire region, although I know this would definitely conceal many features by averaging.
I'm just wondering if there are any widely used techniques that can detect the common features of a suite of time series? like pattern recognition or time series classification?
Any ideas/suggestions are much appreciated!
Thanks!
Some extra explanations: I'm dealing with remote sensing images of several years with a time step of 7 days. So for each pixel, there is a time series associated, with values extracted from this pixel on different dates.So if I define a region consisting of many pixels, is there a way to detect or extract some common features charactering all or most of the time series of pixels within this region? Such as the shape of the time series, or a date around which there's an obvious increase in the values?
You could compute the correlation matrix for the pixels. This would simply be:
corr = np.zeros((npix,npix))
for i in range(npix):
for j in range(npix):
corr(i,j) = sum(data(i,:)*data(j,:))/sqrt(sum(data(i,:)**2)*sum(data(j,:)**2))
If you want more information, you can compute this as a function of time, i.e. divide your time series into blocks (say minutes) and compute the correlation for each of them. Then you can see how the correlation changes over time.
If the correlation changes a lot, you may be more interested in the cross-power spectrum of the pixels. This is defined as
cpow(i,j,:) = (fft(data(i,:))*conj(fft(data(j,:)))
This will tell you how much pixel i and j tend to change together on various time-scales. For example, they could be moving in unison in time-scales of a second (1 Hz), but also have changes on a time-scale of, say, 10 seconds which are not correlated with each other.
It all depends on what you need, really.

Machine learning algorithm for this task?

Trying to write some code that deals with this task:
As an starting point, I have around 20 "profiles" (imagine a landscape profile), i.e. one-dimensional arrays of around 1000 real values.
Each profile has a real-valued desired outcome, the "effective height".
The effective height is some sort of average but height, width and position of peaks play a particular role.
My aim is to generalize from the input data so as to calculate the effective height for further profiles.
Is there a machine learning algorithm or principle that could help?
Principle 1: Extract the most import features, instead of feeding it everything
As you said, "The effective height is some sort of average but height, width and position of peaks play a particular role." So that you have a strong priori assumption that these measures are the most important for learning. If I were you, I would calculate these measures at first, and use them as the input for learning, instead of the raw data.
Principle 2: While choosing a learning algorithm, the first thing to care about would be the the linear separability
Suppose the height is a function of those measures, then you have to think about that to what extent the function is linear. For example if the function is almost linear, then a very simple Perceptron would be perfect. Otherwise if it's far from linear, you might want to pick up a multiple-layer neural network. If it's far far far from linear....please turn to principle 1, and check out if you are extracting the right features.
Principle 3: More data help
As you said, you have around 20 "profiles" for training. In general speaking, that's not enough. Almost all of the machine learning algorithms were designed for somehow big data. Even they claimed that their algorithm is good at learning small sample, but usually not as small as 20. Get more data!
Maybe multivariate linear regression suffices?
I would probably use a combination of what you said about which features play the most important role, and then train a regression on that. Basically, you need at least one coefficient corresponding to each feature, and you need substantially more data points than coefficients. So, I would pick something like the heights and width of the two biggest peaks. You've now reduced every profile to just 4 numbers. Now do this trick: divide the data into 5 groups of 4. Pick the first 4 groups. Reduce all those profiles to 4 numbers, and then use the desired outcomes to come up with a regression. Once you have trained the regression, try your technique on the last 4 points and see how well it works. Repeat this procedure 5 times, each time leaving out a different set of data. This is called cross-validation, and it's very handy.
Obviously getting more data would help.

Algorithm for detecting peaks from recorded, noisy data. Graphs inside

So I've recorded some data from an Android GPS, and I'm trying to find the peaks of these graphs, but I haven't been able to find anything specific, perhaps because I'm not too sure what I'm looking for. I have found some MatLab functions, but I can't find the actual algorithms that do it. I need to do this in Java, but I should be able to translate code from other languages.
As you can see, there are lots of 'mini-peaks', but I just want the main ones.
Your solution depends on what you want to do with the data. If you want to do very serious things then you should most likely use (Fast) Fourier Transforms, and extract both the phase and frequency output from it. But that's very computationally intensive and takes a long while to program. If you just want to do something simple that doesn't require a lot of computational resources, then here's a suggestion:
For that exact problem i implemented the below algorithm a few hours ago. I invented the algorithm myself so i do not know if it has a name already, but it is working great on very noisy data.
You need to determine the average peak-to-peak distance and call that PtP. Do that measurement any what you like. Judging from the graph in your case it appears to be about 35. In my code i have another algorithm i invented to do that automatically.
Then choose a random starting index on the graph. Poll every new datapoint from then on and wait until the graph has either risen or fallen from the starting index level by about 70% of PtP. If it was a fall then that's a tock. If it was a rise then that's a tick. Store that level as the last tick or tock height. Produce a 'tick' or 'tock' event at this index.
Continue forward in the data. After ticks, if the data continues to rise after that point then store that level as the new 'height-of-tick' but do not produce a new tick event. After tocks, if the data continues to fall after that point then store that level as the new 'depth-of-tock' but do not produce a new tock event.
If last event was a tock then wait for a tick, if last event was a tick then wait for a tock.
Each time you detect a tick, then that should be a peak! Good luck.
I think what you want to do is run this through some sort of low-pass filter. Depending on exactly what you want to get out of this dataset, a simple "box car" filter might be
sufficient: at each point, take the average of the N samples centered on that point,
and take the average as the filtered value. The larger N is, the more aggressively smoothed the filtered data will be.
I guess you have lots of points... Calculate mean value of them, subtract it from all point's values and get highest point value (negative or positive) from each range where points have same sign till they change it. I hope I am clear...
With particulary nasty and noisy data I usually use smoothing. Easiest example of smoothing is moving average. Then you can find peacks on that moving average. And then you simply go back to your original data and take the closest peak to one you found on moving average.
I've done some looking into peak detection and I can tell you that if your data doesn't behave, it could mess up your algorithm. Off the top of my head, you could try: Pick a threshold, i.e threshold = 250. If data is above threshold, find the max at that period. This is assuming that the data you have has a mean about 230. Not sure how fancy you want to get. Hope that helps.

Resources