Forecasting with LSTM - machine-learning

I have the following question, I have already created a forecasting model but it doesn't really predict the future, I just have a bunch of data, I split it into a "training" sample and into a "testing" sample and then I can check how good is my prediction. But now I want to forecast for the next 10 days that are not in the data I have. How on earth can I do it?
Example : Let's say I have the data for these days:
04-07-2017: 213
05-07-2017: 321
06-07-2017: 111
07-07-2017: 90
08-07-2017: 78
Now I want to forecast the data for the next 3 days. How can I do it?

I would assume you could use encoder/decoder. Something similar to seq2seq as in NLU.

Related

Ideas for model selection for predicting sales at locations based on time component and class column

I am trying to build a model for sales prediction out of three different storages based on previous sales. However there is an extra (and very important) component to this which is a column with the values of A and B. These letters indicate a price category, where A siginifies a comparetively cheaper price compared to similar products. Here is a mock example of the table
week
Letter
Storage1 sales
Storage2 sales
Storage3 sales
1
A
50
28
34
2
A
47
29
19
3
B
13
11
19
4
B
14
19
8
5
B
21
13
3
6
A
39
25
23
I have previously worked with both types of prediction problems seperately, namely time series analysis and regression problems, using classical methods and using machine learning but I have not built a model which can take both predicition types into account.
I am writing this to hear any suggestions as how to tackle such a prediction problem. I am thinking of converting the three storage sale columns into one, in order to have one feature column, and having three one-hot encoder columns to indicate the storage. However I am not sure how to tackle this problem with a machine learning approach and would like to hear if anyone knows where to start with such a prediction problem.

Time series prediction using GP - training data

I am trying to implement time series forecasting using genetic programming. I am creating random trees (Ramped Half-n-Half) with s-expressions and evaluating each expression using RMSE to calculate the fitness. My problem is the training process. If I want to predict gold prices and the training data looked like this:
date open high low close
28/01/2008 90.959999 91.889999 90.75 91.75
29/01/2008 91.360001 91.720001 90.809998 91.150002
30/01/2008 90.709999 92.580002 90.449997 92.059998
31/01/2008 90.919998 91.660004 90.739998 91.400002
01/02/2008 91.75 91.870003 89.220001 89.349998
04/02/2008 88.510002 89.519997 88.050003 89.099998
05/02/2008 87.900002 88.690002 87.300003 87.68
06/02/2008 89 89.650002 88.75 88.949997
07/02/2008 88.949997 89.940002 88.809998 89.849998
08/02/2008 90 91 89.989998 91
As I understand, this data is nonlinear so my questions are:
1- Do I need to make any changes to this data like exponential smoothing? and why?
2- When looping the current population and evaluating the fitness of each expression on the training data, should I calculate the RMSE on just part of this data or all of it?
3- When the algorithm finishes and I get an expression with the best (lowest) fitness, does this mean that when I apply any row from the training data, the output should be the price of the next day?
I've read some research papers about this and I noticed some of them mentioning dividing the training data when calculating the fitness and some of them are doing exponential smoothing. However, I found them a bit difficult to read and understand, and most implementations I've found are either in python or R which I am not familiar with.
I appreciate any help on this.
Thank you.

Error in decompose(wmurders) : time series has no or less than 2 periods

I am new to R and am trying to learn time series on the wmurder dataset of fpp2 package. To start with, as I try a classical decomposition I keep getting this error. There are 55 observations, one for each year. Shouldn't the frequency be 1? Would someone please tell me how to go about this ?
Thanks a million
Annual data does not have seasonality, so you can't do a seasonal decomposition.

Sequence Models Word2vec

I am working on data-set with more than 100,000 records.
This is how the data looks like:
email_id cust_id campaign_name
123 4567 World of Zoro
123 4567 Boho XYz
123 4567 Guess ABC
234 5678 Anniversary X
234 5678 World of Zoro
234 5678 Fathers day
234 5678 Mothers day
345 7890 Clearance event
345 7890 Fathers day
345 7890 Mothers day
345 7890 Boho XYZ
345 7890 Guess ABC
345 7890 Sale
I am trying to understand the campaign sequence and predict the next possible campaign for the customers.
Assume I have processed my data and stored it in 'camp'.
With Word2Vec-
from gensim.models import Word2Vec
model = Word2Vec(sentences=camp, size=100, window=4, min_count=5, workers=4, sg=0)
The problem with this model is that it accepts tokens and spits out text-tokens with probabilities in return when looking for similarities.
Word2Vec accepts this form of input-
['World','of','Zoro','Boho','XYZ','Guess','ABC','Anniversary','X'...]
And gives this form of output -
model.wv.most_similar('Zoro')
[Guess,0.98],[XYZ,0.97]
Since I want to predict campaign sequence, I was wondering if there is anyway I can give below input to the model and get the campaign name in the output
My input to be as -
[['World of Zoro','Boho XYZ','Guess ABC'],['Anniversary X','World of
Zoro','Fathers day','Mothers day'],['Clearance event','Fathers day','Mothers
day','Boho XYZ','Guess ABC','Sale']]
Output -
model.wv.most_similar('World of Zoro')
[Sale,0.98],[Mothers day,0.97]
I am also not sure if there is any functionality within the Word2Vec or any similar algorithms which can help predicting campaigns for individual users.
Thank you for your help.
I don't believe that word2vec is the right approach to model your problem.
Word2vec uses two possible approaches: Skip-gram (given a target word predict its surrounding words) or CBOW (given the surrounding words predict the target word). Your case is similar to the context of CBOW, but there is no reason why the phenomenon that you want to model would respect the linguistic "rules" for which word2vec has been developed.
word2vec tends to predict the word that occurs more frequently in combination with the targeted one within the moving window (in your code: window=4). So it won't predict the best possible next choice but the one that occurred most often in the window span of the given word.
In your call to word2vec (Word2Vec(sentences=camp, size=100, window=4, min_count=5, workers=4, sg=0)) you are also using min_count=5 so the model is ignoring the words that have a frequency less than 5. Depending on your dataset size, there could be a loss of relevant information.
I suggest to give a look to forecasting techniques and time series analysis methods. I have the feeling that you will obtain better prediction using these techniques rather word2vec. (https://otexts.org/fpp2/index.html)
I hope it helps

LibSVM - Multi class classification with unbalanced data

I tried to play with libsvm and 3D descriptors in order to perform object recognition. So far I have 7 categories of objects and for each category I have its number of objects (and its pourcentage) :
Category 1. 492 (14%)
Category 2. 574 (16%)
Category 3. 738 (21%)
Category4. 164 (5%)
Category5. 369 (10%)
Category6. 123 (3%)
Category7. 1025 (30%)
So I have in total 3585 objects.
I have followed the practical guide of libsvm.
Here for reminder :
A. Scaling the training and the testing
B. Cross validation
C. Training
D. Testing
I separated my data into training and testing.
By doing a 5 cross validation process, I was able to determine the good C and Gamma.
However I obtained poor results (CV is about 30-40 and my accuracy is about 50%).
Then, I was thinking about my data and saw that I have some unbalanced data (categories 4 and 6 for example). I discovered that on libSVM there is an option about weight. That's why I would like now to set up the good weights.
So far I'm doing this :
svm-train -c cValue -g gValue -w1 1 -w2 1 -w3 1 -w4 2 -w5 1 -w6 2 -w7 1
However the results is the same. I'm sure that It's not the good way to do it and that's why I ask you some helps.
I saw some topics on the subject but they were related to binary classification and not multiclass classification.
I know that libSVM is doing "one against one" (so a binary classifier) but I don't know to handle that when I have multiple class.
Could you please help me ?
Thank you in advance for your help.
I've met the same problem before. I also tried to give them different weight, which didn't work.
I recommend you to train with a subset of the dataset.
Try to use approximately equal number of different class samples. You can use all category 4 and 6 samples, and then pick up about 150 samples for every other categories.
I used this method and the accuracy did improve. Hope this will help you!

Resources