Predict n steps for multivariate time series forecasting - machine-learning

In a multivariate time series forecasting, Do we need to pass the input variable values for prediction?. How it works if we have to predict n steps?
Let's say we have variable like Date, Age, Gender, Country, Customers count. Is it possible to predict customers count for next n periods with the historical data?

Related

Dynamic Multi-step Time Series Forecasting

I am using LSTM to forecast the future temperature of a specific device so my data only have Date and Temperature
I want to create a method to forecast N number of time steps in future. N is changeable (I will take it from the user). But the main thing is, I don't want to use any loops in this method because of the time restrictions.

Time Series Model - Device/Customer wise

I wanted to build a time series model - ARIMA, LSTM for eg. where I am trying to forecast a single time series column. But I wanted to do it say each customer wise. For instance: customer_name, date, metric is my data. I wanted to forecast what will be the metric for customer1 tomorrow.
The actual question now is: How to include the categorical column customer_name to the model.
I am good building model only with metric column but what I need is the metric for a particular customer.

Can I build a ML model with independent variables containing (time series+ categorical +numeric) and a classifier dependent variable (0,1)

Let's say I have with me data containing
salary,
job profile,
work experience,
number of people in household,
other demographic etc ..
of multiple persons who visited my car dealership and I also have the data if he/she has bought a car from me or not.
I can leverage this dataset to predict if a new customer coming in is likely to buy a car or not. And let's say currently I am doing it using xgboost.
NOW, I have got additional data but it is a time series data of the monthly expenditure the person makes. Say I get the data for my training data too. Now I want to build a model which uses this time series data and the old demographics data(+ salary, age etc) to get to know if a customer is likely to buy or not.
Note: In the second part I have time series data of the monthly expenditure only. The other variables are at a point in time. For example I do not have the time series for Salary or Age.
Note2: I also have categorical variables like job profile which I would like to use in the model. But for this I do not know if the person has been in the same job profile or he has changed over from some other job profile.
As most of the data are specific to the person; except expenditure time series, so it is better to bring time series data at person level. This can be done by feature engineering like:
As #cmxu suggested take various statistical measures. It will be even more beneficial to take these statistical measures at different time intervals like say mean at last 2 days, 5 days, 7 days, 15 days, 30 day, 90 days, 180 days etc.
Create mixed features like:
a) ratio of salary vs expenditure statistical summery created in point 1 (choose appropriate interval)
b) salary per person household or avg monthly expenditure per household. etc.
With similar ideas you can easily create 100s or 1000s of features with your data and then feed all this data to XGBoost (which is easy to train and debug) or NN (more complicated to train).

Customer behaviour prediction when there are no negative examples

Imagine you own a postal service and you want to optimize your business processes. You have a history of orders in the following form (sorted by date):
# date user_id from to weight-in-grams
Jan-2014 "Alice" "London" "New York" 50
Jan-2014 "Bob" "Madrid" "Beijing" 100
...
Oct-2017 "Zoya" "Moscow" "St.Petersburg" 30
Most of the records (about 95%) contain positive numbers in the "weight-in-grams" field, but there are a few that have zero weight (perhaps, these messages were cancelled or lost).
Is it possible to predict whether the users from the history file (Alice, Bob etc.) will use the service in Nov., 2017? What machine learning methods should I use?
I tried to use simple logistic regression and decision trees, but they evidently give positive outcome for any user, as there are very few negative examples in the training set. I also tried to apply Pareto/NBD model (BTYD library in R), but it seems to be extremely slow for large datasets, and my data set contains more than 500 000 records.
I have another problem: if I impute negative examples (considering that the user, who didn't send a letter in the certain month is a negative example for this month) the dataset grows from 30 Mb up to 10 Gb.
The answer is yes you can try to predict.
You can approach this as a time series and run RNN:
Train your RNN on your set pivoted so each user is one sample.
You can also pivot your set so each user is a a row (observation) by aggregating each users' data. Then run multivariate logistic regression. You will loose information this way, but it might be simpler. You can add time related columns such as 'average delay between orders', 'average orders per year' etc.
You can use Bayesian methods to estimate the probability with which the user will return.

Identifying machine learning data to make predictions

As a learning exercise I plan to implement a machine learning algorithm (probably neural network) to predict what users earn trading stocks based on shares bought , sold and transaction times. Below datasets are test data I've formulated.
acronym's :
tab=millisecond time apple bought
asb=apple shares bought
tas=millisecond apple sold
ass=apple shares sold
tgb=millisecond time google bought
gsb=google shares bought
tgs=millisecond google sold
gss=google shares sold
training data :
username,tab,asb,tas,ass,tgb,gsb,tgs,gss
a,234234,212,456789,412,234894,42,459289,0
b,234634,24,426789,2,234274,3,458189,22
c,239234,12,156489,67,271274,782,459120,3
d,234334,32,346789,90,234254,2,454919,2
classifications :
a earned $45
b earned $60
c earned ?
d earned ?
Aim : predict earnings of users c & d based on training data
Is there any data points I should add to this data set? I should use alternative data perhaps ? As this is just a learning exercise of my own creation can add any feature that may be useful.
This data will need to be normalised, is there any other concept I should be aware of ?
Perhaps should not use time as a feature parameter as shares can bounce up and down depending on time.
You might want to solve your problem in below order:
Prediction for an individual stock's future value based on all stock's historical data.
Prediction for a combination of stocks' total future value based on a portfolio and all stocks' historical data.
A buy-sell short-term strategy for managing a portfolio. (when and what amount to buy/sell on which stock(s) )
If you can do 1) well for a particular stock, probably it's a good starting point for 2). 3) might be your goal but I put it in the last because it's even more complicated.
I would make some assumptions below and focus on how to solve 1) hopefully. :)
I assume at each timestamp, you have a vector of all possible features, e.g.:
stock price of company A (this is the target value)
stock price of other companies B, C, ..., Z (other companies might affect company A directly or indirectly)
52 week lowest price of A, B, C, ..., Z (long-term features begin)
52 week highest price of A, B, C, ..., Z
monthly highest/lowest price of A, B, C, ..., Z
weekly highest/lowest price of A, B, C, ..., Z (short-term features begin)
daily highest/lowest price of A, B, C, ..., Z
is revenue report day of A, B, C, ..., Z (really important features begin)
change of revenue of A, B, C, ..., Z
change of profit of of A, B, C, ..., Z
semantic score of company profile from social networks of A, ..., Z
... (imagination helps here)
And I assume you have almost all above features at every fixed time interval.
I think a lstm-like neural network is very relevant here.
Don't use the username along with the training data - the network might make associations between the username and the $ earned. Including it would factor in the user to the output decision, while excluding it ensures the network will be able to predict the $ earned for an arbitrary user.
Using parameter that you are suggesting seems me impossible to predict earnings.
The main reason is that input parameters don't correlate with output value.
You input values contradicts itself - consider such case is it possible that for the same input you will expect different output values? If so you won't be able predict any output for such input.
Let's go further, earnings of trader depend not only from a share of bought/sold stocks, but also from price of each one of them. This will bring us to the problem when we provide to neural network two equals input but desire different outputs.
How to define 'good' parameters to predict desired output in such case?
I suggest first of all to look for people who do such estimations then try to define a list of parameters they take into account.
If you will succeed you will get a huge list of variables.
Then you can try to build some model for example, using neural network.
Besides normalisation you'll also need scaling. Another question, which I have for you is classification of stocks. In your example you provide google and apple which are considered as blue-chipped stocks. I want to clarify, you want to make prediction of earning only for google and apple or prediction for any combination of two stocks?
If you want to make prediction only for google and apple and provide data which you have, then you can apply only normalization and scaling with some kind of recurrent neural network. Recurrent NN are better in prediction tasks then simple model of feedforward with backpropagation training.
But in case if you want to apply your training algorithm to more then just google and apple, I recommend you to split your training data into some groups by some criteria. One example of dividing can be according to capitalization of stocks. And if you want to make capitalization dividing, you can make five groups ( as example ). And if you decide to make five groups of stocks, you can also apply equilateral encoding in order to decrease number of dimensions for NN learning.
Another kind of grouping which you can think of can be area of operation of stock. For example agricultural, technological, medical, hi-end, tourist groups.
Let's say you decided to give this grouping as mentioned ( I mean agricultural, technological, medical, hi-end, tourist). Then five groups will give you five entries into NN to input layer ( so called thermometer encoding ).
And let's say you want to feed agricultural stock.
Then input will look like this:
1,0,0,0,0, x1, x2, ...., xn
Where x1, x2, ...., xn - are other entries.
Or if you apply equilateral encoding, then you'll have one dimension less ( I'm to lazy to describe how it will look like ).
Yet one more idea for converting entries for neural network can be thermometer encoding.
And one more idea to keep in your mind, as usually people loose on trading stocks, so your data set will be biased. I mean if you randomly choose only 10 traders, they all can be losers, and your data set will not be completely representative. So in order to avoid data bias, you should have big enough data set of traders.
And one more detail, you don't need to pass into NN user id, because NN then learn trading style of particular user, and use it for prediction.
Seems to me dimensions are more than data points. However, it might be the case that your observations are in a linear sub space, you just need to compute the kernel of the matrix shown above.
If the kernel has a larger dimension than the number of data points then you do not need add more data points.
Now there is another thing to look at. You should check out your classifier's VC dimension, don't want to add too many points to the dataset. But anyway that is mostly theoretical in this example, and I'm just joking.

Resources