Ideas for model selection for predicting sales at locations based on time component and class column - machine-learning

I am trying to build a model for sales prediction out of three different storages based on previous sales. However there is an extra (and very important) component to this which is a column with the values of A and B. These letters indicate a price category, where A siginifies a comparetively cheaper price compared to similar products. Here is a mock example of the table
week
Letter
Storage1 sales
Storage2 sales
Storage3 sales
1
A
50
28
34
2
A
47
29
19
3
B
13
11
19
4
B
14
19
8
5
B
21
13
3
6
A
39
25
23
I have previously worked with both types of prediction problems seperately, namely time series analysis and regression problems, using classical methods and using machine learning but I have not built a model which can take both predicition types into account.
I am writing this to hear any suggestions as how to tackle such a prediction problem. I am thinking of converting the three storage sale columns into one, in order to have one feature column, and having three one-hot encoder columns to indicate the storage. However I am not sure how to tackle this problem with a machine learning approach and would like to hear if anyone knows where to start with such a prediction problem.

Related

LSTM and labels

Lets start off with "I know ML cannot predict stock markets better than monkeys."
But I just want to go through with it.
My question is a theretical one.
Say I have date, open, high, low, close as columns. So I guess I have 4 features, open, high, low, close.
'my_close' is going to be my label(answer) and I will use the 'close' 7 days from current row. Basically i shift the 'close' column up 7 rows and make it a new column called 'my_close'.
LSTMs work on sequences. So say the sequence I set is 20 days.
hence my shape will be (1000days of data, 20 day as a sequence, 3 features).
The problem that is bothering me is should these 20 days or rows of data, have the exact same label? or can they have individual labels ?
Or have i misunderstood the whole theory?
Thanks guys.
In your case, You want to predict the current day's stock price using previous 7 days stock values. The way your building your inputs and outputs require some modification before feeding into the model.
Your making mistake in understanding timesteps(in your sequences).
Timesteps(sequences) in layman terms is the total number of inputs we will consider while predicting the output. In your case, it will be 7(not 20) as we will be using previous 7 days data to predict the current day's output.
Your Input should be previous 7 days of info
[F11,F12,F13],[F21,F22,F23],........,[F71,F72,F73]
Fij in this, F represents the feature, i represents timestep and j represents feature number.
and the output will be the stock price of the 8th day.
Here your model will analyze previous 7 days inputs and predict the output.
So to answer your question You will have a common label for previous 7 days input.
I strongly recommend you to study a bit more on LSTM's.

Forecasting with LSTM

I have the following question, I have already created a forecasting model but it doesn't really predict the future, I just have a bunch of data, I split it into a "training" sample and into a "testing" sample and then I can check how good is my prediction. But now I want to forecast for the next 10 days that are not in the data I have. How on earth can I do it?
Example : Let's say I have the data for these days:
04-07-2017: 213
05-07-2017: 321
06-07-2017: 111
07-07-2017: 90
08-07-2017: 78
Now I want to forecast the data for the next 3 days. How can I do it?
I would assume you could use encoder/decoder. Something similar to seq2seq as in NLU.

Predicting next destination. What type of classification it is?

I have a very different machine learning problem. Or atleast I am facing such problem for 1st time.
It would be really great if u can guide me to solve it.
I have 3 data sets as follow:
hotel star_rating
1 3
2 2
user home_continent gender
1 2 female
2 3 female
3 1 male
user hotel
1 39
1 44
2 63
I need to find which hotel a user will visit next.
To me it does not look like normal classification or regression problem. Total 66 hotel and 4400 users
Can you please guide me.
Thanks
It is a ranking problem (which, in many cases can be solved applying classification techniques).
Have a look to the documentation from this Kaggle competition. It essentially poses the same kind of problem you are trying to solve.

naive bayes for Forecast grade

I have data set of grade in four lessons (for example lesson a,lesson b,lesson c,lesson d) for 100 students and let's imagine this grades are In association with
grade of lesson f.
I want to implement naive Bayes for Forecast grade lesson f by that four grade but I don't know how use input for this.
I read naive Bayes for spam mail detection and in that, Possibility of each word Calculated.
But for grade I do not know what Possibility I must calculate.
I have tried like spam but for this example I have just four names (for each lesson)
In order to do a good classification, you need to have some information in plus about student than class they are taking. Following your exemple, spam detection is based on words, stop words which are generally spam (buy, promotion, money) or origin in http headers.
For the case to predict student grade, you could imagine having information about student like : social class, is he doing sport, male or female and so on.
Getting back to your question, it is not the name of the lessons which are interesting but the grades each students got at this lessons. You need to take grades of each four lessons and lesson f to train the naive Bayes classifier.
Your entry might look like that:
StudentID gradeA gradeB gradeC gradeD gradeF
1 10 9 8 5 8
2 3 5 3 8 8
3 5 3 1 1 2
4 10 10 10 5 4
After training your classifier you will pass new entry for a new student like that:
StudentID gradeA gradeB gradeC gradeD
1058 1 5 8 4
The classifier will be able to predict the grade for lesson F taking into consideration the precedant grades.
You might have notice that I intentionnally did a training dataset where gradeF is highly correlated with gradeD. It is what the Bayes classifier will try to learn, just in a more complexe way.

Machine learning, classification type

I am studying for my Machine Learning (ML) class and I have question that I couldn't find an answer with my current knowledge. Assume that I have the following data-set,
att1 att2 att3 class
5 6 10 a
2 1 5 b
47 8 4 c
4 9 8 a
4 5 6 b
The above data-set is clear and I think I can apply classification algorithms for new incoming data after I train my data-set. Since each instance has a label, it is easy to understand that each instance has a class that is labeled with. Now, my question is what if we had a class consisting of different instances such as gesture recognition data. Any class will have multiple instances that specifies its class. For example,
xcor ycord depth
45 100 10
50 20 45
10 51 12
the above three instances belong to class A and the below three instances belong to class B as a group, I mean those three data instances constitute that class together. For gesture data, the coordinates of movement of your hand.
xcor ycord depth
45 100 10
50 20 45
10 51 12
Now, I want every incoming three instances to be grouped either as A or B? Is it possible to label all of them either A or B without labeling each instance independently? As an example, assume that following group belongs to B, so I want all of the instances to be labelled together as B not individually because of their independent similarity to class A or B? If it is possible, how do we call it?
xcor ycord depth
45 10 10
5 20 87
10 51 44
I don't see an scenario where you might want to group an indeterminate number of rows in your dataset as features of a given class. They are either independently associated with a class or they are all features and therefore an unique row. Something like:
Instead of
xcor ycord depth
45 10 10
5 20 87
10 51 44
Would be something like:
xcor1 ycord1 depth1 xcor2 ycord2 depth2 xcor3 ycord3 depth3
45 10 10 5 20 87 10 51 44
This is pretty much the same approach that is used to model time series
It seems you may be confused between different types of machine learning.
The dataset given in your class is an example of a supervised classification algorithm. That is, given some data and some classes, learn a classifier that can predict classes on new, unseen data. Classifiers that you can apply to this problem include
decision trees,
support vector machines
artificial neural networks, etc.
The second problem you are describing is an example of an unsupervised classification problem. That is, given some data without labels, we want to find an automatic way to separate the different types of data (your A and B) algorithmically. Algorithms that solve this problem include
K-means clustering
Mixture models
Principal components analysis followed by some sort of clustering
I would look into running a factor analysis or normalizing your data, then running a K-means or gaussian mixture model. This should discover the A and B types of your data if they are distinguishable.
Take a peek at the use of neural networks for recognizing hand-written text. You can think of a gesture as a hand-written figure with an additional time component (so, give each pixel an "age".) If your training data also includes similar time data, then I think the technique should carry over well.

Resources