Forecasting sales per item in Azure ML algorithm choice - machine-learning

I have almost 10 years of sales data, let's assume in below format:
DateKey Product Count Price Type
20140701 Shoe 10 $100 X
20140701 Shoe 5 $40 Y
20140702 Shirt 50 $80 Z
20140703 Shoe 10 $105 A
.
.
20180630 . ......
Now I want to predict this financial Year (2018-19) sales breakdown.
I have sales data for July 2018 which I can use to score my model, But I am not sure which algorithm to use. I am using Azure ML Studio

There are plenty of sample experiments in Cortana Gallery. This one for example may provide you a brief overview of time series forecasting using Azure ML Studio

Related

How to forecast macro trend by multiple index by LSTM model?

I just start exploring machine learning world. I want to try predicting the macro economic trend by grouping different index futures by LSTM model. After reading many article, I have came up 2 approaches below. May I ask what is the best approach?
1. In the pre-processing stage, group the Index futures (e.g. S&P 500, Dow Jones, Nasdaq 100, FTSE 100 etc) and get the average price. Adding a extra column holding the average price of 2 days after.
data structure:
date
avg price
T+2 avg price
2. Simply random pick one index futures and adding a extra column holding its average price of 2 days after.
date
S&P
RTY
DJ
FESX
NK
S&P +2

Ideas for model selection for predicting sales at locations based on time component and class column

I am trying to build a model for sales prediction out of three different storages based on previous sales. However there is an extra (and very important) component to this which is a column with the values of A and B. These letters indicate a price category, where A siginifies a comparetively cheaper price compared to similar products. Here is a mock example of the table
week
Letter
Storage1 sales
Storage2 sales
Storage3 sales
1
A
50
28
34
2
A
47
29
19
3
B
13
11
19
4
B
14
19
8
5
B
21
13
3
6
A
39
25
23
I have previously worked with both types of prediction problems seperately, namely time series analysis and regression problems, using classical methods and using machine learning but I have not built a model which can take both predicition types into account.
I am writing this to hear any suggestions as how to tackle such a prediction problem. I am thinking of converting the three storage sale columns into one, in order to have one feature column, and having three one-hot encoder columns to indicate the storage. However I am not sure how to tackle this problem with a machine learning approach and would like to hear if anyone knows where to start with such a prediction problem.

Can I use machine learning on below dataset sample

Dataset Sample
Can I use any algorithm to train above dataset ?
Because Each Row (Id) has Dependent Variable(Status) . But Each "Id" again as Mulitple Rows as per Features
You Can Assume it as "Each Id has multiple transaction and All transactions have common Status"
Will Machine learning find some Patterns from these transaction
Is there any other approach to solve these type of problems
Just fill your ID row with the value from the above row , same for the status row, this will lead to:
df
ID Feature1 Feature2 Feature3 Status
8079 100 Asia High Approved
8079 200 Africa Low Approved
When you run a classification algorithm, you can use: ID, Feature1, Feature2, Feature3as features and Status as target. A classifier will learn with this and everything is completly the same as before.
The features are still independet. Dependet features you will only have if the variables are somehow dependet to each other, in your case the ID 8079 does not lead to Feature1: Africa. They are independet.
You can fill your cells with:
import numpy as np
df[df[0]==""] = np.NaN
df.fillna(method='ffill')
Based on your comments, the approach can be slightly different, you need to convert your entries to new features (Python pandas convert rows to columns where multiple columns exist):
The dataframe then should look like:
ID Feature1 Feature2 Feature3 Feature1a .... Feature3z Status
8079 100 Asia High 200 Approved
you can either assume that each row is independent and ignore the id column or if every ID has 3 rows, you could extend the dataset with more features

LSTM and labels

Lets start off with "I know ML cannot predict stock markets better than monkeys."
But I just want to go through with it.
My question is a theretical one.
Say I have date, open, high, low, close as columns. So I guess I have 4 features, open, high, low, close.
'my_close' is going to be my label(answer) and I will use the 'close' 7 days from current row. Basically i shift the 'close' column up 7 rows and make it a new column called 'my_close'.
LSTMs work on sequences. So say the sequence I set is 20 days.
hence my shape will be (1000days of data, 20 day as a sequence, 3 features).
The problem that is bothering me is should these 20 days or rows of data, have the exact same label? or can they have individual labels ?
Or have i misunderstood the whole theory?
Thanks guys.
In your case, You want to predict the current day's stock price using previous 7 days stock values. The way your building your inputs and outputs require some modification before feeding into the model.
Your making mistake in understanding timesteps(in your sequences).
Timesteps(sequences) in layman terms is the total number of inputs we will consider while predicting the output. In your case, it will be 7(not 20) as we will be using previous 7 days data to predict the current day's output.
Your Input should be previous 7 days of info
[F11,F12,F13],[F21,F22,F23],........,[F71,F72,F73]
Fij in this, F represents the feature, i represents timestep and j represents feature number.
and the output will be the stock price of the 8th day.
Here your model will analyze previous 7 days inputs and predict the output.
So to answer your question You will have a common label for previous 7 days input.
I strongly recommend you to study a bit more on LSTM's.

naive bayes for Forecast grade

I have data set of grade in four lessons (for example lesson a,lesson b,lesson c,lesson d) for 100 students and let's imagine this grades are In association with
grade of lesson f.
I want to implement naive Bayes for Forecast grade lesson f by that four grade but I don't know how use input for this.
I read naive Bayes for spam mail detection and in that, Possibility of each word Calculated.
But for grade I do not know what Possibility I must calculate.
I have tried like spam but for this example I have just four names (for each lesson)
In order to do a good classification, you need to have some information in plus about student than class they are taking. Following your exemple, spam detection is based on words, stop words which are generally spam (buy, promotion, money) or origin in http headers.
For the case to predict student grade, you could imagine having information about student like : social class, is he doing sport, male or female and so on.
Getting back to your question, it is not the name of the lessons which are interesting but the grades each students got at this lessons. You need to take grades of each four lessons and lesson f to train the naive Bayes classifier.
Your entry might look like that:
StudentID gradeA gradeB gradeC gradeD gradeF
1 10 9 8 5 8
2 3 5 3 8 8
3 5 3 1 1 2
4 10 10 10 5 4
After training your classifier you will pass new entry for a new student like that:
StudentID gradeA gradeB gradeC gradeD
1058 1 5 8 4
The classifier will be able to predict the grade for lesson F taking into consideration the precedant grades.
You might have notice that I intentionnally did a training dataset where gradeF is highly correlated with gradeD. It is what the Bayes classifier will try to learn, just in a more complexe way.

Resources