I have a monthly forecast in one table and daily sales in another. Is there a way in Tableau to develop a table calc that aggregates the daily sales to monthly where I can then subtract against by monthly forecast?
Related
In a multivariate time series forecasting, Do we need to pass the input variable values for prediction?. How it works if we have to predict n steps?
Let's say we have variable like Date, Age, Gender, Country, Customers count. Is it possible to predict customers count for next n periods with the historical data?
Let's say I have with me data containing
salary,
job profile,
work experience,
number of people in household,
other demographic etc ..
of multiple persons who visited my car dealership and I also have the data if he/she has bought a car from me or not.
I can leverage this dataset to predict if a new customer coming in is likely to buy a car or not. And let's say currently I am doing it using xgboost.
NOW, I have got additional data but it is a time series data of the monthly expenditure the person makes. Say I get the data for my training data too. Now I want to build a model which uses this time series data and the old demographics data(+ salary, age etc) to get to know if a customer is likely to buy or not.
Note: In the second part I have time series data of the monthly expenditure only. The other variables are at a point in time. For example I do not have the time series for Salary or Age.
Note2: I also have categorical variables like job profile which I would like to use in the model. But for this I do not know if the person has been in the same job profile or he has changed over from some other job profile.
As most of the data are specific to the person; except expenditure time series, so it is better to bring time series data at person level. This can be done by feature engineering like:
As #cmxu suggested take various statistical measures. It will be even more beneficial to take these statistical measures at different time intervals like say mean at last 2 days, 5 days, 7 days, 15 days, 30 day, 90 days, 180 days etc.
Create mixed features like:
a) ratio of salary vs expenditure statistical summery created in point 1 (choose appropriate interval)
b) salary per person household or avg monthly expenditure per household. etc.
With similar ideas you can easily create 100s or 1000s of features with your data and then feed all this data to XGBoost (which is easy to train and debug) or NN (more complicated to train).
I have some daily time series data. I am trying to predict the next 3 days from the historical daily set of data.
The historical data shows a definite trend based upon the day-of-week such as Monday, Tuesday, etc Monday and Tuesdays are high, Wednesday typically highest and then decreasing over the remaining part of the week.
If i group the data monthly or weekly, i can definitely see a trend over time going up that appears to be additive.
My goal is to predict the next 3 days only. My intuition is telling me to take one approach and I am hoping for some feedback on pros/cons versus other approaches.
My intuition tells me it might be better to group the data by week or month and then predict the next week or month. Suppose I predict the next week total by loading historical weekly data into ARIMA, train, test and predict the next week. Within a week, each Day-of-week typically contributes x percent to that weekly total. So, if Wednesday historically has on average contributed 50% of the weekly volume, and for the next week I am predicted 1000, then I would predict Wednesday to be 500. Is this a common approach?
Alternatively, I could load the historical daily values into ARIMA, train, test and let ARIMA predict the next 3 days. The big difference here is the whole "predict weekly" versus "predict daily".
In the time series forecasting space, is this a common debate and if so, perhaps someone can suggest some key words i can google to educate myself on pros/cons?
Also, perhaps there is a suggested algorithm to use when day of week is a factor?
Thanks in advance for any responses.
Dan
This is a standard daily time series problem where there is a day-of-week seasonality. If you are using R, you could make the time series a ts object with frequency = 7 and then use auto.arima() from the forecast package to forecast it. Any other seasonal forecasting method is also potentially applicable.
I'm trying to predict the price of tomatoes, I've collected a data set that contains the previous tomato price along with which I've also added features that might affect the change in tomato price, for example, wages in agriculture over months, inflation rate over months, rainfall over months. Does this qualify as a multivariate time series? What machine learning technique can be used to solve this problem? The constraint is that there are only 48 data points (4 years *12 months). Also, can the test and train be pulled using Cross Validation ?
Columns in my dataset:
Year
Month
Tomato price
Wage
Inflation
Rainfall
Number of festivals in the month
Thanks in advance !!
I am new to machine learning. So apologize in advance if the question is not smart enough.
I have just completed learning linear regression. Now I want to apply my skill on a sample e-commerce data. For example, I have a purchase history of a customer on a specific site which is as follows:
Date product amount
2016-12-01 A 300
2016-16-01 B 500
2016-01-02 C 400
..............................
..............................
Now I can predict what can be his purchase on month of December by fitting a time series regression model.
But now I have given purchase histry of multiple customers. With additional customerId column. How can I model it to predict purchase amount for each customer for month of December? Actually it does not sound smart to make N model for N individual customer.
Any clue or learning material will be appreciated.
You have to train N models for N customer if you want to predict a weekly/monthly purchase per customer.
However, if you generally want to know how much your customers buy in total, then add up the shopping values of all customers and create a model to predict the total purchase of all customers.