Differencing a time series in Prophet - time-series

I have a series that has a linear trend but no seasonality. I have been trying different time series algorithms. I have tried ARIMA using pmdarima and I get good results with 1st order differencing of the series.
Next, I am using Prophet. With the series as is I get a high MAE. So I differenced the series and used Prophet to make predictions. But now the predicted values (yhat) are the differenced values. How do I convert the predicted values in the yhat column to the original scale so that I can calculate MAE and evaluate the model?
Is it even possible? I have tried all the possible solutions, but since this is unlike min-max scaler, I am not able to find a way out of it. Most of the solutions require the first value of the original series to inverse diff the differenced series.
Any help will be appreciated.

Related

Determine if one time series forecast another (in terms of trend only)

I have 2 time series, X_t and Y_t, which are on different scales.
Y_t can be 0 to infinite, while X_t is limited to 0 to 100.
How can I determine if the trend of X_t forecast the trend of Y_t? In other words if there is a peak in Xt, then the peak of Yt will follow after some lag.
If this is indeed the case, what is the lag?
I am not interested in forecasting the actual value of Yt.
Using the following chart as an illustration, the red line is Xt (which in my data the values are between 27 to 34), and the black line is Yt (which is about 40000).
I tried to use Time Lagged Pearson Correlation, but I am aware the pearson correlation (of the 2 time series) does not have the concept of time. Pearson correlation simply treats the time series as lists of data.
I have read some guides on Granger causality, but it seems this checks if (the value of) Xt is useful in forecasting the value of Yt, which is similar to a regression framework. (which I am mostly interested in forecasting the trend of Yt)
I am a newbie in time series analysis, Thanks for your time!

Time Series Data: Trend and Multi-Seasonality, SARIMA and TBATS predictions not working

I have ~2.6K hours of sales data with a positive linear trend as well as daily and weekly seasonality. See plotted data. I have tried to model the data using SARIMA and TBATS in python. In both cases, I cannot get the predictions to work as I intend.
For SARIMA, the in sample predictions look great, but when I try to forecast into the future, it looks completely wrong.See here for in sample SARIMA predictionsSee here for how poor the out of sample SARIMA predictions are.
For TBATS, the predicted values match the daily and weekly pattern, but is missing the positive trend, despite me forcing so that use_trend = True. See TBATS model prediction here
I have no idea what I'm doing wrong and have been stuck on this for days! Any advice greatly appreciated.

Forecasting Value in time series data with multiple independent variables

I have a data set attributes are (Date, Value, Variable-1, Variable-2, Variable-3, Variable-4, Variable-5), I have 100k plus rows. I wanted to predict the "Value" in the future based on 5 variables trained in time series manners, there will be seasonal trends and low and high scores in "Value". Can someone suggest to me some statistical or machine learning/deep learning solution for this?
Here is Dataset Screenshot, I wanted to forecast Value Variable
This is very interesting problem and you can use "Vector auto regression (VAR)" method to solve this problem. Packages are available in both R and Python to solve this problem.

TIme Series forcasting

I've been following a lot of tutorials using lstms to forecast timeseries data. My question is that how do we predict on new data that is not part of the dataset since almost all the tutorials show the predict function in Keras being used on the test data split.
How do we actually forecast into the future?
Usually, you create your training data such that the model receives n points and predict the following m points. Once you have your model trained, you take the last n available points of your dataset or new points from the present, and the model will output a prediction of m points in the future.
If you want to predict more than m points in the future, you could predict m points and use it as input to predict another m points, and so on. However, you should be aware that using this technique you will probably get worse results as you are accumulating errors.

Understanding multiple Linear regression

I am doing multiple regression problem. I have the below data set as below.
rank--discipline--yrs.since.phd--yrs.service--sex--salary
[ 1 1 19 18 1 139750],......
I am taking salary as dependent variable, and other variable as independent variable. After doing data pre processing, I ran the gradient descent, regression model. I estimated bias(intercept), coefficient for all independent features.
I want to do scattered plot for the actual values and regression line
for the hypothesis I predicted. Since we have more than one features here,
I have the below questions.
While plotting actual values (scatted plot), how do I decide the x-axis values. Meaning, I have list of values. for example, first row [1,1,19,18,1]=>139750 How do I transform or map [1,1,19,18,1] to x-axis.? I need to somehow make [1,1,19,18,1] to one value, so I can mark a point of (x,y) in the plot.
While plotting regression line, what would be the feature values, so I can calculate the hypothesis value.?
Meaning now, I have the intercept, and weight of all features, but I dont have the feature values. How do I decide upon the feature values now.?
I want to calculate the points and use matplot to do the jobs. I am aware that there are lot of tools available outside including matplotlib to do the job. But I want to get the basic understanding.
Thanks.
I am still not sure I completely understand your question, so if something is not what you expected comment below and we will work it out.
Now,
Query 1: In all your datasets you are going to have multiple inputs and there is no way to view the target variable salary in your case with respect to all, in a single graph, what is usually done is either you apply the concept of dimensionality reduction on your data using t-sne (link) or you use principal component analysis (PCA) to reduce the dimensionality of your data, and make your output a function of two or three variables and then plot it on the screen, the other technique that I prefer is rather plotting target vs each variable separately as subplot, The reason for this is we don't even have a way to comprehend how we will see the data that is in more than three dimensions.
Query 2: If you are not determined to use matplotlib, I will suggest seaborn.regplot(), but let's also do it in matplotlib. Suppose the variable you want to use first is 'discipline' vs 'salary'.
from sklearn.linear_model import LinearRegression
lm = LinearRegression()
X = df[['discipline']]
Y = df['salary']
lm.fit(X,Y)
After running this lm.coef_ will give you the coefficient, and lm.intercept_ will give you the intercept, in a linear equation that forms this variable, then you can plot the data between two variables and a line using matplotlib easily.
what you can do is ->
from pandas import plotting as pdplt
pdplt.scatter_matrix(dataframe, pass the remaining required parameters)
by this you will get a matrix of plots(in your case it's 6X6) which will exactly show how each column in your dataframe relates to the other columns and you can clearly visualise which feature dominates the result and also how the features are correlated to each other.
If you ask me this is the first thing I used to do with such types of problems and then remove all correlated features and select the features which best approximate the output.
But as you have to plot a 2d plot and in the above approach you might get more than a single feature which dominate the output then what you can do is a miracle named PCA.
If you ask me PCA is one of the most beautiful thing in machine learning. What it will do that is somehow merges all your feautres in some magical ratio which will generate principle components for your data. Principal components are those components which govern/major contribution to your model. You apply pca by simply importing from sklearn and then select the first principle component(as you need a 2d plot) or might select 2 priciple components and plot a 3d graph. But always remember this that these pricipal components are not the real features of your model but they are some magical combination and how PCA did so is very very interesting(by using concepts like eigen values and vectors) and you can build by your own also.
Apart from all these you can apply Singular Value decomposition(SVD) to your model which is the essence of whole linear algebra which is a type of matrix decomposition existing for all matrix. What this do is decompose your matrix into three matrix out of which the diagonal matrix which consists of singular values(a scaling factor) in descending order and what you have to do is that select the top singular values (in your case only the first one having highest magnitude) and construct back a feature matrix from 5 columns to 1 columns and then plot that. You can do svd by using the numpy.linalg
Once you applied any one of these methods then what you can do is learn your hypothesis with only the single most important selected feature and finally plot the graph. But take a tip, just for plotting a 2d graph you should avoid other important features beacuse maybe you have 3 principal components all having almost the same contribution and may the top three singular values are very close to each other. So take my words and take all important features into account and if you need the visualisation of these important features then use scatter matrix
Summary ->
All I want to mention is that you can do the same process with all these things and also can invent your own statistical or mathematical model for compressing your feature space.
But for me I prefer to go with PCA and in such type of problems I even first plot the scatter matrix to get an visual intuition to the data. And also PCA and SVD helps to remove redundancy and hence overfitting.
For rest details refer to docs.
Happy machine learning...

Resources