how use m-estimator for estimate time series models parameters in python? - time-series

how use this code from https://www.statsmodels.org/dev/generated/statsmodels.robust.robust_linear_model.RLM.html for time series data?
I want to estimate AR(1) parameter model with m-estimator

Related

How to compare forecast based on the data of raw series itself and that of the raw series integrated of order 1?

I have a dataset of close prices, lets say Close which after doing the unit root tests I decided to go ahead with two different models, in eviews the equations will be as follows:
Close c ar(1)
and
d(Close) ma(5)
the first model takes a constant and an AR(1) process while the second one takes MA(5) process with 1st differenced series.
After getting the forecasts for both, my question is how can I compare them and decide whihc forecast is better and why? RMSE and some other statistucs are supposed to be scaler and I wonder if differencing affects this scale. The output for both the forecasts along with statistics is as shown in the picture: the two forecasts
Thankyou for your help

Differencing a time series in Prophet

I have a series that has a linear trend but no seasonality. I have been trying different time series algorithms. I have tried ARIMA using pmdarima and I get good results with 1st order differencing of the series.
Next, I am using Prophet. With the series as is I get a high MAE. So I differenced the series and used Prophet to make predictions. But now the predicted values (yhat) are the differenced values. How do I convert the predicted values in the yhat column to the original scale so that I can calculate MAE and evaluate the model?
Is it even possible? I have tried all the possible solutions, but since this is unlike min-max scaler, I am not able to find a way out of it. Most of the solutions require the first value of the original series to inverse diff the differenced series.
Any help will be appreciated.

How to apply numerical operations to forecast object?

I'm constructing ARIMA model, my data is monthly hence i adjusted calendar effect for each data point. After I modeled ARIMA and forecasted it I'd like to back transform the result. How I can access forecast's object mean and prediction intervals to apply numerical operations (so it still remains a forecast object)? Any help would be highly appreciated.
I had to miss something. I updated R and forecast package and we can just use raw transformation and the forecast object won't change its class. I have no clue why the object was changing its class from forecast to list before the update.
For example:
forecast[["mean"]] <- (forecast[["mean"]]/30)*monthdays(forecast[["mean"]])

Normalize time-series data before or after split of training and testing data?

I use a classification model on time-series data where I normalize the data before splitting the data into train and test. Now, I know that train and test data should be treated separately to prevent data leaking. What could be the proper order of normalization steps here? Should I apply steps 1,2,3 separately to train and test after I split data with the help of a sliding window? I use a sliding window here to compare each hour (test) with its previous 24 hrs data (train). Here is the order that I am currently using in the pipeline.
Moving averages (mean)
Resampling every hour
Standardization
Split data into train and test using a sliding window (of a length 24 hrs (train) and slides every 1 hr (test))
Fit the model using train data
Predict using the test data
Steps 1 and 2 can be done safely, you just should take into account that The moving average must use only past values: X'i = mean(Xi, Xi-1, Xi-2, ..., Xi-n).
However, in step 3, the normalization/standardization parameters, like max and min if you are using minmax scaler or mean and standard deviation if you are using standardization, should be computed from the training data and should be applied to the whole dataset, so your pipeline would be something like this
Moving average (using only past values)
Resampling every hour
Split data into train and test.
Get standardization parameters from the train data (mean and std).
Standardize the whole dataset (train and test) using the parameters computed in 4.
Fit the model using train data
Predict using the test data

ARIMA Nodes in KNIME how to use?

I'm new to KNIME and trying to use ARIMA for extrapolation of my time series data. But I've failed to make ARIMA Predictor to do it's work.
Input data are of the following format
year,cv_diff
2011,-4799.099999999977
2012,60653.5
2013,64547.5
2014,60420.79999999993
And I would like to predict values for example for 2015 and 2016 years.
I'm using String to Date/Time node to convert year to date. In ARIMA Learner I can choose only cv_diff field. And this is the first question: for option 'Column containing univariate time series' should I set year column or variable that I'm going to predict? But in my case I have only one option - cv_diff variable. After that I connect Learner's output with ARIMA Predictor's input and execute. Execution is failing with ' ERROR ARIMA Predictor 2:3 Execute failed: The column with the defined time series was not found. Please configure the node anew.'
Help me to understand which variable should I set for Learner and Predictor? Should it be non-timeseries variable? And how then Arima nodes will understand which column to use as time series?
You should set the cv_diff as the time series variable and connect the input to the predictor too. (And do not try to set too large values for the parameters as with so little data points, learning will not work.)
Here is an example:
Finally, I've figured it out. Option 'Column containing univariate time series' for ARIMA Learner node seems little bit confusing especially for those unfamiliar with time series analysis. I should't have provided any time series field explicitly, because ARIMA treats variable on which it is going to make prediction as collected in equal time intervals and it doesn't matter what kind of intervals they are.
I've found a good explanation of what 'univariate time series' means
The term "univariate time series" refers to a time series that
consists of single (scalar) observations recorded sequentially over equal time increments. Some examples are monthly CO2 concentrations and southern
oscillations to predict el nino effects.
Although a univariate time series data set is usually given as a single column of numbers, time is in fact an implicit variable in the time series. If the data are equi-spaced, the time variable, or index, does not need to be explicitly given. The time variable may sometimes be explicitly used for plotting the series. However, it is not used in the time series model itself.
So, I should choose cv_diff variable for both Learner and Predictor and do not provide any timestamps or any other time related columns.
One more thing that I didn't understand. That I should train on some series of data and then provide another SERIES for which I want predictions. That is little bit different from other Machine Learning workflows when you need to provide only new data and there is no notion of series at all.

Resources