Cannot remove time series seasonality - time-series

I am currently working on fit a SARIMA model to my dataset. When I explore the acf of the dataset, there is seasonality which shows slow linearly decay at lag 24, 48, 72...(houly data). Then following standard procedure, I did differencing at lag 24 to remove the seasonality. However, no matter how many times I have tried (3 or 4 time differencing at lag 24), acf still shows seasonality at lag 24 and outside the bound. Can anyone help me with that?

Related

Hard time finding SARIMA parameters from ACF and PACF

Im a beginner in time series analyses.
I need help finding the SARIIMA(p,d,q,P,D,Q,S) parameters.
This is my dataset. Sampletime 1 hour. Season 24 hour.
S=24
Using the adfuller test I get p = 6.202463523469663e-16. Therefor stationary.
d=0 and D=0
Plotting ACF and PACF:
Using this post:
https://arauto.readthedocs.io/en/latest/how_to_choose_terms.html
I learn to "start counting how many “lollipop” are above or below the confidence interval before the next one enter the blue area."
So looking at PACF I can see maybe 5 before one is below the confidence interval. Therefor non seasonal p=5 (AR).
But I having a hard time finding the q - MA parameter from the ACF.
"To estimate the amount of MA terms, this time you will look at ACF plot. The same logic is applied here: how much lollipops are above or below the confidence interval before the next lollipop enters the blue area?"
But in the ACF plot not a single lollipop is inside the blue area.
Any tips?
There are many different rules of thumb and everyone has own views. I would say, in your case you probably do not need the MA component at all. The rule with the lollipop refers to ACF/PACF plots that have a sharp cut-off after a certain lag, for example in your PACF after the second or third lag. Your ACF is trailing off which can be an indicator for not using the MA component. You do not have to necessarily use it and sometimes the data is not suited for an MA model. A good tip is to always check what pmdarima’s auto_arima() function returns for your data:
https://alkaline-ml.com/pmdarima/tips_and_tricks.html
https://alkaline-ml.com/pmdarima/modules/generated/pmdarima.arima.auto_arima.html
Looking at you autocorrelation plot you can clearly see the seasonality. Just because the ADF test tells you it is stationary does not mean it necessarily is. You should at least check if you model works better with seasonal differencing (D).

Quantify stationary seasonality

I want to quantify seasonal variation to be able to determine that one data has more seasonal variation than another data.
I am analyzing weekday variation in sales for a stores (Store A ). I have data between 1995 and 1999 and 2005 and 2009.
My aim is to identify and compare the daily Seasonality in 1995-1999 and 2004-2009.
I have worked with seasonality before, but I have never used any method to quantify seasonality.
I have identified the seasonal components using the decompose() function in R.
I run two separate models, one for 1995-1999 and one for 2004-2009.
I use additive models because the seasonality does not vary within these periods.
I report the results as seasonal index.
It is easy to see (Figure below) that there was less seasonality in 2005-2009 (dotted line) compared to 1995-1999 (solid).
However, I would like to be able to quantify the difference in seasonality.
Is it correct to use a simple Coefficient of variation (CV)? CV in 1995-1999 = 0.15. CV in 2005-2009 = 0.5.
strength of seasonality
small vs. large seasonality
1]: https://i.stack.imgur.com/ibD44.png
I have read about the strength of seasonality and wonder about what it really indicates. What is the meant of strength of seasonality? feat_stl() function i r produce seasonal_strength. But is this really an indicator of how much seasonality a seasonal pattern holds? Is strength = "how much"
Is not the total area under/above the line of seasonality a better measure of increasing/ declining seasonality. The blue line obviously symbolizes much more seasonal variation compared to the red line. If you measure the arena below/above the lines, these areas also clearly shows this.
Is measuring the total area above/below the line a working way to quantify seasonal variation?
I understand that it can be more complex if the seasonal pattern is very fluctuating because that is also part of seasonal variation.

Deep Q Learning Network converging behavior is strange

I have implemented a deep q learning network (from the orginal paper without any subsequent modification and improvement) to train an agent playing tic-tac-toe. My hyper parameters are as follows:
Network Structure: 3 layer MLP with 1 hidden layer(150 nodes)
Input: The input is of shape (9,) with possible value [1,0,-1] representing a board state
Output: The output is of shape (9,) with possible value [0,1] represent possible action
Reward: if win then 10 , if lose then -10 , if draw then 0
Gamma(discount for future reward):0.99
epislon(for exploration): initially 0.3, then decreases linearly with respect to the number of episodes
Replay memory:2000000 so stored samples will never be replaced by new samples(beacues the memory will never be full)
Gradiant Descent Method: Momentum SGD
Loss function : square(y-x)
I use random walk as the exploring strategy in epsilon function and use this srategy to simulate the opponent's walk.
I then use the win rate of most recently perceived 200 games in the training process to evaluate the model's strength.
The training process is above avergage at beginning, the win rate raises from around 0.4 to around 0.70 and stay there for quite a lot of episodes( slowly raises to around 0.78).
However, at some point (about 30000 episodes), the win rate dramatically decreases and drops to 0.6 rapidly (and stays there).
Could anyone possibly give me some guidance about why this decay happens?
I made some modification:
Decreased the replay memory size to 100000,the replay memory size seems not the reason why the collapse happens according to experiments.
Increased the minibatch size sampled from the replay memory.
I am running another experiment. So far the winning rate does not collapse
Here's the curves:(left figure is the winning rate and right figure is the loss for the Q value)
The behavior mention above finally happened again:
Smooth Figure
Is there any reason why this happens?
Final results:
Final resuts

The convolutional neural network i'm trying to train is settling at a particular range of loss value, how should i avoid it?

Description: I am trying to train an alexnet similar(actually same but without groups) CNN from scratch (50000 images, 1000 classes and x10 augmentation). Each epoch has 50,000 iterations and image size is 227x227x3.
There was a smooth cost decline and improvement in the accuracy for a few initial epochs but now i'm facing this problem where the cost has settled to ~6(started from 13) for a long time, its been a day and cost is continuously oscillating in the range 6.02-6.7. The accuracy has also become stagnant.
Now i'm not sure what to do and not having any proper guidance. Is this the problem of vanishing gradients in local minima? So, to avoid this should i decrease my learning rate? Currently the learning rate is 0.08 with Relu activation (which helps in avoiding vanishing gradients), Glorot initialization and a batch size of 96. Before making another change and again training for days, i want to make sure that i'm moving in a correct direction. What could be the possible reasons?

contradictory results from box test & ACF/PACF plots of Seasonal ARIMA model

I have used ARIMA(0,1,0)(0,1,1)12 where seasonal lag is 12. The ACF plot & PACF plot of residuals suggests pattern where as box test for 12 lags, 24 lags etc shows p value in the range of 20% to 40% indicating randomness.
Is it really possible to have contradictory results OR there might be a problem in the way i modeled it.
I am using Arima function of R.
The original series has strong seasonality and upward trend.
Regards
Lakshmi

Resources