Stationary and nonstationary time series data - machine-learning

This question might sound trivial to some but I just got interested in time series analysis and have been reading up about it in the last couple of days. However, I am yet to understand the topic of identifying stationary/non-stationary time-series data. I generated some time-series data of two dimensions using some tool I found. Plotting it out, I get something like in this image:
Looking at the plot, I think it shows some seasonalities (with the spike in the middle) and I would say its not stationary. However, doing the stationarity test as described in Machine Learning Mastery, it passed the stationarity test (the tests says its stationary) . Now, I'm confused maybe I didn't understand what seasons and trends means in time-series data. Am I wrong in thinking that the spikes hints at seasons?

Judging from the plot, your data looks like white noise, which is a type of stationary random data. A stationary time series has constant mean (in your case zero), variance, autocorrelation, etc. across time.
Seasonality is regular patterns that occur at specific calendar intervals within a year, for example quarterly, monthly, or daily patterns. Accordingly, large spikes in a plot do not usually indicate seasonality.
In contrast, the following time series (using R) exhibits an upward trend, monthly seasonality, and increasing variance:
plot(AirPassengers)
In sum, the AirPassengers time series is not stationary.

Related

How can I split EEG evoked potentials in different frequency bands using MNE-python?

So far, I have calculated the evoked potentials. However, I would like to see if there is relatively more activity in the theta band wrt the other bands. When I use mne.Evoked.filter, I get a plot which lookes a lot like a sine wave, containing no useful information. Furthermore, the edge regions (time goes from -0.2s to 1s) are highly distorted.
Filtering will always result in edge artifacts, especially for low frequencies like theta (longer filter). To perform analyses on low frequency signal you should epoch your data into longer segments (epochs) than the time period you are interested in.
Also, if you are interested in theta oscillations it is better to perform time-frequency analysis than filter the ERP. ERP contains only time-locked activity, while with time-frequency representation you will be able to see theta even in time periods where it was not phase-aligned across trials. You may want to follow this tutorial for example.
Also make sure to see the many rich tutorials and examples in mne docs.
If you have any further problems we use Discourse now: https://mne.discourse.group/

Which machine learning algorithm should i use to predict if particular parking space will be occupied?

I'm working on my idea for Master thesis topic.
I get a dataset with milions of records which describe on-street parking sensors.
Data i have :
-vehicle present on particular sensor ( true or false)
It's normal that there are few parking event where there are False values with different duration time in a row.
-arrival time and departure time(month,day,hour,minute and even second)
-duration in minutes
And few more columns, but i don't have any idea how to show in my analysis that "continuity of time" and
reflect this in the calculations for a certain future time based on the time when the parking space was usually free or occupied.
Any ideas?
You can take two approaches:
If you want to predict whether a particular space will be occupied or not and if you take in count order of the events (TIME), this seems like a time series problem. You should start by trying simple time-series algorithms like Moving average or ARIMA Models. There are more sophisticated methods that take in count long and short term relationships, like recurrent neural networks, especially LSTM (Long short-term memory) which have shown good performance in time series problems.
You can take in the count all variables and use them to train a clustering algorithm like K-means or SVM.
As you pointed out:
And few more columns, but I don't have any idea how to show in my analysis that "continuity of time" and reflect this in the calculations for a certain future time based on the time when the parking space was usually free or occupied.
I recommend you to work this problem as a time series problem.
Timeseries modeling will be better option for this kind of modelling. As you said you want to predict binary output at different time intervals i.e whether the the parking slot will be occupied at the particular time interval or not. You can use LSTM for this purpose.
Time series is definitely an option here... if you are really going with LSTMs why not look into Transformers and take advantage of attention mechanism while doing time series forecasting !! I don't know them thoroughly, yet, just have a vague idea and performance benefits over RNNs and LSTM.

How to level the trend line when using Facebook Prophet for forecasting

I am using an additive Facebook Prophet model for predicting forecasts. On my test data it seems to be understanding the spikes well, but the overall trend is completely wrong, based on the last few data points going down:
Is there a way to level the trend line? I have adjusted changepoint_prior_scale many times, the current settings are:
changepoint_prior_scale = 0.25, changepoint_range=0.85
I am surprised that it continues to project a negative trend rather than a level trend, since the forecast window is quite long compared to the regularity of the major changepoints. Does Prophet always forecast trend so linearly? Is there a way to adjust this?

Which Model to use when trying to forecast a Time Series

At the beginning I was already confident about using ARIMA (because it was the plebiscited and recommended one when dealing with univariate time series which is not stationary. (I thought I will have to deal with a completely non stationary time series data) So my concern is about the fact that my time series is being stationary just after the first two months (see picture)
Should I still use ARIMA (or ARMA without differencing) or another method? Which one will be the "best" model for me to use regarding the Plotted data.
Thanks
After looking your data, you can try moving average ,Simple Exponential Smoothing and prophet.

Similarity of trends in time series analysis

I am new in time series analysis. I am trying to find the trend of a short (1 day) temperature time series and tried to different approximations. Moreover, sampling frequency is 2 minute. The data were collocated for different stations. And I will compare different trends to see whether they are similar or not.
I am facing three challenges in doing this:
Q1 - How I can extract the pattern?
Q2 - How I can quantify the trend since I will compare trends belong to two different places?
Q3 - When can I say two trends are similar or not similar?
Q1 -How I can extract the pattern?
You would start by performing time series analysis on both your data sets. You will need a statistical library to do the tests and comparisons.
If you can use Python, pandas is a good option.
In R, the forecast package is great. Start by running ets on both data sets.
Q2 - How I can quantify the trend since I will compare trends belong to two different places?
The idea behind quantifying trend is to start by looking for a (linear) trend line. All stats packages can assist with this. For example, if you are assuming a linear trend, then the line that minimizes the squared deviation from your data points.
The Wikipedia article on trend estimation is quite accessible.
Also, keep in mind that trend can be linear, exponential or damped. Different trending parameters can be tried to take care of these.
Q3 - When can I say two trends are similar or not similar?
Run ARIMA on both data sets. (The basic idea here is to see if the same set of parameters (which make up the ARIMA model) can describe both your temp time series. If you run auto.arima() in forecast (R), then it will select the parameters p,d,q for your data, a great convenience.
Another thought is to perform a 2-sample t-test of both your series and check the p-value for significance. (Caveat: I am not a statistician, so I am not sure if there is any theory against doing this for time series.)
While researching I came across the Granger Test – where the basic idea is to see if one time series can help in forecasting another. Seems very applicable to your case.
So these are just a few things to get you started. Hope that helps.

Resources