auto.arima with xreg in R, restriction on forecast periods - time-series

I am using the forecast package and implementing auto.arima with xreg. Here I want to forecast only for 1 year ahead but I am unable to use 'h' parameter in the 'forecast function', below is the reason for that:
Defination given in manual(F1 check):
h = "Number of period of forecast but if xreg is used h is ignored and the forecast period will be number of rows"
Please suggest me an alternate way to use h for the specific period forecast.

Related

Query data based on time interval/frequency

I'm trying to have a calendar display a series of activities based on their type, time and frequency for an easier visualization of data.
So far, I have managed to create a formula that correctly fetches the data that I have on a repository and displays it on the calendar. However, I'm not sure how I can have it account for entries that have a frequency (happening every x days).
For an easier understanding here are screenshots of both the table and the schedule
And here's the current formula I'm using to display the event/activity title in each day/hour at C12 for example:
=IFERROR(
INDEX(Repository!$K:$K,
MATCH(
C$10,
IF(
(Repository!$G:$G=$G$8)*
(Repository!$H:$H=$K$8)*
(Repository!$N:$N>=$B12)*
(Repository!$N:$N<$B12+TIME(2,0,0)),
Repository!$D:$D),
0)
),
"")
What I'm currently missing on the formula is a way to correctly account for the start/end date as well as frequency and understand if each day falls under the specified criteria. In case the frequency is 0 then I'd like to have it discard the end date at all (in case for some reason I end up forgetting to set the end date).
I have tried to work with the formula provided to account for the frequency but nothing that I tried seemed to work.
Minimal example requested by #GabrielCarballo
Entry on the table with a 2 days frequency:
Expected result on the schedule:
So basically, the formula on each cell should check for the start date, end date and frequency of the activity and identify if the specific date on the schedule falls under the specified timeframe.
In this minimal example, the activity starts on the 7th December and repeats every 2 days until the 14th of December.
use in C6:
=INDEX(IFNA(VLOOKUP(TEXT(C4:P4+B6:B14, "e-m-d-h-m")&G$2&K$2, SPLIT(FLATTEN(MAP(
Repository!D$4:D, Repository!O$4:O, Repository!P$4:P, Repository!N$4:N,
Repository!G$4:G, Repository!H$4:H, Repository!K$4:K, LAMBDA(d, o, p, n, g, h, k,
IF(DAYS(o, d)>=SEQUENCE(1, MAX(DAYS(o, d)), 0, p), TEXT(d+SEQUENCE(1,
MAX(DAYS(o, d)), 0, p)+IF(ISODD(HOUR(n)), (HOUR(n)*"1:00")-"1:00", HOUR(n)*"1:00"),
"e-m-d-h-m")&g&h&"×"&k, )))), "×"), 2, )))

Google Sheets Select X where (max(y)<=z)

my title is potentially not that enticing. But I am trying to create a semi-dynamic formula in order to find a "stock on hand" up to a particular date in time. There is a set number of locations ids 1-10, and two product types 3 & 4.
It is not guaranteed that each location will have a stock count at the date in question. I want to use query to find THE MOST RECENT stock count where location and product type and <= date
here is the basic formula
=QUERY(Sheet1!A160:E3530,"SELECT D WHERE ((B = "&$H$1&")) AND (E <= date '"&TEXT(MAX($M$3),"yyyy-mm-dd")&"') AND ((A = "&G2&"))", true)
But I need to figure out how to use MAX to find the most recent date within the date range specified.
Any help appreciated!
EDIT 23/06/2021
You will note this is a fraction of the data I have in my set (in the example sheet), so most numbers show as zero, but the formula
=MAXIFS($C$3:$C$6040,$A$3:$A$6040,I3,$B$3:$B$6040,$J$2,$E$3:$E$6040,(MAX(QUERY($A$3:$E$6040,"SELECT E WHERE (E <= date '"&TEXT($R$2,"yyyy-mm-dd")&"') AND ((A = "&I3&")) AND ((B = "&$J$2&"))", true))))
works on my full data. So this finds the most recent record of equipment type 3 or 4, up to the specified date and from a specified yard. Further filtering is done based on a change type of "converted, removed,dead,added,etc". What I want to do now is do a monthly or fortnightly line chart over time, eg the 14th and 29th of each month, or the 20th of each month and plot the the sum of each column J:Q. To start I hoped to use the date in U:U and populate the V:AC accordingly.
I have played with the onEvent script but I am struggling to make progess here

Google Sheets - Creating a line chart with average closing prices over last 90 days?

I've been trying to create a dataset for a chart all day and its way beyond me so hopefully someone can put me out of my misery!
I'm trying to create a line chart of average closing prices for a list of NYSE stocks over the previous 90 days. To build the chart, I believe I need a column of average prices and another column of dates. I should be able to create the chart but I've completely failed at creating the dataset. All I've managed to do is cook my processor.
I have a Column (A2) of NYSE Tickers, and I've tried to build a matrix of prices per Ticker, per date which has taken 2 separate ARRAYFORMULA() functions.
I'm hoping there's a way to process all the data within 1 ARRAYFORMULA() and output the average price per date into each cell, but anything is better than what I've been trying.
Here's some sample data:
NYSE Tickers
HZON
VGAC
BSN
DMYD
SNPR
THCB
(They won't all have data going as far back as 90 days)
Ideally, the output would be:
Avg. Price
Date
$10.11
27/02/2021
$10.08
26/02/2021
$10.02
25/02/2021
(Average price of all NYSE Tickers in my A2 column for that date)
I hope this is possible and someone can help!
Thanks
UPDATE
The dataset is just input manually by me. Pretty much everything about my trial and error attempt is useless. But, I was using this to return historical close prices:
=IFERROR(INDEX(GOOGLEFINANCE(CONCATENATE("NYSE:",$A2), "close", DATEVALUE(B$2)), 2, 2), "")
I'll try to demo what I was trying to accomplish: (best read from inside-out).
Output each average price+date to new rows {
-- For each day (X) - going back 90 days from today {
---- Get average price of array {
------ For each Ticker (build array?) {
-------- Get close price of Ticker on (X) date {

Which functions I need to use for this type of Google Sheets search?

I have the list of dates A and list of prices B.
Then manually filled search range in D and E.
I need to perform a search for number, that will be higher than G, or lower than H.
As result we show founded date in J. If no matching number is found, return E. Price (G or H), that triggered successful result in L. And founded price M, just B from date J.
Which functions can help me to implement this type of search? I tried to use INDEX, FILTER, but can't properly set the range like IF "HIGHER THAN" or "LOWER THAN" on every cell.
The main target is gradually checking each cell vertically, one by one, searching for a price, that will be higher or lower than sought. And if the number was not found, return the end date of the search.
Added the Google Sheets link, so you can test your solution and compare it.
https://docs.google.com/spreadsheets/d/1gmw7I778MGfCZENsOos4HhKB07X0wLp7fKs9hyn0a-Q/edit?usp=sharing
You can use following formulas:
for Expected result:
=IFERROR(INDEX($A$14:$A$23;MATCH(1;((D14<=$A$14:$A$23)*(E14>=$A$14:$A$23)*(((G14<=$B$14:$B$23)+(H14>=$B$14:$B$23))>0));0));E14)
for Triggered price:
=CHOOSE(1+(M14>=G14)+(M14<=H14)*2;"None";G14;H14)
for Founded price:
=IFERROR(INDEX($B$14:$B$23;MATCH(1;((D14<=$A$14:$A$23)*(E14>=$A$14:$A$23)*(((G14<=$B$14:$B$23)+(H14>=$B$14:$B$23))>0));0));ARRAYFORMULA(MAX($B$14:$B$23*(E14=$A$14:$A$23))))
See sheet "Search formula test" in your file.
To return date following formula is used
=IFERROR(IFERROR(INDEX(FILTER($A$14:$B;$A$14:$A>=D14;$A$14:$A<=E14;$B$14:$B>=G14);1;1);
INDEX(FILTER($A$14:$B;$A$14:$A>=D14;$A$14:$A<=E14;$B$14:$B<=H14);1;1));
MAX(FILTER($A$14:$B;$A$14:$A>=D14;$A$14:$A<=E14)))
Where it filters dates based on "Search from" / "Search until" data and Price check, first for "Higher than" then if no values found - for "Lower than".
Date result is returned with INDEX(filterFormula;1;1).
Last part MAX(FILTER()) returns last date in checked range in case no values were found.
For price the same formula is used but INDEX(filterFormula;1;2) returns price and last part VLOOKUPs price for last date in checked range.
However, there is a problem with using formulas as it first checks selected range for one condition and then for next one. Better solution would be script to check each cell in selection for both conditions.

kusto series_decompose_forecast() returning all nulls

I am trying to explore the forecasting function provided by kusto. I tried the sample which obviously generated the forecasting trend shown by the docs. However, I then tried the forecasting function with similar parameters on our production data. For some reason the forecasted values are all null.
Our kusto raw data looks like the following:
I would like to forecast the values of a0. Here is my query:
...
| distinct ['name']))
| summarize a0=avg(todouble(['temp'])) by d0=bin(['timestamp'], 1s), d1=['name']
| summarize timeAsList = make_list(d0), dataAsList0 = make_list(a0)
| extend forecast = series_decompose_forecast(dataAsList0, 60*60*24*3) // 3 day forecast
| render timechart
This is what the query renders:
This line is just our production data, not a forecast. The actual forecast array is just an array of nulls, as you can see.
What is wrong with the query?
The second parameter of series_decompose_forecast defines the number of points to leave out of training from the original time series. In your case the length of your original time series is ~1:39 hours (just by looking at the screenshot) so setting 3 days to leave out leaves no data for training. You need to extend the time series with the forecast period prior to calling series_decompose_forecast. Also I recommend using make-series to create the time series, filling empty gaps, instead of summarize by bin and make list. So the final query should look like below. I cannot test it as I have no access to the data. If you need please share a sample datatable and I can craft you the full working query
thanks
Adi
let start_time=datetime(...);
let end_time=datetime(...);
let dt=1s;
let forecast_points=60*60*24*3
tbl
| make-series a0=avg(todouble(temp)) on timestamp from start_time to (end_time+forecast_points*dt) step dt
| extend forecast = series_decompose_forecast(a0, forecast_points) // 3 day forecast
| render timechart

Resources