Array of value using array formula - google-sheets

I wanna return an array of value from multiple conditions, using an array formula.
I manage to make a working formula but without array formula, witch is annoying because of filling conflicts in the worksheet.
Here is the sheet
Note : current solution developped from this initial thread

TL;DR
Use this formula:
=ARRAYFORMULA(TRANSPOSE(SPLIT(TRANSPOSE(IF((F$1:$1&F$2:$2&F$3:$3&F$4:$4)<>"";IFERROR(REPT(";";IFERROR(VLOOKUP(F$4:$4;{$B$5:$B\ARRAYFORMULA(ROW(A5:A)-5)};2;FALSE);-1)+IFERROR(VLOOKUP(F$4:$4;{$C$5:$C\ARRAYFORMULA(ROW(A5:A)-5)};2;FALSE);-1)+IFERROR(VLOOKUP(F$4:$4;{$D$5:$D\ARRAYFORMULA(ROW(A5:A)-5)};2;FALSE);-1)+2);"#Batch? ")&IFERROR(VLOOKUP(F$1:$1&"-----"&F$2:$2;{ARRAYFORMULA(IF(LEN(Fprod!$H$2:$H);Fprod!$F$2:$F&"-----"&Fprod!$G$2:$G;))\Fprod!$H$2:$H};2;FALSE);"#Species/supplier?");";"));";";TRUE;FALSE)))
I'll try to explain my thought process so you are able to make changes or make similar formulas in the future and for reference for other people.
Basic explanation
I've noticed a few things that will help us make a formula for this problem:
In the Fprod sheet you have the list of suppliers and species with the weeks.
There is only a single list for each batch.
Your locale doesn't allow the usage of commas in the formulas. \ must be used instead
The basic idea of the formula is to get a string defining the columns. It's basically the values on Fprod with a padding of semicolons to vertically move it. After that we want to convert it into columns (similar to how it was used in the other question). For example, if the CNC + shiitake (batch 2101) started at week 4, we want to achieve ;;;1;2;3;4;5;6 which then becomes:
2101
<empty>
<empty>
<empty>
1
2
3
4
5
6
Step 1: Getting the spec for a supplier's species
Having the supplier and the species, we want to get the weeks on Fprod (ie. having CNC + shiitake sould give us 1;2;3;4;5;6).
To do so, we'll use a VLOOKUP to get the correct list for the week. The problem is that we need to check for 2 columns. So the trick is to join both columns with some characters in the middle (eg -----) to prevent unexpected collisions. So working on the F column:
=VLOOKUP(
F$1&"-----"&F$2;
{ARRAYFORMULA(IF(LEN(Fprod!$H$2:$H); Fprod!$F$2:$F&"-----"&Fprod!$G$2:$G;)) \ Fprod!$H$2:$H};
2;
FALSE
)
Let's unpack this, as there are already a lot of things.
Let's start with the second argument, as it's the most complex one. What it does is to make a table (2D array) where the first column is <supplier>-----<species> for each entry on Fprod, and the second column is the value we want. To make the first column we can use a ARRAYFORMULA to add both columns row by row:
=ARRAYFORMULA(Fprod!$F$2:$F & "-----" & Fprod!$G$2:$G)
The second column is simply H2:H of Fprod.
They are joined as columns. To see how this works try ={1\2} (usually it's , but because of your locale \ is required). This will generate a result similar to:
1
2
CNC-----shiitake
1;2;3;4;5;6
euro-----shiitake
1;2;3;4;5;6;7;8;9;10;11;12;13;14
euro-----pleurote
1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22
Lentin-----shiitake
1;2;3;4;5;6;7;8;9;10;11;12;13;14
Moser-----agaricus
1;2;3;4;5;6
-----
-----
As you can see we are also generating empty entries. To ignore them, we'll check that the H column of that row is not empty:
=ARRAYFORMULA(IF(LEN(Fprod!$H$2:$H); Fprod!$F$2:$F&"-----"&Fprod!$G$2:$G;))
Now we can simply use VLOOKUP to get the value in the second column. The value we search for is in the same format as the first column. In this case we'll use the F column and generalize later. If you add this to an F column you should see 1;2;3;4;5;6 appear.
It's important to note the last argument, as the data is not sorted and setting it to true or not setting it (it's true by default) would cause problems.
Step 2: Generating the padding
To generate a padding we'll need to use some VLOOKUP to detect at what row is the value. We need 3 of them one for each column. The idea being that in the first column you have the value you need to look up (the batch) and on the second, the number of padding required. It looks like:
={$B$5:$B\ARRAYFORMULA(ROW(A5:A)-5)}
$B$5:$B is simply the column B starting at row 5. ARRAYFORMULA(ROW(A5:A)-5) is a trick:
ARRAYFORMULA(ROW(Ai:Aj)) will return a count list from i to j inclusive. For example ARRAYFORMULA(ROW(A5:A6)) will return
result
5
6
In this case we need the result to have the same size as the other table so we are forced to use ARRAYFORMULA(ROW(A5:A)) to do so. Because of that the values are too big, so we need to subtract 5, giving us ARRAYFORMULA(ROW(A5:A)-5).
Join them into a single table and you have where to lookup:
=VLOOKUP(F$4; {$B$5:$B\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE)
This is very similar to what it was done before. If you add this to F5 you'll see that it says the offset (# week minus one).
Like I pointed before wee need this for the three columns, but wee need to make sure it doesn't crash. To do that we want set a default value to -1 if not in the column. This is simple by using IFERROR.
=IFERROR(VLOOKUP(F$4; {$B$5:$B\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1)
We cannot use MAX to join them because it cannot be used with ARRAYFORMULA because if can use ranges. Because of that we need to get creative, so I added them. This means that there is 2 -1 and the value; so adding 2 compensates it:
=IFERROR(VLOOKUP(F$4; {$B$5:$B\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4; {$C$5:$C\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4; {$D$5:$D\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
2
Now we only need to use REPT to repeat ; as many times as we need:
=REPT(
";";
IFERROR(VLOOKUP(F$4; {$B$5:$B\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4; {$C$5:$C\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4; {$D$5:$D\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
2
)
Step 3: Joining them
You may simply join them like you join regular strings. Note that the padding goes first
=REPT(
";";
IFERROR(VLOOKUP(F$4; {$B$5:$B\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4; {$C$5:$C\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4; {$D$5:$D\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
2
)&
VLOOKUP(
F$1&"-----"&F$2;
{ARRAYFORMULA(IF(LEN(Fprod!$H$2:$H); Fprod!$F$2:$F&"-----"&Fprod!$G$2:$G;)) \ Fprod!$H$2:$H};
2;
FALSE
)
Step 4: All the columns in one formula
Now it's a good moment to apply the formula to all columns. Basically, where you had a the column F, now we'll change it for a range on that row starting with F. For example, F4 would become F4:4.
=ARRAYFORMULA(
REPT(
";";
IFERROR(VLOOKUP(F$4:$4; {$B$5:$B\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4:$4; {$C$5:$C\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4:$4; {$D$5:$D\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
2
)&
VLOOKUP(
F$1:$1&"-----"&F$2:$2;
{ARRAYFORMULA(Fprod!$F$2:$F&"-----"&Fprod!$G$2:$G) \ Fprod!$H$2:$H};
2;
FALSE
)
)
Notice that if you have data later in the spreadsheet, it will give you an error. You can remove it or set the maximum column.
Also notice that if the column doesn't have the proper data it gives us an error. We'll fix that later.
Step 5: Split into columns
To split the representation into columns we need to use SPLIT. Split will split the values into a row. That means that we need to transpose (see Wikipedia article). So we should transpose, split, and transpose again. So let's add TRANSPOSE and SPLIT:
=ARRAYFORMULA(
TRANSPOSE(
SPLIT(
TRANSPOSE(
REPT(
";";
IFERROR(VLOOKUP(F$4:$4; {$B$5:$B\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4:$4; {$C$5:$C\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4:$4; {$D$5:$D\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
2
)&
VLOOKUP(
F$1:$1&"-----"&F$2:$2;
{ARRAYFORMULA(Fprod!$F$2:$F&"-----"&Fprod!$G$2:$G) \ Fprod!$H$2:$H};
2;
FALSE
)
);
";";
TRUE;
FALSE
)
)
)
Step 6: Handle errors
There can be 3 cases we can handle:
The species and supplier was not found on Fprod
The batch does not exist
the column is empty
For the first one, we can simply add an IFERROR to the lookup, adding a default message. For the second we can add an IFERROR wrapping REPT because it will throw an error when the number is negative (when not found). And for the last one we need to use IF with the value ";" when fails (SPLIT requires the string to not be empty).
Adding all of that we we our final result:
=ARRAYFORMULA(
TRANSPOSE(
SPLIT(
TRANSPOSE(
IF(
(F$1:$1&F$2:$2&F$3:$3&F$4:$4)<>"";
IFERROR(
REPT(
";";
IFERROR(VLOOKUP(F$4:$4; {$B$5:$B\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4:$4; {$C$5:$C\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
IFERROR(VLOOKUP(F$4:$4; {$D$5:$D\ARRAYFORMULA(ROW(A5:A)-5)}; 2; FALSE); -1) +
2
);
"#Batch? "
)&
IFERROR(
VLOOKUP(
F$1:$1&"-----"&F$2:$2;
{ARRAYFORMULA(IF(LEN(Fprod!$H$2:$H); Fprod!$F$2:$F&"-----"&Fprod!$G$2:$G;)) \ Fprod!$H$2:$H};
2;
FALSE
);
"#Species/supplier?"
);
";"
)
);
";";
TRUE;
FALSE
)
)
)
If you remove any unnecessary white-space, you get the TL;DR formula.
References
VLOOKUP (Google Docs Editors Help)
ARRAYFORMULA (Google Docs Editors Help)
SPLIT (Google Docs Editors Help)
REPT (Google Docs Editors Help)
TRANSPOSE (Google Docs Editors Help)
IF (Google Docs Editors Help)
IFERROR (Google Docs Editors Help)
ROW (Google Docs Editors Help)

Related

How do I set value depending on other columns value, according to some kind of completion key?

I'm using Google Sheets to get an overview on my banking transactions.
I would like to put every transaction in a category, for example, grocery, transport,...
I solved this by using this short script:
=IFS(REGEXMATCH(A1;$K$2); $L$2;REGEXMATCH(A1;$K$3);$L$3;REGEXMATCH(A1;$K$4);$L$4;true;"other")
In which column A is the one I want to check, with all the transactions, K has the shop names I'm checking on, for example "shopname1", and L is the category I would like to put transactions for this shop in.
Now, this is working just fine, but this list of shops and categories is now just 3 rows long, as I'm testing it out, but as I will be using it, it will be quite long, which means that my IFS statement will also be very long. This isn't very modular eighter, for changes in the future, so I would like to make this better.
I would like to know a way to for example let it check on a couple of lists, one for every category.
I hope this makes sence, and someone has an idea!
You can try this (in B2):
=ARRAYFORMULA(
IF(
A:A = "";;
VLOOKUP(
ROW(A:A);
{
FILTER(ROW(A:A); A:A <> "")\
REGEXREPLACE(
TRIM(
TRANSPOSE(QUERY(
IF(
NOT(REGEXMATCH(
TRANSPOSE(FILTER(A:A; A:A <> ""));
"(?i)" & FILTER(K2:K; K2:K <> "")
));;
FILTER(L2:L; K2:K <> "") & ", "
);;
COUNTA(K2:K)
)) & "other"
);
", other$";
)
};
2;
)
)
)
It will set other if there were no matches, otherwise it will set all matching categories separated by , .

Google Sheet | ArrayFormula - Get next smaller value of COL

Google Sheets. I want to get the next smaller value of the current value in COL A.
I've tried this...
=MAX( FILTER(INDIRECT("A" & ROW()-1 );MAX(A:A)) )
Looks good for the moment if the values in COL A are sorted.
The formula above is needed to pasted in each field.
"but", I want to use ARRAYFORMULA() ... I'm trying a long time (months) with breaks ...
That's one of my last tests.
=ArrayFormula(IF(ROW(G:G)=1;"Trip"; IF(ROW(G:G)<3;"0"; MAX( FILTER(INDIRECT("A" & ROW() );MAX(A:A)) ) ))
I've already tried VLOOKUP, too. But maybe I'm at the wrong way.
Unfortunately I didn't found a solution which match my case.
Does anybody can solve my issue? Or give me a hint to can solve this by my-self?
UPDATE Jan 26, 2021
Here we go... I've created a dummy sheet based on original values but cut not needed cols.
https://docs.google.com/spreadsheets/d/1yI0UEdZ3aKU03ElPchUuAPcmnAoMmCyXWhZBcu2Hv3g/edit#gid=0
Col A - is the "current value"
Col G+H - are some tries with ArrF
Col J - is working but not with ArrF - Shows the diff to the last value. Yes this can also be done without INDIRECT() and Co. But I want to try the basic logic.
Col K - show the last value.
Currently Col A is sorted. But if not the diff isn't working. I have added some values from above (grey marked) to simulate.
The goal should be to get the next smaller value of "current" COL A using ArrayFormula.
Use OFFSET
See the docs: OFFSET
=
{
"Last";
0;
ArrayFormula( OFFSET(A3:A, -1, 0) )
}
=
{
"Diff";
0;
ArrayFormula( A3:A100 - OFFSET(A3:A100, -1, 0) )
}
I changed the structure slightly so that it would be contained within arrays. That way you don't need the IF statements for the headers, for example.
I also did not use the whole column notation A:A partly because the array already has A1 and A2 covered. I tried A3:A but that didn't work for the diff column, because it always says it needs more rows. Probably because it needs to reference a row that is not on the same row, if that makes sense.
Refs
Arrays
OFFSET
UPDATE
Due to international settings you may need to have your functions written in this way:
=
{
"Last";
0;
ArrayFormula( OFFSET(A3:A; -1; 0) )
}
=
{
"Diff";
0;
ArrayFormula( A3:A100 - OFFSET(A3:A100; -1; 0) )
}

ARRAY formula to find last row to contain value in Google Sheets

I have a Google Sheet that is populated automatically via Zapier integration. For each new row added, I need to evaluate a given cell (Shipper Name) to find last instance of Shipper Name in prior rows, and if so, return Row# for the last entry.
Example Data Sheet
I am trying to create a formula that simply looks at name in new row and returns the number of the most recent row with that name.
Formula needs to run as an Array formula so that the data auto populates with each new row added to the Sheet.
I have tried to use this formula, but when refactored as Array formula, it doesn't populate new values for new rows, it just repeats the first value for all rows.
From Row J:
=sumproduct(max(row(A$1:A3)*(F4=F$1:F3)))
I need this formula refactored to be an Array formula that auto populates all the cells below it.
I have tried this version, but it doesn't work:
=ArrayFormula(IF(ISBLANK($A2:$A),"",sumproduct(max(row(A$1:A3)*($F4:$F=F$1:F3))))
A script (custom function maybe?) would be better.
Solution 1
Below is a formula you can place into the header (put in in J1, remove everything below).
It works much faster than the second solution and has no N² size restriction. Also it works with empty shippers (& "♥" is for those empty ones): as long as A:A column has some value it will not be ignored.
={
"Row of Last Entry";
ARRAYFORMULA(
IF(
A2:A = "",
"",
VLOOKUP(
ROW(F2:F)
+ VLOOKUP(
F2:F & "♥",
{
UNIQUE(F2:F & "♥"),
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1)
},
2,
0
),
SORT(
{
ROW(F2:F) + 1
+ VLOOKUP(
F2:F & "♥",
{
UNIQUE(F2:F & "♥"),
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1)
},
2,
0
),
ROW(F2:F);
{
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1),
SEQUENCE(ROWS(UNIQUE(F2:F)), 1, 0, 0)
}
},
1,
1
),
2,
1
)
)
)
}
Details on how it works
For every row we use VLOOKUP to search for a special number in a sorted virtual range to get the row number of the previous entry matching current.
A special number for a row is constructed like this: we get a sequential number for the current entry among unique entries and append to it current row number.
The right part (row number) of the resulting special numbers must be aligned between them. If the entry has sequential number 13 and the row number is 1234 and there are 100500 rows, then the number must be 13001234. 001234 is the aligned right part.
Alignment is done by multiplying a sequential number by 10 to the power of (log10(total number of rows) + 1), gives us 13000000 (from the example above). This approach is used to avoid using LEN and TEXT - working with numbers is faster then working with strings.
Virtual range has almost the same special numbers in the first column and original row numbers in the second.
Almost the same special numbers: they just increased by 1, so VLOOKUP will stop at most one step before the number corresponding to the current string.
Also virtual range has some special rows (added at the bottom before sorting) which have all 0's as the right part of their special numbers (1st column) and 0 for the row number (2nd column). That is done so VLOOKUP will find it for the first occurrence of the entry.
Virtual range is sorted, so we could use is_sorted parameter of the outer VLOOKUP set to 1: that will result in the last match that is less or equal to the number being looked for.
& "♥" are appended to the entries, so that empty entries also will be found by VLOOKUP.
Solution 2 - slow and has restrictions
But for some small enough number of rows this formula works (put in in J1, remove everything below):
={
"Row of Last Entry";
ARRAYFORMULA(
REGEXEXTRACT(
TRANSPOSE(QUERY(TRANSPOSE(
IF(
(FILTER(ROW(F2:F), F2:F <> "") > TRANSPOSE(FILTER(ROW(F2:F), F2:F <> "")))
* (FILTER(F2:F, F2:F <> "") = TRANSPOSE(FILTER(F2:F, F2:F <> ""))),
TRANSPOSE(FILTER(ROW(F2:F), F2:F <> "")),
""
)
), "", ROWS(FILTER(F2:F, F2:F <> "")))),
"(\d*)\s*$"
)
)
}
But there is a problem. The virtual range inside of the formula is of size N², where N is the number of rows. For current 1253 rows it works. But there is a limit after which it will throw an error of a range being too large.
That is the reason to use FILTER(...) and not just F2:F.
Here is a significantly simpler way to get at the information you're interested in. (I think.) I'm mostly guessing about what you want because your question wasn't really about what you want, but rather about how to get something that you think would help you get what you want. This is an example of an XY problem. I attempted to guess based on experience at what you're really after.
This editable sheet contains just 3 formulas. 2 on the raw data sheet and one in a new tab called "analysis."
The first formula on the Raw data tab extracts a properly formatted timestamp using a combination of MMULT and SPLIT functions and looks like this:
=ARRAYFORMulA({"Good Timestamp";IF(A2:A="",,MMULT(N(IFERROR(SPLIT(A2:A,"T"))),{1;1}))})
The second formula finds the amount of time since the previous timestamp for that Shipper. and subtracts it from the current timestamp thereby giving you the time between timestamps. However, it only does this if the time is less than 200 minutes. IF it is more than 200 minutes, it assumes that was a different shift for that shipper. It looks like this and uses a combination of LOOKUP() and SUBSTITUTE() to make sure it's pulling the correct timestamps. Obviously, you can find and change the 200 value to something more appropriate if it makes sense.
=ARRAYFORMULA({"Minutes/Order";IF(A2:A="",,IF(IFERROR((G2:G-1*SUBSTITUTE(LOOKUP(F2:F&G2:G-0.00001,SORT(F2:F&G2:G)),F2:F,""))*24*60)>200,,IFERROR((G2:G-1*SUBSTITUTE(LOOKUP(F2:F&G2:G-0.00001,SORT(F2:F&G2:G)),F2:F,""))*(24*60))))})
The third formula, on the tab called analysis uses query to show the average minutes per order and the number of orders per hour that each shipper is processing. It looks like this:
=QUERY({'Sample Data'!F:I},"Select Col1,AVG(Col3),COUNT(Col3)/(SUM(Col3)/60) where Col3 is not null group by Col1 label COUNT(Col3)/(SUM(Col3)/60)'Orders/ hour',AVG(Col3)'Minutes/ Order'")
Hopefully I've guessed correctly at your real goals. Always do your best to explain what they are rather than asking for only a small portion that you think will help you get to the answer. You can end up overcomplicating your process without realizing it.

Easiest way to query multiple sheets that are named by year, for all time?

I use a big nasty formula to query multiple sheets in the same Google Sheet document named according to year (2020, 2019, 2018, etc...) to sum up a total value. Because I need to query a filtered range in a complex way, I've figured out the best way to do this without running into other troubleshooting issues is to SUM multiple queries like so:
=SUM(
IFERROR(QUERY(FILTER({EOMONTH(INDIRECT("'"&**TO_TEXT(YEAR(TODAY()))&"'!A1:A"&ROWS(INDIRECT("'"&TO_TEXT(YEAR(TODAY()))&"'!K2:K"))), 0),INDIRECT("'"&TO_TEXT(YEAR(TODAY()))&"'!K2:K")}, [filter conditions]), "select Col2 label Col2' ' ")),
IFERROR(QUERY(FILTER({EOMONTH(INDIRECT("'"&TO_TEXT(YEAR(TODAY()-365))&"'!A1:A"&ROWS(INDIRECT("'"&TO_TEXT(YEAR(TODAY()-365))&"'!K2:K"))),0),INDIRECT("'"&TO_TEXT(YEAR(TODAY()-365))**&"'!K2:K")}, [filter conditions]),
"select Col2 label Col2' ' "))
)
For some context, you can see the much larger IF formula that this SUM is meant to be nested into, in the "Example Matrix" tab of the sheet. My focus for this question is on the INDIRECT references, which I have been using to dynamically reference the most current year's sheet and the previous year's sheet.
The problem is, if I want to keep doing this for every sheet as the years go on, I have to manually add a whole other query into my SUM using INDIRECT("'"&TO_TEXT(YEAR(TODAY()-730))&"'!K2:K") and INDIRECT("'"&TO_TEXT(YEAR(TODAY()-1095))&"'!K2:K") and so on, and that is just not an option considering how many of them I would need to add to multiple formulas in multiple sheets.
Is there any way I can adapt this for simplicity or perhaps make it into a script to accomplish summing queries for all sheets that are named by year for all time?
Here's a copy of my Example Sheet: https://docs.google.com/spreadsheets/d/1b29gyEgCDwor_KJ6ACP2rxdvauOzacDI9FL2K-jgg5E/edit#gid=1652431688
Thank you, any help is appreciated.
Usually an array formula would be a way to go in such case, but INDIRECT does not work inside array formulas.
There are a few approaches using scripting like this.
Here I will describe another approach: formula generation. We'll get a string with the formula and manually place it in a cell. It would be nice to put it in an inverted FORMULATEXT function, but unfortunately there is no such function at the moment, so we'll just paste it manually.
Step 1
Set the year limits (sheet names) in some cells. The first year of the period will be in K22, and the last will be in M22.
I set the period to from 2005 to 2040.
All the year numbers will e easily generated with SEQUENCE. If there were arbitrary names, a range of those names set manually would've been needed.
Step 2
Write a formula generator for what you need. We just generate a string here, in that string will be a formula you would normally type manually. It is not hard, but there are a lot of repetition and it would be tedious to write it manually.
Here is the generator:
=ARRAYFORMULA(
"=SUM(
FILTER(
{
" & JOIN(
";" & CHAR(10) & " ",
"IFERROR('" & SEQUENCE(M22 - K22 + 1, 1, K22, 1) & "'!D2:D, 0)"
) & "
},
ISNUMBER(
{
" & JOIN(
";" & CHAR(10) & " ",
"IFERROR('" & SEQUENCE(M22 - K22 + 1, 1, K22, 1) & "'!D2:D, 0)"
) & "
}
),
REGEXMATCH(
{
" & JOIN(
";" & CHAR(10) & " ",
"IFERROR('" & SEQUENCE(M22 - K22 + 1, 1, K22, 1) & "'!A2:A, 0)"
) & "
},
""(?i)^TOTAL$""
),
REGEXMATCH(
{
" & JOIN(
";" & CHAR(10) & " ",
"IFERROR('" & SEQUENCE(M22 - K22 + 1, 1, K22, 1) & "'!C2:C, 0)"
) & "
},
""(?i)^"" & IF(F19 = ""Condition 1 Count"", ""Condition 1"", ""Condition 2"") & ""$""
)
)
)"
)
Compared to the original formula the resulting formula is heavily changed, simplified. For example there is no actual need for INDIRECT with this approach, EOMONTH wasn't used anywhere and so on.
Step 3
Copy that result as text, remove enclosing quotes, replace double double quotes with single double quotes: "" -> ".
Now we've got our formula to paste somewhere as we could've typed manually. Here is a part of it:
=SUM(
FILTER(
{
IFERROR('2005'!F2:F, 0);
IFERROR('2006'!F2:F, 0);
...
IFERROR('2039'!F2:F, 0);
IFERROR('2040'!F2:F, 0)
},
ISNUMBER(
{
IFERROR('2005'!F2:F, 0);
IFERROR('2006'!F2:F, 0);
...
IFERROR('2039'!F2:F, 0);
IFERROR('2040'!F2:F, 0)
}
),
REGEXMATCH(
{
IFERROR('2005'!C2:C, 0);
IFERROR('2006'!C2:C, 0);
...
IFERROR('2039'!C2:C, 0);
IFERROR('2040'!C2:C, 0)
},
"(?i)^TOTAL$"
),
REGEXMATCH(
{
IFERROR('2005'!E2:E, 0);
IFERROR('2006'!E2:E, 0);
...
IFERROR('2039'!E2:E, 0);
IFERROR('2040'!E2:E, 0)
},
"(?i)^" & IF(F19 = "Condition 1 Count", "Condition 1", "Condition 2") & "$"
)
)
)
Step 4
Manually place this resulting formula into some cell.
It does what it supposed to do, dropdown reference works, non-existing sheets are tolerated.
There is no 2021 sheet for example, but when it will be crated there will be no need to change the formula, data from that new sheet will be used.
You'll need to repeat the process in two cases: the formula needs some change in logic or it is almost 2040 and you want to add another 50 years to the period. Still that process of generation is faster than making changes manually to the resulting monster.
A few notes on the original formula:
YEAR(TODAY() - 365) ➡ YEAR(TODAY()) - 1. With your approach there will be an error because of leap years. Depends on the years number, but at the beginning of a year it will emerge for sure.
"select Col2 label Col2' ' " ➡ "select Col2 label Col2 ''". Do you really need a column with a header name ' ' (just a space)? I'm guessing it meant to be blank.
No need for TO_TEXT.

Generate a list of all unique values of a multi-column range and give the values a rating according to how many times they appear in the last X cols

As the title says.
I have a range like this:
A B C
------ ------ ------
duck fish dog
rat duck cat
dog bear bear
What I want is to get a single-column list of all the unique values in the range, and assign them a rating (or tier) according to the number of times they have appeared in the last X columns (more columns are constantly added to the right side).
For example, let's say:
Tier 0: hasn't appeared in the last 2 columns.
Tier 1: has appeared once in the last 2 columns.
Tier 2: has appeared twice in the last 2 columns.
So the results should be:
Name Tier
------ ------
duck 1
rat 0
dog 1
fish 1
bear 2
cat 1
I was able to generate a list of unique values by using:
=ArrayFormula(UNIQUE(TRANSPOSE(SPLIT(CONCATENATE(B2:ZZ9&CHAR(9)),CHAR(9)))))
But it's the second part that I am not sure exactly how to achieve. Can this be done through Google Sheets commands or will I have to resort to scripting?
Sorry, my knowledge is not enough to build an array-formula but I can explain how I get it per cell and then expanded a range from it.
Part 1: count the number of nonempty columns (assuming that if column has something on the second row, then it's filled.
COUNTA( FILTER( Sheet1!$B$2:$Z$2 , NOT( ISBLANK( Sheet1!$B$2:$Z$2 ) ) ) )
Part 2: build a range for the last two filled columns:
OFFSET(Sheet1!$A$2, 0, COUNTA( ... )-1, 99, 2)
Part 3: use COUNTIF to count how many values of "bear" we meet there (here we can pass a cell-reference instead) :
COUNTIF(OFFSET( ... ), "bear")
I built a sample spreadsheet that gets the results, here's the link (I know external links are bad, but there's no other choice to show the reproducible example).
Sheet1 contains the data, Sheet2 contains the counts.
I suggest using both script and the formula.
Normalize the data
Script is the easiest way to normalize data. It will convert your columns into single column data:
/**
* converts columns into one column.
*
* #param {data} input the range.
* #return Column number, Row number, Value.
* #customfunction
*/
function normalizeData(data) {
var normalData = [];
var line = [];
var dataLine = [];
// headers
dataLine.push('Row');
dataLine.push('Column');
dataLine.push('Data');
normalData.push(dataLine);
// write data
for (var i = 0; i < data.length; i++) {
line = data[i];
for (var j = 0; j < line.length; j++) {
dataLine = [];
dataLine.push(i + 1);
dataLine.push(j + 1);
dataLine.push(line[j]);
normalData.push(dataLine);
}
}
return normalData;
}
Test it:
Go to the script editor: Tools → Editor (or in Chrome browser: [Alt → T → E])
After pasting this code into the script editor, use it as simple formula: =normalizeData(data!A2:C4)
You will get the resulting table:
Row Column Data
1 1 duck
1 2 fish
1 3 dog
2 1 rat
2 2 duck
2 3 cat
3 1 dog
3 2 bear
3 3 bear
Then use it to make further calculations. There are a couple of ways to do it. One way is to use extra column with criteria, in column D paste this formula:
=ARRAYFORMULA((B2:B>1)*1)
it will check if column number is bigger then 1 and return ones and zeros.
Then make simple query formula:
=QUERY({A:D},"select Col3, sum(Col4) where Col1 > 0 group by Col3")
and get the desired output.

Resources