Get average of last 4 cells Google Sheets - google-sheets

I am trying to figure out how to find the average of the last 4 columns. Every column in yellow Fame is the score, and then Attacks is how many attacks used to get that score.
So essentially, it would need to add up the last 4 Fame columns, and divide it by the sum of the last 4 Attacks columns.
Example
For the first row (row 3), the final output would do this calculation:
(3500+2700+3250+3300) / (16+12+16+16) = 212.5
Example 2
For the second row (row 4), the final output would do this calculation:
(2850+3500) / (16+16) = 198.4
Any help is greatly appreciated! Thanks!

from right to left try:
=SUM(SPLIT(REGEXEXTRACT(TEXTJOIN(" ", 1, FILTER(D3:3,
MOD(COLUMN(D3:3), 2)=0)), "\d+ \d+ \d+ \d+$"), " "))/
SUM(SPLIT(REGEXEXTRACT(TEXTJOIN(" ", 1, FILTER(D3:3,
MOD(COLUMN(D3:3)-1, 2)=0)), "\d+ \d+ \d+ \d+$"), " "))
update:
from left to right use:
=INDEX(QUERY(QUERY(IFERROR(SPLIT(TRIM(FLATTEN(QUERY(TRANSPOSE(D3:6),,9^9))), " ")*1),
"select (Col1+Col3+Col5+Col7)/(Col2+Col4+Col6+Col8)"), "offset 1", 0))

Related

Vlookup using a comma separated search key in Google Sheets

I am trying to use a vlookup (or some other function) to populate a list of numbers with usernames based on comma-separated data appearing in a single cell. I have attempted wild cards, and regmatch functions but can't seem to get the syntax correct (see the development tab of the linked spreadsheet for my efforts).
Essentially, I'd like to populate column B of the "columns" tab with usernames from column E of the All tab that corresponds to the numbers in column A derived from the comma-separated input in column D of the "all" Tab.
https://docs.google.com/spreadsheets/d/1Zebrp15784rtKb8obSr1ohWbjwm28owVTLSf1cBixzs/edit#gid=556664780
Thanks in advance for any support.
Solution without VLOOKUP:
=IFERROR(
INDEX(E:E,
MAX(
IFERROR(MATCH((ROW ()-1) & ",*",D:D,0), -1),
IFERROR(MATCH("* " & (ROW ()-1) & ",*",D:D,0), -1),
IFERROR(MATCH(", " & (ROW ()-1),D:D,0), -1)
),
1),
"")
Explain:
MATCH((ROW ()-1) & ",*",D:D,0) - the first number (9, 15, 21)
MATCH("* " & (ROW ()-1) & ",*",D:D,0) - a middle number (9, 15, 21)
MATCH(", " & (ROW ()-1),D:D,0) - the last number (9, 15, 21)
if no match ROW()-1 with data value will be -1, then it fires ERROR
(when we try to get aname by row == -1)
if we have ERROR we show an empty string
Result:
use:
=INDEX(IFNA(VLOOKUP(A2:A*1, SPLIT(FLATTEN(IF(IFERROR(
SPLIT(ALL!D2:D, ","))="",,SPLIT(ALL!D2:D, ",")&"​"&ALL!E2:E)), "​"), 2, )))
=INDEX(IFNA(VLOOKUP(A2:A*1, SPLIT(FLATTEN(IF(IFERROR(
SPLIT(ALL!D2:D, ","))="",,SPLIT(ALL!D2:D, ",")&"​"&ALL!E2:E)), "​"), 2, )))
This is the formula that worked the best. It treated each of the comma separated values separately and allowed me to pull in the names from the appropriate column. Thanks for the support!

Using ARRAYFORMULA to Calculate Running Total of Payables (Alternative to INDIRECT)

I use a Google Spreadsheet to keep track of the accounts payable per vendor. There is a sheet per vendor in the Spreadsheet. A simplified sheet looks like this:
When I receive a new invoice, an entry for the amount is made in the Credit column and when I release a payment, an entry for the amount is made in the Debit column. I keep track of the running total in the AC Payable column. I achieve this by using a formula in each cell of the AC Payable column (the example below is from cell E4):
=IF(
ISNUMBER(INDIRECT(ADDRESS(ROW()-1,COLUMN()))),
INDIRECT(ADDRESS(ROW()-1,COLUMN()))+C4-D4,
C4-D4
)
The logic is simple. The running total for row n is calculated by:
AC Payable(n - 1) + Credit(n) - Debit(n)
This setup works fine, except I have to drag the formula into newly added rows. Is there a way to achieve this by using ARRAYFORMULA?
PS: I have found a solution using:
= ARRAYFORMULA(
SUMIF(
ROW(C3:C),
"<="&ROW(C3:C),
C3:C)
-
SUMIF(
ROW(D3:D),
"<="&ROW(D3:D),
D3:D
)
)
I feel this is a suboptimal (The original sheet dates back to 2018. It has a lot of rows) solution since, in every row, it calculates the total of the Debit and Credit columns up to the current row and then subtracts the total of the Debit column from the total of the Credit column.
I am expecting a solution that would take advantage of the running total available in the previous row and not redo the whole calculation per row.
solution for up to 1581 rows:
=ARRAYFORMULA(QUERY(QUERY(MMULT(TRANSPOSE((SEQUENCE(COUNTA(A3:A)*2)<=
SEQUENCE(1, COUNTA(A3:A)*2))*FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1})),
SEQUENCE(COUNTA(A3:A)*2, 1, 1, 0)), "offset 1", ), "skipping 2", ))
skills:
it's fast
it's smart
gets slower more rows you add
dies after 1581 rows
it's based on standard MMULT Running/Cumulative Total/Sum formula:
=ARRAYFORMULA(MMULT(TRANSPOSE((ROW(B1:B6)
<=TRANSPOSE(ROW(B1:B6)))*B1:B6), SIGN(B1:B6)))
but with a modification twist, because you got 2 columns to total
instead of ROW(B1:B6) we use a sequence of count of real data multiplied by two (because you got 2 columns):
SEQUENCE(COUNTA(A3:A)*2)
instead of TRANSPOSE(ROW(B1:B6)) we use again:
SEQUENCE(1, COUNTA(A3:A)*2)
combination of these pieces:
=ARRAYFORMULA(TRANSPOSE((SEQUENCE(COUNTA(A3:A)*2)<=SEQUENCE(1, COUNTA(A3:A)*2))))
will produce a matrix like:
and that's the reason why it dies with lots of rows because while you may think that if you have only 1500 rows in two columns, then formula will work only on 1500*2=3000 virtual cells, but in fact the MMULT formula processes (1500*2)*(1500*2)=9000000 virtual cells. still, it's worth to note, that this MMULT fx is great if deployed on a small scale.
next, instead of *B1:B6 we use:
*FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1}))
eg. with INDIRECT we take only "valid" range of C3:D which is in your example sheet just C3:D5 and we multiply C column by 1 and D column by -1 to simulate subtraction and then we FLATTEN both columns into one single column. the part +ROW(A3)-1 is just an offset because you start from row 3
and the last part of standard RT fx - SIGN(B1:B6) is replaced with one column full of ones:
SEQUENCE(COUNTA(A3:A)*2, 1, 1, 0)
then we offset the output with inner QUERY by 1 because we are interested in a totals after subtraction and finally we use skipping 2 which means that we filter out every second value - again, we are interested in totals after subtraction of D column.
solution for more than 1581 rows:
=ARRAYFORMULA(
SUMIF(SEQUENCE(COUNTA(A3:A)), "<="&SEQUENCE(COUNTA(A3:A)), INDIRECT("C3:C"&COUNTA(A3:A)))-
SUMIF(SEQUENCE(COUNTA(A3:A)), "<="&SEQUENCE(COUNTA(A3:A)), INDIRECT("D3:D"&COUNTA(A3:A))))
skills:
supports more rows
looks less smart
sadly the third argument of SUMIF always needs to be a range
gets slower with more rows
it will get sick if you feed it with 10000 rows
it may kill off your sheet with 11000+ rows
Here'a modification of Ben Collins' running total formula
=ARRAYFORMULA(
IF(ISBLANK(A2:A),,
MMULT(TRANSPOSE((ROW(C2:C)<=TRANSPOSE(ROW(C2:C)))*C2:C),SIGN(C2:C))-
MMULT(TRANSPOSE((ROW(D2:D)<=TRANSPOSE(ROW(D2:D)))*D2:D),SIGN(D2:D))))
yet another alternative to MMULT:
=INDEX(QUERY(FLATTEN(QUERY(QUERY(TRANSPOSE(QUERY(QUERY(TRANSPOSE(
(SEQUENCE(COUNTA(A3:A)*2)<=SEQUENCE(1, COUNTA(A3:A)*2))*
FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1})),
"offset 1", ), "skipping 2", )), "select "&QUERY(
"sum(Col"&SEQUENCE(COUNTA(A3:A))&"),",, 9^9)&"' '"),
"offset 1", )), "where Col1 is not null", ))
but again, LTE (<=) limitation of 10M cells won't let you use more than 1581 rows in your case or 3162 rows in the standard cumulative sum case
(1581 rows * 2 columns) raised on 2nd power < 10 million cells
(1581*2)^2 = 9998244

How to pull data from every third column using QUERY function

I am using this formula but the same formula needs to be applied to every third column. ie: starting from D3:D, G3:G, J3:J, and so on... what is the best way to apply or pull the data from every third column. (data is on the second sheet called Sitemap)
Please advise and help, many many thanks much appreciated!
=query({
'Sitemaps'!D3:D1000},
"Select * where Col1 is not null ")
Adding the sheet link maybe that will be more helpful to understand the situation, "AllURLs" needs to pull all links from Sitemaps into one list
https://docs.google.com/spreadsheets/d/1AWGfA7cHmF3Q2kiX1xkQcoec6H5EPiHUXaiWENMzZkA/edit?usp=sharing
use:
=QUERY({INDIRECT("Sitemaps!"&
ADDRESS(3, (COLUMN($D1)-1)*COLUMN(A1)+1)&":"&
ADDRESS(1000, (COLUMN($D1)-1)*COLUMN(A1)+1))},
"where Col1 is not null")
and drag to the right
update:
use in B3:
=INDEX(IFERROR(REGEXEXTRACT(C3:C,"^(?:https?:\/\/)?(?:www\.)?([^\/]+)")))
use in C3:
=QUERY(FLATTEN(FILTER(IFERROR(Sitemaps!D3:1000), MOD(COLUMN(Sitemaps!D1:1)-1, 3)=0)),
"where Col1 is not null")
Try this:
=FILTER(FILTER(Sitemaps!D3:J,MOD(COLUMN(Sitemaps!D3:J)-4,3)=0),Sitemaps!D3:D<>"")
Just replace :J with whichever column is further to the right in your data set.
This one formula should produce all results, assuming that any rows that have data in Column D also have data in that row of every other included column, and that rows that are null in Column D are also null in that row of every other included column.
MOD is the modulus function. It returns whatever is left after dividing a number by another number. For instance, MOD(7,3) would return 1, because 7 divided by 3 is 6 with 1 left over. The leftover portion is the modulus.
We can apply this to your column numbers, since the ones you want to retrieve are evenly spaced three apart. We just need to start at a baseline of zero. Since Column D has a column number of 4, we can "zero out" that baseline by subtracting 4 from every column number. Only those columns that then are evenly divisible by 3 (i.e., those that, after subtracting 4, have a modulus of 0) are returned.

ARRAY formula to find last row to contain value in Google Sheets

I have a Google Sheet that is populated automatically via Zapier integration. For each new row added, I need to evaluate a given cell (Shipper Name) to find last instance of Shipper Name in prior rows, and if so, return Row# for the last entry.
Example Data Sheet
I am trying to create a formula that simply looks at name in new row and returns the number of the most recent row with that name.
Formula needs to run as an Array formula so that the data auto populates with each new row added to the Sheet.
I have tried to use this formula, but when refactored as Array formula, it doesn't populate new values for new rows, it just repeats the first value for all rows.
From Row J:
=sumproduct(max(row(A$1:A3)*(F4=F$1:F3)))
I need this formula refactored to be an Array formula that auto populates all the cells below it.
I have tried this version, but it doesn't work:
=ArrayFormula(IF(ISBLANK($A2:$A),"",sumproduct(max(row(A$1:A3)*($F4:$F=F$1:F3))))
A script (custom function maybe?) would be better.
Solution 1
Below is a formula you can place into the header (put in in J1, remove everything below).
It works much faster than the second solution and has no N² size restriction. Also it works with empty shippers (& "♥" is for those empty ones): as long as A:A column has some value it will not be ignored.
={
"Row of Last Entry";
ARRAYFORMULA(
IF(
A2:A = "",
"",
VLOOKUP(
ROW(F2:F)
+ VLOOKUP(
F2:F & "♥",
{
UNIQUE(F2:F & "♥"),
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1)
},
2,
0
),
SORT(
{
ROW(F2:F) + 1
+ VLOOKUP(
F2:F & "♥",
{
UNIQUE(F2:F & "♥"),
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1)
},
2,
0
),
ROW(F2:F);
{
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1),
SEQUENCE(ROWS(UNIQUE(F2:F)), 1, 0, 0)
}
},
1,
1
),
2,
1
)
)
)
}
Details on how it works
For every row we use VLOOKUP to search for a special number in a sorted virtual range to get the row number of the previous entry matching current.
A special number for a row is constructed like this: we get a sequential number for the current entry among unique entries and append to it current row number.
The right part (row number) of the resulting special numbers must be aligned between them. If the entry has sequential number 13 and the row number is 1234 and there are 100500 rows, then the number must be 13001234. 001234 is the aligned right part.
Alignment is done by multiplying a sequential number by 10 to the power of (log10(total number of rows) + 1), gives us 13000000 (from the example above). This approach is used to avoid using LEN and TEXT - working with numbers is faster then working with strings.
Virtual range has almost the same special numbers in the first column and original row numbers in the second.
Almost the same special numbers: they just increased by 1, so VLOOKUP will stop at most one step before the number corresponding to the current string.
Also virtual range has some special rows (added at the bottom before sorting) which have all 0's as the right part of their special numbers (1st column) and 0 for the row number (2nd column). That is done so VLOOKUP will find it for the first occurrence of the entry.
Virtual range is sorted, so we could use is_sorted parameter of the outer VLOOKUP set to 1: that will result in the last match that is less or equal to the number being looked for.
& "♥" are appended to the entries, so that empty entries also will be found by VLOOKUP.
Solution 2 - slow and has restrictions
But for some small enough number of rows this formula works (put in in J1, remove everything below):
={
"Row of Last Entry";
ARRAYFORMULA(
REGEXEXTRACT(
TRANSPOSE(QUERY(TRANSPOSE(
IF(
(FILTER(ROW(F2:F), F2:F <> "") > TRANSPOSE(FILTER(ROW(F2:F), F2:F <> "")))
* (FILTER(F2:F, F2:F <> "") = TRANSPOSE(FILTER(F2:F, F2:F <> ""))),
TRANSPOSE(FILTER(ROW(F2:F), F2:F <> "")),
""
)
), "", ROWS(FILTER(F2:F, F2:F <> "")))),
"(\d*)\s*$"
)
)
}
But there is a problem. The virtual range inside of the formula is of size N², where N is the number of rows. For current 1253 rows it works. But there is a limit after which it will throw an error of a range being too large.
That is the reason to use FILTER(...) and not just F2:F.
Here is a significantly simpler way to get at the information you're interested in. (I think.) I'm mostly guessing about what you want because your question wasn't really about what you want, but rather about how to get something that you think would help you get what you want. This is an example of an XY problem. I attempted to guess based on experience at what you're really after.
This editable sheet contains just 3 formulas. 2 on the raw data sheet and one in a new tab called "analysis."
The first formula on the Raw data tab extracts a properly formatted timestamp using a combination of MMULT and SPLIT functions and looks like this:
=ARRAYFORMulA({"Good Timestamp";IF(A2:A="",,MMULT(N(IFERROR(SPLIT(A2:A,"T"))),{1;1}))})
The second formula finds the amount of time since the previous timestamp for that Shipper. and subtracts it from the current timestamp thereby giving you the time between timestamps. However, it only does this if the time is less than 200 minutes. IF it is more than 200 minutes, it assumes that was a different shift for that shipper. It looks like this and uses a combination of LOOKUP() and SUBSTITUTE() to make sure it's pulling the correct timestamps. Obviously, you can find and change the 200 value to something more appropriate if it makes sense.
=ARRAYFORMULA({"Minutes/Order";IF(A2:A="",,IF(IFERROR((G2:G-1*SUBSTITUTE(LOOKUP(F2:F&G2:G-0.00001,SORT(F2:F&G2:G)),F2:F,""))*24*60)>200,,IFERROR((G2:G-1*SUBSTITUTE(LOOKUP(F2:F&G2:G-0.00001,SORT(F2:F&G2:G)),F2:F,""))*(24*60))))})
The third formula, on the tab called analysis uses query to show the average minutes per order and the number of orders per hour that each shipper is processing. It looks like this:
=QUERY({'Sample Data'!F:I},"Select Col1,AVG(Col3),COUNT(Col3)/(SUM(Col3)/60) where Col3 is not null group by Col1 label COUNT(Col3)/(SUM(Col3)/60)'Orders/ hour',AVG(Col3)'Minutes/ Order'")
Hopefully I've guessed correctly at your real goals. Always do your best to explain what they are rather than asking for only a small portion that you think will help you get to the answer. You can end up overcomplicating your process without realizing it.

Easiest way to query multiple sheets that are named by year, for all time?

I use a big nasty formula to query multiple sheets in the same Google Sheet document named according to year (2020, 2019, 2018, etc...) to sum up a total value. Because I need to query a filtered range in a complex way, I've figured out the best way to do this without running into other troubleshooting issues is to SUM multiple queries like so:
=SUM(
IFERROR(QUERY(FILTER({EOMONTH(INDIRECT("'"&**TO_TEXT(YEAR(TODAY()))&"'!A1:A"&ROWS(INDIRECT("'"&TO_TEXT(YEAR(TODAY()))&"'!K2:K"))), 0),INDIRECT("'"&TO_TEXT(YEAR(TODAY()))&"'!K2:K")}, [filter conditions]), "select Col2 label Col2' ' ")),
IFERROR(QUERY(FILTER({EOMONTH(INDIRECT("'"&TO_TEXT(YEAR(TODAY()-365))&"'!A1:A"&ROWS(INDIRECT("'"&TO_TEXT(YEAR(TODAY()-365))&"'!K2:K"))),0),INDIRECT("'"&TO_TEXT(YEAR(TODAY()-365))**&"'!K2:K")}, [filter conditions]),
"select Col2 label Col2' ' "))
)
For some context, you can see the much larger IF formula that this SUM is meant to be nested into, in the "Example Matrix" tab of the sheet. My focus for this question is on the INDIRECT references, which I have been using to dynamically reference the most current year's sheet and the previous year's sheet.
The problem is, if I want to keep doing this for every sheet as the years go on, I have to manually add a whole other query into my SUM using INDIRECT("'"&TO_TEXT(YEAR(TODAY()-730))&"'!K2:K") and INDIRECT("'"&TO_TEXT(YEAR(TODAY()-1095))&"'!K2:K") and so on, and that is just not an option considering how many of them I would need to add to multiple formulas in multiple sheets.
Is there any way I can adapt this for simplicity or perhaps make it into a script to accomplish summing queries for all sheets that are named by year for all time?
Here's a copy of my Example Sheet: https://docs.google.com/spreadsheets/d/1b29gyEgCDwor_KJ6ACP2rxdvauOzacDI9FL2K-jgg5E/edit#gid=1652431688
Thank you, any help is appreciated.
Usually an array formula would be a way to go in such case, but INDIRECT does not work inside array formulas.
There are a few approaches using scripting like this.
Here I will describe another approach: formula generation. We'll get a string with the formula and manually place it in a cell. It would be nice to put it in an inverted FORMULATEXT function, but unfortunately there is no such function at the moment, so we'll just paste it manually.
Step 1
Set the year limits (sheet names) in some cells. The first year of the period will be in K22, and the last will be in M22.
I set the period to from 2005 to 2040.
All the year numbers will e easily generated with SEQUENCE. If there were arbitrary names, a range of those names set manually would've been needed.
Step 2
Write a formula generator for what you need. We just generate a string here, in that string will be a formula you would normally type manually. It is not hard, but there are a lot of repetition and it would be tedious to write it manually.
Here is the generator:
=ARRAYFORMULA(
"=SUM(
FILTER(
{
" & JOIN(
";" & CHAR(10) & " ",
"IFERROR('" & SEQUENCE(M22 - K22 + 1, 1, K22, 1) & "'!D2:D, 0)"
) & "
},
ISNUMBER(
{
" & JOIN(
";" & CHAR(10) & " ",
"IFERROR('" & SEQUENCE(M22 - K22 + 1, 1, K22, 1) & "'!D2:D, 0)"
) & "
}
),
REGEXMATCH(
{
" & JOIN(
";" & CHAR(10) & " ",
"IFERROR('" & SEQUENCE(M22 - K22 + 1, 1, K22, 1) & "'!A2:A, 0)"
) & "
},
""(?i)^TOTAL$""
),
REGEXMATCH(
{
" & JOIN(
";" & CHAR(10) & " ",
"IFERROR('" & SEQUENCE(M22 - K22 + 1, 1, K22, 1) & "'!C2:C, 0)"
) & "
},
""(?i)^"" & IF(F19 = ""Condition 1 Count"", ""Condition 1"", ""Condition 2"") & ""$""
)
)
)"
)
Compared to the original formula the resulting formula is heavily changed, simplified. For example there is no actual need for INDIRECT with this approach, EOMONTH wasn't used anywhere and so on.
Step 3
Copy that result as text, remove enclosing quotes, replace double double quotes with single double quotes: "" -> ".
Now we've got our formula to paste somewhere as we could've typed manually. Here is a part of it:
=SUM(
FILTER(
{
IFERROR('2005'!F2:F, 0);
IFERROR('2006'!F2:F, 0);
...
IFERROR('2039'!F2:F, 0);
IFERROR('2040'!F2:F, 0)
},
ISNUMBER(
{
IFERROR('2005'!F2:F, 0);
IFERROR('2006'!F2:F, 0);
...
IFERROR('2039'!F2:F, 0);
IFERROR('2040'!F2:F, 0)
}
),
REGEXMATCH(
{
IFERROR('2005'!C2:C, 0);
IFERROR('2006'!C2:C, 0);
...
IFERROR('2039'!C2:C, 0);
IFERROR('2040'!C2:C, 0)
},
"(?i)^TOTAL$"
),
REGEXMATCH(
{
IFERROR('2005'!E2:E, 0);
IFERROR('2006'!E2:E, 0);
...
IFERROR('2039'!E2:E, 0);
IFERROR('2040'!E2:E, 0)
},
"(?i)^" & IF(F19 = "Condition 1 Count", "Condition 1", "Condition 2") & "$"
)
)
)
Step 4
Manually place this resulting formula into some cell.
It does what it supposed to do, dropdown reference works, non-existing sheets are tolerated.
There is no 2021 sheet for example, but when it will be crated there will be no need to change the formula, data from that new sheet will be used.
You'll need to repeat the process in two cases: the formula needs some change in logic or it is almost 2040 and you want to add another 50 years to the period. Still that process of generation is faster than making changes manually to the resulting monster.
A few notes on the original formula:
YEAR(TODAY() - 365) ➡ YEAR(TODAY()) - 1. With your approach there will be an error because of leap years. Depends on the years number, but at the beginning of a year it will emerge for sure.
"select Col2 label Col2' ' " ➡ "select Col2 label Col2 ''". Do you really need a column with a header name ' ' (just a space)? I'm guessing it meant to be blank.
No need for TO_TEXT.

Resources