Custom average function on a pivot table in Google Sheets - google-sheets

I have a spreadsheet that I'm starting to use for personal money analysis.
My main sheet is called "transactions" and has headers of Category, Description, Date and Amount (it's basically a check register).
I've created a pivot report off of that sheet that contains sum, min and max of Amount by Category.
I would like to make a custom average function on that pivot report but not sure how to go about it.
What I would like to see is the average amount of negative transactions between positive ones.
My positive transactions are my paychecks and the negative transactions are any spending I do.
An example might help in what I'm trying to do here...
Let's say for category "Food" I have the following transactions (in this order)...
-20
-25
-30
100
-30
-35
-40
I'd like my average to be calculated like this...
( ( (-20 + -25 + -30) / 3 ) + ( (-30 + -35 + -40) / 3 ) ) / 2
Anyone have the slightest idea on how I can enhance my pivot report to do this?

You do it with something like:
=ARRAYFORMULA(AVERAGE(IF(Sheet1!D2:D8<0,Sheet1!D2:D8, 0)))
where column D is the amount of your example and Sheet1 contains the "transactions" of your example.
If you want to fill it for the pivot table (having the category as another criterion) you can check the answer at: https://stackoverflow.com/a/9165254/179529
=SUM(ARRAYFORMULA(((Transactions!$A2:$A)=$A2) * ((Transactions!$D2:$D)>0) * (Transactions!$D2:$D) ))
/
SUM(ARRAYFORMULA(((Transactions!$A2:$A)=$A2) * ((Transactions!$D2:$D)>0) * (1) ))
where $A2 is the cell where you have the category name in the pivot table (The $ will allow you to copy the formula to other columns in you want it per month or other second criterion.
If you want to SUM the element in column D only if they great than 0, you need to have ((Transactions!$D2:$D)>0) as the second argument and (Transactions!$D2:$D) as the 3rd argument (otherwise you will count the cells instead of SUM them).
Since AVERAGE will take blank cells as well, I've used SUM/COUNT instead. Note that COUNT is actually SUM with the 3rd argument as 1.
Also note that if you want to ignore a header line you need to define your columns with Transactions!$D2:$D, to start from the 2nd row.

Related

How to apply array formula taking data from another table?

We have two tables in Google Sheets.
First:
Date
Amount
Currency
Worth
01.01.2021
100
USD
373
02.01.2021
100
EUR
451
03.01.2021
100
PLN
100
04.01.2021
100
USD
373
05.01.2021
100
USD
372
Second:
Date
PLN
EUR
USD
01.01.2021
1
4,50
3,73
02.01.2021
1
4,51
3,75
03.01.2021
1
4,50
3,74
04.01.2021
1
4,48
3,73
05.01.2021
1
4,49
3,72
I tried find array formula for first table, column Worth. Formula should take proper value from second table (based on two columns from table one - Date and Currency) and multiply that values by worth in column Amount. I really want to use array formula. Is it possible?
Use VLOOKUP to find the correct date row and MATCH to find which column the value is in:
=ARRAYFORMULA(IFERROR(VLOOKUP(A2:A,I2:L,MATCH(C2:C,I1:L1,0))*B2:B))
Option 01: Getting the result with one cell one formula.
Paste this in B3 "Amount" column in the first table, take a look at this Sheet.
=ArrayFormula(IF(ArrayFormula(IF(A3:A="",,VLOOKUP(A3:A,G3:J,ArrayFormula(IF(D3:D="",,MATCH(D3:D,$H$2:$J$2,0)+1)),0)))="",,ArrayFormula(IF(A3:A="",,VLOOKUP(A3:A,G3:J,ArrayFormula(IF(D3:D="",,MATCH(D3:D,$H$2:$J$2,0)+1)),0)))*E3:E))
Explanation ...
1 - MATCH(D3:D,$H$2:$J$2,0) To get the index you want to VLOOKUP the "Currency" column from the second table with, we need that in the next step.
2 - VLOOKUP the "date" found in First table A3:A from Range in the second table G3:J, with Index set to MATCH(D3:D,$H$2:$J$2,0), and [is_sorted] set to 0
3 - till now we have the value of the exchange rate if we can call it that for each Currency chosen in the first Table, we need to multiply it by Worth to get Amount
ArrayFormula(IF(A3:A="",,VLOOKUP(A3:A,G3:J,ArrayFormula(IF(D3:D="",,MATCH(D3:D,$H$2:$J$2,0)+1)),0)))*E3:E is structured like this Exchange rate * Amount note that E3:E is the Amount, and this IF(A3:A="",, to calculate only when A3:A range is not blank.
4 - ArrayFormula and a IF is needed to be wrapped around like this ArrayFormula(IF(Range=Empty,Do nothing,formula)
Range:
ArrayFormula(IF(A3:A="",,VLOOKUP(A3:A,G3:J,ArrayFormula(IF(D3:D="",,MATCH(D3:D,$H$2:$J$2,0)+1)),0)))
Empty
""
Do nothing :
,,
Formula:
ArrayFormula(IF(A3:A="",,VLOOKUP(A3:A,G3:J,ArrayFormula(IF(D3:D="",,MATCH(D3:D,$H$2:$J$2,0)+1)),0)))*E3:E
Option 02: Getting the result with intermediate steps.
Same as option 01 but in seprate columns take a look at this Sheet.

Using ARRAYFORMULA to Calculate Running Total of Payables (Alternative to INDIRECT)

I use a Google Spreadsheet to keep track of the accounts payable per vendor. There is a sheet per vendor in the Spreadsheet. A simplified sheet looks like this:
When I receive a new invoice, an entry for the amount is made in the Credit column and when I release a payment, an entry for the amount is made in the Debit column. I keep track of the running total in the AC Payable column. I achieve this by using a formula in each cell of the AC Payable column (the example below is from cell E4):
=IF(
ISNUMBER(INDIRECT(ADDRESS(ROW()-1,COLUMN()))),
INDIRECT(ADDRESS(ROW()-1,COLUMN()))+C4-D4,
C4-D4
)
The logic is simple. The running total for row n is calculated by:
AC Payable(n - 1) + Credit(n) - Debit(n)
This setup works fine, except I have to drag the formula into newly added rows. Is there a way to achieve this by using ARRAYFORMULA?
PS: I have found a solution using:
= ARRAYFORMULA(
SUMIF(
ROW(C3:C),
"<="&ROW(C3:C),
C3:C)
-
SUMIF(
ROW(D3:D),
"<="&ROW(D3:D),
D3:D
)
)
I feel this is a suboptimal (The original sheet dates back to 2018. It has a lot of rows) solution since, in every row, it calculates the total of the Debit and Credit columns up to the current row and then subtracts the total of the Debit column from the total of the Credit column.
I am expecting a solution that would take advantage of the running total available in the previous row and not redo the whole calculation per row.
solution for up to 1581 rows:
=ARRAYFORMULA(QUERY(QUERY(MMULT(TRANSPOSE((SEQUENCE(COUNTA(A3:A)*2)<=
SEQUENCE(1, COUNTA(A3:A)*2))*FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1})),
SEQUENCE(COUNTA(A3:A)*2, 1, 1, 0)), "offset 1", ), "skipping 2", ))
skills:
it's fast
it's smart
gets slower more rows you add
dies after 1581 rows
it's based on standard MMULT Running/Cumulative Total/Sum formula:
=ARRAYFORMULA(MMULT(TRANSPOSE((ROW(B1:B6)
<=TRANSPOSE(ROW(B1:B6)))*B1:B6), SIGN(B1:B6)))
but with a modification twist, because you got 2 columns to total
instead of ROW(B1:B6) we use a sequence of count of real data multiplied by two (because you got 2 columns):
SEQUENCE(COUNTA(A3:A)*2)
instead of TRANSPOSE(ROW(B1:B6)) we use again:
SEQUENCE(1, COUNTA(A3:A)*2)
combination of these pieces:
=ARRAYFORMULA(TRANSPOSE((SEQUENCE(COUNTA(A3:A)*2)<=SEQUENCE(1, COUNTA(A3:A)*2))))
will produce a matrix like:
and that's the reason why it dies with lots of rows because while you may think that if you have only 1500 rows in two columns, then formula will work only on 1500*2=3000 virtual cells, but in fact the MMULT formula processes (1500*2)*(1500*2)=9000000 virtual cells. still, it's worth to note, that this MMULT fx is great if deployed on a small scale.
next, instead of *B1:B6 we use:
*FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1}))
eg. with INDIRECT we take only "valid" range of C3:D which is in your example sheet just C3:D5 and we multiply C column by 1 and D column by -1 to simulate subtraction and then we FLATTEN both columns into one single column. the part +ROW(A3)-1 is just an offset because you start from row 3
and the last part of standard RT fx - SIGN(B1:B6) is replaced with one column full of ones:
SEQUENCE(COUNTA(A3:A)*2, 1, 1, 0)
then we offset the output with inner QUERY by 1 because we are interested in a totals after subtraction and finally we use skipping 2 which means that we filter out every second value - again, we are interested in totals after subtraction of D column.
solution for more than 1581 rows:
=ARRAYFORMULA(
SUMIF(SEQUENCE(COUNTA(A3:A)), "<="&SEQUENCE(COUNTA(A3:A)), INDIRECT("C3:C"&COUNTA(A3:A)))-
SUMIF(SEQUENCE(COUNTA(A3:A)), "<="&SEQUENCE(COUNTA(A3:A)), INDIRECT("D3:D"&COUNTA(A3:A))))
skills:
supports more rows
looks less smart
sadly the third argument of SUMIF always needs to be a range
gets slower with more rows
it will get sick if you feed it with 10000 rows
it may kill off your sheet with 11000+ rows
Here'a modification of Ben Collins' running total formula
=ARRAYFORMULA(
IF(ISBLANK(A2:A),,
MMULT(TRANSPOSE((ROW(C2:C)<=TRANSPOSE(ROW(C2:C)))*C2:C),SIGN(C2:C))-
MMULT(TRANSPOSE((ROW(D2:D)<=TRANSPOSE(ROW(D2:D)))*D2:D),SIGN(D2:D))))
yet another alternative to MMULT:
=INDEX(QUERY(FLATTEN(QUERY(QUERY(TRANSPOSE(QUERY(QUERY(TRANSPOSE(
(SEQUENCE(COUNTA(A3:A)*2)<=SEQUENCE(1, COUNTA(A3:A)*2))*
FLATTEN(INDIRECT("C3:D"&COUNTA(A3:A)+ROW(A3)-1)*{1, -1})),
"offset 1", ), "skipping 2", )), "select "&QUERY(
"sum(Col"&SEQUENCE(COUNTA(A3:A))&"),",, 9^9)&"' '"),
"offset 1", )), "where Col1 is not null", ))
but again, LTE (<=) limitation of 10M cells won't let you use more than 1581 rows in your case or 3162 rows in the standard cumulative sum case
(1581 rows * 2 columns) raised on 2nd power < 10 million cells
(1581*2)^2 = 9998244

Converting formula to ARRAYFORMULA issues with SUM and INDEX

I have a scoring spreadsheet for a competition I'm working on. Competitors' place/rank are converted into points towards the overall series based on a chart of corresponding values. For ties, the sum of the points covered by all of the tied places are split evenly among the tied competitors (i.e. 2-way tie for 3rd; if 3rd usually gets 10 points and 4th usually gets 8, these competitors will receive (10+8)/2 (2 being the # of tied competitors), so they each receive 9 points).
I have a formula which does this exact calculation:
=IFERROR(IF(ISBLANK($A4:$A),,SUM(INDEX(SeriesPoints, E4:E):INDEX(SeriesPoints, MIN(E4:E + COUNTIF(E$4:E, E4:E) - 1, ROWS(SeriesPoints)))) / COUNTIF(E$4:E, E4:E), 0))
Where 'SeriesPoints' is a 2 column array; column 1 is the places/ranks (1:125) and column 2 is their corresponding point values. Column 'E' is the competitors' rank from the competition.
I have been unable to convert this formula to an ARRAYFORMULA() so I can avoid dragging it down the entire sheet (possibly up to 1000+ competitors over the series).
I'm mildly proficient with MMULT(), so I understood that would be a good approach for switching out SUM(), however, I haven't been able to create a matrix of the values to be summed.
INDEX():INDEX() doesn't work with ARRAYFORMULA() so I've tried switching to VLOOKUP(). With VLOOKUP() I've been able to produce the start and end values of the range of values for a tie, but not the full list. For example, if there is a 3-way tie for 4th, I can produce the respective points for 4th and 6th (the bounds of the tie).
In an attempt to list out even just the numbers from 4:6, I've hit a wall converting what would be a simple ROW() or SEQUENCE() formula to a matrix/array.
The following formula produces an array of the upper and lower bounds of ties or the single place should there be no tie, although the single place gets repeated.
=ARRAYFORMULA(IF(COUNTIF(E$4:E,E4:E)=1,E4:E,{E4:E,E4:E+COUNTIF(E$4:E,E4:E)-1}))
I'm assuming if I can get VLOOKUP({#:#}) to fill properly, I'll be where I need to be.
From here, I feel confident in my abilities to wrap a VLOOKUP() for the actual point values, an MMULT() to sum across these rows for the total, then a simple division to produce the correct point value.
Spreadsheet: https://docs.google.com/spreadsheets/d/1lpNewR3p4i7ZHmlFGLlG1tLuxgO-6onSeH8mWTeclBw/edit?usp=sharing
Currently, my workspace is off to the right. The original formula is in F4 and my test codes are working on column G instead of E.
So for sample placements of 1,1,3,3,3,6,7,8 and sample points values of 1000, 850,738,663,633,603,573,550 I expect the output to be 925 for the two 1st place tied competitors, 678 for the tied 3rd places, 603 for 6th, 573 for 7th, and 550 for 8th.
I'd appreciate any and all help!
=ARRAYFORMULA(IFERROR(IFERROR(VLOOKUP(G4:G, QUERY({INDIRECT("G4:G"&counta(A4:A)+3),
VLOOKUP(ROW(INDIRECT("A1:A"&COUNTA(A4:A))), SeriesPoints, 2, 0)},
"select Col1,sum(Col2) group by Col1 label sum(Col2)''", 0), 2, 0))/
IFERROR(VLOOKUP(G4:G, QUERY(G4:G,
"select G,count(G) where G is not NULL group by G label count(G)''", 0), 2, 0))))

How to get the sum of a column up to a certain value?

I have a google sheet that I am using to try and calculate leveling and experience points. Column A has the level and Column B has the exp needed to reach the next level. i.e. To get to Level 3 you need 600 exp.
A B
1 200
2 400
3 600
...
99 19800
In column I2 I have an integer for an amount of exp (e.g. 2000), in column J2 I want to figure out what level someone would be at if they started from 0.
Put this in column J and ddrag down as required. Rounddown(I2,-2) rounds I2 down to the nearest 100. Index match finds a match in column B and returns the value in column A of the matched row.
=index(A2:A100,match(ROUNDDOWN(I2,-2),B2:B100,0))
Using a helper column (for example Z): put =sum(B$1:B1) in cell Z1 and drag down. This will compute the sums required for each level. In J2, use the formula
=vlookup(I2, {B:B, Z:Z}, 2) + 1
which looks up I2 in column B, and returns the nearest match that is less than or equal to the search key. It adds 1 to find the level that would be reached, because your table has this kind of an offset to you: the entry against level N is about achieving level N+1.
You may want to put 0 0 on top of the table, to correctly handle the amounts under 200. Or treat them with a separate if condition.
Using algebra
In your specific scenario, the point amount required for level N can be computed as
200*(1+2+3+...+N-1) = 200*(N-1)*N/2 = 100*(N-1/2)^2 - 25
So, given x amount of points, we can find N directly with algebra:
N = floor(sqrt((x+25)/100)+1/2)
which means that the formula
=floor(sqrt((I2 + 25) / 100) + 1/2)
will have the desired effect in cell J2, without the need for an extra column and vlookup.
However, the second approach only works for this specific point values.

Find the sum of each row in a spreadsheet

I'm new to Sheets and I don't know any terminology yet so I wasn't sure how to look this up.
If I have:
A1[=SUM(B1:1)]
How do I automatically copy that to A2 so that:
A2[=SUM(B2:2)]
And the same thing continues either indefinitely or until I declare a stopping point?
First of all, if you simply copy-paste the formula from A1 to A2 (or several cells below), it will automatically change as you want. This is how relative references work.
But it's also possible to get all the sums with one formula.
The following formula, entered in A1, will create sums of the first seven row in column A. To change the number of rows summed, replace 7 in B1:7 with another number.
=arrayformula(mmult(B1:7 + 0, transpose(B1:1 * 0 + 1)))
Explanation:
B1:7 + 0 coerces the entries to numbers (so that blank cells become 0).
transpose(B1:1 * 0 + 1) creates a column vector of 1s of suitable size.
matrix multiplication mmult by a column of 1s amounts to summing each row.
the wrapper arrayformula indicates that the operations are to be done on arrays.

Resources