Google Spreadsheet sum which always ends on the cell above - google-sheets

How to create a Google Spreadsheet sum() which always ends on the cell above, even when new cells are added? I have several such calculations to make on each single column so solutions like this won't help.
Example:
On column B, I have several dynamic ranges which has to be summed. B1..B9 should be summed on B10, and B11..B19 should be summed on B20. I have tens such calculations to make. Every now and then, I add rows below the last summed row , and I want them to be added to the sum. I add a new row (call it 9.1) before row 10, and a new raw (let's call it 19.1) before row 20. I want B10 to contain the sum of B1 through B9.1 and B20 to contain the sum of B11:B19.1.
On excel, I have the offset function which does it like charm. But how to do it with google spreadsheet? I tried to use formulas like this:
=SUM(B1:INDIRECT(address(row()-1,column(),false))) # Formula on B10
=SUM(B11:INDIRECT(address(row()-1,column(),false))) # Formula on B20
But on Google Spreadsheet, all it gives is a #name error.
I wasted hours trying to find a solution, maybe someone can calp?
Please advise
Amnon

You are probably looking for formula like:
=SUM(INDIRECT("B1:"&ADDRESS(ROW()-1,COLUMN(),4)))
Google Spreadsheet INDIRECT returns reference to a cell or area, while - from what I recall - Excel INDIRECT returns always reference to a cell.
Given Google's INDIRECT indeed has some hard time when you try to use it inside SUM as cell reference, what you want is to feed SUM with whole range to be summed up in e.g. a1 notation: "B1:BX".
You get the address you want in the same way as in EXCEL (note "4" here for row/column relative, by default Google INDIRECT returns absolute):
ADDRESS(ROW()-1,COLUMN(),4)
and than use it to prepare range string for SUM function by concatenating with starting cell.
"B1:"&
and wrap it up with INDIRECT, which will return area to be sum up.
REFERRING TO BELOW ANSWER from Druvision (I cant comment yet, I didn't want to multiply answers)
Instead of time consuming formulas corrections each time row is inserted/deleted to make all look like:
=SUM(INDIRECT(ADDRESS(ROW()-9,COLUMN(),4)&":"&ADDRESS(ROW()-1,COLUMN(),4)))
You can spare one column in separate sheet for holding variables (let's name it "def"), let's say Z, to define starting points e.g.
in Z1 write "B1"
in Z2 write "B11"
etc.
and than use it as variable in your sum by using INDEX:
SUM(INDIRECT(INDEX(def!Z:Z,1,1)&":"&ADDRESS(ROW()-1,COLUMN(),4))) - sums from B1 to calculated row, since in Z1 we have "B1" ( the 1,1 in INDEX(...,1,1) )
SUM(INDIRECT(INDEX(def!Z:Z,2,1)&":"&ADDRESS(ROW()-1,COLUMN(),4))) - sums from B11 to calculated row, since in Z2 we have "B11" ( the 2,1 in INDEX(...,2,1) )
please note:
Separate sheet named 'def' - you don't want row insert/delete influence that data, thus keep it on side. Useful for adding some validation lists, other stuff you need in your formulas.
"Z:Z" notation - whole column. You said you had a lot of such formulas ;)
Thus you preserve flexibility of defining starting cell for each of your formulas, which is not influenced by calculation sheet changes.
By the way, wouldn't it be easier to write custom function/script summing up all rows above cell? If you feel like javascripting, from what I recall, google spreadsheet has now nice script editor. You can make a function called e.g. sumRowsAboveMe() and than just use it in your sheet like =sumRowsAboveMe() in sheet cell.
Note: you might have to replace commas by semicolons

NOTE
After testing this answer, it will only work if the sum is in a different column due to a circular dependency error. Otherwise, the solution is valid.
It's a bit of algebra, but we can take advantage of Spreadsheets' lower right corner drag.
=SUM(X:X) - SUM(X2:X)
Where X is the column you are working with and X2 is your ending point. Drag the formula down and Sheets will increment the X2, thus changing the ending point.
*You mentioned that you had tens of such calculations to make. So in order to fit your exact need, we would subtract your last summation to get that "middle" range that we wanted.
e.g.
B1..B9 should be summed on B10, and B11..B19 should be summed on B20
Because of the circular dependency error mentioned earlier, I can't solve it exactly and put the sum on the same line, but this could work in other cases where the sum needs to be stored in a different column.
=SUM(B:B) - SUM(B9:B) //Formula on C10 (Sum of B1..B9)
=SUM(B:B) - SUM(B19:B) - B10 // Formula on C20 (Sum of B11..B19)

This is based on #PsychoFish, here is the solution:
=SUM(INDIRECT(SUBSTITUTE(ADDRESS(1,COLUMN(),4),"1","")&"3:"&ADDRESS(ROW()-1,COLUMN(),4)))
Simply replace the "3:" for the row to start sum.
#PsychoFish is correct but cannot be dragged and copied since the column is literal and hard coded, and #Druvision was in the right direction but was wrong... basically ended up with the same issue of having to re-enter the ranges and then sliding the formulas over and over.

You guys are making this harder than you have to. I just leave a couple of empty rows above by "sum" row (you can format them to be filled with color or something to keep them from being inadvertently used), then just add your new rows just above those special rows.

Agree with what user7255446 said that everyone is overcomplicating. Keep one row blank before your sum row. And then whenever you want to insert a new row, click on your blank row and use "Insert row ABOVE" instead of "insert row below". Your sum formula will automatically adjust.
Example: I want to sum from B1 to B19. I leave row 20 blank. In cell B21, put =SUM(B1:B20). Then if you ever need to insert a new row, click on row 20 and choose "Insert row above". The sum formula automatically changes to =SUM(B1:B21) for you. And of course your sum cell is now B22.

General syntax:
=SUM(INDIRECT(cell_reference_as_string1 &":"& cell_reference_as_string2)
with for example:
cell_reference_as_string1 = ADDRESS(ROW(),COLUMN(),4)
cell_reference_as_string2 = ADDRESS(ROW()-1,COLUMN(),4)

I like how #abernier describes the general solution. So far only alphabet-based A1 notation (A being first column, 1 being first row) are being used. It keeps confusing me, especially when thinking of number of columns left of another column. I like the number-based R1C1 notation much better. To use R1C1 notation for INDIRECT, you need to pass FALSE like so:
=SUM(INDIRECT("R1C"&COLUMN()&":R"&(ROW()-1)&"C"&COLUMN(), FALSE))
I hope you find that helpful, too.

OFFSET() can be used/abused for this purpose. Give it the absolute address of the top left of the range, 0 and 0 for the row/column offsets, and the height/width of the range. Let OFFSET() be the argument to SUM(), SUMIF(), etc.
ROW() and COLUMN() are handy when computing the desired height/width. Be sure to remember to subtract one to exclude the current row/column, or else you're liable to end up with a circular reference. If you have header rows/columns, subtract for them too.
For example, to sum everything from A2 down, excluding the current row, try:
=SUM(OFFSET($A$2,0,0,ROW()-2,1))
To sum everything to the left of the current cell, wherever it may be, try:
=SUM(OFFSET(INDIRECT("RC1",FALSE),0,0,1,COLUMN()-1))
Now let's flip things upside down, to show that this works in the other direction. Suppose you want to sum the B column, starting below the current row, until (and including) row #10. Try this:
=SUM(OFFSET($B$10,ROW()-9,0,10-ROW(),1))
You can avoid negative offsets, while still summing column B:
=SUM(OFFSET(INDIRECT("RC2",FALSE),1,0,10-ROW(),1))
Remove the "2" to instead sum the current column:
=SUM(OFFSET(INDIRECT("RC",FALSE),1,0,10-ROW(),1))
(Credit to Tom Sharpe, who commented above.) INDEX() can be used in a range expression. You might prefer this over OFFSET(), so I'm putting it here. The following sums everything from G1 down to the row above the current:
=SUM(G1:INDEX(G:G,ROW()-1))

Here's how I do it.
This formula does not require you to edit or enter anything about the particular column you would like to sum
=SUM(INDIRECT(CONCATENATE(address(1,column(),4),":",LEFT(address(1,column(),4),1))&ROW()-1))

The answer by #PsychoFish led me in the correct way.
The only issue that I had to rewrite the formula again from each column and each sum. So here is the improved formula, which sums the previous 9 cells on the same column, without hardcoding the column or row numbers:
=SUM(INDIRECT(ADDRESS(ROW()-9,COLUMN(),4)&":"&ADDRESS(ROW()-1,COLUMN(),4)))
The only issue is that I had to rewrite the formulas if someone adds or deletes a row. In this case I should change 9 to 10 or 8 corrspondingly.

Related

ArrayFormula to calculate previous rows

I have 3 sheets:
Sheet1 - list of transactions (Account, Credit, Debit, Date)
Sheet2 - list of transactions (Account, Credit, Debit, Date)
Sheet3 (I plan to lock it) - combined list of transactions, sorted by Date
Sheet3 looks like:
I need to add 1 more column to Sheet3 to count current balance for certain row to be like:
I'm able to do this with formula:
=SUM(FILTER($B$2:$B$8, ROW($A$2:$A$8) <= ROW($A2), A$2:A$8=$A2)) - SUM(FILTER($C$2:$C$8, ROW($A$2:$A$8) <= ROW($A2), A$2:A$8=$A2))
But this one I need continuously drag down.
Question: Is there way convert this formula to ArrayFormula, to avoid dragging
In G2 on sheet 3 I entered
=ArrayFormula(if(A2:A="",,mmult((A2:A=transpose(A2:A))*(row(A2:A)>= TRANSPOSE(row(A2:A)))*(transpose(B2:B)-transpose(C2:C)),row(A2:A)^0)))
See if that works for you?
In Sheet3 row 1, put your headers.
In Sheet3!A2, put
=sort({filter(Sheet1!A2:D,not(isblank(Sheet1!A2:A)));filter(Sheet2!A2:D,not(isblank(Sheet2!A2:A))),4,true)
In Sheet3!E2, put
=mmult(transpose(arrayformula(arrayformula(array_constrain(A2:A,counta(A2:A),1)=transpose(array_constrain(A2:A,counta(A2:A),1)))
*arrayformula(array_constrain(row(A2:A),counta(A2:A),1)<=transpose(array_constrain(row(A2:A),counta(A2:A),1))))),
arrayformula(array_constrain(B2:B,counta(A2:A),1)-array_constrain(C2:C,counta(A2:A),1))
To see why, let's temporarily remove the array_constrain(...,counta(...),1) wrappings, which is meant to auto detect the last data row:
=mmult(transpose(arrayformula(arrayformula(A2:A9=transpose(A2:A9))
*arrayformula(row(A2:A9)<=transpose(row(A2:A9))))),
arrayformula(B2:B9-C2:C9))
arrayformula(B2:B9-C2:C9) are the running sums of column B - column C (ie. credit - debit). It is a column vector with the length of your data size.
We want to, for each row, 1) filter this vector by comparison to column A (ie. account name) & 2) filter this vector by whether the running sums are below or above the row in question.
arrayformula(A2:A9=transpose(A2:A9)) does 1). arrayformula(row(A2:A9)<=transpose(row(A2:A9))) does 2).
We want elementwise product between the 2 matrices in order to compose the filter. Hence, arrayformula(...*...).
The columns of our filters are meant to be applied to the running sums. To use matrix multiplication, we can keep the column vector of running sums as the post-multiplier; and transpose the filter matrix as pre-multiplier so that the rows of the transposed matrix are multiplied (ie. applied) to the running sums. Hence, mmult(transpose(...),...).
Add back the array_constrain trick. And we are done.
Feel free to experiment with alternate placings of arrayformula. But remember to keep the () brackets wherever you omit arrayformula. Example:
=arrayformula(mmult(transpose(((array_constrain(A2:A,counta(A2:A),1)=transpose(array_constrain(A2:A,counta(A2:A),1)))
*(array_constrain(row(A2:A),counta(A2:A),1)<=transpose(array_constrain(row(A2:A),counta(A2:A),1))))),
(array_constrain(B2:B,counta(A2:A),1)-array_constrain(C2:C,counta(A2:A),1))))
Nonetheless, the 1 formula solution is computationally inefficient compared to individually spread formula per cell. That is because, without mutating the formula per row, we are forced to compute the filters as full n-by-n matrices where n is your data size.
Whereas, if in E2 we put =sum(filter(B$2:B2-C$2:C2,A$2:A2=A2)) and spread to the end by double right-clicking the square on bottom right when you select E2, the formula mutates per row, saving the row index comparison entirely, and also cutting the comparison to column A logarithmically.
Granted, we probably shouldn't rely on Google Sheet for a large database (e.g. >100k entries). But even for thousands of entries, if you square the amount of computations required, getting the results in browser becomes impractically slow well before one may expect.

Can change shape of range with ARRAYFORMULA() in Google Sheets?

My intention is to convert a single line of data into rows consist of a specific number of columns in Google Sheets.
For example, starting with the raw data:
A
B
C
D
E
F
1
id1
attr1-1
attr2-1
id2
attr2-1
attr2-2
And the expected result is:
(by dividing columns by three)
A
B
C
1
id1
attr1-1
attr1-2
2
id2
attr2-1
attr2-2
I already know that it's possible a bit manually, like:
=ARRAYFORMULA({A1:C1;D1:F1})
But I have to start over with it every time the target range is moved OR the subset size needs to be changed (in the case above it was three)!
So I guess there will be a much more graceful way (i.e. formula does not require manual update) to do the same thing and suspect ARRAYFORMULA() is the key.
Any help will be appreciated!
I added a new sheet ("Erik Help") where I reduced your manually entered parameters from two to one (leaving only # of columns to be entered in A2).
The formula that reshapes the grid:
=ArrayFormula(IFERROR(VLOOKUP(SEQUENCE(ROUNDUP(COUNTA(7:7)/A2),A2),{SEQUENCE(COUNTA(7:7),1),FLATTEN(FILTER(7:7,7:7<>""))},2,FALSE)))
SEQUENCE is used to shape the grid according to whatever is entered in A2. Rows would be the count of items in Row 7 divided by the number in A2 (rounded to the nearest whole number); and the columns would just be whatever number is entered in A2.
Example: If there are 11 items in Row 7 and you want 4 columns, ROUNDUP(11/4)=3 rows to the SEQUENCE and your requested 4 columns.
Then, each of those numbers in the grid is VLOOKUP'ed in a virtual array consisting of a vertical SEQUENCE of ordered numbers matching the number of data pieces in Row 7 (in Column 1) and a FLATTENed (vertical) version of the Row-7 data pieces themselves (in Column 2). Matches are filled into the original SEQUENCE grid, while non-matches are left blank by IFERROR
Though it's a bit messy, managed to get it done thanks to SEQUENCE() function anyway.
It constructs a grid by accepting number of rows/columns input, and that was exactly I was looking for.
For reference set up a sheet with the sample data here:
https://docs.google.com/spreadsheets/d/1p972tYlsPvC6nM39qLNjYRZZWGZYsUnGaA7kXyfJ8F4/edit#gid=0
Use a custom formula
Although you already solved this. If you are doing this kind of thing a lot, it could be beneficial to look into Apps Script and custom formulas.
In this case you could use something like:
function transposeSingleRow(range, size) {
// initialize new range
let newRange = []
// initialize counter to keep track
let count = 0;
// start while loop to go through row (range[0])
while (count < range[0].length){
// add a slice of the original range to the new range
newRange.push(
range[0].slice(count, count + size)
);
// increment counter
count += size;
}
return newRange;
}
Which works like this:
The nice thing about the formula here is that you select the range, and then you put in a number to represent its throw, or how many elements make up a complete row. So if instead of 3 attributes you had 4, instead of calling:
=transposeSingleRow(A7:L7, 3)
you could do:
=transposeSingleRow(A7:L7, 4)
Additionally, if you want this conversion to be permanent and not dependent on formula recalculation. Making it in run fully in Apps Script without using formulas would be neccesary.
Reference
Apps Script
Custom Functions

Google Sheet - Transform two columns into one column using arrayformula (more than 50,000 characters)

I'm using Google Sheets and looking for an arrayformula that able to take a list in two columns and arrange it alternately in one column. The sheet contains about 5,000 rows, each row has more than 35 characters.
I tried this:
=transpose(split(join(" ", query(transpose(B5:C),,50000)), " "))
But then I got this message:
Please take a look at the sheet here:
https://docs.google.com/spreadsheets/d/11T1Roj1trviOSiiTZS292-4l3oODid7KLi9oGz3Z66o/edit#gid=0
Assuming your 2 columns are A and B, this formula will "interlace" them:
=query(
sort(
{arrayformula({row(A1:A3)*2, A1:A3});
arrayformula({row(B1:B3)*2+1, B1:B3})}
),
"select Col2")
Explanation, unwrapping the formula from the inside:
Each value gets a unique number, based on its row number times 2 (+1 for the 2nd column)
Everything is sorted based on this number
Only the 2nd column is extracted for the result.
There is a function for this called FLATTEN().
This works perfectly as a general solution since it takes an array of any size and outputs the items in the order they appear left-right-top-down (See here).
It can be combined with TRANSPOSE() to accomplish the same thing but in the horizontal case, and if needed blank cells can be omitted with FILTER().
EDIT:
My sincere apologies, I did not read the question carefully enough. My response is incorrect.
This should work:
={B5:B12;C5:C12}
just be careful to NOT change it to
={B5:B;C5:C}
This will start an infinite loop where the spreadsheet will increase the amount of rows in the spreadsheet to allow this output column to expand, but in doing so increases the length of the 2 input columns, meaning the length of the output column increases even more, so the spreadsheet tries adding more rows, etc, etc. It'll make your sheet crash your browser or something each time you try to open it.
In Row5:
=ArrayFormula(offset(B$5,INT((row()-5)/2),iseven(row())))
Would need to be copied down however.

Add title row with ARRAYFORMULA in Google Sheets

I watched a tutorial where the author uses an IF statement along with the ARRAYFORMULA function to add a title row to a column of data. Links are given to the docs; however, for an example of how to use ARRAYFORMULA see this answer.
An example can be seen below:
I was able to populate the C column by placing the following formula in C1:
=ARRAYFORMULA(if(row(A:A) = 1, "spent", B:B - A:A))
I'm confused about the syntax. I understand that X:X references the entire X column but I don't understand how it's being used to check if we're at cell A1 in one context and then being used to apply mass formulas in another context.
How does the above line work?
Can you illustrate with some examples?
It sounds to me that the information you learned led you to expect that row(A:A)=1 translates to row A1?
It works a little different than that, the syntax as your using it now, is basically saying if any row in A:A has a value of 1, then write "spent" else subtract B-A
My suggestion:
use a literal array to make your header, then use the if(arrayformula) to only populate rows with values, for aesthetics:
Example:
={"Spent";arrayformula(if(isnumber(A2:A),B2:B-A2:A,))}
Explanation:
The {} allow you to build a literal array, and using a semicolon instead of a comma allows you to stack your cells vertically, following that we check if there is a value in column A, if so, subtract A from B, else leave it blank.
why not just put the column title directly on the first row cell, and start the array formula from the 2nd row, using the A2:A, B2:B syntax?
If something does not have to be in a formula, better put it directly on the cell - simpler for others to understand what's going on, and the formula will be simpler.
If you put the array formula in line 2, and someone sorts the data, then the arrayformula will move. If it is in the header line, this is less likely to happen.
You can also use the IFS function to achieve a similar effect to the array,
=arrayformula(ifs(row(A1:A)=1,"Spent",A1:A="",,True,B1:B-A1:A)
Here the first condition checks the row number, and if it is row ONE, then inserts a Column Header.
The Second condition - A1:A="",, - ensures that blank lines are ignored.
The Third condition True (ELSE) performs the calculation.
This method also allows for different calculations to performed on different rows depending on requirements.

Google Sheets ARRAYFORUMLA that can show 0 if the value is less than or equal to zero. Otherwise show the value of that set of data

I have set of columns that I am attempting to calculate the combined total of those columns then subtract from that total 8, if the difference after 8 is equal to or less than 0 I want to only show zero in the column I am doing this formula in. For those who might ask, I am using the ARRAYFORUMLA cause I want this calculation to repeat as I add new rows, keeping the totals I am seeking on the same row as the calculation is being done onto. So far I have most of this working. Well up to the IF ELSE THEN type of portion. My attempt is/was
=if(LTE((B3:B)+(C3:C)-8,0),ARRAYFORMULA((B3:B)+(C3:C)), 0)
As long as I'm understanding you correctly, I believe this will work:
=if(lte(sum(B2:C)-8,0),0,sum(B2:C))
It's at the top of the row and sums columns B and C. You can add more columns easily this way by either changing B/C to something else or passing more columns in.
Sum-8 is greater than 0
Sum-8 is less than 0
If what you want to do is to add each row to the total but only if its sum is 8 or greater, then your original formula was nearly OK but the two parts of the IF statement should have been reversed
=sum(ArrayFormula(if(LTE((B3:B)+(C3:C)-8,0),0,(B3:B)+(C3:C))))
You could also take most of the brackets out
=sum(ArrayFormula(if(LTE(B3:B+C3:C-8,0),0,B3:B+C3:C)))
and this would also work
=sum(ArrayFormula(if(LTE(B3:B+C3:C,8),0,B3:B+C3:C)))
Short answer
ARRAYFORMULA that can show 0 if the value is less than or equal to zero. Otherwise show the value of that set of data:
=ArrayFormula(IF(LTE(A1:A3,0),0,A1:A3))
Explanation
The basic IF syntax requires the use of scalar values, to use it with arrays, ARRAYFORMULA is required as an outer function:
ARRAYFORMULA(IF(array_logical_expression,array_if_true,array_if_false))
For the specific case described on the body of the question, the corresponding formula is:
=ARRAYFORMULA(IF(LTE(B3:B+C3:C-8,0),0,B3:B+C3:C))
Reference
Google Docs editors Help
IF
ARRAYFORMULA
Using arrays in Google Sheets

Resources