Add title row with ARRAYFORMULA in Google Sheets - google-sheets

I watched a tutorial where the author uses an IF statement along with the ARRAYFORMULA function to add a title row to a column of data. Links are given to the docs; however, for an example of how to use ARRAYFORMULA see this answer.
An example can be seen below:
I was able to populate the C column by placing the following formula in C1:
=ARRAYFORMULA(if(row(A:A) = 1, "spent", B:B - A:A))
I'm confused about the syntax. I understand that X:X references the entire X column but I don't understand how it's being used to check if we're at cell A1 in one context and then being used to apply mass formulas in another context.
How does the above line work?
Can you illustrate with some examples?

It sounds to me that the information you learned led you to expect that row(A:A)=1 translates to row A1?
It works a little different than that, the syntax as your using it now, is basically saying if any row in A:A has a value of 1, then write "spent" else subtract B-A
My suggestion:
use a literal array to make your header, then use the if(arrayformula) to only populate rows with values, for aesthetics:
Example:
={"Spent";arrayformula(if(isnumber(A2:A),B2:B-A2:A,))}
Explanation:
The {} allow you to build a literal array, and using a semicolon instead of a comma allows you to stack your cells vertically, following that we check if there is a value in column A, if so, subtract A from B, else leave it blank.

why not just put the column title directly on the first row cell, and start the array formula from the 2nd row, using the A2:A, B2:B syntax?
If something does not have to be in a formula, better put it directly on the cell - simpler for others to understand what's going on, and the formula will be simpler.

If you put the array formula in line 2, and someone sorts the data, then the arrayformula will move. If it is in the header line, this is less likely to happen.
You can also use the IFS function to achieve a similar effect to the array,
=arrayformula(ifs(row(A1:A)=1,"Spent",A1:A="",,True,B1:B-A1:A)
Here the first condition checks the row number, and if it is row ONE, then inserts a Column Header.
The Second condition - A1:A="",, - ensures that blank lines are ignored.
The Third condition True (ELSE) performs the calculation.
This method also allows for different calculations to performed on different rows depending on requirements.

Related

Google Sheet: How to use arrayformula to copy data from one sheet to another?

In a Google spreadsheet, I want to sync A2:G500 in sheet1 to sheet2, I've been aware of the following two methods:
use IMPORTRANGE: put the following formula in A1 of sheet2:
=IMPORTRANGE("spreadsheet_url",sheet1!A2:G500)
It works but it feels like I am overdoing it, besides there seem to be a performance issue
In A2 of sheet2, put formula =sheet1!A2, then drag the formula to G500 in sheet2. This one is intuitive and simple to do. However, it doesn't work if sheet1 is a form response sheet - when new response is added, sheet2 won't automatically get it.
For learning purpose, I'm wondering if there is a way to do this using Arrayformula. Besides, I want to find a way to make this sync more care-free, meaning if there are indefinite rows of data I won't have to go back to this sheet every now and then and change the formula or manually drag the formula. Is this possible? And is Arrayformula the right way to go for this purpose?
I would recommend an { array expression }, like this:
={ Sheet1!A2:G }
This is more or less the same as
=arrayformula(Sheet1!A2:G)
...but I prefer the {} syntax because it allows you to specify non-adjacent columns. For example, you can skip columns D and F like this:
={ Sheet1!A2:C, Sheet1!E2:E, Sheet1!G2:G }
In spreadsheets where the locale uses the comma as decimal mark instead of the period, use a backslash \ instead of comma as horizontal separator.
To skip rows, use the semicolon ; as vertical separator. For example, you can skip rows 2:9 like this:
={ Sheet1!A1:G1; Sheet1!A10:G }
The open-ended range reference A10:G means "columns A to G starting in row 2 and extending all the way to the bottom of the sheet."
You can also leave out the row number to get an open-ended range reference like A:G which means "columns A to G from the very top to the bottom of the sheet." This reference will behave the same as A1:G in almost all situations. I have made it a habit to always include the start row in the reference because that way the formula will automatically adjust in the event a row is inserted above row 1.
When the source sheet is a form responses sheet, another tactic is needed. Form responses are always inserted in newly created rows that cannot be referenced directly in advance.
To avoid the range reference from adjusting when you dynamically copy form responses to another sheet, start the copy from row 1, like this:
={ 'Form Responses 1'!A1:A }
Alternatively, use an array formula, like this:
=arrayformula( 
  if( 
    row('Form Responses 1'!A1:A) = 1,
"Enter column header here", 
    'Form Responses 1'!A1:A
  ) 
)
An even better way to deal with form responses is to aggregate the data directly to whatever reports you need with the query() function.
It's either:
ArrayFormula(Sheet1!A2:G500) for the 499 lines, or
ArrayFormula(Sheet!A2:G) if you wanto sync everything from line 2 down
=ARRAYFORMULA(Sheet1!A:G)
Does this not work?
try in row 1:
={""; INDEX(sheet1!A2:A)}
this will solve your form issues when you use it in 1st row. if you already have something in your row 1 you can add it into double quotes like this:
={"header"; INDEX(sheet1!A2:A)}
in case of multiple columns its like this:
={"","","","","","",""; INDEX(sheet1!A2:G)}

JOIN header row values across a row based on non-blank values in cells

So I have two rows:
ID
TagDog
TagCat
TagChair
TagArm
Grouped Tags (need help with this)
1
TRUE
TRUE
TagDog,TagArm
Row 1 consists mainly of Tags, while rows 2+ are entries. This data ties ENTRIES to TAGS.
What I'm needing to do is concatenate/join the tag names per entry. For example, look at the last column above.
I suspect we could write a formula that would:
Create an array of non-empty cells in the row. (IE: [2,4])
Return it with the header row A (IE: [A2,A4])
Then join them together by a comma
But I am unsure how to write the formula, or if this is even the best approach.
Here's the formula:
={
"Grouped Tags (need help with this)";
ARRAYFORMULA(
REGEXREPLACE(TRIM(
TRANSPOSE(QUERY(TRANSPOSE(
IF(NOT(B2:E11),, B1:E1)
),, COLUMNS(B1:E1)))
), "\s+", ",")
)
}
The trick used is called vertical query smash. That's the part:
TRANSPOSE(QUERY(TRANSPOSE(...),, Nnumber_of_columns))
You can find a brief description of this one and his friends here.
I wasn't able to create a single formula that would do this for me, so instead, I utilized a formula inside of Sheets' Find/Replace tool, and it worked like a charm!
I did a find/replace, replacing all instances of TRUE with the following formula:
=INDIRECT(SUBSTITUTE(LEFT(ADDRESS(ROW(),COLUMN()),3),"$","")&"$1")
What this formula does is it finds the cell's letter, then gets the first row of the cell using INDIRECT.
Breaking down the formula:
ADDRESS(ROW(),COLUMN()) returns the direct reference: $H$1
LEFT("$H$1",3) returns $H$
SUBSTITUBE("$H$","$","") replaces the dollar signs ($) and returns H
INDIRECT(H&"$1") references the exact cell H$1
Now, I can replace all instances of TRUE with that formula and the magic happens!
Here is a video explanation: https://youtu.be/SXXlv4JHDA8
Hopefully, that helps someone -- however, I would still be interested in seeing what the formula is for this solution.

Google Sheets Fill Down with Formula

I have a very hard problem to solve, which must be completed with a formula (not a script).
Basically, the Raw input column needs to be dynamically filled down until it hits the next piece of text.
Here's an example file with includes the expected output.
https://docs.google.com/spreadsheets/d/1ibqCvY39NlhCRWsbBdxKITUUpVpp9wXdEz44T-pHDY0/
Is it even possible to achieve?
Thanks
This will work based on your ask, assuming that A2 is never blank, place this in the first row of data (not header):
=ArrayFormula(IF(A2:A<>"", A2:A, B1:B))
It checks to see if there is a value in column A, if there is, it fills that column, if not, it copies the cell above.
Delete everything in Column B (including the header) and place the following formula in B1:
=ArrayFormula({"Header";VLOOKUP(FILTER(ROW(A2:A),ROW(A2:A)<=MAX(FILTER(ROW(A2:A),A2:A<>""))),FILTER({ROW(A2:A),A2:A},A2:A<>""),2,TRUE)})
Here is a basic explanation of how this formula works:
A virtual array is created between the curly brackets { }; this virtual array contains a header and all results. You can change the header name to whatever you like.
VLOOKUP looks up every row number that is less than or equal to the highest row number that contains text in A2:A. Each of these qualifying rows is looked up in a second array that contains only the row numbers and Column-A data from non-blank rows, returning the data itself. Since rows are in perfect ascending order and the last parameter of VLOOKUP is set to TRUE, all blank rows in the first array will "fall backward" to find the most recent row that did have something in Column A.

Google Spreadsheet sum which always ends on the cell above

How to create a Google Spreadsheet sum() which always ends on the cell above, even when new cells are added? I have several such calculations to make on each single column so solutions like this won't help.
Example:
On column B, I have several dynamic ranges which has to be summed. B1..B9 should be summed on B10, and B11..B19 should be summed on B20. I have tens such calculations to make. Every now and then, I add rows below the last summed row , and I want them to be added to the sum. I add a new row (call it 9.1) before row 10, and a new raw (let's call it 19.1) before row 20. I want B10 to contain the sum of B1 through B9.1 and B20 to contain the sum of B11:B19.1.
On excel, I have the offset function which does it like charm. But how to do it with google spreadsheet? I tried to use formulas like this:
=SUM(B1:INDIRECT(address(row()-1,column(),false))) # Formula on B10
=SUM(B11:INDIRECT(address(row()-1,column(),false))) # Formula on B20
But on Google Spreadsheet, all it gives is a #name error.
I wasted hours trying to find a solution, maybe someone can calp?
Please advise
Amnon
You are probably looking for formula like:
=SUM(INDIRECT("B1:"&ADDRESS(ROW()-1,COLUMN(),4)))
Google Spreadsheet INDIRECT returns reference to a cell or area, while - from what I recall - Excel INDIRECT returns always reference to a cell.
Given Google's INDIRECT indeed has some hard time when you try to use it inside SUM as cell reference, what you want is to feed SUM with whole range to be summed up in e.g. a1 notation: "B1:BX".
You get the address you want in the same way as in EXCEL (note "4" here for row/column relative, by default Google INDIRECT returns absolute):
ADDRESS(ROW()-1,COLUMN(),4)
and than use it to prepare range string for SUM function by concatenating with starting cell.
"B1:"&
and wrap it up with INDIRECT, which will return area to be sum up.
REFERRING TO BELOW ANSWER from Druvision (I cant comment yet, I didn't want to multiply answers)
Instead of time consuming formulas corrections each time row is inserted/deleted to make all look like:
=SUM(INDIRECT(ADDRESS(ROW()-9,COLUMN(),4)&":"&ADDRESS(ROW()-1,COLUMN(),4)))
You can spare one column in separate sheet for holding variables (let's name it "def"), let's say Z, to define starting points e.g.
in Z1 write "B1"
in Z2 write "B11"
etc.
and than use it as variable in your sum by using INDEX:
SUM(INDIRECT(INDEX(def!Z:Z,1,1)&":"&ADDRESS(ROW()-1,COLUMN(),4))) - sums from B1 to calculated row, since in Z1 we have "B1" ( the 1,1 in INDEX(...,1,1) )
SUM(INDIRECT(INDEX(def!Z:Z,2,1)&":"&ADDRESS(ROW()-1,COLUMN(),4))) - sums from B11 to calculated row, since in Z2 we have "B11" ( the 2,1 in INDEX(...,2,1) )
please note:
Separate sheet named 'def' - you don't want row insert/delete influence that data, thus keep it on side. Useful for adding some validation lists, other stuff you need in your formulas.
"Z:Z" notation - whole column. You said you had a lot of such formulas ;)
Thus you preserve flexibility of defining starting cell for each of your formulas, which is not influenced by calculation sheet changes.
By the way, wouldn't it be easier to write custom function/script summing up all rows above cell? If you feel like javascripting, from what I recall, google spreadsheet has now nice script editor. You can make a function called e.g. sumRowsAboveMe() and than just use it in your sheet like =sumRowsAboveMe() in sheet cell.
Note: you might have to replace commas by semicolons
NOTE
After testing this answer, it will only work if the sum is in a different column due to a circular dependency error. Otherwise, the solution is valid.
It's a bit of algebra, but we can take advantage of Spreadsheets' lower right corner drag.
=SUM(X:X) - SUM(X2:X)
Where X is the column you are working with and X2 is your ending point. Drag the formula down and Sheets will increment the X2, thus changing the ending point.
*You mentioned that you had tens of such calculations to make. So in order to fit your exact need, we would subtract your last summation to get that "middle" range that we wanted.
e.g.
B1..B9 should be summed on B10, and B11..B19 should be summed on B20
Because of the circular dependency error mentioned earlier, I can't solve it exactly and put the sum on the same line, but this could work in other cases where the sum needs to be stored in a different column.
=SUM(B:B) - SUM(B9:B) //Formula on C10 (Sum of B1..B9)
=SUM(B:B) - SUM(B19:B) - B10 // Formula on C20 (Sum of B11..B19)
This is based on #PsychoFish, here is the solution:
=SUM(INDIRECT(SUBSTITUTE(ADDRESS(1,COLUMN(),4),"1","")&"3:"&ADDRESS(ROW()-1,COLUMN(),4)))
Simply replace the "3:" for the row to start sum.
#PsychoFish is correct but cannot be dragged and copied since the column is literal and hard coded, and #Druvision was in the right direction but was wrong... basically ended up with the same issue of having to re-enter the ranges and then sliding the formulas over and over.
You guys are making this harder than you have to. I just leave a couple of empty rows above by "sum" row (you can format them to be filled with color or something to keep them from being inadvertently used), then just add your new rows just above those special rows.
Agree with what user7255446 said that everyone is overcomplicating. Keep one row blank before your sum row. And then whenever you want to insert a new row, click on your blank row and use "Insert row ABOVE" instead of "insert row below". Your sum formula will automatically adjust.
Example: I want to sum from B1 to B19. I leave row 20 blank. In cell B21, put =SUM(B1:B20). Then if you ever need to insert a new row, click on row 20 and choose "Insert row above". The sum formula automatically changes to =SUM(B1:B21) for you. And of course your sum cell is now B22.
General syntax:
=SUM(INDIRECT(cell_reference_as_string1 &":"& cell_reference_as_string2)
with for example:
cell_reference_as_string1 = ADDRESS(ROW(),COLUMN(),4)
cell_reference_as_string2 = ADDRESS(ROW()-1,COLUMN(),4)
I like how #abernier describes the general solution. So far only alphabet-based A1 notation (A being first column, 1 being first row) are being used. It keeps confusing me, especially when thinking of number of columns left of another column. I like the number-based R1C1 notation much better. To use R1C1 notation for INDIRECT, you need to pass FALSE like so:
=SUM(INDIRECT("R1C"&COLUMN()&":R"&(ROW()-1)&"C"&COLUMN(), FALSE))
I hope you find that helpful, too.
OFFSET() can be used/abused for this purpose. Give it the absolute address of the top left of the range, 0 and 0 for the row/column offsets, and the height/width of the range. Let OFFSET() be the argument to SUM(), SUMIF(), etc.
ROW() and COLUMN() are handy when computing the desired height/width. Be sure to remember to subtract one to exclude the current row/column, or else you're liable to end up with a circular reference. If you have header rows/columns, subtract for them too.
For example, to sum everything from A2 down, excluding the current row, try:
=SUM(OFFSET($A$2,0,0,ROW()-2,1))
To sum everything to the left of the current cell, wherever it may be, try:
=SUM(OFFSET(INDIRECT("RC1",FALSE),0,0,1,COLUMN()-1))
Now let's flip things upside down, to show that this works in the other direction. Suppose you want to sum the B column, starting below the current row, until (and including) row #10. Try this:
=SUM(OFFSET($B$10,ROW()-9,0,10-ROW(),1))
You can avoid negative offsets, while still summing column B:
=SUM(OFFSET(INDIRECT("RC2",FALSE),1,0,10-ROW(),1))
Remove the "2" to instead sum the current column:
=SUM(OFFSET(INDIRECT("RC",FALSE),1,0,10-ROW(),1))
(Credit to Tom Sharpe, who commented above.) INDEX() can be used in a range expression. You might prefer this over OFFSET(), so I'm putting it here. The following sums everything from G1 down to the row above the current:
=SUM(G1:INDEX(G:G,ROW()-1))
Here's how I do it.
This formula does not require you to edit or enter anything about the particular column you would like to sum
=SUM(INDIRECT(CONCATENATE(address(1,column(),4),":",LEFT(address(1,column(),4),1))&ROW()-1))
The answer by #PsychoFish led me in the correct way.
The only issue that I had to rewrite the formula again from each column and each sum. So here is the improved formula, which sums the previous 9 cells on the same column, without hardcoding the column or row numbers:
=SUM(INDIRECT(ADDRESS(ROW()-9,COLUMN(),4)&":"&ADDRESS(ROW()-1,COLUMN(),4)))
The only issue is that I had to rewrite the formulas if someone adds or deletes a row. In this case I should change 9 to 10 or 8 corrspondingly.

How to use INDEX() inside ARRAYFORMULA()?

I am trying to use the INDEX() formula inside an ARRAYFORMULA(). As a simple (non-sense) example, with 4 elements in column A, I expected that the following array formula entered in B1 would display all four elements from A in column B:
=ARRAYFORMULA(INDEX($A$1:$A$4,ROW($A$1:$A$4)))
However, this only fills field B1 with a the value found in A1.
When I enter
=ARRAYFORMULA(ROW($A$1:$A$4))
in B1, then I do see all numbers 1 to 4 appear in column B. Why does my first array formula not expand similar like the second one does?
The INDEX function is one that does not support "iteration" over an array if an array is used as one of its arguments. There is no documentation of this that I know of; it simply is what it is. So the second argument will always default to the first element of the array, which is ROW(A1).
One clumsy workaround to achieve what you require relies on a second adjacent column existing next to the source data* (although it is unimportant what values are actually in that second column):
=ArrayFormula(HLOOKUP(IF(ROW($A$1:$A$4);$A$1);$A$1:$B$4;ROW($A$1:$A$4);0))
or indeed something like:
=ArrayFormula(HLOOKUP(IF({3;2;4;1};$A$1);$A$1:$B$4;{3;2;4;1};0))
edit 2015-06-09
* This is no longer a requirement in the newest version of Sheets; the second argument in the HLOOKUP can just be $A$1:$A$4.
Here is a tip for using vlookup with an array, so that even if the columns are moved later on the formula will still work correctly....
In general, configure the vlookup so that it's reading only 2 columns and returning the second. This can be done by inputting only the 2 columns required, rather than a range and column index.
Example:
Replace the following formula which would fail if columns are moved
=arrayformula( vlookup(C:C, booking!$A:$E ,5 ,false) )
with this formula which will continue to work even if columns are moved
=arrayformula( vlookup(C:C, {booking!$A:$A,booking!$E:$E} ,2 ,false) )
Note, you can also simulate the index function using vlookup.
Example:
Column R:R contains the row index numbers for looking up data in column booking!$A:$A
=arrayformula(vlookup(R:R ,arrayformula({row(booking!$A:$A), booking!$A:$A}),2 , false))
It's a nested array, so it can be helpful to test in stages, eg just the inner part for one example, eg return entry in row 10:
=vlookup(10 ,arrayformula({row(booking!$A:$A), booking!$A:$A}),2 , false)

Resources