Related
I have a Google Sheet that is populated automatically via Zapier integration. For each new row added, I need to evaluate a given cell (Shipper Name) to find last instance of Shipper Name in prior rows, and if so, return Row# for the last entry.
Example Data Sheet
I am trying to create a formula that simply looks at name in new row and returns the number of the most recent row with that name.
Formula needs to run as an Array formula so that the data auto populates with each new row added to the Sheet.
I have tried to use this formula, but when refactored as Array formula, it doesn't populate new values for new rows, it just repeats the first value for all rows.
From Row J:
=sumproduct(max(row(A$1:A3)*(F4=F$1:F3)))
I need this formula refactored to be an Array formula that auto populates all the cells below it.
I have tried this version, but it doesn't work:
=ArrayFormula(IF(ISBLANK($A2:$A),"",sumproduct(max(row(A$1:A3)*($F4:$F=F$1:F3))))
A script (custom function maybe?) would be better.
Solution 1
Below is a formula you can place into the header (put in in J1, remove everything below).
It works much faster than the second solution and has no N² size restriction. Also it works with empty shippers (& "♥" is for those empty ones): as long as A:A column has some value it will not be ignored.
={
"Row of Last Entry";
ARRAYFORMULA(
IF(
A2:A = "",
"",
VLOOKUP(
ROW(F2:F)
+ VLOOKUP(
F2:F & "♥",
{
UNIQUE(F2:F & "♥"),
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1)
},
2,
0
),
SORT(
{
ROW(F2:F) + 1
+ VLOOKUP(
F2:F & "♥",
{
UNIQUE(F2:F & "♥"),
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1)
},
2,
0
),
ROW(F2:F);
{
SEQUENCE(ROWS(UNIQUE(F2:F)))
* POWER(10, INT(LOG10(ROWS(F:F))) + 1),
SEQUENCE(ROWS(UNIQUE(F2:F)), 1, 0, 0)
}
},
1,
1
),
2,
1
)
)
)
}
Details on how it works
For every row we use VLOOKUP to search for a special number in a sorted virtual range to get the row number of the previous entry matching current.
A special number for a row is constructed like this: we get a sequential number for the current entry among unique entries and append to it current row number.
The right part (row number) of the resulting special numbers must be aligned between them. If the entry has sequential number 13 and the row number is 1234 and there are 100500 rows, then the number must be 13001234. 001234 is the aligned right part.
Alignment is done by multiplying a sequential number by 10 to the power of (log10(total number of rows) + 1), gives us 13000000 (from the example above). This approach is used to avoid using LEN and TEXT - working with numbers is faster then working with strings.
Virtual range has almost the same special numbers in the first column and original row numbers in the second.
Almost the same special numbers: they just increased by 1, so VLOOKUP will stop at most one step before the number corresponding to the current string.
Also virtual range has some special rows (added at the bottom before sorting) which have all 0's as the right part of their special numbers (1st column) and 0 for the row number (2nd column). That is done so VLOOKUP will find it for the first occurrence of the entry.
Virtual range is sorted, so we could use is_sorted parameter of the outer VLOOKUP set to 1: that will result in the last match that is less or equal to the number being looked for.
& "♥" are appended to the entries, so that empty entries also will be found by VLOOKUP.
Solution 2 - slow and has restrictions
But for some small enough number of rows this formula works (put in in J1, remove everything below):
={
"Row of Last Entry";
ARRAYFORMULA(
REGEXEXTRACT(
TRANSPOSE(QUERY(TRANSPOSE(
IF(
(FILTER(ROW(F2:F), F2:F <> "") > TRANSPOSE(FILTER(ROW(F2:F), F2:F <> "")))
* (FILTER(F2:F, F2:F <> "") = TRANSPOSE(FILTER(F2:F, F2:F <> ""))),
TRANSPOSE(FILTER(ROW(F2:F), F2:F <> "")),
""
)
), "", ROWS(FILTER(F2:F, F2:F <> "")))),
"(\d*)\s*$"
)
)
}
But there is a problem. The virtual range inside of the formula is of size N², where N is the number of rows. For current 1253 rows it works. But there is a limit after which it will throw an error of a range being too large.
That is the reason to use FILTER(...) and not just F2:F.
Here is a significantly simpler way to get at the information you're interested in. (I think.) I'm mostly guessing about what you want because your question wasn't really about what you want, but rather about how to get something that you think would help you get what you want. This is an example of an XY problem. I attempted to guess based on experience at what you're really after.
This editable sheet contains just 3 formulas. 2 on the raw data sheet and one in a new tab called "analysis."
The first formula on the Raw data tab extracts a properly formatted timestamp using a combination of MMULT and SPLIT functions and looks like this:
=ARRAYFORMulA({"Good Timestamp";IF(A2:A="",,MMULT(N(IFERROR(SPLIT(A2:A,"T"))),{1;1}))})
The second formula finds the amount of time since the previous timestamp for that Shipper. and subtracts it from the current timestamp thereby giving you the time between timestamps. However, it only does this if the time is less than 200 minutes. IF it is more than 200 minutes, it assumes that was a different shift for that shipper. It looks like this and uses a combination of LOOKUP() and SUBSTITUTE() to make sure it's pulling the correct timestamps. Obviously, you can find and change the 200 value to something more appropriate if it makes sense.
=ARRAYFORMULA({"Minutes/Order";IF(A2:A="",,IF(IFERROR((G2:G-1*SUBSTITUTE(LOOKUP(F2:F&G2:G-0.00001,SORT(F2:F&G2:G)),F2:F,""))*24*60)>200,,IFERROR((G2:G-1*SUBSTITUTE(LOOKUP(F2:F&G2:G-0.00001,SORT(F2:F&G2:G)),F2:F,""))*(24*60))))})
The third formula, on the tab called analysis uses query to show the average minutes per order and the number of orders per hour that each shipper is processing. It looks like this:
=QUERY({'Sample Data'!F:I},"Select Col1,AVG(Col3),COUNT(Col3)/(SUM(Col3)/60) where Col3 is not null group by Col1 label COUNT(Col3)/(SUM(Col3)/60)'Orders/ hour',AVG(Col3)'Minutes/ Order'")
Hopefully I've guessed correctly at your real goals. Always do your best to explain what they are rather than asking for only a small portion that you think will help you get to the answer. You can end up overcomplicating your process without realizing it.
I try to add conditional formatting using custom formula. I want to check if in a single row (in this case, in row 5 and it starts from column L and skip every 3rd column: L, P, T,....), there's at least one not empty shell.
The formula below works perfectly. It's only it's not dynamic, I mean in case I add more column to check then I have to add more and more "+ not(isblank(...))" there and become very long. Is there any shorter formula which should be checked no matter the last column is.
=(not(isblank(L5))+ not(isblank(P5))+ not(isblank(T5)) +not(isblank(X5)) + not(isblank(AB5)) +not(isblank(AF5)) + not(isblank(AJ5))+ not(isblank(AN5))+not(isblank(AR5))+not(isblank(AV5))+ not(isblank(AZ5)) ) > 1
here's the link:
https://docs.google.com/spreadsheets/d/1kedGsLIUw2UsA8LbiWe0w_s9HYoBLqOl6zGQxZlLq5s/edit#gid=0
you can do it like this which is shorter but still not dynamic:
=((E2<>"")+(I2<>"")+(M2<>"")+(Q2<>"")+(U2<>"")+(Y2<>"")+(AC2<>"")+(AG2<>"")+(AK2<>"")+(AO2<>"")+(AS2<>"")+(AW2<>"")+(BA2<>"")+(BE2<>"")+(BI2<>"")+(BM2<>"")+(BQ2<>"")+(BU2<>"")+(BY2<>"")+(CC2<>"")+(CG2<>"")+(CK2<>"")+(CO2<>"")+(CS2<>"")+(CW2<>"")+(DA2<>"")+(DE2<>"")+(DI2<>"")+(DM2<>"")+(DQ2<>""))>1
this will cover 30 columns (the equivalent of range A1:DQ which is 121 columns)
but if you need more I created a generator in your sheet where you just input number of columns and it will create you the formula you just copy-paste in Conditional formatting:
tho still not truly dynamic solution but hey, better than a nail in the eye
I have a column called "Notes (Atomic weights)" with an arbitrary (0 to n) number of search keys in it.
and a corresponding Named Range called "NOTES"
How do I do a vlookup/Query or Filter such that I get the combined column called "Note Texts" (see image below)?
If there is only one search key in the Notes column, I can use
IF(LEN(W3)>0, VLOOKUP(W3, NOTES, 2, false) , )
but now I have an arbitrary number of search keys in one column. how do I approach this without splitting and creating even more cells and then stitch them all back (adding more columns is very messy, since many other columns in my table also require the same fix).
Try this formula:
=TRANSPOSE(SPLIT(JOIN(char(10),ArrayFormula(IFERROR(VLOOKUP("["&SPLIT(JOIN("! ",A1:A4),"![",1),D1:E3,2,0),"!"))),char(10)&"!"&char(10),0))
Sample file:
https://docs.google.com/spreadsheets/d/13QFnYri6d8xvL9kXw-xAP87n1kT4wh4HwxYtHftMU9g/edit#gid=2094642927
Max's Solution works great! took me over an hour to analyse and finally understand the formula.
for my needs, I did not combine the rows and perform a single evaluation. Instead, I repeated the formula for every row using the following simplified formula (this fixes the alignment bug when there's empty rows)
= JOIN(char(10), ArrayFormula(
IFERROR(
VLOOKUP("["&SPLIT( A10 ,"[]", TRUE) &"]",NOTES,2,0),
"Error")
)
)
the following is a break down of what each part of Max's formula means.
start debugging from the inside(1) to the outside (7)
//(7) Finally, we TRANSPOSE the Columns into Rows
TRANSPOSE(
//(6) Now, we SPLIT the column up with the delimiter “\n!\n”
// that was added during Step (1)
SPLIT(
//(5) we now JOIN back all the columns, adding a new line “char(10)" before every column
JOIN(
char(10) //prepend with new line
//(4) The Magic !! ARRAYFORMULA enables the display of values returned from an array formula into multiple rows and/or columns
// Result is now displayed across multiple columns
,ArrayFormula(
IFERROR(
//(3) We can now do a VLOOKUP for each of the split search key
// (but only The first result is displayed)
VLOOKUP(
“[“ //reinsert the [ back after the split
//(2) Now, SPLIT up everything using delimiter “!”(new Row) And “[“ (new item)
& SPLIT(
//******** START FROM HERE*********
//(1) - take all the rows of interest, and then
// JOIN them together with a “!<SPACE>”
JOIN(
"! " //delimiter !<SPACE> ?
,A1:A4) //text to join (all the rows of interests)
,"![“
,TRUE) // split by each
,NOTES //Named range of interest
,2 //take second second column
,FALSE)
,”!”) // insert ! If error
) //ArrayFormula
) //JOIN
,char(10)&"!"&char(10) //delimiter "\n!\n” for split
,FALSE // do not split by each
) //SPLIT
) // TRANSPOSE
I've got the following Google spreadsheet:
item have ready need1 need2 need3
A 1 2 1
B 1 2 1 1
C 2 2
etc
I want to fill ready column as follows:
find the first column in need1, ..., needN range which has a non-empty value
if the value found is less or equals the value in have column, set ready column to something cheerful (e.g. yes)
if the value found is larger than the value in have column, don't do anything
So above input, when processed should look like this:
item have ready need1 need2 need3
A 1 2 1
B 1 2 1 1
C 2 yes 2
For the first step I found a suggested solution, which did not work for me:
=INDEX( SORT( FILTER( D10:H10 , LEN( D10:H10 ) ) ,
FILTER( COLUMN( D10:H10 ) , LEN( D10:H10 ) ) , 0 ) , 1 )
(it returns #REF!) Not sure what's wrong with it or how to proceed to the next step.
Thanks in advance!
If you know how many need columns you have, or even just how many columns are on the sheet, this is quite straightforward. If not and you need to look at the entire row, you might have to redesign a bit to avoid a circular reference from the cell with the formula being part of that row.
Your second two steps are fairly simple either way - you want one of two results based on a condition, so you're going to want to use =IF. Your condition is that the 'need' number is less than or equal to the 'have' number, and you want it to say 'yes' if that's true, and nothing if it isn't. So, that gives us:
=IF(need<=have, "Yes", "")
The examples below assume your table above starts from cell A1 in the top left, and that the last column in your sheet is Z
Next we need to find 'need' and 'have'. Finding 'have' is pretty easy - it's just the number in column B.
Finding 'need' is slightly more complicated. You've got the right idea using INDEX and FILTER, but your formula seems a little overcomplicated. Basically we can use FILTER to filter out the blank values, and INDEX to find the first one that is left. First, FILTER:
The range you want to filter from is everything in the same row from column D to column Z (or whatever the final column is), and the condition you want to filter for is that those same cells are not blank. For the formula you're typing into cell C2, that gives us:
=FILTER(D2:Z2, D2:Z2<>"")
Next, INDEX: If you give INDEX an array, a row number, and a column number, it will tell you what is at that the cell where that row and column meet. As we've filtered out the blanks, we just want whatever is left in the first column of our filtered array, which gives us:
=INDEX(FILTER(D2:Z2, D2:Z2<>""), 1, 1)
Or, as we only have one row in our array, and INDEX is pretty smart, simply:
=INDEX(FILTER(D2:Z2, D2:Z2<>""), 1)
So to bring it all together, our final formula for cell C2 is:
=IF(INDEX(FILTER(D2:Z2, D2:Z2<>""), 1)<=B2, "Yes", "")
Then just drag the formula down for as many rows as you need. If your sheet is or becomes wider, just change Z to whatever your last column is.
When you don't know the size of a range, use functions row, column, rows, columns.
Simple formula
Here's an example of what you are looking:
=if(INDEX(FILTER(OFFSET(D2,,,1,COLUMNS(1:1)-column(D2)+1),OFFSET(D2,,,1,COLUMNS(1:1)-column(D2)+1)<>""),1)<=B2,"yes","")
this part of formula:
OFFSET(D2,,,1,COLUMNS(1:1)-column(D2)+1)
returns the range starting from given cell (D2) to the end of Sheet (COLUMNS(1:1)-column(D2)+1)
ArrayFormula
I suggest using ArrayFormula, it'll expand automatically:
=ARRAYFORMULA(if(REGEXEXTRACT(SUBSTITUTE(trim(transpose(query(transpose(OFFSET(D2,,,COUNTA(A2:A),COLUMNS(1:1)-column(D2)+1)),,COLUMNS(OFFSET(D2,,,COUNTA(A2:A),COLUMNS(1:1)-column(D2)+1)))))," ",", "),"\d+")*1<=OFFSET(B2,,,COUNTA(A2:A)),"yes",""))
It assumes that 'Item' column has no blank values.
The solution from #Max Makhrov works, and has the advantage of using a single formula for the whole column.
However, it assumes that all of your columns at the right from your ready column (D) will be need_ columns.
The solution from #dmusgrave also works, provided you remove the extra "=" before INDEX:
=IF(INDEX(FILTER(D2:Z2,D2:Z2<>""),1)<=B2,"Yes","").
However, it makes the same assumption, and also limits at column Z.
Such assumptions seem reasonable, but if they are limiting you, here's how you can have any number of need_ columns starting right of your ready column:
=IF(INDEX(FILTER(INDIRECT( "D"&ROW()&":"&CHAR(67+COLUMNS(FILTER($1:$1,LEFT($1:$1, 4)="need")))&row() ), INDIRECT( "D"&ROW()&":"&CHAR(67+COLUMNS(FILTER($1:$1,LEFT($1:$1,4)="need")))&row() )<>""),1)<=B2,"Yes","")
The idea is simply to replace D2:Z2 (in #dmusgrave's solution) by :
INDIRECT( "D"&ROW()&":"&CHAR(67+COLUMNS(FILTER($1:$1,LEFT($1:$1, 4)="need")))&row() )
Explanation: You start from D at current row, and you go until the last need_ column on the same current row.
CHAR(68) is D, to which you add the number of columns titled need.*, minus one (hence the 67).
Using the same logic, you can easily make your formula more robust/generic, such as not having the need_ columns starting right form the ready column, etc.
Google Forms - I have set up a google form and I want to assign a unique id each of the completed incoming form inputs. My intention is to use the unique ID as an input for another google form I have created which I will use to link the two completed forms. Is there another easier way to do this?
I'm not a programmer but I have programming resources available to me if needed.
I was also banging my head at this and finally found a solution.
I compose a 6-digit number that gets generated automatically for every row and is composed of:
3 digits of the row number - that gives the uniqueness (you can use more if you expect more than 998 responses), concatenated with
3 digits of the timestamp converted to a number - that prevents guessing the number
Follow these instructions:
Create an additional column in the spreadsheet linked to your form, let's call it: "unique ID"
Row number 1 should be populated with column titles automatically
In row number 2, under column "Unique ID", add the following formula:
=arrayformula( if( len(A2:A), "" & text(row(A2:A) - row(A2) + 2, "000") & RIGHT(VALUE(A2:A), 3), iferror(1/0) ) )
Note: An array formula applies automatically to the entire column.
Make sure you never delete that row, even if you clear up all the results from the form
Once a new submission is populated, its "Unique ID" will appear automatically
Formula explanation:
Column A should normally hold the timestamp. If the timestamp is not empty, then this gives the row number: row(A2:A) - row(A2) + 2
Using text I trim it to a 3-digit number.
Then I concatenate it with the timestamp converted to a number using VALUE and trim it to the three right-most digits using RIGHT
Voila! A number that is both unique and hard-to-guess (as the submitter has no access to the timestamp).
If you would like more confidence, obviously you could use more digits for each of the parts.
You can apply unique ID numbers using an arrayformula next to the form data. In row 1 of the first rightmost empty column you can use something like
=arrayformula(if(row(A1:A)=1,"UNIQUE ID",if(len(A1:A)>0,98+row(A1:A),iferror(1/0))).
A few comments regarding the explanation provided by #Ying, which I will try to expand, as it is very good.
> Column A should normally hold the timestamp.
In my case, it is date+time stamp.
> 4. Make sure you never delete that row,
even if you clear up all the results from the form
That issue can easily be avoided by placing the formula in the header like this
={"calculated_id";arrayformula( if( len(C2:C); "" & text(row(C2:C) - row(C2) + 2; "000") & RIGHT(VALUE(C2:C); 3); iferror(1/0) ) )}
This formula provides an string for one cell, and a formula for the next one, which happens to be an array formula which will cover all the cells below.
Note: Depending on your language settings you may need to use ";" or "," as separator among parameters.
> 5. Once a new submission is populated,
its "Unique ID" will appear automatically
Issue
And here is the issue I see with this solution.
If the Google Form allows responders to Edit their responses, the date+time stamp will change and so the calculated_id.
A workaround is to have 2 columns, one is the calculated_id and the other will be static_id.
static_id will take whatever is on calculated_id only if itself has no data, otherwise it will stay as it is.
Doing that we will have an ID that will not change no matter how many updates the response experience.
The sort formula for static_id is
=IF(AND(IFERROR(K2)<>0;K2<>"");K2;L2)
The large one is
={"static_id";ArrayFormula(IF(AND(IFERROR(M2:M)<>0;M2:M<>"");M2:M;L2:L))
}
M or K -> static_id
L -> calculated_id
Remember to put this last one on the header of the column. I tend to change the color to purple when it has a formula behind, so I don't mess with it by mistake.
Extra info.
The numeric value from the date/time stamp differs when it comes from both or just one. Here are some examples.
Note that the number of digits on the fractional part differ quite a lot depending on the case.