Google Spreadsheet: Split String and return last N elements - google-sheets

I have string ( "-" ) delimited data (alphanumeric, variable length) in a single column which has the format...
Column A
"aaa-bbb-ccc"
"ddd-bbb-eee"
"aaa-fff-ggg"
I have been able to use array_constrain() to return partial data elsewhere but this is N elements from the beginning of the array to the end of the array so I could have...
aaa (num_cols = 1)
aaa-bbb (num_cols = 2)
aaa-bbb-ccc (num_cols = 3)
I am looking to get the last N elements from the split data so...
bbb-ccc OR
ccc
num_cols only goes from the beginning of the array to the back of the array so that's no good for my scenario.
This answer to a similar question suggests using regexextract() to retrieve the last value which if my RegEx Foo was stronger then perhaps I could make that work with a little nudge in the right direction.
I know that index() can be used but that only returns one element at a time from the split data so that gets messy / inefficient.
So the question is does anybody know how to return the last N elements from an array? Ordinary sorting wouldn't work but reversing the array could work.

There's no reverse function. I see two ways:
write small function is sctipt: reverse array
use ugly huge formula
Huge formula sample
=JOIN("-",QUERY({ArrayFormula(ROW(INDIRECT("A1:A"&COUNTA(SPLIT(A1,"-"))))),TRANSPOSE(SPLIT(A1,"-"))},"select Col2 order by Col1 desc limit 2 "))
change limit 2 in the end of the formula to get N last elements.
Script sample
Try this:
function reverseLine(line) {
return line[0].reverse();
}
Use as custom formula: =reverseLine(B1:D1)
for line: aaa bbb ccc returns:
ccc
bbb
aaa

Provided the variable length is exclusively from the number of sets of the format -aaa appended to the first three characters then those three characters and as many further sets as desired may be stripped off the front with:
=replace(A1,1,find("-",A1)*B1,"")
where A1 contains the likes of aaa-bbb-ccc and B1 the number of sets additional to the first three characters to be removed, plus one.

Related

ArrayFormula, RegexExtract, and Join in Google Sheets

I have a data set wherein emails are populated. I would like to list all the surnames extracted in the emails per cell and will be all joined to a one single cell but I want to put a separator or delimeter to the emails obtaine per cell.
Here is the data set:
A
B
john.smith#gmail.com, jane.doe#gmail.com
UPDATE
john.smith#gmail.com
CLOSE
And here is the formula to extract
=ARRAYFORMULA(
PROPER(
REGEXEXTRACT(
A:A,
REGEXREPLACE(
A:A,
"(\w+)#","($1)#"
)
)
)
)
This initially yields the ff:
C
D
Smith
Doe
Smith
I would like to use JOIN() inside the ARRAYFORMULA() but it is not working as I seem to think it would since it outputs an error that it only accepts one row or one column of data. My initial understanding of ARRAYFORMULA() is that it iterates through the course of the data, so I thought it will JOIN() first, and then move on to the next element/row but I guess it doesn't work that way. I can use FLATTEN() but I want to have delimiters or separators in between the row elements. I need help in obtaining my intended final result which will look like this:
UPDATE:
Smith
Doe
CLOSE:
Smith
All are located in one cell, C1. UPDATE and CLOSE are from column B.
EDIT: I would like to clarify that the email entries in column A are dynamic and maybe more than two.
I think this will work:
=arrayformula(flatten(if(A2:A<>"",regexreplace(trim(split(B2:B&":"&char(9999)&regexreplace(Proper(A2:A),"#[\w\.]+,\ ?|#.*",char(9999)&" "),char(9999))),".*\.",),)))
NOTES:
Proper(A2:A) changes the capitalisation.
The regexreplace "#[\w\.]+,\ ?|#.*" finds:
# symbol...
then any number of A-Z, a-z, 0-9, _ [using \w] or . [using \.]
then a comma
then 'optionally' a space \ [the optional bit is ?]
or [using |], the # symbol then an number of characters [using .*]
The result is replaced with a character that you won't expect to have in your text - char(9999) which is a pencil icon, and a trailing space (used later on when the flatten keeps a gap between lines). The purpose is to get all of the 'name.surname' and 'nameonly' values in front of any # symbol and separate them with char(9999).
Then infront of the regexreplace is B2:B&":"&char(9999)& which gets the value from column B, the : chanracter and char(9999).
The split() function then separates then into columns. Trim() is used to get rid of spaces in front of names that don't contain ..
The next regexreplace() function deletes anything before, and including . to keep surname or name without ..
The if(A2:A<>"" only process rows where there is a value in col A. The arrayformula() function is required to cascade the formula down the sheet.
I didn't output the results in a single cell, but it looks like you've sorted that with textjoin.
Here's my version of getting the results into a single cell.
=arrayformula(textjoin(char(10),1,if(A2:A<>"",REGEXREPLACE(B2:B&":"&char(10)&regexreplace(Proper(A2:A),"#[\w\.]+,\ ?|#.*",char(10)),".*\.",),)))
Assuming that your A:A cells will always contain only contiguous email addresses separated by commas, you could place this in, say, C1 (being sure that both Columns C and D are otherwise empty beforehand):
=TRANSPOSE(FILTER({B:B,IFERROR(REGEXEXTRACT(SPLIT(PROPER(A:A),","),"([^\.]+)#"))},A:A<>""))
If this produces the desired result and you'd like to know how it works, report back. The first step, however, is to make sure it works as expected.
use:
=INDEX(REGEXREPLACE(TRIM(QUERY(FLATTEN(QUERY(TRANSPOSE({{B1; IF(B2:B="",,"×"&B2:B)},
PROPER(REGEXEXTRACT(A:A, REGEXREPLACE(A:A, "(\w+)#", "($1)#")))})
,,9^9)),,9^9)), " |×", CHAR(10)))

How can I get the last numerical value value in a column in Google Sheets?

I need to find the last numerical value in a column. I was using this formula to get the last value in column G, but I made some changes and it no longer works: =INDEX(G:G, COUNTA(G:G), 1). My column now looks like this:
645
2345
4674.2345
123.1
"-"
"-"
"-"
...and the formula returns "-". I want it to return 123.1. How can I do this?
There are many ways to go about this. Here is one of them:
=QUERY(FILTER({G:G,ROW(G:G)},ISNUMBER(G:G)),"Select Col1 ORDER BY Col2 Desc LIMIT 1")
FILTER creates a virtual array of only numeric values in G in the first column and the row of those numeric values in the second column.
QUERY returns flips the order by row number and returns only the new top value from the first column (which winds up being your last numeric value in the original range).
However, if your numeric values start at G1, and if there are only numeric values up to where you start adding hyphens in cells, you could just alter your original formula like this:
=INDEX(G:G,COUNT(G:G))
This would work because COUNT only counts numeric values while COUNTA counts all non-null values (including errors BTW).
Not to take anything away from the accepted answer, but I've been working on this a bit lately in relation to this for the never-ending last row discussion and thought I'd share some potential similar solutions. These ideas are inspired by a pattern of google sheet array questions that seem to be coming up more often. I am also intentionally using different ways to do the same thing just to give people some ideas (i.e. left and Regex).
Last Row that is...
Number: =max(filter(row(G:G),isnumber(G:G)))
Text: =max(filter(row(G:G),isText(G:G)))
An error: =max(filter(row(G:G),iserror(G:G)))
Under 0 : =max(filter(row(G:G),G:G<0))
Also exists in column D: =max(filter(row(G:G),ISNUMBER(match(G:G,D:D,0))))
Not Blank: =max(filter(row(A:A),NOT(ISBLANK(A:A))))
Starts with ab: =max(filter(row(G:G),left(G:G,2)="ab"))
Contains the character !: =max(filter(row(G:G),isnumber(Find("!",G:G))))
Starts with a number: =max(filter(row(G:G),REGEXMATCH(G:G,"^\d")))
Only contains letters: =max(filter(row(G:G),REGEXMATCH(G:G,"^[a-zA-Z]+$")
Last four digits are upper case: =Max(filter(row(G:G),REGEXMATCH(G:G,"[A-Z]{4}$")))
To get the actual value (which I realize was the actual question), just wrap an index function around the Max function. So for this question, a solution could be :
=Index(G:G,max(filter(row(G:G),isnumber(G:G))))

How to check if query parameter is empty?

The idea is to generate a list of board games based on parameters like player count, time and/or difficulty level. I am using the QUERY function in Google Sheets.
I have a sheet that has a list of board games. They have separate columns for Title, the minimum number of players, the maximum number of players, difficulty, playtime.
In a second sheet, I have 3 cells which a user can use to write player count (B1), Difficulty (B2) and/or Playtime (B3).
The idea is that with all 3 cells empty, it doesn't show anything, but then you can fill out any or even all of the three cells to filter the complete list of games.
However, if I use one query with all three parameters, if any are empty I get an error.
I've worked around the problem by having multiple nested IFs which check if any of the cells are empty. Based on this, it runs a slightly different QUERY function (i.e., excluding the empty cell). However, this is difficult to troubleshoot and will be a pain to modify if I want to add any additional parameters.
This is the full query:
=QUERY('Lista gier'!A1:H424;"select A,C,D,E,F where C<="&B1&" AND D>="&B1&" AND E='"&B2&"' AND F<="&B12&" AND F>=(0.5*"&B12&") AND B='Gra'");
Expected results are the proper list of games, but I always receive an error:
Unable to parse query string for Function QUERY parameter 2: PARSE_ERROR: Encountered " "C "" at line 1, column 24. Was expecting one of: "(" ... "(" ...
the correct syntax would be:
=QUERY('Lista gier'!A1:H424;
"select A,C,D,E,F
where C <= "&B1&"
and D >= "&B1&"
and E = '"&B2&"'
and F <= "&B12&"
and F >= "&0,5*B12&"
and B = 'Gra'"; 0)
assuming that B1 and B12 are numeric and B2 is a text string

VLOOKUP remove spaces when cell is empty

This a simple customer sheet:
A B C D
ID First Middle Last
1 John Doe
2 Jane Maia Doe
And in F1 I put this vlookup code:
=VLOOKUP($G$1;$A$1:$D$3;2;FALSE)&" "&VLOOKUP($G$1;$A$1:$D$3;3;FALSE)&" "&VLOOKUP($G$1;$A$1:$D$3;4;FALSE)
When I lookup ID 2, it's perfect nicely spaced between the vlookups
But when I lookup ID 1 you see 2 spaces between the first and last name, because there is no middle name here.
How can I manage that I always see 1 space between the vlookups?
One way you could achieve the result you're looking for is to simply replace multiple spaces with a single space.
=REGEXREPLACE(JOIN(" ",ARRAYFORMULA(VLOOKUP(G1,A:D,{2,3,4},FALSE))),"\s{2,}"," ")
This formula looks up G1 in your table (A:D). VLOOKUP can be used in an ARRAYFORMULA to efficiently retrieve all of the columns you want in one shot. Your JOIN joins all of the retrieved columns, inserting a space between each value. Finally, your REGEXREPLACE function looks for multiple consecutive spaces and replaces them with a single space.
Alternatively, you could filter the resulting array (i.e. the result of what your VLOOKUP returns). The following formula looks up the array of first, middle, and last name, and then filters out any empty cells before joining the remaining elements with a space.
=JOIN(" ",FILTER(VLOOKUP(I1,A:D,{2,3,4},FALSE),INDIRECT("B"&MATCH(I1,A:A,0)&":D"&MATCH(I1,A:A,0))<>""))
all you need is TRIM fx and:
=ARRAYFORMULA(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IFERROR(
VLOOKUP(G1:G2, A1:D3, {2,3,4}, 0))),,999^99))))

Get data between number two and three delimiter

I have a large list of people where each person has a line like this.
Bill Gates, IT Manager, Microsoft, <https://www.linkedin.com/in/williamhgates>
I want to extract the company name in a specific cell. In this example, it would be Microsoft, which is between the second and third delimiters (in this case, the delimiter is ", "). How can I do this?
Right now I'm using the split method (=SPLIT(A2, ", ",false)). But it gives me four different cells with information. I would like a command only to output the company in one cell. Can anyone help? I have tried different things, but I can't seem to find anything that works.
Maybe some regex can do it, but I'm not into regex.
Short answer
Use INDEX and SPLIT to get the value between two separators. Example
=INDEX(SPLIT(A1,", ",FALSE),2)
Explation
SPLIT returns an 1 x n array.
The first argument of INDEX could be a range or an array.
The second and third arguments of INDEX are optional. If the first parameter is an array that has only one row or one column, it will assume that the second argument corresponds to the larger side of the array, so there is no need to use the third argument.
A bit nasty, but this formula works, assuming data in cell D3.
=MID(D3,FIND(",",D3,FIND(",",D3)+1)+2,FIND(",",D3,FIND(",",D3,FIND(",",D3)+1)+1)-FIND(",",D3,FIND(",",D3)+1)-2)
Broken down, this is what it does:
Take the Mid point of D3 =MID(D3
starting two characters after the 2nd comma FIND(",",D3,FIND(",",D3)+1)+2
and the number of characters between the 2nd and 3rd comma, excluding spaces FIND(",",D3,FIND(",",D3,FIND(",",D3)+1)+1)-FIND(",",D3,FIND(",",D3)+1)-2)
I'll add my favourite ArratFormula, which you could use to expand list automatically without draggind formula down. Assumptions:
you have list with data in range "A1:A20"
all data have same sintax "...,Company Name, <..."
In this case you could use Arrayformula, pasted in cell B1:
=ArrayFormula(REGEXEXTRACT(A1:A20,", ([^,]+), <"))
If your data doest's always look like "...,Company Name, <..." or you wish to get different ounput, use this formula in cell B1:
=QUERY(QUERY(TRANSPOSE(SPLIT(JOIN(", ",A1:A20),", ",0)),"offset 2"),"skipping 4")
in this formula:
change 2 in offset 2 to 0, 1, 2, 3 to get name, position, company, link
in skipping 4 4 is a number of items.
Number of items can be counted by formula:
=len(A1)-len(SUBSTITUTE(A1,",",""))+1
and final formula is:
=QUERY(QUERY(TRANSPOSE(SPLIT(JOIN(", ",A1:A20),", ",0)),"offset 2"),
"skipping "&len(A1)-len(SUBSTITUTE(A1,",",""))+1)

Resources