ArrayFormula, RegexExtract, and Join in Google Sheets - google-sheets

I have a data set wherein emails are populated. I would like to list all the surnames extracted in the emails per cell and will be all joined to a one single cell but I want to put a separator or delimeter to the emails obtaine per cell.
Here is the data set:
A
B
john.smith#gmail.com, jane.doe#gmail.com
UPDATE
john.smith#gmail.com
CLOSE
And here is the formula to extract
=ARRAYFORMULA(
PROPER(
REGEXEXTRACT(
A:A,
REGEXREPLACE(
A:A,
"(\w+)#","($1)#"
)
)
)
)
This initially yields the ff:
C
D
Smith
Doe
Smith
I would like to use JOIN() inside the ARRAYFORMULA() but it is not working as I seem to think it would since it outputs an error that it only accepts one row or one column of data. My initial understanding of ARRAYFORMULA() is that it iterates through the course of the data, so I thought it will JOIN() first, and then move on to the next element/row but I guess it doesn't work that way. I can use FLATTEN() but I want to have delimiters or separators in between the row elements. I need help in obtaining my intended final result which will look like this:
UPDATE:
Smith
Doe
CLOSE:
Smith
All are located in one cell, C1. UPDATE and CLOSE are from column B.
EDIT: I would like to clarify that the email entries in column A are dynamic and maybe more than two.

I think this will work:
=arrayformula(flatten(if(A2:A<>"",regexreplace(trim(split(B2:B&":"&char(9999)&regexreplace(Proper(A2:A),"#[\w\.]+,\ ?|#.*",char(9999)&" "),char(9999))),".*\.",),)))
NOTES:
Proper(A2:A) changes the capitalisation.
The regexreplace "#[\w\.]+,\ ?|#.*" finds:
# symbol...
then any number of A-Z, a-z, 0-9, _ [using \w] or . [using \.]
then a comma
then 'optionally' a space \ [the optional bit is ?]
or [using |], the # symbol then an number of characters [using .*]
The result is replaced with a character that you won't expect to have in your text - char(9999) which is a pencil icon, and a trailing space (used later on when the flatten keeps a gap between lines). The purpose is to get all of the 'name.surname' and 'nameonly' values in front of any # symbol and separate them with char(9999).
Then infront of the regexreplace is B2:B&":"&char(9999)& which gets the value from column B, the : chanracter and char(9999).
The split() function then separates then into columns. Trim() is used to get rid of spaces in front of names that don't contain ..
The next regexreplace() function deletes anything before, and including . to keep surname or name without ..
The if(A2:A<>"" only process rows where there is a value in col A. The arrayformula() function is required to cascade the formula down the sheet.
I didn't output the results in a single cell, but it looks like you've sorted that with textjoin.
Here's my version of getting the results into a single cell.
=arrayformula(textjoin(char(10),1,if(A2:A<>"",REGEXREPLACE(B2:B&":"&char(10)&regexreplace(Proper(A2:A),"#[\w\.]+,\ ?|#.*",char(10)),".*\.",),)))

Assuming that your A:A cells will always contain only contiguous email addresses separated by commas, you could place this in, say, C1 (being sure that both Columns C and D are otherwise empty beforehand):
=TRANSPOSE(FILTER({B:B,IFERROR(REGEXEXTRACT(SPLIT(PROPER(A:A),","),"([^\.]+)#"))},A:A<>""))
If this produces the desired result and you'd like to know how it works, report back. The first step, however, is to make sure it works as expected.

use:
=INDEX(REGEXREPLACE(TRIM(QUERY(FLATTEN(QUERY(TRANSPOSE({{B1; IF(B2:B="",,"×"&B2:B)},
PROPER(REGEXEXTRACT(A:A, REGEXREPLACE(A:A, "(\w+)#", "($1)#")))})
,,9^9)),,9^9)), " |×", CHAR(10)))

Related

Lookup and fetch multiple delimited partner names

I have list of partner name codes, delimited by space. Like the one shown in below,
I have another table(E:F), from where I have to map them to show the partner names like the column C, perhaps i am not able to understand how to make it happen,
I have tried using this formula which brings only one partner name but when there are multiple it does not shows up, do i need to add another function like TEXTJOIN or what I am doing wrong here.
=IFERROR(VLOOKUP(IFERROR(REGEXEXTRACT(A2,JOIN("|",FILTER($E$2:$E,$E$2:$E<>""))),""),$E$2:$F,2,0),"")
Link To GS
See my sheet ("Erik Help"). The following formula is in cell B1:
=ArrayFormula({"PARTNER NAMES";IF(A2:A="",,REGEXREPLACE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IFERROR(VLOOKUP(SPLIT(A2:A," ",0,1),D:E,2,FALSE)&",")),,COLUMNS(SPLIT(A2:A," ",0,1))))),",$",""))})
This one formula produces the header (which you can change within the formula itself as you like) and all results for all rows.
IF(A2:A="",,...) means if a cell in Col A is blank, then the result in the same row of Col B will also be blank (i.e., null).
SPLIT (the first time in the formula) will split the Col-A values at the spaces.
VLOOKUP will try to find each split value in the D:E list. If found, the full name will replace the initials. If not found, IFERROR will return null.
You will see &",". That is appending a comma to any full names that are returned.
TRANSPOSE(QUERY(TRANSPOSE...),,COLUMNS())) is what many call "QUERY Smash." It basically, flips the remaining results of the VLOOKUP into columns instead of rows, turns everything into headers (to get them in one cell per column) and then flips them back to row orientation.
TRIM gets rid of spaces where no names were found in the full list.
REGEXREPLACE(... ,",$","") replaces any final comma that has no name after it with null.

JOIN header row values across a row based on non-blank values in cells

So I have two rows:
ID
TagDog
TagCat
TagChair
TagArm
Grouped Tags (need help with this)
1
TRUE
TRUE
TagDog,TagArm
Row 1 consists mainly of Tags, while rows 2+ are entries. This data ties ENTRIES to TAGS.
What I'm needing to do is concatenate/join the tag names per entry. For example, look at the last column above.
I suspect we could write a formula that would:
Create an array of non-empty cells in the row. (IE: [2,4])
Return it with the header row A (IE: [A2,A4])
Then join them together by a comma
But I am unsure how to write the formula, or if this is even the best approach.
Here's the formula:
={
"Grouped Tags (need help with this)";
ARRAYFORMULA(
REGEXREPLACE(TRIM(
TRANSPOSE(QUERY(TRANSPOSE(
IF(NOT(B2:E11),, B1:E1)
),, COLUMNS(B1:E1)))
), "\s+", ",")
)
}
The trick used is called vertical query smash. That's the part:
TRANSPOSE(QUERY(TRANSPOSE(...),, Nnumber_of_columns))
You can find a brief description of this one and his friends here.
I wasn't able to create a single formula that would do this for me, so instead, I utilized a formula inside of Sheets' Find/Replace tool, and it worked like a charm!
I did a find/replace, replacing all instances of TRUE with the following formula:
=INDIRECT(SUBSTITUTE(LEFT(ADDRESS(ROW(),COLUMN()),3),"$","")&"$1")
What this formula does is it finds the cell's letter, then gets the first row of the cell using INDIRECT.
Breaking down the formula:
ADDRESS(ROW(),COLUMN()) returns the direct reference: $H$1
LEFT("$H$1",3) returns $H$
SUBSTITUBE("$H$","$","") replaces the dollar signs ($) and returns H
INDIRECT(H&"$1") references the exact cell H$1
Now, I can replace all instances of TRUE with that formula and the magic happens!
Here is a video explanation: https://youtu.be/SXXlv4JHDA8
Hopefully, that helps someone -- however, I would still be interested in seeing what the formula is for this solution.

VLOOKUP remove spaces when cell is empty

This a simple customer sheet:
A B C D
ID First Middle Last
1 John Doe
2 Jane Maia Doe
And in F1 I put this vlookup code:
=VLOOKUP($G$1;$A$1:$D$3;2;FALSE)&" "&VLOOKUP($G$1;$A$1:$D$3;3;FALSE)&" "&VLOOKUP($G$1;$A$1:$D$3;4;FALSE)
When I lookup ID 2, it's perfect nicely spaced between the vlookups
But when I lookup ID 1 you see 2 spaces between the first and last name, because there is no middle name here.
How can I manage that I always see 1 space between the vlookups?
One way you could achieve the result you're looking for is to simply replace multiple spaces with a single space.
=REGEXREPLACE(JOIN(" ",ARRAYFORMULA(VLOOKUP(G1,A:D,{2,3,4},FALSE))),"\s{2,}"," ")
This formula looks up G1 in your table (A:D). VLOOKUP can be used in an ARRAYFORMULA to efficiently retrieve all of the columns you want in one shot. Your JOIN joins all of the retrieved columns, inserting a space between each value. Finally, your REGEXREPLACE function looks for multiple consecutive spaces and replaces them with a single space.
Alternatively, you could filter the resulting array (i.e. the result of what your VLOOKUP returns). The following formula looks up the array of first, middle, and last name, and then filters out any empty cells before joining the remaining elements with a space.
=JOIN(" ",FILTER(VLOOKUP(I1,A:D,{2,3,4},FALSE),INDIRECT("B"&MATCH(I1,A:A,0)&":D"&MATCH(I1,A:A,0))<>""))
all you need is TRIM fx and:
=ARRAYFORMULA(TRIM(TRANSPOSE(QUERY(TRANSPOSE(IFERROR(
VLOOKUP(G1:G2, A1:D3, {2,3,4}, 0))),,999^99))))

Google sheets - How to concat multiple values into multiple links

I am working with a google sheet that in one column can hold 1+ strings of numbers, and in another column, needs to read those numbers in and concat them with a link.
What I mean by this is and what works thus far: Column A - 324243324 || Column B - =concat("google.com/", Column A) = google.com/324243324
What I hope to get working: Column A - 324243324 5004938 || Column B - =concat("google.com/", Column A) = google.com/324243324 google.com/5004938
Is this possible? Thank you for your help!
You want to (a) split the input by spaces; (b) prepend each part by "google.com/"; and (c) join again, separated by spaces. This is achieved by
=join(" ", arrayformula("google.com/" & split(A1, " ")))
The & operator is equivalent to CONCAT but easier to type. arrayformula indicates that the operation is done on an array (the output of split).
However, these links will not be hyperlinked; joining makes them into plain text. To keep them functional, remove the last step, "join", so that each link appears in its own cell.

Get data between number two and three delimiter

I have a large list of people where each person has a line like this.
Bill Gates, IT Manager, Microsoft, <https://www.linkedin.com/in/williamhgates>
I want to extract the company name in a specific cell. In this example, it would be Microsoft, which is between the second and third delimiters (in this case, the delimiter is ", "). How can I do this?
Right now I'm using the split method (=SPLIT(A2, ", ",false)). But it gives me four different cells with information. I would like a command only to output the company in one cell. Can anyone help? I have tried different things, but I can't seem to find anything that works.
Maybe some regex can do it, but I'm not into regex.
Short answer
Use INDEX and SPLIT to get the value between two separators. Example
=INDEX(SPLIT(A1,", ",FALSE),2)
Explation
SPLIT returns an 1 x n array.
The first argument of INDEX could be a range or an array.
The second and third arguments of INDEX are optional. If the first parameter is an array that has only one row or one column, it will assume that the second argument corresponds to the larger side of the array, so there is no need to use the third argument.
A bit nasty, but this formula works, assuming data in cell D3.
=MID(D3,FIND(",",D3,FIND(",",D3)+1)+2,FIND(",",D3,FIND(",",D3,FIND(",",D3)+1)+1)-FIND(",",D3,FIND(",",D3)+1)-2)
Broken down, this is what it does:
Take the Mid point of D3 =MID(D3
starting two characters after the 2nd comma FIND(",",D3,FIND(",",D3)+1)+2
and the number of characters between the 2nd and 3rd comma, excluding spaces FIND(",",D3,FIND(",",D3,FIND(",",D3)+1)+1)-FIND(",",D3,FIND(",",D3)+1)-2)
I'll add my favourite ArratFormula, which you could use to expand list automatically without draggind formula down. Assumptions:
you have list with data in range "A1:A20"
all data have same sintax "...,Company Name, <..."
In this case you could use Arrayformula, pasted in cell B1:
=ArrayFormula(REGEXEXTRACT(A1:A20,", ([^,]+), <"))
If your data doest's always look like "...,Company Name, <..." or you wish to get different ounput, use this formula in cell B1:
=QUERY(QUERY(TRANSPOSE(SPLIT(JOIN(", ",A1:A20),", ",0)),"offset 2"),"skipping 4")
in this formula:
change 2 in offset 2 to 0, 1, 2, 3 to get name, position, company, link
in skipping 4 4 is a number of items.
Number of items can be counted by formula:
=len(A1)-len(SUBSTITUTE(A1,",",""))+1
and final formula is:
=QUERY(QUERY(TRANSPOSE(SPLIT(JOIN(", ",A1:A20),", ",0)),"offset 2"),
"skipping "&len(A1)-len(SUBSTITUTE(A1,",",""))+1)

Resources