Extract all matches in a string using REGEXEXTRACT in Google Sheets - google-sheets

I have a cell with values like so: aasdf123asdf34asdf3
I want to extract all groups of consecutive numbers: 123, 34, and 3.
I think this is the regular expression I need: (\d+).
But it is only extracting the first match.
This works outside of Google Sheets. Not sure why I can't get it to work in Google Sheets.
https://regexr.com/572et

You could try actually generating the CSV string you want directly, using REGEXREPLACE:
=REGEXREPLACE(REGEXREPLACE("aasdf123asdf34asdf3", "\D+", ","), "^,|,$", "")
The inner call to REGEXREPLACE replaces all clusters of non digit characters with comma. The outer call then removed any leading/trailing commas which the first replacement might have left behind.
Moreover you can use SPLIT to separate the values into each individual cell:
=TRANSPOSE( SPLIT(REGEXREPLACE(REGEXREPLACE("aasdf123asdf34asdf3", "\D+", ","), "^,|,$", ""), ","))
In here the TRANSPOSE function is just to stack the matches vertically instead of horizontally as SPLIT would lay them as default.

Related

Horizontally Concatenate Array of Columns with delimiter and ignore blank columns in google sheets [duplicate]

This question already has an answer here:
Concatenate non empty cells in each row with arrayformula in google sheets
(1 answer)
Closed 6 months ago.
The shared sheet shows multiple column rows which can be individually concatenated horizontally with a comma & space between using TEXTJOIN(", ", TRUE, A2:D2) and blank spaces are ignored. But textjoin cannot be used in Arrayformula as far as I know and I would like ot find a suitable replacement that can also be combined as a string along with other strings of information.
I want to be able to use this as an independent formula string that might be added to other strings of information. For example, "Favorite colors: "& textjoin(", ",1,A2:D2)&"Favorite foods:"&textjoin(", ",1,E2:G2)&"...
Possible solutions
May be a variant of one of the following:
Modifying this so it could be used w/ an array formula JOIN("~", SPLIT(JOIN(CHAR(60000), B3:E3), CHAR(60000)))
Modifying this formula works with join also JOIN(", ",FILTER(H2:H,H2:H<>""))
Using a combination of IF(a2:A<>"" along with a regex replacement at the end (see my answer below) but this could be very long formula compared to textjoin if there are many columns)
An ideal solution would be concise and look closest to something this:
arrayformula(TEXTJOIN(", ", TRUE, A2:A,B2:B,C2:C)
Shared sheet is here
use:
=INDEX(REGEXREPLACE(TRIM(FLATTEN(QUERY(TRANSPOSE(IF(A2:D="",,A2:D&",")),,9^9))), ",$", ))
Using a series of IF statements, adding a delimiter and then removing any trailing delimiters can be accomplished using: Arrayformula(regexreplace(if(A2:A100<>"",A2:A100&", ","")&if(B2:B100<>"",B2:B100&", ","")&if(C2:C100<>"",C2:C100&", ","")&if(D2:D100<>"",D2:D100&", ",""),", $",""))
Use a query smush, like this:
=transpose(query(transpose(A2:D), "", 9^9))
The formula will separate values with spaces. To separate with commas and remove unwanted white space, use trim() and substitute() or regexreplace(), like this:
=arrayformula( substitute( trim( transpose( query( transpose(A2:D), "", 9^9 ) ) ), " ", ", " ) )

How can I get this formula to grab values between the commas?

Here is a screenshot of and a link to my test spreadsheet. It makes the requirements very clear:
https://docs.google.com/spreadsheets/d/1rZr2zHaSkff9SFpwpx83_4TawruotA1jhOKqW6uYDz0/edit?usp=sharing
The formula I have come up with is very close to what I need, but "linkText" is a placeholder for the value of the array item. Here is my formula:
=if(A2="","","<a href='https://samplewebsite.com/search?q=" & trim(lower(substitute(A2,",","'>linkText</a>, <a href='https://samplewebsite.com/search?q="))) & "'>linkText</a>")
try:
=index(join(","&char(10), SUBSTITUTE($B$1, "linkText", split(A3, ","))))
Drag down to column.
Result:
First using SPLIT to split the strings between comma from the column A. Then using SUBSTITUTE to find the string "linkText" from the text in B1 and replace it with the strings from the returned strings from the split function. Then joining them all together.
NOTE: Just keep the reference string in a fixed cell in your sheet. <a href='https://samplewebsite.com/search?q=linkText'>linkText</a> to be used in the formula. As seen in above screenshot it is fixed in cell B1.
Alternate Solution using ArrayFormula:
You can also use it with arrayformula so you only have to put it in the first row and no need to drag down the formula to the column, it will automatically be expanded down just make sure to clear the cells below or it will throw an error.
=arrayformula(regexreplace(substitute(transpose(query(transpose(IF(IFERROR(SPLIT(A2:A, ","))<>"", "♦<a href='https://samplewebsite.com/search?q="&SPLIT(A2:A, ",")&"'>"&SPLIT(A2:A, ",")&"</a>", )),,9^9)), "♦", char(10)), "^\s", ""))
Result:
You may also have a look in below references for more information.
References:
SUBSTITUTE
SPLIT
JOIN
Comma separated list into matched columns pairings

Combine Data From Separate Columns with a Delimiter with Arrayformula

This is the example sheet.
Alright, in cell V1!A1 is the formula ={"Languages";ARRAYFORMULA(TRANSPOSE(QUERY(TRANSPOSE(B2:F&","),,COLUMNS(B2:F))))}. I need to combine data from B2:F with the delimiter ,. But now I need to delete the unnecessary delimiters.
In sheet V2, I tried ={"Languages";ARRAYFORMULA(REGEXREPLACE(REGEXREPLACE(TRANSPOSE(QUERY(TRANSPOSE(B2:F&","),,COLUMNS(B2:F))),"(^(,(\s,){4})$)|(^(,\s)+)|(,(\s,)?\s?$)",""),"(,\s,)+\s?",", "))} but it's not consistant and still leaves delimiters in the output.
Is there a better way to do this?
I added a sheet called "Erik Help" which replaces your formula with the following:
=ARRAYFORMULA({"Languages";SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(B2:F&" "),,COLUMNS(B2:F))))," ",", ")})
Essentially, instead of appending a comma to elements of the range B2:F, I appended a space. Then I applied TRIM to the results, which will leave no spaces before or after each concatenation and only one between each element. To that, I applied a SUBSTITUTE of the single spaces with a comma-space.

How does one replace a character at N positions in a string in a cell in Google Sheets

I have a long string of characters in a cell A1:
sdfhgt9|ft8yy|1gftre|78hedd
In cell A2 I have a set of comma-separated numbers to indicate the positions for replacing the character within the cell:
4,10,19,26
The characters at these positions have to be replaced by a "#", so the output should look like:
sdf#gt9|f#8yy|1gft#e|78he#d
I tried using the replace function with an arrayformula.
=ARRAYFORMULA(replace(A1,split(A2,","),1,"#"))
creates these 4 different strings in A3,A4,A5,A6:
sdf#gt9|ft8yy|1gftre|78hedd
sdfhgt9|f#8yy|1gftre|78hedd
sdfhgt9|ft8yy|1gft#e|78hedd
sdfhgt9|ft8yy|1gftre|78he#d
I am now not able to join and build one string with all 4 "#" replacements.
I am looking at solving this with the regular functions in Sheets, no custom coding.
See if this works
=regexreplace(A1, "^(.{3}).(.{5}).(.{8}).(.{6}).", "$1#$2#$3#$4#")
Or, if you want to use the contents of cell A2
=join("",ArrayFormula(if(regexmatch(row(indirect("A1:A"&len(A1)))&"", "\b"&SUBSTITUTE(A2, ",", "|")&"\b"), "#", transpose(split(regexreplace(A1, "(.)", "$1-"), "-")))))
you can build on what you have like:
=ARRAYFORMULA(JOIN("|", HLOOKUP(COLUMN(A:D), {COLUMN(A:D);
SPLIT(TRANSPOSE(REPLACE(A1, SPLIT(A2, ","), 1, "#")), "|")}, 1+COLUMN(A:D))))
or fully dynamic:
=ARRAYFORMULA(JOIN("|", HLOOKUP(
TRANSPOSE(ROW(INDIRECT("A1:A"&COLUMNS(SPLIT(A1, "|"))))), {
TRANSPOSE(ROW(INDIRECT("A1:A"&COLUMNS(SPLIT(A1, "|")))));
SPLIT(TRANSPOSE(REPLACE(A1, SPLIT(A2, ","), 1, "#")), "|")}, 1+
TRANSPOSE(ROW(INDIRECT("A1:A"&COLUMNS(SPLIT(A1, "|"))))))))

Combine Text in ArrayFormula

I have a table using Google Sheets. It has three columns that will always have a null value or a specific value for that column. Each line will have one, two, or three values; it will never have three null values on one line. In the fourth column, I want an ArrayFormula that will combine those values and separate the values with a comma if there is more than one.
Here is a photo of what I am trying to accomplish.
I've tried several ideas so far and this formula is the closest I've gotten so far but it's still not quite working correctly; I think it is treating each column as an array before joining rather than doing the function line by line. I'm using the LEN function rather than A2="" or ISBLANK(A2) because columns A-C are ArrayFormulas as well. I realize this probably isn't the most efficient formula to use but I think it covers every possibility. I'm definitely open to other ideas as well.
={"Focus";
ArayFormula(
IFS(
$A$2:$A="", "",
(LEN(A2:A)>0 & LEN(B2:B)>0 & LEN(C2:C)>0), TEXTJOIN(", ", TRUE, A2:A, B2:B, C2:C),
(LEN(A2:A)>0 & LEN(B2:B)>0 & LEN(C2:C)=0), TEXTJOIN(", ", TRUE, A2:A, B2:B),
(LEN(A2:A)>0 & LEN(B2:B)=0 & LEN(C2:C)>0), TEXTJOIN(", ", TRUE, A2:A, C2:C),
(LEN(A2:A)=0 & LEN(B2:B)>0 & LEN(C2:C)>0), TEXTJOIN(", ", TRUE, B2:B, C2:C),
(LEN(A2:A)>0 & LEN(B2:B)=0 & LEN(C2:C)=0), A2:A,
(LEN(A2:A)=0 & LEN(B2:B)>0 & LEN(C2:C)=0), B2:B,
(LEN(A2:A)=0 & LEN(B2:B)=0 & LEN(C2:C)>0), C2:C
)
)
}
Is it possible to achieve this with Google Sheets?
Sample File
Please try:
=ARRAYFORMULA(SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(FILTER(A2:C,ROW(A2:C)<=MAX(IF(LEN(A2:C),ROW(A2:C)*COLUMN(A2:C)^0,0)))),,2^99)))," ",", "))
Notes:
The formula will work incorrectly if some names have space inside: like "Aston Martin"
So if you have spaces, please try this:
=ARRAYFORMULA(SUBSTITUTE(
SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(FILTER(SUBSTITUTE(A2:C," ",char(9)),ROW(A2:C)<=MAX(IF(LEN(A2:C),ROW(A2:C)*COLUMN(A2:C)^0,0)))),,2^99)))," ",", "),
CHAR(9)," "))
EDIT
Noticed the shorter variant (without *COLUMN(A2:C)^0) will work:
=ARRAYFORMULA(SUBSTITUTE(
SUBSTITUTE(TRIM(TRANSPOSE(QUERY(TRANSPOSE(FILTER(SUBSTITUTE(A2:C," ",char(9)),ROW(A2:C)<=MAX(IF(LEN(A2:C),ROW(A2:C),0)))),,2^99)))," ",", "),
CHAR(9)," "))
Notes:
I used an old trick to join strings with an array-formula. See sample file
Explanations
If you like to understand any tiered formula, the best way is to split it by parts:
Part 1. Filter the data
FILTER(any_columns,ROW(A2:C)<=MAX(IF(LEN(A2:C),ROW(A2:C)*COLUMN(A2:C)^0,0))). this is my way to limit the data range.
The range is open, means it starts from the second row (A2) and
ends in any row.
I want to get the limited array in this step to reduce work that the formula should do. This is done with a condition, if.
ROW(A2:C) must be less or equal to the max row of data.
MAX(IF(LEN(A2:C), some_rows) gives the max row.
If(len.. part checks if a cell has some text inside it.
Note some_rows part:
MAX(IF(LEN(A2:C),ROW(A2:C)*COLUMN(A2:C)^0,0)))),,2^99))).
ROW(A2:C) must be multiplied by columns, because filter formula
takes only one row into its condition. That is why I multiply by
COLUMN(A2:C)^0 which is columns with 1s. Edit. Now noticed,
that the formula works fine without *COLUMN(A2:C)^0, so it's an
overkill.
Part 2. Join the text
query formula has 3 arguments: data, query_text, and a number_of_header_rows.
data is made with a filter.
query_text is empty, which gives us equivalent to select all
("select *").
And the number of rows of a header is some big number (2^99).
This is a trick: when a query has more headers then one row,
it will join them with space.
After a union is made, transpose function will convert the result back to the column.
Part 3. Substitute and trim
The function trim deletes extra spaces.
Then we replace spaces with the delimiter: ", ". That is why the
formula needs to be modified if spaces are in strings. Correct
result: "Ford, Aston Martin". Incorrect: "Ford, Aston, Martin". But
if we previously replace spaces with some char (char(9) is Tab),
then we do not replace it in this step.

Resources