sheets REGEXEXTRACT - extract text between square brackets - google-sheets

I have a cell in E13 which contains numbers and numbers between brackets.
What I want to acheive is to match the number and copy to another cell and delete the match from E13.
E13
0:08.63 [6]
I want E13 to be
0:08.63
And in M13 I want
6
Based on this example https://support.google.com/docs/answer/3098244?hl=en
=REGEXEXTRACT(A4, "\(([A-Za-z]+)\)")
I tried this in M13
=REGEXEXTRACT(E13,\([[0-9]+]\))
Then based on this SO answer https://stackoverflow.com/a/2403159/461887
=REGEXEXTRACT(E13,\[(.*?)\])
But in both cases I just get an error.

SPLIT by the space:
=SPLIT(E13," ")
REGEX:
=REGEXEXTRACT(E13,"(\S+)\s+\[(\d+)\]")

You are just getting a basic syntax error. The minimal help for REGEXEXTRACT shows that the regexp must be enclosed in double quotes. Your second expression works correctly then:
=REGEXEXTRACT(E13,"\[(.*?)\]")

Related

Remove characters in Googlesheets

I want to remove the first and last characters of a text in a cell. I know how to remove just the first or the last with formulas such as =LEFT(A27, LEN(A27)-1) but i want to combine two formulas at the same time for first and last character or maybe there is a formula that I'm not aware of which removes both (first and last characters) at the same time.
I know about Power Tool but i want to avoid using this tool and I'm trying to realize this simply by formulas.
You could use the REGEXREPLACE() function:
=REGEXREPLACE(A27, "^.|.$", "")
The regular expression used here matches:
^. the first character after the start of the string
| OR
.$ the last character of the string
better not to use this but it works too:
=RIGHT(LEFT(A27, LEN(A27)-1), LEN(LEFT(A27, LEN(A27)-1))-1)
=LAMBDA(x, RIGHT(LEFT(A27, x), LEN(LEFT(A27, x))-1))(LEN(A27)-1)
=LAMBDA(x, LAMBDA(y, RIGHT(y, LEN(y)-1))(LEFT(A27, x)))(LEN(A27)-1)
=LAMBDA(x, RIGHT(x, LEN(x)-1))(LEFT(A27, LEN(A27)-1))

REGEXMATCH on double quote character?

I'm trying to do a conditional formatting that matches on the double quote character followed by a zero. i.e.
"0 / 10" : this should match as true
"10 / 10": this should match as false
This regex is incorrect, as it matches on both:
=REGEXMATCH(B:B;"0 /")
I expect to be able to use the formula standard of escaping the " with an extra quote. It accepts this formula syntactically, but does not match:
=REGEXMATCH(B:B;"""0 /")
I tried matching with punctuation characters, no match:
=REGEXMATCH(B:B;"[[:punct:]]0 /")
I can use [digit[ to match the 10/10 case, but ~digit doesn't match the zero with a quote in front of it:
=REGEXMATCH(B:B;"[^[:digit:]]0 /")
I even tried concatenating the specific character, no match:
=REGEXMATCH(B:B;CONCATENATE(CHAR(34), "0 /"))
I'm very confused at this point. If I insert any other special character before the zero, I have no trouble matching it. But it seems like double-quote just isn't treated like a regular character somehow. Does anyone know what I'm doing wrong?
Try
=REGEXMATCH(B2;char(34)&"0 /")
Thanks to Tim for giving me a hint that lead to a solution. My problem was that the text in a cell was part of a formula. So I needed to extract the formula as text (FORMULATEXT) and then it works:
=REGEXMATCH(FORMULATEXT($B1);"""0 /")
Try this in row 1 of a different column:
=arrayformula(if(B:B<>"",iferror(regexmatch(B:B,"""0"),),))
if(B:B<>"" will only process the formula provided Col B has values.
iferror( will ignore numbers in Col B that produce #VALUE!.
"""0" is the regex. The double quote to the left of 0 is doubled up (or char(34)&"0" as per #Mike Steelson).

Extract dollar amount in sheets text cell

Ideally what I'm looking for is to get the dollar amount extracted no matter the format.
Sheet link:
https://docs.google.com/spreadsheets/d/1drTPlnQmVTsbUXwJDfQr7DnHjSbnGx-fLthad6KxfM8/edit?usp=sharing
Delete everything from Column B, including the header. Then place the following formula in cell B1:
=ArrayFormula({"Header"; IF(A2:A="",,VALUE(IFERROR(REGEXEXTRACT(A2:A,"\$(\d+\.?\d*)"))))})
You may change the header text within the formula as you like.
If a cell in A2:A is blank, the corresponding cell in B2:B will be left blank as well.
Otherwise REGEXEXTRACT will look for a pattern that begins with a literal dollar sign. The parenthesis within the quotes denote the beginning and end of a capture group (i.e., what will be returned if found) following that literal dollar sign. The pattern \d+\.?\d* means "a group of one or more digits, followed by zero or one literal period symbols, followed by zero or more digits."
IFERROR will cause null to be rendered instead of an error if such a pattern is not able to be extracted.
VALUE will convert the extracted string (or null) to a real number.
If you would prefer that null be returned instead of 0 where no pattern match is found, you can use the following variation of the formula instead:
=ArrayFormula({"Header"; IFERROR(VALUE(IFERROR(REGEXEXTRACT(A2:A,"\$(\d+\.?\d*)"),"x")))})
If your strings may include numbers with comma separators, use the following versions of the above two formulas, respectively:
=ArrayFormula({"Header V1"; IF(A2:A="",,VALUE(IFERROR(REGEXEXTRACT(SUBSTITUTE(A2:A,",",""),"\$(\d+\.?\d*)"))))})
=ArrayFormula({"Header V2"; IFERROR(VALUE(IFERROR(REGEXEXTRACT(SUBSTITUTE(A2:A,",",""),"\$(\d+\.?\d*)"),"x")))})
try:
=INDEX(IFNA(REGEXEXTRACT(A2:A, "\$(\d+.\d+|\d+)")*1))

Extract substring after '-' character in Google Sheets

I am using the following formula to extract the substring venue01 from column C, the problem is that when value string in column C is shorter it only extracts the value 1 I need it to extract anything straight after the - (dash) no matter the length of the value text in column c
={"VenueID";ARRAYFORMULA(IF(ISBLANK(A2:A),"",RIGHT(C2:C,SEARCH("-",C2:C)-21)))}
There is a much simpler solution using regular expressions.
=REGEXEXTRACT(A1,".*-(.*)")
In case you are no familiar with Regular Expressions what this means is, get me every string of characters ((.*)) after a dash (-).
Example
Reference
REGEXTRACT
Test regular expressions
Cheat sheet for regular expressions
To answer bomberjackets question in the comment of Raserhin:
To select the part of the string before the "-"
=REGEXEXTRACT(A1,"(.*)-.*")
EXAMPLE
example of code
Adding to your original formula. I think if you'd use RIGHT and inside it reverse the order of the string with ARRAY then that may work.
=Right(A1,FIND("-",JOIN("",ARRAYFORMULA(MID(A1,LEN(A1)-ROW(INDIRECT("1:"&LEN(A1)))+1,1))))-1)
It takes string from the right side up to X number of characters.
Number of character is fetched from reversing the text, then finding
the dash "-".
It adds one more +1 of the text as it will take out so it accounts
for the dash itself, if no +1 is added, it will show the dash on
the extracted string.
The REGEX on the other answer works great too, however, you can control a number of character to over or under trim. E.g. if there is a space after the dash and you would like to always account for one more char.

extract number from cell in openoffice calc

I have a column in open office like this:
abc-23
abc-32
abc-1
Now, I need to get only the sum of the numbers 23, 32 and 1 using a formula and regular expressions in calc.
How do I do that?
I tried
=SUMIF(F7:F16,"([:digit:].)$")
But somehow this does not work.
Starting with LibreOffice 6.4, you can use the newly added REGEX function to generically extract all numbers from a cell / text using a regular expression:
=REGEX(A1;"[^[:digit:]]";"";"g")
Replace A1 with the cell-reference you want to extract numbers from.
Explanation of REGEX function arguments:
Arguments are separated by a semicolon ;
A1: Value to extract numbers from. Can be a cell-reference (like A1) or a quoted text value (like "123abc"). The following regular expression will be applied to this cell / text.
"[^[:digit:]]": Match every character which is not a decimal digit. See also list of regular expressions in LibreOffice
The outer square brackets [] encapsulate the list of characters to search for
^ adds a NOT, meaning that every character not included in the search list is matched
[:digit:] represents any decimal digit
"": replace matching characters (every non-digit) with nothing = remove them
"g": replace all matches (don't stop after the first non-digit character)
Unfortunately Libre-Office only supports regex in find/replace and in search.
If this is a once-only deal, I would copy column A to column to B, then use [data] [text to columns] in B and use the - as a separator, leaving you with all the text in column B and the numbers in column C.
Alternatively, you could use =Right(A1,find("-",A1,1)+1) in column B, then sum Column C.
I think that this is not exactly what do you want, but maybe it can help you or others.
It is all about substring (in Calc called [MID][1] function):
First: Choose your cell (for example with "abc-23" content).
Secondly: Enter the start length ("british" --> start length 4 = tish).
After that: To print all remaining text, you can use the [LEN][2] function (known as length) with your cell ("abc-23") in parameter.
Code now looks like this:
D15="abc-23"
=MID(D15; 5; LEN(D15))
And the output is: 23
When you edit numbers (in this example 23), no problem. However, if you change anything before (text "abc-"), the algorithm collapses because the start length is defined to "5".
Paste the string in a cell, open search and replace dialog (ctrl + f) extended search option mark regular expression search for ([\s,0-9])([^0-9\s])+ and replace it with $1
adjust regex to your needs
I didn't figure out how to do this in OpenOffice/LibreOffice directly. After frustrations in searching online and trying various formulas, I realised my sheet was a simple CSV format, so I opened it up in vim and used vim's built-in sed-like feature to find/replace the text in vim command mode:
:%s/abc-//g
This only worked for me because there were no other columns with this matching text. If there are other columns with the same text, then the solution would be a bit more complex.
If your sheet is not a CSV, you could copy the column out to a text file and use vim to find/replace, and then paste the data back into the spreadsheet. For me, this was a lot less frustrating than trying to figure this out in LibreOffice...
I won't bother with a solution without knowing if there really is interest, but, you could write a macro to do this. Extract all the numbers and then implement the sum by checking for contained numbers in the text.

Resources