Would appreciate some help with a Google Sheets formula to remove all punctuation excluding period (.)
I'm currently using the formula:
=REGEXREPLACE(D2,"[[:punct:]]", " ")
However this removes all punctuation and I need the periods to remain.
Input
Expected Output
cpe:2.3:a:caldera:openlinux_server:3.1:::::::*
cpe 2.3 a caldera openlinux server 3.1
cpe:2.3:o:trustix:secure_linux:1.01:::::::*
cpe 2.3 o trustix secure linux 1.01
it should be:
=TRIM(REGEXREPLACE(A2, "[:,_\*]", " "))
Related
I have texts in which all of the apostrophes (‘) and quotation marks (“) are coming out incorrectly as question marks. e.g. I?m writing to send my wishes and congratulate you on your engagement. I’m using regexreplace in Google sheets to replace the question marks with apostrophes: =REGEXREPLACE(A1, "?", "'")
The problem is that I don’t want to replace actual (i.e. correct) question marks with apostrophes. So, I need to be able to make regexreplace ignore true question marks. We could define a true question mark as 1. one that comes before two spaces (in the case of one at the end of a paragraph) and 2. before a space and a capital letter (in the case of one at the end of a sentence within a paragraph). In practice rule 2 will capture some apostrophes (those coming before proper nouns not at the start of a new sentence), but it’ll be rare enough not to matter much.
Any ideas on how can I put these rules for ignoring real question marks into regexreplace?
Try
=REGEXREPLACE(REGEXREPLACE(A2,"\?([\w\.])","'$1"),"\?( [a-z])","'$1")
try:
=INDEX(REGEXREPLACE(SUBSTITUTE(A1:A5, "?", "'"), " '|' ", " ? "))
on google sheets I am trying to remove some capital letters that exist between brackets using regexreplace
=arrayformula(regexreplace(regexreplace(A2:A,"\(.+?\)\ Ltd$| $| LTD$","")," $| LTD$",))
the only part remaining is where it could be random company (PTY) and I need to remove the space before the (PTY) and the (PTY).
any ideas?
Try this formula:
=arrayformula(regexreplace(A2:A,"\s?\(.+\) (Ltd|LTD)$",""))
Output:
Reference:
Regex capturing group
I have an issue with the IMPORTXML function and then changing the currency in my portfolio tracker.
IMPORTXML (C3=IMPORTXML(B3,"//div[#class='priceValue___11gHJ']") takes the price of a cryptocurrency from coinmarketcap (B3=https://coinmarketcap.com/currencies/ethereum/ - this all works fine (would rather prefer to have the prices from coingecko, but cannot figure the IMPORTXML function for that website... - if anyone has some valuable input for this too, would be great).
However, the imported price in C3 has a dollar sign before the actual numbers, which mess up the GOOGLEFINANCE formulas in columns D (D3=C3GoogleFinance("CURRENCY:USDEUR")) and E (E3=C3GoogleFinance("CURRENCY:USDGBP")). Screenshot of the error attached. Error Message
Does anyone know how to fix this?
Much appreciated!
Rob
On cell C3 you could use:
=VALUE(SUBSTITUTE(IMPORTXML(B3,"//div[#class='priceValue___11gHJ']"),"$",""))
References:
SUBSTITUTE
VALUE
If I understand correctly, the problem is that the imported price in C3 has a dollar sign in front of the actual numbers, which breaks down further processing by formulas.
To solve the problem, you can wrap the formula in C3 in a MID() formula that returns a segment of a string. If we specify a position number equal to 2, starting from which the string segment will be extracted, then the dollar sign at position 1 will be ignored. The sign of the pound or the euro will also be ignored - any currency sign with a length of 1 symbol.
10 is the segment length with some margin. Please note if the end of string is reached before segment length characters are encountered, MID returns the characters from starting position to the end of string.
C3:
=MID(IMPORTXML(B3,"//div[#class='priceValue___11gHJ']"),2,10)
MID()
Here's a little workaround to import the prices from coingecko, and remove the $ simbol, of course:
=SUBSTITUTE(IMPORTXML("https://www.coingecko.com/en/coins/ethereum"; "//div[contains(#data-controller,'coins-information')]//span[contains(#data-coin-symbol,'eth')]");"$";"")
But, as you can imagine, if they change the html structure of the site, this will stop working.
Have fun!
How do I remove non-numerical characters from a filter in Google Sheets?
I have a function that spits out matching phone numbers into up to 3 subsequent columns. I would like to eliminate non-number characters and a prevailing 1 if there is one, possibly using Regex.
=array_constrain(transpose(filter(People!H:H,People!B:B=A10)),1,3)
=ARRAYFORMULA(IFERROR(REGEXREPLACE(TO_TEXT(A1:A), "\D+|^1", "")))
I did some searching and in openoffice and excel it looks like you can simply add an * at the beginning or end of a character to delete everything before and after it, but in Google spreadsheet this isn't working. Does it support this feature? So if I have:
keyword USD 0078945jg .12 N N 5748 8
And I want to remove USD and everything after it what do I use? I have tried:
USD* and (USD*) with regular expressions checked
But it doesn't work. Any ideas?
The * quantifier just needs to be applied to a dot (.) which will match any character.
To clarify: the * wildcard used in certain spreadsheet functions (eg COUNTIF) has a different usage to the * quantifier used in regular expressions.
In addition to options that would be available in Excel (LEFT + FIND) pointed out by pnuts, you can use a variety of regex tools available in Google Sheets for text searching / manipulation
For example, RegexReplace:
=REGEXREPLACE(A1,"(.*)USD.*","$1")
(.*) <- capture group () with zero or more * of any character .
USD.* <- exact match on USD followed by zero or more * of any character .
$1 <- replace with match in first capture group
Please try:
and also have a look at.
For spaces within keyword I suggest a helper column with a formula such as:
=left(A1,find("USD",A1)-1)
copied down to suit. The formula could be converted to values and the raw data (assumed to be in ColumnA) then deleted, if desired.
To add to the answers here, you can get into trouble when there are special characters in the text (I have been struggling with this for years).
You can put a frontslash \ in front of special characters such as ?, + or . to escape them. But I still got stuck when there were further special characters in the text. I finally figured it out after reading find and replace in google sheets with regex.
Example: I want to remove the number, period and space from the beginning of a question like this: 1. What is your name?
Go to Edit → Find and replace
In the Find field, enter the following: .+\. (note: this includes a space at the end).
Note: In the Find and replace dialogue box, be sure to check "Search using regular expressions" and "match case". Leave the Replace field blank.
The result will be this text only: What is your name?