Google Spreadsheet - Not calculating numbers with space - google-sheets

I am trying to do a calculation of two cells, where one of them contains a number like this: 1 250.
If the number is written like that, and not 1250, then I cannot get the spreadsheet to do any calculations with it. Google suddenly do not treat it as a legit number anymore.
Why not just type 1250 instead of 1 250?
Well, I am getting the cell values from a html import function.
Any good advice on how to get around this?

Try something like this:
=Substitute(A2," ","")
In this formula, A2 is a cell. You are finding any spaces in that cell and then replacing it with a "non-space".

Use the substitute function to transform your number before using it in a formula. For instance, let's say you wanted to multiple F8 by 2, but F8 may contain spaces. You would then do:
=substitute(F8, " ","") * 2

Substitute didn't work form me. But these steps did:
Select one or several columns of data
Press Ctrl + H to get the "Find and Replace" dialog
Make sure "Search using regular expressions" is checked ✅
Enter \s to the "Find" field, and leave "Replace with" empty
Click on the "Replace all" button
Explanation:
\s is a regular expression matching any kind of whitespace character. There may have been some other kind of whitespace in my spreadsheet, not a regular " " (space) character, and that's why regex worked for me, while SUBSTITUTE() didn't.
I've also tried the REGEXREPLACE(A2, "\s", "") function, but it didn't seem to to anything in my case.

Related

Remove characters in Googlesheets

I want to remove the first and last characters of a text in a cell. I know how to remove just the first or the last with formulas such as =LEFT(A27, LEN(A27)-1) but i want to combine two formulas at the same time for first and last character or maybe there is a formula that I'm not aware of which removes both (first and last characters) at the same time.
I know about Power Tool but i want to avoid using this tool and I'm trying to realize this simply by formulas.
You could use the REGEXREPLACE() function:
=REGEXREPLACE(A27, "^.|.$", "")
The regular expression used here matches:
^. the first character after the start of the string
| OR
.$ the last character of the string
better not to use this but it works too:
=RIGHT(LEFT(A27, LEN(A27)-1), LEN(LEFT(A27, LEN(A27)-1))-1)
=LAMBDA(x, RIGHT(LEFT(A27, x), LEN(LEFT(A27, x))-1))(LEN(A27)-1)
=LAMBDA(x, LAMBDA(y, RIGHT(y, LEN(y)-1))(LEFT(A27, x)))(LEN(A27)-1)
=LAMBDA(x, RIGHT(x, LEN(x)-1))(LEFT(A27, LEN(A27)-1))

ArrayFormula, RegexExtract, and Join in Google Sheets

I have a data set wherein emails are populated. I would like to list all the surnames extracted in the emails per cell and will be all joined to a one single cell but I want to put a separator or delimeter to the emails obtaine per cell.
Here is the data set:
A
B
john.smith#gmail.com, jane.doe#gmail.com
UPDATE
john.smith#gmail.com
CLOSE
And here is the formula to extract
=ARRAYFORMULA(
PROPER(
REGEXEXTRACT(
A:A,
REGEXREPLACE(
A:A,
"(\w+)#","($1)#"
)
)
)
)
This initially yields the ff:
C
D
Smith
Doe
Smith
I would like to use JOIN() inside the ARRAYFORMULA() but it is not working as I seem to think it would since it outputs an error that it only accepts one row or one column of data. My initial understanding of ARRAYFORMULA() is that it iterates through the course of the data, so I thought it will JOIN() first, and then move on to the next element/row but I guess it doesn't work that way. I can use FLATTEN() but I want to have delimiters or separators in between the row elements. I need help in obtaining my intended final result which will look like this:
UPDATE:
Smith
Doe
CLOSE:
Smith
All are located in one cell, C1. UPDATE and CLOSE are from column B.
EDIT: I would like to clarify that the email entries in column A are dynamic and maybe more than two.
I think this will work:
=arrayformula(flatten(if(A2:A<>"",regexreplace(trim(split(B2:B&":"&char(9999)&regexreplace(Proper(A2:A),"#[\w\.]+,\ ?|#.*",char(9999)&" "),char(9999))),".*\.",),)))
NOTES:
Proper(A2:A) changes the capitalisation.
The regexreplace "#[\w\.]+,\ ?|#.*" finds:
# symbol...
then any number of A-Z, a-z, 0-9, _ [using \w] or . [using \.]
then a comma
then 'optionally' a space \ [the optional bit is ?]
or [using |], the # symbol then an number of characters [using .*]
The result is replaced with a character that you won't expect to have in your text - char(9999) which is a pencil icon, and a trailing space (used later on when the flatten keeps a gap between lines). The purpose is to get all of the 'name.surname' and 'nameonly' values in front of any # symbol and separate them with char(9999).
Then infront of the regexreplace is B2:B&":"&char(9999)& which gets the value from column B, the : chanracter and char(9999).
The split() function then separates then into columns. Trim() is used to get rid of spaces in front of names that don't contain ..
The next regexreplace() function deletes anything before, and including . to keep surname or name without ..
The if(A2:A<>"" only process rows where there is a value in col A. The arrayformula() function is required to cascade the formula down the sheet.
I didn't output the results in a single cell, but it looks like you've sorted that with textjoin.
Here's my version of getting the results into a single cell.
=arrayformula(textjoin(char(10),1,if(A2:A<>"",REGEXREPLACE(B2:B&":"&char(10)&regexreplace(Proper(A2:A),"#[\w\.]+,\ ?|#.*",char(10)),".*\.",),)))
Assuming that your A:A cells will always contain only contiguous email addresses separated by commas, you could place this in, say, C1 (being sure that both Columns C and D are otherwise empty beforehand):
=TRANSPOSE(FILTER({B:B,IFERROR(REGEXEXTRACT(SPLIT(PROPER(A:A),","),"([^\.]+)#"))},A:A<>""))
If this produces the desired result and you'd like to know how it works, report back. The first step, however, is to make sure it works as expected.
use:
=INDEX(REGEXREPLACE(TRIM(QUERY(FLATTEN(QUERY(TRANSPOSE({{B1; IF(B2:B="",,"×"&B2:B)},
PROPER(REGEXEXTRACT(A:A, REGEXREPLACE(A:A, "(\w+)#", "($1)#")))})
,,9^9)),,9^9)), " |×", CHAR(10)))

replace everything after a character in Google spreadsheet

I did some searching and in openoffice and excel it looks like you can simply add an * at the beginning or end of a character to delete everything before and after it, but in Google spreadsheet this isn't working. Does it support this feature? So if I have:
keyword USD 0078945jg .12 N N 5748 8
And I want to remove USD and everything after it what do I use? I have tried:
USD* and (USD*) with regular expressions checked
But it doesn't work. Any ideas?
The * quantifier just needs to be applied to a dot (.) which will match any character.
To clarify: the * wildcard used in certain spreadsheet functions (eg COUNTIF) has a different usage to the * quantifier used in regular expressions.
In addition to options that would be available in Excel (LEFT + FIND) pointed out by pnuts, you can use a variety of regex tools available in Google Sheets for text searching / manipulation
For example, RegexReplace:
=REGEXREPLACE(A1,"(.*)USD.*","$1")
(.*) <- capture group () with zero or more * of any character .
USD.* <- exact match on USD followed by zero or more * of any character .
$1 <- replace with match in first capture group
Please try:
and also have a look at.
For spaces within keyword I suggest a helper column with a formula such as:
=left(A1,find("USD",A1)-1)
copied down to suit. The formula could be converted to values and the raw data (assumed to be in ColumnA) then deleted, if desired.
To add to the answers here, you can get into trouble when there are special characters in the text (I have been struggling with this for years).
You can put a frontslash \ in front of special characters such as ?, + or . to escape them. But I still got stuck when there were further special characters in the text. I finally figured it out after reading find and replace in google sheets with regex.
Example: I want to remove the number, period and space from the beginning of a question like this: 1. What is your name?
Go to Edit → Find and replace
In the Find field, enter the following: .+\. (note: this includes a space at the end).
Note: In the Find and replace dialogue box, be sure to check "Search using regular expressions" and "match case". Leave the Replace field blank.
The result will be this text only: What is your name?

Google new spreadsheet Split() function bug?

try this.
in cell A1 10/8/2013 Mike1, 2013, 0
now Split(A1,",")
now change A1 and add a colon 10/8/2013 Mike:1, 2013, 0
Split(A1,",") ... notice how Mike:1 is gone. Bug?
I need to split something with a colon in it and it is chopping off part of the text
The problem is that "10/8/2013 Mike:1" is interpreted as a date, it seems to be something like: "dd/mm/yyyy hhhh:mm". (if you change the display of the first cell to raw text you'll see he is giving you a number: 41496).
try this formula:
=arrayformula(SUBSTITUTE(arrayformula(split(REGEXREPLACE(A1;":";"###");","));"###";":"))
in a easier way: As I suppose you can't change the way it's displayed, the possible workaround that I see here is to first split around the ":", then split with the "," and then join the two first elements.
I can't reproduce this issue, either with function SPLIT or with Split text into columns... and with or without formatting as dd/mm/yyyy hhhh:mm.
Select A1, Data > Split text into columns... is all that is required.

extract number from cell in openoffice calc

I have a column in open office like this:
abc-23
abc-32
abc-1
Now, I need to get only the sum of the numbers 23, 32 and 1 using a formula and regular expressions in calc.
How do I do that?
I tried
=SUMIF(F7:F16,"([:digit:].)$")
But somehow this does not work.
Starting with LibreOffice 6.4, you can use the newly added REGEX function to generically extract all numbers from a cell / text using a regular expression:
=REGEX(A1;"[^[:digit:]]";"";"g")
Replace A1 with the cell-reference you want to extract numbers from.
Explanation of REGEX function arguments:
Arguments are separated by a semicolon ;
A1: Value to extract numbers from. Can be a cell-reference (like A1) or a quoted text value (like "123abc"). The following regular expression will be applied to this cell / text.
"[^[:digit:]]": Match every character which is not a decimal digit. See also list of regular expressions in LibreOffice
The outer square brackets [] encapsulate the list of characters to search for
^ adds a NOT, meaning that every character not included in the search list is matched
[:digit:] represents any decimal digit
"": replace matching characters (every non-digit) with nothing = remove them
"g": replace all matches (don't stop after the first non-digit character)
Unfortunately Libre-Office only supports regex in find/replace and in search.
If this is a once-only deal, I would copy column A to column to B, then use [data] [text to columns] in B and use the - as a separator, leaving you with all the text in column B and the numbers in column C.
Alternatively, you could use =Right(A1,find("-",A1,1)+1) in column B, then sum Column C.
I think that this is not exactly what do you want, but maybe it can help you or others.
It is all about substring (in Calc called [MID][1] function):
First: Choose your cell (for example with "abc-23" content).
Secondly: Enter the start length ("british" --> start length 4 = tish).
After that: To print all remaining text, you can use the [LEN][2] function (known as length) with your cell ("abc-23") in parameter.
Code now looks like this:
D15="abc-23"
=MID(D15; 5; LEN(D15))
And the output is: 23
When you edit numbers (in this example 23), no problem. However, if you change anything before (text "abc-"), the algorithm collapses because the start length is defined to "5".
Paste the string in a cell, open search and replace dialog (ctrl + f) extended search option mark regular expression search for ([\s,0-9])([^0-9\s])+ and replace it with $1
adjust regex to your needs
I didn't figure out how to do this in OpenOffice/LibreOffice directly. After frustrations in searching online and trying various formulas, I realised my sheet was a simple CSV format, so I opened it up in vim and used vim's built-in sed-like feature to find/replace the text in vim command mode:
:%s/abc-//g
This only worked for me because there were no other columns with this matching text. If there are other columns with the same text, then the solution would be a bit more complex.
If your sheet is not a CSV, you could copy the column out to a text file and use vim to find/replace, and then paste the data back into the spreadsheet. For me, this was a lot less frustrating than trying to figure this out in LibreOffice...
I won't bother with a solution without knowing if there really is interest, but, you could write a macro to do this. Extract all the numbers and then implement the sum by checking for contained numbers in the text.

Resources