Google Sheets character sort order - google-sheets

Is there anywhere where I can find the list google uses when you a-z sort a list with a filter?
EDIT: Image for clarity:
More specifically, I need a character that is discreet; ideally a blank character; that is sorted after the plus sign '+'.

I've made a test. You may see the result in Sample File:
I tried first 3000 chars:
Column A: numbers from 1 to 3000
Column B: char formula: =CHAR(A2)
I sorted the range with filter and got an unexpected result: Google is not using ASCII to sort text.
Also, see sort function works the same way as sort by a filter. But sorting with query gives another result.
When I tried the same experiment in Excel I was confused even more:
=CHAR(A2) gives another result in Excel, it not an ASCII char
sorting range of chars gives different from Sheets result. Please try it yourself to see.

Related

Google Sheets - Iterate through text string, removing charters from end until a vlookup match

A user pastes in a value to see if there is a full or partial match. I need to do a vlookup and keep removing characters until there is a match. A full match of something like test1.test2.test3 is no problem because it's a full match to my list. But if someone pastes in something like test1.test2.test3.test4, I need to remove a character one at a time from the end until there is a match. So in this example, it would match test1.test2.test3 and return that result.
Conceptually I see this as a for loop that counts the characters using len, using left to remove the number of characters from the end based on the current iteration, and doing vlookups until returning the value when true. But I'm not sure how to do this in Google Sheets.
This formula will give you the matching value that was found in the data(i.e. test1.test2.test3)
=FILTER([column_with_data], REGEXMATCH([cell_with_pasted_value_to_look], [column_with_data]))
This formula will give you the matching data and the cell reference where it was found (i.e. test1.test2.test3 # $A$4)
=FILTER([column_with_data], REGEXMATCH([cell_with_pasted_value_to_look], [column_with_data]))&" # "&CELL("address",INDEX([column_with_data],MATCH(FILTER([column_with_data], REGEXMATCH([cell_with_pasted_value_to_look], [column_with_data])),[column_with_data],0),1))
Simply copy & paste any of the above formulas next to the cell where users paste a value to look. Then, replace the two references in the square brackets [ ] with the proper coordinates in your sheet:
replace [column_with_data] with the coordinates of the column containing all the stored data (i.e. A1:A)
replace [cell_with_pasted_value_to_look] with the absolute ($col$row)coordinates of the cell where users paste the value to look (i.e. $B$1)
Would it be a problem to download the data from Google sheets, transform the file type to use the for loop in another software, and re-upload? I think your idea for a for loop would work.
It might be quicker if this is a long term project, but not so great if the client is continually monitoring/uploading.

Sheets: use FILTER for multiple strings instead of exact match only?

I'm trying to SUM column C based on the contents of columns A and B. Like this:
=sum(filter(C:C, (A:A="Safari")*(B:B="10.0.1")))
The above formula works. The FILTER function works as an exact match for "Safari" and "10.0.1" for columns A and B respectively.
The problem is... this only captures an exact match: "10.0.1". I need to capture multiple strings e.g. "10.0.1", "10.0.2", "10.0.3", etc.
If helpful, here's an example sheet.
I'm not sure if regex can be used in combination with a filter function. In any case, I've tried hard and failed spectacularly. So... how best to filter for multiple strings instead of exact match only?
=SUMIFS(C:C,A:A,"Safari",B:B,"10.0.*")
Please try:
=filter(C:C, (A:A="Safari")*(REGEXMATCH(B:B, "10\.0\..*")))
Notes:
filter is an arrayformlula and it has a great property: it converts all the formulas inside it into array formulas
"10.0..*" is a regex for your match. "\." will match a dot, ".*" will match any sequence of chars. Please see more syntax here.

How to Query by String Length in Google Sheet

I need to use query in Google Sheet Spreadsheet to return text in Column B of a given length, and various other conditions. To make it easy, below is a simplified version concentrate solely on Len() function. Seems simple enough but unfortunately does not work.
=QUERY(Valuations!B1:B,"select B where LEN(B)>3 ")
I'm aware that SQL uses LEN(), where as LENGTH() for MySQL.
Are my syntax incorrect or it is not possible to check string length in Google Sheet Query?
You can do it using a filter
=filter(B:B,len(B:B)>=3)
And then if you want to combine that with other conditions, you can put it in a query e.g.
=query(filter(A:B,len(B:B)>=3),"select Col1,Col2 where Col1>1")
See this question
A regular expression can be used:
=QUERY(Valuations!B1:B, "select B where B matches '.{3,}'")
The regular expression explained:
. match any character
{3,} match the preceding symbol (the .) 3 or more times
You could also search for a specific length by modifying the expression to ^.{3}$
OR a range ^.{3,10}$
OR a maximum ^.{,10}$
^ the start of the string
$ the end of the sting
regex101.com is a valuable resource for regular expressions.
I am not associated with the site in any way but I use it all the time.

Countif with len in Google Spreadsheet

I have a column XXX like this :
XXX
A
Aruin
Avolyn
B
Batracia
Buna
...
I would like to count a cell only if the string in the cell has a length > 1.
How to do that?
I'm trying :
COUNTIF(XXX1:XXX30, LEN(...) > 1)
But what should I write instead of ... ?
Thank you in advance.
For ranges that contain strings, I have used a formula like below, which counts any value that starts with one character (the ?) followed by 0 or more characters (the *). I haven't tested on ranges that contain numbers.
=COUNTIF(range,"=?*")
To do this in one cell, without needing to create a separate column or use arrayformula{}, you can use sumproduct.
=SUMPRODUCT(LEN(XXX1:XXX30)>1)
If you have an array of True/False values then you can use -- to force them to be converted to numeric values like this:
=SUMPRODUCT(--(LEN(XXX1:XXX30)>1))
Credit to #greg who posted this in the comments - I think it is arguably the best answer and should be displayed as such. Sumproduct is a powerful function that can often to be used to get around shortcomings in countif type formulae.
Create another list using an =ARRAYFORMULA(len(XXX1:XXX30)>1) and then do a COUNTIF based on that new list: =countif(XXY1:XXY30,true()).
A simple formula that works for my needs is =ROWS(FILTER(range,LEN(range)>X))
The Google Sheets criteria syntax seems inconsistent, because the expression that works fine with FILTER() gives an erroneous zero result with COUNTIF().
Here's a demo worksheet
Another approach is to use the QUERY function.
This way you can write a simple SQL like statement to achieve this.
For example:
=QUERY(XXX1:XXX30,"SELECT COUNT(X) WHERE X MATCHES '.{1,}'")
To explain the MATCHES criteria:
It is a regex that matches every cell that contains 1 or more characters.
The . operator matches any character.
The {1,} qualifies that you only want to match cells that have at 1 or more characters in them.
Here is a link to another SO question that describes this method.

extract number from cell in openoffice calc

I have a column in open office like this:
abc-23
abc-32
abc-1
Now, I need to get only the sum of the numbers 23, 32 and 1 using a formula and regular expressions in calc.
How do I do that?
I tried
=SUMIF(F7:F16,"([:digit:].)$")
But somehow this does not work.
Starting with LibreOffice 6.4, you can use the newly added REGEX function to generically extract all numbers from a cell / text using a regular expression:
=REGEX(A1;"[^[:digit:]]";"";"g")
Replace A1 with the cell-reference you want to extract numbers from.
Explanation of REGEX function arguments:
Arguments are separated by a semicolon ;
A1: Value to extract numbers from. Can be a cell-reference (like A1) or a quoted text value (like "123abc"). The following regular expression will be applied to this cell / text.
"[^[:digit:]]": Match every character which is not a decimal digit. See also list of regular expressions in LibreOffice
The outer square brackets [] encapsulate the list of characters to search for
^ adds a NOT, meaning that every character not included in the search list is matched
[:digit:] represents any decimal digit
"": replace matching characters (every non-digit) with nothing = remove them
"g": replace all matches (don't stop after the first non-digit character)
Unfortunately Libre-Office only supports regex in find/replace and in search.
If this is a once-only deal, I would copy column A to column to B, then use [data] [text to columns] in B and use the - as a separator, leaving you with all the text in column B and the numbers in column C.
Alternatively, you could use =Right(A1,find("-",A1,1)+1) in column B, then sum Column C.
I think that this is not exactly what do you want, but maybe it can help you or others.
It is all about substring (in Calc called [MID][1] function):
First: Choose your cell (for example with "abc-23" content).
Secondly: Enter the start length ("british" --> start length 4 = tish).
After that: To print all remaining text, you can use the [LEN][2] function (known as length) with your cell ("abc-23") in parameter.
Code now looks like this:
D15="abc-23"
=MID(D15; 5; LEN(D15))
And the output is: 23
When you edit numbers (in this example 23), no problem. However, if you change anything before (text "abc-"), the algorithm collapses because the start length is defined to "5".
Paste the string in a cell, open search and replace dialog (ctrl + f) extended search option mark regular expression search for ([\s,0-9])([^0-9\s])+ and replace it with $1
adjust regex to your needs
I didn't figure out how to do this in OpenOffice/LibreOffice directly. After frustrations in searching online and trying various formulas, I realised my sheet was a simple CSV format, so I opened it up in vim and used vim's built-in sed-like feature to find/replace the text in vim command mode:
:%s/abc-//g
This only worked for me because there were no other columns with this matching text. If there are other columns with the same text, then the solution would be a bit more complex.
If your sheet is not a CSV, you could copy the column out to a text file and use vim to find/replace, and then paste the data back into the spreadsheet. For me, this was a lot less frustrating than trying to figure this out in LibreOffice...
I won't bother with a solution without knowing if there really is interest, but, you could write a macro to do this. Extract all the numbers and then implement the sum by checking for contained numbers in the text.

Resources