Keep result of SPLIT() as string instead of number for VLOOKUP() - google-sheets

In Google Sheets, I have this formula:
=ARRAYFORMULA( VLOOKUP( SPLIT(F2,", "), A2:B8, 2, FALSE) )
The intent is to take the comma-delimited string in cell F2 and find all values associated with the pieces. This works correctly, except when one of the pieces of the string looks like a number. For example, if F2 has the text a, 1.2, 1.2.3 then VLOOKUP will look for a as a string, and for 1.2.3 as a string, but will look for 1.2 as a number.
How can I coerce the result of SPLIT so that each piece remains a string?
I have a public copy of a test spreadsheet viewable here:
https://docs.google.com/spreadsheets/d/115WmV0vfXfaRgT0fVifJo86od3irZQy5gB3l-g6c7Ts/edit?usp=sharing
As background information, VLOOKUP treats strings and numbers differently. For example, given this table (where the formulae are shown for the first column):
A B
1 ="1.2" STR
2 =1.2 NUM
4 =VLOOKUP("1.2",A1:B2,2,0)
5 =VLOOKUP(1.2,A1:B2,2,0)
...the value shown in A4 will be "STR", and the value shown in A5 will be "NUM".

You can CONCAT a number with an empty string to convert it into its equivalent string representation.
CONCAT(1.2,"") yields "1.2"
To do this for every value, you must wrap the CONCAT() call in ARRAYFORMULA():
=ARRAYFORMULA(CONCAT(SPLIT(F2,", "),""))
The final formula thus becomes:
=TRANSPOSE(ARRAYFORMULA(VLOOKUP(ARRAYFORMULA(CONCAT(SPLIT(F2,", "),"")),A2:B8,2,FALSE)))

Related

How do I get the column letter of a single row where a particular value equals a test value?

I need to get the letter of the column that has a value in a given row that matches a given value in Google Sheets, assuming that no values in the row are duplicates.
For example, in the above screenshot, if the row is the first row, and the test value is Jun, the formula will return H.
Kind of meta. Appreciate any help.
Answer
The following formula should produce the behaviour you desire:
=REGEXREPLACE(ADDRESS(1,MATCH("Jun",A1:1),4),"[1-9]*",)
Explanation
The =MATCH formula returns the position of the item in a range which has a specified value. In this case, the specified value is "Jun" and the range is A1:1.
=ADDRESS returns the A1 notation of a row and column specified by its number. In this case, the row is 1 and the column is whichever number is returned by the =MATCH. The 4 is there so that =ADDRESS returns H1 instead of $H$1 (absolute reference is its default).
=REGEXREPLACE looks through a string for a specified pattern and replaces that portion of the string with another string. In this case, the pattern to search for is any number. The last argument of =REGEXREPLACE is blank so it simply removes all numbers from the string.
What is left is the letter of the column where the value is found.
Functions Used:
=MATCH
=ADDRESS
=REGEXREPLACE
Now that Google Sheets has added Named Functions, there is an easier way to do this.
To use named functions, go to Data -> Σ Named Functions. A sidebar will pop up. At the bottom use "Add new function" to create a new named function.
I created two functions to do this:
First, COL_CHAR which will take a column reference and return its letter
Second, ALPHA_CHAR which takes a numeric input and converts it to letters. I made this one recursive, so if it's an n-letter column name, it will keep calling itself until it gets the full name.
COL_CHAR just converts the referenced column to a column number and passes that to ALPHA_CHAR. It's formula is:
=ALPHA_CHAR( column(cell) )
where cell is an Argument placeholder. Make sure to add that to the argument placeholder list in the sidebar.
Here is the (recursive) formula for ALPHA_CHAR:
=IF( num > 26, ALPHA_CHAR( INT( num / 26 ) ), "") & CHAR( CODE("A") - 1 + MOD( num, 26 ) )
where num is an Argument placeholder.
By making this recursive, even if Google Sheets expands to allow 4-letter (or more) columns in the future, it will keep iterating through every letter regardless of how many there is.
Then, to get the letter of a column in the spreadsheet, you just call COL_CHAR and pass the cell in the column you want, for example:
= COL_CHAR(BK1)
Will return the string "BK"

How do I write a formula to code numbers into letters?

We work in gemstones and would like to figure out a formula for codifying the prices. This would depend on the client we could adjust the price without them seeing our baseline.
$1710 would show on the tag: 1GA0.
Essentially, the alphabet would be assigned to the numbers, we'd like to keep the first digit but anything after that would be a letter.
Is there a way to do this in google sheets to generate this?
Suppose the following are true:
1.) Your raw price data runs A2:A.
2.) Your raw price data are actual numbers.
Use the following array formula in the second cell of an otherwise empty column:
=ArrayFormula(IF(A2:A="",,REGEXREPLACE(LEFT(A2:A)&TRANSPOSE(QUERY(TRANSPOSE(IF(SPLIT(REGEXREPLACE(MID(A2:A&REPT("~",10),2,10),"(.)","$1~"),"~")&""="0",0,CHAR(64+SPLIT(REGEXREPLACE(MID(A2:A&REPT("~",10),2,10),"(.)","$1~"),"~"))))," ",10)),"\s|#","")))
This will produce all converted results for A2:A.
As this is a unique custom formula that will not be often requested by future site visitors, and since it is rather complex, I'm providing it as-is and without explanation at this time.
Can you try this:
Formula for B1:
=JOIN("", LEFT(A1, 1), ARRAYFORMULA(IFERROR(SUBSTITUTE(ADDRESS(1, MID(A1, ROW($A$2:INDIRECT("A" & LEN(A1))), 1) * 1, 4), "1", ""), "0")))
Output:
Step by step formula behavior:
Step 1: splits the 2nd digit to the last digit into separate cells
Step 2: convert valid numbers (1-9) into letters
Step 3: convert #VALUE to "0"
Step 4: combine the 1st number and the converted value.
Note:
1st number and zeroes are as is, everything else is converted.
If $ is a necessity and is present together with a ,, then adjust your formula into:
Adjusted formula:
=JOIN("", MID(TO_TEXT(A1), "2", "1"), ARRAYFORMULA(IFERROR(SUBSTITUTE(ADDRESS(1, MID(REGEXREPLACE(TO_TEXT(A1), ",", ""), ROW($A$3:INDIRECT("A" & LEN(REGEXREPLACE(TO_TEXT(A1), ",", "")))), 1) * 1, 4), "1", ""), "0")))
Adjustments:
LEFT(A1, 1) -> MID(TO_TEXT(A1), "2", "1")
since we consider the $, we need to get the 2nd character instead of the first 1.
A1 -> REGEXREPLACE(TO_TEXT(A1), ",", "")
since getting the value directly will yield to a number but the length will result into the string's length, we need to compare them equally, thus converting the number into text and then removing the , to return the proper value as string.
$A$2 -> $A$3
we start to convert the third character instead of the 2nd.
Adjusted formula now should yield the following output:
References:
Convert number to letter
Split number into digits

Using multiple SUBSTITUTE functions dynamically

Let's say I have a list of strings and I want to remove specific words from them. I can easily use multiple SUBSTITUTE functions, for example, this will remove the strings in B2, B3 and B4 from the string in A2:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2,$B$2,""),$B$3,""),$B$4,"")
How can I make this dynamic so that when I add more terms to remove in the B column they'll be removed automatically from A2. I tried the following methods but they didn't work:
1 - add the B cells as an array
=SUBSTITUTE(A2,{$B$2:$B$4},"") or =SUBSTITUTE(A2,{$B$2,$B$3,$B$4},"")
2 - Make a single condition
cat|donkey|mouse
3 - Using Indirect and concatenate - I built the correct function as a string (using REPT and CONCATENATE) and tried to activate it with INDIRECT) but this also failed.
Here's the spreadsheet (Col A are the strings to clea, B are the words to remove, D is the manual method that works, F, H and K are the failed 3 attempts).
https://docs.google.com/spreadsheets/d/15u8qZ0xQkjvTRrJca6AInoQ4aPkijccouAETE4Gyr9I/edit#gid=0
In the 'Copy' of the tab I entered
=ArrayFormula(IF(LEN(A2:A), REGEXREPLACE(A2:A, TEXTJOIN("|", 1, B2:B),),))
See if that works for you?
EXPLANTION
LEN(A2:A) basically limits the output to the rows that a value in column A
REGEXREPLACE uses a regular expression to replace parts of the string. That regular expression is constructed by the TEXTJOIN function.
TEXTJOIN combines the text from the range B2:B, with a specifiable delimiter separating the different texts. Here the pipe character (which means 'or' in regex) is used. The second paramater of this function is set to TRUE (or 1) so that empty cells selected in the text arguments won't be included in the result.
REFERENCES
TEXTJOIN
REGEXREPLACE
You can also try-
=TEXTJOIN(" ",TRUE,FILTER(SPLIT(A2," "),ISERROR(MATCH(SPLIT(A2," "),$B$2:$B$7,0))))

How can I extract the exact part of the text on the cell of google sheet when the text can change?

In a Google Sheets spreadsheet, I have the cell A1 with value "people 12-14 ABC". I want to extract the exact match "ABC" into another cell. The contents of cell A1 can change, e.g. to "woman 60+ ABCD". For this input, I would want to extract "ABCD". If A1 was instead "woman 12-20 CAE", I would want "CAE".
There are 5 possible strings that the last part may be: (ABC, ABCD, AB, CAE, C), while the first portions are very numerous (~400 possibilities).
How can I determine which of the 5 strings is in A1?
If the first part "only" has lower case or numbers and the last part "only" UPPER case,
=REGEXREPLACE(D3;"[^A-E]";)
Anchor: Space
=REGEXEXTRACT(A31;"\s([A-E]+)$")
If you can guarantee well-formatted input, this is simply a matter of splitting the contents of A1 into its component parts (e.g. "gender_filter", "age range", and "my 5 categories"), and selecting the appropriate index of the resultant array of strings.
To convert a cell's contents into an array of that content, the SPLIT() function can be used.
B1 = SPLIT(A1, " ")
would put entries into B1, C1, and D1, where D1 has the value you want - provided your gender filter and age ranges.
Since you probably don't want to have those excess junk values, you want to contain the result of split entirely in B1. To do this, we need to pass the array generated by SPLIT to a function that can take a range or array input. As a bonus, we want to sub-select a part of this range (specifically, the last one). For this, we can use the INDEX() function
B1 = INDEX(SPLIT(A1, " "), 1, COUNTA(SPLIT(A1, " ")))
This tells the INDEX function to access the first row and the last column of the range produced by SPLIT, which for the inputs you have provided, is "ABC", "ABCD", and "CAE".

Understanding "select X where Y = ..." in a google sheets QUERY function

I'm trying to figure out how to parse this google sheets function:
=IFERROR(QUERY($A$2:$F$1000, "select F where A="&A4&" "),"")
I'm having trouble understanding the "select F where A="&A4&" part. The function is applied to an entire column. For some of the rows, this function returns a number, for others it returns a blank. The A column which it is referencing is entirely composed of 6-digit numbers.
What is going on such that sometimes the function returns a number and sometimes a blank?
Also, why are the ampersands important? If I take away the ampersands, the function returns an error.
You need to fix the quotes around A4.
=IFERROR(QUERY($A$2:$F$1000, "select F where A='"&A4&"'"),"")
'"&A4&"'
means what is in cell A4
The & means to concatenate.
In this case the literal contents of A4 into the query formula.
Notice that the query has 4 "s. ie ""
"&""
The single quotes are to make the contents of A4 a string.
where A=
so where contents of A2 to A1000 matches the contents of A4.
It would definitely match on A4, (and any other Col A cell that had the same contents.)
in which case it would return F4 because of the
"select F"
means show/return column F in the results
You should try the following:
=arrayformula(if(eq(F2:F,A2:A),F2:F,))
It is hard to suggest the right formula without seeing what you are working with or what the expected result looks like, so if this doesn't work, please share your sample spreadsheet.

Resources