Why are hyphens screwing up my duplicate finding conditional formatting?

Why are hyphens screwing up my duplicate finding conditional formatting? - google-sheets

I have a sheet I use as a database of scientific papers. I copy journal article titles from different sources (some could be from an email, others are links on a web page, or just the title from the article page). I have conditional formatting set to let me know if I'm adding a title that is already in the list. I've noticed that there are some titles that are "ignoring" the conditional formatting, and it looks like there are hyphens in all of the offenders. If I remove the hyphens, the conditional formatting works. So there is some 'difference' in the hyphens originating from the same title that is preventing the conditional formatting from viewing them as identical.
Shared sheet
Examples of offending titles:
End-to-end continuous bioprocessing: impact on facility design, cost of goods and cost of development for monoclonal antibodies
End‐to‐end continuous bioprocessing: impact on facility design, cost of goods and cost of development for monoclonal antibodies
End‐to‐end continuous bioprocessing: Impact on facility design, cost of goods, and cost of development for monoclonal antibodies
What is this difference, and is there a way to fix it? Do I need to write a script to find/replace the hyphens to get this to work?
TIA

Just because characters appear identical, it does not mean that they are identical. You have fallen foul of the similarity between the hyphen and dashes. Visually, they are almost identical - dashes are slightly widest than the hyphen.
Dashes are regarded as "special characters" (i.e. they aren't keys on the keyboard) but they are used widely in html. So if, for instance, you copied an item from a website then you might unwittingly have copied dashes rather than hyphens.
You can identify the exact nature of a character by using the CODE function.
You ask "What is this difference, and is there a way to fix it? Do I need to write a script to find/replace the hyphens to get this to work?"
WHAT IS THIS DIFFERENCE?
It's important to recognise that though these examples appear identical, there are other differences that are more than just hyphens vs dashes.
Example#1 - Hyphen - CODE returns "45"
Example#2 - Dash - CODE returns "8208"
Example#3 - Dash - CODE returns "8208".
But there are other factors that contribute to fail to trigger the conditional formatting rule:
Length = 128 (vs 127 for the other examples). There is an additional comma (after "cost of sales")
the word "Impact" is spelled with an upper case "I" (lower case for the other examples)
MOVING FORWARD
Do you need a script? No (IMHO)
Is there a way to fix it? As outlined above, there are more differences that just hyphens and dashes. And, as time goes by, the number & type of difference might increase. However, there is a solution to the "Hyphen Vs Dash" problem which is the focus of this question.
FORMULA AND FORMATTING
Your data is currently in Column A and Column A is also subject to conditional formatting.
Remove the conditional formatting rules from Column A
Insert this formula in cell B2
=arrayformula(if(LEN($A2:A)-LEN(SUBSTITUTE($A2:A, char(8208), ""))=0,A2:A,arrayformula(substitute(A2:A,char(8208),char(45)))))
Conditional Formatting for Column B
select the range in Column B
select, Format, Conditional Formatting.
Select "Custom Formula is" and enter this formula: =countif($B$2:$B2,B2)>1
Select a preferred Formatting Style and then click Done.
FORMULA LOGIC
arrayformula enables the formula to automatically populate all the relevant cell in the column.
LEN($A2:A)-LEN(SUBSTITUTE($A2:A, char(8208), ""))=0
a test for dashes in a string. It substitutes a nil value for any/all instances of a dash (char(8208)), then compares the length to the adjusted length. If the value is zero, then there are no dashes in the string.
IF: Test for any dashes,
if the string doesn't contain any dashes then use that value
else, the string must contains dashes so substitute any dashes for hyphens, and use the substituted value
arrayformula(substitute(A2:A,char(8208),char(45)))
The conditional formatting rule then looks for duplicate values in the column, and formats any/all duplicate values.
You'll note that Example#3 is not flagged as a duplicate despite containing dashes. This is because of the spelling of "Impact" and the extra comma after "cost of sales".
Sample

Related

How to make a word mentioned in column A bold in column B where that word is mentioned in a sentence (Google Sheets)?

Basically, I am trying to create conditional formatting that does the following:
Simple example
I really would like to just make bold a word (not a whole sentence) in column B that is mentioned in column A.
I tried many different formulas in the "Value or formula" field:
=REGEXMATCH($B1,A1)
=REGEXMATCH($B1,"<"&A1&">")
=REGEXMATCH(B1,"\b"&A1&"\b")
=ARRAYFORMULA(REGEXMATCH(SPLIT(B1," "),"\b"&A1&"\b"))
...but none of these work.

This is actually sort of possible, if you are willing to stretch your definition of 'bold'. It's absolutely right that you can't make a particular part of a text string within a cell bold (or italic/underlined/coloured/etc.) in a way that replicates the effect if you manually select an area of a text string and format it using the menu bar options - effectively you are setting some metadata which sits 'outside' of the cell so isn't accessible to a formula.
However, Unicode fonts generally contain a region within their character map which corresponds to a bold version of the underlying font, and so with a little bit of trickery it's possible to substitute characters with their bold version. Here's a formula which achieves something like your original request:
=let(wordstobold,A1:A2,
sentencestosplit,B1:B2,
boldwords,map(wordstobold,lambda(eachword,concatenate(map(split(regexreplace(eachword,"(.)","$1_"),"_"),lambda(eachchar,filter(BOLDCHARS(),exact(STDCHARS(),eachchar))))))),
splitsentences,map(sentencestosplit,wordstobold,lambda(sentence,word,split(sentence,word,false))),
presplits,choosecols(splitsentences,1),
postsplits,choosecols(splitsentences,2),
map(presplits,boldwords,postsplits,lambda(x,y,z,x&y&z)))
The formula exploits the fact that SPLIT treats a multicharacter delimiter as a single entity by default, so if you pass the word to be bolded to SPLIT as the delimiter for the sentence it will split the sentence into two halves, presplit and postsplit. The substitution of the normal characters for bold characters could be done in a number of ways, but what I'm doing here is to explode the word to be bolded into individual characters and then MAPping a the equivalent bold character onto each one using FILTER/EXACT.
A couple of 'helper' Named Functions are required: STDCHARS() & BOLDCHARS(); these don't accept any parameters and are a means of storing the character sets for the part of the formula which swaps characters for their bold equivalents. It would be possible to integrate these into the formula, albeit at the expense of readability.
STDCHARS():
={"a";"b";"c";"d";"e";"f";"g";"h";"i";"j";"k";"l";"m";"n";"o";"p";"q";"r";"s";"t";"u";"v";"w";"x";"y";"z";"A";"B";"C";"D";"E";"F";"G";"H";"I";"J";"K";"L";"M";"N";"O";"P";"Q";"R";"S";"T";"U";"V";"W";"X";"Y";"Z";"0";"1";"2";"3";"4";"5";"6";"7";"8";"9"}
BOLDCHARS():
={"𝗮";"𝗯";"𝗰";"𝗱";"𝗲";"𝗳";"𝗴";"𝗵";"𝗶";"𝗷";"𝗸";"𝗹";"𝗺";"𝗻";"𝗼";"𝗽";"𝗾";"𝗿";"𝘀";"𝘁";"𝘂";"𝘃";"𝘄";"𝘅";"𝘆";"𝘇";"𝗔";"𝗕";"𝗖";"𝗗";"𝗘";"𝗙";"𝗚";"𝗛";"𝗜";"𝗝";"𝗞";"𝗟";"𝗠";"𝗡";"𝗢";"𝗣";"𝗤";"𝗥";"𝗦";"𝗧";"𝗨";"𝗩";"𝗪";"𝗫";"𝗬";"𝗭";"𝟬";"𝟭";"𝟮";"𝟯";"𝟰";"𝟱";"𝟲";"𝟳";"𝟴";"𝟵"}

Google Sheets - Split Data

I have these data in Google Sheets
$71,675_x000d_
$80,356_x000d_
$107,361_x000d_
$123,393_x000d_
$116,878
I want them to be split into different columns.
However, when I do so using Data > Split Data into Different Columns, it separates $71 and 675_x000d_ but I need the $71,275 and remove the xoood
Please note that the last number doesn't have those extra characters.
Please help.

Your post says you want to "remove the x000d (that is, extract only the dollar amounts). That said, let's say your raw data starts in A2 (i.e., the data is in A2:A). Place the following formula into the first cell of another otherwise empty column (e.g., B1):
=ArrayFormula({"Extracted";IF(A2:A="",,REGEXEXTRACT(SUBSTITUTE(A2:A&"_",",",""),"\d+"))})
How It Works:
ArrayFormula(...) signifies that we'll be processing an entire range and not just one cell.
The outer curly brackets {...} signify that a virtual array will be formed from non-like or non-contiguous pieces.
The first piece of the virtual array is the header. Here, that is "Extracted"; but you can change it as you like.
The semicolon means "place the next information below the previous part."
IF(A2:A="",, ...) is a standard check that basically says "Don't try to process any blank cells in Column A"; or alternatively worded, "If any cell in A2:A is blank/null, do nothing."
Skipping the REGEXEXTRACT for now, A2:A&"_" appends an underscore to every entry in A2:A. This allows entries in A2:A that are just a dollar amount (e.g., from the post, $116,878) to have a consistent symbol following them if not already there. (And adding the underscore to anything that already has an underscore won't matter, because we won't be extracting that far out.)
Now that we've got the new strings, we SUBSTITUTE every comma for a null (i.e., delete all commas).
Finally, REGEXEXTRACT will take all of the virtually modified strings and extract \d+, which means only digits (\d) in an unbroken sequence of any length greater than 0 (+). Note that REGEXEXTRACT will only return the first such match it encounters as written, so 000 will not be extracted.
An IFERROR wrap is placed around the REGEXEXTRACT, just in case you have any situations in real life that don't have any sequence of numbers at all. In these cases, nothing will be returned (whereas, without the IFERROR, an error would have been returned).
Once the extraction is done, you can apply Format > Number > Currency (rounded) to the entire column.
Addendum:
After an additional comment (below), it appears that the raw data is in Column T, that all five entries are in one cell and that the OP would like all five amounts extracted across each row. That being the case, assuming that Columns U:Y are empty to start, place the following in cell U1 (not U2):
=ArrayFormula({"Va11","Val2","Val3","Val4","Val5";IF(T2:T="",,IFERROR(REGEXEXTRACT(SUBSTITUTE(T2:T&"_",",",""),REPT("\$(\d+)[^\$]*",5))))})
This works much the same way as the previous formula. The differences:
There are five headers now.
You'll see REPT(...,5) here. This is an easy way to repeat the same extraction five times.
That repeated extraction is now the following:
\$(\d+)[^\$]*
The backslash in front of the dollar signs means to treat those symbols as literals instead of as their usual meaning (i.e., end-of-string). So the extraction reads as follows:
\$ anything that starts with a dollar sign
(\d+) extract what is between the ( ), which is any group of digits [^$]*` followed by any number (including 0) characters that are not dollar signs
As I said, the REPT will repeat this five times; so five groups matching this pattern will be extracted.
Understand that if you have any groups that don't follow the pattern exactly, resulting in five matching extractions, nothing will be returned.
Be sure to format U:Y as currency rounded, or you will wind up with some of those numbers translating as raw dates and therefore being completely off.

Please use the following formula and format cells to your needs.
=ArrayFormula(IFERROR(SPLIT(REGEXREPLACE(A2:A,"\n|_x000d_","√"),"√")))
The big advantage of the above formula compared to others is that it works for any number of lines included within a single cell (as shown in the image below).
Functions used:
ArrayFormula
IFERROR
SPLIT
REGEXREPLACE

You can use SPLIT function:
=ArrayFormula(IF(LEN(A:A),SPLIT(A:A,"_x000d_",FALSE),""))

This regex matches in BBEdit and regex.com, but not on iOS - why?

I am trying to "highlight" references to law statutes in some text I'm displaying. These references are of the form <number>-<number>-<number>(char)(char), where:
"number" may be whole numbers 18 or decimal numbers 12.5;
the parenthetical terms are entirely optional: zero or one or more;
if a parenthetical term does exist, there may or may not be a space between the last number and the first parenthesis, as in 18-1.3-401(8)(g) or 18-3-402 (2).
I am using the regex
((\d+(\.\d+)*-){2}(\d+(\.\d+)*))( ?(\([0-9a-zA-Z]+\))*)
to find the ranges of these strings and then highlight them in my text. This expression works perfectly, 100% of the time, in all of the cases I've tried (dozens), in BBEdit, and on regex101.com and regexr.com.
However, when I use that exact same expression in my code, on iOS 12.2, it is extremely hit-or-miss as to whether a string matching the regex is actually found. So hit-or-miss, in fact, that a string of the exact same form of two other matches in a specific bit of text is NOT found. E.g., in this one paragraph I have, there are five instances of xxx-x-xxx; the first and the last are matched, but the middle three are not matched. This makes no sense to me.
I'm using the String method func range(of:options:range:locale:) with options of .regularExpression (and nil locale) to do the matching. I see that iOS uses ICU-compatible regexes, whereas these other tools use PCRE (I think). But, from what I can tell, my expression should be compatible and valid for my case with the ICU parsing. But, something is definitely different, and I cannot figure out what it is.
Anyone? (I'm going to give NSRegularExpression a go and see if it behaves differently, but I'd still like to figure out what's going on here.)

Find a time in some text, allowing for multiple formats

I have the following formula.
=INDEX(Lookups!$L$1:$L$726,MAX(IF(ISERROR(FIND(Lookups!$L$1:$L$726,$A1)),-1,1)*(ROW(Lookups!$L$1:$L$726)-ROW(Lookups!$L$1)+1)))
The idea is to pick up the time for a certain item from an email (already parsed into google sheets). The emails come in various formats so I'm unable to specify the location in the the text string to look at specifically.
The times are not always written in a conventional time format either so as you can see from the formula there are 726 possibilities that I work with. For example, sometimes the time could be written as 13:15 and others as 1:15 or even 1.15 or 1-15 etc etc.
The issue I have is that the above formula seems to start with the smallest string possible and work 'upwards', therefore picking up 3:15 from the email string rather than the full time string which is 13:15. Is there a way I can amend the formula to search for the longest string first, in that example looking for 13:15 and then only searching for 3:15 if the prior is not found.
Hope that makes sense. Thanks in advance for any assistance.

One way is to reorder those 726 possibilities so that you have the longer ones first. You can do it by creating another column with =len(L1), copying that formula down, and sorting the range by this new column in descending order.
But it would be easier to use regexextract instead, because regular expressions are designed to solve the problem you are facing. For example,
=regexextract(L1, "\b\d{1,2}[:.-]\d{1,2}\b")
picks up all of the variants 1:15, 13:15, 1-15 or 13.15. (It looks for the following sequence: word boundary, 1-2 digits, one of characters :, ., -, then 1-2 digits, and another word boundary.) The match is greedy, so it will find 13:15 when it's there, not just 3:15.
A more complex form
=regexextract(L1, "(?i)\b\d{1,2}[:.-]\d{1,2} ?(?:am|pm)?\b")
also supports "am" or "pm" after the time, case-insensitive and possibly separated by a space from the digits.
This can be refined further, for example the hours part would be more precisely stated as [0-2]?\d instead of \d{1,2}, and the minutes part as [0-6]?\d.

Font weight BOLD using formula in cell

I'm using Google Spreadsheet to make some calculations.
At the bottom of my table I need SUM, AVG and some others and everything is fine.
But I made a long cell with all the text, like this:
="Hi at all, this is my report: "&SUM(B:B)&" are the sum of my fingers, "&AVG(C:C)&" is the avg of my sons."
and so on.
Everything works. But I need to bold the SUM(B:B).
I won't use single cells for math results.
I tried with ....<b>"&SUM(B:B)&"</b>.... but obviously I take <b></b> in my cell and not the bold font weight.
How to style from formula in cell?

Recently I had the same problem, but I found a solution. This solution works well in google sheets.
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(E3,"A","𝐀"),"B","𝐁"),"C","𝐂"),"D","𝐃"),"E","𝐄"),"F","𝐅"),"G","𝐆"),"H","𝐇"),"I","𝐈"),"J","𝐉"),"K","𝐊"),"L","𝐋"),"M","𝐌"),"N","𝐍"),"O","𝐎"),"P","𝐏"),"Q","𝐐"),"R","𝐑"),"S","𝐒"),"T","𝐓"),"U","𝐔"),"V","𝐕"),"W","𝐖"),"X","𝐗"),"Y","𝐘"),"Z","𝐙"),"0","𝟎"),"1","𝟏"),"2","𝟐"),"3","𝟑"),"4","𝟒"),"5","𝟓"),"6","𝟔"),"7","𝟕"),"8","𝟖"),"9","𝟗")
If you want to improve your formula, use the characters below:
ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789abcdefghijklmnopqrstuvwxyz
𝐀𝐁𝐂𝐃𝐄𝐅𝐆𝐇𝐈𝐉𝐊𝐋𝐌𝐍𝐎𝐏𝐐𝐑𝐒𝐓𝐔𝐕𝐖𝐗𝐘𝐙𝟏𝟐𝟑𝟒𝟓𝟔𝟕𝟖𝟗𝐚𝐛𝐜𝐝𝐞𝐟𝐠𝐡𝐢𝐣𝐤𝐥𝐦𝐧𝐨𝐩𝐪𝐫𝐬𝐭𝐮𝐯𝐰𝐱𝐲𝐳

At the current time, in Google Sheets you cannot bold partial components of any formula. The solution mentioned using the 'B' or bold icon only works if the desired bold selection is an entire cell containing a formula, or partial of a cell that does not contain a formula.

Yes, this is possible.
My own solution is at the bottom, but I find the one added by Adrian Mihai Nemes here to be cleaner. So, I have expanded that solution here to work for lowercase as well as uppercase and numbers. This should work for text containing anything, including newlines, single quotes, emojis, etc.
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(B25,"A","𝐀"),"B","𝐁"),"C","𝐂"),"D","𝐃"),"E","𝐄"),"F","𝐅"),"G","𝐆"),"H","𝐇"),"I","𝐈"),"J","𝐉"),"K","𝐊"),"L","𝐋"),"M","𝐌"),"N","𝐍"),"O","𝐎"),"P","𝐏"),"Q","𝐐"),"R","𝐑"),"S","𝐒"),"T","𝐓"),"U","𝐔"),"V","𝐕"),"W","𝐖"),"X","𝐗"),"Y","𝐘"),"Z","𝐙"),"a","𝐚"),"b","𝐛"),"c","𝐜"),"d","𝐝"),"e","𝐞"),"f","𝐟"),"g","𝐠"),"h","𝐡"),"i","𝐢"),"j","𝐣"),"k","𝐤"),"l","𝐥"),"m","𝐦"),"n","𝐧"),"o","𝐨"),"p","𝐩"),"q","𝐪"),"r","𝐫"),"s","𝐬"),"t","𝐭"),"u","𝐮"),"v","𝐯"),"w","𝐰"),"x","𝐱"),"y","𝐲"),"z","𝐳"),"0","𝟎"),"1","𝟏"),"2","𝟐"),"3","𝟑"),"4","𝟒"),"5","𝟓"),"6","𝟔"),"7","𝟕"),"8","𝟖"),"9","𝟗")
For my own solution, this will bold all lowercase, uppercase, and numbers. It should work for text containing just about anything, including newlines, single quotes, etc. (perhaps not emojis?).
Just replace the single A2 reference with whatever it is you want bolded:
=ARRAYFORMULA(JOIN("", UNICHAR(QUERY(UNICODE(SPLIT(TRANSPOSE(SPLIT(
REGEXREPLACE(
REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(A2&""
,"([^a-zA-Z0-9])","$1"&UNICHAR(160)&UNICHAR(1)&CHAR(127))
,"'","''")
,"([a-z])","$1"&UNICHAR(160)&UNICHAR(119738)&CHAR(127))
,"([A-Z])","$1"&UNICHAR(160)&UNICHAR(119744)&CHAR(127))
,"([0-9])","$1"&UNICHAR(160)&UNICHAR(120735)&CHAR(127))
,"'","''")
,CHAR(127))), UNICHAR(160))), "select Col1+Col2-1 label Col1+Col2-1 ''",0))))
Overview:
For each character group (lowercase, uppercase, numbers) we're using stringed together REGEXREPLACE calls to append special separator characters along with the base unicode character for that group's bold font right after.
So for example, Hi-1 becomes "H"&UNICHAR(160)&UNICHAR(119743)&CHAR(127)&"i"&UNICHAR(160)&UNICHAR(119737)&CHAR(127)&"-"&UNICHAR(160)&UNICHAR(1)&CHAR(127)&"1"&UNICHAR(160)&UNICHAR(120734)&CHAR(127).
Once we have this new string, we split on the `CHAR(127) and transform, so each of these characters are on their own row. So the example now becomes:
"H"&UNICHAR(160)&UNICHAR(119743)
"i"&UNICHAR(160)&UNICHAR(119737)
"-"&UNICHAR(160)&UNICHAR(1)
"1"&UNICHAR(160)&UNICHAR(120734)
Next, we split on the UNICHAR(160) character:
"H", UNICHAR(119743)
"i", UNICHAR(119737)
"-", UNICHAR(1)
"1", UNICHAR(120734)
We use the UNICODE() function to convert the actual characters along with their corresponding UNICHAR into their unicode numbers:
72, 119744
105, 119738
45, 1
49, 120735
Now, we use the QUERY() function as a way of summing each of these rows individually. That's where the "select Col1+Col2-1 label Col1+Col2-1 ''" comes in. It is adding column 1 to column 2 and taking away the 1 extra from the base unicode value, and then preventing a heading label from being added to the function output.
So, now we get:
119815
119842
45
120783
We use the next UNICHAR() function to convert these to their unicode characters, which at this point is the corresponding bolded character:
𝐇
𝐢
-
𝟏
Lastly, we use a JOIN() with an empty string "" delimiter to combine it all back into a single string.
𝐇𝐢-𝟏
p.s. If you're curious why the different character groups need to be split up, it's because each group lines up with their corresponding bold characters in order, but not all 3 character groups in a row. There are some extra characters between the normal type groups that you don't see in the bold section of unicode. Thus, each character group has to be given its own base unicode value to be added to that normal character's unicode value.
p.p.s. If you wanted to add more characters, you would just need to add another wrapping REGEXREPLACE() with the correct character group, and the UNICHAR() with the correct base unicode value for that group, and then add that new group to the exclusions from the first REGEXREPLACE(). Happy to explain this further if required.
p.p.p.s. The REGEXREPLACE() with the single quote ' being replaced with two single quotes '' is needed because when we split the characters to their own cells, Google Sheets actually considers a leading single quote as a special character and removes it. So effectively, two single quotes get converted to one single quote after splitting.

I updated Adrian's formula to style to sans-serif characters instead of serif ones if anyone needs it. It includes both lower and upper case letters, as well as numbers.
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(D2,"A","𝗔"),"B","𝗕"),"C","𝗖"),"D","𝗗"),"E","𝗘"),"F","𝗙"),"G","𝗚"),"H","𝗛"),"I","𝗜"),"J","𝗝"),"K","𝗞"),"L","𝗟"),"M","𝗠"),"N","𝗡"),"O","𝗢"),"P","𝗣"),"Q","𝗤"),"R","𝗥"),"S","𝗦"),"T","𝗧"),"U","𝗨"),"V","𝗩"),"W","𝗪"),"X","𝗫"),"Y","𝗬"),"Z","𝗭"),"a","𝗮"),"b","𝗯"),"c","𝗰"),"d","𝗱"),"e","𝗲"),"f","𝗳"),"g","𝗴"),"h","𝗵"),"i","𝗶"),"j","𝗷"),"k","𝗸"),"l","𝗹"),"m","𝗺"),"n","𝗻"),"o","𝗼"),"p","𝗽"),"q","𝗾"),"r","𝗿"),"s","𝘀"),"t","𝘁"),"u","𝘂"),"v","𝘃"),"w","𝘄"),"x","𝘅"),"y","𝘆"),"z","𝘇"),"0","𝟬"),"1","𝟭"),"2","𝟮"),"3","𝟯"),"4","𝟰"),"5","𝟱"),"6","𝟲"),"7","𝟳"),"8","𝟴"),"9","𝟵")

EDIT
I understood how to use the "Substitute" formula but I have a question: is it possible to use another font? (Montserrat). The substitute formula seem to force a specific font regardless of the font I am selecting for that cell.
Here is what I also tried:
I noticed that when I CCed the substitute formula in an empty cell in a google sheet, every letter/number is followed by its bold version.
I thought it was just a matter of the font being used within the formula so I CCed the formula in a google doc and manually changed the font as well as every bold letter and number to Montserrat. They appear "correctly" within google doc, however when I CC that modified substitute formula in an empty cell in google sheet, it doesn't keep the bold letters and numbers. How come CCing from here keeps the bold letters/numbers in google sheet, and CCing from a google doc have the bold letters/numbers reset to regular version?
Please help me figure this out
Yours,
--Jay

Having the same issue at the moment. Spent a decent amount of time figuring it out before I realized it cannot be done. The reason for this is that Google spreadsheets does not allow bits of a query to have bold/italics/underline to it.
The only workaround I have found is to do it manually . Let the query run , and highlight the section of the string you want with a different formatting option.

To make any of the cells bold just make a conditional format for the cell/range/array and choose bold.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Why are hyphens screwing up my duplicate finding conditional formatting? - google-sheets

Related

How to make a word mentioned in column A bold in column B where that word is mentioned in a sentence (Google Sheets)?

Google Sheets - Split Data

This regex matches in BBEdit and regex.com, but not on iOS - why?

Find a time in some text, allowing for multiple formats

Font weight BOLD using formula in cell

Categories

Resources