Rails not displaying unicode character

Rails not displaying unicode character - ruby-on-rails

If I print the string "hello\u1D175 \u1D176and goodbye", I expect a musical tie symbol to show. Instead I get "helloᴗ5 ᴗ6and goodbye".
I'm assuming I need to change how the string is read? Or the font?

Using the unicode decimal value for a musical tie symbol, the following should work
9835.chr(Encoding::UTF_8)
For example,
"hello #{9835.chr(Encoding::UTF_8)} and..."
Reference: https://unicodelookup.com/

Related

How to convert a formatted string into plain text

User copy paste and send data in following format: "𝕛𝕠𝕧𝕪 𝕕𝕖𝕓𝕓𝕚𝕖"
I need to convert it into plain txt (we can say ascii chars) like 'jovy debbie'
It comes in different font and format:
ex:
'𝑱𝒆𝒏𝒊𝒄𝒂 𝑫𝒖𝒈𝒐𝒔'
'𝙶𝚎𝚟𝚒𝚎𝚕𝚢𝚗 𝙽𝚒𝚌𝚘𝚕𝚎 𝙻𝚞𝚖𝚋𝚊𝚐'
Any Help will be Appreciated, I already refer other stack overflow question but no luck :(

Those letters are from the Mathematical Alphanumeric Symbols block.
Since they have a fixed offset to their ASCII counterparts, you could use tr to map them, e.g.:
"𝕛𝕠𝕧𝕪 𝕕𝕖𝕓𝕓𝕚𝕖".tr("𝕒-𝕫", "a-z")
#=> "jovy debbie"
The same approach can be used for the other styles, e.g.
"𝑱𝒆𝒏𝒊𝒄𝒂 𝑫𝒖𝒈𝒐𝒔".tr("𝒂-𝒛𝑨-𝒁", "a-zA-Z")
#=> "Jenica Dugos"
This gives you full control over the character mapping.
Alternatively, you could try Unicode normalization. The NFKC / NFKD forms should remove most formatting and seem to work for your examples:
"𝕛𝕠𝕧𝕪 𝕕𝕖𝕓𝕓𝕚𝕖".unicode_normalize(:nfkc)
#=> "jovy debbie"
"𝑱𝒆𝒏𝒊𝒄𝒂 𝑫𝒖𝒈𝒐𝒔".unicode_normalize(:nfkc)
#=> "Jenica Dugos"

How to remove ANSI codes from a string?

I am working on string manipulation using LUA and having trouble with the following problem.
Using this as an example of the original data I am given -
"[0;1;36m(Web): You say, "Text here."[0;37m"
I want to keep the string intact except for removing the ANSI codes.
I have been pointed toward using gsub with the LUA pattern matching but I cannot seem to get the pattern correct. I am also unsure how to reference exactly the escape character sent.
text:gsub("[\27\[([\d\;]+)m]", "")
or
text:gsub("%x%[[%d+;+]m", "")
If successful, all I want to be left with, using the above example, would be:
(Web): You say, "Text here."

Your string example is missing the escape character, ASCII 27.
Here's one way:
s = '\x1b[0;1;36m(Web): You say, "Text here."\x1b[0;37m'
s = s:gsub('\x1b%[%d+;%d+;%d+;%d+;%d+m','')
:gsub('\x1b%[%d+;%d+;%d+;%d+m','')
:gsub('\x1b%[%d+;%d+;%d+m','')
:gsub('\x1b%[%d+;%d+m','')
:gsub('\x1b%[%d+m','')
print(s)

Convert Unicode escape sequence into its corresponding character

I'm receiving a string from the server and it has the special characters in code. Here's the example:
"El usuario o las contrase\UOOOOfffda no son v\UOOOOfffdlidos"
The first one should be an "ñ" and the second one "á"
I know it's not complicated but I can't find the answer. How can I get the string with the special characters correctly formatted?

Unicode U+FFFD (in your string, displayed as UTF-32 \U0000fffd) is "�", the replacement character. It is often substituted in strings when a system encounters unrecognized characters.
This character really shouldn't appear in string data since its purpose is to indicate an error in displaying or interpreting the string. Since your server is sending you that character for both ñ and á, there is no way to retrieve the correct character.
How are you "receiving" this string? It could be that you are accessing the server incorrectly so it isn't sending you an unmodified string.

Unicode for those characters should look like this:
#"accented-a is \u00f1, and tilda-n is \u00e1"
But it's not clear what you're getting from the server makes any sense. The objective-c literal must have a lowercase leading "u" followed only by valid hex digits (0-9 and a-f). I don't see a transformation that changes the literals you have to the ones you expect.
Once the characters are formatted properly, the built-in classes will just work, for example, assigning the string to a label's text property will show the user a nice glyph.

How to escape strings with numeric character references in Java

Hello and thank you for reading my post.
The Apache Commons StringEscapeUtils.escapeHtml3() and StringEscapeUtils.escapeHtml4() functions allow, in particular, to convert characters with an acute (like é, à...) in a string into
character entity references which have the format &name; where name is a case-sensitive alphanumeric string.
How can I get the escaped string of a given string with numeric character references instead (&#nnnn; or &#xhhhh; where nnnn is the code point in decimal form, and hhhh is the code point in hexadecimal form)?
I actually need to escape strings for a XML document which doesn't know about such entities as & eacute;, & agrave; etc.
Best regards.

To solve this problem, I wrote a method which takes a string as an argument and replaces, in this string, character entity references (like é) with their corresponding numeric character references (é in this case).
I used this W3C list of references: http://www.sagehill.net/livedtd/xhtml1-transitional/xhtml-lat1.ent.html
Nota: It would be great to be able to pass another argument to the StringEscapeUtils.escapeHtml4() method to tell it whether we would like character entity references or numeric character references in the output string...

Create your CharacterTranslator:
CharacterTranslator XML_ESCAPE = StringEscapeUtils.ESCAPE_XML11.with(
NumericEntityEscaper.between(0x7f, Integer.MAX_VALUE) );
and use it:
XML_ESCAPE.translate(…)

Escape Unicode Characters for iOS

There are some Unicode arrangements that I want to use in my app. I am having trouble properly escaping them for use.
For instance this Unicode sequence: 🅰
If I escape it using an online tool i get: \ud83c\udd70
But of course this is an invalid sequence per the compiler:
var str = NSString.stringWithUTF8String("\ud83c\udd70")
Also if I do this:
var str = NSString.stringWithUTF8String("\ud83c")
I get an error "Invalid Unicode Scalar"
I'm trying to use these Unicode "fonts":
http://www.panix.com/~eli/unicode/convert.cgi?text=abcdefghijklmnopqrstuvwxyz
If I view the source of this website I see sequences like this:
&#x1D552
Struggling to wrap my head around what is the "proper" way to work with/escape unicode.
And simply need a to figure out a way to get them working on iOS.
Any thoughts?

\ud83c\udd70 is a UTF-16 surrogate pair which encodes the unicode character 🅰 (U+1F170). Swift string literals do not use UTF-16, so that escape sequence doesn't make sense. However, since 1F170 has five digits you can't use a \uXXXX escape sequence (which only accepts four hexadecimal digits). Instead, use a \UXXXXXXXX sequence (note the capital U), which accepts eight:
var str = "\U0001F170" // returns "🅰"
You can also just paste the character itself into your string:
var str = "🅰" // returns "🅰"

Swift is an early Beta, is is broken in many ways. This issue is a Swift bug.
let ringAboveA: String = "\u0041\u030A" is Å and is accepted
let negativeSquaredA: String = "\uD83D\uDD70" is 🅰 and produces an error
Both are decomposed UTF16 characters that are accepted by Objective-C. The difference is that the composed character 🅰 is in plane 1.
Note: to get the UTF32 code point either use the OSX Character Viewer or a code snippet:
NSLog(#"utf32: %#", [#"🅰" dataUsingEncoding:NSUTF32BigEndianStringEncoding]);
utf32: <0001f170>
To get the Character Viewer in the Apple Menu go to the "System Preferences", "Keyboard", "Keyboard" tab and select the checkbox: "Show Keyboard & Character Viewers in menu bar". The "Character View" item will be in the menu bar just to the left of the Date.
After entering the character right (control) click on the character in favorites to copy the search results.
Copied information:
🅰
NEGATIVE SQUARED LATIN CAPITAL LETTER A
Unicode: U+1F170 (U+D83C U+DD70), UTF-8: F0 9F 85 B0
Better yet: Add unicode in the list on the left and select it.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Rails not displaying unicode character - ruby-on-rails

If I print the string "hello\u1D175 \u1D176and goodbye", I expect a musical tie symbol to show. Instead I get "helloᴗ5 ᴗ6and goodbye". I'm assuming I need to change how the string is read? Or the font?

Using the unicode decimal value for a musical tie symbol, the following should work 9835.chr(Encoding::UTF_8) For example, "hello #{9835.chr(Encoding::UTF_8)} and..." Reference: https://unicodelookup.com/

Related

How to convert a formatted string into plain text

How to remove ANSI codes from a string?

Convert Unicode escape sequence into its corresponding character

How to escape strings with numeric character references in Java

Escape Unicode Characters for iOS

Categories

Resources