Need to convert string "&#x0398" to "\u0398" - ruby-on-rails

My Rails application stores strings containing html entity codes, e.g. "&#x0398", which display Greeks and other characters on browser pages. To display these same characters in Prawn documents, I need to convert "&#x0398" to "\u0398". Using a regexp I can extract the bare codepoint, "0398", from the original string. But I'm unable to use this to create a new string variable containing "\u0398".
I've tried many variations of string concatenation, interpolation and even array operations, but no joy. Anything that looks like
new_string_var = "\u" + my_codepoint
generates an "invalid Unicode escape" error at "\u".
Anything that looks like
new_string_var = "\\u" + my_codepoint
runs without error but inserts the literal string "\u0398" in the Prawn document.
Is it possible in Ruby to construct a string like this? Is there a better approach?

Actually, you don't need \uxxxx notation - this is for display purposes in Ruby. Try CGI.unescapeHTML(string_with_entities) from built-in CGI module.

Related

IBM Cast Iron studio unable to convert '&' to '&'

Hello I am constructing a URI from two different strings coming from a source.
String1 = 12345&67890
String2 = 78326832
URI = /api?invoice=String1&supplier=String2
After using concat function available in studio, this is the final URI.
/api?invoice=12345&67890&supplier=78326832
(Get request fails because 67890 is taken as query)
Expected output is
/api?invoice=12345&67890&supplier=78326832
how do I achieve this, Can i use xslt to convert symbols to its HTML entity characters
Your expected output /api?invoice=12345&67890&supplier=78326832 is rather bizarre: there's no context where it makes sense to escape some ampersands (at the XML/HTML level) and leave others unescaped.
I think that what you really want is to use URI escaping (not XML escaping) for the first ampersand, that is you want /api?invoice=12345%2667890&supplier=78326832. If you're building the URI using XSLT 2.0 you can achieve this by passing the strings through encode-for-uri() before you concatenate them into the URI.
But you've given so little information about the context of your processing that it's hard to be sure exactly what you want.

How to disable string conversion in Swift?

I have a encrypted string which would be passed from the server side, now I want to test to convert it into readable language by some conventional decoding method.
but I found I totally cannot use the string:
The error shows: invalid escape sequence in literal.
There exists some conversions in swift string like "\(variable)" or "\b".
Is there a way for me to use pure String?
For example. in python, I can declare a = """content""" to represent pure String
It's the backslash (\), just before the character the up-arrow is pointing to in the error message. In a literal, this needs to be represented by a double backslash (\\).
This issue won't arise once you're no longer testing and you're doing this all with actual values; it's a feature only of literal strings.
I recently fount a quick solution:
Use ' content ' instead of " content ", then Xcode would give you a warning.
Press Fix-it, Xcode would automatically add \ at right places to avoid literal convention.
I would suggest you save the string in a file in you app and read it from there, to avoid having to modify the string (it's a rather ugly string and you will have to escape a bunch of stuff).
You can use NSBundle.pathForResource and NSString.initWithContentsOfFile to get the string into memory from the file.

Grails: User inputs formatted string, but formatting not preserved

I am just starting a very basic program in Grails (never used it before, but it seems to be very useful).
What I have so far is:
in X.groovy,
a String named parameters, with constraint of maximum length 50000 and a couple other strings and dates, etc.
in XController.groovy,
static scaffold = X;
It displays the scaffold UI (very handy!), and I can add parameter strings and the other objects associated with it.
My problem is that the parameters string is a long string with formatting that is pasted in by the user. When it is displayed on the browser, however, it does not retain any carriage returns.
What is the best way to go about this? I'm a very beginner at Grails and still have lots and lots of learning to do on this account. Thanks.
The problem is that the string is being displayed using HTML which doesn't parse \n into a new line by default. You need to wrap the text in <pre> (see: http://www.w3schools.com/tags/tag_pre.asp) or replace the \n with <br/> tags to display it correctly to the user.

lua reading chinese character

I have the following xml that I would like to read:
chinese xml - https://news.google.com/news/popular?ned=cn&topic=po&output=rss
korean xml - http://www.voanews.com/templates/Articles.rss?sectionPath=/korean/news
Currently, I try to use a luaxml to parse in the xml which contain the chinese character. However, when I print out using the console, the result is that the chinese character cannot be printed correctly and show as a garbage character.
I would like to ask if there is anyway to parse a chinese or korean character into lua table?
I don't think Lua is the issue here. The raw data the remote site sends is encoded using UTF-8, and Lua does no special interpretation of that—which means it should be preserved perfectly if you just (1) read from the remote site, and (2) save the read data to a file. The data in the file will contain CJK characters encoded in UTF-8, just like the remote site sent back.
If you're getting funny results like you mention, the fault probably lies either with the library you're using to read from the remote site, or perhaps simply with the way your console displays the results when you output to it.
I managed to convert the "中美" into chinese character.
I would need to do one additional step which has to convert all the the series of string by using this method from this link, http://forum.luahub.com/index.php?topic=3617.msg8595#msg8595 before saving into xml format.
string.gsub(l,"&#([0-9]+);", function(c) return string.char(tonumber(c)) end)
I would like to ask for LuaXML, I have come across this method xml.registerCode(decoded,encoded)
Under that method, it says that
registers a custom code for the conversion between non-standard characters and XML character entities
What do they mean by non-standard characters and how do I use it?

Clean up & style characters from text

I am getting text from a feed that has alot of characters like:
Insignia&#153; 2.0 Stereo Computer Speaker System (2-Piece) - Black
4th-Generation Apple® iPod® touch
Is there an easy way to get rid of these, or do I have to anticipate which characters I want to delete and use the delete method to remove them? Also, when I try to remove
&
with
str.delete("&")
It leaves behind "amp;" Is there a better way to delete this type of character? Do I need to re-encode the text?
String#delete is certainly not what you want, as it works on characters, not the string as a whole.
Try
str.gsub /&/, ""
You may also want to try replacing the & with a literal ampersand, such as:
str.gsub /&/, "&"
If this is closer to what you really want, you may get the best results unescaping the HTML string. If so try this:
CGI::unescapeHTML(str)
Details of the unescapeHTML method are here.
If you are getting data from a 'feed', aka RSS XML, then you should be using an XML parser like Nokogiri to process the XML. This will automatically unescape HTML entities and allow you to get the proper string representation directly.
For removing try to use gsub method, something like this:
text = "foo&bar"
text.gsub /\b&\b/, "" #=> foobar

Resources