Prevent Ruby from changing & to &?

Prevent Ruby from changing & to &? - ruby-on-rails

I need to display some superscript and subscript characters in my webpage title. I have a helper method that recognizes the pattern for a subscript or superscript, and converts it to &sub2; or ²
However, when it shows up in the rendered page's file, it shows up in the source code as:
&sub2;
Which is not right. I have it set up to be:
<% provide(:title, raw(format_title(#hash[:page_title]))) %>
But the raw is not working. Any help is appreciated.
Method:
def format_title(name)
label = name
if label.match /(_[\d]+_)+|(~[\d]+~)+/
label = label.gsub(/(_([\d]+)_)+/, '&sub\2;')
label = label.gsub(/(~([\d]+)~)+/, '&sup\2;')
label.html_safe
else
name
end
end
I have even tried:
str.gsub(/&/, '&')
but it gives back:
&amp;sub2;

You can also achieve this with Rails I18n.
<%= t(:page_title_html, scope: [:title]) %>
And in your respective locale file. (title.en.yml most probably):
title:
page_title: "Title with ²"
Here is a chart for HTML symbols regarding subscript and superscripts.
For more information check Preventing HTML character entities in locale files from getting munged by Rails3 xss protection
Update:
In case you need to load the page titles dynamically, first, you'll have to install a gem like Page Title Helper.
You can follow the guide in the gem documentation.

There are two of issues with your example, one is of matter and the other is just a coincidence.
The first issue is you are trying to use character entities that do not actually exist. Specifically, there are only ¹, ² and ³ which provide 1, 2 and 3 superscript symbols respectively. There is no such character entity as &sup4; nor any other superscript digits. There are though bare codepoints for other digits which you can use but this would require a more involved code.
More importantly, there are no subscript character entities at all in HTML5 character entities list. All subscript digits are bare codepoints. Therefore it makes no sense to replace anything with &sub2; or any other "subscript" digit.
The reason you didn't see your example working is due to the test string you chose. Supplying anything with underscores, like _2_mystring will be properly replaced with &sub2;. As &sub2; character entity is non-existent, the string will appear as is, creating an impression that raw method somehow doesn't work.
Try to use ~2~mystring and it will be replaced with the superscript character entity ² and will be rendered correctly. This illustrates that your code correct, but the basic assumption about character entities is not.

Related

In Reflected XSS, why do we need to sanitize single quote, double quote, ampersand, and backslash

Based on this article
https://resources.infosecinstitute.com/topic/how-to-prevent-cross-site-scripting-attacks/
Reflected XXS happens when data injected is reflected in the response. I get the idea that if I, for example, have a search box in my page and the search term inputted by a user is displayed in the page, someone could write as a search term:
<script>alert('x');</script>
and that would be read as regular HTML element in the page that displays the response.
But lets say greater than and less than are already blocked in input (meaning they wouldn't be able to put in script tags or any tag), what's the issue if I allow single quote, double quote, ampersand, and backslash reflected in the response. I'm trying to make sense of it but I am not sure if I am understanding correctly.

Today the web stack is big and complex with many languages. We have HTML, CSS, JavaScript, VB-Script, SVG, URLs…
Each with its own rules for:
Encoding
Quoting
Commenting
Escaping
Also, each one can be nested inside each other:
And just replacing <> fixes some issues, but not all of them as you don't know where you data will end up, is it in HTML? as a HTML Attribute? inside a JavaScript string? Each one needs different encoding to become safe.
So, the world is a bit more complicated.....

Grails: User inputs formatted string, but formatting not preserved

I am just starting a very basic program in Grails (never used it before, but it seems to be very useful).
What I have so far is:
in X.groovy,
a String named parameters, with constraint of maximum length 50000 and a couple other strings and dates, etc.
in XController.groovy,
static scaffold = X;
It displays the scaffold UI (very handy!), and I can add parameter strings and the other objects associated with it.
My problem is that the parameters string is a long string with formatting that is pasted in by the user. When it is displayed on the browser, however, it does not retain any carriage returns.
What is the best way to go about this? I'm a very beginner at Grails and still have lots and lots of learning to do on this account. Thanks.

The problem is that the string is being displayed using HTML which doesn't parse \n into a new line by default. You need to wrap the text in <pre> (see: http://www.w3schools.com/tags/tag_pre.asp) or replace the \n with <br/> tags to display it correctly to the user.

Rails i18n strings auto-lowercased?

I've noticed that in my new Rails 3.0 application all German i18n strings are converted to lowercase (except for the first letter).
When having a string like this:
de:
email: "E-Mail"
the output is always like "E-mail". Same story with all the other strings - uppercase letters within a sentence are auto-converted to lowercase.
Is this default behaviour that I have to disable, or is there any other problem? I have successfully set the locale correctly, as these strings to actually work.
Thanks for your help
Arne

There should be no modifications to the content you specify as part of the internationalization process. It sounds like something is calling humanize on the string before it is output. Some of the standard Rails form helper methods do this I believe. If you just output the translation using t('email') you should see 'E-Mail' correctly.
Update:
From your comments it seems like it is a label that is causing the problem. If you explicitly specify the text for the label rather than relying on the default behaviour you will get the translation exactly as you specify. So,
<%= f.label(:email, t('email')) %>
should generate the correct label from the translations.
However, it isn't ideal. I think you may also run into problems with the generated validation error messages.

Got the same issue. solved it by adding the _html suffix to the I18n translation key. it seems that using this suffix suppresses the humanize usage.

what if html_escape would stop escaping '&'?

is there any danger if the rails html_escape function would stop escaping '&'? I tested a few cases and it doesn't seem to create any problems. Can you give me a contrary an example? Thanks.

If you put an unescaped "&" into an HTML attribute, it would make your page invalid. For example:
Link
The page is now invalid as the & indicates an entity. This is true for any usage of an & on a page (for example, view source and hopefully you'll notice that Stack Overflow escapes the & signs in this post!)
The following would make the above example valid:
Link
Additional Note
& characters do need to be escaped in URLs if you want to validate your markup against the W3C validator. Example:
Line 9, Column 38: & did not start a character reference.
(& probably should have been escaped as &.)
Example

change an url with adding some argument

Why is this query string invalid?

In my asp.net mvc page I create a link that renders as followed:
http://localhost:3035/Formula/OverView?colorId=349405&paintCode=744&name=BRILLANT%20SILVER&formulaId=570230
According to the W3C validator, this is not correct and it errors after the first ampersand. It complains about the & not being encoded and the entity &p not recognised etc.
AFAIK the & shouldn't be encoded because it is a separator for the key value pair.
For those who care: I send these pars as querystring and not as "/" seperated values because there is no decent way of passing on optional parameters that I know of.
To put all the bits together:
an anchor (<a>) tag's href attribute needs an encoded value
& encodes to &
to encode an '&' when it is part of your parameter's value, use %26
Wouldn't encoding the ampersand into & make it part of my parameter's value?
I need it to seperate the second variable from the first
Indeed, by encoding my href value, I do get rid of the errors. What I'm wondering now however is what to do if for example my colorId would be "123&456", where the ampersand is part of the value.
Since the separator has to be encoded, what to do with encoded ampersands. Do they need to be encoded twice so to speak?
So to get the url:
www.mySite.com/search?query=123&456&page=1
What should my href value be?
Also, I think I'm about the first person in the world to care about this.. go check the www and count the pages that get their query string validated in the W3C validator..

Entities which are part of the attributes should be encoded, generally. Thus you need & instead of just &
It works even if it doesn't validate because most browsers are very, very, very lenient in what to accept.
In addition, if you are outputting XHTML you have to encode every entity everywhere, not just inside the attributes.

All HTML attributes need to use character entities. You only don't need to change & into & within script blocks.
Whatever
Anywhere in an HTML document that you want an & to display directly next to something other than whitespace, you need to use the character entity &. If it is part of an attribute, the & will work as though it was an &. If the document is XHTML, you need to use character entities everywhere, even if you don't have something immediately next to the &. You can also use other character entities as part of attributes to treat them as though they were the actual characters.
If you want to use an ampersand as part of a URL in a way other than as a separator for parameters, you should use %26.
As an example...
Hello
Would send the user to http://localhost/Hello, with name=Bob and text=you & me "forever".

This is a slightly confusing concept to some people, I've found. When you put & in a HTML page, such as in <a href="abc?def=5&ghi=10">, the URL is actually abc?def=5&ghi=10. The HTML parser converts the entity to an ampersand.
Think of exactly the same as how you need to escape quotes in a string:
// though you define your string like this:
myString = "this is \"something\" you know?"
// the string is ACTUALLY: this is "something" you know?
// when you look at the HTML, you see:
<a href="foo?bar=1&baz=2">
// but the url is ACTUALLY: foo?bar=1&bar=2

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Prevent Ruby from changing & to &? - ruby-on-rails

Related

In Reflected XSS, why do we need to sanitize single quote, double quote, ampersand, and backslash

Grails: User inputs formatted string, but formatting not preserved

Rails i18n strings auto-lowercased?

what if html_escape would stop escaping '&'?

Why is this query string invalid?

Categories

Resources