url escaping in ruby - ruby-on-rails

There are many discussion about URL escaping in Ruby, but unfortunately I didn't find an appropriate solution.
In general, URI.escape should do the job, but looks like it doesn't support all characters, for example it doesn't escape "[".
URI.parse(URI.escape("1111{3333"))
works well.
URI.parse(URI.escape("1111[3333"))
raises an exception.
I understand that "[" is not an eligible character in URL according to RFC, but when I enter it into the browser it takes it, and renders the page, so I need exactly the same behavior.
Do know any ready solution for escaping in Ruby?

I typically use
CGI.escape
to escape URI parameters.
require 'cgi'.
CGI.escape('1111[3333')
=> "1111%5B3333"

The character [ is a uri delimiter character and does not require escaping.
http://www.ietf.org/rfc/rfc2396.txt
section 2.4.3. Excluded US-ASCII Characters

Related

how can I use colon instead of question mark in url query?

for example this image:
https://pbs.twimg.com/media/BFmDUA5CcAAmcBl.jpg
then I add a color symbol to send query string:
https://pbs.twimg.com/media/BFmDUA5CcAAmcBl.jpg:large
https://pbs.twimg.com/media/BFmDUA5CcAAmcBl.jpg:small
I googled that is twitter image
what coding language can achieve this?
php? ruby on rails?
or any htaccess rewrite rule?
Any.
It has nothing to do with programming languages, but with CGI: http://en.wikipedia.org/wiki/Common_Gateway_Interface
The colon is however not a valid part of the CGI spec, so the server receiving the request will probably parse it in code.
Note though that the CGI spec defines '&' as separator between different variable/value pairs, which results in incorrect (X)HTML when used in <a> tags. This is because it doesn't define a valid entity. To remedy this, at least in PHP, you can change this separator: http://www.php.net/manual/en/ini.core.php#ini.arg-separator.output

Why are URLs in the form of "http://www.mongodb.org/display/DOCS/mongo+-+The+Interactive+Shell"

What is the mongo+-+The+Interactive+Shell part for and why is it that way? It seems like it is urlencoded from "mongo - The Interactive Shell"
for the same reason the url to this qustion includes why-are-urls-in-the-form-of-http-www-mongodb-org-display-docs-mongo-theinte. unencoded spaces aren't valid, and encoded ones (%20) are hard to read, so a more readable alternative is used.
The W3C reserved the plus sign as a shorthand for the space character. You'll also find the same document codified as RFC 1630.

what if html_escape would stop escaping '&'?

is there any danger if the rails html_escape function would stop escaping '&'? I tested a few cases and it doesn't seem to create any problems. Can you give me a contrary an example? Thanks.
If you put an unescaped "&" into an HTML attribute, it would make your page invalid. For example:
Link
The page is now invalid as the & indicates an entity. This is true for any usage of an & on a page (for example, view source and hopefully you'll notice that Stack Overflow escapes the & signs in this post!)
The following would make the above example valid:
Link
Additional Note
& characters do need to be escaped in URLs if you want to validate your markup against the W3C validator. Example:
Line 9, Column 38: & did not start a character reference.
(& probably should have been escaped as &.)
Example
change an url with adding some argument

Can we use & in url?

Can we use "&" in a url ? or should "and" be used?
Yes, you can use it plain in your URL path like this:
http://example.com/Alice&Bob
Only if you want to use it in the query you need to encode it with %26:
http://example.com/?arg=Alice%26Bob
Otherwise it would be interpreted as argument separator when interpreted as application/x-www-form-urlencoded.
See RFC 3986 for more details.
An URL is generally in the form
scheme://host/some/path/to/file?query1=value&query2=value
So it is not advisable to use it in an URL unless you want to use it for parameters. Otherwise you should percent escape it using %26, e.g.
http://www.example.com/hello%26world
This results in the path being submitted as hello&world. There are other characters which must be escaped when used out of context in an URL. See here for a list.
Unless you're appending variables to the query string, encode it.
encode '&' with & (this answer is based on your use of tags)
If you are asking what to use "&" or "and" when registering the name of your URL, I would use "and".
EDIT: As mentioned in comments "& is an HTML character entity and not a URI character entity. By putting that into a URI you still have the ampersand character and additional extraneous characters." I started answering before fully understanding your question.

Rails: post data with a '+' is getting set to a blank

I have post data that includes a '+' sign. Once it makes it to the server the raw post data is showing the '+' sign but once the post data makes it into the param hash the '+' sign has been converted to a blank. Any ideas on how to make it NOT do that?
If you replace your '+' signs with '%2B', this should resolve the issue.
However, also note that you probably need to check your ampersands, percent signs, and other characters as well. The server receiving your post data is probably expecting URLEncoded data.
In a nutshell, if you replace % signs with %25, then replace & with %26, replace ? with %3F, replace # with %23, and replace + signs with %2B; you will cover most of the issues you can encounter.
A more in-depth list of replacements can be found at these links.
HTML Url Encoding (w3schools)
Percent-Encoding (wikipedia)
Have a look at the CGI.escape method in the standard library that will do this for you:
irb(main):001:0> require 'cgi'
=> true
irb(main):002:0> CGI.escape 'foo+bar&baz?qux quux/corge'
=> "foo%2Bbar%26baz%3Fqux+quux%2Fcorge"
There's also a CGI.unescape method should you need to convert back.
Try replacing the + with %2B.
Not sure why that is happening. Normally + signs make it through to the params. Can you post your rails versions. Also try escaping the "+" sign with "+" or its CGI equivalent "%2B" to see if it makes a difference.
there is a Ruby call to handle all this for you so you don't need to figure out the characters yourself
require 'uri'
url = http://www.google.com?a=this is a test
URI.escape(url, Regexp.new("[^#{URI::PATTERN::UNRESERVED}]"))

Resources