Ruby - how to encode URL without re-encoding already encoded characters - ruby-on-rails

I have a simple problem: users can post urls through specific input in a form in my website.
I would like to encode the posted url, because sometimes users send urls with strange and/or non ascii characters (like é à ç...). For instance: https://www.example.com/url-déjà-vu
So I tried to use URI.escape('https://www.example.com/url-déjà-vu') which does work, but then if you have the following url: URI.escape('https://somesite.com/page?stuff=stuff&%20') you get:
=> "https://somesite.com/page?stuff=stuff&%2520"
The % character is encoded and should not be as %20 is already an encoded character. Then I thought I could do this:
URI.escape(URI.decode('https://somesite.com/page?stuff=stuff&%20'))
=> "https://somesite.com/page?stuff=stuff&%20"
But there is a problem if you have a "/" encoded in your url, for instance:
URI.escape(URI.decode('http://example.com/a%2fb'))
=> "http://example.com/a/b"
The "/" should stay encoded.
So... putting it all together: I want to encode urls posted by users but leaving already encoded characters unchanged in ruby. Any idea how I may do that without getting an headache?
Thanks :)

I can't think of a way to do this that isn't a little bit of a kludge. So I propose a little bit of a kludge.
URI.escape appears to work the way you want in all cases except when characters are already encoded. With that in mind we can take the result of URI.encode and use String#gsub to "un-encode" only those characters.
The below regular expression looks for %25 (an encoded %) followed by two hex digits, turning e.g. %252f back into %2f:
require "uri"
DOUBLE_ESCAPED_EXPR = /%25([0-9a-f]{2})/i
def escape_uri(uri)
URI.encode(uri).gsub(DOUBLE_ESCAPED_EXPR, '%\1')
end
puts escape_uri("https://www.example.com/url-déjà-vu")
# => https://www.example.com/url-d%C3%A9j%C3%A0-vu
puts escape_uri("https://somesite.com/page?stuff=stuff&%20")
# => https://somesite.com/page?stuff=stuff&%20
puts escape_uri("http://example.com/a%2fb")
# => http://example.com/a%2fb
I don't promise that this is foolproof, but hopefully it helps.

Related

How to have url not encoded in rails

I don't understand why all my special characters in my url are encoded for example :
new_subscription_url(:session_id => '{CHECKOUT_SESSION_ID}' )
Give me
http://localhost:3000/en/subscriptions/new?session_id=%7BCHECKOUT_SESSION_ID%7D
All special characters are encode. How could I have them not encoded ?
It is not encoded but rather escaped.
According to Internet standard (IETF section 2.4), URI is always in an "escaped" form.
On the side note, if you want to unescape it, you can use
CGI::unescape(new_subscription_url(session_id: '{CHECKOUT_SESSION_ID}' ))

url escaping in ruby

There are many discussion about URL escaping in Ruby, but unfortunately I didn't find an appropriate solution.
In general, URI.escape should do the job, but looks like it doesn't support all characters, for example it doesn't escape "[".
URI.parse(URI.escape("1111{3333"))
works well.
URI.parse(URI.escape("1111[3333"))
raises an exception.
I understand that "[" is not an eligible character in URL according to RFC, but when I enter it into the browser it takes it, and renders the page, so I need exactly the same behavior.
Do know any ready solution for escaping in Ruby?
I typically use
CGI.escape
to escape URI parameters.
require 'cgi'.
CGI.escape('1111[3333')
=> "1111%5B3333"
The character [ is a uri delimiter character and does not require escaping.
http://www.ietf.org/rfc/rfc2396.txt
section 2.4.3. Excluded US-ASCII Characters

Passing fullstops (periods) and forward slashes in a GET request?

I have built a form that submits values to Wufoo as a GET request in the URL. I cannot get it to work if any of the values (in a textarea) contain a line-break or a forward slash. Is there a way to encode these in a URL?
This is being done in Rails.
I thought Rails would do that for you. But if you need to do it manually, you can use CGI::escape, e.g.
> require 'cgi'
...
> CGI.escape("hello%there\nworld")
=> "hello%25there%0Aworld"
EDIT:
Actually, CGI does not seem to escape a dot. URI can be used instead, it takes an extra parameter that lets you list extra characters you want escaped:
URI.escape("hello.there%world", ".")
http://en.wikipedia.org/wiki/Percent-encoding

Rails: base64 and character escaping problem

In my app I need to encode a string via base64, escape it's possible special characters and put it into a URL.
I do the following:
string = "random_email#server.com"
enc = OpenSSL::Cipher::Cipher.new('DES-EDE3-CBC')
enc.encrypt('dummy_salt')
encoded = URI.escape(Base64.encode64(enc.update(string) << enc.final))
The problem is, that somehow URI.escape do not escape '/' character. That's completely unacceptable if the encoded string is intended to be used as a URL parameter.
How come URI.escape ignores to escape '/'? Should I user any other .escape then one, which comes from URI? Or should I even use other encoding method (don't think so)?
Any suggestions as to the code are welcome too.
Use CGI.escape instead :-)
require 'cgi'
puts CGI.escape('/') # => "%2F"
If you need to escape html you can also do:
CGI::escapeHTML('Usage: foo "bar" <baz>')
"Usage: foo "bar" <baz>"

Rails: post data with a '+' is getting set to a blank

I have post data that includes a '+' sign. Once it makes it to the server the raw post data is showing the '+' sign but once the post data makes it into the param hash the '+' sign has been converted to a blank. Any ideas on how to make it NOT do that?
If you replace your '+' signs with '%2B', this should resolve the issue.
However, also note that you probably need to check your ampersands, percent signs, and other characters as well. The server receiving your post data is probably expecting URLEncoded data.
In a nutshell, if you replace % signs with %25, then replace & with %26, replace ? with %3F, replace # with %23, and replace + signs with %2B; you will cover most of the issues you can encounter.
A more in-depth list of replacements can be found at these links.
HTML Url Encoding (w3schools)
Percent-Encoding (wikipedia)
Have a look at the CGI.escape method in the standard library that will do this for you:
irb(main):001:0> require 'cgi'
=> true
irb(main):002:0> CGI.escape 'foo+bar&baz?qux quux/corge'
=> "foo%2Bbar%26baz%3Fqux+quux%2Fcorge"
There's also a CGI.unescape method should you need to convert back.
Try replacing the + with %2B.
Not sure why that is happening. Normally + signs make it through to the params. Can you post your rails versions. Also try escaping the "+" sign with "+" or its CGI equivalent "%2B" to see if it makes a difference.
there is a Ruby call to handle all this for you so you don't need to figure out the characters yourself
require 'uri'
url = http://www.google.com?a=this is a test
URI.escape(url, Regexp.new("[^#{URI::PATTERN::UNRESERVED}]"))

Resources