In our Rails REST API we need to get an object by (unique) name. The name can consist of any, and especially cyrillic, characters. My first attempt was the following:
GET /api/objects/name/:name
An alternative would be:
GET /api/objects?name=:name
In both cases, :name can consist of any characters and needs to respect correct encoding. I have green rspec tests running with Тест Веселье 123 for example, but I worry that there may be edge cases that don’t encode correctly.
What is the correct way to avoid URL parameter encoding issues for GET requests with a Rails backend? Can the issue be avoided altogether? Is there a solution to sufficiently test encoding edge cases?
Related
I'm reading the book, HTTP - The Definitive Guide, from which I get the URL general format:
<scheme>://<user>:<password>#<host>:<port>/<path>;<params>?<query>#<frag>
The <params> part said,
The path component for HTTP URLs can be broken into path segments. Each segment can have its own params. For example:
http://www.joes-hardware.com/hammers;sale=false/index.html;graphics=true
In my opinion, path params can also be used to query resources like query strings, but why it's barely seen?
And I'm a Rails developer, and I haven't seen its usage or specification in Rails. Does Rails not support it?
You ask several questions
Why do we not see ;params=value much?
Because query parameters using ?=& are widely supported, like in PHP, .net, ruby etc.. with convenient functions like $_GET[].
While params delimited by ; or , do not have these convenient helper functions. You do encounter them at Rest api's, where they are used in the htaccess or the controller to get relevant parameters.
Does Ruby support params delimited with ;?
Once you obtain the current url, you can get all parameters with a simple regex call. This is also why they are used in htaccess files, because they are easily regexed (is that a word?).
Both parameter passing structures are valid and can be used, the only clear reason why one is used more often than the other is because of preference and support in the different languages.
I've created a Rest API for a existing system utilizing Rails and am attempting to consume it in an external system via ActiveResource. Unfortunately the primary key of one of the core tables is an arbitrary string defined by the user so many non-URL friendly characters have been used over the years. We've ended up with keys such as "CR 1400/2400 A-C", which ActiveResource is not encoding correctly into a restful URL. It is deals with the spaces correctly, but does not encode the forward slashes amongst other characters.
I would like to be able to call the find method with the primary key containing these forbidden characters such as:
p = Project.find('CR 1400/2400 A-C')
which would result in a url such as:
http://localhost:3000/projects/CR%201400%2F2400%20A-C.json
instead of:
http://localhost:3000/projects/CR%201400/2400%20A-C.json
I cannot change the database schema even though currently very little would bring me greater joy.
Is there a way to tell ActiveResource to encode additional characters, or intercept the call to encode them prior to constructing the URL?
Thanks in advance
Neil
I'm building a little token-based authentication library for my (rails based) api server which uses redis to store generated auth tokens. The line I'm worried about is: user_id = $redis.get("auth:#{token}"), where token is what's passed in to authenticate_or_request_with_http_token.
If this were SQL, that'd be a huge red flag - string interpolated SQL queries are pretty insecure. As far as I can tell, however, doing string interpolation on a redis key query isn't insecure.
My source for the above claim is the redis documentation here: http://redis.io/topics/security (under the string escaping and nosql injection header), but I wanted to make sure that this is the case before I get a Bobby Tables attack.
The documentation you are pointing to is quite explicit:
The Redis protocol has no concept of string escaping, so injection is impossible under normal circumstances using a normal client library. The protocol uses prefixed-length strings and is completely binary safe.
There is a small attack vector for these kinds of string injections. While the redis documentation is clear about the difficulty of executing multiple commands on the database, it does not mention that the key separator (':' in your example) usually needs to be escaped when used as the part of a key.
I have seen a redis database using these keys:
oauth_token:123456 (which contained a hash of OAuth token parameters) and
oauth_token:123456:is_temp (which contained a boolean property to indicate whether the OAuth token is a temporary token)
Trusting the user input without escaping might result in GET oauth_token:#{token} accidentally ending up as GET oauth_token:123456:is_temp (when token has been set to 123456:is_temp by the user).
So I highly recommend to properly escape colons from potential user input to make sure your key paths cannot be tricked like this.
NOTE: Someone recommended to fix the example above by using oauth_token:123456 and oauth_token:is_temp:123456, but that is flawed (for the user-provided token is_temp:123456). The correct solution to that problem (without escaping) would be to use keys oauth_token:info:123456 and oauth_token:is_temp:123456 to make sure these keys cannot overlap whatever the user-provided input was (or simply escape colons).
Basically Redis is immune from escaping issues when the input string is used verbatim. For example:
SET mykey <some-attacker-chosen-data>
However Redis is not immune from issues arising by using non validate input in the context of string interpolation, as showed by Sven Herzberg. In order to turn the Sven example into a safe one, it is possible to just use an Hash, and avoid reverting to interpolation. Otherwise either use not common prefixes to use in conjunction with keys interpolation, or use some basic form of sanity check on the input, which is, filtering away the separator used, or better, validate that the input is actually a number (in the specific example).
So while Redis does not suffer from the typical injection attacks of SQL, when used untrusted input in the context of a string interpolation used to create key names, or even worse, Lua scripts, some care should be taken.
I have some easy to read URLs for finding data that belongs to a collection of record IDs that are using a comma as a delimiter.
Example:
http://www.example.com/find:1%2C2%2C3%2C4%2C5
I want to know if I change the delimiter from a comma to a period. Since periods are not a special character in a URL. That means it won't have to be encoded.
Example:
http://www.example.com/find:1.2.3.4.5
Are there any browsers (Firefox, Chrome, IE, etc) that will have a problem with that URL?
There are some related questions here on SO, but none that specific say it's a good or bad practice.
To me, that looks like a resource with an odd query string format.
If I understand correctly this would be equal to something like:
http://www.example.com/find?id=1&id=2&id=3&id=4&id=5
Since your filter is acting like a multi-select (IDs instead of search fields), that would be my guess at a standard equivalent.
Browsers should not have any issues with it, as long as the application's route mechanism handles it properly. And as long as you are not building that query-like thing with an HTML form (in which case you would need JS or some rewrites, ew!).
May I ask why not use a more standard URL and querystring? Perhaps something that includes element class (/reports/search?name=...), just to know what is being queried by find. Just curious, I knows sometimes standards don't apply.
In Ruby on Rails 3 (currently using Beta 4), I see that when using the form_tag or form_for helpers there is a hidden field named _snowman with the value of ☃ (Unicode \x9731) showing up.
So, what is this for?
This parameter was added to forms in order to force Internet Explorer (5, 6, 7 and 8) to encode its parameters as unicode.
Specifically, this bug can be triggered if the user switches the browser's encoding to Latin-1. To understand why a user would decide to do something seemingly so crazy, check out this google search. Once the user has put the web-site into Latin-1 mode, if they use characters that can be understood as both Latin-1 and Unicode (for instance, é or ç, common in names), Internet Explorer will encode them in Latin-1.
This means that if a user searches for "Ché Guevara", it will come through incorrectly on the server-side. In Ruby 1.9, this will result in an encoding error when the text inevitably makes its way into the regular expression engine. In Ruby 1.8, it will result in broken results for the user.
By creating a parameter that can only be understood by IE as a unicode character, we are forcing IE to look at the accept-charset attribute, which then tells it to encode all of the characters as UTF-8, even ones that can be encoded in Latin-1.
Keep in mind that in Ruby 1.8, it is extremely trivial to get Latin-1 data into your UTF-8 database (since nothing in the entire stack checks that the bytes that the user sent at any point are valid UTF-8 characters). As a result, it's extremely common for Ruby applications (and PHP applications, etc. etc.) to exhibit this user-facing bug, and therefore extremely common for users to try to change the encoding as a palliative measure.
All that said, when I wrote this patch, I didn't realize that the name of the parameter would ever appear in a user-facing place (it does with forms that use the GET action, such as search forms). Since it does, we will rename this parameter to _e, and use a more innocuous-looking unicode character.
This is here to support Internet Explorer 5 and encourage it to use UTF-8 for its forms.
The commit message seen here details it as follows:
Fix several known web encoding issues:
Specify accept-charset on all forms. All recent browsers, as well as
IE5+, will use the encoding specified
for form parameters
Unfortunately, IE5+ will not look at accept-charset unless at least one
character in the form's values is not
in the page's charset. Since the
user can override the default
charset (which Rails sets to UTF-8),
we provide a hidden input containing
a unicode character, forcing IE to
look at the accept-charset.
Now that the vast majority of web input is UTF-8, we set the inbound
parameters to UTF-8. This will
eliminate many cases of incompatible
encodings between ASCII-8BIT and
UTF-8.
You can safely ignore params[:_snowman]
In short, you can safely ignore this parameter.
Still, I am not sure why we're supporting old technologies like Internet Explorer 5. It seems like a very non-Ruby on Rails decision if you ask me.