passing regular expressions as params into ruby - ruby-on-rails

so i would like to expose regular expression queries on a field in my model, such that user could ask for
http://localhost:3000/myview.json?field=^hello, (there|world).*
so i know i'll have to change my routes to recognise the wildcard characters etc, and i can easily do a Regexp.new() inside my controller to convert this to a real regular expression (i'm using mongomapper in the back).
the issue is the potentially huge security hole with XSS.
should i be worried about this? how could i safely enable users to query with regular expression strings.
(i'm not too bothered about the user hammering the database... yet)

Regular expressions won't be able to perform arbitrary code execution unless there is something really wrong with Regexp.new. So if we assume that Regexp.new will either make a valid regular expression or fail or do something else sane you are safe already without having to sanitize the incoming string.

Related

Is data.to_json.html_safe susceptible to XSS attack?

I'm trying to figure out if this code is safe.
Is it at all possible to attack this code?
<script>
data = <%= data.to_json.html_safe %>;
</script>
In other words, what value of data would result in a successful attack?
It kind of depends on what you are doing with the data and the version of Rails you are using. If you are using anything past Rails 3 then no, calling html_safe could make your code vulnerable to XSS.
Basically, what you are doing is telling the app that data.to_json is html safe. However, the application doesn't actually know that for sure.
What html_safe does is it marks a string as safe to be inserted directly into HTML without escaping anything within the string. As described in the method api, it should never be used on user input. Constructed input may be safe, but it is up to you to ensure that it is.
to_json converts a given string into JSON. By default, it does not escape HTML characters like <, / >
Thus, if data is user input, it is entirely possible for someone to insert their own script into it and have it marked as safe (and thus rendered as html) the way it is currently written.
The way this is written, if someone does the following:
data = "</script><script>insert_xss_attack_here</script>"
Your code will not escape the script, resulting in the script being executed by the code.
Many people have described the issues with html_safe and to_json:
This deals specifically with to_json.html_safe
http://jfire.io/blog/2012/04/30/how-to-securely-bootstrap-json-in-a-rails-view/
https://bibwild.wordpress.com/2013/12/19/you-never-want-to-call-html_safe-in-a-rails-template/
http://makandracards.com/makandra/2579-everything-you-know-about-html_safe-is-wrong

is it good practice to encode user input to database?

I am wondering if it is consider good practice to encode user input to database.
Or is it ok to not encode to user input instead.
Currently my way of doing it is to encode it when entering database and use Html.DisplayFor to display it.
No. You want to keep the input in its original form until you need it and know what the output type is. It might be HTML for now, but later if you want to change it to json, text file, xml, etc the encoding might make it look different then you want.
So, first you want to make sure you are securely validating your input. It is a good idea to know what are the requirements for each of your inputs and validate that they are withing the correct length, range, character set, etc. It will be to your interest to limit the type of characters that are allowed as valid characters of an input type. (If using Regular Expressions to validate input ensure you do not use a regular expressions that is susceptible to a Regular Expression Denial of Service.
When moving the data around in your code ensure that you are properly handling the data in a manner that it will not turn into an Injection Attack.
Since you are talking about a database, the best practice is to use paramaterized statements. Check out the prevention methods in the above link.
Then when it comes outputting using MVC, if you are not using RAW or MvcHtmlString functions/calls, then the output is automatically encoded. With the automatic encoding, you want to make sure you are using the AntiXss encoder and not the default (whitelist approach vs. blacklist). Link
If you are using Raw or MvcHtmlString, you want to make sure you COMPLETE TRUST the values (you hard coded them in) or you manually encode them using the AntiXss Encoder class.
No it is not necessary to encode all the user inputs, rather if you want to avoid the script injection either you my try to validate the fields for special characters like '<', '>', '/', etc. else your Html helper method itself will do the needful.

Is it possible to run a string injection attack on a redis query?

I'm building a little token-based authentication library for my (rails based) api server which uses redis to store generated auth tokens. The line I'm worried about is: user_id = $redis.get("auth:#{token}"), where token is what's passed in to authenticate_or_request_with_http_token.
If this were SQL, that'd be a huge red flag - string interpolated SQL queries are pretty insecure. As far as I can tell, however, doing string interpolation on a redis key query isn't insecure.
My source for the above claim is the redis documentation here: http://redis.io/topics/security (under the string escaping and nosql injection header), but I wanted to make sure that this is the case before I get a Bobby Tables attack.
The documentation you are pointing to is quite explicit:
The Redis protocol has no concept of string escaping, so injection is impossible under normal circumstances using a normal client library. The protocol uses prefixed-length strings and is completely binary safe.
There is a small attack vector for these kinds of string injections. While the redis documentation is clear about the difficulty of executing multiple commands on the database, it does not mention that the key separator (':' in your example) usually needs to be escaped when used as the part of a key.
I have seen a redis database using these keys:
oauth_token:123456 (which contained a hash of OAuth token parameters) and
oauth_token:123456:is_temp (which contained a boolean property to indicate whether the OAuth token is a temporary token)
Trusting the user input without escaping might result in GET oauth_token:#{token} accidentally ending up as GET oauth_token:123456:is_temp (when token has been set to 123456:is_temp by the user).
So I highly recommend to properly escape colons from potential user input to make sure your key paths cannot be tricked like this.
NOTE: Someone recommended to fix the example above by using oauth_token:123456 and oauth_token:is_temp:123456, but that is flawed (for the user-provided token is_temp:123456). The correct solution to that problem (without escaping) would be to use keys oauth_token:info:123456 and oauth_token:is_temp:123456 to make sure these keys cannot overlap whatever the user-provided input was (or simply escape colons).
Basically Redis is immune from escaping issues when the input string is used verbatim. For example:
SET mykey <some-attacker-chosen-data>
However Redis is not immune from issues arising by using non validate input in the context of string interpolation, as showed by Sven Herzberg. In order to turn the Sven example into a safe one, it is possible to just use an Hash, and avoid reverting to interpolation. Otherwise either use not common prefixes to use in conjunction with keys interpolation, or use some basic form of sanity check on the input, which is, filtering away the separator used, or better, validate that the input is actually a number (in the specific example).
So while Redis does not suffer from the typical injection attacks of SQL, when used untrusted input in the context of a string interpolation used to create key names, or even worse, Lua scripts, some care should be taken.

Any problems with using a period in URLs to delimiter data?

I have some easy to read URLs for finding data that belongs to a collection of record IDs that are using a comma as a delimiter.
Example:
http://www.example.com/find:1%2C2%2C3%2C4%2C5
I want to know if I change the delimiter from a comma to a period. Since periods are not a special character in a URL. That means it won't have to be encoded.
Example:
http://www.example.com/find:1.2.3.4.5
Are there any browsers (Firefox, Chrome, IE, etc) that will have a problem with that URL?
There are some related questions here on SO, but none that specific say it's a good or bad practice.
To me, that looks like a resource with an odd query string format.
If I understand correctly this would be equal to something like:
http://www.example.com/find?id=1&id=2&id=3&id=4&id=5
Since your filter is acting like a multi-select (IDs instead of search fields), that would be my guess at a standard equivalent.
Browsers should not have any issues with it, as long as the application's route mechanism handles it properly. And as long as you are not building that query-like thing with an HTML form (in which case you would need JS or some rewrites, ew!).
May I ask why not use a more standard URL and querystring? Perhaps something that includes element class (/reports/search?name=...), just to know what is being queried by find. Just curious, I knows sometimes standards don't apply.

is it bad form to have the colon in db and in the URL?

I am designing my namespace such that the id i am storing in the DB is
id -> "e:t:222"
where "e" represents the Event class, "t" represents the type
i am also expecting to use this id in my urls
url -> /events/t:222
Is there anything wrong with doing this?
Is there anything wrong with doing this?
Yes: The colon is a reserved character in URLs that has a special meaning, namely specifying the server port, in a URL.
Using it in other places in the URL is a bad idea.
You would need to URLEncode the colon in order to use it.
There is nothing wrong with doing this, you'll simply need to encode the URL properly. Most libraries with do this automatically for you.
In general though, if you care about your data you shouldn't let the application drive the data or database design. Exceptions to this are application centric databases that have no life outside of a single application nor do you expect to use the data anywhere else. In this case, you may want to stick with schemas and idioms that work best with your application.

Resources