Storing generated HTML and preventing XSS attacks in Rails 3 - ruby-on-rails

I am writing an app where a lot of heavy duty calculations are performed in the browser to generate and format some content. The resulting content is HTML that I would like to save on the server (structure, links, results of calculations, etc). It does not have any javascript or CSS (in style tags).
Is there a Rails method or plugin I can use in a model's before_save or after_initialize method to strip javascript/css from the content and safely send it back to the browser (via JSON, FYI) while preventing XSS attacks, given the simple structure of the content?

I personally use the Sanitize gem. Using this gem you can configure what HTML elements and attributes you want to allow and the gem will strip anything else.
I'd say this is a good choice for your app because since you want to allow some HTML but disallow Javascript/CSS then a sanitizer based on a full blown HTML parser is most likely the best choice considering there are numerous ways of being crafty and injecting Javascript/CSS into HTML.

Related

Performance difference between plain HTML and Rails/ERB helper templates

I am new to rails and trying to understand this concept, as there are many things which we write using helpers(erb/rails tags) can also be written using simple plain html , is there any other advantage to using rails/erb helper than enabling to write more simple and readable code.
As the end result of writing the erb/rails template is always going to be a plain html , so initially by writing plain html do we reduce load on server or reduce servers efforts of converting the rails/erb templates into plain html.
Note: I am specifically asking for more of static templates e.g web forms , links , form contents,etc.
Here're some of merits.
Ability to Collaborate with models and helpers
It automatically generates post url and form labels etc. So, say you're changing the name of a model or url, if you were writing plain html for all templates, you'll have to manually replace all of occurrences on your own, whereas the "rails-way" can handle them all just with one line of modification or one command execution.
Can take advantage of template libraries.
There're lots of awesome template libraries that generate html from ruby code.
https://github.com/plataformatec/simple_form
https://github.com/justinfrench/formtastic
Gives you better abstraction
It gives you good abstraction in the way that it makes you write what you want instead of how you do. For example in my previous project, I was using bootstrap2 and decided to move to bootstrap3. If I were writing plain html, I had to see all html files, and inspect sometimes intricately structured html tags, classes, and find all bootstrap2 specific elements and change them all. But thanks to the template generation gem I was using, all I had to do was basically to upgrade the gem and add a few lines to some config files.

How can I use a WYSIWYG editor with rails, but also have the data sanitized server side?

There's no scarcity of WYSIWYG editors, but it seems like there's no simple path to having one and keeping some semblance of protection from bypassing client side validation and including script and object tags.
My initial thought was to find a WYSIWYG editor which would output markdown, then store markdown formatted text in the db and parse on display. This would protect me from storing potentially dangerous code in the db, but also keep me from needing to whitelist every possible tag that the editor would put out as I would need to if it were HTML.
Am I missing some really easy path here? How does everyone else balance having a usable editor but not opening themselves wide open to attacks?
Ryan Grove's sanitize gem is very customizable, and I think the basic or relaxed modes would work for sanitizing raw html from the WYSIWYG editor (and you wouldn't have to whitelist a bunch of tags).

Security implications of using CKEditor

I have just implemented CKEditor for rich text entry in my app and I am thinking that the ability of a user to enter anything could pose a security threat.
At the moment, I have the simplest implementation - CKEditor sits in a form, input is saved to the database as part of update_attributes, and other people can view the output as html_safe.
Somehow, the above doesn't sound good to me, even though it works. Am I correct in thinking there are risks to the above approach? Is there a safer way to do this to block an attack through the editor?
You should always take care of sanitizing a users input. In your case, by stripping all unwanted HTML tags (like , for example) regardless of where it came from.
html_safe is not meant to strip HTML or sanitize for you. See Yehuda Katz' article on ActiveSupport::SafeBuffer. It is meant to prevent "unsafe" markup by marking a String as safe, if it is (and encoding it to HTML entities otherwise, to make it safe).
There are sanitation helpers in ActionView::Helpers::SanitizeHelper that you can use to sanitize what is displayed, but you might want to sanitize it before it enters the database.
If you strip away the possibility of inserting CSS, Javascript or an iframe, you should be fine. If you're paranoid about what your users do, also take away <img> tags. And if you're really paranoid, you should consider using Markdown, Textile or others.

Security in a Rails app - User submitted data

I'm currently in the process of writing my first Rails app. I'm writing a simple blog app that will allow users to comment on posts. I'm pretty new to Rails, so I'm looking for a bit of guidance on how to address security concerns with user input.
On the front end, I am using TinyMCE to accept user input. It is my understanding that TinyMCE will strip out any suspicious tags (e.g. <script>) from user input before posting to server. It seems that this could be bypassed by disabling javascript on the page, allowing a user to have free reign in the text area. TinyMCE recommends using javascript to create the TextArea. Therefore if the user disables javascript, there will be no text area. Is this the standard solution? It seems like a bit of a hack.
On the back end, what is the best way to strip out malicious code? Would I want to put some sort of validation in the create and update methods inside my comments controller? Is there some functionality built into Rails that can assist with this?
When displaying the information back out to the user, I'm assuming that I don't want to escape the HTML markup (with <%= h *text*%>), because that's how its stored in the back end. Is this bad practice?
I'm generally a big fan of cleaning out the data prior popping that stuff into the database. This is a debatable practice, but I usually lean toward this.
I use a modified version of the old white_list plugin to not strip out the html, but to convert anything I do want into a safer format.
<tag>
becomes
<tag>
This way I'm not really altering the content of the submission.
There are some plugins that specifically handle sanitization using a white/black list model.
http://github.com/rgrove/sanitize/ # Have not used, but looks very interesting
http://github.com/imanel/white_list_model # Used, not bad
There is also act_as_sanitized, but I have no real info on that.
And of course using the h().
Your suspicions are justified, but the creation of a text area in javascript won't make you any less vulnerable. A user could always use something like curl to force a form submission without ever visiting your site through a web browser.
You should assume that a user can post malicious scripts into the comments, and escape it on the frontend. Using <%= h(...) %> is one way to do it, or you can use the sanitize method in the same way. It will strip any scripts and escape all other html except for a few common tags that aren't harmful. Documentation for sanitize.
In addition to nowk's suggestions there is also the xss_terminate plugin. I have been using it in some of my applications. I found it to be easy to use, it needs almost no configuration, and has been working like a charm.

Is it the best practice to store content in the database as html when content is written in markdown?

Or is just saving markdown and rendering it on requests usually okay?
I'm writing a site that uses markdown for content. Stack overflow similarly uses markdown for comments and questions.
I'm storing the content as markdown in the database and then rendering it to html when the user visits the site.
I've got a feeling I ought to be storing the markdown and the html output in the database to cut down the load on the server. However, the performance doesn't seem like an issue now (famous last words.)
It's a rails site using the rdiscount gem to convert the markdown.
That depends on whether you intend for the Markdown content to be editable. If it's write-once-edit-never, there's no need to keep the source. Otherwise, obviously you need to keep the Markdown.
In most cases, rendering Markdown (at least with a decent library) won't stress a server at all. If server-side processing starts to become an issue, take a look at caching (memcached or similar).
I think it's quite appropriate to store a cached HTML version, but keep the MarkDown as well, just incase you need to either:
Display it somewhere else
Regenerate the HTML cache, due to some security issue

Resources