Are there any better alternatives to Sanitize for a Ruby app? - ruby-on-rails

I love Sanitize. It's an amazing utility. The only issue I have w/ it is the fact that it takes forever to prepare a development environment w/ it because it uses Nokogiri, which is a pain for compile time. Are there any programs that do what Sanitize does (if nothing else than mildly what it does) w/out using Nokogiri? This would help exponentially!

Rails has its own SanitizeHelper.
According to http://api.rubyonrails.org/classes/ActionView/Helpers/SanitizeHelper.html, it will
This sanitize helper will html encode all tags and strip all attributes that aren’t specifically allowed.
It also strips href/src tags with invalid protocols, like javascript: especially. It does its best to counter any tricks that hackers may use, like throwing in unicode/ascii/hex values to get past the javascript: filters. Check out the extensive test suite.
You can use it in a view like so
<%= sanitize #article.body %>
You can visit the link to see more customizing options like:
Custom Use (only the mentioned tags and attributes are allowed, nothing else)
<%= sanitize #article.body, tags: %w(table tr td), attributes: %w(id class style) %>

Related

rails 5.x: add nofollow to all links in 'sanitize'

I am working on a Rails application whose HAML templates frequently make use of a routine called sanitize. I have deduced from context that this routine sanitizes user-controlled HTML. Example:
# views/feed_items/_about.html.haml
%h3 Summary:
.description
= sanitize #feed_item.description
I want to make this routine add 'rel=nofollow' to all outbound links, in addition to what it's already doing. What is the most straightforward way to do that?
N.B. I am not having any luck finding the definition of this method, or the official configuration knobs for it. The vendor directory has two different HTML sanitizer gems in it and I can't even figure out which one is being used. This is a large, complicated web application that I did not write, and I barely understand Ruby, let alone all of Rails' extensions to it. Please assume I do not know any of the things that you think are obvious.
The sanitizer will strip out the rel tags if they exist.
I ran into a similar issue and added an additional helper method - clean_links to the ApplicationHelper module, and called it after sanitizing the content.
# application_helper.rb
def clean_links html
html.gsub!(/\\2')
html.html_safe
end
This method looks for all <a> tags, and adds rel="nofollow". The html_safe method is necessary or else the HTML will be displayed as a string (it's already been sanitized).
This solution treats all links equally, so if you only want this for links pointing outside the domain, you'll have to update the REGEX accordingly.
In your view: <%= clean_links sanitize(#something) %>
So, first the content is sanitized, then you add the rel="nofollow" tag before displaying the link.
Actually there's a built-in way:
sanitize "your input", scrubber: Loofah::Scrubbers::NoFollow.new

Rails 4: how to insert line breaks in text_area?

I have created a blog in rails. I'm a beginner and got quite far, but now I'm stuck with a seemingly minor detail: I can't seem to format the posts (articles).
Here's the relevant part of my show.html.erb:
<p>
<strong>Content:</strong>
<%= simple_format (#article.content) %>
</p>
When I write something and insert html-tags, they are not recognized as such. What am I doing wrong?
Rails will automatically remove html tags to prevent someone from injecting code into your webpage (e.g. malicious javascript)
If your users cannot enter data into #article.content and it's always safe then you can flag it as safe usng the html_safe method.
<%= (simple_format (#article.content)).html_safe %>
Can you post the article content for reference? If I had to guess, I'd imagine Rails is escaping the html tags and inserting them as plain text (so the output looks like: Article content !
Take a look at Rails' helper methods like content_tag (http://apidock.com/rails/ActionView/Helpers/TagHelper/content_tag) and concat (http://apidock.com/rails/ActionView/Helpers/TextHelper/concat) and consider using those to help with generating the appropriate html tags.
An issue to be concerned with is who's going to be supplying the content. For example, if you're writing an application that other people will use, you want to make sure any html give you is escaped to avoid XSS attacks. In that case, you'll want to spend some time reading about how to properly sanitize user input.
You can now specify the tag it gets wrapped in (defaults to p) like so:
<%= simple_format (#article.content, {}, wrapper_tag: "div") %>
or
add white-space: pre-line style.
It will display \r or \n (enter) in user input as a new line.
for more info:
http://apidock.com/rails/v4.0.2/ActionView/Helpers/TextHelper/simple_format

Rails translation performance impact <%= raw t( vs. <%= t(

I am building a multilingual application using rails-i18n Ruby on Rails.
Most of content (and DB entries) I have to translate is pure text, though part of it has some embedded html.
I was thinking of using <%= raw t('translation_key') %> instead of the straight <%= t('translation_key') %> to account for future changes that might include html.
If I adopt <%= raw t('translation_key') %> throughout the all website, am I going to get any (negative) impact when it comes to
website performance
website security
You just just append _html to your tag keys to handle HTML in translation tags:
en:
key_one: test text
key_one_html: <p>test text</p>
Then the standard code will work:
<%= t('key_one_html') %>
Performance aspect:
The performance impact should be negligible. Calling raw copies the parameter into a new string (an ActiveSupport::SafeBuffer to be precise) with its html_safe flag set to true. On the other hand there is no longer HTML escaping performed on that string.
Security aspect:
There are more substantial drawbacks to using raw everywhere.
You say your translations are read from the database. Can the user edit these translations? If so...
You risk HTML injections: A malicious user could just enter a <script> tag.
All your translations must to be HTML safe from now on. That means you have to manually escape all your translations, i.e. you have to replace <,>, and &.
Alternatives:
There are other options if you need to incorporate HTML into your translations:
Use the _html suffix judiciously to prevent automatic escaping
Use localized views and partials, i.e. index.en.html, _footer.de-DE.html to translate larger parts of your views.
To streamline translation of database entries, you try
Globalize
hstore_translate (PostgreSQL only)
Conclusion:
Using raw everywhere will lead to a lot of problems along the road. Save yourself a lot of trouble and just use it "as needed".
You could also use the globalize gem for bigger text / html sections of content.
https://github.com/globalize/globalize/blob/master/README.md
It also supports eager loading if you're worried about performance.

Rails - Outputting content, sanitize or <%=h?

I recently made a small rails3 app to convert an old cms written in another language. After migrating the content I am having problems outputting content from the database.
The #content.desc field sometimes has html. Currently the only way I could get it to work was:
<%= sanitize content.desc %>
But is this the best way? When I use <%=h #content.desc %> I can see the html tags still. When I use <%= simple_format #content.desc %> I get wicked spacing.
Is there a definitive guide somewhere where I can see all of the options while outputting content? I've tried to search but can't turn anything up (rails newb, i know).
Any string not marked as "safe" will be HTML-escaped by default in Rails 3. Some methods, such as sanitize, h, link_to and many other helpers return safe strings, thus allowing them to be written literally. See this blog post for more info.
If you know for sure that the HTML contained in #content.desc is safe, you can mark it as such yourself like so: <%= #content.desc.html_safe %>.
Rails 3 has changed HTML sanitisation to be enabled by default. If you're sure that the string you're rendering is safe, you can use
<%= #content.desc.html_safe! %>
Unless I'm mistaken, you shouldn't have to sanitize the content before displaying it, as Rails 3 does that by default. More info here: http://yehudakatz.com/2010/02/01/safebuffers-and-rails-3-0/

Escaping HTML in Rails

What is the recommended way to escape HTML to prevent XSS vulnerabilities in Rails apps?
Should you allow the user to put any text into the database but escape it when displaying it? Should you add before_save filters to escape the input?
There are three basic approaches to this problem.
use h() in your views. The downside here is that if you forget, you get pwnd.
Use a plugin that escapes content when it is saved. My plugin xss_terminate does this. Then you don't have to use h() in your views (mostly). There are others that work on the controller level. The downsides here are (a) if there's a bug in the escaping code, you could get XSS in your database; and (b) There are corner cases where you'll still want to use h().
Use a plugin that escapes content when it is displayed. CrossSiteSniper is probably the best known of these. This aliases your attributes so that when you call foo.name it escapes the content. There's a way around it if you need the content unescaped. I like this plugin but I'm not wild about letting XSS into my database in the first place...
Then there are some hybrid approaches.
There's no reason why you can't use xss_terminate and CrossSiteSniper at the same time.
There's also a ERb implementation called Erubis that can be configured so that any call like <%= foo.name %> is escaped -- the equivalent of <%= h(foo.name) %>. Unfortunately, Erubis always seems to lag behind Rails and so using it can slow you down.
If you want to read more, I wrote a blog post (which Xavor kindly linked to) about using xss_terminate.
The h is an alias for html_escape, which is a utility method for escaping all HTML tag characters:
html_escape('<script src=http://ha.ckers.org/xss.js></script>')
# => <script src=http://ha.ckers.org/xss.js></script>
If you need more control, go with the sanitize method, which can be used as a white-list of tags and attributes to allow:
sanitize(#article.body, :tags => %w(table tr td), :attributes => %w(id class style))
I would allow the user to input anything, store it as-is in the database, and escape when displaying it. That way you don't lose any information entered. You can always tweak the escaping logic later...
Use the h method in your view template. Say you have a post object with a comment property:
<div class="comment">
<%= h post.comment %>
</div>
Or with this plugin - no need for h 8)
http://railspikes.com/2008/1/28/auto-escaping-html-with-rails
I've just released a plugin called ActsAsSanitiled using the Sanitize gem which can guarantee well-formedness as well being very configurable to what kind of HTML is allowed, all without munging user input or requiring anything to be remembered at the template level.

Resources