Why are "tel:" links removed in sanitization, and how to allow them - ruby-on-rails

I am using Rails sanitize helper to clean up input text from users, that may be formatted as markdown.
I noticed that the method strips down tel: links, and I wonder why, and how can I allow them.
>> sanitize("<a href='http://123'>click</a>")
=> "click"
>> sanitize("<a href='tel:123'>click</a>")
=> "<a>click</a>"
Of course, I have tried figuring it out from the page linked above, but was unable to. I would prefer to avoid writing a "scrubber" class, or any other class for that simple task.
I have also tried what I think means "allow all hrefs" but it did not have any effect (even after restarting the server).
# In config/application.rb
config.action_view.sanitized_allowed_attributes = ['href']

In Rails 4, Loofah is being used for sanitizing HTML. To know more please visit this link.
Rails team extracted this feature into separate gem.
If you check this line, Loofah::HTML5::WhiteList::ALLOWED_PROTOCOLS doesnt have tel in their list, thus it is being striped off from anchor tags.
Solution:
Create an initializer that would add tel to above set of protocols.
Loofah::HTML5::WhiteList::ALLOWED_PROTOCOLS.add('tel')
Restart app and this should work.

Related

Rails comments system with bb-code

In my rails 4 app i want to add comments to my articles, but i want to add functional as most forum-engines do (like SMF), and i need to add bb-code for it.
Are there any good gem for it? With rails 4 support? How then in controller i can translate [quote] to some div with some style?
Also how is it good to store html data in database?
For example if i use haml, and somebody post comment as
- current_user.id
or something similar to this, how to secure my app from "bad boys" ? Sure i can change comments system to something like: quote_parent_id, but if i have multiple quotes in one comment? so it is hard to realise, better is to store html, but to secure it somehow.
Could i do this? And how? Please give good ideas, tutorials, gem-links.
Look into https://github.com/veger/ruby-bbcode
Since it converts to HTML and does not excecute user input as Ruby code - you'll be fairly safe. However, I havent tried the gem and its possible it introduces some XSS vulnerabilities.
Have you considered Markdown as an option?
You should also look into https://github.com/asceth/bbcoder ( I should note I am the original author ).
In the controller, changing a string such as "[quote=user]My post of epic importance[/quote]" into a div etc is just doing:
# assume params[:comment] is the text you are converting
params[:comment].bbcode_to_html
As for storing html in a database, there is no right or wrong answer. If you want to allow users to edit their posts later then I would lean towards not storing the html version but storing their original bbcode version. This way when you allow them to edit you aren't having to convert html back to bbcode.
To make sure you aren't open to XSS and other attacks I recommend combining other gems like sanitize.
Sanitize.clean(text.to_s).bbcode_to_html
Some more notes:
Multiple tags and nested tags are parsed as they are seen without any additional steps required. So a comment or post with lots of bbcode tags, multiple quotes, b tags or anything else is dealt with by just calling bbcode_to_html on the variable/string.
If a user tries to use haml in their post it should appear as-is. haml shouldn't try to eval the string unless you specifically tell it to which I'm not even sure how to do that unless haml as a special filter or operator.

What characters are allowed in a dynamic segment (param) in Rails?

I am using Rails and have a user entered field that can become a param in the URL. I'd like to add a validation that stops the users from entering any fields that will cause routing errors, as currently if the user enters a value like that we get an error "No route matches [GET]..." So far I know periods and slashes are not allowed...
What regex should I use for my validation? Or what regex does Rails use by default for dynamic segments?
Since no one has actually answered the question, just suggested workarounds. (Which are probably better, if you are in the right circumstances to use them.) I experimented to find the characters that caused issues. I tested all punctuation available on a standard US keyboard. I also tested space and (horizontal) tab. I did not test any extended Unicode punctuation, nor control characters.
The characters I found to cause problems in Rails 3.2.9, using webrick and the composite_primary_keys gem are:
,/.%
To validate that a field contains none of these characters:
validates :field_name, :format => { :with => /\A[^,\/\.%]*\z/,
:message => "commas, slashes, periods, and percent signs (,/.%) are not allowed"}
Many of the other characters I tried are not valid directly in URLs, but Rails automatically URL encodes them so they do not cause an issue.
As mentioned in the comments on the original question, some of these characters can be enabled by configuring Rails other than the defaults, but in doing so you will disable other features of Rails. To enable them you need to add :constraints or :id settings in your route definition.
I have not completely tested enabling all these characters, but I believe the consequences are:
Ch Consequence of enabling use
-- ---------------------------------------
, Must not use gem composite_primary_keys
/ Limits ability to route to child items
. Disables automatic format handling
% Not sure this can be enabled
Maybe you can let the user insert whatever, than use to_params + parameterize to write the url, and if you want some regex, take a look at the parameterize source code.
Example of to_params, the documentation and source code see:
http://apidock.com/rails/ActiveSupport/Inflector/parameterize
Hope it helps!
From rails code in action_pack action_dispath/journey/path/pattern.rb
#separator_re = "([^#{separator}]+)" # where separator comes from #separators = "/.?"
So the default regular expression used to match a dynamic segment seems to be:
/([^\/\.\?])/

Ruby on Rails - Converting Twitter #mentions, #hashtags and URLs within a string

Let's say I have a string which contains text grabbed from Twitter, as follows:
myString = "I like using #twitter, because I learn so many new things! [line break]
Read my blog: http://www.myblog.com #procrastination"
The tweet is then presented in a view. However, prior to this, I'd like to convert the string so that, in my view:
#twitter links to http://www.twitter.com/twitter
The URL is turned into a link (in which the URL remains the link text)
#procrastination is turned into https://twitter.com/i/#!/search/?q=%23procrastination, in which #procrastination is the link text
I'm sure there must be a gem out there that would allow me to do this, but I can't find one. I have come across twitter-text-rb but I can't quite work out how to apply it to the above. I've done it in PHP using regex and a few other methods, but it got a bit messy!
Thanks in advance for any solutions!
The twitter-text gem has pretty much all the work covered for you. Install it manually (gem install twitter-text, use sudo if needed) or add it to your Gemfile (gem 'twitter-text') if you are using bundler and do bundle install.
Then include the Twitter auto-link library (require 'twitter-text' and include Twitter::Autolink) at the top of your class and call the method auto_link(inputString) with the input string as the parameter and it will give you the auto linked version
Full code:
require 'twitter-text'
include Twitter::Autolink
myString = "I like using #twitter, because I learn so many new things! [line break]
Read my blog: http://www.myblog.com #procrastination"
linkedString = auto_link(myString)
If you output the contents of linkedString, you get the following output:
I like using #<a class="tweet-url username" href="https://twitter.com/twitter" rel="nofollow">twitter</a>, because I learn so many new things! [line break]
Read my blog: http://www.myblog.com <a class="tweet-url hashtag" href="https://twitter.com/#!/search?q=%23procrastination" rel="nofollow" title="#procrastination">#procrastination</a>
Use jQuery Tweet Linkify
A small jQuery plugin that transforms #mention texts into hyperlinks pointing to the actual Twitter profile, #hashtag texts into real hashtag searches, as well as hyperlink texts into actual hyperlinks

Rails, Radiant, and Regex

I'm working on a Rails site that uses the Radiant CMS and am building stateful navigation as per the first method in this link.
I'm matching the URL on regular expressions to determine whether or not to show the active state of each navigation link. As an example, here are two sample navigation elements, one for the Radiant URL /communications/ and one for /communications/press_releases/:
<r:if_url matches="/communications\/$/"><li class="bottom-border selected">Communications</li></r:if_url>
<r:unless_url matches="/communications\/$/"><li class="bottom-border">Communications</li></r:unless_url>
<r:if_url matches="/communications\/press_releases/"><li class="bottom-border selected">Press Releases</li></r:if_url>
<r:unless_url matches="/communications\/press_releases/"><li class="bottom-border">Press Releases</li></r:unless_url>
Everything's working fine for the Press Releases page--that is, when the URL is /communications/press_releases the Press Releases nav item gets the 'selected' class appropriately, and the Communications nav item is unselected. However, the Communications regular expression doesn't seem to be functioning correctly, as when the URL is /communications/ neither element has the 'selected' class (so the regex must be failing to match). However, I've tested
>> "/communications/".match(/communications\/$/)
=> #<MatchData:0x333a4>
in IRB, and as you can see, the regular expression seems to be working fine. What might be causing this?
TL;DR: "/communications/" matches /communications\/$/ in the Ruby shell but not in the context of the Radiant navigation. What's going on here?
From Radiant's wiki, it looks like you don't need to add /s around your regexs or escape /s. Try:
<r:if_url matches="/communications/$"><li class="bottom-border selected">Communications</li></r:if_url>
<r:unless_url matches="/communications/$"><li class="bottom-border">Communications</li></r:unless_url>
<r:if_url matches="/communications/press_releases/"><li class="bottom-border selected">Press Releases</li></r:if_url>
<r:unless_url matches="/communications/press_releases/"><li class="bottom-border">Press Releases</li></r:unless_url>
What is happening behind the scenes is that Radiant calls Regex.new on the string in matches, so the regex you were trying to match before was this one:
Regexp.new '/communications\/$/'
# => /\/communications\/$\//
which translates to 'slash communications slash end-of-line slash' which I really doubt is what you want.
Ruby Regexs are interesting in that there are symbols for both start(^) and end of line($) as well as start(\A) and end of string(\Z). That's why sometimes you will see people using \A and \Z in their regexes.

How good is the Rails sanitize() method?

Can I use ActionView::Helpers::SanitizeHelper#sanitize on user-entered text that I plan on showing to other users? E.g., will it properly handle all cases described on this site?
Also, the documentation mentions:
Please note that sanitizing
user-provided text does not guarantee
that the resulting markup is valid
(conforming to a document type) or
even well-formed. The output may still
contain e.g. unescaped ’<’, ’>’, ’&’
characters and confuse browsers.
What's the best way to handle this? Pass the sanitized text through Hpricot before displaying?
Ryan Grove's Sanitize goes a lot farther than Rails 3 sanitize. It ensures the output HTML is well-formed and has three built-in whitelists:
Sanitize::Config::RESTRICTED
Allows only very simple inline formatting markup. No links, images, or block elements.
Sanitize::Config::BASIC
Allows a variety of markup including formatting tags, links, and lists. Images and tables are not allowed, links are limited to FTP, HTTP, HTTPS, and mailto protocols, and a attribute is added to all links to mitigate SEO spam.
Sanitize::Config::RELAXED Allows an even wider variety of markup than BASIC, including images and tables. Links are still limited to FTP, HTTP, HTTPS, and mailto protocols, while images are limited to HTTP and HTTPS. In this mode, is not added to links.
Sanitize is certainly better than the "h" helper. Instead of escaping everything, it actually allows the html tags that you specify. And yes, it does prevent cross-site scripting because it removes javascript from the mix entirely.
In short, both will get the job done. Use "h" when you don't expect anything other than plaintext, and use sanitize when you want to allow some, or you believe people may try to enter it. Even if you disallow all tags with sanitize, it'll "pretty up" the code by removing them instead of escaping them as "h" does.
As for incomplete tags: You could run a validation on the model that passes html-containing fields through hpricot, but I think this is overkill in most applications.
The best course of action depends on two things:
Your rails version (2.x or 3.x)
Whether your users are supposed to enter any html at all on the input or not.
As a general rule, I don't allow my users to input html - instead I let them input textile.
On rails 3.x:
User input is sanitized by default. You don't have to do anything, unless you want your users to be able to send some html. In that case, keep reading.
This railscast deals with XSS attacks on rails 3.
On rails 2.x:
If you don't allow any html from your users, just protect your output with the h method, like this:
<%= h post.text %>
If you want your users to send some html: you can use rails' sanitize method or HTML::StathamSanitizer

Resources