Rails way to offer modified attributes - ruby-on-rails

The case is simple: I have markdown in my database, and want it parsed on output(*).
#post.body is mapped to the posts.body column in the database. Simple, default Activerecord ORM. That column stores the markdown text a user inserts.
Now, I see four ways to offer the markdown rendered version to my views:
First, in app/models/post.rb:
# ...
def body
markdown = RDiscount.new(body)
markdown.to_html
end
Allowing me to simply call #post.body and get an already rendered version. I do see lots of potential problems with that, e.g. on edit the textfield being pre-filled with the rendered HMTL instead of the markdown code.
Second option would be a new attribute in the form of a method
In app/models/post.rb:
# ...
def body_mardownified
markdown = RDiscount.new(body)
markdown.to_html
end
Seems cleanest to me.
Or, third in a helper in app/helpers/application_helper.rb
def markdownify(string)
markdown = RDiscount.new(string)
markdown.to_html
end
Which is used in the view, instead of <%= body %>, <%= mardownify(body) %>.
The fourth way, would be to parse this in the PostsController.
def index
#posts = Post.find(:all)
#posts.each do |p|
p.body = RDiscount.new(string).to_html
#rendered_posts << p
end
end
I am not too familiar with Rails 3 proper method and attribute architecture. How should I go with this? Is there a fifth option? Should I be aware of gotchas, pitfalls or performance issues with one or another of these options?
(*) In future, potentially updated with a database caching layer, or even special columns for rendered versions. But that is beyond the point, merely pointing out, so to avoid discussion on filter-on-output versus filter-on-input :).

The first option you've described won't work as-is. It will cause an infinite loop because when you call RDiscount.new(body) it will use the body method you've just defined to pass into RDiscount (which in turn will call itself again, and again, and so on). If you want to do it this way, you'd need to use RDiscount.new(read_attribute('body')) instead.
Apart from this fact, I think the first option would be confusing for someone new looking at your app as it would not be instantly clear when they see in your view #post.body that this is in fact a modified version of the body.
Personally, I'd go for the second or third options. If you're going to provide it from the model, having a method which describes what it's doing to the body will make it very obvious to anyone else what is going on. If the html version of body will only ever be used in views or mailers (which would be logical), I'd argue that it makes more sense to have the logic in a helper as it seems like the more logical place to have a method that outputs html.
Do not put it in the controller as in your fourth idea, it's really not the right place for it.

Yet another way would be extending the String class with a to_markdown method. This has the benefit of working on any string anywhere in your application
class String
def to_markdown
RDiscount.new(self)
end
end
#post.body.to_markdown
normal bold italic

If you were using HAML, for example in app/views/posts/show.html.haml
:markdown
= #post.body
http://haml-lang.com/docs/yardoc/file.HAML_REFERENCE.html#markdown-filter

How about a reader for body that accepts a parse_with parameter?
def body(parse_with=nil)
b = read_attribute('body')
case parse_with
when :markdown then RDiscount.new(b)
when :escape then CGI.escape(b)
else b
end
end
This way, a regular call to body will function as it used to, and you can pass a parameter to specify what to render with:
#post.body
normal **bold** *italic*
#post.body(:markdown)
normal bold italic

Related

Rails sanitizing user input

For user inputed data I am taking the approach of sanitizing it before saving to strip out any html or anything malicious (i.e. tags).
I have a before_validation callback:
before_validation :sanitize_fields
def sanitize_fields
full_sanitizer = Rails::Html::FullSanitizer.new
white_list = Rails::Html::WhiteListSanitizer.new
# Only text allowed
self.fname = full_sanitizer.sanitize(self.fname)
self.lname = full_sanitizer.sanitize(self.lname)
self.company = full_sanitizer.sanitize(self.company)
# Some HTML Allowed
self.description = white_list.sanitize(self.description)
end
The problem I am encountering is that when saving something like "Smith & Company" as the name it is stored in the DB as Smith & Company. Not an issue per se, but then it also displays as Smith & Company in the edit view of the form, which seems funny and confusing to the end user.
Is there a better way than the approach I am taking? This "smells" wrong to me.
Thanks!
If you are confident the data is sanitized, you can declare it html_safe in the views to avoid it showing up as &; it will render exactly as provided.
This of course begs the question: rather than jump through hoops to pre-sanitize and then tell the view that it has been sanitized, why not just allow the view to sanitize strings like it does by default? If you render the string "<tag>some_stuff</tag>" in a view, it will escape it for you. Are you concerned about the unsanitized string appearing elsewhere other than in a view that you control?
The reason it smells wrong is because it is.
With the possible exception of pe-rendering large text (markdown, etc) blocks into html, I would avoid sanitizing your model data this way. Following rails best practices will protect you from SQL injection, text output in views will be rendered in a safe way by default.
If you need to allow users to input html, sanitize it on output (in your view) not on input.
Separation of concerns is one reason, but the biggest is that what you are trying to do is simply not idiomatic rails. If you chose to continue down that path you will be fighting the framework constantly.

How to structure Util classes in RoR

I have a template that users can upload that generates a report. They can put special tags into the html template and it will replace with data from the db. Quick example:
<div class="customer-info">
<h1>{customer_name}</h1>
<h2>{customer_address_line1}</h2>
<h2>{customer_address_line2}</h2>
<h2>{customer_address_city}, {customer_address_state} {customer_address_zip}</h2>
</div>
I have a controller that looks up the customer and then parses the template and replaces the tokens.
Right now I have the parse code in the controller creating a fat controller. Not good.
But where should I move the code? Model folder? Create a Util folder and put it there?
Just not sure what the Rails Way would be.
I was curious about this too, and found a very similar discussion here. Honestly, I think it depends on how much parse code there is. If there are only a very few lines, then the model is a safe place. If it's going to be a large package, especially a re-usable one, the /lib/ folder may be a better bet for the parsing itself. However, you definitely should remove it from the controller, as you suggested.
I agree that the logic shouldn't be in the controller, but let's get a
little more specific about how you'd go about implementing this.
First, where are you storing your templates in the database? They
should be stored in their own model, let's call it
CustomerTemplate and give an attribute :template of type Text.
So now we have two types of objects, Customers and
CustomerTemplates. How to render a customer given a template? It
honestly wouldn't be terrible to just have a render function in
the CustomerTemplate model that takes a customer and renders it, but
it is putting some logic inside your app that doesn't strictly belong
there. You should separate out the "customer specific rendering logic"
from the "rendering my simple custom template language".
So, let's create a simple template handler for your custom language,
which I'm going to nickname Curly. This handler should know nothing about
customers. All it does is take a string and interpolate values inside
{}'s. This way if you want to add new template types in the future —
say, to render another model like an invoice — you can use the same
template type.
Templates in Rails are classes which respond to call and are
registered with ActionView::Template. The simplest example is Builder.
Here's a quickly written Template handler which renders Curly. The
call function returns a string which is eval'd, so the string has to
be valid ruby code. The string eval in scoped by the render call, so
it has access to any variables passed in via the { locals: {} }
option to render.
# In lib/curly_template_handler.rb
class CurlyTemplateHandler
def self.call(template)
src = template.source
"""
r = '#{src}'.gsub(/{([^}]*)}/) { |s|
local_assigns[$1.to_sym] || s
}
raw r
"""
end
end
Make sure the handler is initialized, and let's set it to listen for
the :curly type.
# In config/initializers/register_curly_template.rb
ActionView::Template.register_template_handler(:curly, CurlyTemplateHandler)
We need to add lib/ to autoload_paths so the class is loaded:
# config/application.rb
config.autoload_paths += %W(#{config.root}/lib)
Finally, we can render our template in our view! I'm embedding the string here, but you'd really get it from a CustomerTemplate object:
<%= render(inline: "<h2>{customer_name}</h2><p>{customer_address}</p>",
type: :curly,
locals: { customer_name: #customer.name,
customer_address: #customer.address }) %>
DO NOT USE MY EXAMPLE CODE IN PRODUCTION! I left out a bunch of corner
cases which you'll need to handle, like sanitizing user input.

Is there any way to define a model's attribute as always html_safe?

I have a model called Feature with a variable called body_string, which contains HTML markup I'd like to render, rather than escape.
Every time I reference body_string in my views, I need to use <%=raw or .html_safe. This seems redundant and not-so-DRY.
Is there any way that I can establish once-and-for-all the body_string variable as html_safe?
I'm assuming this would happen in the app/models/feature.rb file, but I can't figure out what the right syntax would be, exactly. I've thought of this:
def body_string
return self.body_string.html_safe
end
But Rails doesn't like it; it raises a stack level too deep exception.
Naturally I could define a variable/method with a different name:
def safe_body_string
return self.body_string.html_safe
end
And then just change all references in the views from body_string to safe_body_string. But somehow this seems almost as un-DRY as simply using raw or .html_safe in the first place.
Any insights to how best to handle this? I feel like there must be something really elegant that I'm just not seeing.
Just use read_attribute to avoid the recursive call to body_string:
def body_string
read_attribute(:body_string).html_safe
end
read_attribute is complemented by write_attribute for setting attributes from within your model.
A note on style: Don't use explicit returns unless you actually need them. The result of the last statement in a method is implicitly the value returned from the method.
While #meager's answer will definitely work, I don't think this logic belongs in a model. Simply because it adds view-level concerns (HTML safeness) to the model layer, which should just include business logic. Instead, I would recommend using a Presenter for this (see http://nithinbekal.com/posts/rails-presenters/ or find a gem for this -- I personally love Display Case). Your presenter can easily override the body_string method and provide the .html_safe designation when displaying in the view. This way you separate your concerns and can continue to get body_string from other models without mixing in the view concern.
Maybe this gem is useful for you. I also wanted to stop repeating html_safe all the time when the content is completely trustable.
http://rubygems.org/gems/html_safe_attribute
Or you can also use this approach,
def body_string
super && super.html_safe
end

Rails 3: Where should a helper that uses h (i.e. html_escape) live?

I'm writing a webapp in Ruby on Rails 3. Rails 3 automatically escapes any potentially-bad strings, which is generally a good thing, but means if you assemble HTML yourself, you have to call html_safe on it.
I have a Card model, which has several text fields, the contents of which are not trusted (may contain evil HTML or script). I have a function which performs a few transforms on one of these text fields, using other knowledge about the specific Card, to produce HTML output. I want to embed the HTML produced by this function in several places throughout several parts of my app.
Conceptually, this helper is to do with the View. However, I can't find any way to write functions in my View files; it seems they have to go in Helpers or the Controller/Model.
Since this function is very much specific to a Card object, the next best option would be to have a function inside my Card model card.rb:
class Card < ActiveRecord::Base
[...]
def format(unsafe_text)
initial_text = h unsafe_text # aka html_escape unsafe_text
# assembles HTML output based on initial_text and fields of self
output_text.html_safe!
end
Then I'd like to call this in assorted views by doing things like:
Rules text: <%= format(#card.rulestext) %>
However, there's a big problem here as well. In the Card model card.rb, I am able to use the html_safe! function, but I'm not able to use h or html_escape. It seems that the h and html_escape functions are only available in ERB views, not in the helpers or controllers!
There are a few workarounds. I can make format not sanitize its input, and go
Rules text: <%= format(h(#card.rulestext)) %>
But that's both prone to dangerous slipups (one missing h() and we've got problems) and is very non-DRY. At the moment I'm using a partial to gain access to the h() function:
(in a normal view)
Rules text: <%= render 'formattext', :text=> #card.rulestext %>
(app/views/shared/_formattext.html.erb)
<%= #card.format(html_escape(text)) %>
But this still feels dangerous. All I have to do is make single forgetful call to format(sometext) in a view, rather than calling render 'formattext', :text=> sometext, and I've got unescaped text running around.
Is there any better way to do this? Is there a way to write helper functions to live in the View rather than the Model or the Controller?
Place the logic that does your view assembly into a CardHelper:
app/helpers/card_helper.rb
class CardHelper
def rules(card)
initial_text = h card.rules_text
# assembles HTML output based on initial_text and fields of card
output_text.html_safe
end
end
It's not clear from your example whether you want to format several fields via the format method. If that's the case, then you might be able to do:
class CardHelper
def format(card, attribute)
initial_text = h card[attribute]
# assembles HTML output based on initial_text and fields of card
output_text.html_safe
end
end
You can use this helper like any other:
class CardsController
helper CardHelper
end
and in your views:
<%= rules(#card) %>
or
<%= format(#card, :rules) %>
Escaping the content for view is a View responsibility, this is the reason why the h helper is not available in controllers or models.
Still, I don't understand why can't you simply sanitize the content in the view.
Also note that, in Rails 3, you don't need to call the h helper.
Content is sanitized automatically by default unless you flag it as html_safe!.
The main reason why is not logically true to use the h helper in the model is because the model should work view-independently. In other words, the model should not care whether the content is going to be embedded in a HTML document or JSON file (which requires a different escaping approach compared to HTML).

Remove all html tags from attributes in rails

I have a Project model and it has some text attributes, one is summary. I have some projects that have html tags in the summary and I want to convert that to plain text. I have this method that has a regex that will remove all html tags.
def strip_html_comments_on_data
self.attributes.each{|key,value| value.to_s.gsub!(/(<[^>]+>| |\r|\n)/,"")}
end
I also have a before_save filter
before_save :strip_html_comments_on_data
The problem is that the html tags are still there after saving the project. What am I missing?
And, is there a really easy way to have that method called in all the models?
Thanks,
Nicolás Hock Isaza
untested
include ActionView::Helpers::SanitizeHelper
def foo
sanitized_output = sanitize(html_input)
end
where html_input is a string containing HTML tags.
EDIT
You can strip all tags by passing :tags=>[] as an option:
plain_text = sanitize(html_input, :tags=>[])
Although reading the docs I see there is a better method:
plain_text = strip_tags(html_input)
Then make it into a before filter per smotchkiss and you're good to go.
It would be better not to include view helpers in your model. Just use:
HTML::FullSanitizer.new.sanitize(text)
Just use the strip_tags() text helper as mentioned by zetetic
First, the issue here is that Array#each returns the input array regardless of the block contents. A couple people just went over Array#each with me in a question I asked: "Return hash with modified values in Ruby".
Second, Aside from Array#each not really doing what you want it to here, I don't think you should be doing this anyway. Why would you need to run this method over ALL the model's attributes?
Finally, why not keep the HTML input from the users and just use the standard h() helper when outputting it?
# this will output as plain text
<%=h string_with_html %>
This is useful because you can view the database and see the unmodified data exactly as it was entered by the user (if needed). If you really must convert to plain text before saving the value, #zetetic's solution gets you started.
include ActionView::Helpers::SanitizeHelper
class Comment < ActiveRecord::Base
before_save :sanitize_html
protected
def sanitize_html
self.text = sanitize(text)
end
end
Reference Rails' sanitizer directly without using includes.
def text
ActionView::Base.full_sanitizer.sanitize(html).html_safe
end
NOTE: I appended .html_safe to make HTML entities like render correctly. Don't use this if there is a potential for malicious JavaScript injection.
If you want to remove along with html tags, nokogiri can be used
include ActionView::Helpers::SanitizeHelper
def foo
sanitized_output = strip_tags(html_input)
Nokogiri::HTML.fragment(sanitized_output)
end

Resources