Line breaks are not shown in heredoc - ruby-on-rails

I have a heredoc string
html =<<EOF
<span>
Hello hello 123
</span>
<a>Link1</a>
<a>Link2Link2</a>
EOF
If I say puts html, it will give html as it is, meaning with new lines which is fine. If I call p html I'll get the html without line breaks.
However, what I really need to do is to convert this html into picture and it should have line breaks. Here is how I do that:
kit = IMGKit.new html, quality: 30
# using Magick::Image ......
# some code which is not important....
img.write("my_gif.gif")
It's almost fine except the fact that the result html, as I've already said, doesn't have line breaks, it has only one line:
<span>Hello hello 123</span><a>Link1</a><a>Link2Link2</a>
Of course, if I add <br /> tags, it all will be worked out. But I'm not able to do that for some reason, I want not to use <br /> and still have line breaks.
This is not the problem of IMGKit or Rmagic as I'm pretty sure.
So how do I achieve that?

I agree it is not a problem with IMGKit - it is doing what it is supposed to do - render the HTML. There is also nothing wrong with the heredoc, and nothing magical you can do with Ruby's representation of the HTML such that literal whitespace (spaces, tabs, newlines) in HTML source become visible when rendered.
The most common rendering of source whitespace by HTML viewers is that any length of pure whitespace (whether spaces, tabs, newlines or any combination) is rendered as a single space -> <- in the view. Additionally, whitespace between one element end and another starting is often completely ignored (although the rendering of the elements themselves may cause layout/spacing effects in the view).
You could, however, do something like this:
kit = IMGKit.new html.gsub(/\n/,"<br/>"), quality: 30
and have line breaks rendered without adding <br/> to your heredoc.

Related

Displaying user input html with newlines

I have comments section in my application where users enter input in a text area. I want to prevent the line breaks they enter but also display html as a string. For example, if comment.body is
Hello, this is the code: <a href='foo'>foo</a>
Bye
I want it to be displayed just as above. The same with anything else, including iframe tags.
The closest I got is:
= simple_format(comment.body)
but it sanitizes html code and it's not displayed. Example: foo <iframe>biz</iframe> bar is displayed as:
foo biz bar
What should I do to achieve what I want?
Just use it without any method, it will be rendered as plain text:
= comment.body
Using your second example, the output will be:
foo <iframe>biz</iframe> bar
To make \n behave as <br>, you can use CSS:
.add-line {
white-space: pre-wrap;
}
And use it in your view:
.add-line = comment.body
Using your first example:
comment.body = "Hello, this is the code: <a href='foo'>foo</a>\n\nBye"
The output will be:
Hello, this is the code: <a href='foo'>foo</a>
Bye
Having done something similar in the past, I think you must first understand why HTML is sanitized from user input.
Imagine I wrote the following into a field that accepted HTML and displays this to the front page.
<script>alert('Hello')</script>
The code would execute for anyone visiting the front-page and annoyingly trigger a JS alert for every visitor.
Maybe not much of an issue yet, but imagine I wrote some AJAX request that sent user session IDs to my own server. Now this is an issue... because people's sessions are being hijacked.
Furthermore, there is a full JavaScript based exploitation framework called BeEF that relies on this type of website exploit called Cross-site Scripting (XSS).
BeEF does extremely scary stuff and is worth taking a look at when considering user generated HTML.
http://guides.rubyonrails.org/security.html#cross-site-scripting-xss
So what to do? Well if you checked in your DB you'd see that the tags are actually being stored, but like you pointed out aren't displayed.
You could .html_safe the content, but again I strongly advise against this.
Maybe instead you should write an alternative .html_safe method yourself, something like html_safe_whitelisted_tags.
As for removing newlines, you say you want to display as is. So replacing /n with <br>, as pointed out by Michael, would be the solution for you.
comment.body.gsub('\n', '<br />').html_safe_whitelisted_tags
HTML safe allows the html in the comment to be used as html, but would skip the newlines, so doing a quick replace of \n with <br /> would cover the new lines
comment.body.gsub("\n", "<br />").html_safe
If you want the html to be displayed instead of rendered then checkout CGI::escapeHTML(), then do the gsub so that the <br /> does not get escaped.
CGI::escapeHTML(comment.body).gsub("\n", "<br />")

rails truncate method adds special characters

I have this html text:
<p> I'm a html text</p>
To show it on my web page, I first sanitize it and remove the tags:
sanitize(best_practice.milestone.description, :tags=>[])
I then shows ok, the is removed.
But if I decide to truncate the text like this:
sanitize(best_practice.milestone.description, :tags=>[]).truncate(30)
The is visible again on my web page. All the special chars will actually be visible.
What can I do to avoid truncate to make this special chars visible?
Dealing with sanitize helpers and truncation can be tricky. There are a lot of different sanitize helpers: h, CGI::escapeHTML, sanitize, strip_tags, html_safe, etc. Sanitization and truncation do not work well together if a string is truncated between an opening and a closing tag or right in the middle of a special HTML character.
The following statement seems to work
sanitize(text, :tags=>[]).truncate(30, :separator => " ").html_safe
The trick is to a pass a :separator option to truncate text at a natural break.

Why does the simple_format helper seem to ignore double new lines in ruby on rails?

I have a micropost feature and was testing the way it formats text that has been posted when displaying back to the user.
I pasted the following text like this:
and this was displayed back to me:
I'm using "simple_format h(content)". When I remove the helper the text is displayed with out a new line from the word "In". It displays as one big paragraph so I assume the helper is working but for some reason my double new lines are being ignored.
Any idea what is going on? Am I missing something?
By seeing it back, do you mean inside a textarea, or on the page? If it's on the page, all whitespace is compressed to one space each. If it's the latter, simply use the css rule:
white-space:pre;
On the proper selector.
However, if it is in a textarea (which preserves whitespace by default), there must be something stripping the extra space when you save it into the database. You might want to debug down your stack in the model & controller, to see where this might be happening. I have to admit i haven't used the the simple_format method.
Thanks to chrome developer tools as per usual. I realised that each text separated by 2 new lines were wrapped with p tags so I just added a bottom margin of 5px using css to p. Works perfectly.

Rails 3.1 HAML escaping too much on a an `:escaped` chunk, how to control it so that it only escapes ampersands?

I have a chunk of code provided by Wistia to embed videos into a page. This source is embedable raw html and they include some ampersands in it directly. Of course my w3c validator yells at me all day long and with these in it I'm getting hundreds of errors like:
& did not start a character reference. (& probably should have been escaped as &.)
My view is in HAML so I'm assuming that I needed to escape the sequence, which I happily did with:
:escape
<object width="...
Upon doing this the video no longer loads as it has escaped the entire string with <object width=" ... etc.
How would one properly escape such sequences programmatically vs manually altering the inserted string each time a new update is made in Rails 3.1 with HAML?
You'll probably want to put your HTML into its own partial, then render it into a string and do a String#gsub on it.
Put your Wistia HTML into a partial called something like app/views/shared/_wistia.html
Then create a helper that looks like:
def embed_video(partial)
html = render_to_string(:partial => "shared/#{partial}")
html.gsub '&', '&'
end
And in your HAML, just put = embed_video 'wistia' wherever you want the video to be inserted.

newline characters screwing up <pre> tags (Ruby on Rails)

I developing a blog and some really annoying stuff is happening with newline characters (\n). Everything works fine except if I make a post that contains pre tags my newline characters screw up the indentation.
So if I have code that looks like this
<pre>
<code>
some code some code
more code more code
</code>
</pre>
For some reason the newline characters that are saved in the db field with the post are causing whatever is inside the pre tag to be indented by a tab or two.
I have no idea why it's doing it, but if I do something like
string.gsub!(/\n/, "<br />")
The indentation is removed, so I know it has to do with the \n. But then my problem is that there are way too many line breaks and the format is then way off.
So then I tried to capture everything inside the pre tags with a method that looks like this
def remove_newlines(string)
regexp = /<pre>\s?(.*?)\s?<\/pre>/
code = regexp.match(string)
code[1].gsub!(/\n/, "<br />")
end
But I can't get that to work properly.
Anyone know how I can rid of this weird indentation problem, or any pointers on this?
Thanks!
It sounds like your template engine is auto-indenting the contents of the <pre> tags. Browsers render the whitespace inside <pre> tags as it is (and so they should, according to specs). This means that the whitespace at the beginning of each line inside the <pre> added by the template engine in order to make the HTML source more readable is rendered in the actual page as well, unlike whitespace most other places in HTML source.
The solution therefore depends on your templating language.
If you are using HAML:
HAML FAQ: How do I stop Haml from indenting the contents of my pre and textarea tags?
Hope this helps.

Resources