Given an html email message, is there a way to convert that to a text version? I'm doing email ingestion and notice that some times an email doesn't include a text version, especially with blackberry devices.
thanks
HTML to Text is one of the features provided by the Premailer gem.
premailer = Premailer.new('http://example.com/html_email.html')
premailer.to_plain_text
In case you don't want to use it because it does a lot, you can look at the code for how it does it here
Perhaps I'm missing something, but couldn't you just take the HTML message and run ActionView::Helpers::SanitizeHelper#strip_tags over it?
http://api.rubyonrails.org/classes/ActionView/Helpers/SanitizeHelper.html#method-i-strip_tags
I know this post is old, but it comes up high in Google for "convert html to text". The following may meet your needs:
The author says:
Ruby convert HTML to formatted text — Chip’s Tips for Developers. When
you want to have your whitespace and feed it, too.
http://www.chipstips.com/?p=610
Refer the following link
http://edgeguides.rubyonrails.org/action_mailer_basics.html
Action mailer provides options for sending a html or text emails.......
Related
In my rails 4 app i want to add comments to my articles, but i want to add functional as most forum-engines do (like SMF), and i need to add bb-code for it.
Are there any good gem for it? With rails 4 support? How then in controller i can translate [quote] to some div with some style?
Also how is it good to store html data in database?
For example if i use haml, and somebody post comment as
- current_user.id
or something similar to this, how to secure my app from "bad boys" ? Sure i can change comments system to something like: quote_parent_id, but if i have multiple quotes in one comment? so it is hard to realise, better is to store html, but to secure it somehow.
Could i do this? And how? Please give good ideas, tutorials, gem-links.
Look into https://github.com/veger/ruby-bbcode
Since it converts to HTML and does not excecute user input as Ruby code - you'll be fairly safe. However, I havent tried the gem and its possible it introduces some XSS vulnerabilities.
Have you considered Markdown as an option?
You should also look into https://github.com/asceth/bbcoder ( I should note I am the original author ).
In the controller, changing a string such as "[quote=user]My post of epic importance[/quote]" into a div etc is just doing:
# assume params[:comment] is the text you are converting
params[:comment].bbcode_to_html
As for storing html in a database, there is no right or wrong answer. If you want to allow users to edit their posts later then I would lean towards not storing the html version but storing their original bbcode version. This way when you allow them to edit you aren't having to convert html back to bbcode.
To make sure you aren't open to XSS and other attacks I recommend combining other gems like sanitize.
Sanitize.clean(text.to_s).bbcode_to_html
Some more notes:
Multiple tags and nested tags are parsed as they are seen without any additional steps required. So a comment or post with lots of bbcode tags, multiple quotes, b tags or anything else is dealt with by just calling bbcode_to_html on the variable/string.
If a user tries to use haml in their post it should appear as-is. haml shouldn't try to eval the string unless you specifically tell it to which I'm not even sure how to do that unless haml as a special filter or operator.
My Rails app processes incoming emails by splitting them into multiple lines. This is what I currently use on the plain text version of the body: lines = email.body.split("\n")
This works well unless the sentences are longer than ~74 characters as most email clients will automatically add a line break per RFC 2822.
Example email: https://gist.github.com/marckohlbrugge/39c17b928eb17d330d63
Looking at the plain text part there seems to be no way to discern between a line break added by the user versus the email client. You could ignore any line break happening at the 75th position, but I think there might be a chance of false positives. (I could be wrong.)
The HTML part has all the information we need, but I'm not sure about a universal way to process this. Is replacing every div and br with a newline and then stripping al other HTML elements enough? What about all the other block-element tags? What about inline elements styled as block-elements? What if an email doesn't have an HTML part?
I did find some interesting code examples in Convert HTML to plain text (with inclusion of s), but replacing a list of html tags with newlines doesn't seem like a complete (exhaustive) solution.
Is it worth looking at something like this mail library as they've probably already thought about the edge cases? ;)
Is there any chance to replace the mandrill's *| |* symbols?
The CMS i'm using (MODX) has its own symbols to enclose the tags, eg: [[+ ]]
The case is that I also have "read on web" link, where the page on the web needs to generate dynamic content as well.
I have googled and searched on http://help.mandrill.com but still no luck.
Any hint will be appreciated.
You wouldn't be able to use different symbols in your emails - those are how Mandrill's system recognizes merge tags and to replace them in the HTML and/or text of your email. You'd need to convert any placeholders you have or want for the email to that format, so you can pass the data to Mandrill as expected. If it's going to mirror what you're putting on the web, then you probably just want to have something that transforms strings, for example, to convert your CMS tags to Mandrill tags specifically for the emails.
#kaitlin-mandrill,
Exactly,
I just figured it out.
I need to replace it right before it is sent.
More or less, this is the code.
Hopefully it's useful for anyone else.
I'm using Slim as the templating language for my HTML email. When pretty mode is turned off in production, it puts all the HTML on one line. When the emails go through Sendgrid, a line break is introduced at the 998th character, breaking the HTML. Sendgrid does this to comply with the email RFC.
How can I turn pretty mode off while rendering the email, tell Slim to respect the maximum line length, or introduce a hard line break?
Adding a few of these
= "\r\n"
throughout the email template solved the problem.
Just add data-force-encoding="✓" attribute to the body tag. That will make Rails to send email as quoted printable (trick is to use UTF8 char in fact). See: https://github.com/slim-template/slim/issues/123
I have a string named MESSAGE and it varies depending on what people say.
Is there a way I can make MESSAGE replace the contents of a string inside of a string?
For example, if MESSAGE is equal to "Hey guys [rainbow]look at my awesome rainbow text[/rainbow] isn't it cool?" then how do I only replace the "[rainbow]look at my awesome rainbow text[/rainbow]" part by getting rid of "[rainbow]" and "[/rainbow]" and replacing "look at my awesome rainbow text" with a string called RAINBOWTEXT?
The reason I need this is because I would like users in my Flash-based chat to be able to make rainbow text, but the method I use needs to do this client-side. I did have a PHP version that would submit the text into the database, but it would cut the message off due to the large amount of data. Each character got a <font color="#123456"> and a </font> on either side of it, so messages were super long.
If I can replace the inside part of what is specified by [rainbow] and [/rainbow], I can have each person's client replace the message once it gets it. In the database, it will have "Hey guys [rainbow]look at my awesome rainbow text[/rainbow] isn't it cool?" but in chat it will have "Hey guys look at my awesome rainbow text isn't it cool?" with the actual text rainbowfied.
This sounds like a job for a regular expression and the replace() method. Have a dig around for expressions which will find and replace the contents of HTML tags and you should be able to adapt what you find to suit your requirements. Examples for JavaScript should also be valid as ActionScript and JavaScript both adhere to the ECMAScript standard for regular expressions.
Once you've found something, you can play around with it with this handy tool for testing ActionScript 3.0 regular expressions.