Replacing html text based on database - ruby-on-rails

I want to make app which will switch vocabulary in desired url of webpage Japanese to English.
But firstable I want start form just simply display desired url of webpage inline lust like Google Translate.(See here)
I got html data from desired url using code below,
and now I want to replace text in html all at same time based data on database.
def submit
require 'open-uri'
charset = nil
#html = open(params[:url]) do |f|
charset = f.charset
f.read
end
end
Database is undone, but I am going to contain Japanese vocabulary which should be switched, and English vocabulary which should be switched instead of Japanese vocabulary.
Any ideas or ways to do this?
Also, I just started learning Ruby on Rails recently so it would be nice if you explain it with some examples or detailed explanation :)
I just want to replace particular word in text based on item on database,I don't want to multilingualism.
EDIT:
For example i got following html below from desired webpage.
<html>
<head>
</head>
<body>
<p>I want to switch "aaa" this and "ccc"</p>
</body>
</html>
Lets say I want to switch(Replace) "aaa" to "bbb", "ccc" to "ddd".
Word that should be switched and be switched instead of previous word are in database.(Target:"aaa","ccc" Switch:"bbb","ddd")
since this html is the one i got it using open-uri, i can't implement code like #{target}.

Working based on the code in this answer and this answer, you could do something like this:
replacements = {'aaa' => 'ccc', 'bbb' => 'ddd' }
regex = Regexp.new(replacements.keys.map { |x| Regexp.escape(x) }.join('|'))
doc = Nokogiri::HTML::DocumentFragment.parse(html)
doc.traverse do |x|
if x.text?
x.content = x.content.gsub(regex, replacements)
end
end
I've also tested that:
replacements = {'こんにちは' => 'Good day', 'bbb' => 'ddd' }
regex = Regexp.new(replacements.keys.map { |x| Regexp.escape(x) }.join('|'))
"こんにちは Mr bbb".gsub(regex, replacements)
Gives the expected:
Good day Mr ddd
You might also want to use:
regex = Regexp.new(replacements.keys.map { |x| '\\b'+Regexp.escape(x)+'\\b' }.join('|'))
to prevent "aaardvark" being changed into "cccrdvark".

Related

Markdown Render Newlines

I'm working on a Project and give an user the possibility to create a Post.
With loading the Post, i'm calling the markdown method, to extract links and format the text.
Now i got a Problem.
By writing "1. Example" the Output in the Post is a list.
By just writing "1.Example"_ without the whitespace between the point and the text, it'working fine.
My markdown method:
#preview = nil
options = {
autolink: true,
hard_wrap: true
}
begin
URI.extract(text, ['http', 'https', 'www']).each do |uri|
unless text.include?("<a")
text = text.gsub( uri, "#{uri}" )
#preview = LinkThumbnailer.generate(uri)
end
end
rescue OpenSSL::SSL::SSLError => e
end
renderer = Redcarpet::Render::HTML.new(options)
markdown = Redcarpet::Markdown.new(renderer)
markdown.render(text).html_safe
May you know, how to fix it.. I don't want the list, i just want the Output to be the same like the Input!
Thank you, for your time!
EDIT Added a photo to show the output.
You want to use a backslash escape in your Markdown source. As the rules explain:
Markdown allows you to use backslash escapes to generate literal characters which would otherwise have special meaning in Markdown’s formatting syntax.
Among the characters which backlash escaping supports is the dot (.). Therefore your source text should look like this:
1\. Example
Which results in this HTML:
<p>1. Example</p>
And renders as:
1. Example
By default you're going to get the list. Markdown is after all looking for syntax that it recognises in order to generate mark up.
In order to skip particular markdown features I think you're going to need to provide your own custom renderer.
If you define a new renderer:
class NoListRenderer < Redcarpet::Render::HTML
def list(contents, list_type)
contents
end
def list_item(text, list_type)
text
end
end
and use an instance of that instead of the default renderer class when you create your markdown instance it should skip the default list processing. (NB. I haven't tested this code):
renderer = NoListRenderer.new(options)
markdown = Redcarpet::Markdown.new(renderer)

How to show some HTML entities on title tag using Rails

I'm running Rails 4.2.x and I have the following problem.
The <title> of some pages are generated from user content. So I have to use the sanitize Rails helpers to properly clean it up.
But if the user writes something like "A & B", the title shown in browser is A & B which is wrong.
What's the correct way of escaping user content on the <title> tag using Rails? At least some special characters should be included...
We can use CGi
also
title = "A & B"
=> "A & B"
string = CGI.escapeHTML(title)
=> "A & B"
string = CGI.unescapeHTML(title)
=> "A & B"
Rails providing so many options to escape
Refer these links:
raw vs. html_safe vs. h to unescape html
How to HTML encode/escape a string? Is there a built-in?
If you want remove tags you can use SanitizeHelper
One more option : WhiteListSanitizer
white_list_sanitizer = Rails::Html::WhiteListSanitizer.new
white_list_sanitizer.sanitize(s, tags: %w())
white_list_sanitizer.sanitize(s, tags: %w(table tr td), attributes: %w(id class style))
You can use Rails::Html::TargetScrubber also
You can both sanitize and convert html entities to proper characters with a combination of sanitize and the htmlentities gem. This works for me in the console:
gem install htmlentities
then...
c = ActionController::Base::ApplicationController.new
dirty_content = "<script>content & tags</script>"
clean_but_with_entities = c.helpers.sanitize(dirty_content)
nice_looking = HTMLEntities.new.decode(clean_but_with_entities.to_str )
You end up with "content & tags". To make this easier to use I put this in application_controller:
helper_method :sanitize_and_decode
def sanitize_and_decode(str)
helpers.sanitize(str)
HTMLEntities.new.decode(str.to_str)
end
(to_str is to work around the SafeBuffer issue mentioned here)

Redcloth extend with emoticons filter

What would be a good way to implement emoticons/smiley's in a simple messaging system?
I came out on red cloth as a valuable solution.
The messages will be saved in the DB like ;), :) ;(
* like described here but this is old: http://flip.netzbeben.de/2008/07/smilies-in-rails-using-redcloth/ I try that any comments on that solution in safety etc?
UPDATE:
Created a helper method , this one works
def emoticons(text)
emoticons = { ":)" => "<img src='/assets/emoticons/smile.gif' class='emoticon'>",
":(" => "<img src='/assets/emoticons/cry.gif' class='emoticon'>"
}
[emoticons.keys, emoticons.values].transpose.each do |search, replace|
text.gsub!(search, replace)
end
return raw text
end
Any way to more improve this? the replacement works although the
This
emoticons = {":)" => "[happy/]", ":(" => "[sad/]"}
text = "test :) :("
[emoticons.keys, emoticons.values].transpose.each do |search, replace|
text.gsub!(search, replace)
end
p text
will output
test [happy/] [sad/]
you can play with gsub to get HTML output instead of pseudo BB code

Rails 3: How to display properly text from "textarea"?

In my Rails 3 application I use textarea to let users to write a new message in a forum.
However, when the message is displayed, all newlines look like spaces (there is no <br />). Maybe there are other mismatch examples, I don't know yet.
I wonder what is the most appropriate way to deal with this.
I guess that the text that is stored in the database is OK (I see for example that < is converted to <), so the main problem is the presentation.
Are there build-in helper methods in Rails for this ?
(simple_format does something that looks similar to what I need, but it adds <p> tags which I don't want to appear.)
Rails got a helper method out of the box, so you dont have to write your own method.
From the documentation:
simple_format(text, html_options={}, options={})
my_text = "Here is some basic text...\n...with a line break."
simple_format(my_text)
# => "<p>Here is some basic text...\n<br />...with a line break.</p>"
more_text = "We want to put a paragraph...\n\n...right there."
simple_format(more_text)
# => "<p>We want to put a paragraph...</p>\n\n<p>...right there.</p>"
simple_format("Look ma! A class!", :class => 'description')
# => "<p class='description'>Look ma! A class!</p>"
You can use style="white-space: pre-wrap;" in the html tag surrounding the text. This respects any line breaks in the text.
Since simple_format does not do what you want, I'd make a simple helper method to convert newlines to <br>s:
def nl2br(s)
s.gsub(/\n/, '<br>')
end
Then in your view you can use it like this:
<%= nl2br(h(#forum_post.message)) %>
If someone still gets redirected here and uses Rails 4:
http://apidock.com/rails/v4.0.2/ActionView/Helpers/TextHelper/simple_format
You can now specify the tag it gets wrapped in (defaults to p) like so:
simple_format(my_text, {}, wrapper_tag: "div")
# => "<div>Here is some basic text...\n<br />...with a line break.</div>"
CSS-only option
I believe one of the easiest options is to use css white-space: pre-line;
Other answers also mentioned using white-space, but I think it needs a little more information:
In most cases you should probably choose pre-line over pre-wrap. View the difference here.
It's very important to keep in mind about white-space that you should not do something like this:
<p style="white-space: pre-line;">
<%= your.text %>
</p>
It will produce extra spaces and line-breaks in the output. Instead, go with this:
<p style="white-space: pre-line;"><%= your.text %></p>
HTML alternative
Another way is to wrap your text in <pre> tags. And last note on my CSS option is true here as well:
<p>
<pre><%= your.text %></pre>
</p>
Don't separate your text from <pre> tags with spaces or line-breaks.
Final thoughts
After googling this matter a little I have a feeling that html-approach is considered less clean than the css one and we should go css-way. However, html-way seems to be more browser-compatible (supports archaic browsers, but who cares):
pre tag
white-space
The following helper preserves new lines as line breaks, and renders any HTML or Script (e.g Javscript) as plain text.
def with_new_lines(string)
(h(string).gsub(/\n/, '<br/>')).html_safe
end
Use as so in views
<%= with_new_lines #object.some_text %>
I just used white-space: pre-line. So next line (\n) will render it.
You'll need to convert the plain text of the textarea to HTML.
At the most basic level you could run a string replacement:
message_content.gsub! /\n/, '<br />'
You could also use a special format like Markdown (Ruby library: BlueCloth) or Textile (Ruby library: RedCloth).
I was using Ace code-editor in my rails app and i had problem, that whenever i update or create the code, it adds always extra TAB on every line (except first). I couldn't solve it with gsub or javascript replace.. But it accidently solved itself when i disabled layout for that template.
So, i solved it with
render :layout => false

Evaluating string templates

I have a string template as shown below
template = '<p class="foo">#{content}</p>'
I want to evaluate the template based on current value of the variable called content.
html = my_eval(template, "Hello World")
This is my current approach for this problem:
def my_eval template, content
"\"#{template.gsub('"', '\"')}\"" # gsub to escape the quotes
end
Is there a better approach to solving this problem?
EDIT
I used HTML fragment in the sample code above to demonstrate my scenario. My real scenario has set of XPATH templates in a configuration file. The bind variables in the template are substituted to get a valid XPATH string.
I have thought about using ERB, but decided against as it might be a overkill.
You can do what you want with String's native method '%':
> template = "<p class='foo'>%s</p>"
> content = 'value of content'
> output = template % content
> puts output
=> "<p class='foo'>value of content</p>"
See http://ruby-doc.org/core/classes/String.html#M000770
You can render a string as if it were an erb template. Seeing that you're using this in a rake task you're better off using Erb.new.
template = '<p class="foo"><%=content%></p>'
html = Erb.new(template).result(binding)
Using the ActionController methods originally suggested, involves instantiating an ActionController::Base object and sending render or render_to_string.
I can't say I really recommend either of these approaches. This is what libraries like erb are for, and they've been throughly tested for all the edge cases you haven't thought of yet. And everyone else who has to touch your code will thank you. However, if you really don't want to use an external library, I've included some recommendations.
The my_eval method you included didn't work for me. Try something like this instead:
template = '<p class="foo">#{content}</p>'
def my_eval( template, content )
eval %Q{"#{template.gsub(/"/, '\"')}"}
end
If you want to generalize this this so you can use templates that have variables other than content, you could expand it to something like this:
def my_eval( template, locals )
locals.each_pair{ |var, value| eval "#{var} = #{value.inspect}" }
eval %Q{"#{template.gsub(/"/, '\"')}"}
end
That method would be called like this
my_eval( '<p class="foo">#{content}</p>', :content => 'value of content' )
But again, I'd advise against rolling your own in this instance.
This is also a nice one:
template = "Price of the %s is Rs. %f."
# %s - string, %f - float and %d - integer
p template % ["apple", 70.00]
# prints Price of the apple is Rs. 70.000000.
more here
To late but I think a better way is like ruby-style-guide:
template = '<p class="foo">%<content>s</p>'
content_text = 'Text inside p'
output = format( template , content: content_text )

Resources