As the title said, I have some DOM manipulation tasks. For example, I want to:
- find all H1 element which have blue color.
- find all text which have size 12px.
- etc..
How can I do it with Rails?
Thank you.. :)
Update
I have been doing some research about extracting web page content based on this paper-> http://www.springerlink.com/index/A65708XMUR9KN9EA.pdf
The summary of the step is:
get the web url which I want to be extracted (single web page)
grab some elements from the web page based on some visual rules (Ex: grab all H1 which have blue color)
process the elements with my algorithm
save the result into my database.
-sorry for my bad english-
If what you're trying to do is manipulate HTML documents inside a rails application, you should take a look at Nokogiri.
It uses XPath to search through the document. With the following, you would find any h1 with the "blue" css class inside a document.
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::HTML(open('http://www.stackoverflow.com'))
doc.xpath('//h1/a[#class="blue"]').each do |link|
puts link.content
end
After, if what you were trying to do was indeed parse the current page dom, you should take a look at JavaScript and JQuery. Rails can't do that.
http://railscasts.com/episodes/190-screen-scraping-with-nokogiri
To reliably sort out what color an arbitrary element on a webpage is, you would need to reverse engineer a browser (to accurately take into account stylesheets, markup hacks, broken tags, images, etc).
A far easier approach would be to embed an existing browser such as gecko into a custom application of your making.
As your spider would browse pages, it would pass them to your embedded instance of gecko where you could use getComputedStyle to pull what color an individual element happens to be.
You originally mentioned wanting to use Ruby on Rails for this project, Rails is a framework for writing presentational applications and really a bad fit for a project like this.
As a starting point, I'd recommend you check out RubyGnome, and in particular RubyGnome's Gtk::MozEmbed functionality.
Related
Essentially I'm trying to implement a way so that users can edit slim that is stored in the database.
For example they would use the form to create a new page and insert the html for that page in a text field which would be saved in the database. I want to allow them to edit that page in slim. By the way the html stored is slim not plain html.
If I store slim in the database how do I get rails to render the html properly on the client side in production? So in other words would rails automatically do this since the view is being render like so:
views/page/view.html.slim
page.header
page.content
page.footer
or would I have to figure out a way to convert on the fly? I might be making this more complicated then I should but I'm new to this
If I understand you correctly you want to convert the slim to Html and output that in your views.
This is directly from slims doc. This is how it processes slim files and outputs it.
Tilt.new['template.slim'].render(scope)
Slim::Template.new('template.slim', optional_option_hash).render(scope)
Slim::Template.new(optional_option_hash) { source }.render(scope)
so in short
Slim::Template.new(page/view.html.slim).render
put that in a module to make it prettier and I think you're good. You may want to use rails path helper to get the direct link for the view. You may also want to consider figuring out a way to catch the errors in indentation so that your output doesn't bug out in production. Some kind of validation that prevents it from saving if not properly formatted should help.
I want to achieve print functionality such that user can print out the web form and use it as paper form for the same purpose. Of course I do not need all the web page header and footer to be printed, just content of a div which take most of the page. I did play around with media print css and menage print result to look almost as original page. But the I tried to print it in another browser(Chrome) and it is all messed. (before I tried Mozilla).
For the web form I user css framework Twitter Bootstrap and I had to override its css (in print media) for almost each element individually to get some normal look in the print result.
My question is is there some way (framework/plugin) to print just what you see on the page, maybe as an image or something?
Any other suggestions are welcome.
Thanks.
If you are familiar with PHP you can try the PHP class files of TCPDF or those of FPDF.
Or there is also dompdf which renders HTML to PDF, but this will include more than just the information of one div.
And for further info here is a post on Stack where users are discussing which they think is best.
With a grails app and from a local database, I'm returning some text in a xml format.
I can return it well formed in a <textarea></textarea> tag with the correct indenting (tabulation, line return,...etc.)
I want to go a bit further. In the text I'm returning, there are some <img/> tags and I'd like to replace those tag by the real images themselves.
I searched around and found no solution as of now. I understood that you can't add an image to a textarea (other then in a background), and if I choose a div tag, I won't have the indenting anymore (and therefore, harder to read)
I was wondering if using a <g:textField/> or an other tag from the grails library will do the trick. And if so, How can I append them to a page using jquery.
For example, how to append a <g:textField/> in jquery. It doesn't interpret it and I get this error
SyntaxError: missing ) after argument list [Break On This Error]...+doc).append("<input type="text" id="FTMAP_"+nb_sec+"" ...
And in my javascript file, I have
$("#FTM_"+doc).append("<g:textField id='FTMAP_"+nb_sec+"' ... />
Any possible solutions ?
EDIT
I did forget to mention that my final intentions are to be able to modify the text (tags included) and to have a nice and neat indentation so that it is the easiest possible for the end user.
You are asking a few different questions:
1. Can I use a single HTML tag to include images inside pre-formatted text.
No. You will have to parse the text and translate it into styled text yourself.
2. Is there a tag in the grails standard tags to accomplish this for me?
No.
3. How can I add grails tags from my javascript code.
Grails tags are processed on the server-side, and javascript is processed on the client. This means you cannot directly add grails tags via javascript.
There are a couple methods that can accomplish the same result, however:
You can set a javascript variable to the rendered content of a grails tag. This solution is good for data that is known at the time of the initial request.
var tagOutput = "${g.textField(/* etc */)}";
You can make an ajax request for the content to be added. Then your server-side grails code can render the tags you need. This is better for realtime data, or data that will be updated more than once on a single rendered page.
I am having trouble in using tinymce editor with rails 3. I want to show text in bold letters and having trouble using tags like when I write something in p tags It should go to next paragraphs. in my case this tags is not working. It remains on same lines and display p tags on site page.
The usual suspect when it comes to rails 3 printing raw html output to the site, is that someone forgot to call html_safe on whatever text should be printed.
So if you have a #my_model_instance.description that you edit with tinymce, you might want to make the view look like #my_model_instance.description.html_safe, or as they suggest in the comment on the documentation, raw(#my_model_instance.description).
If the text is coming from user input, however, you might want to be a bit cautious, since it might be possible for users to input all sorts of nasty injection hacks this way.
How to add FCKEditor in MVC application?
How to show database value comes in model, in FCKEditor?
That CodeProject website isn't ideal. It asks you to do alot of unneccessary code. All you really need to do is include the correct javascript file:
Then, in the page, render the FCKEditor, given any number of different ways. I prefer to replace a text area:
window.onload = function()
{
var oFCKeditor = new FCKeditor( 'content' ) ;
oFCKeditor.ReplaceTextarea() ;
}
At that point, the editor should load just fine. You will probably need to edit the fckeditor configuration files to get the standardized behavior you want. At this point, however, everything should just work. Your FCKEditor instance will behave just like another form field, and you can treat it as such when you get values from it on the server side.
It's very easy to create the server side api's for it to use as well. I created an fckeditor control, and you just need to implement GetFolders, GetFoldersAndFiles, and GetFiles. Those only take a few lines and give you nearly all the functionality you need.
I think it's easier to integrate / customize fckeditor using MVC than it is on Classic ASP.NET.