I have footer text that may be more than one line, the length of the text will vary depending on the user's name and the company they work for. Like all footers, it needs to be displayed below the document's bottom bound so that it doesn't get intermingled with the main content of the PDF.
The problem is that the only why I have found in Prawn to get text printed below the document's bottom bound is by using #draw_text. This is the same method that number_pages uses to get its text to appear below the document's bottom bound. However the one caveat of using #draw_text appears to be its inability to wrap text to a second line.
I have found many methods that allow me to wrap text to a second line such as #text_box, #bounding_box, etc. but the caveat of these methods is that they don't allow you to print anything below the document's bottom bound.
For example, the following will not print anything on the document because it would be below the document's bottom bound:
text_box "Generated by Tom Cruise for Universal Studios", :at => [bounds.left, 0], :width => 200
The following does print on the document because it is within the document's bottom bound but will also be printed on top of any content that already exists there:
text_box "Generated by Tom Cruise for Universal Studios", :at => [bounds.left, bounds.bottom - 20], :width => 200
And finally the following will print below the document's bottom bound ensuring that it is not being printed on top of any existing content in the PDF, but there is no available :width option or the ability to have the text wrap to a second line if needed:
draw_text "Generated by Tom Cruise for Universal Studios", :at => [bounds.left, 0]
Is there a way to get the best of both worlds? A way to print below the document's bottom bound AND enforce a maximum width with line wrapping?
I suspect you'll need to do the line wrapping manually (e.g. calculate when to break).
But I was able to get a multi-line footer using the standard number_pages method and the following:
pdf.number_pages "Copyright #{Time.now.year} Company.", [pdf.bounds.left, 0]
pdf.number_pages "Profile generated on #{Time.now.strftime('%B %d, %Y')}.", [pdf.bounds.left, 10]
Is that what you are looking for?
I ended up writing my own little routine to handle multiple lines in the footer. It'd be nice if Prawn supported something like this out of the box, I'm still a bit mystified why some things can't be displayed below the bottom bound while other things can be. It would also be nice if all the different types of text methods supported the :width attribute with line wrapping...but I digress, here is the code I ended up using:
line_wrapper = Prawn::Core::Text::LineWrap.new
repeat :all do
str = "Generated on " + Time.zone.now.strftime("%m/%d/%y at %I:%M:%S %p %Z") + " by #{user.full_name} at #{user.company.name}"
starting_position = 0
while !str.blank?
single_line = line_wrapper.wrap_line(str, :width => 470, :document => pdf)
draw_text(single_line, :at => [bounds.left, starting_position])
starting_position -= 10
str.slice!(single_line)
end
end
Related
I realize this question has been asked a lot, but I could not find any information about how to do this in RoR. I am filling a PDF with form text fields using pdf-forms but this does not support adding images, and I need to be able to add an image of a customer's signature into the PDF. I have used prawn to render the image on the existing PDF, but I need to know the exact location to add the image on the signature line. So my question is how can I look at an arbitrary PDF and find the exact position of the "Signature" form field?
I ended up using pdf2json to find the x,y position of the form field. I generate a JSON file of the original pdf using this command:
%x{ pdf2json -f "#{form_path}" }
The JSON file is generated in the same directory as form_path. I find the field I want using these commands:
jsonObj = JSON.parse(File.read(json_path))
signature_fields = jsonObj["formImage"]["Pages"].first["Fields"].find_all do |f|
f["id"]["Id"] == 'signature'
end
I can use prawn to first create a new PDF with the image. Then using pdf-forms, I multistamp the image pdf onto the original PDF that I want to add the image to. But multistamp applies each page of the stamp PDF to the corresponding page of the input PDF so make sure your image PDF has the correct number of pages or else your image will get stamped on every page. I only want the image stamped onto the first page, so I do the following:
num_pages = %x{ #{Rails.configuration.pdftk_path} #{form_path} dump_data | grep "NumberOfPages" | cut -d":" -f2 }.to_i
signaturePDF = "/tmp/signature.pdf"
Prawn::Document.generate(signaturePDF) do
signature_fields.each do |field|
image Rails.root.join("signature.png"), at: [field["x"], field["y"]],
width: 50
end
[0...num_pages - 1].each{|p| start_new_page }
end
outputPDF = "/tmp/output.pdf"
pdftk.multistamp originalPDF, signaturePDF, outputPDF
You can use this gem 'wicked_pdf. You just write html, and this gem automatically convert it to pdf
Read more https://github.com/mileszs/wicked_pdf
Here's a pure ruby implementation that will return the field's name, page, x, y, height, and width using Origami https://github.com/gdelugre/origami
require "origami"
def pdf_field_metadata(file_path)
pdf = Origami::PDF.read file_path
field_to_page = {}
pdf.pages.each_with_index do |page, page_index|
(page.Annots || []).each do |annot|
field_to_page[annot.refno] = page_index
end
end
field_metas = []
pdf.fields.each do |field|
field_metas << {
name: field.T,
page_index: field_to_page[field.no],
x: field.Rect[0].to_f,
y: field.Rect[1].to_f,
h: field.Rect[3].to_f - field.Rect[1],
w: field.Rect[2].to_f - field.Rect[0]
}
end
field_metas
end
pdf_field_metadata "<path to pdf>"
I haven't tested it particularly thoroughly but the snippet can hopefully get you most of the way there.
Also -- keep in mind the above coordinates calculated are in points from the bottom left of the pdf page rather than the top right (and are not in pixels). I believe there's always 72 points per inch, and you can get the total page points by calling page.MediaBox in the pdf.pages loop above. If you're looking for pixel coordinates, you need to know the DPI of the resulting rendered document.
I am using the Roo gem to output a spreadsheet from a Rails app. One of my columns is a hash (Postgres DB). I would like to format the cell contents into something more readable. I am using a method to return a human readable cell.
The column data looks like this:
Inspection.first.results
=> {"soiled"=>"oil on back",
"assigned_to"=>"Warehouse#firedatasolutions.com",
"contaminated"=>"blood on left cuff",
"inspection_date"=>"01/01/2017",
"physical_damage_seam_integrity"=>"",
"physical_damage_thermal_damage"=>"",
"physical_damage_reflective_trim"=>"",
"physical_damage_rips_tears_cuts"=>"small tear on right sleeve",
"correct_assembly_size_compatibility_of_shell_liner_and_drd"=>"",
"physical_damage_damaged_or_missing_hardware_or_closure_systems"=>""}
In my Inspections model I defined the following method:
def print_results
self.results.each do |k,v|
puts "#{k.titleize}:#{v.humanize}\r\n"
end
end
So in the console I get this:
Inspection.first.print_results
Soiled:Oil on back
Assigned To:Warehouse
Contaminated:Blood on left cuff
Inspection Date:01/01/2017
Physical Damage Seam Integrity:
Physical Damage Thermal Damage:
Physical Damage Reflective Trim:
Physical Damage Rips Tears Cuts:Small tear on right sleeve
Correct Assembly Size Compatibility Of Shell Liner And Drd:
Physical Damage Damaged Or Missing Hardware Or Closure Systems:
=> {"soiled"=>"oil on back",
"assigned_to"=>"Warehouse",
"contaminated"=>"blood on left cuff",
"inspection_date"=>"01/01/2017",
"physical_damage_seam_integrity"=>"",
"physical_damage_thermal_damage"=>"",
"physical_damage_reflective_trim"=>"",
"physical_damage_rips_tears_cuts"=>"small tear on right sleeve",
"correct_assembly_size_compatibility_of_shell_liner_and_drd"=>"",
"physical_damage_damaged_or_missing_hardware_or_closure_systems"=>""}
But when I put this in the index.xlsx.axlsx file
wb = xlsx_package.workbook
wb.add_worksheet(name: "Inspections") do |sheet|
sheet.add_row ['Serial Number', 'Category', 'Inspection Type', 'Date',
'Pass/Fail', 'Assigned To', 'Inspected By', 'Inspection Details']
#inspections.each do |inspection|
sheet.add_row [inspection.ppe.serial, inspection.ppe.category,
inspection.advanced? ? 'Advanced' : 'Routine',
inspection.results['inspection_date'],
inspection.passed? ? 'Pass' : 'Fail',
inspection.ppe.user.last_first_name,
inspection.user.last_first_name,
inspection.print_results]
end
end
The output in the spreadsheet is the original hash, not the results of the print statement.
{"soiled"=>"oil on back",
"assigned_to"=>"Warehouse",
"contaminated"=>"blood on left cuff", "inspection_date"=>"01/01/2017",
"physical_damage_seam_integrity"=>"",
"physical_damage_thermal_damage"=>"",
"physical_damage_reflective_trim"=>"",
"physical_damage_rips_tears_cuts"=>"small tear on right sleeve",
"correct_assembly_size_compatibility_of_shell_liner_and_drd"=>"",
"physical_damage_damaged_or_missing_hardware_or_closure_systems"=>""}
Is it possible to get the output of the method into the cell rather than the hash object?
The problem is that your print_results method prints out what you want to stdout (that is, the console), but still returns the original hash. The return value of the method is all that matters to Roo.
What you want to do is rewrite print_results to return the formatted string:
def print_results
self.results.map do |k,v|
"#{k.titleize}:#{v.humanize}\r\n"
end.join
end
This will return a string (note the use of .join to combine the array of strings returned by .map) that you can throw into Roo and get your desired output.
Those willing to jump straight to my questions can go to the paragraph "Please help with". You will find there my beginning of implementation, along with short XML samples
The story
The famous problem of inserting repeating content, like table rows, into a word template, using the rails framework.
I decided to implement a 'cleaner' solution for replacing some variables in a Word document with rails, using XML databinding. This solution works very well for non-repetitive content, but for repetitive content, a little extra dirty work must be done and I need help with it.
No C#, No Visual, just plain olde ruby on rails & XML
The databinded document
I have a Word document with some content controls, tagged with "human-readable" text, so my users know what should be inside.
I have used Word 2007 Content Control Toolkit to add some custom XML to a .docx file. Therefore in each .docx I have some customXml/itemsx.xml that contains my custom XML.
I have manually databinded this XML to text content control I have in my word template, using drag & drop with Word 2007 Content Control Toolkit.
The replacing process with nokogiri
Basically I already have some code that replaces every XML node by the corresponding value from a hash. For example if I provide this hash to my function :
variables = {
"some_xml-node" => "some_value"
}
It will properly replace XML in customXml/itemsx.xml of .docx file :
<root> <some> <xml-node>some_value</xml-node></some> </root>
So this is taken care of !
The repetitive content
Now as I said, this works perfectly for non-repetitive content. For repetitive content (in my case I want to repeat some <w:tr> in a document), the solution I'd like to go with, is
Manually insert some tags in word/document.xml of .docx file (this is dirty, but hell I can't think of anything else) before every <tr> that needs to be duplicated
In rails, parse the XML and locate the <tr> that needs duplicating using Nokogiri
Copy the tr as many times as I need
Look at some text inside this <tr>, find the databinding (which looks like <w:dataBinding w:xpath="/root[1]/movies[1]/movie[1]/name[1]"
Replace movie[1] by movie[index]
Repeat for every table that needs <tr> duplication
With this solution Therefore I ensure 100% compatibility with my existing system ! It's some kind of preprocessing...
Please help with
Finding an XML comment containing a custom string, and selecting the node just below it (using Nokogiri)
Changing attributes in many sub-nodes of the node found in 1.
XML/Hash samples that could be used (my beginning of implementation after that):
Sample of .docx word/document.xml
<w:document>
<!-- My_Custom_Tag_ID -->
<w:tr someparam="something">
<w:td></w:td>
<w:td><w:sthelse></w:sthelse><w:dataBinding w:xpath="/root[1]/movies[1]/movie[1]/name[1]><w:sth>Value</w:sth></w:td>
<w:td></<:td>
</w:tr>
</w:document>
Sample of input parameter repeat_tag hash
repeat_tags_sample = [
{
"tag" => "My_Custom_Tag_ID",
"repeatable-content" => "movie"
},
{
"tag" => "My_Custom_Tag_ID_2",
"repeatable-content" => "cartoons"
}
]
Sample of input parameter contents hash
contents_sample =
{
"movies" => [{"name" => "X-Men",
"year" => 1998,
"property-xxx" => 42
}, { "name" => "X-Men-4",
"year" => 2007,
"property-xxx" => 42
}],
"cartoons" => [{"name" => "Tom_Jerry",
"year" => 1995,
"property-yyy" => "cat"
}, { "name" => "Random_name",
"year" => 2008,
"property-yyy" => 42
}]
}
My beginning of implementation :
def dynamic_table_content(zip, repeat_tags, contents)
doc = zip.find_entry("word/document.xml")
xml = Nokogiri::XML.parse(doc.get_input_dtream)
# repeat_tags_sample = [ {
# "tag" => My_Custom_Tag_ID",
# "repeatable-content" => "movie"},
# ...]
repeat_tags.each do |rpt|
content = contents[rpt[:repeatable-content]]
# content now looks like [
# {"name" => "X-Men",
# "year" => 1998,
# "property-xxx" => 42, ...},
# ...]
content_name = rpt[:repeateable_content].to_s
# the 'movie' of '/root[1]/movies[1]/movie[1]/name[1]' (see below)
puts "Processing #{rpt[:tag]}, adding #{content_name}s"
# Word document.xml sample code looks like this :
# <!-- My_Custom_Tag_ID_inserted_manually -->
# <w:tr ...>
# ...
# <w:dataBinding w:xpath="/root[1]/movies[1]/movie[1]/name[1]>
# ...
# </w:tr>
Find a comment containing a custom string, and select the node just below
# Find starting <w:tr > tag located after <!-- rpt[:tag] -->
base_tr_node = find the node after
# Duplicate it as many times as we want.
content.each_with_index do |content, index|
puts "Adding #{content_name} : #{content}.to_s"
new_tr_node = base_tr_node.add_next_sibling(base_tr_node)
# inside this new node there are many
# <w:dataBinding w:xpath="/root[1]/movies[1]/movie[1]/name[1]>
# <w:dataBinding w:xpath="/root[1]/movies[1]/movie[1]/year[1]>
# ..../movie[1]/property-xxx[1]
# GOAL : replace every movie[1] by movie[index]
Change attributes in many sub-nodes of the node found in 1.
new_tr_node.change_attributes as shown in (see GOAL in previous comments)
# Maybe, it would be something like
# new_tr_node.gsub("(#{content_name})\[([1-9]+)\]", "\1\[#{index}\]")
# ... But new_tr_node is a nokogiri element so .gsub doesn't exist
end
end
#replace["word/document.xml"] = xml.serialize :save_zip_with => 0
end
I have looked at the DoPE extension for Word documents. It looks great ! But alas I had already done a lot of work, and just now I (almost) finished building my own preprocessor.
What I needed was more complicated than what I originally asked. But nevertheless, the answers would be :
EDIT : fixed bad regex/xpath
# 1. Find a comment containing a custom string, and select the node just below
comment_nodes = doc.xpath("//comment()")
# Loop like comment_nodes.each do |comment|
base_tr_node = comment.next_sibling.next_sibling
# For some reason, need to apply next_sibling twice, thought the comment is indeed just above the <w:tr> node
# 2. Change attributes in many sub-nodes of the node found in 1.
matches = tr_node.search('.//*[name()='w:dataBinding']')
matches.each do |databinding_node|
# replace '.*phase[1].*' by '.*phase[index].*'
databinding_node['w:xpath'].gsub("#{comment.text}\[1\]", "#{comment.text}\[#{index}\]")
end
I have an automated report tool (corp intranet) where the admins have a few text area boxes to enter some text for different parts of the email body.
What I'd like to do is parse the contents of the text area and wrap any hyperlinks found with link tags (so when the report goes out there are links instead of text urls).
Is ther a simple way to do something like this without figuring out a way of parsing the text to add link tags around a found (['http:','https:','ftp:] TO the first SPACE after)?
Thank You!
Ruby 1.87, Rails 2.3.5
Make a helper :
def make_urls(text)
urls = %r{(?:https?|ftp|mailto)://\S+}i
html_text = text.gsub urls, '\0'
html_text
end
on the view just call this function , you will get the expected output.
like :
irb(main):001:0> string = 'here is a link: http://google.com'
=> "here is a link: http://google.com"
irb(main):002:0> urls = %r{(?:https?|ftp|mailto)://\S+}i
=> /(?:https?|ftp|mailto):\/\/\S+/i
irb(main):003:0> html = string.gsub urls, '\0'
=> "here is a link: http://google.com"
There are many ways to accomplish your goal. One way would be to use Regex. If you have never heard of regex, this wikipedia entry should bring you up to speed.
For example:
content_string = "Blah ablal blabla lbal blah blaha http://www.google.com/ adsf dasd dadf dfasdf dadf sdfasdf dadf dfaksjdf kjdfasdf http://www.apple.com/ blah blah blah."
content_string.split(/\s+/).find_all { |u| u =~ /^https?:/ }
Which will return: ["http://www.google.com/", "http://www.apple.com/"]
Now, for the second half of the problem, you will use the array returned above to subsititue the text links for hyperlinks.
links = ["http://www.google.com/", "http://www.apple.com/"]
links.each do |l|
content_string.gsub!(l, "<a href='#{l}'>#{l}</a>")
end
content_string will now be updated to contain HTML hyperlinks for all http/https URLs.
As I mentioned earlier, there are numerous ways to tackle this problem - to find the URLs you could also do something like:
require 'uri'
URI.extract(content_string, ['http', 'https'])
I hope this helps you.
I have a Rails site, where the content is written in markdown. I wish to display a snippet of each, with a "Read more.." link.
How do I go about this? Simple truncating the raw text will not work, for example..
>> "This is an [example](http://example.com)"[0..25]
=> "This is an [example](http:"
Ideally I want to allow the author to (optionally) insert a marker to specify what to use as the "snippet", if not it would take 250 words, and append "..." - for example..
This article is an example of something or other.
This segment will be used as the snippet on the index page.
^^^^^^^^^^^^^^^
This text will be visible once clicking the "Read more.." link
The marker could be thought of like an EOF marker (which can be ignored when displaying the full document)
I am using maruku for the Markdown processing (RedCloth is very biased towards Textile, BlueCloth is extremely buggy, and I wanted a native-Ruby parser which ruled out peg-markdown and RDiscount)
Alternatively (since the Markdown is translated to HTML anyway) truncating the HTML correctly would be an option - although it would be preferable to not markdown() the entire document, just to get the first few lines.
So, the options I can think of are (in order of preference)..
Add a "truncate" option to the maruku parser, which will only parse the first x words, or till the "excerpt" marker.
Write/find a parser-agnostic Markdown truncate'r
Write/find an intelligent HTML truncating function
Write/find an intelligent HTML truncating function
The following from http://mikeburnscoder.wordpress.com/2006/11/11/truncating-html-in-ruby/, with some modifications will correctly truncate HTML, and easily allow appending a string before the closing tags.
>> puts "<p><b>Something</p>".truncate_html(5, at_end = "...")
=> <p><b>Someth...</b></p>
The modified code:
require 'rexml/parsers/pullparser'
class String
def truncate_html(len = 30, at_end = nil)
p = REXML::Parsers::PullParser.new(self)
tags = []
new_len = len
results = ''
while p.has_next? && new_len > 0
p_e = p.pull
case p_e.event_type
when :start_element
tags.push p_e[0]
results << "<#{tags.last}#{attrs_to_s(p_e[1])}>"
when :end_element
results << "</#{tags.pop}>"
when :text
results << p_e[0][0..new_len]
new_len -= p_e[0].length
else
results << "<!-- #{p_e.inspect} -->"
end
end
if at_end
results << "..."
end
tags.reverse.each do |tag|
results << "</#{tag}>"
end
results
end
private
def attrs_to_s(attrs)
if attrs.empty?
''
else
' ' + attrs.to_a.map { |attr| %{#{attr[0]}="#{attr[1]}"} }.join(' ')
end
end
end
Here's a solution that works for me with Textile.
Convert it to HTML
Truncate it.
Remove any HTML tags that got cut in half with
html_string.gsub(/<[^>]*$/, "")
Then, uses Hpricot to clean it up and close unclosed tags
html_string = Hpricot( html_string ).to_s
I do this in a helper, and with caching there's no performance issue.
You could use a regular expression to find a line consisting of nothing but "^" characters:
markdown_string = <<-eos
This article is an example of something or other.
This segment will be used as the snippet on the index page.
^^^^^^^^^^^^^^^
This text will be visible once clicking the "Read more.." link
eos
preview = markdown_string[0...(markdown_string =~ /^\^+$/)]
puts preview
Rather than trying to truncate the text, why not have 2 input boxes, one for the "opening blurb" and one for the main "guts". That way your authors will know exactly what is being show when without having to rely on some sort of funkly EOF marker.
I will have to agree with the "two inputs" approach, and the content writer would need not to worry, since you can modify the background logic to mix the two inputs in one when showing the full content.
full_content = input1 + input2 // perhaps with some complementary html, for a better formatting
Not sure if it applies to this case, but adding the solution below for the sake of completeness. You can use strip_tags method if you are truncating Markdown-rendered contents:
truncate(strip_tags(markdown(article.contents)), length: 50)
Sourced from:
http://devblog.boonecommunitynetwork.com/rails-and-markdown/
A simpler option that just works:
truncate(markdown(item.description), length: 100, escape: false)