Find PDF Form Field position - ruby-on-rails

I realize this question has been asked a lot, but I could not find any information about how to do this in RoR. I am filling a PDF with form text fields using pdf-forms but this does not support adding images, and I need to be able to add an image of a customer's signature into the PDF. I have used prawn to render the image on the existing PDF, but I need to know the exact location to add the image on the signature line. So my question is how can I look at an arbitrary PDF and find the exact position of the "Signature" form field?

I ended up using pdf2json to find the x,y position of the form field. I generate a JSON file of the original pdf using this command:
%x{ pdf2json -f "#{form_path}" }
The JSON file is generated in the same directory as form_path. I find the field I want using these commands:
jsonObj = JSON.parse(File.read(json_path))
signature_fields = jsonObj["formImage"]["Pages"].first["Fields"].find_all do |f|
f["id"]["Id"] == 'signature'
end
I can use prawn to first create a new PDF with the image. Then using pdf-forms, I multistamp the image pdf onto the original PDF that I want to add the image to. But multistamp applies each page of the stamp PDF to the corresponding page of the input PDF so make sure your image PDF has the correct number of pages or else your image will get stamped on every page. I only want the image stamped onto the first page, so I do the following:
num_pages = %x{ #{Rails.configuration.pdftk_path} #{form_path} dump_data | grep "NumberOfPages" | cut -d":" -f2 }.to_i
signaturePDF = "/tmp/signature.pdf"
Prawn::Document.generate(signaturePDF) do
signature_fields.each do |field|
image Rails.root.join("signature.png"), at: [field["x"], field["y"]],
width: 50
end
[0...num_pages - 1].each{|p| start_new_page }
end
outputPDF = "/tmp/output.pdf"
pdftk.multistamp originalPDF, signaturePDF, outputPDF

You can use this gem 'wicked_pdf. You just write html, and this gem automatically convert it to pdf
Read more https://github.com/mileszs/wicked_pdf

Here's a pure ruby implementation that will return the field's name, page, x, y, height, and width using Origami https://github.com/gdelugre/origami
require "origami"
def pdf_field_metadata(file_path)
pdf = Origami::PDF.read file_path
field_to_page = {}
pdf.pages.each_with_index do |page, page_index|
(page.Annots || []).each do |annot|
field_to_page[annot.refno] = page_index
end
end
field_metas = []
pdf.fields.each do |field|
field_metas << {
name: field.T,
page_index: field_to_page[field.no],
x: field.Rect[0].to_f,
y: field.Rect[1].to_f,
h: field.Rect[3].to_f - field.Rect[1],
w: field.Rect[2].to_f - field.Rect[0]
}
end
field_metas
end
pdf_field_metadata "<path to pdf>"
I haven't tested it particularly thoroughly but the snippet can hopefully get you most of the way there.
Also -- keep in mind the above coordinates calculated are in points from the bottom left of the pdf page rather than the top right (and are not in pixels). I believe there's always 72 points per inch, and you can get the total page points by calling page.MediaBox in the pdf.pages loop above. If you're looking for pixel coordinates, you need to know the DPI of the resulting rendered document.

Related

Render JSON to image in Prawn PDF

I'm using prawn pdf in conjunction with signature-pad gem in my rails 3.2 app and i'm having troubles converting the JSON data to an image to render in the pdf.
I have the signature-pad on completion throw the JSON data into the table and it looks like this.
JSON
[{"lx":29,"ly":18,"mx":29,"my":17},{"lx":29,"ly":19,"mx":29,"my":18},{"lx":29,"ly":24,"mx":29,"my":19},{"lx":29,"ly":27,"mx":29,"my":24},{"lx":29,"ly":30,"mx":29,"my":27},{"lx":29,"ly":32,"mx":29,"my":30},{"lx":32,"ly":32,"mx":29,"my":32},{"lx":33,"ly":32,"mx":32,"my":32},{"lx":35,"ly":31,"mx":33,"my":32},{"lx":39,"ly":24,"mx":35,"my":31},{"lx":42,"ly":16,"mx":39,"my":24},{"lx":48,"ly":7,"mx":42,"my":16},{"lx":51,"ly":2,"mx":48,"my":7},{"lx":54,"ly":-3,"mx":51,"my":2},{"lx":58,"ly":2,"mx":58,"my":1},{"lx":59,"ly":9,"mx":58,"my":2},{"lx":60,"ly":18,"mx":59,"my":9},{"lx":60,"ly":27,"mx":60,"my":18},{"lx":60,"ly":38,"mx":60,"my":27},{"lx":55,"ly":45,"mx":60,"my":38},{"lx":49,"ly":51,"mx":55,"my":45},{"lx":45,"ly":54,"mx":49,"my":51},{"lx":39,"ly":57,"mx":45,"my":54},{"lx":35,"ly":51,"mx":35,"my":50},{"lx":43,"ly":45,"mx":35,"my":51},{"lx":54,"ly":39,"mx":43,"my":45},{"lx":70,"ly":32,"mx":54,"my":39},{"lx":81,"ly":28,"mx":70,"my":32},{"lx":96,"ly":25,"mx":81,"my":28},{"lx":111,"ly":23,"mx":96,"my":25},{"lx":119,"ly":23,"mx":111,"my":23},{"lx":126,"ly":23,"mx":119,"my":23},{"lx":129,"ly":23,"mx":126,"my":23},{"lx":130,"ly":23,"mx":129,"my":23},{"lx":128,"ly":24,"mx":130,"my":23},{"lx":117,"ly":25,"mx":128,"my":24},{"lx":105,"ly":27,"mx":117,"my":25},{"lx":96,"ly":29,"mx":105,"my":27},{"lx":89,"ly":30,"mx":96,"my":29},{"lx":85,"ly":30,"mx":89,"my":30},{"lx":84,"ly":31,"mx":85,"my":30},{"lx":87,"ly":32,"mx":84,"my":31},{"lx":101,"ly":36,"mx":87,"my":32},{"lx":118,"ly":39,"mx":101,"my":36},{"lx":136,"ly":42,"mx":118,"my":39},{"lx":151,"ly":43,"mx":136,"my":42},{"lx":165,"ly":43,"mx":151,"my":43},{"lx":171,"ly":40,"mx":165,"my":43},{"lx":175,"ly":37,"mx":171,"my":40},{"lx":177,"ly":34,"mx":175,"my":37},{"lx":178,"ly":32,"mx":177,"my":34},{"lx":178,"ly":31,"mx":178,"my":32}]
I have seen this, but i'm not sure how best to implement it?
Controller
def show
#form = Form.find(params[:id])
respond_to do |format|
format.html
format.pdf do
pdf = FormPdf.new(#form)
send_data pdf.render, filename: "form - #{#form.title}", type: "application/pdf", disposition: "inline"
end
end
end
Prawn PDF
# encoding: utf-8
class FormPdf < Prawn::Document
def initialize(form)
super()
#form = form
all
end
def all
text "Form text here"
move_down 20
signature_data = [[#form.signature, "Signature of person"]]
table(signature_data, position: :center) do
cells.style(:border_width => 0)
end
end
Please see: https://github.com/nqngo/rails-signature-pad-prawns-demo
The signature in question image:
Luckily I did something similar at my workplace, so I will walk you through the whole thought process. Assume we store the data in #sig and setup a signature box dimension :
signature = '[{"lx":29,"ly":18,"mx":29,"my":17},{"lx":29,"ly":19,"mx":29,"my":18},{"lx":29,"ly":24,"mx":29,"my":19},{"lx":29,"ly":27,"mx":29,"my":24},{"lx":29,"ly":30,"mx":29,"my":27},{"lx":29,"ly":32,"mx":29,"my":30},{"lx":32,"ly":32,"mx":29,"my":32},{"lx":33,"ly":32,"mx":32,"my":32},{"lx":35,"ly":31,"mx":33,"my":32},{"lx":39,"ly":24,"mx":35,"my":31},{"lx":42,"ly":16,"mx":39,"my":24},{"lx":48,"ly":7,"mx":42,"my":16},{"lx":51,"ly":2,"mx":48,"my":7},{"lx":54,"ly":-3,"mx":51,"my":2},{"lx":58,"ly":2,"mx":58,"my":1},{"lx":59,"ly":9,"mx":58,"my":2},{"lx":60,"ly":18,"mx":59,"my":9},{"lx":60,"ly":27,"mx":60,"my":18},{"lx":60,"ly":38,"mx":60,"my":27},{"lx":55,"ly":45,"mx":60,"my":38},{"lx":49,"ly":51,"mx":55,"my":45},{"lx":45,"ly":54,"mx":49,"my":51},{"lx":39,"ly":57,"mx":45,"my":54},{"lx":35,"ly":51,"mx":35,"my":50},{"lx":43,"ly":45,"mx":35,"my":51},{"lx":54,"ly":39,"mx":43,"my":45},{"lx":70,"ly":32,"mx":54,"my":39},{"lx":81,"ly":28,"mx":70,"my":32},{"lx":96,"ly":25,"mx":81,"my":28},{"lx":111,"ly":23,"mx":96,"my":25},{"lx":119,"ly":23,"mx":111,"my":23},{"lx":126,"ly":23,"mx":119,"my":23},{"lx":129,"ly":23,"mx":126,"my":23},{"lx":130,"ly":23,"mx":129,"my":23},{"lx":128,"ly":24,"mx":130,"my":23},{"lx":117,"ly":25,"mx":128,"my":24},{"lx":105,"ly":27,"mx":117,"my":25},{"lx":96,"ly":29,"mx":105,"my":27},{"lx":89,"ly":30,"mx":96,"my":29},{"lx":85,"ly":30,"mx":89,"my":30},{"lx":84,"ly":31,"mx":85,"my":30},{"lx":87,"ly":32,"mx":84,"my":31},{"lx":101,"ly":36,"mx":87,"my":32},{"lx":118,"ly":39,"mx":101,"my":36},{"lx":136,"ly":42,"mx":118,"my":39},{"lx":151,"ly":43,"mx":136,"my":42},{"lx":165,"ly":43,"mx":151,"my":43},{"lx":171,"ly":40,"mx":165,"my":43},{"lx":175,"ly":37,"mx":171,"my":40},{"lx":177,"ly":34,"mx":175,"my":37},{"lx":178,"ly":32,"mx":177,"my":34},{"lx":178,"ly":31,"mx":178,"my":32}]'
#sig = JSON.parse signature
sigpad_height = 55
sigpad_width = 198
You then create a bounding_box at the cursor point and draw the line from the JSON data. The reason why we have to use a bounding_box is to set the coordinate of the line origin. Otherwise, the line function will use the bottom left of the page as the origin point:
bounding_box([0, cursor], width: sigpad_width, height: sigpad_height) do
stroke_bounds
#sig.each do |e|
stroke { line [e["lx"], e["ly"]],
[e["mx"], e["my"]] }
end
end
The resulting PDF will be:
Notice how the image is upside down, this is due to the different point of axis-direction between PDF and canvas. In PDF the origin point is bottom-left, where in canvas, the origin point is top-left. What we need to do is convert the coordinate from canvas style to PDF style. A basic transformation is to flip it over the x-axis, and translate it back by sigpad_height. The line code is now:
stroke { line [e["lx"], sigpad_height - e["ly"]],
[e["mx"], sigpad_height - e["my"]] }
The end result will be:
If you do not want the border around the bounding_box removes the stroke_bounds. A couple of gotchas you need to be careful about:
SignaturePad captures data coordinates outside the HTML signature pad dimension, hence why you see the rendered PDF signature have overdrawn lines outside its bounding_box.
The above transformation assumes the signature height of the bounding box and the HTML pad is the same. If different, you will need to add some offset to translate the signature back into the correct position due to the flipping over the x-axis.
Depends on how you store your JSON in the database. You might be able to access the coordinate as a :hash. Hence e["lx"] will yield nil, you must use e[:lx] instead.

MiniMagick (+Rails): How to display number of scenes in an image

I have a Rails app that uploads images for image processing, and I want to be able to 1) See how many pages/frames/scenes there are in an image, and 2) split multi-page images into single-page jpegs.
I'm having no trouble converting image types for single-scene images, but I can't quite puncture the ImageMagick documentation to understand exactly what I'm to do. The doc page I'm using is here:
http://www.imagemagick.org/www/escape.html
For the most part, I would like the code to be as simple as
def multiPage?( image )
img = MiniMagick::Image.open(image.path)
numPages = img.format("%n") #This returns Nil
count > 1 ? true : false
end
Does anyone have a better idea of what to do than I do? Thanks in advance!
Ok, well this is a bit of a hack, but when I did:
numPages = img[:n]
I would get numPages resulting in a string of the letter 'n' as many times as there are pages in an image, so:
#img -> 4-page image
numPages = img[:n] # => 'nnnn'
Probably not the best answer, but at least it works.
UPDATE:
Found a better way
numPages = Integer(img["%n"])

Grails: Replacing symbols with HTML equivalent

I'm reading a CSV file and one of the columns has text that contains symbols that is not recognized. After I read the file, symbols such as ' becomes � . I'm also saving this into a DB.
Obviously when I display this on a webpage, it shows garbage. How can I substitute HTML code (ex. &#180 ;) for this with Grails?
I am reading the CSV using the csv plugin. Code below:
def f = "clientDocs/testfile.csv"
def fReader = new File(f).toCsvMapReader([batchSize:50, charset:'UTF-8'])
fReader.each { batchList ->
batchList.each {
def description = substituteSymbols(it.Description)
def substituteSymbols(inText) {
// HOW TO SUBSTITUTE HERE
}
Thanks for any help or suggestions. I've already tried string.replaceAll(regExp).
Grails comes with a basic set of encoders/decoders for common tasks.
What you want here is it.Description.encodeAsHTML().
And then if you want the original when displaying in the view, just reverse it with .decodeHTML()
You can read more about these here: http://grails.org/doc/latest/guide/single.html#codecs
(Edited decode method name typo as per the comment)

Carrierwave and mini_magick finding widths & height

After a bit of investigation I decided to use Carrierwave and mini_magick on my new rail3 app.
I've set it up and it works perfectly. However I have a one question. I'd like to be able to access the width and height, so I can form the html correctly. However, there is no default data from which to get this information. Because of the way it stores the data I'm I cannot think of any way that I can add it to the database.
Can anyone suggest any tips or ideas? Is it even possible?
class Attachment
mount_uploader :file, FileUploader
def image
#image ||= MiniMagick::Image.open(file.path)
end
end
And use it like this:
Attachment.first.image['width'] # => 400
Attachment.first.image['height'] # => 300
Just for record, I have used a similar solution, however using files with Mongo GridFS, here it goes:
def image
#image ||= MiniMagick::Image.read(Mongo::GridFileSystem.new(Mongoid.database).open(file.path, 'r'))
end
Disadvantage of calculating Image height / width using RMagick or MiniMagick in run time:
Its CPU intensive
It requires Internet to get image and calculate the dimensions.
Its a slow process
FYI You can also calculate the Image Height, Width after the
Image is fully loaded by using the load event associated with the
<img> tag with the help of jQuery.
For Example:
$(document).ready(function(){
var $image = $('.fixed-frame img');
$image.load(function(){
rePositionLogo($image);
});
if($image.prop('complete')){
rePositionLogo($image);
}
});
function rePositionLogo($image){
var height = $image.height();
var width = $image.width();
if (width > height) {
$image.parents('.header').addClass('landscape');
var marginTop = (105 - $image.height())/2;
$image.css('margin-top', marginTop + 'px')
}else{
$image.parents('.header').addClass('portrait');
}
}
Be careful, because load() will not trigger when an image is already loaded. This can happens easily when an image is in the user's browser cache.
You can check if an image is already loaded using $('#myImage').prop('complete'), which returns true when an image is loaded.
I think the best way is to store the image dimensions in the model (database).
In my case, the model name is attachment. Then I created a migration:
rails g migration add_dimensions_to_attachments image_width:integer image_height:integer
After that, run the migration:
rake db:migrate
In my Image Uploader file app/uploaders/image_uploader.rb, I have:
class ImageUploader < CarrierWave::Uploader::Base
include CarrierWave::MiniMagick
process :store_dimensions
private
def store_dimensions
if file && model
model.image_width, model.image_height = ::MiniMagick::Image.open(file.file)[:dimensions]
end
end
end
With this, the image dimensions is saved in the upload step.
To get the dimensions, I simply run attachment.image_width or attachment.image_height
See the reference here.

Truncate Markdown?

I have a Rails site, where the content is written in markdown. I wish to display a snippet of each, with a "Read more.." link.
How do I go about this? Simple truncating the raw text will not work, for example..
>> "This is an [example](http://example.com)"[0..25]
=> "This is an [example](http:"
Ideally I want to allow the author to (optionally) insert a marker to specify what to use as the "snippet", if not it would take 250 words, and append "..." - for example..
This article is an example of something or other.
This segment will be used as the snippet on the index page.
^^^^^^^^^^^^^^^
This text will be visible once clicking the "Read more.." link
The marker could be thought of like an EOF marker (which can be ignored when displaying the full document)
I am using maruku for the Markdown processing (RedCloth is very biased towards Textile, BlueCloth is extremely buggy, and I wanted a native-Ruby parser which ruled out peg-markdown and RDiscount)
Alternatively (since the Markdown is translated to HTML anyway) truncating the HTML correctly would be an option - although it would be preferable to not markdown() the entire document, just to get the first few lines.
So, the options I can think of are (in order of preference)..
Add a "truncate" option to the maruku parser, which will only parse the first x words, or till the "excerpt" marker.
Write/find a parser-agnostic Markdown truncate'r
Write/find an intelligent HTML truncating function
Write/find an intelligent HTML truncating function
The following from http://mikeburnscoder.wordpress.com/2006/11/11/truncating-html-in-ruby/, with some modifications will correctly truncate HTML, and easily allow appending a string before the closing tags.
>> puts "<p><b>Something</p>".truncate_html(5, at_end = "...")
=> <p><b>Someth...</b></p>
The modified code:
require 'rexml/parsers/pullparser'
class String
def truncate_html(len = 30, at_end = nil)
p = REXML::Parsers::PullParser.new(self)
tags = []
new_len = len
results = ''
while p.has_next? && new_len > 0
p_e = p.pull
case p_e.event_type
when :start_element
tags.push p_e[0]
results << "<#{tags.last}#{attrs_to_s(p_e[1])}>"
when :end_element
results << "</#{tags.pop}>"
when :text
results << p_e[0][0..new_len]
new_len -= p_e[0].length
else
results << "<!-- #{p_e.inspect} -->"
end
end
if at_end
results << "..."
end
tags.reverse.each do |tag|
results << "</#{tag}>"
end
results
end
private
def attrs_to_s(attrs)
if attrs.empty?
''
else
' ' + attrs.to_a.map { |attr| %{#{attr[0]}="#{attr[1]}"} }.join(' ')
end
end
end
Here's a solution that works for me with Textile.
Convert it to HTML
Truncate it.
Remove any HTML tags that got cut in half with
html_string.gsub(/<[^>]*$/, "")
Then, uses Hpricot to clean it up and close unclosed tags
html_string = Hpricot( html_string ).to_s
I do this in a helper, and with caching there's no performance issue.
You could use a regular expression to find a line consisting of nothing but "^" characters:
markdown_string = <<-eos
This article is an example of something or other.
This segment will be used as the snippet on the index page.
^^^^^^^^^^^^^^^
This text will be visible once clicking the "Read more.." link
eos
preview = markdown_string[0...(markdown_string =~ /^\^+$/)]
puts preview
Rather than trying to truncate the text, why not have 2 input boxes, one for the "opening blurb" and one for the main "guts". That way your authors will know exactly what is being show when without having to rely on some sort of funkly EOF marker.
I will have to agree with the "two inputs" approach, and the content writer would need not to worry, since you can modify the background logic to mix the two inputs in one when showing the full content.
full_content = input1 + input2 // perhaps with some complementary html, for a better formatting
Not sure if it applies to this case, but adding the solution below for the sake of completeness. You can use strip_tags method if you are truncating Markdown-rendered contents:
truncate(strip_tags(markdown(article.contents)), length: 50)
Sourced from:
http://devblog.boonecommunitynetwork.com/rails-and-markdown/
A simpler option that just works:
truncate(markdown(item.description), length: 100, escape: false)

Resources