Fill PDF form with data and images - ruby-on-rails

My goal is to fill the existing PDF interactive form with user data.
Requirements for this are:
it should be able to insert data into text fields;
it should be able to insert an image on the XY position.
I found FillablePDF gem for inserting data into interactive PDF forms. But, can't find if I could insert an image.
For inserting an image I found Prawn gem.
Is there better way or solution for this with only FillablePDF gem?

You can do both tasks with the latest version of HexaPDF which has gained support for AcroForm interactive forms. HexaPDF is a pure Ruby PDF manipulation and creation library, so you wouldn't need to rely on iText and Java like with FillablePDF.
The code for your task would be something like this:
require 'hexapdf'
doc = HexaPDF::Document.open(input_pdf_file)
doc.acro_form.field_by_name('Name of text field').field_value = 'New value'
doc.pages[0].canvas(type: :overlay).image(image_file, at: [x, y])
doc.write('output.pdf')
Note that if this is for a commercial project you will most likely need the commercial license of HexaPDF.
Alternatively, using Prawn with the prawn-blank gem should also work.
(Nota bene: I'm the author of HexaPDF.)

Related

Excel to pdf conversion in rails4 with libreoffice

In a rails 4 application, how can I convert an excel spreadsheet file into pdf.
When I am trying to implement Excel to pdf the contents are listed in different pages if excel column size is large.
How to generate the pdf without moving data to next pages in pdf.
Please help,
Thanks
So you basically have two options you can either implement this yourself and use a CSV gem/library (default CSV, faster CSV, or smarter CSV) assuming that by "excel" a CSV is acceptable. If a CSV is not acceptible you can use the axlsx gem instead. Then for pdf conversion you can use something like prawn. If you decide to build this yourself follow these steps.
Create a controller that will handle Reports, I suggest using the rails g controller report upload generate_table show generate_pdf generator to create a controller and a view for the upload process
Create a file upload form in the upload view.
On submit you will send the file to the generate action processing with one of the CSV or excel gems
Once processed your end product should be an array or hash (as an instance variable) and you can send that to the show action
In the show view you will iterate of that hash/array and incapsulate the contents in a html table.
On the show view you should have a button that will send that same hash/array to the generate_pdf controller action where you will use prawn to create a pdf, you can use something like send_data to the send the completed pdf file back to the user.
This is roughly how you could go about it less the low level details. Now if you wanted to use an out of the box solution you could use something like Ruport. Ruport will handle most of the heavy lifting for you the only thing is you need to have your models and associations set up to use it the way it is designed, and that may not be an option for you.

PrawnPDF Flip Entire PDF

I'm using Prawn PDF to create a label that I send to a label printer, but the label prints upside down. This is important as the shipping labels we use come with some print already on it. The setup I'm using (an iPad through a Lantronix xPrintServer to a Zebra Printer) won't allow me to flip it using the drivers.
So I'm wanting to know if there is a way using Prawn (or even just Rails) to flip the entire document (which contains 2+ pages) so it prints out correctly on the labels. The order of the pages isn't essential.
I haven't used Prawn lately, but I'm pretty sure using the rotate method at the top of your code will work. You'll just need to either set the origin to the center of the page, or use translate to reposition the content after rotation. Page 29 in the manual (PDF) has some example code.
You could save the pdf to a file and then use the awesome pdftk to rotate the saved pdf, then send the amended version.
https://www.pdflabs.com/docs/pdftk-cli-examples/
EDIT - pdftk is not a library/plugin/gem, or any kind of Ruby for that matter. It's a command line tool which you would use like this, in your controller, replacing your current "generate and send pdf" code.
#instead of sending the pdf straight to the user, save it to a file
#i'm not sure how to do this in prawn but it can't be difficult
#rotate the original to a new file
`pdftk /path/to/original.pdf cat 1-endsouth output /path/to/rotated.pdf`
#you could test whether the rotated file exists here as an error-check
#then use send_file to send the rotated one as the response.
send_file "/path/to/rotated.pdf", :type => "application/pdf"

Add text to existing pdf using ruby

I have an rails application that is joining any count of pdf files. Now I need to add a numbering to the joined pdf using ruby.
Is there a state of the art way to add text or other content to an existing pdf file using ruby?
Working with PDF's is really challenging in Ruby/Rails (so I have found out!)
This is the way I was able to add text dynamically to a PDF in rails.
add this gem to your gem file gem "combine_pdf"
and then you can use code like this:
# get the record from the database to add dynamically to the pdf
user = User.last
# get the existing pdf
pdf = CombinePDF.load "#{Rails.root}/public/pdf/existing_pdf.pdf"
# create a textbox and add it to the existing pdf on page 2
pdf.pages[1].textbox "#{user.first_name} #{user.last_name}", height: 20, width: 70, y: 596, x: 72
# output the new pdf which now contains your dynamic data
pdf.save "#{Rails.root}/public/pdf/output#{Time.now.to_s}.pdf"
You can find details of the textbox method here:
https://www.rubydoc.info/gems/combine_pdf/0.2.5/CombinePDF/Page_Methods#textbox-instance_method
I spent days on this working through a number of different gems:
prawn
wicked_pdf
pdfkit
fillable_pdf
But this was by far the most smooth solution for me as of 2019.
I hope this saves someone a lot of time so they don't have to go through all the trial and error I had to with PDF's!!
This solution worked well for me...
Prawn::Document.generate("output.pdf", :template => "/path/to/template.pdf") do
text "This is a text in a copied pdf.", :align => :center
end
You can use CombinePDF for that.
I wrote it because Prawn dropped their template support and I needed a native replacement.
Your code might look something like this:
pdf = CombinePDF.new
pdf << CombinePDF.new("file1.pdf")
pdf << CombinePDF.new("file2.pdf")
pdf.number_pages
pdf.save "output.pdf"
look at the documentation for the different formatting options - I love to surround the numbering with a rounded box (it's in the features, should be easy to play with).

How to export all Issues and its contents (Full content) to excel in JIRA?

Here I can able to download only the fields / I can get the contents of only one particular issues to word.
JIRA : Using Latest version.
Logged in as Administrator.
I searched Google but could'nt find.
Go to Issues and make a filter that returns all the issues you want
In the top right corner, there is a Views menu item. Open it.
Select the Excel (all fields) option to export all issues to Excel
#user1747116 you can use the method described by Whim but you do not get all of the information out of an issue.
You do have a couple of options:
If you are versed in XML you can go to System->Import / Export Section -> Backup and it does a full backup of your JIRA instance in XML as described in this help post.
You can use the method described by Whim of simply going to the issues list and clicking on the 'export function', but ALSO before doing that using one of the add-ons that allows you to export comments as well. Plug-ins specifically mentioned in this help article are "All Comments", "JIRA Utilities", and "Last Comment".
Write a Crystal Report formatted in a way to export into Excel. We have done this to make the information both accessible to those not versed in SQL. We have in particular done this for
You write an SQL Query and go directly at the database, and saving to CSV. Note in JIRA 4 to 6 the schema changed and we had to redo several of our queries so keep this in mind. But this is one to get you started in JIRA 6. Note time log is in ([worklog] and File Attachments are in ([fileattachment]) and comments are in ([jiraaction]). Each of these tend to have multiple entries per issue so you will need to do further joins to get them all into the same query. This is also useful know how if you are doing it in a Crystal Report and then exporting to excel.
SELECT TOP 1000 _JI.ID
,_JI.pkey
,_JI.PROJECT
,_PRJ.pname
,_JI.REPORTER
,_JI.ASSIGNEE
,_JI.issuetype
,_IT.pname
,_JI.SUMMARY
,_JI.DESCRIPTION
,_JI.ENVIRONMENT
,_JI.PRIORITY
,_PRI.pname
,_JI.RESOLUTION
,_RES.pname
,_JI.issuestatus
,_IS.Pname
,_JI.CREATED
,_JI.UPDATED
,_JI.DUEDATE
,_JI.RESOLUTIONDATE
,_JI.VOTES
,_JI.WATCHES
,_JI.TIMEORIGINALESTIMATE
,_JI.TIMEESTIMATE
,_JI.TIMESPENT
,_JI.WORKFLOW_ID
,_JI.SECURITY
,_JI.FIXFOR
,_JI.COMPONENT
,_JI.issuenum
,_JI.CREATOR
FROM jiraissue _JI (NOLOCK)
LEFT JOIN PROJECT _PRJ ON _JI.Project = _PRJ.ID
LEFT JOIN ISSUESTATUS _IS ON _JI.issuestatus = _IS.ID
LEFT JOIN ISSUETYPE _IT ON _JI.issuetype = _IT.ID
LEFT JOIN PRIORITY _PRI ON _JI.Priority = _PRI.ID
LEFT JOIN RESOLUTION _RES ON _JI.Resolution = _RES.ID
Note: You could get rid of the redundant fields, but I left both in so you can see where they came from. You can also put a where clause for a single issue ID or limit the outputs to a particular project. The top 1000 only displays the first 1000 results. Remove that if you are comfortable with it returning everything. (We tens of thousands in our db so I put that in there).
Exporting all details to Excel using the built-in export feature is simply impossible. Excel export will not export you the comments, the attachment, change history, etc. As other answers mention the Excel output produced by JIRA is in fact an HTML file, which works in many situations, but doesn't if you need precise representation of data.
Our company built a commercial add-on called the Better Excel Plugin, which generates native Excel exports (in XLSX format) from JIRA data.
It is powerful alternative to the built-in feature, with major advantages and awesome customization. It really supports Excel analysis functionality, including formulas, charts, pivot tables- and pivot charts.
This was my solution.
I downloaded the file like this:
"Issues" > "Search for Issues"
"Export" button > "Excel (HTML, All Fields)"
After downloading the file, Excel (Microsoft Office Professional Plus 2013) was not opening the download Jira.xls file for me.
I worked around that by doing the following:
Change the ".xls" to ".html"
Open the new "Jira.html" file in Chrome
Highlight/Select the table contents of the exported Jira Issues
Copy and then paste into a new excel file
The Better Excel add-on is great (we use it) but it cannot do attachments (AFAIK). Another add-on, JExcel Pro, can.

Parse a pdf file

I got a pdf like this one :
81 11005589 THING MAXIME 4 PC2I TR1 - MERCREDI DE 07H45 A 09H45 4A7
71 11007079 STUFF QUENTIN 1 PC2I TR1 - LUNDI DE 10H00 A 12H00 1B4
74 10506940 HAHA YEZHOU 2 PC2I TR1 - LUNDI DE 13H30 A 15H30 2D5
http://i.stack.imgur.com/hbXg2.png
And I need to parse it. What I mean by that is take the 4th column, add the 3rd column and make an email adress out of it. For example with the first line : maxime.thing#something.com
I tried to c/p it to Google docs but it just c/p it in one cell instead of multiple cells.
I really don't know what to do here. I guess regex would help me but with what ?
If it is Java iText, if it is C# iTextSharp, both are free for non commercial use.
I've used Aspose before for parsing PDFs/Word docs/Excel docs/and some other docs before. I'm not sure what their capabilities are when it comes to parsing tables in a PDF but it wouldn't surprise me if they had something.
I'd start by looking at them but be warned: they have an unapologetically piss poor method for updating their libraries. I have had to rewrite code because they flat out DROP functionality when they release new versions. Not deprecated, just GONE. That said their support is alright and the tool-set is quite powerful.
I know they have libraries for .NET and Java. Beyond that I can't say.
If in PHP, you can use
exec('pdftotext '.$filepath, $outputAsArray); //execute the command pdftotext. Proabably installed if you're on linux, if not you can install it /// to transform the pdf to text,
then
$text = implode($outputAsArray,"\n"); //to have the output as text
then preg_replace is your friend.
You can't just use a regular expression to parse PDF. You need to extract the text. There are many libraries that can do this for different languages.
My company, Atalasoft, has a text extraction add-on for .NET -- http://www.atalasoft.com/products/dotimage/pdf-reader
For Java, take a look at PDFTextStream from Snowtide. http://www.snowtide.com.
You cannot be sure there is any structure in the PDF of that the text is visible. You really need to use an extraction tool. I wrote an article explaining what formatting is actually in a PDF file at http://www.jpedal.org/PDFblog/?p=228

Resources