Write in an existing Excel .xls file which contains macros - ruby-on-rails

I want to insert datas an existing Excel (.xls) file with Ruby under Linux. This file has already data, it has some format properties and it contains macros.
I tried to insert data into the file with the spreadsheet gem but when I save modifications, the format and all the macros of the file are lost.
Here's an example of a simple modification where I meet this problem :
book = Spreadsheet.open('myOriginalFile.xls')
sheet = book.worksheet 0
sheet.write('C12','hello')
book.write('myModifiedFile.xls')
I tried lots of things, did research on forums and the web but I didn't find a solution...
Does anyone has an idea?

I found a solution :
I use the POI library of Apache which is written in java with the rjb gem (Ruby Java Bridge, which allows to use java libraries with ruby). POI allows to keep macros and formulas of existing xls file and to modify it.
For those who need, here's how to set up rjb to use POI :
# JVM loading
apache_poi_path = File.dirname(__FILE__)+'/poi-3.8/poi-3.8-20120326.jar'
Rjb::load("#{apache_poi_path}", ['-Xmx512M'])
# Java classes import
#file_class = Rjb::import('java.io.FileOutputStream')
#workbook_class = Rjb::import('org.apache.poi.hssf.usermodel.HSSFWorkbook')
#poifs_class = Rjb::import('org.apache.poi.poifs.filesystem.POIFSFileSystem')
Rjb::import('org.apache.poi.hssf.usermodel.HSSFCreationHelper')
Rjb::import('org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator')
#cell_reference_class = Rjb::import('org.apache.poi.hssf.util.CellReference')
#cell_class = Rjb::import('org.apache.poi.hssf.usermodel.HSSFCell')
# You can import all java classes that you need
# Java classes utilisation :
#file_input_class = Rjb::import('java.io.FileInputStream')
#file_input = #file_input_class.new(model_file_path)
#fs = #poifs_class.new(#file_input)
#book = #workbook_class.new(#fs)
#worksheet = #book.getSheet('worksheet')
# ...
# You can use your objects like in Java but with a ruby syntax

You need to write the modified file to a new file name. Check out this
If you have more than one sheet, you need to rewrite other sheets
XLS with several sheets but only modify one of the sheets (and don't
touch the other data), there is no way, that spreadsheet "remembers"
what is in the other sheets. You will have to write the unmodified
sheets as well or otherwise unexpected things will happen.
Ergo: Write the modified sheet and also write the complete unmodified
sheets again, when modifying an XLS with spreadsheet with several
sheets.

Might want to check out Axlsx not sure if it will be able to edit a plain .xls, but I did some work with it a few weeks ago, it worked wonders for the xlsx I was working with.

You just need to open existing file, write your changes into file and save it with another name.
For example on server you have template.xls file.
Simple working example ( need to have template.xls near your .rb file ):
#edit_xls.rb
require 'rubygems'
require 'spreadsheet'
book = Spreadsheet.open 'template.xls'
sheet = book.worksheet 0
sheet[0,0] = 'qweqeqw'
book.write 'edited.xls'

Related

Fastest way to read the first row of big XLSX file in Ruby

I need to be able to read the first (header) row in big xlsx file (350k x 12 cells, ~30MB) very fast in Ruby on Rails app.
I am using Roo gem at the moment, which is fine for smaller files. But for files this big it takes 3-4 minutes. Is there a way to do this in seconds?
xlsx = Roo::Spreadsheet.open(file_path)
sheet = xlsx.sheet(0)
header = sheet.row(1)
Edit:
I tried other gems:
rubyXL took several minutes
creek was the fastest with 30s. But still unusable in controller
Edit2:
I ended up using creek in a job and polling for the result in controller. Thx Tom Lord for suggesting creek
The ruby gem roo does not support file streaming; it reads the whole file into memory. Which, as you say, works fine for smaller files but not so well for reading small sections of huge files.
You need to use a different library/approach. For example, you can use the gem: creek, which describes itself as:
a Ruby gem that provides a fast, simple and efficient method of parsing large Excel (xlsx and xlsm) files.
And, taking the example from the project's README, it's pretty straightforward to translate the code you wrote for roo into code that uses creek:
require 'creek'
creek = Creek::Book.new(file_path)
sheet = creek.sheets[0]
header = sheet.rows[0]
Note: A quick google of your StackOverflow question title led me to this blog post as the top search result. It's always worth searching on Google first.
Using #getscould work, maybe something like:
first_line_data = File.open(file_path, "rb", &:gets)
first_line_file = File.open("tmp_file.xlsx", "wb") { |f| f << first_line_data }
xlsx = Roo::Spreadsheet.open("tmp_file.xlsx")
# etc...

How to ignore hidden sheets when importing xlsx files with rails and roo

I am using Roo to import xlsx files into my rails app. The imports work fine, however, while trying to make a 'workbook' importer instead of just a 'worksheet' importer, I noticed that there are tons of hidden sheets on some of the files. For example:
In some of the files the SUB_LABOR sheet has important data that should be imported. These are not hidden. In other files, the SUB_LABOR was used as a scratch pad and then hidden so that people using the sheet will not use it.
I would like my importer to read in the workbook and parse the sheets that are not hidden and ignore the ones that are. I see that the 'hidden' value is stored in the excelx object under <Nokogiri::XML::Attr:[a hex value] name="state" value="hidden">
Is there a way to dig this information out of the object and act on it?
The whole object is way to big to post here.
You can pass in
only_visible_sheets: true
to the initializer, e.g.:
Roo::Excelx.new("my.xlsx", only_visible_sheets: true)

How create XLSX file from LibXML::XML::Document in Ruby?

My code prepares a string containing XML data in UTF-8 encoding. I use LibXML to create it and finally I call Rails send_data which creates some.xls file from the prepared string. MS Excel perfectly opens the some.xls file, but it's the only application which can open an XML file in table format.
Does anybody know how to create an XLSX file from LibXML::XML::Document? I need to create a spreadsheet at once,not cell by cell.
I checked some gems like XlsxWriter,etc. However, I found the only examples use methods writing into a cell or a row o a column, but I need to create a file at once.
Take a look the axlsx gem, I think this will help you.
https://github.com/randym/axlsx

Modify a csv file with a Ruby script

I have a .xlsx file converted to .csv.I need to write a script to modify this file(change/rename columns etc.) How can I open this .csv file and save it from within the script?
Thanks!
Open the csv file just like you would open any other file in ruby using the standard File api
csv_file = File.open('data.csv', 'r')
Parse it manually or use a library like FasterCSV. Make your modifications, writeback to the file and close. There is nothing inherently special about a csv file, work with it like you would with any file in ruby.
You should proably work with a CSV library (or in the ruby world a gem). So install the gem,
and your code will look something like this:
FasterCSV.foreach("path/to/file.csv") do |row|
# use row here...
end
http://fastercsv.rubyforge.org/
As far as I know, you cannot make inline modifications to the CSV file. You would have to output via another file.

How to edit or write on existing PDF with Ruby?

I have a couple of PDF template files with complex content and several blank regions/areas in them. I need to be able to write text in those blank regions and save the resulting PDFs in a folder.
I googled for answers on this question quite intensively, but I didn't find definite answers. One of the better solutions is PDF::Toolkit, but it would require the purchase of Adobe Acrobat to add replaceable attributes to existing PDF documents.
The PHP world is blessed with FPDI that can be used to simply open a PDF file and write/draw on it over the existing content. There is a Ruby port of this library, but the last commit for it happened at the beginning of 2009. Also that project doesn't look like it is widely used and supported.
The question is: What is the better Ruby way of editing, writing or drawing on existing PDFs?
This question also doesn't seem to be answered on here. These questions are related, but not really the same:
Prawn gem: How to create the .pdf from an *existing* file (.xls)
watermark existing pdf with ruby
Ruby library for manipulating existing PDF
How to replace a word in an existing PDF using Ruby Prawn?
you have to definitely check out Prawn gem, by which you can generate any custom pdf files. You can actually use prawn to write in text into existing pdfs by treating the existing PDF as a template for your new Prawn document.
For example:
filename = "#{Prawn::DATADIR}/pdfs/multipage_template.pdf"
Prawn::Document.generate("full_template.pdf", :template => filename) do
text "THis content is written on the first page of the template", :align => :center
end
This will write text onto the first page of the old pdf.
See more here:
http://prawn.majesticseacreature.com/manual.pdf
Since Prawn has removed the template feature (it was full of bugs) the easiest way I've found is the following:
Use Prawn to generate a PDF with ONLY the dynamic parts you want to add.
Use PDF::Toolkit (which wraps PDFtk) to combine the Prawn PDF with the original.
Rough Example:
require 'prawn'
require 'pdf/toolkit'
template_filename = 'some/dir/Awesome-Graphics.pdf'
prawn_filename = 'temp.pdf'
output_filename = 'output.pdf'
Prawn::Document.generate(prawn_filename) do
# Generate whatever you want here.
text_box "This is some new text!", :at => [100, 300]
end
PDF::Toolkit.pdftk(prawn_filename, "background", template_filename, "output", output_filename)
I recommend prawn for generating PDFs and then using combine_pdf to combine two generated PDFs together into one. I use it like this and it works just fine.
Short example (from the README) of how to combine two PDFs:
company_logo = CombinePDF.load("company_logo.pdf").pages[0]
pdf = CombinePDF.load "content_file.pdf"
pdf.pages.each { |page| page << company_logo } # notice the << operator is on a page and not a PDF object.
pdf.save "content_with_logo.pdf"
You don't need to use a combination of gems you can use just one gem!
Working with PDF's is really challenging in Ruby/Rails (so I have found out!)
This is the way I was able to add text dynamically to a PDF in rails.
add this gem to your gem file gem combine_pdf
and then you can use code like this:
# get the record from the database to add dynamically to the pdf
user = User.last
# get the existing pdf
pdf = CombinePDF.load "#{Rails.root}/public/pdf/existing_pdf.pdf"
# create a textbox and add it to the existing pdf on page 2
pdf.pages[1].textbox "#{user.first_name} #{user.last_name}", height: 20, width: 70, y: 596, x: 72
# output the new pdf which now contains your dynamic data
pdf.save "#{Rails.root}/public/pdf/output#{Time.now.to_s}.pdf"
You can find details of the textbox method here: https://www.rubydoc.info/gems/combine_pdf/0.2.5/CombinePDF/Page_Methods#textbox-instance_method
I spent days on this working through a number of different gems: prawn wicked_pdf pdfkit fillable_pdf
But this was by far the most smooth solution for me as of 2019.
I hope this saves someone a lot of time so they don't have to go through all the trial and error I had to with PDF's!!
The best I can think of is Rails-latex, it doesn't allow you to edit existing PDF files but it would allow you to set up template *.tex.erb which you may dynamically modify and compile them into PDF format (along with dvi and a number of others).
PDFLib seems to do the thing you want and has ruby bindings.
According to my research, Prawn is one of the free and best gems I found. The template functionality isn't working in later version. The latest version I could find to work with templates is 1.0.0.rc2 - March 1, 2013. Couldn't find any later version which works with templates. So be mindful if you are using later versions than this. Check below thread for more info.
https://groups.google.com/forum/#!searchin/prawn-ruby/prawn$20templates/prawn-ruby/RYGPImNcR0I/7mxtnrEDHeQJ
PDFtk is another capable tool for PDF manipulation and to work with templates. But it mentions following points,
This library is free for personal use, but requires a license if used
in production
This is a non-ruby command line tool
For more information please refer the below link
http://adamalbrecht.com/2014/01/31/pre-filling-pdf-form-templates-in-ruby-on-rails-with-pdftk/
You can use Origami gem to add a password to the existing pdf or edit it.
pdf = WickedPdf.new.pdf_from_url(pdf_params[:url])
temp_file = Tempfile.new('temp', encoding: 'ascii-8bit')
temp_file.write(pdf)
# Creates an encrypted document with AES256 and passwords.
pdf = PDF.read(temp_file.path).encrypt(cipher: 'aes', key_size: 256,user_passwd: pdf_params[:user_password], owner_passwd: pdf_params[:owner_password])
save_path = "#{File.basename(__FILE__, ".rb")}.pdf"
pdf.save(save_path)
temp_file.close

Resources