Modify a csv file with a Ruby script - ruby-on-rails

I have a .xlsx file converted to .csv.I need to write a script to modify this file(change/rename columns etc.) How can I open this .csv file and save it from within the script?
Thanks!

Open the csv file just like you would open any other file in ruby using the standard File api
csv_file = File.open('data.csv', 'r')
Parse it manually or use a library like FasterCSV. Make your modifications, writeback to the file and close. There is nothing inherently special about a csv file, work with it like you would with any file in ruby.

You should proably work with a CSV library (or in the ruby world a gem). So install the gem,
and your code will look something like this:
FasterCSV.foreach("path/to/file.csv") do |row|
# use row here...
end
http://fastercsv.rubyforge.org/

As far as I know, you cannot make inline modifications to the CSV file. You would have to output via another file.

Related

Issue with line breaks in CSV text files generated by rails

I generate a CSV text file in Rails like this:
CSV.generate(col_sep: ';') do |csv|
sheet.add_row ['1st line']
sheet.add_row ['2nd line']
end
When I open the text file the two lines are there as expected. Unfortunately this file now should be used by a program that reads the file and I get an error message, that the second line is missing. I have a sample file that looks exactly like the file I generated which works fine but my file can't be read properly. It also has the same encoding. Any suggestions where to look? Anything concerning line breaks?
I'm not sure this is a question that can be answered as asked. You said that a 3rd party program is having trouble reading a text file generated by Ruby, but provided no information on that error and how you think Ruby is related to this error.
Could you please update your original post with the plaintext version of your CSV file and what program you're trying to open it in?

How create XLSX file from LibXML::XML::Document in Ruby?

My code prepares a string containing XML data in UTF-8 encoding. I use LibXML to create it and finally I call Rails send_data which creates some.xls file from the prepared string. MS Excel perfectly opens the some.xls file, but it's the only application which can open an XML file in table format.
Does anybody know how to create an XLSX file from LibXML::XML::Document? I need to create a spreadsheet at once,not cell by cell.
I checked some gems like XlsxWriter,etc. However, I found the only examples use methods writing into a cell or a row o a column, but I need to create a file at once.
Take a look the axlsx gem, I think this will help you.
https://github.com/randym/axlsx

Write in an existing Excel .xls file which contains macros

I want to insert datas an existing Excel (.xls) file with Ruby under Linux. This file has already data, it has some format properties and it contains macros.
I tried to insert data into the file with the spreadsheet gem but when I save modifications, the format and all the macros of the file are lost.
Here's an example of a simple modification where I meet this problem :
book = Spreadsheet.open('myOriginalFile.xls')
sheet = book.worksheet 0
sheet.write('C12','hello')
book.write('myModifiedFile.xls')
I tried lots of things, did research on forums and the web but I didn't find a solution...
Does anyone has an idea?
I found a solution :
I use the POI library of Apache which is written in java with the rjb gem (Ruby Java Bridge, which allows to use java libraries with ruby). POI allows to keep macros and formulas of existing xls file and to modify it.
For those who need, here's how to set up rjb to use POI :
# JVM loading
apache_poi_path = File.dirname(__FILE__)+'/poi-3.8/poi-3.8-20120326.jar'
Rjb::load("#{apache_poi_path}", ['-Xmx512M'])
# Java classes import
#file_class = Rjb::import('java.io.FileOutputStream')
#workbook_class = Rjb::import('org.apache.poi.hssf.usermodel.HSSFWorkbook')
#poifs_class = Rjb::import('org.apache.poi.poifs.filesystem.POIFSFileSystem')
Rjb::import('org.apache.poi.hssf.usermodel.HSSFCreationHelper')
Rjb::import('org.apache.poi.hssf.usermodel.HSSFFormulaEvaluator')
#cell_reference_class = Rjb::import('org.apache.poi.hssf.util.CellReference')
#cell_class = Rjb::import('org.apache.poi.hssf.usermodel.HSSFCell')
# You can import all java classes that you need
# Java classes utilisation :
#file_input_class = Rjb::import('java.io.FileInputStream')
#file_input = #file_input_class.new(model_file_path)
#fs = #poifs_class.new(#file_input)
#book = #workbook_class.new(#fs)
#worksheet = #book.getSheet('worksheet')
# ...
# You can use your objects like in Java but with a ruby syntax
You need to write the modified file to a new file name. Check out this
If you have more than one sheet, you need to rewrite other sheets
XLS with several sheets but only modify one of the sheets (and don't
touch the other data), there is no way, that spreadsheet "remembers"
what is in the other sheets. You will have to write the unmodified
sheets as well or otherwise unexpected things will happen.
Ergo: Write the modified sheet and also write the complete unmodified
sheets again, when modifying an XLS with spreadsheet with several
sheets.
Might want to check out Axlsx not sure if it will be able to edit a plain .xls, but I did some work with it a few weeks ago, it worked wonders for the xlsx I was working with.
You just need to open existing file, write your changes into file and save it with another name.
For example on server you have template.xls file.
Simple working example ( need to have template.xls near your .rb file ):
#edit_xls.rb
require 'rubygems'
require 'spreadsheet'
book = Spreadsheet.open 'template.xls'
sheet = book.worksheet 0
sheet[0,0] = 'qweqeqw'
book.write 'edited.xls'

Parsing XLS Spreadsheet in Rails using Roo Gem

I am trying to parse a XLS file with the roo gem without using a file upload plugin. Unfortunately I can not access the data of the File.
I get the error:
#<File:0x007ffac2282250> is not an Excel file
So roo is not recognizing the file as an Excel file. Do I need to save the file locally to use roo or is there a way around that. I would like to parse the data of the excel file directly into the database.
The params that are coming through:
Parameters: {"utf8"=>"✓", "authenticity_token"=>"yLqOpSK981tDNYjKSoWBh0VnFEKSk0XA/wOt3r+yWJc=", "uploadform"=>{"name"=>"xls", "file"=>#<ActionDispatch::Http::UploadedFile:0x007ffac22b6550 #original_filename="cities2.xls", #content_type="application/octet-stream", #headers="Content-Disposition: form-data; name=\"uploadform[file]\"; filename=\"cities2.xls\"\r\nContent-Type: application/octet-stream\r\n", #tempfile=#<File:/var/folders/qn/70msrkt90pd390sdr14_0g2m0000gn/T/RackMultipart20120306-3729-1m2xcsp>>}, "commit"=>"Save Uploadform"}
I am trying to access the file with
if params[:uploadform][:file].original_filename =~ /.*\.xls$/i
oo = Excel.new(params[:uploadform][:file].open)
rooparse(oo)
end
I also tried params[:uploadform][:file].read and params[:uploadform][:file] already but I think the .open would be the correct method here!?
And would you recommend using paperclip or carrierwave here?
Thank you for your help!
Yes, I can not parse the full file yet but that's another problem. At least I am getting the first row from the table into my database with the following lines:
require 'fileutils'
require 'iconv'
tmp = params[:uploadform][:file].tempfile
file = File.join("public", params[:uploadform][:file].original_filename)
FileUtils.cp tmp.path, file
oo = Excel.new(file)
rooparse(oo)
FileUtils.rm file
Thanks for your input!
Looking at the source for Excel.new, it seems that it wants a file name, not a File object or handler. In other words, it needs string representation of the full path, including filename, to the the file you want to parse. Also, it checks the extension of the file. So if the tempfile doesn't end with ".xls" you'll need to rename the file first.
This is the path:
params[:file].tempfile.path.
You can try this:
Excel.new(params[:uploadform][:file].tempfile.path)

Ruby/Rails CSV parsing, invalid byte sequence in UTF-8

I am trying to parse a CSV file generated from an Excel spreadsheet.
Here is my code
require 'csv'
file = File.open("input_file")
csv = CSV.parse(file)
But I get this error
ArgumentError: invalid byte sequence in UTF-8
I think the error is because Excel encodes the file into ISO 8859-1 (Latin-1) and not in UTF-8
Can someone help me with a workaround for this issue, please
Thanks in advance.
You need to tell Ruby that the file is in ISO-8859-1. Change your file open line to this:
file=File.open("input_file", "r:ISO-8859-1")
The second argument tells Ruby to open read only with the encoding ISO-8859-1.
Specify the encoding with encoding option:
CSV.foreach(file.path, headers: true, encoding:'iso-8859-1:utf-8') do |row|
...
end
You can supply source encoding straight in the file mode parameter:
CSV.foreach( "file.csv", "r:windows-1250" ) do |row|
<your code>
end
If you have only one (or few) file, so when its not needed to automatically declare encoding on whatever file you get from input, and you have the contents of this file visible in plaintext (txt, csv etc) separated with i.e. semicolon, you can create new file with .csv extension manually, and paste the contents of your file there, then parse the contents like usual.
Keep in mind, that this is a workaround, but in need of parsing in linux only one big excel file, converted to some flavour of csv, it spares time on experimenting with all those fancy encodings
Save the file in utf-8, unless for some reason you need to save it differently in which case you may specify the encoded set while reading the file
add second argument "r:ISO-8859-1" as File.open("input_file","r:ISO-8859-1" )
I had this same problem and was just using google spreadsheets and then downloading as a CSV. That was the easiest solution.
Then I came across this gem
https://github.com/singlebrook/utf8-cleaner
Now I don't need to worry about this issue at all. Hope this helps!

Resources