Can we use column headers to specify the column number from which we are parsing the excel sheet using roo gem? My current code is like this now:
oo = Openoffice.new("simple_spreadsheet.ods")
oo.default_sheet = oo.sheets.first
(2..oo.last_row).each do |line|
date = oo.cell(line,'A')
start_time = oo.cell(line,'B')
end_time = oo.cell(line,'C')
pause = oo.cell(line,'D')
...
end
I would like to parse from column headers instead of specifying columns as 'A' 'B' 'C' ... Can I acheive this using Roo?
You can grab the entire header row as an array and hash the entire row key'd on the header row.
oo = Openoffice.new("simple_spreadsheet.ods")
oo.default_sheet = oo.sheets.first
header = oo.row(1)
2.upto(oo.last_row) do |line|
row_data = Hash[header.zip oo.row(line)]
...
end
You could also use row_data[line] to nest the hashes for later use.
A cleaner/clearer version of the above is
oo = Openoffice.new("simple_spreadsheet.ods")
oo.default_sheet = file.sheets.first
header = oo.first_row
2.upto(oo.last_row) do |line|
row_data = Hash[*header.zip(row).flatten]
...
end
the original took me a bit to understand because especially as i thought hash was a local variable named hash instead of the class Hash
This will use the header row as the keys. The helpful parts are transpose and strip.
def self.excel_to_hash(folder_name, file_name, tab_name)
# Takes an excel file name and a tab name, and returns an array of stripped, transposed rows
# Sample call: my_data = excel_to_hash File.join(Rails.root,'db/data/data_to_import.xlsx'), 'models'
rows = []
file = File.open(File.join(folder_name, file_name), mode = 'r')
excel = Excelx.new(file.path, nil, :ignore)
excel.default_sheet = excel.sheets.index(tab_name) + 1
header = excel.row(1)
(2..excel.last_row).each do |i|
next unless excel.row(i)[0]
row = Hash[[header, excel.row(i)].transpose]
row.each_key{|x| row[x] = row[x].to_s.strip if row[x]}
rows << row
end
return rows
end
valid through Roo gem 1.10.2
This works for me
require 'roo'
# open excel file
excel_file = Roo::Spreadsheet.open(file_path)
# iterate on each sheet
excel_file.each_with_pagename do |name, sheet|
# iterate on each sheet
sheet.parse(headers: true, pad_cells: true) do |row|
# data should be access by column header if we have column header Name we can access like this
row['Name']
end
end
end
Related
I have a CSV file with two columns:
PPS_Id Amount
123 100
1234 150
I read data from this file and insert in a array using the code below:
CSV.foreach("filename.CSV", headers: true) do |row|
file_details << row.inspect # hash
end
I am then trying to push the data in the file_details into a hash with PPS_Id as key and Amount as Value, I am using the code below:
file_details_hash = Hash.new
file_details.each { |x|
file_details_hash[x['PPS_Id']] = x['Amount']
}
But when I print the result I get nothing just {"PPS_Id"=>"Amount"}
Can you please help
Your code, modified to work
You need to specify the column separator for your csv, and remove inspect.
require 'csv'
file_details = []
CSV.foreach("filename.CSV", headers: true, col_sep: "\s" ) do |row|
file_details << row
end
file_details_hash = Hash.new
file_details.each { |x|
file_details_hash[x['PPS_Id']] = x['Amount']
}
p file_details_hash
#=> {"123"=>"100", "1234"=>"150"}
It now returns what you expected to get.
Shorter solution
Read the csv, drop the first line (header) and convert to a Hash :
p CSV.read("filename.CSV", col_sep: "\s").drop(1).to_h
#=> {"123"=>"100", "1234"=>"150"}
First of all, you are collecting strings into an array (see String#inspect):
file_details << row.inspect
After that you call (sic!) String#[] on that strings:
x['PPS_Id'] #⇒ "PPS_Id", because string contains this substring
That said, your code has nothing but errors. You might achieve what you want with:
csv = CSV.parse(File.read("filename.CSV"), col_sep: "\s")
csv[1..-1].to_h
#⇒ {
# "123" => "100",
# "1234" => "150"
# }
Using inspect will save your CSV rows as strings, so obviously you won't be able get what you need. Instead try this:
file_details = CSV.read("filename.csv")
Read CSV directly will create an 2D array that you can then iterate over, which will look like this: [["PPS_Id", "Amount"], ["123", "100"], ["1234", "150"]]
From there you can slightly modify your approach:
file_details.each do |key, value|
file_details_hash[key] = value
end
To receive a hash like this: {"PPS_Id"=>"Amount", "123"=>"100", "1234"=>"150"}
I am writing my Hashmap into a csv file with headers using the below piece of code. However, what I notice is that the first data row is not available in the file. I can the headers and all other rows accurately
def self.execute_auto_pg_debtors pg_debtors_dt_list
partition_key = Date.today.prev_day.prev_day.strftime "%Y%m%d"
csvfilename = "PG_Debtors_" + partition_key + ".CSV"
pg_debtors_dt_batch = Array.new
rowid = 0
pg_debtors_dt_list.each { |x|
pg_debtors_details = Hash.new
pg_debtors_details["Store_Order_Id"] = x['Store_Order_Id']
pg_debtors_details["Transaction_Id"] = x['Transaction_Id']
pg_debtors_details["Gateway_Payment_Id"] = x['Gateway_Payment_Id']
pg_debtors_details["PPS_Id"] = x['PPS_Id']
pg_debtors_details["Event_Type"] = x['Event_Type']
pg_debtors_details["Event_Date"] = x['Event_Date']
pg_debtors_details["Gateway_Name"] = x['Gateway_Name']
pg_debtors_details["Open_Amount"] = "%f" % x['Open_Amount']
pg_debtors_details["Invoice_No"] = x['Invoice_No']
pg_debtors_dt_batch << pg_debtors_details
rowid += 1
if rowid == 1
CSV.open(csvfilename, "w") do |csv|
csv << pg_debtors_details.keys# adding header row (column labels)
end
else
CSV.open(csvfilename, "a") do |csv|
csv << pg_debtors_details.values# of if/else inside hsh
end# of hsh's (rows)
end# of csv open
}
return pg_debtors_dt_batch
end
Please help.
You are writing the headers instead of the first row!
I recommend, that you open the file and iterate through your hash inside the CSV.open do ... end
AND do not use a else after your if rowid == 1. Just execute that for EVERY values, so you do not skip data row 1
Even if you check for rowid, the .each loop is still not aware of it. So, for rowid == 1, it will write the headers, but in the next iteration, x will point to the second item in pg_debtors_dt_list.
To solve it, write your code in the following order:
Open the file, and write the headers first.
Loop through pg_debtors_dt_list, and write subsequent data to the file.
Hope it helps.
We just found out that CSV export cut off text in one field. Here is the original text for CSV export (from a text field of postgres 9.3):
#payable_approved_unpaid, #payable_paid, #payable_po_unpaid = {}, {}, {}
models.each do |m|
#payable_po_unpaid[m.id.to_s] = PurchaseOrderx::Order.where(project_id: m.id).sum('po_total') - PaymentRequestx::PaymentRequest.where(project_id: m.id).where(resource_string: 'purchase_orderx/orders').sum('amount')
#payable_paid[m.id.to_s] = PaymentRequestx::PaymentRequest.where(project_id: m.id).where(wf_state: :paid).sum('amount')
#payable_approved_unpaid[m.id.to_s] = PaymentRequestx::PaymentRequest.where(project_id: m.id).where('approved = ? AND wf_state != ?', true, :paid).sum('amount')
end
Here is what we get from CSV:
#payable_approved_unpaid, #payable_paid, #payable_po_unpaid = {}, {}, {}
models.each do |m|
#payable_po_unpaid[m.id.to_s] = PurchaseOrderx::Order.where(project_id: m.id).sum('po_total') - PaymentRequestx::PaymentRequest.where(project_id: m.id).wher
In the same file, there are fields which is much long than this and there is no problem. We have been using the export for a long time and this is first time we are losing text. What could cause the cut=off of text in CSV export?
Here is the method for CSV export, argument_value is the field with cut-off text. In debug, we verified that full context of the column has been assigned to CSV before exporting:
def self.to_csv
CSV.generate do |csv|
header = ['id', 'engine_name', 'engine_version', 'argument_name', 'argument_value', 'last_updated_by_id', 'created_at', 'updated_at', 'brief_note', 'global']
csv << header
i = 1
all.each do |config|
base = OnboardDatax.engine_config_class.find_by_id(config.engine_config_id)
row = Array.new
row << i
row << (base.global ? nil : base.engine.name)
row << base.engine_version
row << base.argument_name
row << (config.custom_argument_value.present? ? config.custom_argument_value : base.argument_value)
row << config.last_updated_by_id
row << config.created_at
row << config.updated_at
row << base.brief_note
row << (base.global ? 't' : 'f')
csv << row
i += 1
end
end
If wrapping the whole text with quotation mark, then the column can be exported to CSV in its entirety.
What could cause the cut=off of text in CSV export?
puts File.read('your_cutoff_text.txt').size
--output:--
256
Rails column types:
String:
Limited to 255 characters (depending on DBMS)
My OS automatically adds a newline to the end of a file, so your cutoff text contains exactly 255 characters.
In the same file, there are fields which is much long than this and
there is no problem.
Rails column types:
Text:
Unlimited length (depending on DBMS)
I'm using ruby gem axlsx and want to know if there's a way to set merge columns inside a style? Today i doing like this:
sheet.merge_cells "A1:E1"
sheet.add_row [I18n.t('foo.some_label').upcase], style: [title]
sheet.merge_cells "B2:E2"
...
I want to avoid to increment the cells manually (B2:E2...B5:E5), there's a way to do this?
Yes give this a try (*Disclaimer I did not actually test these methods but I have used similar functionality in the past)
def merge_last_row(sheet,options ={})
last_row = sheet.rows.last.index + 1
first_col,last_col = options[:columns]
if first_col && last_col
sheet.merge_cells "#{first_col}#{last_row}:#{last_col}#{last_row}"
else
sheet.merge_cells sheet.rows.last
end
sheet.rows.last.style = style if options[:style]
end
so to do what you want it would be
merge_last_row sheet, columns:["A","E"]
sheet.add_row [I18n.t('foo.some_label').upcase]
merge_last_row sheet, columns:["B","E"], style:title
If the last row contains data in A-E then the columns can be left empty and it will merge the whole row. If it does not you could add an option for filling the columns like so
def fill_columns(sheet,column_count,options={})
row = options[:row_data] || []
(column_count - row.count).times do
row << nil
end
sheet.add_row row
end
calls as
my_row = ["Hello","World"]
fill_columns sheet, 5,row_data: my_row
# this will add a row like["Hello","World",nil,nil,nil]
# so that it will merge properly across the columns A-E
merge_last_row sheet
If you are going to use these consistently then patching these functions into Worksheet might make more sense so you don't have to pass the sheet object.
module Axlsx
class Worksheet
def merge_last_row(options={})
last_row = rows.last.index + 1
first_col,last_col = options[:columns]
if first_col && last_col
merge_cells "#{first_col}#{last_row}:#{last_col}#{last_row}"
else
merge_cells rows.last
end
rows.last.style = style if options[:style]
end
def fill_columns(column_count,options={})
row_data = options[:row_data] || []
(column_count - row.count).times do
row_data << nil
end
add_row row_data
end
end
end
Call
sheet.merge_last_row columns:["A","E"]
sheet.add_row [I18n.t('foo.some_label').upcase]
sheet.merge_last_row columns:["B","E"], style:title
I'm trying to take a populated array and empty its contents into specified table fields.
I have a rake file that's importing new rows via a CSV file that needs to extract the values from my already populated array and add them to the incident_id field.
For example:
#id_array = [97, 98, 99]
So, if I'm importing three new rows, the first row needs to get an incident_id of 97, the second row needs to get an incident_id of 98, and so on until the array is empty.
Here is the code for my rake file:
require 'csv'
namespace :import_timesheets_csv do
task :create_timesheets => :environment do
puts "Import Timesheets"
csv_text = File.read('c:/rails/thumb/costrecovery_csv/lib/csv_import/timesheets.csv')
csv = CSV.parse(csv_text, :headers => true)
csv.each do |row|
row = row.to_hash.with_indifferent_access
Timesheet.create!(row.to_hash.symbolize_keys)
timesheet = Timesheet.last
timesheet.incident_id << #id_array
timesheet.save
end
end
end
if csv.size == #id_array.size
csv.each_with_index do |row,index|
row = row.to_hash.with_indifferent_access
Timesheet.create!(row.to_hash.symbolize_keys)
timesheet = Timesheet.last
timesheet.incident_id = #id_array[index]
timesheet.save
end
else
#Handel error arrays are not equal in size
end