I am working on requirements we have data in hash around 100+ keys. we need to generate CSV file as per user-defined header with some transformation, we may end up having 100+ template
Main changes will be
1) Change column name such as Fname –> First name
2) Data transformation like Full name – > First name + Last name (adding 2 column)
3) Fixing the position of a column – Fname should be at 35 positions etc.
please suggest is it possible to define declarative way or any gem available. Can you let me know any design pattern we can apply here?
Some sample scenarios
I have input like this with many columns (100+)
[ {:employee_id=>"001", :first_name=>"John",:last_name=>"Dee" :date_of_birth=>"10/10/1983", :salary=>"100000",:bounus =>"50000",......},
{:employee_id=>"002", :first_name=>"Alex",:last_name=>"Peck" :date_of_birth=>"11/01/1988", :salary=>"120000",:bounus =>"70000", .........},
]
Some customer need CSV as
Employee ID, First Name, Last Name, Date of birth, Salary, Bonus
001,John,Dee,10/10/1983,100000,50000,...
002,Alex,Peck,11/01/1988,120000,70000,...
Others (only header change)
ID, FName, LName, Dob, Salary, Bounus
001,John,Dee,10/10/1983,100000,50000,...
002,Alex,Peck,11/01/1988,120000,70000,...
another (merge of colum FName, LName -> Fullname)
ID, Fullname, Dob, Salary, Bounus
001,John Dee,10/10/1983,100000,50000,...
002,Alex Peck,11/01/1988,120000,70000,...
anothers (merge of column Salary, Bonus -> Salary+ Bonus)
ID, FName, LName, Dob, Salary
001,John,Dee,10/10/1983,150000,...
002,Alex,Peck,11/01/1988,190000,...
anothers ( column order changed also insted of Dob need age)
FName, LName, ID, age, Salary
John,Dee,001,36,150000,...
Alex,Peck,003,32,190000,...
Like many variations with the same input
Thanks for help
What you need is the presenter design pattern.
Your controller will request the data and store it in a local variable, and then your will have to load a presenter for your client passing it the data variable.
In response you'll get the final CSV to return to the client.
Let's say you clients have uniq codes, so that a Client model instance has a code attribute which is a string.
So your controller would looks like this:
app/controllers/exports_controller.rb
class ExportsController < ApplicationController
def export
data = MyService.fetchData # <== data contains the data you gave as an example
# Gets the right presenter, initialise it, and build the CSV
csv = PresenterFactory.for(current_client).new(data).present
respond_to do |format|
format.html
format.csv { send_data csv, filename: "export-name-for-#{current_client.code}.csv" }
end
end
end
The PresenterFactory class would be something like that:
app/models/presenter_factory.rb
class PresenterFactory
def self.for(client)
# For client with code "ABCD" it will return Presenters::Abcd class
"Presenters::#{client.code.capitalize}".constantize
end
end
The factory return the client's presenter class
And here is an example for a client's presenter class, for a client having the code ABCD:
app/models/presenters/abcd.rb
module Presenters
class Abcd
def initialize(data)
#data = data
end
def present
CSV.generate(headers: true) do |csv|
# Here is the client's specific CSV header
csv << [
'Employee ID',
'First Name',
# ...
]
#data.each do |row|
# Here is the client's specific CSV row
csv << [
row[:employee_id],
row[:first_name],
# ...
]
end
end
end
end
end
You can achieve your objective by constructing a transformation hash whose keys are the names of the columns in the desired CSV file, in order, and whose values are procs, which when called with an argument equal to an element of the given array of hashes, returns an element to be written in a row of the CSV file in the column corresponding to the key.
Code
require 'csv'
def construct_csv(fname, arr, transform)
CSV.open(fname, "wb") do |csv|
keys = transform.keys
csv << keys
arr.each { |h| csv << keys.map { |k| transform[k].call(h) } }
end
end
Examples
I will now illustrate how this method is used with various transformations.
Common data
arr = [{:employee_id=>"001", :first_name=>"John", :last_name=>"Dee",
:date_of_birth=>"10/10/1983", :salary=>"100000", :bonus=>"50000" },
{:employee_id=>"002", :first_name=>"Alex", :last_name=>"Peck",
:date_of_birth=>"11/01/1988", :salary=>"120000", :bonus=>"70000" }]
FName = 'temp.csv'
Write a CSV file with the same keys, in the same order, and the same values
keys = arr.first.keys
#=> [:employee_id, :first_name, :last_name, :date_of_birth, :salary, :bonus]
transform = keys.each_with_object({}) { |k,g| g[k] = ->(h) { h[k] } }
#=> {:employee_id=>#<Proc:0x00005bd270a0e710#(irb):451 (lambda)>,
# :first_name=>#<Proc:0x00005bd270a13260#(irb):451 (lambda)>,
# ...
# :bonus=>#<Proc:0x00005bd270a19cc8#(irb):451 (lambda)>}
construct_csv(FName, arr, transform)
Let's see what was written.
puts File.read(FName)
employee_id,first_name,last_name,date_of_birth,salary,bonus
001,John,Dee,10/10/1983,100000,50000
002,Alex,Peck,11/01/1988,120000,70000
Write a CSV file with the columns reordered1
col_order = [:last_name, :first_name, :employee_id, :salary, :bonus,
:date_of_birth]
keys = arr.first.keys
order_map = col_order.each_with_object({}) { |k,h| h[k] = keys.index(k) }
#=> {:last_name=>2, :first_name=>1, :employee_id=>0, :salary=>4,
# :bonus=>5, :date_of_birth=>3}
transform = col_order.each_with_object({}) { |k,g|
g[k] = ->(h) { h[keys[order_map[k]]] } }
#=> {:last_name=>#<Proc:0x00005bd270f8e5a0#(irb):511 (lambda)>,
# :first_name=>#<Proc:0x00005bd270f8e550#(irb):511 (lambda)>,
# ...
# :date_of_birth=>#<Proc:0x00005bd270f8e3c0#(irb):511 (lambda)>}
construct_csv(FName, arr, transform)
puts File.read(FName)
last_name,first_name,employee_id,salary,bonus,date_of_birth
Dee,John,001,100000,50000,10/10/1983
Peck,Alex,002,120000,70000,11/01/1988
Write a CSV file with a subset of keys, renamed and reordered
keymap = { :FirstName=>:first_name, :LastName=>:last_name, :ID=>:employee_id,
:Salary=>:salary, :Bonus=>:bonus }
transform = keymap.each_with_object({}) { |(new,old),g| g[new] = ->(h) { h[old] } }
#=> {:FirstName=>#<Proc:0x00005bd270d50298#(irb):391 (lambda)>,
# :LastName=>#<Proc:0x00005bd270d50220#(irb):391 (lambda)>,
# ...
# :Bonus=>#<Proc:0x00005bd270d830f8#(irb):391 (lambda)>}
construct_csv(FName, arr, transform)
puts File.read(FName)
FirstName,LastName,ID,Salary,Bonus
John,Dee,001,100000,50000
Alex,Peck,002,120000,70000
Write a CSV file after removing keys and adding keys whose values are computed
keys_to_remove = [:first_name, :last_name]
keys_to_add = [:full_name, :compensation]
keys = arr.first.keys + keys_to_add - keys_to_remove
#=> [:employee_id, :date_of_birth, :salary, :bonus, :full_name,
# :compensation]
transform = keys.each_with_object({}) do |k,h|
h[k] =
case k
when :full_name
->(h) { h[:first_name] + " " + h[:last_name] }
when :compensation
->(h) { h[:salary].to_i + h[:bonus].to_i }
else
->(h) { h[k] }
end
end
#=> {:employee_id=>#<Proc:0x00005bd271001000#(irb):501 (lambda)>,
# :date_of_birth=>#<Proc:0x00005bd271000f88#(irb):501 (lambda)>,
# :salary=>#<Proc:0x00005bd271000f10#(irb):501 (lambda)>,
# :bonus=>#<Proc:0x00005bd271000ec0#(irb):501 (lambda)>,
# :full_name=>#<Proc:0x00005bd271000e20#(irb):497 (lambda)>,
# :compensation=>#<Proc:0x00005bd271000dd0#(irb):499 (lambda)>}
construct_csv(FName, arr, transform)
puts File.read(FName)
employee_id,date_of_birth,salary,bonus,full_name,compensation
001,10/10/1983,100000,50000,John Dee,150000
002,11/01/1988,120000,70000,Alex Peck,190000
1. I don't understand the reason for doing this but it was mentioned as a possible requirement.
I've got a class that looks like this that turns a collection into a nested array of hashes:
# variable_stack.rb
class VariableStack
def initialize(document)
#document = document
end
def to_a
#document.template.stacks.map { |stack| stack_hash(stack) }
end
private
def stack_hash(stack)
{}.tap do |hash|
hash['stack_name'] = stack.name.downcase.parameterize.underscore
hash['direction'] = stack.direction
hash['boxes'] = stack.boxes.indexed.map do |box|
box_hash(box)
end.reverse_if(stack.direction == 'up') # array extensions
end.delete_if_key_blank(:boxes) # hash extensions
end
def box_hash(box)
{}.tap do |hash|
hash['box'] = box.name.downcase.parameterize.underscore
hash['content'] = box.template_variables.indexed.map do |var|
content_array(var)
end.join_if_any?
end.delete_if_key_blank(:content)
end
def content_array(var)
v = #document.template_variables.where(master_id: var.id).first
return unless v
if v.text.present?
v.text
elsif v.photo_id.present?
v.image.uploaded_image.url
else
''
end
end
end
# array_extensions.rb
class Array
def join_if_any?
join("\n") if size.positive?
end
def reverse_if(boolean)
reverse! if boolean
end
end
# hash_extensions.rb
class Hash
def delete_if_key_blank(key)
delete_if { |_, _| key.to_s.blank? }
end
end
This method is supposed to return a hash that looks like this:
"stacks": [
{
"stack_name": "stack1",
"direction": "down",
"boxes": [
{
"box": "user_information",
"content": "This is my name.\n\nThis is my phone."
}
},
{
"stack_name": "stack2",
"direction": "up",
"boxes": [
{
"box": "fine_print",
"content": "This is a test.\n\nYeah yeah."
}
]
}
Instead, often the boxes key is null:
"stacks": [
{
"stack_name": "stack1",
"direction": "down",
"boxes": null
},
{
"stack_name": "stack2",
"direction": "up",
"boxes": [
{
"box": "fine_print",
"content": "This is a test.\n\nYeah yeah."
}
]
}
I suspect it's because I can't "single-line" adding to arrays in Rails 5 (i.e., they're frozen). The #document.template.stacks is an ActiveRecord collection.
Why can't I map records in those collections into hashes and add them to arrays like hash['boxes']?
The failing test
APIDocumentV3 Instance methods #stacks has the correct content joined and indexed
Failure/Error:
expect(subject.stacks.first['boxes'].first['content'])
.to include(document.template_variables.first.text)
expected "\n" to include "#1"
Diff:
## -1,2 +1 ##
-#1
The presence of \n means the join method works, but it shouldn't join if the array is empty. What am I missing?
reverse_if returns nil if the condition is false. Consider this:
[] if false #=> nil
You could change it like this:
def reverse_if(condition)
condition ? reverse : self
end
delete_if_key_blank doesn't look good for me. It never deletes anything.
Disclaimer. I don't think it's a good idea to extend standard library.
So thanks to Danil Speransky I solved this issue, although what he wrote doesn't quite cover it.
There were a couple of things going on here and I solved the nil arrays with this code:
hash['boxes'] = stack.boxes.indexed.map do |box|
box_hash(box) unless box_hash(box)['content'].blank?
end.reverse_if(stack.direction == 'up').delete_if_blank?
end
That said, I'm almost certain my .delete_if_blank? extension to the Array class isn't helping at all. It looks like this, FYI:
class Array
def delete_if_blank?
delete_if(&:blank?)
end
end
I solved it by thowing the unless box_hash(box)['content'].blank? condition on the method call. It ain't pretty but it works.
I have no idea where to start with testing this rake task. Do I need stubs? If so, how to use them? Any help would be appreciated. Thanks!
desc "Import CSV file"
task :import => [:environment] do
data = "db/data.csv"
headers = CSV.open(data, 'r') { |csv| csv.first }
cs = headers[2..-1].map { |c| Model1.where(name: c).first_or_create }
ls = Model2.find_ls
csv_contents = CSV.read(photos)
csv_contents.shift
csv_contents.each do |row|
p = Model2.where(id: row[0], f_name: row[1]).first_or_create
p_d = FastImage.size(p.file.url(:small))
p.update_attributes(dimensions: p_d)
row[2..-1].each_with_index do |ls, i|
unless ls.nil?
ls.split(',').each { |l|
cl = Model3.where(name: l.strip, model_1_id: cs[i].id).first_or_create
Model4.where(p_id: p.id, model_3_id: cl.id).first_or_create
}
end
end
end
end
Here's how I'd do it:
1) Happy-path test(s)
Rake tasks as such are a pain to test. Extract the body of the rake task into a class:
whatever.rake
desc "Import CSV file"
task :import => [:environment] do
CSVImporter.new.import "db/data.csv"
end
end
lib/csv_importer.rb
class CsvImporter
def import(data)
headers = CSV.open(data, 'r') { |csv| csv.first }
cs = headers[2..-1].map { |c| Model1.where(name: c).first_or_create }
ls = Model2.find_ls
csv_contents = CSV.read(photos)
csv_contents.shift
csv_contents.each do |row|
p = Model2.where(id: row[0], f_name: row[1]).first_or_create
p_d = FastImage.size(p.file.url(:small))
p.update_attributes(dimensions: p_d)
row[2..-1].each_with_index do |ls, i|
unless ls.nil?
ls.split(',').each { |l|
cl = Model3.where(name: l.strip, model_1_id: cs[i].id).first_or_create
Model4.where(p_id: p.id, model_3_id: cl.id).first_or_create
}
end
end
end
end
Now it's easy to write a test that calls CSVImporter.new.import on a test file (that's why import takes the file as a parameter instead of hardcoding it) and expects the results. If it's reasonable to import db/data.csv in the test environment, you can do that in a test if you want. You probably only need one test like this. No stubs are required.
2) Edge and error cases
There is a lot of logic here which, for simplicity and speed, you'll want to test without creating actual model objects. That is, yes, you'll want to stub. Model2.find_ls and FastImage.size are already easy to stub. Let's extract a method to make the other model calls easy to stub:
class CsvImporter
def import(data)
headers = CSV.open(data, 'r') { |csv| csv.first }
cs = headers[2..-1].map { |c| Model1.first_or_create_with(name: c) }
ls = Model2.find_ls
csv_contents = CSV.read(photos)
csv_contents.shift
csv_contents.each do |row|
p = Model2.first_or_create_with(id: row[0], f_name: row[1])
p_d = FastImage.size(p.file.url(:small))
p.update_attributes(dimensions: p_d)
row[2..-1].each_with_index do |ls, i|
unless ls.nil?
ls.split(',').each { |l|
cl = Model3.first_or_create_with(name: l.strip, model_1_id: cs[i].id)
Model4.first_or_create_with(p_id: p.id, model_3_id: cl.id)
}
end
end
end
end
app/models/concerns/active_record_extensions.rb
module ActiveRecordExtensions
def first_or_create_with(attributes)
where(attributes).first_or_create
end
end
and include the module in all of the models that need it.
Now it's easy to stub all of the model methods so you can write tests that simulate any database situation you like.