how can I upload and parse an Excel file in Rails? - ruby-on-rails

I want to be able to upload an Excel file that contains contact information. I then went to be able to parse it and create records for my Contact model.
My application is a Rails application.
I am using the paperclip gem on heroku, I've been able to parse vim cards into the Contact model, and am looking for something similar, but will go through all lines of the Excel file.
Gems that simplify the task and sample code to parse would be helpful!

Spreadsheet is the best Excel parser that I have found so far. It offers quite a lot of functionality.
You say you use Paperclip for attachments which is good. However, if you store the attachments in S3 (which I assume since you use Heroku) the syntax for passing the file to spreadsheet is a little different but not difficult.
Here is an example of the pure syntax that can be used and not placed in any classes or modules since I don't know how you intend to start the parsing of contacts.
# load the gem
require 'spreadsheet'
# In this example the model MyFile has_attached_file :attachment
#workbook = Spreadsheet.open(MyFile.first.attachment.to_file)
# Get the first worksheet in the Excel file
#worksheet = #workbook.worksheet(0)
# It can be a little tricky looping through the rows since the variable
# #worksheet.rows often seem to be empty, but this will work:
0.upto #worksheet.last_row_index do |index|
# .row(index) will return the row which is a subclass of Array
row = #worksheet.row(index)
#contact = Contact.new
#row[0] is the first cell in the current row, row[1] is the second cell, etc...
#contact.first_name = row[0]
#contact.last_name = row[1]
#contact.save
end

I had a similar requirement in one of my Rails 2.1.0 application. I solved it in the following manner:
In the 'lib' folder I wrote a module like this:
require 'spreadsheet'
module DataReader
def read_bata(path_to_file)
begin
sheet = book.worksheet 0
sheet.each 2 do |row|
unless row[0].blank?
# Create model and save it to DB
...
end
end
rescue Exception => e
puts e
end
end
end
Had a model Upload:
class Upload < AR::Base
has_attached_file :doc,
:url => "datafiles/:id",
:path => ":rails_root/uploads/:id/:style/:basename.:extension"
# validations, if any
end
Generated an UploadsController which would handle the file upload and save it to appropriate location. I used Paperclip for file upload.
class UploadsController < AC
include DataReader
def new
#upload = Upload.new
end
def create
#upload = Upload.new(params[:upload])
#upload.save
file_path = "uploads/#{#upload.id}/original/#{#upload.doc_file_name}"
#upload.read = DataReader.read_data(file_path)
# respond_to block
end
end
Read about 'spreadsheet' library here and here. You can make appropriate improvements and make the technique work in Rails 3. Hope this helps.

I made a gem to achieve this easily. I called it Parxer and...
It's built on to of roo gem.
It allows you to parse xls, xlsx and csv files.
Has a DSL to handle:
Column mapping.
File, row, and column/cell validation.
Column/cell formatting.

Related

Struggling to get a CSV Report to load when requested

I'm trying to make it possible to view a report from a webpage and I am struggling. This task uses Sidekiq.
When I click on the link that should take me there I get the error: "The action 'churn_risk_report' could not be found for Admin::ReportsController."
In a show.html.erb file, in the Reports view, I've added <li><%= link_to 'Churn Risk Report', churn_risk_report_admin_reports_path %></li> beneath a number of similar lines with links to other reports.
I've added get 'churn_risk_report' in the correct place in my Routes file.
In my Workers directory, which I believe is my system's version of a Scripts directory used for Sidekiq jobs, I have a file called churn_risk_report.rb and in this file I have the following code:
class ChurnRiskReport
include Sidekiq::Worker
def perform
csv_temp = Tempfile.new
puts csv_temp.path
CSV.open(csv_temp.path, 'wb') do |csv|
csv << ["AAA", "BBB", "CCC", "DDD", "EEE", "FFF"]
AccountChurnRisk.all.each do |acr|
report_data = [acr.aaa,
acr.bbb,
"#{acr.ccc} #{acr.ccc}",
acr.ddd,
acr.eee,
acr.fff]
csv << report_data
end
end
report_name = "churn_risk_report"
file_name = "gggggggggg-#{report_name}-#{DateTime.now.to_date.strftime("%b-%Y")}.csv".downcase
bucket_name = 'hhh-reports'
s3 = AWS::S3.new
key = "reports/churn_reports/" + File.basename(file_name)
x = s3.buckets[bucket_name].objects[key].write(:file => csv_temp.path)
end
end
Consecutive letters = code that I think should be kept private. This code worked in a rails console when I ran it, so I know the code works.
I'm struggling with the final part of the task. In the Reports controller I've defined churn_risk_report and just copied in the same code as is in the Worker file. I know this is incorrect but I'm not sure what should go in there. I think there should be some code in here that temporarily creates a churn_risk_report file in the Workers directory, however this might not be the case. I also think that 'async' should be involved somewhere.
Thanks in advance for any help!
At the bottom of this answer, is the gist I keep specifically for basic CSV export logic handling works for most cases that don't require high performance. Typically I use it as a way for the user to download a CSV copy of the data they are currently looking at (so caching in S3 isn't overly practical): https://gist.github.com/MyklClason/f6ac68ca4ce1faa5d655abfb0abe788b
Depending on how large the report is, saving it in S3 may be excessive, but could work as a way to cache it (or store it for long term). In which case before rendering the download link (to the S3 file itself), you check to see if the report already exists, if it does, render a link directly to AWS, if it doesn't, render a "generate report" link, then once the report is generated, render a "Download report" link.
Even better if you can auto-generate the reports, which seems to be what you are doing in the original code. Though I think a "Generate report" button could queue up the job to run the code you have though likely the reports are monthly or something so you know exactly when you need to create them, in which case the "Generate report" button probably isn't needed and you can just provide an S3 download link, if the report has been generated.
Gist content:
# Good solution for exporting to CSV in rails.
# Source: https://www.codementor.io/victorhazbun/export-records-to-csv-files-ruby-on-rails-vda8323q0
class UsersController < ApplicationController
def index
#users = User.all
respond_to do |format|
format.html
format.csv { send_data #users.to_csv, filename: "users-#{Date.today}.csv" }
end
end
end
class User < ActiveRecord::Base
def self.to_csv
attributes = %w{id email name}
CSV.generate(headers: true) do |csv|
csv << attributes
all.each do |user|
csv << attributes.map{ |attr| user.send(attr) }
end
end
end
end
<%= link_to 'Export CSV', user_path(format: :csv) %>

Validation of uploaded Excel file before save using Rails with Roo-xls gem

In a model, before uploading an .xls file, I want to be able to validate excel files before they are saved by the application. I am trying to open the to-be-saved excel file from the :file_url object(column in comits table where the .xls files will be saved) and then validate it but I am getting a no implicit conversion of Symbol into String error.
The validation works when I place the actual file path of an excel file that has been uploaded and saved by carrierwave into Roo::Excel.new("") but that defeats the purpose of my validation.
How can I grab the excel file without it being stored in the application?
I appreciate the help! I hope is not too confusing.
This is my comit.rb
class Comit < ActiveRecord::Base
belongs_to :user
mount_uploader :file_url, ComitUploader, mount_on: :file_url
validates :filename, :file_url, presence: true
validates_format_of :file_url, with: /.xls/, message: "Wrong file format"
before_save :validate_excel
def validate_excel
sheet = Roo::Excel.new(:file_url)
errors = []
header = sheet.row(1)
num_of_columns = sheet.last_column
errors << 'Need headers' unless
errors << 'Need more columns' if num_of_columns < 21
errors
end
end
You're passing the symbol :file_url to Roo::Excel.new, it wants a path to the file. Try:
sheet = Roo::Excel.new(file_url)
You can send the tempfile to Roo.
Let's say you are sending the file to params as :file:
Roo::Excel.open(params[:file].tempfile.to_path.to_s)
Okay, I figured it out. Instead of Roo::Excell, it needs to be
sheet = Roo::Spreadsheet.open(self.file_url)
Also, I needed to install the gem 'roo-xls' gem in order to read the spreadsheet.
In my case following code is working fine. (Assuming that you have installed required gems)In my case I have the .xls file in my root directory
require 'roo'
require 'roo-xls'
class ExcelReader
def read_file()
sheet = Roo::Spreadsheet.open('Test.xls')
sheet.each do |row|
puts row
end
end
obj=ExcelReader.new()
obj.read_file()
end

Fill up and update an uploaded PDF form online and save it back to the server - Ruby on Rails

Here is the requirement:
In my web-app developed in Ruby on Rails, we require to have an option to upload a PDF form template to the system, take it back in the browser itself and the user should be able to fill up the PDF form online and finally save it back to the server.
Then, user will come and download the updated PDF form from the application. I've searched a lot but couldn't find a proper solution to it. Please suggest.
As I stated for prebuilt PDF's with form fields already embedded I use pdtk Available Here and the active_pdftk gem Available Here. This is the standard process I use but yours may differ:
class Form
def populate(obj)
#Stream the PDF form into a TempFile in the tmp directory
template = stream
#turn the streamed file into a pdftk Form
#pdftk_path should be the path to the executable for pdftk
populated_form = ActivePdftk::Form.new(template,path: pdftk_path)
#This will generate the form_data Hash based on the fields in the form
#each form field is specified as a method with or without arguments
#fields with arguments are specified as method_name*args for splitting purposes
form_data = populated_form.fields.each_with_object({}) do |field,obj|
meth,args = field.name.split("*")
#set the Hash key to the value of the method with or without args
obj[field.name] = args ? obj.send(meth,args) : obj.send(meth)
end
fill(template,form_data)
end
private
def fdf(waiver_data,path)
#fdf ||= ActivePdftk::Fdf.new(waiver_data)
#fdf.save_to path
end
def fill(template,waiver_data)
rand_path = generate_tmp_file('.fdf')
initialize_pdftk.fill_form(template,
fdf(waiver_data,rand_path),
output:"#{rand_path.gsub(/fdf/,'pdf')}",
options:{flatten:true})
end
def initialize_pdftk
#pdftk ||= ActivePdftk::Wrapper.new(:path =>pdftk_path)
end
end
Basically what this does is it streams the form to a tempfile. Then it converts it to a ActivePdftk::Form. Then it reads all the fields and builds a Hash of field_name => value structure. From this it generates an fdf file and uses that to populate the actual PDF file and then outputs it to another tempfile flattened to remove the fields from the final result.
Your use case might differ but hopefully this example will be useful in helping you achieve your goal. I have not included every method used as I am assuming you know how to do things like read a file. Also my forms require a bit more dynamics like methods with arguments. Obviously if you are just filling in raw fixed data this portion could be changed a bit too.
An example of usage given your class is called Form and you have some other object to fill the form with.
class SomeController < ApplicationController
def download_form
#form = Form.find(params[:form_id])
#object = MyObject.find(params[:my_object_id])
send_file(#form.populate(#object), type: :pdf, layout:false, disposition: 'attachment')
end
end
This example will take #form and populate it from #object then present it to the end user as a filled and flattened PDF. If you just needed to save it back into the database I am sure you can figure this out using an uploader of some kind.

Ruby parse CSV file to print out the rows

I have a file upload in my Rails application and I want to parse the CSV file assuming the upload went okay. You can see the comment below that indicates where I would like to read the rows of the CSV file. How can I do this? I used carrierwave for the file upload.
I mounted it as such
mount_uploader :file, LCFileUploader
Here is the code I currently have
require 'CSV'
class LCFilesController < ApplicationController
def new
authorize! :create, :lc_file
#lc_file = LCFile.new
end
def create
authorize! :create, :lc_file
puts params
#lc_file = LCFile.new(params[:lc_file])
#lc_file.user_id = current_user.id
if #lc_file.save
#PARSE CSV HERE TO PRINT OUT THE ROWS OF THE CSV FILE
CSV.foreach(#lc_file.file.path) do |row|
puts row
end
redirect_to lc_path, :notice => 'New lc created!'
else
render :new
end
end
end
and I get this error:
undefined method `find_all_by_team_id' for #<Class:0x007fe14c40d848>
You can use the CSV class:
puts CSV.read(#lc_file.file.path)
or one row at a time:
CSV.foreach(#lc_file.file.path) do |row|
puts row
end
Besides CSV generation there are a few more issues:
the redirect will not work after you send send some output. But even if it did, the output would not be seen, since you're redirecting.
the path you are redirecting to is incorrect (I believe that's why you get that error). I suppose you want something like lcfiles_path or lcfile_path(#lc_file). Run rake routes (the same way you ran rails console) to see a list of all available routes.
Now if you still have issues, I suggest posting another question, as this one was mainly about CSV generation and that should be solved using the code I posted at the start of this answer.

How to save a raw_data photo using paperclip

I'm using jpegcam to allow a user to take a webcam photo to set as their profile photo. This library ends up posting the raw data to the sever which I get in my rails controller like so:
def ajax_photo_upload
# Rails.logger.info request.raw_post
#user = User.find(current_user.id)
#user.picture = File.new(request.raw_post)
This does not work and paperclip/rails fails when you try to save request.raw_post.
Errno::ENOENT (No such file or directory - ????JFIF???
I've seen solutions that make a temporary file but I'd be curious to know if there is a way to get Paperclip to automatically save the request.raw_post w/o having to make a tempfile. Any elegant ideas or solutions out there?
UGLY SOLUTION (Requires a temp file)
class ApiV1::UsersController < ApiV1::APIController
def create
File.open(upload_path, 'w:ASCII-8BIT') do |f|
f.write request.raw_post
end
current_user.photo = File.open(upload_path)
end
private
def upload_path # is used in upload and create
file_name = 'temp.jpg'
File.join(::Rails.root.to_s, 'public', 'temp', file_name)
end
end
This is ugly as it requires a temporary file to be saved on the server. Tips on how to make this happen w/o the temporary file needing to be saved? Can StringIO be used?
The problem with my previous solution was that the temp file was already closed and therefore could not be used by Paperclip anymore. The solution below works for me. It's IMO the cleanest way and (as per documentation) ensures your tempfiles are deleted after use.
Add the following method to your User model:
def set_picture(data)
temp_file = Tempfile.new(['temp', '.jpg'], :encoding => 'ascii-8bit')
begin
temp_file.write(data)
self.picture = temp_file # assumes has_attached_file :picture
ensure
temp_file.close
temp_file.unlink
end
end
Controller:
current_user.set_picture(request.raw_post)
current_user.save
Don't forget to add require 'tempfile' at the top of your User model file.

Resources