Upload, process and export CSV without Database - ruby-on-rails

I am building a very simple web tool where user can upload a CSV file, which is then processed and the result CSV can be immediately downloaded.
The upload form:
<%= form_tag '/upload', multipart: true do %>
<%= file_field_tag :csv %>
<%= submit_tag 'Import CSV' %>
<% end %>
The upload and download actions:
def upload
original_csv = params[:csv]
p original_csv.path # /var/folders/71/chp2vrc92_19b3jt2fcwhvp80000gn/T/RackMultipart20181025-11469-25guh5.csv
redirect_to result_path(file_path: original_csv.path)
end
def result
p params[:file_path] # /var/folders/71/chp2vrc92_19b3jt2fcwhvp80000gn/T/RackMultipart20181025-11469-25guh5.csv
output_csv = CSV.generate do |csv|
CSV.foreach(params[:file_path], headers: true) do |row|
#############################################
# "No such file or directory # rb_sysopen" #
# exception is thrown #
#############################################
# each row data is being processed here
csv << row
end
end
# Download the file into user's computer
send_data output_csv
end
As you can see from the comments, this method doesn't work because the temp file path no longer exists in result action. How can I go about this without touching db at all.

Uploaded files are stored as temp files by the application. That means once the request has ended the temp file is automatically deleted. Therefore it doesn't exist anymore when the next page is requested.
One option would be to copy the file by yourself to another location and make it a "real" file in the file system that isn't deleted automatically anymore. But that has downsides too: Now you are responsible to manage and delete these files by yourself too. That means you need to generate unique files names and pass them to the next request and you need to ensure that the file is deleted after it was downloaded otherwise these files would slowly consume all space on your server's disk. Furthermore, this doesn't scale to multiple servers and will only work for small applications running on one server.
A better option might be to just do the upload, the processing and the download in one request, without any redirect. As long as the processing can be done in a reasonable time and in memory this might be a good option to avoid complexity.
def upload
original_csv = params[:csv]
output_csv = CSV.generate do |csv|
CSV.foreach(original_csv.path, headers: true) do |row|
# process data
csv << row
end
end
send_data output_csv
end

Try this:
def upload
result(params[:csv])
end
def result(fpath=params[:file_path])
output_csv = CSV.generate do |csv|
CSV.foreach(fpath, headers: true) do |row|
csv << row
end
end
# Download the file into user's computer
send_data output_csv
end

Related

Save Paperclip image with Sidekiq

I'm trying to save the Paperclip uploaded images throw a Sidekiq worker.
The user select the images into a image[] and passes it to the controller (as params[:image]).
View:
<%= file_field_tag "image[]", type: :file, multiple: true %>
When I pass it to another var (files), it works in the controller, but when I pass it to the sidekiq_worker, it turns into a hash of strings
Controller:
file = params[:image]
SidekiqWorker.perform_async("import_images", file)
SidekiqWorker
def perform(type, file)
case type
when "import_images"
file.each do |picture|
puts picture.class
puts picture.original_filename
end
Product.import_images(file)
puts "SIDEKIQ_WORKER: IMPORTING IMAGES"
end
end
How can I pass a hash of image-hashes? Or how can I achieve what I want to do?
After that, the images are processed into a model, but the hash already turned into a string and it does not works.
def self.import_images(file)
file.each do |picture|
#product = Product.where(code: File.basename(picture.original_filename, ".*"))
if(!#product.nil?)
#product.update(:image=> picture)
end
end
end
Thank you for your help :)
So, what I just did to make it happen...
After the user uploaded the files, it saves them in a folder and a variable in controller gets the name of each image.
when "Import images"
file = Array.new
params[:image].each do |picture|
File.open(Rails.root.join('public/system/products', 'uploaded', picture.original_filename), 'wb') do |f|
f.write(picture.read)
end
file.push picture.original_filename
end
SidekiqWorker.perform_async("import_images", file,0,0)
redirect_to products_url, notice: "#{t 'controllers.products.images'}"
After that, it passes to my sidekiq_worker and passes to my model, where I search for the image and search for the product where the code equals the name of the image. After processing, deletes the file of the uploaded image :)
def self.import_images(file)
file.each do |image|
uploaded = open("public/system/products/uploaded/"+image)
prd = Product.where(code: File.basename(uploaded, '.*'))
if !prd.nil?
prd.update(image: uploaded)
end
File.delete("public/system/products/uploaded/"+image)
end
end

How can I parse a local CSV file with Rails?

I'm trying to use smarter_csv to parse csv files with my Rails app. But the documentation only explains how to parse a file that already belongs to the app.
I want to parse a file that's stored locally on my computer. So I think I have to upload the file, parse it, and then delete it.
This is how far I got:
<%= form_tag({action: :upload}, multipart: true) do %>
<%= file_field :csv %>
<%= submit_tag 'Submit' %>
<% end %>
So then how can I reference and use the uploaded file in my controller action?
def upload
#save file temporarily to app
filename = #filename
#parse file with smarter_csv
#File.delete(filename)
end
To get the file path as a string you need to do the following:
filename = params[:csv].path
as params[:csv] is a UploadedFile object. You don't need handle the temp file yourself, i.e. storing and deleting it. Rails would do that for you. As per documentation:
Uploaded files are temporary files whose lifespan is one request. When the object is finalized Ruby unlinks the file, so there is no need to clean them with a separate maintenance task.

Rails: Permission denied for File.delete

I am creating a very simple web app that allows the user to upload a .zip file that I temporarily save in the tmp folder inside my application, parse the contents using zipfile and then delete the file after I'm done.
I managed to upload the file and copy it to the tmp folder, I can successfully parse it and get the results I want, but when I try to delete the file I get a permission denied error.
here's my view:
<%= form_tag({action: :upload}, multipart: true) do %>
<%= file_field_tag :software %>
<br/><br/>
<%= submit_tag("UPLOAD") %>
<% end %>
And here's my controller:
def upload
#file = params[:software]
#name = #file.original_filename
File.open(Rails.root.join('tmp', #name), 'wb') do |file|
file.write(#file.read)
end
parse
File.delete("tmp/#{#name}")
render action: "show"
end
I have tried using FileUtils.rm ("tmp/#{#name}") as well, and I also tried setting File.chmod(0777, "tmp/#{#name}") before deletion but to no avail. Changing the deletion path to Rails.root.join('tmp', #name) like the File.open block also doesn't fix it. I can totally delete the file via console so I don't know what can be the matter.
EDIT: The parse method:
def parse
require 'zip'
Zip::File.open("tmp/#{#nome}") do |zip_file|
srcmbffiles = File.join("**", "src", "**", "*.mbf")
entry = zip_file.glob(srcmbffiles).first
#stream = entry.get_input_stream.read
puts #stream
end
end
The issue was that for some reason my file was not being closed for deletion in either the File.open block or the Zip::File.open block. My solution was to close it manually and avoid using open blocks, changing this snippet:
File.open(Rails.root.join('tmp', #name), 'wb') do |file|
file.write(#file.read)
end
into this:
f = File.open(Rails.root.join('tmp', #nome), 'wb+')
f.write(#file.read)
f.close
and changing my parse method from this:
def parse
require 'zip'
Zip::File.open("tmp/#{#nome}") do |zip_file|
srcmbffiles = File.join("**", "src", "**", "*.mbf")
entry = zip_file.glob(srcmbffiles).first
#stream = entry.get_input_stream.read
puts #stream
end
end
to this:
def parse
require 'zip'
zf = Zip::File.open("tmp/#{#nome}")
srcmbffiles = File.join("**", "src", "**", "*.mbf")
entry = zf.glob(srcmbffiles).first
#stream = zf.read(entry)
puts #stream
zf.close()
end
Notice that I changed the way I populate #stream because apparently entry.get_input_stream also locks the file you're accessing.
The writing process may be still locking the file. You may have to wait until that process is complete.
'"tmp/#{#name}"' is not right path. Just use 'Rails.root.join('tmp', #name)'

Rails 4 - Delayed_Job for CSV import

I'm building a marketplace app in Rails 4 where sellers can list items to sell. I have a csv import feature so sellers can bulk load products. The import code worked fine on small files but I ran into timeout issues with larger files. So I want to use delayed_job to process these files in the background.
I set up delayed_job up to the point where the job is queued (i see the job in the delayed_job table). But when I run the job, I get an error saying that the file to be imported is not found. It is looking for the file in a temp folder which doesn't exist when the job is run.
How do I save (or without saving) the file in a location where delayed_job can access it? And how to I tell delayed_job where the file is located?
my listings controller:
def import
Listing.import(params[:file], params[:user_id])
redirect_to seller_url, notice: "Products are being imported."
end
my listing model:
class Listing < ActiveRecord::Base
require 'csv'
require 'open-uri'
class << self
def importcsv(file_path)
CSV.foreach(file_path, headers: true, skip_blanks: true) do |row|
#some model processing
end
end
handle_asynchronously :importcsv
end
# My importer as a class method
def self.import(file, user_id)
Listing.importcsv file.path
end
end
Here is form view:
<%= form_tag import_listings_path, multipart: true do %>
<%= file_field_tag :file %>
<%= hidden_field_tag :user_id, current_user.id %>
<%= submit_tag "Import CSV" %>
<% end %>
Presumably the file is a form upload. I think those files only persist while the web request is running. My recommendation is to use FileUtils.copy to copy the file into some location that will exist when your job runs.
So, probably you don't want to handle_asynchronously importcsv, but instead copy the files then call a private method on your model (which will be handled asynchronously) with the new file paths.

is there a rails method to loop through each line of uploaded file? "each_line" is an IO method but it's not working

I'm attempting to upload a csv file, parse it, and spit out a file for S3 or just pass to view. I use a file_field_tag to upload the csv. I thought file_field_tag passes an object that is a subclass of IO and would have all ruby IO methods such as "each_line". I can call "read" on the object (method of IO class) but not "each_line"... so how can I iterate over each line of a file_field_tag upload?
create method of my controller as:
#csv_file = params[:csv_file]
My show view which throws a no "each_line" method error:
<% #csv_file.each_line do |line| %>
<%= line %>
<% end %>
Yet I can use
<%= #csv_file.read(100) %>
I'm really confused what methods a file_field_tag upload params[] has... each_line, gets don't work... I can't seem to find a list of what I can use.
EDIT
I worked around this by doing:
#csv_file = params[:csv_file].read.to_s
then iterated through with:
<% #sp_file.each_line do |line| %>
<%= line %>
<% end %>
EDIT 2
The file being uploaded has repeats the header after lines which don't contain a comma (don't ask)... So I find lines without a comma and call .gets (in my rb script independent of rails). Unfortunately I get an error about gets being a private method I can't call. Which goes back to my initial issue being. Aren't files a sub class of IO with IO methods like read_lines & gets?
#file_out = []
#file_in.each_line do |line|
case line
when /^[^,]+$/
#comp = line.to_s.strip
comp_header = #file_in.gets.strip.split('')
#file_out.push(#comp)
end
end
When you post a 'file_field' , the param returned to the controller has some special magic hooked in.
I.e. in your case you could this
<%= "The following file was uploaded #{params[:csv_file].original_filename}" %>
<%= "It's content type was #{params[:csv_file].content_type}" %>
<%= "The content of the file is as follows: #{params[:csv_file].read}" %>
So those are the three special methods you can call on params[:csv_file], or any params posted as the result of a successful 'file_field_tag' or 'f.file_field' in a view
Just remember that those are the three extra special things you can to to a param posted as a result of a file_field:
original_filename
content_type
read
You've clearly figured out how to do the read, the original_filename and content_type may help you out in the future.
EDIT
OK, so all you have is the read method, which will read the contents of the file uploaded.
contents = params[:csv_file].read
So now contents is a string containing the contents of the file, but nothing else is known about that file EXCEPT that it's a csv file. We know that csvs are delimited with '\r' (I think, I've done a lot of work with parsing csv's, but I'm too lazy to go check)
so you could do this:
contents = params[:csv_file].read
contents.split("\r").each do |csvline|
???
end
EDIT 2
so here is the take away from this post
When you post a file_field to a controller, the most common thing to do with the contents of the uploaded file is 'read' it into a ruby String. Any additional processing of the uploaded contents must be done on that ruby String returned from the 'read'.
In this particular case, if the uploaded file is ALWAYS a CSV, you can just assume the CSV and start parsing it accordingly. If you expect multiple upload formats, you have to deal with that, for example:
contents = params[:csv_file].read
case params[:csv_file].content_type
when 'txt/csv'
contents.split("\r").each do |csvline|
???
end
when 'application/pdf'
???
end

Resources