Saving and later importing xls, csv files in rails application - ruby-on-rails

In my ruby on rails application, have implemented a facility seeing the railscast video http://railscasts.com/episodes/396-importing-csv-and-excel, that will directly import and update the db at same time, but it is creating problems when the file is too large.
So I want to write a facility to upload a csv or excel file in my application and save it in a directory. Then I want to add some kind of observer that will observe the contents of the directory and on an event like create or update of a file in that directory will trigger contents of those files to be uploaded in db. I am getting no idea about how to approach this.
Thanks in advance.

I think that the best approach is to use Resque to import and convert in a worker separately from the request.
Suppose you have a controller to add the Excel file, which I'm going to call Information model:
class InformationController < ApplicationController
def create
#information = Information.new(params[:information])
if #information.save
resque = Resque.enqueue(ImportDataJob, #information.id)
redirect_to #information, :notice => "Successfully created information for further processing."
else
render :new
end
end
end
You'll need make a job, in this case ImportDataJob:
class ImportDataJob
def self.perform(information_id)
information = Information.find(information_id)
# convert information.raw_csv or wherever attribute you saved the Excel or CSV into
# and save it into the database where you need to
end
end
You'll find a full tutorial in Resque RailsCast, where it shows how to add Resque into your existing Rails app.
Note: There's a conflict between README and the actual implementation for Resque. Apparently they want to change the way Resque is called (which is in the readme), but is not implemented yet. See this Issue in Github for more details.

As an alternative to Resque (and i do think doing it in a background job is the best approach), see also Spawnling, previously known as Spawn:
https://github.com/tra/spawnling
Super-low maintenance. In your controller action do something like
#file = <uploaded file here>
spawn do
#perform some long-running process using #file, in a new process
end
#current thread carries on straight away.
If you need to test if it's finished (or crashed for that matter) you can save the id of the new process like so:
#file = <uploaded file here>
spawner = spawn do
#perform some long-running process using #file, in a new process
end
#current thread carries on straight away.
spawner = spawn do
#make it in a temp file so serve_if_present doesn't serve a halfmade file
FileUtils.rm #filename if File.exists?(#filename)
temp_filename = "#{#filename}.temp"
ldb "temp_filename = #{temp_filename}"
current_user.music_service.build_all_schools_and_users_xls(:filename => temp_filename)
FileUtils.mv temp_filename, #filename
end
#spawn_id = spawner.handle

Related

Active Storage - Bug workaround in initializer not working

I'm trying to work around a known issue in Active Storage where the MIME type of a stored file is incorrectly set, without the ability to override it.
https://github.com/rails/rails/issues/32632
This has been addressed in the master branch of Rails, however it doesn't appear to be released yet (project is currently using 5.2.0). Therefor I'm trying to work around the issue using one of the comments provided in the issue:
Within a new initializer (\config\initializers\active_record_fix.rb):
Rails.application.config.after_initialize do
# Defeat the ActiveStorage MIME type detection.
ActiveStorage::Blob.class_eval do
def extract_content_type(io)
return content_type if content_type
Marcel::MimeType.for io, name: filename.to_s, declared_type: content_type
end
end
end
I'm processing and storing a zip file within a background job using delayed_jobs. The initializer doesn't appear to be getting called. I have restarted the server. I'm running the project locally using heroku local to process background jobs.
Here is the code storing the file:
file.attach(io: File.open(temp_zip_path), filename: 'Download.zip', content_type: 'application/zip')
Any ideas why the code above is not working? Active Storage likes to somewhat randomly decide this ZIP file is a PDF and save the content type as application\pdf. Unrelated, attempting to manually override the content_type after attaching doesn't work:
file.content_type = 'application/zip'
file.save # No errors, but record doesn't update the content_type
Try with Rails.application.config.to_prepare in place of after_initialize initialization event.
more info :
https://guides.rubyonrails.org/configuring.html#initialization-events
https://guides.rubyonrails.org/v5.2.0/initialization.html

How can I create a ruby script to parse files uploaded by users?

I want users to be able to upload files and then I want to be able to parse them, taking out pieces of information and then declaring them as global variables to be used by other parts of my web application. I know you can easily put in a file upload form but then where would I store the script for parsing the file? Would it be under models, views, controllers, or somewhere else? Also how can I tell my application to immediately run this script upon the file upload. Would I put it in the view before the form's <% end %> tag? When it does parse the file, how can I make sure the variables (probably array's) are declared globally so that I can call those variables in all other parts of my application
With EventMachine you can watch a folder for file operations and then process them.
The Library rb-inotify does fit aswell.
# Create the notifier
notifier = INotify::Notifier.new
# Run this callback whenever the file path/to/foo.txt is read
notifier.watch("path/to/foo.txt", :access) do
puts "Foo.txt was accessed!"
end
# Watch for any file in the directory being deleted
# or moved out of the directory.
notifier.watch("path/to/directory", :delete, :moved_from) do |event|
# The #name field of the event object contains the name of the affected file
puts "#{event.name} is no longer in the directory!"
end
# Nothing happens until you run the notifier!
notifier.run

What's the proper way to copy a carrierwave file from one record to another?

I need to copy a file from one carrier wave object to another. They are different tables and different types of uploaders.
I started with:
user.avatar = image.content
(where user and image are model instances, avatar and content are the carrierwave mounted uploaders) which worked sometimes. It seems to work all the time locally, with a file storage, but intermittent when using fog and s3.
In a mailing list post I found this code:
user.avatar = image.content.file
that again worked sometimes.
My working solution so far is:
require "open-uri"
begin
user.avatar = open(image.url)
rescue Errno::ENOENT => e
begin
user.avatar = open(image.path)
rescue Errno::ENOENT => e
# Ok, whatever.
end
end
which is not only ugly, but fails to pass the extension validation because the opening of a remote file doesn't maintain the extension (jpg, png, etc.).
Perhaps one way you can do it is to set a remote image URL as per the Carrierwave gem documentation?
user.remote_avatar_url = image.url
From solutions discussed here I created simple CopyCarrierwaveFile gem to do this
usage is something like this:
original_resource = User.last
new_resource = User.new
CopyCarrierwaveFile::CopyFileService.new(original_resource, new_resource, :avatar).set_file
new_resource.save
nev_resource.avatar.url # https://...image.jpg
Here's a (albeit hacky) solution to that doesn't require an HTTP request to fetch the image:
module UploadCopier
def self.copy(old, new)
new.instance_variable_set('#_mounters', nil)
old.class.uploaders.each do |column, uploader|
new.send("#{column}=", old.send(column))
end
end
end
old_user = User.last
new_user = User.new
UploadCopier.copy(old_user, new_user)
new_user.save
I needed to copy a reference from one model to another model and I was successfully able to do so by doing the following:
my_new_model.update_column('attachment', my_other_model.attributes["attachment"]);
In this scenario, I did not care to actually make a copy of the file, nor did I care that 2 records were now linked to the same file (my system never deletes or modifies files after uploaded).
This may be useful to anyone who wants to just copy the reference to a file from one model to another model using the same uploader.
You can do this by copying files.
store_path is a carrierwave method from Uploader class. It returns uploaded file's folder relative path.
this clone file method should be called after model record is saved.
If record not saved, store_path may return wrong path if you specify store_dir with model id in uploader.
def clone_carrierwave_file(column_name)
origin_files = Dir[File.join(Rails.root, 'public', original_record.send(column_name).store_path, '*')]
return if origin_files.blank?
new_file_folder = File.join(Rails.root, 'public', send(column_name).store_path)
FileUtils.mkdir new_file_folder if !Dir.exist? new_file_folder
FileUtils.cp(origin_files, new_file_folder)
end
Hope it works.
I just wanted to copy an avatar reference from one object to another, and what worked for me was:
objectB.avatar.retrieve_from_store!(objectA.avatar.identifier)
objectB.save

Parsing rails uploaded temporary file more then once

I'm new to rails and I'm currently trying to parse an uploaded file to rails. However, after I "read" the file once I cannot read it again. From what I've read online it appears that rails immediately deletes the uploaded file. Is there a way to make the file persistent? My code is as follows
file_param = params[:sequence]
file_param.read.each do |l|
# do stuff
end
file_param.read.each do |l|
# do stuff again. this is not being called.
end
I've thought of using paperclip or some other storage gem, but I don't need to store the files, simply read their contents. Thanks!
Read it into an array, if you really need to go over it multiple times, or just save it.

How to delete the temporary files automatically in ruby-rails?

My Rails app has to process and generate PDF XFA files and send to the user/browser.
Its working fine. But the issue is that before sending the file to the user, it creates 2 files in the rails tmp directory.
If 10 requests come to the pdf_controller, the number of the temp files in the tmp directory will double and it will eat up the space.
After searching around I thought that Sweeper will come to the rescue. But not much knowledge about Sweeper.
So, can anyone plz suggest which way to go?
Tempfile will delete files when the object is finalized.
Tempfile on Rdoc
Example:
def get_pdf
model = Model.find(params[:id])
file = Tempfile.new
model.to_pdf(file)
send_file file.path, ...
end
I can provide a better example if you paste your code into your question.
You could use a cron task, that deletes the files every n minutes, or, you could order the deletion from the controller itself.

Resources