I'm actually looking for advices more than pure coding answers on how to uncompress RAR/ZIP file after upload while keeping a maximum rate of data integrity.
Here is my problem : my application's users are uploading files generated by Adobe Edge (we are using it for animated ads) which are in RAR format. To upload the file, it was really trivial. Here is my uploader :
class MediaUploader < CarrierWave::Uploader::Base
storage :file
def store_dir
"uploads/#{ model.class.to_s.underscore }/#{ mounted_as }/#{ ScatterSwap.hash(model.id) }"
end
def extension_white_list
%w(jpg jpeg gif png rar zip)
end
def filename
"#{ secure_token }.#{ file.extension }" if original_filename.present?
end
protected
def secure_token
var = :"##{ mounted_as }_secure_token"
model.instance_variable_get(var) or model.instance_variable_set(var, SecureRandom.uuid)
end
end
Now, in my case, the RAR file is not actually the one I'll be using. What I need are the files contain inside the archive. Those files generally looks like that :
- edge_includes
|
- images
|
- js
|
| ADS_1234_988x160_edge.js
| ADS_1234_988x160_edgeActions.js
| ADS_1234_988x160.an
| ADS_1234_988x160.html
From the above example, I need to store the reference to ADS_1234_988x160.html file within the database.
For this purpose, I was going to use Carrierwave callbacks in order to :
after :store, :uncompress_and_update_reference
def uncompress_and_update_reference(file)
# uncompress and update reference
end
Uncompress the archive (probably using rubyzip)
Get the path to ADS_1234_988x160.html
Update the reference inside the database
Is there any better way to handle it? How to handle failure or network errors? Any ideas are welcome.
I had similar problem: zip is uploaded but it should not be stored, only unzipped content. I solved it by implementing dedicated storage. Something like:
class MyStorage
class App < CarrierWave::Storage::Abstract
def store!(file)
# invoked when file is moved from cache into store
# unzip and update db here
end
def retrieve!(identifier)
MyZipFile.new(identifier)
end
end
end
class MyZipFile
def initialize(identifier)
#identifier = identifier
end
def path
file.path
end
def delete
# delete unzipped files
end
def file
#file ||= zip_file
end
def zip_file
# create zip file somewhere in tmp dir
end
end
# in uploader file
storage MyStorage
storage :my_storage
If symbol is used than it need to be defined in carrierwave config:
CarrierWave.configure do |config|
config.storage_engines.merge!(my_storage: 'MyStorage')
end
Related
In our rails 6 project I want to export active record data to excel file and save xls file into S3 bucket without storing xls data on local db and send file link to email and provide download xls feature from email. Please help me.
1. Create a model which holds a exported file
# app/models/csv_export.rb
class CsvExport < ApplicationRecord
has_one_attached :file
# ...
end
Configure ActiveStorage that it uses S3 as a provider. See https://medium.com/alturasoluciones/setting-up-rails-5-active-storage-with-amazon-s3-3d158cf021ff
2. Create a export job which creates a new CsvExport with data
# app/jobs/csv_export_job.rb
require 'csv'
class CsvExportJob < ApplicationJob
queue_as :default
def perform(csv_export_id)
csv_export = CsvExport.find_by(id: csv_export_id)
csv_export.file.attach \
io: StringIO.new(csv_string), # add csv_string call here
filename: filename
# ...
end
private
# ...
CSV_COLUMNS = %w[id name email].freeze
def csv_string
CSV.generate(headers: true) do |csv|
csv << CSV_COLUMNS
# or whatever data you want to export
User.all.each do |contact|
csv << CSV_COLUMNS.map { |col| contact.send(col) }
end
end
end
end
3. Trigger job
e.g. from a controller action
csv_export = CsvExport.create(status: :started)
CsvExportJob.perform_later csv_export.id
I hope these examples will bring you to the right direction. If you need detailed help you can look at this article https://railsbyexample.com/export-records-to-csv-files-using-activestorage/
After the file is uploaded, I want to analyze and immediately process.
I'm currently attaching then processing each:
current_account.archives.attach(archive_params)
current_account.archives.each do |archive|
Job.enqueue(AccountArchiveImportJob.new(current_account.id, archive.id))
end
In the job i'm opening the CSV and parsing junk
attachment = Account.find(account_id).archives.where(id: archive_id).first
CSV.parse(attachment.download) do |row|
do_stuff_with_the_row(row)
end
I would like to do something like:
CSV.foreach(attachment.open) do |row|
do_stuff_with_the_row(row)
end
I cannot find documentation that allows converting the attachment back into a FILE
At least from Rails 6.0 rc1:
model.attachment_changes['attachment_name'].attachable
will give you IO of the original TmpFile BEFORE it is uploaded.
Rails-6 we will get a download method that will yield a file but you can get this very easily!
Add this downloader.rb file as an initializer
Then given this model
class Business < ApplicationRecord
has_one_attached :csvfile
end
you can do
ActiveStorage::Downloader.new(csvfile).download_blob_to_tempfile do |file|
CSV.foreach(file.path, {headers: true}) do |row|
do_something_with_each_row(row.to_h)
end
end
EDIT: not sure why this took me to so long to find service_url. Way more simple, but has been noted that service_url should not be shown to users
open(csvfile.service_url)
From Rails 5.2 official guide
class VirusScanner
include ActiveStorage::Downloading
attr_reader :blob
def initialize(blob)
#blob = blob
end
def scan
download_blob_to_tempfile do |file|
system 'scan_virus', file.path
end
end
end
So you can do
include ActiveStorage::Downloading
attr_reader :blob
def initialize(blob)
#blob = blob
end
def perform
download_blob_to_tempfile do |file|
CSV.foreach(file.path, {headers: true}) do |row|
do_something_with_each_row(row.to_h)
end
end
end
You can get the file path from the attachment, and then open the file.
path = ActiveStorage::Blob.service.send(:path_for, attachment.key)
File.open(path) do |file|
#...
end
In my Rails 4.2 application I am using RubyZIP to create a controller action similar to the following:
class SomeController < ApplicationController
def some_action
file_stream = Zip::ZipOutputStream.write_buffer do |zip|
zip.put_next_entry "dir1/hello.txt"
zip.print "Hello"
zip.put_next_entry "dir2/hello.txt"
zip.print "World"
end
file_stream.rewind
respond_to do |format|
format.zip do
send_data file_stream.read, filename: "zip_file.zip"
end
end
end
end
In the example two files are dynamically created and written to, then saved into a ZIP file.
But how can I add a file that already exists (!) to the ZIP file as well, e.g. a PDF file from my /app/assets/documents folder?
This should be much easier to achieve but I can't find any documentation on it.
Thanks for any help.
zip_file = File.new(zip_file_path, 'w')
Zip::File.open(zip_file.path, Zip::File::Create) do |zip|
zip.add(file_name, file_path)
end
zip_file
Here, file_name and file_path are name and paths of the file you want to add to your zip file and zip_file_path is the path of ZipFile. Hope that helps!
I'm using Carrierwave and Fog gems to store a file to my Amazon S3 bucket (to /files/file_id.txt). I need to store a slightly different version of the file to a different location in the bucket (/files/file_id_processed.txt) at the same time (right after the original is stored). I don't want to create a separate uploader attribute for it on the model - is there any other way?
This my current method that stores the file:
def store_file(document)
file_name = "tmp/#{document.id}.txt"
File.open(file_name, 'w') do |f|
document_content = document.content
f.puts document_content
document.raw_export.store!(f)
document.save
end
# I need to store the document.processed_content
File.delete(file_name) if File.exist?(file_name)
end
This is the Document model:
class Document < ActiveRecord::Base
mount_uploader :raw_export, DocumentUploader
# here I want to avoid adding something like:
# mount_uploader :processed_export, DocumentUploader
end
This is my Uploader class:
class DocumentUploader < CarrierWave::Uploader::Base
storage :fog
def store_dir
"files/"
end
def extension_white_list
%w(txt)
end
end
This is how my final solution looks like (kinda) - based on Nitin Verma's answer:
I had to add a custom processor method for the version to the Uploader class:
# in document_uploader.rb
...
version :processed do
process :do_the_replacements
end
def do_the_replacements
original_content = #file.read
File.open(current_path, 'w') do |f|
f.puts original_content.gsub('Apples','Pears')
end
end
considering that you need similar file but with different name.
for this you need to create a version for file in uploader.
version :processed do
process
end
and now second file name will be processed_{origional_file}.extension. if you want to change file name of second file you can use this link https://github.com/carrierwaveuploader/carrierwave/wiki/How-to:-Customize-your-version-file-names
I'm using Carrierwave to upload files, and I have it working.
My issue is attempting to change the name of the uploaded file.
In the generated uploader.rb there is a method I think I should be using
def filename
"something.jpg" if original_filename
basename = "what"+orginal_filename if original_filename, works
basename = (0...8).map{65.+(rand(25)).chr}.join if original_filename # will create a random name for each version, e.g. the orginal, the thumb, and the filename in the db, useless
end
I can't seem to access items like 'extension' or 'content_type' in sanitized_file.rb, so this is a bit beyond my current skill level right now.
Any suggestions or exercises for doing this, i.e. generate filename for an uploaded file that works as well as the carrierwave default (do nothing, but does carry on to each version)? Seems like it should be simple enough but I've stumbled over this.
Well, another problem with your random filename generator is that it's possible to have collisions isn't it? You could possibly generate a filename that was already generated.
One way to go about it would be to somehow generate a hash based on unique properties of the image, like file path. An example, from the carrierwave group:
def filename
if original_filename
#name ||= Digest::MD5.hexdigest(File.dirname(current_path))
"#{#name}.#{file.extension}"
end
end
This will create an MD5 hash based on the current path and then append the original file's extension to it.
Edit: The carrierwave wiki added an entry with a few methods on how to create random and unique filenames for all versioned files.
To have a real unique filename (not almost unique) I recommend to use the uuid gem.
in Gemfile add:
gem 'uuid'
in file_uploader.rb:
def filename
if original_filename
if model && model.read_attribute(mounted_as).present?
model.read_attribute(mounted_as)
else
#name ||= "#{mounted_as}-#{uuid}.#{file.extension}"
end
end
end
protected
def uuid
UUID.state_file = false
uuid = UUID.new
uuid.generate
end
From the Google Group:
def filename
#name ||= "#{secure_token}.#{file.extension}" if original_filename
end
private
def secure_token
ivar = "##{mounted_as}_secure_token"
token = model.instance_variable_get(ivar)
token ||= model.instance_variable_set(ivar, ActiveSupport::SecureRandom.hex(4))
end
To just make the record.id prefix the filename you can do the following:
class MyUploader < CarrierWave::Uploader::Base
storage :file
def store_dir
model.class.to_s.underscore.pluralize
end
def filename
model.id ? "#{model.id}-#{original_filename}" : original_filename
end
def url
"/#{store_dir}/#{model.id}-#{model.file_before_type_cast}"
end
end
The other solution looks good, but how I did it then was to have a hook that created a random string for a new name on instance creation, then:
def filename
"#{model.randomstring}.#{model.image.file.extension}"
end
in the uploader.
That worked, putting the random name generation as part of the model, then having carrierwave use that.
I am curious which is faster, more effective, reasonable, sound, etc.
Here is the solution, how to change the name of the file, if store_dir already contains the file with the exact name:
if File.exists?(Rails.root.join("documents/" + "#{file.filename}")) && !path.to_s.eql?(Rails.root.join("documents/" + original_filename).to_s)
#name ||= File.basename(original_filename, '.*') + Digest::MD5.hexdigest(File.dirname(current_path)).from(25)
"#{#name}.#{file.extension}"
else
"#{original_filename}"
end
Note: Rails.root.join("documents/") is defined as my store_dir.
Hope it helps someone.