Export large mount of data in a zip - ruby-on-rails

I'm exporting some data from my server to the client.
It's an zip archive but when the amount of data is to big : TimeOut !
#On my controller
def export
filename = 'my_archive.zip'
temp_file = Tempfile.new(filename)
begin
Zip::OutputStream.open(temp_file) { |zos| }
Zip::File.open(temp_file.path, Zip::File::CREATE) do |zip|
#videos.each do |v|
video_file_name = v.title + '.mp4'
zip.add(video_file_name, v.source.file.path(:original))
end
end
zip_data = File.read(temp_file.path)
send_data(zip_data, :type => 'application/zip', :filename => filename)
ensure
temp_file.close
temp_file.unlink
end
end
I'm using PaperClip to attach my video on my app.
Is there any way to create and upload the zip (with a stream?) without a too long wait?

You could try the zipline gem. It claims to be "Hacks on Hacks on Hacks" so heads up! Looks very easy to use though, worth a shot.

Related

Issue with Zip file exports in Ruby / Rails

I am having an issue with my rails action that takes binary (blob) files from my database and packages them up into a nice zip file and then finally sends it out for download. The problem occurs when I try to unzip the file, it says something like "Unable to expand; Error 1 - Operation not permitted." I believe that means that the file is corrupted but I don't know what I am doing wrong. I've included my code below, any help would be greatly appreciated. Thanks!
require 'zip/zip'
require 'zip/zipfilesystem'
def export
#layers = Layer.where('group_id > 1')
temp = Tempfile.new("layers-zip-export")
Zip::ZipOutputStream.open(temp.path) do |zipfile|
#layers.each do |layer|
zipfile.put_next_entry(layer.name)
file = Tempfile.new("temp-" + layer.id.to_s)
file.binmode
file << layer.file
file.rewind
zipfile.write IO.binread(file.path)
file.close
file.unlink
end
end
send_file temp.path, :type => 'application/zip', :filename => "layer-export.zip"
temp.close
end

Invalid encoding with rqrcode

I'm having an invalid encoding error that doesn't let me save the image to a carrierwave uploader.
require 'rqrcode_png'
img = RQRCode::QRCode.new( 'test', :size => 4, :level => :h ).to_img.to_s
img.valid_encoding?
=> false
I'm not sure if this is what you're looking for, in my case I needed to associate the generated QR code with a Rails model using carrierwave, what I ended up doing was saving the image to a temp file, associating that file with the model and afterwards deleting the temp file, here's my code:
def generate_qr_code!
tmp_path = Rails.root.join('tmp', "some-filename.png")
tmp_file = RQRCode::QRCode.new(self.hash_value).to_img.resize(200,200).save(tmp_path)
# Stream is handed closed, we need to reopen it
File.open(tmp_file.path) do |file|
self.qr_code = file
end
File.delete(tmp_file.path)
self.save!
end

Downloading and zipping files that were uploaded to S3 with CarrierWave

I have a small Rails 3.2.1 app that uses CarrierWave 0.5.8 for file uploads to S3 (using Fog)
I want users to be able to select some images that they'd like to download, then zip them up and send them a zip. Here is what I've come up with:
def generate_zip
#A collection of Photo objects. The Photo object has a PhotoUploader mounted.
photos = Photo.all
tmp_filename = "#{Rails.root}/tmp/" << Time.now.strftime('%Y-%m-%d-%H%M%S-%N').to_s << ".zip"
zip = Zip::ZipFile.open(tmp_filename, Zip::ZipFile::CREATE)
zip.close
photos.each do |photo|
file_to_add = photo.photo.file
zip = Zip::ZipFile.open(tmp_filename)
zip.add("tmp/", file_to_add.path)
zip.close
end
#do the rest.. like send zip or upload file and e-mail link
end
This doesn't work because photo.photo.file returns an instance of CarrierWave::Storage::Fog::File instead of a regular file.
EDIT: The error this leads to:
Errno::ENOENT: No such file or directory - uploads/photos/name.jpg
I also tried the following:
tmp_filename = "#{Rails.root}/tmp/" << Time.now.strftime('%Y-%m-%d-%H%M%S-%N').to_s << ".zip"
zip = Zip::ZipFile.open(tmp_filename, Zip::ZipFile::CREATE)
zip.close
photos.each do |photo|
processed_uri = URI.parse(URI.escape(URI.unescape(photo.photo.file.authenticated_url)).gsub("[", "%5B").gsub("]", "%5D"))
file_to_add = CarrierWave::Uploader::Download::RemoteFile.new(processed_uri)
zip = Zip::ZipFile.open(tmp_filename)
zip.add("tmp/", file_to_add.path)
zip.close
end
But this gives me a 403. Some help would be greatly appreciated.. It probably is not that hard I'm just Doing it Wrong™
I've managed to solve the problem with help from #ffoeg
The solution offered by #ffoeg didn't work quite so well for me since I was dealing with zip files > 500 MB which caused me problems on Heroku. I've therefor moved the zipping to a background process using resque:
app/workers/photo_zipper.rb:
require 'zip/zip'
require 'zip/zipfilesystem'
require 'open-uri'
class PhotoZipper
#queue = :photozip_queue
#I pass
def self.perform(id_of_object_with_images, id_of_user_to_be_notified)
user_mail = User.where(:id => id_of_user_to_be_notified).pluck(:email)
export = PhotoZipper.generate_zip(id_of_object_with_images, id_of_user_to_be_notified)
Notifications.zip_ready(export.archive_url, user_mail).deliver
end
# Zipfile generator
def self.generate_zip(id_of_object_with_images, id_of_user_to_be_notified)
object = ObjectWithImages.find(id_of_object_with_images)
photos = object.images
# base temp dir
temp_dir = Dir.mktmpdir
# path for zip we are about to create, I find that ruby zip needs to write to a real file
# This assumes the ObjectWithImages object has an attribute title which is a string.
zip_path = File.join(temp_dir, "#{object.title}_#{Date.today.to_s}.zip")
Zip::ZipOutputStream.open(zip_path) do |zos|
photos.each do |photo|
path = photo.photo.path
zos.put_next_entry(path)
zos.write photo.photo.file.read
end
end
#Find the user that made the request
user = User.find(id_of_user_to_be_notified)
#Create an export object associated to the user
export = user.exports.build
#Associate the created zip to the export
export.archive = File.open(zip_path)
#Upload the archive
export.save!
#return the export object
export
ensure
# clean up the tempdir now!
FileUtils.rm_rf temp_dir if temp_dir
end
end
app/controllers/photos_controller.rb:
format.zip do
#pick the last ObjectWithImages.. ofcourse you should include your own logic here
id_of_object_with_images = ObjectWithImages.last.id
#enqueue the Photozipper task
Resque.enqueue(PhotoZipper, id_of_object_with_images, current_user.id)
#don't keep the user waiting and flash a message with information about what's happening behind the scenes
redirect_to some_path, :notice => "Your zip is being created, you will receive an e-mail once this process is complete"
end
Many thanks to #ffoeg for helping me out. If your zips are smaller you could try #ffoeg's solution.
Here is my take. There could be typos but I think this is the gist of it :)
# action method, stream the zip
def download_photos_as_zip # silly name but you get the idea
generate_zip do |zipname, zip_path|
File.open(zip_path, 'rb') do |zf|
# you may need to set these to get the file to stream (if you care about that)
# self.last_modified
# self.etag
# self.response.headers['Content-Length']
self.response.headers['Content-Type'] = "application/zip"
self.response.headers['Content-Disposition'] = "attachment; filename=#{zipname}"
self.response.body = Enumerator.new do |out| # Enumerator is ruby 1.9
while !zf.eof? do
out << zf.read(4096)
end
end
end
end
end
# Zipfile generator
def generate_zip(&block)
photos = Photo.all
# base temp dir
temp_dir = Dir.mktempdir
# path for zip we are about to create, I find that ruby zip needs to write to a real file
zip_path = File.join(temp_dir, 'export.zip')
Zip::ZipFile::open(zip_path, true) do |zipfile|
photos.each do |photo|
zipfile.get_output_stream(photo.photo.identifier) do |io|
io.write photo.photo.file.read
end
end
end
# yield the zipfile to the action
block.call 'export.zip', zip_path
ensure
# clean up the tempdir now!
FileUtils.rm_rf temp_dir if temp_dir
end

Migrating paperclip S3 images to new url/path format

Is there a recommended technique for migrating a large set of paperclip S3 images to a new :url and :path format?
The reason for this is because after upgrading to rails 3.1, new versions of thumbs are not being shown after cropping (previously cached version is shown). This is because the filename no longer changes (since asset_timestamp was removed in rails 3.1). I'm using :fingerprint in the url/path format, but this is generated from the original, which doesn't change when cropping.
I was intending to insert :updated_at in the url/path format, and update attachment.updated_at during cropping, but after implementing that change all existing images would need to be moved to their new location. That's around half a million images to rename over S3.
At this point I'm considering copying them to their new location first, then deploying the code change, then moving any images which were missed (ie uploaded after the copy), but I'm hoping there's an easier way... any suggestions?
I had to change my paperclip path in order to support image cropping, I ended up creating a rake task to help out.
namespace :paperclip_migration do
desc 'Migrate data'
task :migrate_s3 => :environment do
# Make sure that all of the models have been loaded so any attachments are registered
puts 'Loading models...'
Dir[Rails.root.join('app', 'models', '**/*')].each { |file| File.basename(file, '.rb').camelize.constantize }
# Iterate through all of the registered attachments
puts 'Migrating attachments...'
attachment_registry.each_definition do |klass, name, options|
puts "Migrating #{klass}: #{name}"
klass.find_each(batch_size: 100) do |instance|
attachment = instance.send(name)
unless attachment.blank?
attachment.styles.each do |style_name, style|
old_path = interpolator.interpolate(old_path_option, attachment, style_name)
new_path = interpolator.interpolate(new_path_option, attachment, style_name)
# puts "#{style_name}:\n\told: #{old_path}\n\tnew: #{new_path}"
s3_copy(s3_bucket, old_path, new_path)
end
end
end
end
puts 'Completed migration.'
end
#############################################################################
private
# Paperclip Configuration
def attachment_registry
Paperclip::AttachmentRegistry
end
def s3_bucket
ENV['S3_BUCKET']
end
def old_path_option
':class/:id_partition/:attachment/:hash.:extension'
end
def new_path_option
':class/:attachment/:id_partition/:style/:filename'
end
def interpolator
Paperclip::Interpolations
end
# S3
def s3
AWS::S3.new(access_key_id: ENV['S3_KEY'], secret_access_key: ENV['S3_SECRET'])
end
def s3_copy(bucket, source, destination)
source_object = s3.buckets[bucket].objects[source]
destination_object = source_object.copy_to(destination, {metadata: source_object.metadata.to_h})
destination_object.acl = source_object.acl
puts "Copied #{source}"
rescue Exception => e
puts "*Unable to copy #{source} - #{e.message}"
end
end
Didn't find a feasible method for migrating to a new url format. I ended up overriding Paperclip::Attachment#generate_fingerprint so it appends :updated_at.

Zip up all Paperclip attachments stored on S3

Paperclip is a great upload plugin for Rails. Storing uploads on the local filesystem or Amazon S3 seems to work well. I'd just assume store files on the localhost, but the use of S3 is required for this app as it will be hosted on Heroku.
How would I go about getting all of my uploads/attachments from S3 in a single zipped download?
Getting a zip of files from the local filesystem seems straight forward. It's getting the files from S3 that has me puzzled. I think it may have something to do with the way that rubyzip handles files referenced by URL. I've tried various approaches but can't seem to avoid errors.
format.zip {
registrations_with_attachments = Registration.find_by_sql('SELECT * FROM registrations WHERE abstract_file_name NOT LIKE ""')
headers['Cache-Control'] = 'no-cache'
tmp_filename = "#{RAILS_ROOT}/tmp/tmp_zip_" <<
Time.now.to_f.to_s <<
".zip"
# rubyzip gem version 0.9.1
# rdoc http://rubyzip.sourceforge.net/
Zip::ZipFile.open(tmp_filename, Zip::ZipFile::CREATE) do |zip|
#get all of the attachments
# attempt to get files stored on S3
# FAIL
registrations_with_attachments.each { |e| zip.add("abstracts/#{e.abstract.original_filename}", e.abstract.url(:original, false)) }
# => No such file or directory - http://s3.amazonaws.com/bucket/original/abstract.txt
# Should note that these files in S3 bucket are publicly accessible. No ACL.
# works with local storage. Thanks to Henrik Nyh
# registrations_with_attachments.each { |e| zip.add("abstracts/#{e.abstract.original_filename}", e.abstract.path(:original)) }
end
send_data(File.open(tmp_filename, "rb+").read, :type => 'application/zip', :disposition => 'attachment', :filename => tmp_filename.to_s)
File.delete tmp_filename
}
You almost certainly want to use e.abstract.to_file.path instead of e.abstract.url(...).
See:
Paperclip::Storage::S3::to_file (should return a TempFile)
TempFile::path
UPDATE
From the changelog:
New in 3.0.1:
API CHANGE: #to_file has been removed. Use the #copy_to_local_file method instead.
#vlard's solution is ok. However I've run into some issues with the to_file. It creates a tempfile and the garbage collector deletes (sometimes) the file before it was added to the zip file. Therefor, I'm getting random Errno::ENOENT: No such file or directory errors.
So I'm using the following code now (I've kept the initial code variables names for consistency with the initial question)
format.zip {
registrations_with_attachments = Registration.find_by_sql('SELECT * FROM registrations WHERE abstract_file_name NOT LIKE ""')
headers['Cache-Control'] = 'no-cache'
#please note that using nanoseconds option in strftime reduces the risks concerning the situation where 2 or more users initiate the download in the same time
tmp_filename = "#{RAILS_ROOT}/tmp/tmp_zip_" <<
Time.now.strftime('%Y-%m-%d-%H%M%S-%N').to_s <<
".zip"
# rubyzip gem version 0.9.4
zip = Zip::ZipFile.open(tmp_filename, Zip::ZipFile::CREATE)
zip.close
registrations_with_attachments.each { |e|
file_to_add = e.file.to_file
zip = Zip::ZipFile.open(tmp_filename)
zip.add("abstracts/#{e.abstract.original_filename}", file_to_add.path)
zip.close
puts "added #{file_to_add.path} to #{tmp_filename}" #force garbage collector to keep the file_to_add until after the file has been added to zip
}
send_data(File.open(tmp_filename, "rb+").read, :type => 'application/zip', :disposition => 'attachment', :filename => tmp_filename.to_s)
File.delete tmp_filename
}

Resources