Paperclip Processor Operate on S3 - ruby-on-rails

I am trying to create a custom Paperclip::Processor that integrates with an external web service (the processor will call the web service whenever a new file is uploaded). The external service needs the file to be present in S3 and will handle uploading the processed versions to S3 automatically.
Can this be done using a custom Paperclip::Processor or should it be done with an ActiveRecord callback? If a Paperclip::Processor will work, what is the best way to trigger the upload? Ideally I'd like to do a processor, but the requirement is that the original file MUST be uploaded to S3 first. I have taken a look at using after_create calls, but it sometimes seems to conflict with the after_create used in paperclip. Thanks.

You can do this to create a local copy of the file. If it's on S3 it will be downloaded.
tmp_file = #model.attached_file.to_file => TempFile<...>
You can then do your operations on this TempFile. When you're don:
#model.attached_file = tmp_file
#model.save
Edit: misread your question. You can use the before_post_process and after_post_process hooks to perform tasks before or after the file was processed.
class Model < AR::Base
has_attached_file :avatar
after_post_process :ping_webservice
private
def ping_webservice
# Do your magic here.
end
end

I dealt with a similar issue recently, and it was with the after_save callback. I managed to fix my problem by defining paperclip (has_attached_file ...) after I defined my after_save. This way, paperclip's callback will fire after mine.

Related

Seeding multiple image attachments from Amazon S3 in Rails

I am trying to seed multiple image attachments to a model. I have been using this link but I am still sort of stuck since what I aim to do differs a little since:
I am trying to attach multiple images to each object (which I seed) in the model
I want to retrieve these images from my S3 bucket and attach them to the objects (is this possible?)
Here's my seed.rb:
shirt = Item.create(name:"Basic Shirt",price:19.99)
skirt = Item.create(name:"Basic Skirt",price:29.99)
sweater = Item.create(name:"Basic Sweater",price:39.99)
kid_hood = Item.create(name:"Basic Kid Hoodie",price:19.99)
# somehow attach images here?
I am using the aws-sdk-s3 gem in order to connect Active Storage to my S3 bucket. Please tell me if any additional files are needed for viewing. I will happily edit this post to include it.
ActiveStorage work on plain byte streams, so you can download the file (using open-uri for instance) and assign the stream as the content of the attachment.
Assuming you have the following (adapt if different)
class Item < ApplicationRecord
has_one_attached :photo
end
you can have your seeds as:
require 'open-uri'
shirt = Item.create(name:"Basic Shirt",price:19.99)
shirt.photo.attach(io: open('your-s3-nonexpiring-url'), filename: 'foo.bar')
# ...
Just a note: as of Ruby 3.0, you will need to call URI.open instead of open. See the reference to open-uri here.

How to upload while resizing the original image itself in Shrine

I use Shrine in a Ruby on Rails application to create the process of resizing and uploading images to storage.
My current code is:
image_uploader.rb
require "image_processing/mini_magick"
class ImageUploader < Shrine
plugin :derivatives
Attacher.derivatives_processor do |original|
magick = ImageProcessing::MiniMagick.source(original)
{
resized: magick.resize_to_limit!(120, 120)
}
end
end
user.rb
class User < ApplicationRecord
include ImageUploader::Attachment(:image)
before_save :image_resize
def image_resize
self.image_derivatives!
end
end
I implemented it while reading the official documentation, but this is not desirable in two ways.
Requires trigger in model code. Can it be completed with only image_uploader.rb?
Access to images generated with this code requires a "resized" prefix(e.g. #user.image(:resized).url), and the original image will also remain in storage. I want to process the original image itself.
Is there a way to upload while solving these two issues?
You can add the following patch, which will trigger derivatives creation as part of promoting cached file to permanent storage:
# put this in your initializer
class Shrine::Attacher
def promote(*)
create_derivatives
super
end
end
You can just override the model method that retrieves the attached file to return the resized version. You can use the included plugin to do this for all models using this uploader:
class ImageUploader < Shrine
# ...
plugin :included do |name|
define_method(name) { super(:resized) }
end
end
As for the second question: It will still keep the original file in the storage, but just return the resized version instead by default. It's generally better to do this in a view decorator instead.
You always want to keep the original file in the storage, because you never know when you'll need to reprocess it. It can happen that you find your current resizing logic not to be ideal for certain filetypes and sizes, in which case you'll need to regenerate the resized version for previous attachments. And you wouldn't be able to do that if you didn't have the original file anymore.

Stubbing Paperclip downloads from S3 in RSpec

I am using Paperclip/RSpec and StackOverflow has helped me successfully stub file uploads to S3 using this code:
spec/rails_helper.rb
config.before(:each) do
allow_any_instance_of(Paperclip::Attachment).to receive(:save).and_return(true)
end
This is working great.
On my model I have two Paperclip fields:
class MyModel < ActiveRecord::Base
has_attached_file :pdf
has_attached_file :resource
end
My code uses the #copy_to_local_file method (Docs) to retrieve a file from S3.
#copy_to_local_file takes two params: the style (:original, :thumbnail, etc) and the local file path to copy to.
Example:
MyModel.resource.copy_to_local_file(:original, local_file.path)
When the system under test tries to access MyModel#pdf#copy_to_local_file or MyModel#resource#copy_to_local_file, I originally got errors like the following:
No Such Key - cannot copy /email_receipts/pdfs/000/000/001/original/email_receipt.eml.pdf to local file /var/folders/4p/1mm86g0n58x7d9rvpy88_s9h0000gn/T/receipt20150917-4906-13evk95.pdf
No Such Key - cannot copy /email_receipts/resources/000/000/001/original/email_receipt.eml to local file /var/folders/4p/1mm86g0n58x7d9rvpy88_s9h0000gn/T/resource20150917-4906-1ysbwr3.eml
I realize these errors were happening because uploads to S3 are stubbed, so when it encounters MyModel#pdf#copy_to_local_file or MyModel#resource#copy_to_local_file it tries to grab a file in S3 that isn't there.
Current Solution:
I've managed to quash the errors above, but I feel it's not a complete solution and gives my tests a false sense of security. My half-solution is to stub this method in the following way:
spec/rails_helper.rb
before(:each) do
allow_any_instance_of(Paperclip::Storage::S3).to receive(:copy_to_local_file)
end
While this does stub out the #copy_to_local_file method and removes the errors, it doesn't actually write any content to the local file that is provided as the second argument to #copy_to_local_file, so it doesn't quite simulate the file being downloaded from S3.
Question:
Is there a way to stub #copy_to_local_file AND have it write the contents of a canned file in my spec/factories/files directory to the local file (its second argument)?
Or am I overthinking this? Is this something I shouldn't be worrying about?
You don't need to worry about whether the 'downloaded' files actually exist in your tests. You've decided to stub out Paperclip, so do it completely, by stubbing out both #save and #copy_to_file. You may also need to stub out reads of downloaded files from the filesystem.
All this stubbing raises the possibility of integration errors, so you should probably write a feature spec (using a captive browser like poltergeist) that actually uploads and downloads something and reads it from the filesystem.
That said, you can do anything you want in an RSpec stub by passing it a block:
allow_any_instance_of(Paperclip::Storage::S3).to receive(:copy_to_local_file) do |style, local_dest_path|
# write a file here, or do anything you like
end

Paperclip - Running a method after the file is saved?

I'm working on a project that needs to accept file uploads. After the file is uploaded, I'm doing some processing - extracting information from the file. I eventually plan to run this in a background worker, but it's currently running inline.
I've tried making use of both after_create and after_save to process the file, but it seems my method is ran before the save method from Paperclip - so my tests fail with "No such file or directory".
Is there any way to trigger the save method early, or to somehow run my method after the file has been saved to the file system?
You can't read paperclip file in a callback as it's not saved to filesystem yet (or butt). Why, I'm not exactly sure.
EDIT: Reason is that paperclip writes out the file via after_save callback. That callback happens after after_create
However you can get the file payload for your processing. For example:
class Foo < ActiveRecord::Base
has_attached_file :csv
after_create :process_csv
def process_csv
CSV.parse(self.csv.queued_for_write[:original].read)
# .. do stuff
end
end
I had to do this 2 minutes ago. Hope this helps.
Adding this answer for visibility. A previous comment by #Jonmichael Chambers in this thread solved the problem for me.
Change the callback from after_save/after_create to after_commit
I think the problem might be related to the order of callbacks.
As discussed in other answers, the attachment file is indeed physically saved to disk in an after_save callback defined in Paperclip which is added to the model class at the point of has_attached_file call.
So you must ensure that your own after_save callbacks (that want to deal with the uploaded file) are defined after the has_attached_line.
Note: the after_create callback indeed cannot be used at all as it is called before after_save.
Take a look at the Paperclip post processing events callbacks. You should be able to call after_post_process to do your extra file info extraction.

Rails - Run an external program on a Paperclip attachment for processing and save the output attachment back to the model

In my rails project, I need the user to upload a file (input_file) which I will process using an external application. Once, it is completed, I want to attach the processed file to the same model as a different attachment (output file).
I have been able to create a form and use paperclip to allow the user to upload the input_file to my model FileProcessor. Im not sure on the next step as to how do I call an executable on the input_file and save it as output_file.
Based on paperclip, once the file is upload, I can access the path via input_file.path
output_file = %w{external_app input_file.path out_file_name}
Class FileProcessor
has_attached_file :input_file
has_attached_file :output_file
Im confused as to where this call to run the external app be placed? in the model or in the controller (def create). Also, how do I work with paperclip to associate the output_file with the model without actually uploading.
The location for such code depends on what kind of business your external process does. With the requirements as depicted in the question, it would be as simple as this:
class FileProcessor < ActiveRecord
...
after_validation do |fp|
tmp_file = "/tmp/#{rand}"
system "/usr/bin/awesome.sh #{fp.input_file.path} > #{tmp_file}"
fp.output_file = File.open(tmp_file)
end
...
end
I hope, this is what you are looking for.

Resources