write stream to paperclip - ruby-on-rails

I want to store received email attachment with usage of paperclip. From email I get part.body and I have no idea how to put it to paperclip'ed model. For now I create temporary file and write port.body to it, store this file to paperclip, and delete file. Here is how I do it with temporary file:
l_file = File.open(l_path, "w+b", 0644)
l_file.write(part.body)
oAsset = Asset.new(
:email_id => email.id,
:asset => l_file,
:header => h,
:original_file_name => o,
:hash => h)
oAsset.save
l_file.close
File.delete(l_path)
:asset is my 'has_attached_file' field. Is there a way to omit file creation and to do something like: :asset => part.body in Asset.new ?

This is how I would do it, assuming your using the mail gem to read the email. you'll need the whole email 'part', not just part.body
file = StringIO.new(part.body) #mimic a real upload file
file.class.class_eval { attr_accessor :original_filename, :content_type } #add attr's that paperclip needs
file.original_filename = part.filename #assign filename in way that paperclip likes
file.content_type = part.mime_type # you could set this manually aswell if needed e.g 'application/pdf'
now just use the file object to save to the Paperclip association.
a = Asset.new
a.asset = file
a.save!
Hope this helps.

Barlow's answer is good, but it is effectively monkey-patching the StringIO class. In my case I was working with Mechanize::Download#body_io and I didn't want to possibly pollute the class leading to unexpected bugs popping up far away in the app. So I define the methods on the instances metaclass like so:
original_filename = "whatever.pdf" # Set local variables for the closure below
content_type = "application/pdf"
file = StringIO.new(part.body)
metaclass = class << file; self; end
metaclass.class_eval do
define_method(:original_filename) { original_filename }
define_method(:content_type) { content_type }
end

I like gtd's answer a lot, but it can be simpler.
file = StringIO.new(part.body)
class << file
define_method(:original_filename) { "whatever.pdf" }
define_method(:content_type) { "application/pdf" }
end
There's not really a need to extract the "metaclass" into a local variable, just append some class to the object.

From ruby 1.9, you can use StringIO and define_singleton_method :
def attachment_from_string(string, original_filename, content_type)
StringIO.new(string).tap do |file|
file.define_singleton_method(:original_filename) { original_filename }
file.define_singleton_method(:content_type) { content_type }
end
end

This would have been better as a comment on David-Barlow's answer but I don't have enough reputation points yet...
But, as others mentioned I didn't love the monkey-patching. Instead, I just created a new class that inherited from StringIO, like so:
class TempFile < StringIO
attr_accessor :original_filename, :content_type
end

For posterity, here is the best answer. Put the top part in vendor/paperclip/data_uri_adapter.rb and the bottom part in config/initializers/paperclip.rb.
https://github.com/thoughtbot/paperclip/blob/43eb9a36deb09ce5655028a1061578dbf0268a5d/lib/paperclip/io_adapters/data_uri_adapter.rb
This requires a data URI scheme stream, but these days that seems pretty common. Simply set your paperclip'd variable to a string with the stream data, and the code takes care of the rest.

I used a similar technique to pull down images into paperclip
this should work, but is obvs untested:
io = part.body
def io.original_filename; part.original_file_name || 'unknown-file-name'; end
asset = Asset.new(:email=>email)
asset.asset = io
When we are assigning the IO directly to the paperclip instance, it needs to have a .original_file_name to it, so that's what we're doing in the second line.

Related

ActiveStorage how to prevent duplicate file uploads ; find by filename

I am parsing email attachments and uploading them to ActiveStorage in S3.
We would like it ignore duplicates but i cannot see to query by these attributes.
class Task < ApplicationRecord
has_many_attached :documents
end
then in my email webhook job
attachments.each do |attachment|
tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
# i'd like to do something like this
next if task.documents.where(filename: tempfile.filename, bytesize: temfile.bytesize).exist?
# this is what i'm currently doing
task.documents.attach(
io: tempfile,
filename: attachment[:name],
content_type: attachment[:content_type]
)
end
Unfortunately if someone forwards the same files, we've got duplicated and often more.
Edit with current solution:
tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
md5_digest = Digest::MD5.file(tempfile).base64digest
# if this digest already exists as attached to the file then we're all good.
next if ActiveStorage::Blob.joins(:attachments).where({
checksum: md5_digest,
active_storage_attachments: {name: 'documents', record_type: 'Task', record_id: task.id
}).exists?
Rails utilizes 2 tables for storing attachment data; active_storage_attachments and active_storage_blobs
The active_storage_blobs table houses a checksum of the uploaded file.
You can easily join this table to verify the existence of a file.
Going from #gustavo's answer I came up with the following:
attachments.each do |attachment|
tempfile = TempFile.new
tempfile.write open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
checksum = Digest::MD5.file(tempfile.path).base64digest
if task.documents.joins(:documents_blobs).exists?(active_storage_blobs: {checksum: checksum})
tempfile.unlink
next
end
#... Your attachment saving code here
end
Note: Remember to require 'tempfile' in the class where you are using this
What happens if they change the filename anyway (which happens many times with things like filename(2).xlsx) but the content is the same?
Maybe a better approach would be to compare the checksum? I believe that the ActiveStorage object will already store that, for saved files. You could do something like:
attachments.each do |attachment|
tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
checksum = Digest::MD5.file(tempfile.path).base64digest
# i'd like to do something like this
next if task.documents.where(checksum: checksum).exist?
#...
end
That way you know it is the same physical file regardless of the incoming filename.

How to handle a file_as_string (generated by Prawn) so that it is accepted by Carrierwave?

I'm using Prawn to generate a PDF from the controller of a Rails app,
...
respond_to do |format|
format.pdf do
pdf = GenerateReportPdf.new(#object, view_context)
send_data pdf.render, filename: "Report", type: "application/pdf", disposition: "inline"
end
end
This works fine, but I now want to move GenerateReportPdf into a background task, and pass the resulting object to Carrierwave to upload directly to S3.
The worker looks like this
def perform
pdf = GenerateReportPdf.new(#object)
fileString = ???????
document = Document.new(
object_id: #object.id,
file: fileString )
# file is field used by Carrierwave
end
How do I handle the object returned by Prawn (?????) to ensure it is a format that can be read by Carrierwave.
fileString = pdf.render_file 'filename' writes the object to the root directory of the app. As I'm on Heroku this is not possible.
file = pdf.render returns ArgumentError: string contains null byte
fileString = StringIO.new( pdf.render_file 'filename' ) returns TypeError: no implicit conversion of nil into String
fileString = StringIO.new( pdf.render ) returns ActiveRecord::RecordInvalid: Validation failed: File You are not allowed to upload nil files, allowed types: jpg, jpeg, gif, png, pdf, doc, docx, xls, xlsx
fileString = File.open( pdf.render ) returns ArgumentError: string contains null byte
....and so on.
What am I missing? StringIO.new( pdf.render ) seems like it should work, but I'm unclear why its generating this error.
It turns out StringIO.new( pdf.render ) should indeed work.
The problem I was having was that the filename was being set incorrectly and, despite following the advise below on Carrierwave's wiki, a bug elsewhere in the code meant that the filename was returning as an empty string. I'd overlooked this an assumed that something else was needed
https://github.com/carrierwaveuploader/carrierwave/wiki/How-to:-Upload-from-a-string-in-Rails-3
my code ended up looking like this
def perform
s = StringIO.new(pdf.render)
def s.original_filename; "my file name"; end
document = Document.new(
object_id: #object.id
)
document.file = s
document.save!
end
You want to create a tempfile (which is fine on Heroku as long as you don't expect it to persist across requests).
def perform
# Create instance of your Carrierwave Uploader
uploader = MyUploader.new
# Generate your PDF
pdf = GenerateReportPdf.new(#object)
# Create a tempfile
tmpfile = Tempfile.new("my_filename")
# set to binary mode to avoid UTF-8 conversion errors
tmpfile.binmode
# Use render to write the file contents
tmpfile.write pdf.render
# Upload the tempfile with your Carrierwave uploader
uploader.store! tmpfile
# Close the tempfile and delete it
tmpfile.close
tmpfile.unlink
end
Here's a way you can use StringIO like Andy Harvey mentioned, but without adding a method to the StringIO intstance's eigenclass.
class VirtualFile < StringIO
attr_accessor :original_filename
def initialize(string, original_filename)
#original_filename = original_filename
super(string)
end
end
def perform
pdf_string = GenerateReportPdf.new(#object)
file = VirtualFile.new(pdf_string, 'filename.pdf')
document = Document.new(object_id: #object.id, file: file)
end
This one took me couple of days, the key is to call render_file controlling the filepath so you can keep track of the file, something like this:
in one of my Models e.g.: Policy i have a list of documents and this is just the method for updating the model connected with the carrierwave e.g.:PolicyDocument < ApplicationRecord mount_uploader :pdf_file, PdfDocumentUploader
def upload_pdf_document_file_to_s3_bucket(document_type, filepath)
policy_document = self.policy_documents.where(policy_document_type: document_type)
.where(status: 'processing')
.where(pdf_file: nil).last
policy_document.pdf_file = File.open(file_path, "r")
policy_document.status = 's3_uploaded'
policy_document.save(validate:false)
policy_document
rescue => e
policy_document.status = 's3_uploaded_failed'
policy_document.save(validate:false)
Rails.logger.error "Error uploading policy documents: #{e.inspect}"
end
end
in one of my Prawn PDF File Generators e.g.: PolicyPdfDocumentX in here please note how im rendering the file and returning the filepath so i can grab from the worker object itself
def generate_prawn_pdf_document
Prawn::Document.new do |pdf|
pdf.draw_text "Hello World PDF File", size: 8, at: [370, 462]
pdf.start_new_page
pdf.image Rails.root.join('app', 'assets', 'images', 'hello-world.png'), width: 550
end
end
def generate_tmp_file(filename)
file_path = File.join(Rails.root, "tmp/pdfs", filename)
self.generate_prawn_pdf_document.render_file(file_path)
return filepath
end
in the "global" Worker for creating files and uploading them in the s3 bucket e.g.: PolicyDocumentGeneratorWorker
def perform(filename, document_type, policy)
#here we create the instance of the prawn pdf generator class
pdf_generator_class = document_type.constantize.new
#here we are creating the file, but also `returning the filepath`
file_path = pdf_generator_class.generate_tmp_file(filename)
#here we are simply updating the model with the new file created
policy.upload_pdf_document_file_to_s3_bucket(document_type, file_path)
end
finally how to test, run rails c and:
the_policy = Policies.where....
PolicyDocumentGeneratorWorker.new.perform('report_x.pdf', 'PolicyPdfDocumentX',the_policy)
NOTE: im using meta-programming in case we have multiple and different file generators, constantize.new is just creating new prawn pdf doc generator instance so is similar to PolicyPdfDocument.new that way we can only have one pdf doc generator worker class that can handle all of your prawn pdf documents so for instance if you need a new document you can simply PolicyDocumentGeneratorWorker.new.perform('report_y.pdf', 'PolicyPdfDocumentY',the_policy)
:D
hope this helps someone to save some time

Missing extension when save image on paperclip

image = PortfolioFileItem.find(107)
img_source = "http://s3.amazonaws.com/test/portfolio_file_items_final/original/1.jpg"
image.picture_from_url(img_source)
image.save(false)
image save DONE but missing extensionof image. this is sample image name saved:
open-uri20110528-6779-fpiust-0.
Please help me solved problem. thanks
To add an extension to paperclip add this line after has_attached_file as an option
:path => ":rails_root/public/:attachment/:id/:style/:basename.:extension"
You can customize this path to fit your needs however you must have the .:extension at the end, the :extension is one of many values that can be used for interpolation.
See this blog post for more information.
If the actual file does not have an extension originally you can detect extension and add it before saving
def before_save
tempfile = data.queued_for_write[:original]
unless tempfile.nil?
extension = File.extname(tempfile.original_filename)
if !extension || extension == ''
mime = tempfile.content_type
ext = Rack::Mime::MIME_TYPES.invert[mime]
self.data.instance_write :file_name, "#{tempfile.original_filename}#{ext}"
end
end
true
end

Uploading a file through Paperclip or Carrierwave from a Mail attachment

If I have a mail object, eg:
mail = Mail.new do
from "jim#gmail.com"
to "jane#yahoo.com"
subject "Example"
text_part do
body "Blarg"
end
add_file "/some/file/or/some_such.jpg"
end
If I were to receive the above mail in my application
received_mail = mail.encoded
Message.parse(received_mail)
How would I pass the attachment on to CarrierWave/Paperclip (not fussed about which, I'll use whichever one handles this best)? I've tried a few different methods, but I keep running in to various stumbling blocks - has anyone got a working solution for it?
My current attempt is:
mail.attachments.each do |attachment|
self.attachments << Attachment.new(:file => Tempfile.new(attachment.filename) {|f| f.write(attachment.decoded)})
end
This doesn't appear to work - any tips?
end
I know that when I tried to take mail attachments and use them with paperclip, I also ran into some problems. The problem as I remember it was that paperclip expected certain attributes on the File object passed to it.
I solved it like this:
mail.attachments.each do |attachment|
file = StringIO.new(attachment.decoded)
file.class.class_eval { attr_accessor :original_filename, :content_type }
file.original_filename = attachment.filename
file.content_type = attachment.mime_type
#Then you attach it where you want it
self.attachments << Attachment.new(:file => file)

What is the best way to upload a file to another Rails application?

I 've researched and noticed that ActiveResource lack this functionality. So, what is the current state of the art when doing a file upload?
One problem with Guillermo's approach is that the request has to be nested, like this:
body = { :file => {:uploaded_data => File.open("#{RAILS_ROOT}/public/tmp/" + original_filename), :owner_id => current_user.owner_id }, :api_key => '123123123123123123'}
Of course it is not possible to do a request like this with HttpClient. I tried other gems I found in github (sevenwire-http-client and technoweenie-rest-client) but they have problems with the file being nested. Is it possible to upload a file with a nested request?
The Httpclient gem allows you to do multipart posts like this:
clnt = HTTPClient.new
File.open('/tmp/post_data') do |file|
body = { 'upload' => file, 'user' => 'nahi' }
res = clnt.post(uri, body)
end
You could use this to simply post a file on the local file system to a controller in the other application. If you want to upload data just upload with a form into your app without storing it first, you could probably use the uploaded data from your params immediately in the post body.
You can try something like the following:
#I used the HTTPClient gem as suggested (thanks!)
clnt = HTTPClient.new
# The file to be uploaded is originally on /tmp/ with a filename 'RackMultipart0123456789'.
# I had to rename this file, or the resulting uploaded file will keep that filename.
# Thus, I copied the file to public/tmp and renamed it to its original_filename.(it will be deleted later on)
original_filename = params[:message][:file].original_filename
directory = "#{RAILS_ROOT}/public/temporary"
path = File.join(directory, original_filename)
File.open(path, "w+") { |f| f.write(params[:job_application][:resume].read) }
# I upload the file that is currently on public/tmp and then do the post.
body = { :uploaded_data => File.open("#{RAILS_ROOT}/public/tmp/" + original_filename), :owner_id => current_user.owner_id}
res = clnt.post('http://localhost:3000/files.xml', body)

Resources