I have managed to send an email with pdf attachments that are stored on s3
def welcome_pack1(website_registration)
require 'open-uri'
#website_registration = website_registration
email_attachments = EmailAttachment.find(:all,:conditions=>{:goes_to_us=>true})
email_attachments.each do |a|
tempfile = File.new("#{Rails.root.to_s}/tmp/#{a.pdf_file_name}", "w")
tempfile << open(a.pdf.url)
tempfile.puts
attachments[a.pdf_file_name] = File.read("#{Rails.root.to_s}/tmp/#{a.pdf_file_name}")
end
mail(:to => website_registration.email, :subject => "Welcome")
end
The attachments are attached to the email. But they come through as 0 bytes. I was using the example posted here paperclip + ActionMailer - Adding an attachment?. Am i missing something?
File objects are buffered - until you close the file (which you're not) then the bytes you've written may not be on disk. A great way to not forget to call close is to use the block form:
File.open(path, 'w') do |f|
#do stuff to f
end #file closed for you when the block is exited.
I'm not sure why you're using a file at all though - why not do
attachments[a.pdf_file_name] = open(a.pdf.url)
Related
I am parsing email attachments and uploading them to ActiveStorage in S3.
We would like it ignore duplicates but i cannot see to query by these attributes.
class Task < ApplicationRecord
has_many_attached :documents
end
then in my email webhook job
attachments.each do |attachment|
tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
# i'd like to do something like this
next if task.documents.where(filename: tempfile.filename, bytesize: temfile.bytesize).exist?
# this is what i'm currently doing
task.documents.attach(
io: tempfile,
filename: attachment[:name],
content_type: attachment[:content_type]
)
end
Unfortunately if someone forwards the same files, we've got duplicated and often more.
Edit with current solution:
tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
md5_digest = Digest::MD5.file(tempfile).base64digest
# if this digest already exists as attached to the file then we're all good.
next if ActiveStorage::Blob.joins(:attachments).where({
checksum: md5_digest,
active_storage_attachments: {name: 'documents', record_type: 'Task', record_id: task.id
}).exists?
Rails utilizes 2 tables for storing attachment data; active_storage_attachments and active_storage_blobs
The active_storage_blobs table houses a checksum of the uploaded file.
You can easily join this table to verify the existence of a file.
Going from #gustavo's answer I came up with the following:
attachments.each do |attachment|
tempfile = TempFile.new
tempfile.write open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
checksum = Digest::MD5.file(tempfile.path).base64digest
if task.documents.joins(:documents_blobs).exists?(active_storage_blobs: {checksum: checksum})
tempfile.unlink
next
end
#... Your attachment saving code here
end
Note: Remember to require 'tempfile' in the class where you are using this
What happens if they change the filename anyway (which happens many times with things like filename(2).xlsx) but the content is the same?
Maybe a better approach would be to compare the checksum? I believe that the ActiveStorage object will already store that, for saved files. You could do something like:
attachments.each do |attachment|
tempfile = open(attachment[:url], http_basic_authentication: ["api", ENV.fetch("MAILGUN_API_KEY")])
checksum = Digest::MD5.file(tempfile.path).base64digest
# i'd like to do something like this
next if task.documents.where(checksum: checksum).exist?
#...
end
That way you know it is the same physical file regardless of the incoming filename.
I'm using Prawn to generate a PDF from the controller of a Rails app,
...
respond_to do |format|
format.pdf do
pdf = GenerateReportPdf.new(#object, view_context)
send_data pdf.render, filename: "Report", type: "application/pdf", disposition: "inline"
end
end
This works fine, but I now want to move GenerateReportPdf into a background task, and pass the resulting object to Carrierwave to upload directly to S3.
The worker looks like this
def perform
pdf = GenerateReportPdf.new(#object)
fileString = ???????
document = Document.new(
object_id: #object.id,
file: fileString )
# file is field used by Carrierwave
end
How do I handle the object returned by Prawn (?????) to ensure it is a format that can be read by Carrierwave.
fileString = pdf.render_file 'filename' writes the object to the root directory of the app. As I'm on Heroku this is not possible.
file = pdf.render returns ArgumentError: string contains null byte
fileString = StringIO.new( pdf.render_file 'filename' ) returns TypeError: no implicit conversion of nil into String
fileString = StringIO.new( pdf.render ) returns ActiveRecord::RecordInvalid: Validation failed: File You are not allowed to upload nil files, allowed types: jpg, jpeg, gif, png, pdf, doc, docx, xls, xlsx
fileString = File.open( pdf.render ) returns ArgumentError: string contains null byte
....and so on.
What am I missing? StringIO.new( pdf.render ) seems like it should work, but I'm unclear why its generating this error.
It turns out StringIO.new( pdf.render ) should indeed work.
The problem I was having was that the filename was being set incorrectly and, despite following the advise below on Carrierwave's wiki, a bug elsewhere in the code meant that the filename was returning as an empty string. I'd overlooked this an assumed that something else was needed
https://github.com/carrierwaveuploader/carrierwave/wiki/How-to:-Upload-from-a-string-in-Rails-3
my code ended up looking like this
def perform
s = StringIO.new(pdf.render)
def s.original_filename; "my file name"; end
document = Document.new(
object_id: #object.id
)
document.file = s
document.save!
end
You want to create a tempfile (which is fine on Heroku as long as you don't expect it to persist across requests).
def perform
# Create instance of your Carrierwave Uploader
uploader = MyUploader.new
# Generate your PDF
pdf = GenerateReportPdf.new(#object)
# Create a tempfile
tmpfile = Tempfile.new("my_filename")
# set to binary mode to avoid UTF-8 conversion errors
tmpfile.binmode
# Use render to write the file contents
tmpfile.write pdf.render
# Upload the tempfile with your Carrierwave uploader
uploader.store! tmpfile
# Close the tempfile and delete it
tmpfile.close
tmpfile.unlink
end
Here's a way you can use StringIO like Andy Harvey mentioned, but without adding a method to the StringIO intstance's eigenclass.
class VirtualFile < StringIO
attr_accessor :original_filename
def initialize(string, original_filename)
#original_filename = original_filename
super(string)
end
end
def perform
pdf_string = GenerateReportPdf.new(#object)
file = VirtualFile.new(pdf_string, 'filename.pdf')
document = Document.new(object_id: #object.id, file: file)
end
This one took me couple of days, the key is to call render_file controlling the filepath so you can keep track of the file, something like this:
in one of my Models e.g.: Policy i have a list of documents and this is just the method for updating the model connected with the carrierwave e.g.:PolicyDocument < ApplicationRecord mount_uploader :pdf_file, PdfDocumentUploader
def upload_pdf_document_file_to_s3_bucket(document_type, filepath)
policy_document = self.policy_documents.where(policy_document_type: document_type)
.where(status: 'processing')
.where(pdf_file: nil).last
policy_document.pdf_file = File.open(file_path, "r")
policy_document.status = 's3_uploaded'
policy_document.save(validate:false)
policy_document
rescue => e
policy_document.status = 's3_uploaded_failed'
policy_document.save(validate:false)
Rails.logger.error "Error uploading policy documents: #{e.inspect}"
end
end
in one of my Prawn PDF File Generators e.g.: PolicyPdfDocumentX in here please note how im rendering the file and returning the filepath so i can grab from the worker object itself
def generate_prawn_pdf_document
Prawn::Document.new do |pdf|
pdf.draw_text "Hello World PDF File", size: 8, at: [370, 462]
pdf.start_new_page
pdf.image Rails.root.join('app', 'assets', 'images', 'hello-world.png'), width: 550
end
end
def generate_tmp_file(filename)
file_path = File.join(Rails.root, "tmp/pdfs", filename)
self.generate_prawn_pdf_document.render_file(file_path)
return filepath
end
in the "global" Worker for creating files and uploading them in the s3 bucket e.g.: PolicyDocumentGeneratorWorker
def perform(filename, document_type, policy)
#here we create the instance of the prawn pdf generator class
pdf_generator_class = document_type.constantize.new
#here we are creating the file, but also `returning the filepath`
file_path = pdf_generator_class.generate_tmp_file(filename)
#here we are simply updating the model with the new file created
policy.upload_pdf_document_file_to_s3_bucket(document_type, file_path)
end
finally how to test, run rails c and:
the_policy = Policies.where....
PolicyDocumentGeneratorWorker.new.perform('report_x.pdf', 'PolicyPdfDocumentX',the_policy)
NOTE: im using meta-programming in case we have multiple and different file generators, constantize.new is just creating new prawn pdf doc generator instance so is similar to PolicyPdfDocument.new that way we can only have one pdf doc generator worker class that can handle all of your prawn pdf documents so for instance if you need a new document you can simply PolicyDocumentGeneratorWorker.new.perform('report_y.pdf', 'PolicyPdfDocumentY',the_policy)
:D
hope this helps someone to save some time
I am processing a pdf uploaded by an user by extracting the text from it and saving the output in an text file for processing later.
Locally I store the pdf in my public folder but when I work on Heroku I need to use S3.
I thought that the pdf path was the problem, so I included
if Rails.env.test? || Rails.env.cucumber?
But still I receive
ArgumentError (input must be an IO-like object or a filename):
Is there a way of temporarily storing the pdf in my root/tmp folder on Heroku, get the text from it, and then after that is done, upload the document to S3?
def convert_pdf
if Rails.env.test? || Rails.env.cucumber?
pdf_dest = File.join(Rails.root, "public", #application.document_url)
else
pdf_dest = #application.document_url
end
txt_file_dest = Rails.root + 'tmp/pdf-parser/text'
document_file_name = /\/uploads\/application\/document\/\d{1,}\/(?<file_name>.*).pdf/.match(#application.document_url)[:file_name]
PDF::Reader.open(pdf_dest) do |reader|
File.open(File.join(txt_file_dest, document_file_name + '.txt'), 'w+') do |f|
reader.pages.each do |page|
f.puts page.text
end
end
end
end
You're going to want to set up a custom processor in your uploader. And on top of that, since the output file (.txt) isn't going to have the same extension as the input file (.pdf), you're going to want to change the filename. The following belongs in your Uploader:
process :convert_to_text
def convert_to_text
temp_dir = Rails.root.join('tmp', 'pdf-parser', 'text')
temp_path = temp_dir.join(filename)
FileUtils.mkdir_p(temp_dir)
PDF::Reader.open(current_path) do |pdf|
File.open(temp_path, 'w') do |f|
pdf.pages.each do |page|
f.puts page.text
end
end
end
File.unlink(current_path)
FileUtils.cp(temp_path, current_path)
end
def filename
super + '.txt' if original_filename.present?
end
I haven't run this code, so there are probably some bugs, but that should give you the idea at least.
I'm trying to attach a file to an outgoing email but the attachment size ends up being 1 byte. It doesn't matter what attachment I'm forwarding it always ends up in the email 1 byte in size (corrupt). Everything else looks ok to me.
The email information is pulled from an IMAP account and stored in the database for browsing purposes. Attachments are stored on the file system and it's file name stored as an associated record for the Email.
In the view there's an option to forward the email to another recipient. It worked in Rails 2.3.8 but for Rails 3 I've had to change the attachment part of the method so now it looks like...
def forward_email(email_id, from_address, to_address)
#email = Email.find(email_id)
#recipients = to_address
#from = from_address
#subject = #email.subject
#sent_on = Time.now
#body = #email.body + "\n\n"
#email.attachments.each do |file|
if File.exist?(file.full_path)
attachment :filename => file.file_name, :body => File.read(file.full_path)
else
#body += "ATTACHMENT NOT FOUND: #{file.file_name}\n\n"
end
end
end
I've also tried it with...
attachments[file.file_name] = File.read(file.full_path)
and adding :mime_type and :content_type to no avail.
Any help would be a appreciated.
Thanks!
This is what I tried and worked for me
attachments.each do |file|
attachment :content_type => MIME::Types.type_for(file.path).first.content_type, :body => File.read(file.path)
end
Is the file readable? Can you debug the issue by placing something like this?
logger.debug "File: #{file.full_path.inspect} : #{File.read(file.full_path).inspect[0..100]}"
Is there anything in your development.log?
Well, someone from the rails team answered my question. The problem lies with adding body content (#body) other than the attachment inside the method. If you're going to attach files you have to use a view template.
If I have a mail object, eg:
mail = Mail.new do
from "jim#gmail.com"
to "jane#yahoo.com"
subject "Example"
text_part do
body "Blarg"
end
add_file "/some/file/or/some_such.jpg"
end
If I were to receive the above mail in my application
received_mail = mail.encoded
Message.parse(received_mail)
How would I pass the attachment on to CarrierWave/Paperclip (not fussed about which, I'll use whichever one handles this best)? I've tried a few different methods, but I keep running in to various stumbling blocks - has anyone got a working solution for it?
My current attempt is:
mail.attachments.each do |attachment|
self.attachments << Attachment.new(:file => Tempfile.new(attachment.filename) {|f| f.write(attachment.decoded)})
end
This doesn't appear to work - any tips?
end
I know that when I tried to take mail attachments and use them with paperclip, I also ran into some problems. The problem as I remember it was that paperclip expected certain attributes on the File object passed to it.
I solved it like this:
mail.attachments.each do |attachment|
file = StringIO.new(attachment.decoded)
file.class.class_eval { attr_accessor :original_filename, :content_type }
file.original_filename = attachment.filename
file.content_type = attachment.mime_type
#Then you attach it where you want it
self.attachments << Attachment.new(:file => file)