Zip File from S3 files - ruby-on-rails

Ruby '2.7.4'
Rails '~> 5.2.2'
I have access to an S3 bucket containing several files of several types, which I am trying to
Download into memory
Put them all together inside a zip file
Upload this zip file into some S3 bucket
I've looked into several issues on the web already, without any success.
Specifically, I'm trying to use the rubyzip gem, but no matter what I do, I always end up with the error message : 'no implicit conversion of StringIO into String'
Here's a summary of my current code
gem 'rubyzip', require: 'zip'
require 'zip'
bucket_name = 'redacted'
zip_filename = "My final complete zip file.zip"
s3_client = Aws::S3::Client.new(region: 'eu-west-3')
s3_resource = Aws::S3::Resource.new(region: 'eu-west-3')
bucket = s3_resource.bucket(bucket_name)
s3_filename = 's3_file_name'
s3_file = s3_client.get_object(bucket: bucket_name, key: s3_filename)
file = s3_file.body
At this point, I have exactly one file, in a StringIO format.
However please bear in mind that I'm trying to reproduce this with several files, which means I want to bundle several files inside a final zip.
I'm failing to put this file into a zip and/or put the zip back into s3.
Attempt N°1
stringio = Zip::OutputStream.write_buffer do |zio|
zio.put_next_entry("test1.zip")
zio.write(file)
end
stringio.rewind
binary_data = stringio.sysread
Error message : no implicit conversion of StringIO into String
Attempt N°2
zip_file_name = 'my_test_file_name.zip'
File.open(zip_file_name, 'w') { |f| f.puts(file.rewind && file.read) }
final_zip = Zip::File.open(zip_filename, create: true) do |zipfile|
zf = Zip::File.new(file, create: true, buffer: true)
zipfile.add(zf.to_s, zip_file_name)
end
really_final_zip = Zip::File.new(final_zip, create: true, buffer: true)
new_object = bucket.object(zip_file_name)
new_object.put(body: final_zip)
Error Message : expected params[:body] to be a String or IO like object that supports read and rewind, got value #<Zip::Entry:0x0000558a06ff42a0
If instead of that last line, I write
new_object.put(body: final_zip.to_s)
A text file is created in S3 (instead of the zip) with the content #<StringIO:0x0000558a06c8c8d8>

Need to read the bytes from the file so...
change
s3_file.body to s3_file.body.read

Related

Confusion about creating and writing to File in Ruby on rails

I am trying to create a new file and then write some content to it just to create a basic backup of a template.
When I log out the values of filename and file_content they are correct, but when I send the data all I get is a file named after the method (download_include) and a fixnum inside the file, the last one made was 15.
# POST /download_include/:id
def download_include
#include = Include.find(params[:id])
version_to_download = #include.latest_version_record
filename = "#{version_to_download.name}"
file_content = "#{version_to_download.liquid_code.to_s}"
file = File.open(filename, "w") { |f| f.write (file_content) }
send_data file
end
I also tried send_file but that produces the error
no implicit conversion of Fixnum into String
I also tried to just write dummy values like below, and it still produced a file named after the method with a fixnum inside it.
file = File.open("DOES THIS CHANGE THE FILENAME?", "w") { |f| f.write ("FILE CONTENT?") }
I feel I am missing something obvious but I cannot figure it out after looking at many examples here and in blogs.
If you don't end along the filename as an option for send_data, it defaults to the method name.
Secondly, the download wants to read the data from a buffer. My guess is your syntax is just sending a file handle.
Try this...
send_data(file.read, filename: filename)
Or skip the intermediate file and try...
send_data(version_to_download.liquid_code.to_s, filename: filename)

How to convert word file to PDF in ROR

I am using Libreconv gem to convert word to doc but it's not working with S3
bucket = Aws::S3::Bucket.new('bucket-name')
object = bucket.object file.attachment.blob.key
path = object.presigned_url(:get)
Libreconv.convert(path, "public/test.pdf")
If I try to convert this path to PDF using Libreconv then it's give me filename too long error. I have wrriten this code under ActiveJobs. So kindly provide me solutions as per ActiveJobs.
Can someone please suggest me how can I convert word file to pdf.
Here path is https://domain.s3.amazonaws.com/Bf5qPUP3znZGCHCcTWHcR5Nn?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIZ6RZ7J425ORVUYQ%2F20181206%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20181206T051240Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=b89c47a324b2aa423bf64dfb343e3b3c90dce9b54fa9fe1bc4efa9c248e912f9
and error I am getting is
Error: source file could not be loaded
*** Errno::ENAMETOOLONG Exception: File name too long # rb_sysopen - /tmp/Bf5qPUP3znZGCHCcTWHcR5Nn?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIZ6RZ7J425ORVUYQ%2F20181206%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20181206T051240Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=b89c47a324b2aa423bf64dfb343e3b3c90dce9b54fa9fe1bc4efa9c248e912f9.pd
It seems that you PDF is created with all the params needed to fetch docx from S3.
I suppose it happens in this line:
target_tmp_file = "#{target_path}/#{File.basename(#source, ".*")}.#{File.basename(#convert_to, ":*")}"
#source is https://domain.s3.amazonaws.com/Bf5qPUP3znZGCHCcTWHcR5Nn?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIZ6RZ7J425ORVUYQ%2F20181206%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20181206T051240Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=b89c47a324b2aa423bf64dfb343e3b3c90dce9b54fa9fe1bc4efa9c248e912f9 and
> File.basename(#source, ".*")
=> "Bf5qPUP3znZGCHCcTWHcR5Nn?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIZ6RZ7J425ORVUYQ%2F20181206%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20181206T051240Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=b89c47a324b2aa423bf64dfb343e3b3c90dce9b54fa9fe1bc4efa9c248e912f9"
As a result Libreconv gem tries to create a tmp file with this long name and it's too long - that's why an error is raised.
Possible solution: split the process into separate steps of fetching file and converting it. Something like:
require "open-uri"
bucket = Aws::S3::Bucket.new('bucket-name')
object = bucket.object file.attachment.blob.key
path = object.presigned_url(:get)
doc_file = open(path)
begin
Libreconv.convert(doc_file.path, "public/test.pdf")
ensure
doc_file.delete
end
following is the answer using combine pdf gem
tape = Tape.new(file)
result = tape.preview
tempfile = Tempfile.new(['foo', '.pdf'])
File.open(tempfile, 'wb') do |f|
f.write result
end
path = tempfile.path
combine_pdf(path)
and for load file for S3 I have used
object = #bucket.object object_key
path = object.presigned_url(:get)
response = Net::HTTP.get_response(URI.parse(path)).body

Convert pdf file to base64 string

I have working Paperclip gem in my app for documents (pdf, doc). I need to pass the document to some other third party application via post request.
I tried to convert the paperclip attachment via Base64 but it throws error:
no implicit conversion of Tempfile into String
Here is how I did it:
# get url from the paperclip file
url = document.doc.url # https://s3-ap-southeast-1.amazonaws.com/xx-eng/documents/xx/000/000/xx/original/doc.pdf
file_data = open(url)
# Encode the bytes to base64 - this line throw error
base_64_file = Base64.encode64(file_data)
Do you have any suggestion how to avoid the Tempfile error?
You need to read file first.
base_64_file = Base64.encode64(file_data.read)
Here is working example:
$ bundle exec rails c
=> file = open("tmp/file.pdf")
#> #<File:tmp/receipts.pdf>
=> base_64 = Base64.encode64(file)
#> TypeError: no implicit conversion of File into String
=> base_64 = Base64.encode64(file.read)
#> "JVBERi0xLjQKMSAwIG9iago8PAovVGl0b/BBQEPgQ ......J0ZgozMDM0OQolJUVPRgo=\n"
The answer from #3елёный didn't work to me - maybe because it's the S3 file.
However I managed to find a way with Paperclip method:
file_data = Paperclip.io_adapters.for(url).read
base_64_file = Base64.encode64(file_data)

How to write to tmp file or stream an image object up to s3 in ruby on rails

The code below resizes my image. But I am not sure how to write it out to a temp file or blob so I can upload it to s3.
origImage = MiniMagick::Image.open(myPhoto.tempfile.path)
origImage.resize "200x200"
thumbKey = "tiny-#{key}"
obj = bucket.objects[thumbKey].write(:file => origImage.write("tiny.jpg"))
I can upload the original file just fine to s3 with the below command:
obj = bucket.objects[key].write('data')
obj.write(:file => myPhoto.tempfile)
I think I want to create a temp file, read the image file into it and upload that:
thumbFile = Tempfile.new('temp')
thumbFile.write(origImage.read)
obj = bucket.objects[thumbKey].write(:file => thumbFile)
but the origImage class doesn't have a read command.
UPDATE: I was reading the source code and found this out about the write command
# Writes the temporary file out to either a file location (by passing in a String) or by
# passing in a Stream that you can #write(chunk) to repeatedly
#
# #param output_to [IOStream, String] Some kind of stream object that needs to be read or a file path as a String
# #return [IOStream, Boolean] If you pass in a file location [String] then you get a success boolean. If its a stream, you get it back.
# Writes the temporary image that we are using for processing to the output path
And the s3 api docs say you can stream the content using a code block like:
obj.write do |buffer, bytes|
# writing fewer than the requested number of bytes to the buffer
# will cause write to stop yielding to the block
end
How do I change my code so
origImage.write(s3stream here)
http://docs.aws.amazon.com/AWSRubySDK/latest/AWS/S3/S3Object.html
UPDATE 2
This code successfully uploads the thumbnail file to s3. But I would still love to know how to stream it up. It would be much more efficient I think.
#resize image and upload a thumbnail
smallImage = MiniMagick::Image.open(myPhoto.tempfile.path)
smallImage.resize "200x200"
thumbKey = "tiny-#{key}"
newFile = Tempfile.new("tempimage")
smallImage.write(newFile.path)
obj = bucket.objects[thumbKey].write('data')
obj.write(:file => newFile)
smallImage.to_blob ?
below code copy from https://github.com/probablycorey/mini_magick/blob/master/lib/mini_magick.rb
# Gives you raw image data back
# #return [String] binary string
def to_blob
f = File.new #path
f.binmode
f.read
ensure
f.close if f
end
Have you looked into the paperclip gem? The gem offers direct compatibility to s3 and works great.

Rubyzip: Export zip file directly to S3 without writing tmpfile to disk?

I have this code, which writes a zip file to disk, reads it back, uploads to s3, then deletes the file:
compressed_file = some_temp_path
Zip::ZipOutputStream.open(compressed_file) do |zos|
some_file_list.each do |file|
zos.put_next_entry(file.some_title)
zos.print IO.read(file.path)
end
end # Write zip file
s3 = Aws::S3.new(S3_KEY, S3_SECRET)
bucket = Aws::S3::Bucket.create(s3, S3_BUCKET)
bucket.put("#{BUCKET_PATH}/archive.zip", IO.read(compressed_file), {}, 'authenticated-read')
File.delete(compressed_file)
This code works already but what I want is to not create the zip file anymore, to save a few steps. I was wondering if there is a way to export the zipfile data directly to s3 without having to first create a tmpfile, read it back, then delete it?
I think I just found the answer to my question.
It's Zip::ZipOutputStream.write_buffer. I'll check this out and update this answer when I get it working.
Update
It does work. My code is like this now:
compressed_filestream = Zip::ZipOutputStream.write_buffer do |zos|
some_file_list.each do |file|
zos.put_next_entry(file.some_title)
zos.print IO.read(file.path)
end
end # Outputs zipfile as StringIO
s3 = Aws::S3.new(S3_KEY, S3_SECRET)
bucket = Aws::S3::Bucket.create(s3, S3_BUCKET)
compressed_filestream.rewind
bucket.put("#{BUCKET_PATH}/archive.zip", compressed_filestream.read, {}, 'authenticated-read')
The write_buffer returns a StringIO and needs to rewind the stream first before reading it. Now I don't need to create and delete the tmpfile.
I'm just wondering now if write_buffer would be more memory extensive or heavier than open? Or is it the other way around?

Resources