upload an RMagick-generated file from Heroku to Amazon S3 - ruby-on-rails

I am creating a Rails app which is hosted on Heroku and that allows the user to generate animated GIFs on the fly based on an original JPG that's hosted somewhere in the web (think of it as a crop-resize app). I tried Paperclip but, AFAIK, it does not handle dynamically-generated files. I am using the aws-sdk gem and this is a code snippet of my controller:
im = Magick::Image.read(#animation.url).first
fr1 = im.crop(#animation.x1,#animation.y1,#animation.width,#animation.height,true)
str1 = fr1.to_blob
fr2 = im.crop(#animation.x2,#animation.y2,#animation.width,#animation.height,true)
str2 = fr2.to_blob
list = Magick::ImageList.new
list.from_blob(str1)
list.from_blob(str2)
list.delay = #animation.delay
list.iterations = 0
That is for the basic creation of a two-frame animation. RMagick can generate a GIF in my development computer with these lines:
list.write("#{Rails.public_path}/images/" + #animation.filename)
I tried uploading the list structure to S3:
# upload to Amazon S3
s3 = AWS::S3.new
bucket = s3.buckets['mybucket']
obj = bucket.objects[#animation.filename]
obj.write(:single_request => true, :content_type => 'image/gif', :data => list)
But I don't have a size method in RMagick::ImageList that I can use to specify that. I tried "precompiling" the GIF into another RMagick::Image:
anim = Magick::Image.new(#animation.width, #animation.height)
anim.format = "GIF"
list.write(anim)
But Rails crashes with a segmentation fault:
/path/to/my_controller.rb:103: [BUG] Segmentation fault ruby 1.8.7 (2010-01-10 patchlevel 249) [universal-darwin11.0]
Abort trap: 6
Line 103 corresponds to list.write(anim).
So right now I have no idea how to do this and would appreciate any help I receive.

As per #mga's request in his answer to his original question...
a non-filesystem based approach is pretty simple
rm_image = Magick::Image.from_blob(params[:image][:datafile].read)[0]
# [0] because from_blob returns an array
# the blob, presumably, can have multiple images data in it
a_thumbnail = rm_image.resize_to_fit(150, 150)
# just as an example of doing *something* with it before writing
s3_bucket.objects['my_thumbnail.jpg'].write(a_thumbnail.to_blob, {:acl=>:public_read})
Voila! reading an uploaded file, manipulating it with RMagick, and writing it to s3 without ever touching the filesystem.

Since this project is hosted in Heroku I cannot use the filesystem so that is why I was trying to do everything via code. I found that Heroku does have a temporary-writable folder: http://devcenter.heroku.com/articles/read-only-filesystem
This works just fine in my case since I don't need the file after this request.
The resulting code:
im = Magick::Image.read(#animation.url).first
fr1 = im.crop(#animation.x1,#animation.y1,#animation.width,#animation.height,true)
fr2 = im.crop(#animation.x2,#animation.y2,#animation.width,#animation.height,true)
list = Magick::ImageList.new
list << fr1
list << fr2
list.delay = #animation.delay
list.iterations = 0
# gotta packet the file
list.write("#{Rails.root}/tmp/#{#animation.filename}.gif")
# upload to Amazon S3
s3 = AWS::S3.new
bucket = s3.buckets['mybucket']
obj = bucket.objects[#animation.filename]
obj.write(:file => "#{Rails.root}/tmp/#{#animation.filename}.gif")
It would be interesting to know if a non-filesystem-writing solution is possible.

I am updating this answer for AWS SDK Version 2 which should be:
rm_image = Magick::Image.from_blob(params[:image][:datafile].read)[0]
# [0] because from_blob returns an array
# the blob, presumably, can have multiple images data in it
a_thumbnail = rm_image.resize_to_fit(150, 150)
# just as an example of doing *something* with it before writing
s3 = Aws::S3::Resource.new
bucket = s3.bucket('mybucket')
obj = bucket.object('filename')
obj.put(body: background.to_blob)

I think there's a few things going on here. First, the documentation for RMagick is sub-par, and its easy to get side-tracked. The code you're using to generate the gif can be a little simpler. I cooked up a very contrived example here:
#!/usr/bin/env ruby
require 'rubygems'
require 'RMagick'
# read in source file
im = Magick::Image.read('foo.jpg').first
# make two slightly different frames
fr1 = im.crop(0, 100, 300, 300, true)
fr2 = im.crop(0, 200, 300, 300, true)
# create an ImageList
list = Magick::ImageList.new
# add our images to it
list << fr1
list << fr2
# set some basic values
list.delay = 100
list.iterations = 0
# write out an animated gif to the filesystem
list.write("foo.gif")
This code works -- it reads in a jpg I have locally, and writes out a 2-frame animation. Obviously I've hardcoded some values here, but there's no reason this shouldn't work for you, although I am running ruby 1.9.2 and probably a different version of RMagick, but this is basic code.
The second issue is totally unrelated -- is it possible to upload an image generated in IM to S3 without actually hitting the filesystem? Basically, will this ever work:
obj.write(:single_request => true, :content_type => 'image/gif', :data => list)
I'm not sure if it is or not. I experimented with calling list.to_blob, but it only outputs the first frame, and it's as a JPG, although I didn't spend much time on it. You might be able to fool list.write into outputting somewhere, but rather than going down that road, I would personally just output the file unless that is impossible for some reason.

Related

Ruby on Rails - How to convert to images some elements from a word document

Context
In our platform we allow users to upload word documents, those documents are stored in google drive and then dowloaded again to our platform in HTML format to create a section where the users can interact with that content.
Rails 5.0.7
Ruby 2.5.7p206
selenium-webdriver 3.142.7 (latest stable version compatible with our ruby and rails versions)
Problem
Some of the documents have charts or graphics inside that are not processed correctly giving wrong results after all the process.
We have been trying to fix this problem at the moment we get the word document and before to send it to google drive.
I'm looking for a simple way to export the entire chart and/or table as an image, if anyone knows of a way to do this the advice would be much appreciated.
Edit 1: Adding some screenshots:
This screenshot is from the original word doc:
And this is how it looks in our systems:
Here are the approaches I have tried that haven't worked for me so far.
Approach 1
Using nokogiri to read the document and found the nodes that contain the charts (we've found that they are called drawing) and then use Selenium to navigate through the file and take and screenshot of that particular section.
The problem we found with this approach is that the versions our gems are not compatible with the latest versions of selenium and its web drivers (chrome or firefox) and it is not posible to perform this action.
Other problem, and it seems is due to security, is that selenium is not able to browse inside local files and open it.
options = Selenium::WebDriver::Firefox::Options.new(binary: '/usr/bin/firefox', headless: true)
driver = Selenium::WebDriver.for :firefox, options: options
path = "#{Rails.root}/doc_file.docx"
driver.navigate.to("file://#{path}")
# Here occurs the first issue, it is not able to navigate to the file
puts "Title: #{driver.title}"
puts "URL: #{driver.current_url}"
# Below is the code that I am trying to use to replace the images with the modified images
drawing_elements = driver.find_elements(:css, 'w|drawing')
modified_paragraphs = []
drawing_elements.each do |drawing_element|
paragraph_element = drawing_element.find_element(:xpath, '..')
paragraph_element.screenshot.save('paragraph.png')
modified_paragraph = File.read('paragraph.png')
modified_paragraphs << modified_paragraph
end
driver.quit
file = File.open(File.join(Rails.root, 'doc_file.docx'))
doc = Nokogiri::XML(file)
drawing_elements = doc.css('w|drawing')
drawing_elements.each_with_index do |drawing_element, i|
paragraph_element = drawing_element.parent
paragraph_element.replace(modified_paragraphs[i])
end
new_doc_file = File.write('modified_doc.docx', doc.to_xml)
s3_client.put_object(bucket: bucket, key: #document_path, body: new_doc_file)
File.delete('doc_file.docx')
Approach 2
Using nokogiri to get the drawing elements and the try to convert it directly to an image using rmagick or mini_magick.
It is only possible if the drawing element actually contains an image, it can convert that correctly to an image, but the problem is when inside of the drawing element are not images but other elements like graphicData, pic, blipFill, blip. It needs to start looping into the element and rebuilding it, but at that point of time it seems that the element is malformed and it can't rebuild it.
Other issue with this approach is when it founds elements that seem to conform an svg file, it also needs to loop into all the elements and try to rebuild it, but the same as the above issue, it seems that the element is malformed.
response = s3_client.get_object(bucket: bucket, key: #document_path)
docx = response.body.read
Zip::File.open_buffer(docx) do |zip|
doc = zip.find_entry("word/document.xml")
doc_xml = doc.get_input_stream.read
doc = Nokogiri::XML(doc_xml)
drawing_elements = doc.xpath("//w:drawing")
drawing_elements.each do |drawing_element|
node = get_chil_by_name(drawing_element, "graphic")
if node.xpath("//a:graphicData/a:pic/a:blipFill/a:blip").any?
img_data = node.xpath("//a:graphicData/a:pic/a:blipFill/a:blip").first.attributes["r:embed"].value
img = Magick::Image.from_blob(img_data).first
img.write("node.jpeg")
node.replace("<img src='#{img.to_blob}'/>")
elsif node.xpath("//a:graphicData/a:svg").any?
svg_data = node.xpath("//a:graphicData/a:svg").to_s
Prawn::Document.generate("node.pdf") do |pdf|
pdf.svg svg_data, at: [0, pdf.cursor], width: pdf.bounds.width
end
else
puts "unsupported format"
end
end
# update the file in S3
s3.put_object(bucket: bucket, key: #document_path, body: doc)
end
Approach 3
Convert the elements since its parents to a pdf file and then to an image.
Basically the same issue as in the approach 2, it needs to loop inside all the elements and try to rebuild it, we haven't found a way to do that.

Use ActiveStorage Image in wicked_pdf

I can't get ActiveStorage images to work in production. I want to use a resized image (variant) within the body of the PDF I'm generating.
= image_tag(#post.image.variant(resize_to_limit: [150, 100]))
It worked in development but in production generating the PDF hangs indefinitely unless I take that line out.
I've tried things like #post.image.variant(resize_to_limit: [150, 100]).processed.url and setting Rails.application.default_url_options = { host: "example.com" }
Ironically when I restart Passenger it sends the PDF to the browser and it actually looks fine. The image is included.
This is similar:
= wicked_pdf_image_tag(#post.image.variant(resize_to_limit: [150, 100]).processed.url)
Rails 7.0.3, Ruby 3.1.2, wicked_pdf 2.6.3
Thanks to #Unixmonkey I added passenger_min_instances 3; to my server block in Nginx config and it worked initially but would hang Passenger under load. Since I didn't have the RAM to throw at increasing that number I came up with a different solution based on reading images from file.
= image_tag(active_storage_to_base64_image(#post.image.variant(resize_to_limit: [150, 100])))
Then I created a helper in application_helper.rb
def active_storage_to_base64_image(image)
require "base64"
file = File.open(ActiveStorage::Blob.service.path_for(image.processed.key))
base64 = Base64.encode64(file.read).gsub(/\s+/, '')
file.close
"data:image/png;base64,#{Rack::Utils.escape(base64)}"
end
I've hard coded it for PNG files as that's all I needed. Only works for Disk storage. Would welcome improvements

How to create movie screenshot by ffmpeg in an amazon S3 path

I tried to create using ffmpeg a video screenshot from a remote video url in heroku console. Below is how I generated a movie instance and can see also an empty ready to be written file at S3. But the last line movie.screenshot is not working and generates this error:
FFMPEG::Error: Failed encoding.Errors: no output file created
Here is the code
s3 = Aws::S3::Resource.new(region: 'us-west-1')
bucket = s3.bucket("ruby-sample-kb-#{SecureRandom.uuid}")
bucket.create
object = bucket.object('ex-vid-test-kb.jpg')
object.put(acl: "public-read-write")
path = object.public_url
movie = FFMPEG::Movie.new("https://www.googleapis.com/download/storage/v1/b/seppoav/o/3606137_51447286560__56BAF29C-05CB-4223-BAE6-655DF2236321.MOV?generation=1492780072394755&alt=media")
movie.screenshot(path, :seek_time => 2)
I also tried the following line just if it should be written via put. What am I missing here?
object.put(acl: "public-read", body: movie.screenshot(path, :seek_time => 2))
I ended up convincing myself that ffmpeg movie.screenshot won't work for remote url path. So I had to figure out a solution where I can create tempfile in heroku system although the tempfiles stay per dyno.
file = Tempfile.new [prefix, suffix], "#{Rails.root}/app/assets/images/video_screenshots"
movie.screenshot(file.path)
This worked Ok.

Tracking Upload Progress of File to S3 Using Ruby aws-sdk

Firstly, I am aware that there are quite a few questions that are similar to this one in SO. I have read most, if not all of them, over the past week. But I still can't make this work for me.
I am developing a Ruby on Rails app that allows users to upload mp3 files to Amazon S3. The upload itself works perfectly, but a progress bar would greatly improve user experience on the website.
I am using the aws-sdk gem which is the official one from Amazon. I have looked everywhere in its documentation for callbacks during the upload process, but I couldn't find anything.
The files are uploaded one at a time directly to S3 so it doesn't need to load it into memory. No multiple file upload necessary either.
I figured that I may need to use JQuery to make this work and I am fine with that.
I found this that looked very promising: https://github.com/blueimp/jQuery-File-Upload
And I even tried following the example here: https://github.com/ncri/s3_uploader_example
But I just could not make it work for me.
The documentation for aws-sdk also BRIEFLY describes streaming uploads with a block:
obj.write do |buffer, bytes|
# writing fewer than the requested number of bytes to the buffer
# will cause write to stop yielding to the block
end
But this is barely helpful. How does one "write to the buffer"? I tried a few intuitive options that would always result in timeouts. And how would I even update the browser based on the buffering?
Is there a better or simpler solution to this?
Thank you in advance.
I would appreciate any help on this subject.
The "buffer" object yielded when passing a block to #write is an instance of StringIO. You can write to the buffer using #write or #<<. Here is an example that uses the block form to upload a file.
file = File.open('/path/to/file', 'r')
obj = s3.buckets['my-bucket'].objects['object-key']
obj.write(:content_length => file.size) do |buffer, bytes|
buffer.write(file.read(bytes))
# you could do some interesting things here to track progress
end
file.close
After read the source code of the AWS gem, I've adapted (or mostly copy) the multipart upload method to yield the current progress based on how many chunks have been uploaded
s3 = AWS::S3.new.buckets['your_bucket']
file = File.open(filepath, 'r', encoding: 'BINARY')
file_to_upload = "#{s3_dir}/#{filename}"
upload_progress = 0
opts = {
content_type: mime_type,
cache_control: 'max-age=31536000',
estimated_content_length: file.size,
}
part_size = self.compute_part_size(opts)
parts_number = (file.size.to_f / part_size).ceil.to_i
obj = s3.objects[file_to_upload]
begin
obj.multipart_upload(opts) do |upload|
until file.eof? do
break if (abort_upload = upload.aborted?)
upload.add_part(file.read(part_size))
upload_progress += 1.0/parts_number
# Yields the Float progress and the String filepath from the
# current file that's being uploaded
yield(upload_progress, upload) if block_given?
end
end
end
The compute_part_size method is defined here and I've modified it to this:
def compute_part_size options
max_parts = 10000
min_size = 5242880 #5 MB
estimated_size = options[:estimated_content_length]
[(estimated_size.to_f / max_parts).ceil, min_size].max.to_i
end
This code was tested on Ruby 2.0.0p0

Ruby, RSVG and PNG streams

I'm trying to do an image conversion in a rails app from SVG to PNG. ImageMagick didn't work out for me, due to Heroku not able / wanting to upgrade IM at this time. I'm testing out some ideas of using RSVG2 / Cairo in dev but running into a roadblock.
I can easily convert and save the SVG to PNG like this:
#svg_test.rb
require 'debugger'
require 'rubygems'
require 'rsvg2'
SRC = 'test.svg'
DST = 'test.png'
svg = RSVG::Handle.new_from_file(SRC)
surface = Cairo::ImageSurface.new(Cairo::FORMAT_ARGB32, 800, 800)
context = Cairo::Context.new(surface)
context.render_rsvg_handle(svg)
surface.write_to_png(DST)
But this only lets me write PNG files out. In the app, I need to be able to generate these on the fly, then send them down to the client browser as data. And I can't figure out how to do this, or even if its supported. I know I can call surface.data to get the raw data at least, but I don't know enough about image formats to know how to get this as a PNG.
Thanks
Ah ha! I was so close and its pretty obvious in hindsight. Simply call the surface.write_to_png function with a StringIO object. This fills the string object, which you can then get the bytes for. Here's the finished svg_to_png function I wrote, along with a sample controller that calls it. Hope this helps someone else somewhere.
ImageConvert function:
def self.svg_to_png(svg)
svg = RSVG::Handle.new_from_data(svg)
surface = Cairo::ImageSurface.new(Cairo::FORMAT_ARGB32, 800, 800)
context = Cairo::Context.new(surface)
context.render_rsvg_handle(svg)
b = StringIO.new
surface.write_to_png(b)
return b.string
end
Test controller:
def svg_img
path = File.expand_path('../../../public/images/test.svg', __FILE__)
f = File.open(path, 'r')
t = ImageConvert.svg_to_png(f.read)
send_data(t , :filename => 'test.png', :type=>'image/png')
end

Resources