Paperclip Nginx 504 Gateway Time-out - ruby-on-rails

I have a Rails 4 application that allows to upload videos using the jQuery Dropzone plugin and the paperclip gem. Each uploaded video is encoded into multiple formats and uploaded to Amazon S3 in the background using delayed_paperclip, av-transcoder and sidekiq gems.
All works fine with most videos, but with a higher size like 1.1GB after the upload reaches what seems like the end of the progress bar of the dropzone plugin it returns an Nginx 504 Gateway Time-out.
As far as server goes, the rails app runs on Nginx + Passenger on a couple of servers that are behind a load balancer (Nginx used here too). I do not have timeouts set in the upstream section of the load balancer, the client_max_body_size is set to 2000M (both on the load balancer and servers), I've tried setting passenger_pool_idle_time to a large value (600), that didn't help, I have also tried setting send_timeout (600s), nothing made any difference.
Note: When making those changes, I did them on the host files of both servers as well as of the load balancer and always restarted nginx afterwards.
I've read also several answers regarding similar problems like this one and this one but still can't figure this out, google wasn't much more helpful either.
Some extra notes for those unfamiliar with the whole paperclip/delayed_paperclip process, the file is uploaded to the server and then the operation is done as far as the user is concerned, in the background the post processing of the videos (encoding/uploading to S3) is pushed to Redis as a job and Sidekiq processes it whenever it has time/resources.
What could be causing this issue? How can I debug this and solve it?
UPDATE
Thanks to Sergey's answer I was able to solve the issue. Since I was restricted to a specific version of Paperclip, I couldn't update it to the newest version that has the fix, therefore I'll leave here what I ended up doing.
In the engine that I use to handle the uploads I've added the following code in the engine_name.rb file to override the methods from Paperclip that needed fixing:
Paperclip::AbstractAdapter.class_eval do
def copy_to_tempfile(src)
link_or_copy_file(src.path, destination.path)
destination
end
def link_or_copy_file(src, dest)
Paperclip.log("Trying to link #{src} to #{dest}")
FileUtils.ln(src, dest, force: true) # overwrite existing
#destination.close
#destination.open.binmode
rescue Errno::EXDEV, Errno::EPERM, Errno::ENOENT => e
Paperclip.log("Link failed with #{e.message}; copying link #{src} to #{dest}")
FileUtils.cp(src, dest)
end
end
Paperclip::AttachmentAdapter.class_eval do
def copy_to_tempfile(source)
if source.staged?
link_or_copy_file(source.staged_path(#style), destination.path)
else
source.copy_to_local_file(#style, destination.path)
end
destination
end
end
Paperclip::Storage::Filesystem.class_eval do
def flush_writes #:nodoc:
#queued_for_write.each do |style_name, file|
FileUtils.mkdir_p(File.dirname(path(style_name)))
begin
move_file(file.path, path(style_name))
rescue SystemCallError
File.open(path(style_name), "wb") do |new_file|
while chunk = file.read(16 * 1024)
new_file.write(chunk)
end
end
end
unless #options[:override_file_permissions] == false
resolved_chmod = (#options[:override_file_permissions] &~ 0111) || (0666 &~ File.umask)
FileUtils.chmod( resolved_chmod, path(style_name) )
end
file.rewind
end
after_flush_writes # allows attachment to clean up temp files
#queued_for_write = {}
end
private
def move_file(src, dest)
# Support hardlinked files
if File.identical?(src, dest)
File.unlink(src)
else
FileUtils.mv(src, dest)
end
end
end

I faced similar issue a while ago. Maybe, my experience will help.
We had m3.medium instance on Amazon with 4Gb of memory.
User could be able to upload large video files. We faced an issue of 504 error when uploading files larger than 400Mb.
During monitoring and logging the upload process it appeared that Paperclip creates 4 files per attachment and thus all the instance resources work on a file system.
Here there is a description of this problem
https://github.com/thoughtbot/paperclip/issues/1642
and proposed a solution - use links instead of files when possible. You can see the appropriate code changes here
https://github.com/arnonhongklay/paperclip/commit/cd80661df18d7cd112944bfe26d90cb87c928aad
However 2 days ago Paperclip was updated to 5.2.0 version and they implemented similar solution.
So for now it creates only one file per attachment. Thus our file system is not overloaded and after updating to 5.2.0 version we stopped receiving 504 error.
Conclusion:
Use monkey patch from the link attached above if you're restricted in Paperclip version for some reason
Update Paperclip to 5.2.0 version. Should help.

Related

Use ActiveStorage Image in wicked_pdf

I can't get ActiveStorage images to work in production. I want to use a resized image (variant) within the body of the PDF I'm generating.
= image_tag(#post.image.variant(resize_to_limit: [150, 100]))
It worked in development but in production generating the PDF hangs indefinitely unless I take that line out.
I've tried things like #post.image.variant(resize_to_limit: [150, 100]).processed.url and setting Rails.application.default_url_options = { host: "example.com" }
Ironically when I restart Passenger it sends the PDF to the browser and it actually looks fine. The image is included.
This is similar:
= wicked_pdf_image_tag(#post.image.variant(resize_to_limit: [150, 100]).processed.url)
Rails 7.0.3, Ruby 3.1.2, wicked_pdf 2.6.3
Thanks to #Unixmonkey I added passenger_min_instances 3; to my server block in Nginx config and it worked initially but would hang Passenger under load. Since I didn't have the RAM to throw at increasing that number I came up with a different solution based on reading images from file.
= image_tag(active_storage_to_base64_image(#post.image.variant(resize_to_limit: [150, 100])))
Then I created a helper in application_helper.rb
def active_storage_to_base64_image(image)
require "base64"
file = File.open(ActiveStorage::Blob.service.path_for(image.processed.key))
base64 = Base64.encode64(file.read).gsub(/\s+/, '')
file.close
"data:image/png;base64,#{Rack::Utils.escape(base64)}"
end
I've hard coded it for PNG files as that's all I needed. Only works for Disk storage. Would welcome improvements

Paperclip or Google Cloud Storage issue when renaming paths

I've a Rails app with Paperclip and I use Google Cloud Storage. So far so good.
To avoid having both development and production using the same storage, I decided to change the default Paperclip path to another based based on the environment. This way every env has his own directory. Then I consistently moved the old images from the default Paperclip path to the new ones.
The problem is that now old images give a 404, whereas any new image I upload works properly. Is there any way to fix that?
Here it's the previous settings:
module MyApp
class Application < Rails::Application
config.paperclip_defaults = {
storage: :fog,
fog_public: true,
fog_directory: 'myapp-01',
fog_credentials: {
google_storage_access_key_id: ENV['GOOGLE_STORAGE_ID'],
google_storage_secret_access_key: ENV['GOOGLE_STORAGE_SECRET'],
provider: 'Google'
}
}
}
I override the default using the following settings:
path: ":rails_env/:class/:attachment/:id_partition/:style/:filename",
url: "/:rails_root/:class/:attachment/:id_partition/:style/:filename"
My guess is that it's not sufficient to update Paperclip config with the new path and move all images to the new directory. You need also to update the old records...
If you wonder, the old records point to root/images/?123456789.
Your guess is right. Changing the config is not enough, you need to move the files. This, is better left for a rake task or background job. I have some code for S3, but it should give you an idea of how to implement it for Google:
def old_key(image, file_name_field)
# Previous `:path`: '/:class/:attachment/:id/:style/:filename'
klass = self.class.to_s.pluralize.downcase
attachment = image.pluralize
"#{klass}/#{attachment}/#{id}/original/#{send(file_name_field)}"
end
def re_path(image)
file_name_field = "#{image}_file_name"
return if send(file_name_field).blank?
old_object = bucket.object(old_key(image, file_name_field))
return unless old_object.exists?
Rails.logger.warn "Re-saving image attachment #{self.class}/#{id}"
send "#{image}=", URI.parse(old_object.public_url)
save
end
I'm basically building the old path using my own interpolation, finding the object in S3 (hence key/object lingo) and re-download every image from S3. Be careful with this, since you might incur in extra cost for downloading rather than just moving, if that's someone Google allows.
Then I just called this method on every image for every object:
Object.each do { |o| o.re_path(:logo); o.re_path(:background); }

Set System Directory Rails Production Environment

I have an app that works fine in on my development machine, but on my production server it uses a broken link to serve an image served using the Paperclip Gem.
Production environment is Linux(Debian), Apache, Passenger and I am deploying with Capistrano.
The app is stored in (a symlink that points to the public folder of the current version of the app deployed using capistrano):
/var/www/apps/root/appname
However, when I try and access it on the production server, the Apache error log displays this as the path it is looking in:
/var/www/apps/root/system
The correct path, however, is:
/var/www/apps/appname/shared/system
One option available to me is to create a symlink in root that directs system to the correct path, but I don't want to do this in case I want to deploy another app in the same root dir.
The url for this request is generated by rails, but Apache is what fetches the static resource (image files), so I have tried placing the following in my config/environments/production.rb:
ENV["RAILS_RELATIVE_URL_ROOT"] = '/appname/'
Which has resolved all other pathing issues I've been experiencing, but when rails generates the url (via the Paperclip gem), it doesn't seem to use it.
How can I set it so Paperclip uses the right path and only uses it production?
I've a workaround, add this as one of initializers:
config/initializer/paperclip.rb
Paperclip::Attachment.class_eval do
def url(style_name = default_style, options = {})
if options == true || options == false # Backwards compatibility.
res = #url_generator.for(style_name, default_options.merge(:timestamp => options))
else
res = #url_generator.for(style_name, default_options.merge(options))
end
# replace adding uri before res, minus final /
Rails.application.config.site_relative_url[0..-2]+res
end
end
At the moment Paperclip doesn’t work with ENV['RAILS_RELATIVE_URL_ROOT'] and the. You can follow the issue here:
https://github.com/thoughtbot/paperclip/issues/889

Tracking Upload Progress of File to S3 Using Ruby aws-sdk

Firstly, I am aware that there are quite a few questions that are similar to this one in SO. I have read most, if not all of them, over the past week. But I still can't make this work for me.
I am developing a Ruby on Rails app that allows users to upload mp3 files to Amazon S3. The upload itself works perfectly, but a progress bar would greatly improve user experience on the website.
I am using the aws-sdk gem which is the official one from Amazon. I have looked everywhere in its documentation for callbacks during the upload process, but I couldn't find anything.
The files are uploaded one at a time directly to S3 so it doesn't need to load it into memory. No multiple file upload necessary either.
I figured that I may need to use JQuery to make this work and I am fine with that.
I found this that looked very promising: https://github.com/blueimp/jQuery-File-Upload
And I even tried following the example here: https://github.com/ncri/s3_uploader_example
But I just could not make it work for me.
The documentation for aws-sdk also BRIEFLY describes streaming uploads with a block:
obj.write do |buffer, bytes|
# writing fewer than the requested number of bytes to the buffer
# will cause write to stop yielding to the block
end
But this is barely helpful. How does one "write to the buffer"? I tried a few intuitive options that would always result in timeouts. And how would I even update the browser based on the buffering?
Is there a better or simpler solution to this?
Thank you in advance.
I would appreciate any help on this subject.
The "buffer" object yielded when passing a block to #write is an instance of StringIO. You can write to the buffer using #write or #<<. Here is an example that uses the block form to upload a file.
file = File.open('/path/to/file', 'r')
obj = s3.buckets['my-bucket'].objects['object-key']
obj.write(:content_length => file.size) do |buffer, bytes|
buffer.write(file.read(bytes))
# you could do some interesting things here to track progress
end
file.close
After read the source code of the AWS gem, I've adapted (or mostly copy) the multipart upload method to yield the current progress based on how many chunks have been uploaded
s3 = AWS::S3.new.buckets['your_bucket']
file = File.open(filepath, 'r', encoding: 'BINARY')
file_to_upload = "#{s3_dir}/#{filename}"
upload_progress = 0
opts = {
content_type: mime_type,
cache_control: 'max-age=31536000',
estimated_content_length: file.size,
}
part_size = self.compute_part_size(opts)
parts_number = (file.size.to_f / part_size).ceil.to_i
obj = s3.objects[file_to_upload]
begin
obj.multipart_upload(opts) do |upload|
until file.eof? do
break if (abort_upload = upload.aborted?)
upload.add_part(file.read(part_size))
upload_progress += 1.0/parts_number
# Yields the Float progress and the String filepath from the
# current file that's being uploaded
yield(upload_progress, upload) if block_given?
end
end
end
The compute_part_size method is defined here and I've modified it to this:
def compute_part_size options
max_parts = 10000
min_size = 5242880 #5 MB
estimated_size = options[:estimated_content_length]
[(estimated_size.to_f / max_parts).ceil, min_size].max.to_i
end
This code was tested on Ruby 2.0.0p0

ruby reading files from S3 with open-URI

I'm having some problems reading a file from S3. I want to be able to load the ID3 tags remotely, but using open-URI doesn't work, it gives me the following error:
ruby-1.8.7-p302 > c=TagLib2::File.new(open(URI.parse("http://recordtemple.com.s3.amazonaws.com/music/745/original/The%20Stranger.mp3?1292096514")))
TypeError: can't convert Tempfile into String
from (irb):8:in `initialize'
from (irb):8:in `new'
from (irb):8
However, if i download the same file and put it on my desktop (ie no need for open-URI), it works just fine.
c=TagLib2::File.new("/Users/momofwombie/Desktop/blah.mp3")
is there something else I should be doing to read a remote file?
UPDATE: I just found this link, which may explain a little bit, but surely there must be some way to do this...
Read header data from files on remote server
Might want to check out AWS::S3, a Ruby Library for Amazon's Simple Storage Service
Do an AWS::S3:S3Object.find for the file and then an use about to retrieve the metadata
This solution assumes you have the AWS credentials and permission to access the S3 bucket that contains the files in question.
TagLib2::File.new doesn't take a file handle, which is what you are passing to it when you use open without a read.
Add on read and you'll get the contents of the URL, but TagLib2::File doesn't know what to do with that either, so you are forced to read the contents of the URL, and save it.
I also noticed you are unnecessarily complicating your use of OpenURI. You don't have to parse the URL using URI before passing it to open. Just pass the URL string.
require 'open-uri'
fname = File.basename($0) << '.' << $$.to_s
File.open(fname, 'wb') do |fo|
fo.print open("http://recordtemple.com.s3.amazonaws.com/music/745/original/The%20Stranger.mp3?1292096514").read
end
c = TagLib2::File.new(fname)
# do more processing...
File.delete(fname)
I don't have TagLib2 installed but I ran the rest of the code and the mp3 file downloaded to my disk and is playable. The File.delete would clean up afterwards, which should put you in the state you want to be in.
This solution isn't going to work much longer. Paperclip > 3.0.0 has removed to_file. I'm using S3 & Heroku. What I ended up doing was copying the file to a temporary location and parsing it from there. Here is my code:
dest = Tempfile.new(upload.spreadsheet_file_name)
dest.binmode
upload.spreadsheet.copy_to_local_file(:default_style, dest.path)
file_loc = dest.path
...
CSV.foreach(file_loc, :headers => true, :skip_blanks => true) do |row|}
This seems to work instead of open-URI:
Mp3Info.open(mp3.to_file.path) do |mp3info|
puts mp3info.tag.artist
end
Paperclip has a to_file method that downloads the file from S3.

Resources