Accelerate S3 upload with paperclip - ruby-on-rails

I'm using paperclip for uploading images in S3.
But I've noted that this upload is very slow. I think because before complete the submit the file has to pass by my server, be processed and be sent to the S3 server.
Is there a method for accelerate this?
thanks

You did not post any code so I'm going to make a few assumptions here:
in your project you have an Album and Image model
An Album has_many :images
You already have
paperclip and
aws-sdk
set up correctly with buckets and all else
You are uploading many images at once
In order to upload many images, your form will look something like this:
<%= form_for #album, html: { multipart: true } do |f| %>
<%= f.file_field :files, accept: 'image/png,image/jpeg,image/gif', multiple: true %>
<%= f.submit %>
<% end %>
Your controller will look something like this
class AlbumsController < ApplicationController
def update
#album = Album.find params[:id]
#album.update album_params
redirect_to #album, notice: 'Images saved'
end
def album_params
params.require(:album).permit files: []
end
end
In order to manipulate images using an album you'll need
class Album < ApplicationRecord
has_many :images, dependent: :destroy
accepts_nested_attributes_for :images, allow_destroy: true
def files=(array = [])
array.each do |f|
images.create file: f
end
end
end
Your Image file will look like this
class Image < ApplicationRecord
belongs_to :album
has_attached_file :file, styles: { thumbnail: '500x500#' }, default_url: '/default.jpg'
validates_attachment_content_type :file, content_type: /\Aimage\/.*\Z/
end
This is just the important stuff. With this setup, an upload of 22 images with a total of 12MB takes the :files= method 41.1806895 seconds to execute on average on my local server. To check how long a method takes to run, use:
def files=(array = [])
start = Time.now
array.each do |f|
images.create file: f
end
p "ELAPSED TIME: #{Time.now - start}"
end
You ask for a faster upload of many images. There are a few ways to do this. Using
jobs
won't work because you can't pass complex data like images to a job.
Use delayed_paperclip instead. It moves image styles creation (like thumbnail: '500x500#') into background jobs.
Gemfile
source 'https://rubygems.org'
ruby '2.3.0'
...
gem 'delayed_paperclip'
...
Image file
class Image < ApplicationRecord
...
process_in_background :file
end
It speeds up the :files= method. The same upload as before (22 images, 12MB) with this setup took 23.13998 seconds on my machine. That's 1.77963 times faster than before.
Another way of speeding things up is by using Threads. Remove delayed_paperclip from the Gemfile and the process_in_background :file line. Update your :files= method:
def files=(array = [])
threads = []
array.each do |f|
threads << Thread.new do
images.create file: f
end
end
threads.each(&:join)
end
You might try this, but get some weird error and only see that 4 images saved. You must also use Mutex. Also, you must not use :join on the threads because if you join, the method will wait until the threads are done running.
def files=(array = [])
semaphore = Mutex.new
array.each do |f|
Thread.new do
semaphore.synchronize do
images.create file: f
end
end
end
end
With this simple change to the method and no added gems, the same upload as before runs in 0.017628 seconds. That is 1,313 times faster than delayed_paperclip. It's also 2,336 times faster than the regular setup.
What happens if you use delayed_paperclip AND Threads?
Don't change the :files= method. Just turn delayed_paperclip back on in your Gemfile and add back the process_in_background :file line.
With this setup on my machine, the method runs in 0.001277 seconds on average. That's
13.8 times faster than Threads
18,120.6 times faster than delayed_paperclip
32,248.0 times faster than regular setup
Remember, this is on my machine and I have not tested this in production. I am also on wifi, not ethernet. All these things can change the results but I think the numbers speak for themselves.
Upload images faster. Done.
UPDATE: Don't use delayed_paperclip. It can cause a busy database, and some images might not get saved. I've tested it. I think just using threads is fast enough. Remove the process_in_background line from the Image file. Also, here's what my files= method looks like:
def files=(array = [])
Thread.new do
begin
array.each { |f| images.create file: f }
ensure
ActiveRecord::Base.connection_pool.release_connection
end
end
end
Note: Since we push the image saving to a background task and then redirect. The page that loads will not have images on them yet. The user has to
refresh
to update the page. One way around this is to use
polling.
Polling is when JavaScript checks for any changes every 5 seconds or so and makes changes if any to the page.
Another option is to use
Web Sockets.
Now that we have Rails 5, we can use ActionCable. Every time an image gets created, we broadcast an update for the album. If the user is on that page for that album, they will see updates happen as soon as they happen on the database without having the user refresh or the browser make a request every 5 seconds on an infinite loop.
Cool stuff.

Do you want to improve the appearance of the upload being faster or actually make the upload faster?
If it's the former you can put your image handling logic into a background task using something like delayed_job. This way when a user clicks the button they'll immediately go to their next page while you process the image (you can show a "processing in progress" image placeholder until the task is finished).
If it's the latter then it's entirely down to your server and internet connection. Where are you hosting?

How about uploading direct to S3?
Not sure if paperclip does this out of the box, but you could make it.
http://docs.amazonwebservices.com/AmazonS3/2006-03-01/dev/index.html?UsingHTTPPOST.html

Use delayed jobs, this is a good example here
Or you can use flash upload.

If you end up going the route of uploading directly to S3 which offloads the work from your Rails server, please check out my sample projects:
Sample project using Rails 3, Flash and MooTools-based FancyUploader to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-FancyUploader
Sample project using Rails 3, Flash/Silverlight/GoogleGears/BrowserPlus and jQuery-based Plupload to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-Plupload
By the way, you can do post-processing with Paperclip using something like this blog post describes:
http://www.railstoolkit.com/posts/fancyupload-amazon-s3-uploader-with-paperclip

As cwninja recommends, we upload direct to s3 so as to get rid of this extra upload. We use a modified version of the plugin described in this blog post:
http://elctech.wpengine.com/2009/02/updates-on-rails-s3-flash-upload-plugin/
Ours is modified to handle multiple file uploads (rewrote the the flex object
Not sure how well this plays with paperclip, we use attachment_fu, but it wasn't so bad to get it to work with that.

Related

ActiveStorage 5.2.1 - uploaded asset is nil since upload has not finished. How to wait for finished upload?

I use ActiveStorage for user generated stylesheets which will be uploaded to s3 in order to include them in a custom user styled web page.
So I have a model CustomeTheme
has_one_attached :style, dependent: :purge_later
and an after_save callback which does the upload after the custom style has been saved
self.style.attach(io: File.open(File.join(asset_path, name)), filename: name, content_type: 'text/css')
Included in a layout
= stylesheet_link_tag url_for(#custom_theme.style)
The problem now is, that the user saves the style and and sees a preview of the custom web page but without the custom style (404 at this point of time) since the uploaded to s3 has not finished yet, at least thats what I suppose.
to_model delegated to attachment, but attachment is nil
/usr/local/bundle/gems/activesupport-5.2.1/lib/active_support/core_ext/module/delegation.rb:278:in `rescue in method_missing'
/usr/local/bundle/gems/activesupport-5.2.1/lib/active_support/core_ext/module/delegation.rb:274:in `method_missing'
/usr/local/bundle/gems/actionpack-5.2.1/lib/action_dispatch/routing/polymorphic_routes.rb:265:in `handle_model'
/usr/local/bundle/gems/actionpack-5.2.1/lib/action_dispatch/routing/polymorphic_routes.rb:280:in `handle_model_call'
/usr/local/bundle/gems/actionview-5.2.1/lib/action_view/routing_url_for.rb:117:in `url_for'
So the question remains unclear to me how could i know that the asset (no matter whether it is a style or an image) is ready to be displayed?
2 possible approaches:
Define a route for upload status checks and then run an interval in Javascript to check for upload status for a given upload id. When it finishes, the endpoint returns the asset URL, which then you can use. (e.g. If the asset is an image, then you just put that on an <img> tag src attribute).
Another approach would be something like what Delayed Paperclip does:
In the default setup, when you upload an image for the first time and try to display it before the job has been completed, Paperclip will be none the wiser and output the url of the image which is yet to be processed, which will result in a broken image link being displayed on the page.
To have the missing image url be outputted by paperclip while the image is being processed, all you need to do is add a #{attachment_name}_processing column to the specific model you want to enable this feature for.
class AddAvatarProcessingToUser < ActiveRecord::Migration
def self.up
add_column :users, :avatar_processing, :boolean
end
def self.down
remove_column :users, :avatar_processing
end
end
#user = User.new(avatar: File.new(...))
#user.save
#user.avatar.url #=> "/images/original/missing.png"
# Process job
#user.reload
#user.avatar.url #=> "/system/images/3/original/IMG_2772.JPG?1267562148"

Using FeedJira to create RSS aggregator/reader

I am trying to create my own rss reader app in ruby on rails. I want to be able to store various news stories in my database that I can pull from later to display each story with its headline, image, summary, etc. in a nice layout. I am working with the feedjira library and am also pretty new to RoR. I know that these two commands in the rails console fetch rss feeds and somehow parse them:
urls = %w[http://feedjira.com/blog/feed.xml https://github.com/feedjira/feedjira/feed.xml]
feeds = Feedjira::Feed.fetch_and_parse urls
While these two commands work on rss feeds, I was wondering how I could configure my database/model and then save the news entries I get from Feedjira into the db. I tried watching the railscast on this issue but it seemed a bit out of date. Any help on this issue would be immensely appreciated! Thanks in advance!
Here's one way:
Create a model such as this:
class Entry < ActiveRecord::Base
attr_accessible :guid, :source_site_id, :url, :title, :summary, :description, :published_at
def self.update_from_feed(feed_name)
feed = Feed.find_by_name(feed_name)
feed_data = Feedjira::Feed.fetch_and_parse(feed.feed_url)
add_entries(feed_data.entries, feed)
end
private
def self.add_entries(entries, feed)
entries.each do |entry|
break if exists? :entry_id => entry.id
create!(
:entry_id => entry.id,
:feed_id => feed.id,
:url => entry.url,
:title => entry.title.sanitize,
:summary => entry.summary.sanitize,
:description => entry.content.sanitize,
:published_at => entry.published
)
end
end
end
end
You can then call this from the cli / cron or whatever with, for example:
rails runner -e development 'Entry.update_from_feed("feedname")'
This runs the update_from_feed method in the context of your Rails app using a separate rails instance (a bit like rails console), but doesn't impact the running Rails instance.
In this example, there's a separate model which has name and feed_urls, so there's a lookup of the url based on the provided name.
This code doesn't use the ability of Feedjira to check for updates, so dupe checking is baked in.
(This guthub issue says to avoid using the #update method.
Note that the use of break assumes that new entries are always added to the top of the feed. If you don't trust the feed, then replace break if with unless. The url can be used as an alternative unique id.
Edit:
Here's a version of the update_from_feed method that takes advantage of Feedjira's ability to process multiple feeds:
def self.update_all
feed_urls = Feed.pluck :feed_url
feeds = Feedjira::Feed.fetch_and_parse(feed_urls)
feed_urls.each do |feed_url|
feed = Feed.find_by_feed_url(feed_url)
add_entries(feeds[feed_url].entries, feed)
end
end
pluck returns all the rows of the specified column(s) (:feed_url in this case) in an array. Equally you could change it to accept an array of names, from which it looks up an array of URLs to pass to feedjira.
Finally, if you wanted a self-looping method, you could include:
def self.update_all_periodically(frequency = 15.minutes)
loop do
update_all_from_feed
sleep frequency.to_i
end
end
Then this:
rails runner -e development 'Feed.update_all_periodically'
won't return until you break the process, and will update all feeds at the default frequency, or that specified as an optional argument.
If you wanted to run the updates asynchronously in your main Rails process, then a background worker such as Sidekiq, Resque or DelayedJob will do the... job. :)
Scheduling the fetching and parsing of al these feeds can be incredibly hard and time consuming, which means you shoud absolutely not do it from inside the Rails app itself. At best, you should do it using an 'offline' script.
You could also simply rely on existing APIs like Superfeedr and its rack middleware.

Rails 4 Paperclip Images not uploading/showing (only missing.png)

First of all, these are my environment versions:
Rails: 4.1.0
Ruby: 2.1.1p76
Paperclip: 4.1
I created a scaffold (rails g scaffold Entry description:text) and further added Paperclip to the then existing model (rails g paperclip entry image).
Afterwards I migrated and everything worked fine so far.
Now when I upload an image it just doesn't display, instead "/images/original/missing.png" get's shown and there's no record of the image I just uploaded at all.
This is my model (models/entry.rb):
class Entry < ActiveRecord::Base
has_attached_file :image,
:path => ":rails_root/public/images/:class/:attachement/:id/:basename.:extension",
:url => "/images/:class/:attachement/:id/:basename.:extension"
end
My view (show.html.slim):
p#notice = notice
p
strong Description:
= #entry.description
= image_tag #entry.image.url
= link_to 'Edit', edit_entry_path(#entry)
'|
= link_to 'Back', entries_path
I have ImageMagick installed and even set the Paperclip.options within my development.rb.
I have no idea what I am missing here, it just doesn't seem to upload any images whatsoever, nor throw out any error messages.
After a little more research and a few cold drinks I found the solution!
It's necessary to either explicitly allow certain formats to get uploaded OR remove the verification check (I recommend this for development).
Doing this is as simple as adding the following line to your model (for me, entry.rb)
(SOURCE: https://stackoverflow.com/a/21898204/3686898)
do_not_validate_attachment_file_type :image
Also, I added another check in my controller (same as attr_accessible in earlier Rails versions):
private
# Use callbacks to share common setup or constraints between actions.
def set_entry
#entry = Entry.find(params[:id])
end
# Never trust parameters from the scary internet, only allow the white list through.
def entry_params
params.require(:entry).permit(:description, :image)
end
Alrighty, I hope this helps someone out :)
(Always remember to have a look at your server-logs. It provides golden information!)
Check your
:path => ":rails_root/public/images/:class/:attachement/:id/:basename.:extension"
make sure that is tracing back to where the image is stored

File uploading with Dragonfly, impossible to access images after upload (Rails 3)

I'm trying to make a multiple drag and drop upload file system with Rails 3 and Dragonfly (or anything that would work actually)
I'm at the point where my file comes in my controller through the params hash and I can retrieve it as an ActionDispatch::Http::UploadedFile so I thought all I would have to do then is push it in my model's attribute image but it doesn't work
This my Picture model :
class Picture < ActiveRecord::Base
image_accessor :image
attr_accessible :image_name, :image_uid, :title
end
I thought this would work in my controller :
def createImage
#new_picture = Picture.new
#new_picture.image = params[:pic]
if #new_picture.save
render :json => { :picture => #new_picture }
end
end
Ok, so this registers the record with image_name nil oddly, but with the image_uid set
However, when I try to access my image <%= image_tag #picture.image.url %> I get a not found error
For example :
Request URL:http://localhost:3000/media/BAhbBlsHOgZmSSIhMjAxMi8wOS8yMi8xOV8zMF8yOF83MzBfZmlsZQY6BkVU
Request Method:GET
Status Code:404 Not Found
I'm using ruby 1.9.3 and rails 3.2.8
Any ideas ? :D
Dragonfly needs to load it's rack adapter before it is able to take encoded url requests (like the one above). The best way to load it is to add this before config.encoding = "utf-8 in your /config/application.rb
config.middleware.insert 0, 'Rack::Cache', {:verbose => true,:metastore => URI.encode("file:#{Rails.root}/tmp/dragonfly/cache/meta"),:entitystore => URI.encode("file:#{Rails.root}/tmp/dragonfly/cache/body")} unless Rails.env.production?
config.middleware.insert_after 'Rack::Cache', 'Dragonfly::Middleware', :images
Please note that you will need the Rack-Cache gem as well.
Hope this helps!

How can I reference images in the asset pipeline from a model?

I have a model with a method to return a url to a person's avatar that looks like this:
def avatar_url
if self.avatar?
self.avatar.url # This uses paperclip
else
"/images/avatars/none.png"
end
end
I'm in the midst of upgrading to 3.1, so now the hard-coded none image needs be referenced through the asset pipeline. In a controller or view, I would just wrap it in image_path(), but I don't have that option in the model. How can I generate the correct url to the image?
I struggled with getting this right for a while so I thought I'd post the answer here. Whilst the above works for a standard default image (i.e. same one for each paperclip style), if you need multiple default styles you need a different approach.
If you want to have the default url play nice with the asset pipeline and asset sync and want different default images per style then you need to generate the asset path without fingerprints otherwise you'll get lots of AssetNotPrecompiled errors.
Like so:
:default_url => ActionController::Base.helpers.asset_path("/missing/:style.png", :digest => false)
or in your paperclip options:
:default_url => lambda { |a| "#{a.instance.create_default_url}" }
and then an instance method in the model that has the paperclip attachment:
def create_default_url
ActionController::Base.helpers.asset_path("/missing/:style.png", :digest => false)
end
In this case you can still use the interpolation (:style) but will have to turn off the asset fingerprinting/digest.
This all seems to work fine as long as you are syncing assets without the digest as well as those with the digest.
Personally, I don't think you should really be putting this default in a model, since it's a view detail. In your (haml) view:
= image_tag(#image.avatar_url || 'none.png')
Or, create your own helper and use it like so:
= avatar_or_default(#image)
When things like this are hard in rails, it's often a sign that it's not exactly right.
We solved this problem using draper: https://github.com/jcasimir/draper. Draper let us add a wrapper around our models (for use in views) that have access to helpers.
Paperclip has an option to specify default url
has_attached_file :avatar, :default_url => '/images/.../missing_:style.png'
You can use this to serve default image' in case user has not uploaded avatar.
Using rails active storage I solved this problem by doing this:
# Post.rb
def Post < ApplicationRecord
has_one_attached :image
def thumbnail
self.image.attached? ? self.image.variant(resize: "150x150").processed.service_url : 'placeholder.png';
end
def medium
self.image.attached? ? self.image.variant(resize: "300x300").processed.service_url : 'placeholder.png';
end
def large
self.image.attached? ? self.image.variant(resize: "600x600").processed.service_url : 'placeholder.png';
end
end
Then in your views simply call:
<%= image_tag #post.thumbnail %>,

Resources