I have been looking into rails file upload tools and the ones that seemed the most appealing and interesting to me were carrierwave and dragonfly.
From looking around it seems like carrierwave takes the more traditional style where you can process the file on save whereas dragonfly is middleware so it allows you to process on the fly.
I was wondering if people had any references to performance test or any test that compare the two.
Also, just curious on what people's opinions are about both and which they prefer and of course why they prefer it.
Depending on the setup. As Senthil writes, as long as you have a cache-proxy in front, it's fine with Dragonfly.
But if you are using the built-in rails caching, Carrierwave will perform better, as the files can be loaded without any processing. If you don't do any processing, it doesn't matter.
Here's how I summarized when considering both for Images on a project with Mongomapper:
Carrierwave:
Pros
Generates thumbs on upload (saves CPU time)
Can use files directly from a static/cached document
Doesn't need any cache-front
Supports various storage backends (S3, Cloudfiles, GridFS, normal files) easy to extend to new storage types if needed.
Cons
Generates thumbs on upload (diffucult to generate new thumbsizes)
Doesn't natively support mongomapper
Uses storagespace for every file/thumb generated. If you use normal file storage, you might run out of inodes!
Dragonfly:
Pros
Should work with mongomapper, as it only extends ActiveModel
Generates thumbs on the fly (easier to create new layouts/thumbsizes)
Only one file stored! Saves space :)
Cons
Eats CPU on every request if you don't have a cache-proxy, rack::cache or similar.
No way to access thumbs as files if needed.
I ended up using both in the end.
A future wish is for carrierwave to suppert MongoMapper again. After using both in various situations, I've found that the features in MongoMapper (rails3 branch) always works, and are easy to extend using plugins. Cannot say the same for Mongoid as of now, but that might change.
I use dragonfly simply because carrierwave dropped support for mongomapper and paperclip doesn't work mongomapper without some hacks.
Dragonfly does processing on the fly, i.e.
is meant to be used behind a
cache-proxy such as Varnish, Squid or
Rack::Cache, so that while the first
request may take some time, subsequent
requests should be super-quick!
Paperclip
Paperclip is intended as an easy file attachment library for Active Record. The intent behind it was to keep setup as easy as possible and to treat files as much like other attributes as possible. This means they aren't saved to their final locations on disk, nor are they deleted if set to nil, until ActiveRecord::Base#save is called. It manages validations based on size and presence, if required. It can transform its assigned image into thumbnails if needed, and the prerequisites are as simple as installing ImageMagick (which, for most modern Unix-based systems, is as easy as installing the right packages). Attached files are saved to the filesystem and referenced in the browser by an easily understandable specification, which has sensible and useful defaults.
Advantages
validations, Paperclip introduces several validators to validate your attachment:
AttachmentContentTypeValidator
AttachmentPresenceValidator
AttachmentSizeValidator
Deleting an Attachment
Set the attribute to nil and save.
#user.avatar = nil #user.save
Paperclip is better for an organic Rails environment using activerecord and not all the other alternatives. Paperclip is much easier to handle for beginning rails developers and it also has advanced capabilities for the advanced developer.
A huge fan of Paperclip because it doesn't require RMagick, it's very easy to set it up to post through to Amazon S3 and declaring everything in the models (validations, etc) keeps things clean.
With respect to multiple file uploads and progress feedback, both are possible with both Paperclip and Attachment_fu, but both typically require some elbow grease with iframes and Apache to get working.
CarrierWave
This gem provides a simple and extremely flexible way to upload files from Ruby applications. It works well with Rack based web applications, such as Ruby on Rails.
Advantages
Simple Model entity integration. Adding a single string image attribute for referencing the uploaded image.
"Magic" model methods for uploading and remotely fetching images.
HTML file upload integration using a standard file tag and another hidden tag for maintaining the already uploaded "cached" version.
Straight-forward interface for creating derived image versions with different dimensions and formats. Image processing tools are nicely hidden behind the scenes.
Model methods for getting the public URLs of the images and their resized versions for HTML embedding.
if built-in rails caching, Carrierwave will perform better, as the files can be loaded without any processing. If you don't do any processing, it doesn't matter.
Generates thumbs on upload (saves CPU time)
Can use files directly from a static/cached document
Doesn't need any cache-front
Supports various storage backends (S3, Cloudfiles, GridFS, normal files) easy to extend to new storage types if needed.
One of the fact that it doesn't clutter your models with configuration. You can define uploader classes instead. It allows you to easily reuse, extend etc your upload configuration.
What we liked most is the fact the CarrierWave is very modular. You can easily switch your storage engine between a local file system, Cloud-based AWS S3, and more. You can switch the image processing module between RMagick, MiniMagick and other tools. You can also use local file system in your dev env and switch to S3 storage in the production system. Carrierwave has good support for exterior things such as DataMapper, Mongoid, Sequel and even can be used with a 3rd party image managment such as cloudinary The solution seems most complete with support coverage for about anything, but the solution is also much messier (for me at least) since there is a lot more code that you need to handle. Need to appreciate the modular approach that CarrierWave takes. It’s agnostic as to which of the popular S3 clients you use, supporting both aws/s3 and right_aws. It’s also ORM agnostic and not tightly coupled to Active Record. The tight coupling of Paperclip has caused us some grief at work.
Disadvantages
You can't validate file size. There is a wiki article that explains how to do it, but it does not work.
Integrity validations do not work when using MiniMagick (very convenient if you are concerned about RAM usage). You can upload a corrupted image file and CarrierWave will throw an error at first, but the next time will swallow it.
You can't delete the original file. You can instead resize it, compress, etc. There is a wiki article explaining how to do it, but again it does not work.
It depends on external libraries such as RMagick or MiniMagick. Paperclip works directly with the convert command line (ImageMagick). So, if you have problems with Minimagick (I had), you will lose hours diving in Google searches. Both RMagick and Minimagick are abandoned at the time of this writing (I contacted the author of Minimagic, no response).
It needs some configuration files. This is seen as an advantage, but I don't like having single configuration files around my project just for one gem. Configuration in the model seems more natural to me. This is a matter of personal taste anyway.
If you find some bug and report it, the dev team is really absent and busy. They will tell you to fix bugs yourself. It seems like a personal project that is improved in spare time. For me it's not valid for a professional project with deadlines.
Doesn't natively support mongomapper
Uses storagespace for every file/thumb generated. If you use normal file storage, you might run out of inodes!
Dragonfly
The impressive thing about Dragonfly, the thing that separates it from most other image processing plugins, is that it allows on-the-fly resizing from the view.
Not needing to configure thumbnail sizing or other actions in a separate file is a huge time and frustration saver. It makes Rails view code like image_tag #product.image.thumb('150x150#') possible.
The magic is all made possible by caching. Instead of building the processed version on upload and then linking to individual versions of the image, the plugin generates images as they are requested. While this is a problem for the first load, the newly created image is http cached for all subsequent loads, by default using Rack::Cache, though other more robust solutions are available should scaling become an issue.
Advantages
Will I be changing image size often? Example: if you want to let your users change the size of their pictures (or your need flexibility in size for some other reason), or really fast development.
Yes: Dragonfly
No: either Carrierwave or Paperclip
Can be used with mongomapper with no trouble
Performance should be fine as long as you use a caching proxy
Should work with mongomapper (it only extends ActiveModel)
Generates thumbs on the fly (easier to create new layouts/thumbsizes)
Only one file stored! Saves space
Processing done on the fly (is meant to be used behind a cache-proxy such as Varnish, Squid or Rack::Cache, so that while the first request may take some time, subsequent requests should be super-quick)
Disadvantages
Eats CPU on every request if you don't have a cache-proxy, rack::cache or similar.
No way to access thumbs as files if needed.
References
Ruby on Rails Image Uploads with CarrierWave and Cloudinary
Rails 3 paperclip vs carrierwave vs dragonfly vs attachment_fu
Other people wrote pretty good summaries, I just would like to say that from our experience Dragonfly setup needed more maintenance, and because of negligence of some developer(s) along the way we were also stuck with plenty of orphan images which lingered after the original was removed. This wouldn't have happened with a vanilla carrierwave.
P.S. We migrated to cloudinary (and use carrierwave with it) and are happy with it.
Related
So I just started learning rails, and am making my own CRUD app on my own just to get familiar with it.
Ideally the app will have some sort of "Post" that will have a form and an image that will be submitted along with it acting as the "Main Profile" image.
This main profile image will be displayed in a "grid" view on the index page. I've heard of paperclip for Rails....but is that still the best option in holding an image in the database? or is it better to host the image somewhere else? Im somewhat new to a lot of web-dev concepts.
Thanks!
(Database will be postgres on prod at least and sqlite during dev)
Storing images in a DB isn't a good idea - it is generally slow and will affect the DB performance.
I would recommend the Shrine file attachment framework, which is the most lightweight and, at the same time, most sophisticated of its kind.
You could read a very interesting article by Shrine's author with the comparison of different file attachment frameworks.
I'm building a web app that need to store some resources, including but not limited to articles, pictures and videos. My question here is how videos (mp4/ogg) are stored on web server? just as bare file or as binaries in relational or nosql db?
The question to BLOB data almost always comes down to "don't BLOB data". There are very few times that make more sense to write a database connector for your data then to just keep it on disk.
The general trend is to use an established service that employs good design patterns, such as Paperclip for ruby, and tailor it to your needs.
Using an external storage service is also a good idea, for example Amazon S3 will store all of your data for pennies on the dollar per gigabyte, and they'll do an excellent job of it.
If you do decide to cook up your own server that handles data internally, might I recommend digital ocean? I have been very happy with the SSD servers I have setup there (which are super fast).
For video you will almost certainly need a webserver that is capable of streaming the file. I think Nginx has this feature.
I think you need to elaborate a bit about the use case you wish to implement for this app. Only then you can have precise answer.
And to to help out with that, here are some questions you need to ponder:
1- You said you wanted to store videos, what are your requirements beyond storage?
2- do you wish for example to offer access to third party users to these videos and search with keywords?
3- If yes, what kind of information is available about the videos? what is the expected average size of these files?
Many database engines offer the possibility of storing big binary files, but that comes with an impact on performance. That's why most of the storage systems that deal with big files, store the files themselves on the disk and any related metadata (file name, last updated, associated keywords, etc.) are stored in the database. That makes for a scalable system.
I'll edit this answer, if you find it useful and have further related-questions.
An unlimited file storage is difficult to setup without AWS S3. S3 is cheap and scalable solution but expensive to use without proper caching, so we have Nginx S3 proxy that works well: https://stackoverflow.com/a/44749584/290338
I'm new to Rails, I wanted to make a website for uploading files: music, videos, pictures, text. What is a better way to store files? I've read about different methods: Database, as a file, Amazon S3?
There will be a lot of files around 1 kb to 20Mb each.
Thanks!
Storing files in a database is not bad, per se. It depends on the kind of database.
Storing files in a relational database is not view as a good practice by the reasons explained by bassneck.
But there are other kind databases that are specifically designed to store any kind of data, in a non-relational way, for example files of any kind. The answer of Dhruva highlight that, MongoDB is pretty good and its support for storing files using GridFS is awesome.
GridFS is very good, for example it can stream only parts of a file, pretty useful for video.
In your specific case – many small files of many kinds of data – GridFS is an real option. I use Heroku & mongohq.com and they work like a charm.
I don't think there is a right answer for that. I use heroku, so my only option is Amazon S3 (which has free quotas for the first year) and I'm pretty satisfied with it. I use carrierwave gem for uploading files and it's really easy to use. I really prefer it over Paperclip.
If your hosting provides a lot of space and bandwidth then you could give it a try.
But for the database, I really don't like the idea of storing files in it.
Updated
Reading a filename from the table should be faster than reading the file itself. The bigger the DB, the longer it will take to make a backup. And if you take heroku for example, you only get 20gb for a shared db. That's not much if you're gonna store 10-20mb files in there. Moving project to another hoster might be easier if you store files in the DB. But if you use an external service (such as S3), there would be no difference for you.
Amazon S3 integration on Heroku is relatively easy to setup and get working with the paperclip gem. Heroku has some documentation on how to get this up and running.
Take a look at their documentation and see if this is the kind of thing your looking for.
http://devcenter.heroku.com/articles/s3
I will suggest you to also check out & consider MongoDB GridFS - http://www.mongodb.org/display/DOCS/GridFS+Specification. My experience with GridFS has been good. I cannot give you a comparative analysis with the other options, however I would like to know the same.
I would recommend you use amazon s3 to store your files. It is the best.
Heres a quick rundown. We have two apps. Both apps are different from each other in every aspect.
Our first app is a company profile (4 page layout: home, products, about, contact). I don't think it will generate the same traffic as social web sites online. It allows staff members to post product photos. Is it better to have an AWS S3 account to store this content on? Or would I be better off storing the files on the local web server?
Our second app is more social oriented. Here, we have decided to use an S3 bucket for this app. Since we already have an AWS account, should we create a bucket for the first app on this AWS account? Or will this just render more expenses in the long run? What are your thoughts?
On another note. Should app related content (logo, icons, background images, buttons, etc) be stored on the S3 account or on the local server? What is the general consensus on this?
I would keep anything that is "hard-coded" into your site, IE the images in your CSS and anything in your HTML/ERB/HAML views that are referenced by their file name as part of the web server itself. You don't have to do this, but one reason to do it is if there's any interruption of Amazon S3 or any kind of configuration issue etc at any point in the future these 'basic' visual resources won't be affected, so your site will still look pretty much intact.
Plus, this kind of content has a specific appropriate place to live (your public folder) and rails helpers that make it really easy to access.
I just can't see a reason to mess with S3 if it's not going to be high volume of data and bandwidth.
On the other hand, if it IS going to be very high bandwidth and you're using a cheap hosting platform that doesn't give you tons of bandwidth, moving those images to S3 would save you some. But are you really in danger of exceeding your bandwidth limits on the server?
Lastly, I think your OTHER app is using S3 as expected, that's a good use of S3. I would not necessarily recommend mixing buckets with other apps, though. The simplest reason for this is, let's say one of your apps grows and at some point merges with, partners with, or is sold to another party. Really, imagine any scenario where another interest becomes involved in either app.
Now you've got no way to maintain separation of access between the resources your two different apps are using. So you have to either migrate to a new platform for one of them, or you have to deal with them being 'joined at the hip' so to speak.
In other words, I just don't see an upside to sharing AWS resources between two apps that aren't directly related to each other. No reason to make them a package deal if they aren't naturally a package deal.
On the other hand if it DOES make sense for them to be totally linked, then it doesn't really matter either way.
Hope that helps you think about it. Good luck!
s3 is ridiculously cheap so I'd put any image uploads on there.
In the past I've put static images, stylesheets and javascripts and so forth on s3 but finally I just leave them on the web server. Reason for this is jammit which is oh sooo cool in handling packaging, updating, compressing, etc, etc of assets.
I would like to use a delimited text file (xml/csv etc) as a replacement for a database within Ruby on Rails. Solutions?
(This is a class project requirement, I would much rather use a database if I had the choice.)
I'm fine serializing the data and sending it to the text file myself.
The best way is probably to take the ActiveModel API and build your methods that parse your files in the appropriate ways.
Here's a good presentation about ActiveModel and ActiveRelation where he builds a custom model, which should have a lot of similar concepts (but different backend.) And also a good blog post by Yehuda about the ActiveModel API
Have you thought of using SQLite? It is much better solution.
It uses a single file.
It is way faster than doing the serialization yourself.
It is zero configuration. Very simple to use.
You get ACID compliance, transactions sub selects etc etc.
MySQL has a way to store tables in CSV. It has some pretty serious limitations, but it sounds like your requirements demand something with some pretty serious limitations anyway.
I've never set up a Rails project that way, and I don't know what it would take, but it seems like it might be possible.
HSQLDB seems to work by storing data on disk as a SQL script that creates your database. It records changes in memory and a log file, and when you shut down it recreates a single SQL script again. I've not used this one myself.
HSQLDB doesn't appear to be one of the supported databases in Rails. I don't know what it would take to add support for a new database.