Carrierwave store images locally not on s3 at heroku - ruby-on-rails

I am using carrier wave to upload images and display them in a photo-gallery. Carrier wave store files at public/uploads. But these images are not getting displayed at heroku. I found that heroku is read only and we should store files at s3.
Is there any other alternatives than s3?If yes, can you please share here?

Heroku is only read only if you're on the bamboo stack (old). For cedar, they use a ephemeral writeable filesystem, which means that whilst you can upload, it gets wiped with every deploy
S3 is not your only option; it's just Amazon's storage system. You've got dropbox, Azure, RackSpace & a bunch of others which provide similar functionality
Your question should really be which storage solution is right for my app?
The main issue is the location of your files -- they need to be close in proximity to your app, to reduce latency. We've had a problem recently hosting S3 files through a Rackspace app - because S3 is not in RackSpace's datacenter, the latency was high
Because Heroku is built on Amazon's AWS cloud, meaning serving assets from S3 the most efficient & logical method to provide your assets

Related

Heroku - hosting files and static files for my project

I want to use Heroku for hosting my Ruby on Rails project. It will involve lots of file uploads, mostly images. Can I host and serve that static files on Heroku or is it wiser to use services like Amazon S3. What is Your opinion on that approach ? What are my options for hosting that static files on Heroku ?
To answer your question, Heroku's "ephemeral filesystem" will not serve as a storage for static uploads. Heroku is an app server, period. You have to plug into data storage elsewhere.
From Heroku's spec:
Ephemeral filesystem
Each dyno gets its own ephemeral filesystem, with a fresh copy of the most recently deployed code. During the dyno’s lifetime its running processes can use the filesystem as a temporary scratchpad, but no files that are written are visible to processes in any other dyno and any files written will be discarded the moment the dyno is stopped or restarted. For example, this occurs any time a dyno is replaced due to application deployment and approximately once a day as part of normal dyno management.
Heroku is a great option for RoR in my opinion. I have used it personally and ran to the problem that has been mentioned here already (you can't store anything in Heroku's filesystem). I therefore used S3 following this tutorial: https://devcenter.heroku.com/articles/s3
Hope it helps!
PD: Make sure not to store the S3 credentials on any file, but rather create variables as described here: https://devcenter.heroku.com/articles/config-vars
I used to have them on a file and long story short someone gained access to my Amazon account and my account was billed several thousands of dollars (just from a couple of days). The Amazon staff was kind enough to waive those. Just something to have in mind.
As pointed out, you shouldn't do this with Heroku for the specific reason of ephemeral storage, but to answer your question more broadly storing user-uploaded content on a local filesystem on any host has a few inherent issues:
You can quickly run out of local storage space on the disk
You can lose all your user-uploaded content if the hardware crashes / the directory gets deleted / etc.
Heroku, EC2, Digital Ocean, etc. all provide servers that don't come with any guarantee of persistence (ephemeral storage especially). This means that your instance may shut down at any point, be swapped out, etc.
You can't scale your application horizontally. The files on one server won't be accessible from another (or dyno, or whatever your provider of choice calls them).
S3, however, is such a widely-used solution because:
It's incredibly cheap (we store 20 TB of data for something like $500 a month)
Your uploaded files aren't at risk of disappearing due to hardware failure
Your uploaded files are decoupled from the application, meaning any server / dyno / whatever could access them.
You can always publish your S3 buckets into cloud front if you need a CDN without any extra effort.
And certainly many more reasons. The most important thing to remember, is that by storing uploaded content locally on a server, you put yourself in a position where you can't scale horizontally, regardless of how you're hosting your app.
It it wiser to host files on S3, and actually it is even more wiser to use direct uploads to S3.
You can read the arguments, for example, here.
Main point: Heroku is really, really expensive thing.
So you need to save every bit of resources you have. And the only option to store static files on Heroku is having separate dyno running app server for you. And static files don't need app server. So it's just a waste of CPU time (and you should read that as "a waste of a lot of my money").
Also, uploading huge amount of huge files will quickly get you out of memory quota (read that as "will waste even more of my money because I will need to run more dynos"). So it's best to upload files directly to S3.
Heroku is great for hosting your app. Use the tool that best suites the task.
UPD. Forgot to tell you – not only you will need separate dyno for static assets, your static assets will die every time this dyno is restarted.
I had the same problem. I do solve it by adding all my images in my rails app. I then reference the images using their links that might be something like
myapp.herokuapp.com/assets/image1.jpg
I might add the link from the CMS. It might not be the best option, but it works.

How to manage amazon s3 bucket size?

I am using Amazon S3 for saving my uploads in my rails application.
But the bucket size is growing very rapidly, I have used kraken image optimizeer for compressing images.But i want to know what else i can do for managing bucket size.
This depends on what your use case is, if you always need access to the files etc. Optimizing / resizing uploads is probably a good idea, however you can also have a look at S3 lifecycle management. With this feature you can for example delete old files or move them to AWS Glacier. See the reference for an example on how to set this up using the AWS console.

Can I host images in heroku? Or do I need S3?

I'm deploying my web app (it's for a corporate client). So, users will not add images, but only the business will.
I've deployed to Heroku, and my images are still showing. When do I need to use S3? Ill have like 100 images in total in the site, and size will vary like > 7 a week. Can I use only heroku?
The short answer: if you allow users or admins to upload images, you should not use Heroku's file system for this as the images will suddenly vanish.
As explained in the Heroku documentation:
Each dyno gets its own ephemeral filesystem, with a fresh copy of the most recently deployed code. During the dyno’s lifetime its running processes can use the filesystem as a temporary scratchpad, but no files that are written are visible to processes in any other dyno and any files written will be discarded the moment the dyno is stopped or restarted.
This means that user uploaded images on the Heroku filesystem are not only wiped out with every push, but also with every dyno restart, which occasionally happens (even if you would ping them frequently to prevent them going to sleep).
Once you start using a second web dyno, it will not be able to read the other dyno's filesystem, so then images would only be visible from one dyno. This would cause weird issues where users can sometimes see images and sometimes they don't.
That said, you can temporarily store images on the Heroku filesystem if you implement a pass-through file upload to an external file store.
Asset Pipeline
FiveDigit's answer is very good - there is something more to consider; the role of the asset pipeline in Rails
If the images you have are used as assets (IE they are used in the layout; are not changeable by the user), then you can store them in the assets/images folder. There is no limit to the number of assets you can keep with your application, but you must be sure on what these are - they are files which aid your application's operation; not files which can be uploaded / manipulated:
The asset pipeline provides a framework to concatenate and minify or
compress JavaScript and CSS assets. It also adds the ability to write
these assets in other languages and pre-processors such as
CoffeeScript, Sass and ERB.
The asset pipeline will compress & fingerprint the stylesheet, image and js files it has, when you deploy your application to the likes of Heroku, or any other server. This means if those files don't change, you can store them in there
-
S3
The reason you'd want to use the likes of S3 is specifically if your images files are designed to change (user can upload / edit them). Regardless of Heroku's filesystem, if the images are tied to changes in the DB, you'll have to keep a central store for them - if you change servers, they need to be reachable
To do this, you should ensure you appreciate how you want the files to work - are they going to be manipulated constantly by the user or not? If so, you'll have to explore integrating S3 into your app

Do static images get cached by Heroku if they're loaded from Amazon's S3 cloud servers?

The title pretty much describes my entire question.
Because right now I'm wondering if my app is faster by uploading my static images to heroku or to amazon's s3.
According to the Heroku Dev Center, Heroku apps on the Aspen and Bamboo stacks use Varnish to cache output from your application. On the Cedar stack, rack-cache and the memcache add-on must be used. Here's some more info if you're on Aspen or Bamboo:
From the Heroku Dev Center:
Anything that is served from the filesystem (a Rack::File) is cached for 12 hours. Whenever you push changes, your cache is cleared (see below), and since Heroku filesystems are read-only, it’s safe to cache these for a long period.
However they mention in an aside:
Large static assets, such as MP3s or PDFs, should generally not be included in your code tree. Use an external asset hosting service such as Amazon S3, instead. See this article for more information.

Heroku + Paperclip + Amazon S3 - Pricing?

Since Heroku is a read-only filesystem I can't use paperclip to store a small quantity of files on the server. Database image storage is an option, but not particularly ideal since that may crank my client's DB size up from a few hundred KB to over the 5 MB 'free' shared DB limit (depending on size of images).
That leaves Amazon S3 as a likely solution. I understand that Heroku is hosted on EC2 (I believe?). Amazon's pricing wording was a little bit confusing when referring to S3-EC2 file transfers. If I have my client setup an S3 account and let them do file transfers to and from there, what is the pricing going to look like?
Is it cheaper from an S3 point-of-view to to both upload and download data in the rails controllers, and then feed the data to the browser using send_file? Or would it make more sense to just link straight to the image or pdf from the browser like normal?
Would my client have to pay anything at all since heroku is hosted on Amazon? I was looking for other questions related to this but there weren't any really straight answers concerning which parts of the file transfer would be charged for.
I guess the storage would cost a little (hardly anything), but what about the bandwidth? Thanks :)
Is it cheaper from an S3 point-of-view
to to both upload and download data in
the rails controllers, and then feed
the data to the browser using
send_file? Or would it make more sense
to just link straight to the image or
pdf from the browser like normal?
From an S3 standpoint, yes, this would be free, because Heroku would be covering your transfer costs. HOWEVER: Heroku only lets a script run for 30 seconds, and during that time, other clients wont be able to load the site, so this is really a terrible idea. Your best bet is to serve the files out of S3 directly, in which case, yes your customer would be transfer between S3 and the end user.
Any interaction you have with the file from Heroku (i.e. metadata and what not) will be free because it is EC2->S3.
For most cases, your pricing would be identical to what it would be if you were not using heroku. The only case where this would change would be if your app is constantly accessing the data directly on S3 (to read metadata/load files)
You can use Paperclip on Heroku - just not the local file system for storage. Fortunately Paperclip can use s3 for storage. Heroku has a tech article here that covers it.
Also when an asset that's been uploaded is displayed on a page (lookup asset_host) the image would be loaded directly from your s3 buckets URL so you will pay Amazon for a get request to the image and then for data transfer involved but also for storing the assets on s3. Have you looked at the s3 calculator to get indicative costs?

Resources