Ruby on Rails deployment, on "thin" server with lot of attachments - ruby-on-rails

A lot of PDFs are stored inside MySQL as a BLOB field for each PDF file. The average file size is 500K each.
The Rails app will stream the :binary data as file downloads, where there is a user click on the download link.
Assume there is a maximum of 5 users downloading 5 PDFs concurrently, what kind of deployment setup parameters I should be aware of? e.g. for the case of thin:
thin start --servers 3
whether --servers 3 is good enough (or 5 or more is needed) for the above example?
The 2nd question is whether 'thin' a capable solution?
Thanks!

Firstly I don't think you should be storing files in a database. A better place would be in the file system, or alternatively in cloud storage like S3. If you were to use an attachment plugin like paperclip this is a very easy to setup.
However, lets assume you want to store your files in your database.
The problem with your current set up is that when you're sending your file your thin instance is blocking while your client downloads the data. This means if you have 3 thin instances and 3 people downloading pdfs then you're site will not respond to any requests.
Thankfully there is a solution to this problem which involves the x-sendfile header. The way this works is your thin instance sends the file to your webserver, for example nginx, which then serves the file directly.
Here's a great post on stackoverflow on how to set this up with nginx.
Which web server are you using?

You can define one or two thin dedicated to your download. In your webserver you can made a different proxy distribution in your Rails Application and in your download url

Related

Is there a GEM to make a rails app work like an FTP

I have an app that stores products that get loaded up to other sites. Typically I get the product data via REST in the form of CSV files. These get stored in TMP, parsed and imported.
I however have a client who now wants to send me files via FTP, but at the same time doesn't want to use an external FTP.
So. I'm wondering if there's a GEM to make my rails app respond to FTP commands. Something that could present a table the same way that FTP presents files.
Yes I know I should go back and say "ha haa Haaaa!", but mine is not to reason why, mine is to do, or die.
I could likely role my own, but if there's a GEM someone knows of that would be most helpful. Thanks
There maybe (but I doubt it). However, Heroku will only accept incoming requests on Port 80 or if you're using an SSL-Endpoint that will accept requests on port 443. So with Heroku, if you even found a gem it would be impossible to use FTP since that relies on port 20/21 as a standard.
I've done something similar but not using FTP, rather having the client place files on Amazon S3 and then have my Heroku app list contents of a folder on S3.

File access control in Rails

I have a web application which allows users to upload files and share them with other people across the internet. Anyone who has access can download the files, but if the uploader doesn't specifically share the file with someone else, that person can't download the files.
Since the user permissions are controlled by rails, each time someone tries to download a file it sent to the user from a rails process. This is a serious bottle neck - rails is needed for the file upload and permissions but it shouldn't be in the way taking up memory just for others to download files.
I would like to split the application on different servers for the frontend, database and file server. If the user does to my site, they should have the ability to download the file directly from something like my-fileserver.domain.com/file/38183 instead of running it through rails.
What is the best option for this? I would like to control file access at the database level, not the file system - but I don't want rails taking up all of the memory on my system for such a simple process. Any ideas?
Edit:
One thing I may be able to do is load a list of files/permissions from mysql into a node.js app and give access rights to the file server as a true/false response based on what the file server sends in. This still requires the file server to run a web server, however.
May be You could generator a rand url for file, and control by center system .

How to Upload Large Files on Heroku (Particularly Videos)

I'm using heroku to host a web application with the primary focus of hosting videos. The videos are hosted through vimeo pro, and I'm using the vimeo gem by matthooks to help handle the upload process. Upload works for small files, but not for larger ones (~50mb, for example).
A look at heroku logs shows that I am getting http error 413, which stands for "Request Entity Too Large." I believe this might have to do with a limit that heroku places on file uploads (greater than 30mb, according to this webpage). The problem though is that any information I can find on the subject seems to be outdated and conflicting (like this page that claims there is no size limit). I also couldn't find anything on heroku's site about this.
I've searched google and found a few somewhat relevant pages (one and two), but no solutions that worked for me. Most of the pages I found deal with uploading large files to amazon s3, which is different from what I'm trying to do.
Here's the relevant output of the logs:
2012-07-18T05:13:31+00:00 heroku[nginx]: 152.3.68.6 - - [18/Jul/2012:05:13:31 +0000]
"POST /videos HTTP/1.1" 413 192 "http://neoteach.com/components/19" "Mozilla/5.0
(Macintosh; Intel Mac OS X 10.7; rv:13.0) Gecko/20100101 Firefox/13.0.1" neoteach.com
There are no other errors in the logs. This is the only output that appears when I try to upload a video that is too large. Which means that this is not a timeout error or a problem with exceeding the allotted memory per dyno.
Does heroku really place a limit on upload sizes? If so, is there any way to change this limit? Note that the files themselves are not being stored on heroku's servers at all, they are merely being passed on to vimeo's servers.
If the problem is not limit on upload sizes, does anyone have an idea of what else might be going wrong?
Much thanks!
Update:
OP here. I'm still not exactly sure why I was getting this particular 413 error, but I was able to come up with a solution that works using the s3_swf_upload gem. The implementation involves flash, which is less than ideal, but it was the only solution (out of 3 or 4 that I tried) that I could get working.
As Neil pointed out (thanks Neil!), the error I should have been getting is "H12 - Request timeout". And I did end up running into this error after repeated trials. The problem occurs when you try to upload large files to the heroku server from your controller (using a web dyno), because it takes too long for the server to respond to the post request.
The proper approach is to send the file directly to s3 without passing through heroku.
Here's a high-level overview of my approach:
Use the s3_swf_upload gem to supply a direct upload form to s3.
Detect when the file is done uploading with the javascript callback function provided in the gem.
Using javascript, send rails a post message to let your server know the file is done uploading.
The controller that responds to the javascript post does two things: (a) assigns an s3_key attribute to the video object (served up as a param in the form). (b) initiates a background task using the delayed_job gem.
The background task retrieves the file from s3. I used the aws-sdk gem to accomplish this, because it was already included in s3_swf_upload. Note that this is distinctly different from the aws-s3 gem (in fact they conflict with one another).
After the file has been retrieved from s3, I used the vimeo gem to upload it to vimeo (still in the background).
The implementation above works, but it isn't perfect. For files that are close to 500MB in size, you'll still run into R14 errors in your worker dynos. This occurs because heroku only allots 512MB of memory per dyno, so you can't load the entire file into memory at once. The way around this problem is to implement some sort of chunking in the final step, where you retrieve the file from s3 and upload it to vimeo piece by piece. I'm still working on this part, and I'd love to hear any suggestions you might have.
Hopefully this might help someone. Feel free to ask me any questions. Like I said, my solution isn't perfect so feel free to add your own answer if you think it could be better.
I think the best option here is indeed to upload directly to S3. It's much cheaper and much more secure than allowing users to upload files to your own server (or Heroku in this case). It's also a well-proven pattern used by lots of video hosting platforms (I know vzaar do this).
Check out the jQuery upload plugin, which allows direct uploads to S3: https://github.com/blueimp/jQuery-File-Upload
Also check out the Railscasts around this topic: #381 and #383.
Your biggest problem is not the size of the files here, but the fact that you are expecting the user to upload large files to Heroku, and then pass them on. The issue here is that all requests on the Heroku platform must return the first byte within 30 seconds - which in your case is very unlikely.
Therefore, you need to look at getting users to upload direct to S3/Vimeo/whereever and then connect your application data to these uploaded assets.
If you're using Ruby, then the carrier-wave direct gem might be worth a look for how it's done . Failing that there are 3rd party services out there which allow you to do this via some code which you can drop into the page, but these come with an attached cost.

How to post images directly to s3 on a heroku app from a json request?

I have a rails app hosted on heroku and a mobile app made with rhodes.
I'd like to send images from the mobile app to my rails app using an HTTP POST request. Since heroku doesn't allow you to store files, I'm using amazon s3.
I can't send the file from heroku to s3 because it takes more than 30 seconds and causes a timeout. I've seen plenty of examples of uploading a file direct to s3 when the user has a form, but this obviously won't work in this case.
I tried using the suggestion here:
rails 3, heroku, aws-s3, simply trying to upload a file to S3 that is POSTed (http/multipart) to our app
but I still get a 503 request timeout.
I don't want to put my amazon s3 keys on the app.
Right now, I feel like my only option is to host my app on EC2 which I would rather not do as I like the simplicity of Heroku.
Also, it seems strange that these uploads would take so long regardless. I'm only posting images from a mobile phone camera, so they're not huge files.
I was getting the same error in a project in my job. Some people says that the only way to solve this is by uploading files directly to the S3 bucket. This is difficult in our case, because we are using Paperclip Gem for Rails and different size versions of the image.
Some other people says that "The Heroku timeout is a set in stone thing that you need to work around. Direct upload to S3 is the only option, with some sort of post-upload processing required", so I recomend to do the next:
Maybe this is not a solution but, it could be very useful, it was for me in a Rails App:
Worker Dynos, Background Jobs and Queueing
Perhaps you should move this heavy lifting into a background job which can run asynchronously from your web request.
Regards!
So I finally figured out how to do this.
After lots of back and forth with AWS reps and Cloudfiles reps and pulling my hair out, I realized it would be a lot less work to just get another rails server that could write to the filesystem.
So, I started another rails app on openshift. It's just as easy as Heroku to get started (in fact, I might consider moving my rails app there, but it's too new for my taste right now and doesn't have the community around it that Heroku does).
Then, I just had to have communications between my two rails apps.
I know it's not the best/scalable/elegant fix, but it got the job done, and that's what matters in the end!

Rails and Node in the same app on Heroku?

I'm building a Rails application that deals with file uploads through CarrierWave. Currently, larger file uploads block the server for a significant amount of time. I have seen solutions like the s3-swf-upload-plugin gem that skip the local server and send files straight from the browser to S3, but this would require some modifications for pre-generating unique filenames and synchronizing them with the database. I'm sure it wouldn't be too much trouble, but Heroku's new Cedar stack gave me the idea of offloading these long running requests to a node.js instance running in the same app. I'm not very experienced with these kinds of things, so excuse my wording if it's a bit off.
Would something like this be possible? How would you configure things such that certain requests (ones involving file uploads, in this case) would be handled by a node app bundled in the same heroku repository as the main rails app?
I don't think it's possible to mix Rails and Node in the same app. However, you could get roughly the same functionality by using two separate apps that communicate with each other.
You can use ENV['DATABASE_URL'] to determine your database connection string. Use the heroku console to set it as an ENV variable for your Node app (e.g. heroku config:add OTHER_DB=your_connection_string) should then be able to use the same connection string to connect to the same database from your other heroku app. You could even access it outside of heroku if you have a dedicated database, see: http://devcenter.heroku.com/articles/external-database-access
For seamless integration between the two apps, you could have a form rendered by the Rails app post to a URL of the Node app. In addition to the file upload, include in that form via hidden input fields any other variables you need to communicate to the Node app. When the upload to the Node app is done, it could redirect the client back to the Rails app, passing any status or variables as get parameters.
Run the two apps under two subdomains of the same domain and you could even share cookies between them.
You need two apps. I am doing exactly what's described in this question. I wanted large streaming uploads, and since Rack writes downloads to a temp file before passing them through to the handler, it is not possible to do this with Rails.
Node.js, on the other hand, does this beautifully. So there are two Heroku apps, the Rails web app and the Node.js (Express) web app. The Rails web app uses SWFUpload as the client-side solution. The Rails app and the Node.js app both have a secret key as a Heroku config variable. When it's time for the user to upload, client-side Javascript requests an upload URL from the Rails server. The Rails server forms an upload URL with an Expires parameter and computes a signature using the secret key. The client-side Javascript handler passes this URL along to SWFUpload (upload_url property). The user selects the files to upload, and SWFUpload starts posting them to the upload_url. The Node.js app verifies that the URL is not expired and that the signature is valid. It processes the form data with the formidable library.
One other detail. Flash requires the Node.js app to serve a crossdomain.xml that permits the cross-site request.
My Node.js app doesn't touch the database; but if it did I would share DATABASE_URL as previously suggested. Note that you can't share a DATABASE_URL outside of Heroku unless you have a dedicated DB. The DATABASE_URLs for shared databases are not reachable from outside Heroku (unlike some other services like RedisToGo).

Resources