MongoDB's GridFS, Rails 3, X-Sendfile, and ACL's, HOW-TO? - ruby-on-rails

I have a Rails 3 project that does file upload/download, with access rights (User has many Files, and can only read/write his own files).
If I store my files on classic filesystem, I can check the access to the file in my rails app and then use X-Sendfile header to redirect to the file, if user has access. In this way, a user can never access a file without permission, and the download is fast.
Can I make file download from GridFS as fast as X-Sendfile, and skip the hassle of piping them trough rails/rack ?
Piping them trough rails/rack would be horribly slow ?
Can I make file download from GridFS as fast as X-Sendfile, and skip the hassle of piping them trough rails/rack, AND ALSO have the ability to enforce access rights ?

Up until now I've found and thought of to possible solutions:
Use something like gridfs-fuse to mount the GFS to local FS and use X-Sendfile just as allways.
Use something like nginx-gridfs which is c-fast and out-of-rails (does not block my app's req-resp cycle while downloading). The downside is that it's server specific

Related

Serve audio file from GCS with Rails

I have audio files located on a private GCS bucket. I want to serve these audio files for users to listen to.
I cant use Active Storage for this as these files are created/deleted outside of my Rails application.
I could download files using google-cloud-storage gem. It would cover authentication, file download. But if I understand correctly I can only serve files from the public directory? So do I need to download those to Rails.public_path?
Furthermore, I really don't want to manage these files after downloading them - caching, deleting them after some time, etc.
What would be the best way to achieve this?
The best option in my opinion would be to use the google-cloud-storage gem,
since both Google::Cloud::Storage::Bucket and Google::Cloud::Storage::File have the #signed_url method. This way you can find the relevant file(s) that you need and create a temporary url, send the url to the client, which will be in charge of downloading the file directly.
If you don't want the client do download the file directly from Google Cloud you can just download the file from GC yourself, and use #send_data or #send_file in the controller.

How to access files on Ceph directly as URL

I need a storage system with the following requirements:
1. It should support data/service clustering
2. It should be open-source so that I can extend functionalities later if needed
3. It should support file system because I want to access some files as public url(direct access). So that I can store my scripts in these files and directly refer these files.
4. Supports some kind of authentication
5. I want it to be on premise (Not cloud).
Ceph seems to qualify all the criteria but does it support the public access of files just like a URL(Point 3) ? It has ability to generate temporary URLs though but I want permanent URLs for few files.
You could run Nextcloud and have your data volume (and database, if you feel so inclined) stored on the Ceph cluster. That's open-source, you can setup direct links to files including permanent links, and is authenticated.

How do I generate files and then zip/compress with Heroku?

I sort of want to do the reverse of this.
Instead of unzipping and adding the collection files to S3 I want to
On user's request:
generate a bunch of xml files
zip the xml files with some images (pre-existing images hosted on s3)
download zip
Does anybody know agood way of doing this? I think I could manage this no problem on a normal machine but Heroku complicates things somewhat in that it has a read-only filesystem.
From the heroku documentation on the read-only filesystem:
There are two directories that are writeable: ./tmp and ./log (under your application root). If you wish to drop a file temporarily for the duration of the request, you can write to a filename like #{RAILS_ROOT}/tmp/myfile_#{Process.pid}. There is no guarantee that this file will be there on subsequent requests (although it might be), so this should not be used for any kind of permanent storage.
You should be able to pretty easily write your generated xml files to tmp/ and keep track of the names, download and write the s3 files to the same directory, and (maybe?) invoke a zip command as long as the output is in tmp/, then serve the file to the browser with the correct mime type to prompt a download. I would only be concerned with how big the filesize is and if heroku has an undocumented limit on what they'll allow in the tmp directory. Especially since you are only performing this action for a one-time download in the duration of a single request, I think you have a good chance of being able to do it.
Edit: Looking around a bit, you might be able to use something like RubyZip to create your zip file if you want to avoid calling system commands.

Recommendations for file server to be used with Rails application

I'm working on a Rails app that accepts file uploads and where users can modify these files later. For example, they can change the text file contents or perform basic manipulations on images such as resizing, cropping, rotating etc.
At the moment the files are stored on the same server where Apache is running with Passenger to serve all application requests.
I need to move user files to dedicated server to distribute the load on my setup. At the moment our users upload around 10GB of files in a week, which is not huge amount but eventually it adds up.
And so i'm going through a different options on how to implement the communication between application server(s) and a file server. I'd like to start out with a simple and fool-proof solution. If it scales well later across multiple file servers, i'd be more than happy.
Here are some different options i've been investigating:
Amazon S3. I find it a bit difficult to implement for my application. It adds complexity of "uploading" the uploaded file again (possibly multiple times later), please mind that users can modify files and images with my app. Other than that, it would be nice "set it and forget it" solution.
Some sort of simple RPC server that lives on file server and transparently manages files when looking from the application server side. I haven't been able to find any standard and well tested tools here yet so this is a bit more theorethical in my mind. However, the Bert and Ernie built and used in GitHub seem interesting but maybe too complex just to start out.
MogileFS also seems interesting. Haven't seen it in use (but that's my problem :).
So i'm looking for different (and possibly standards-based) approaches how file servers for web applications are implemented and how they have been working in the wild.
Use S3. It is inexpensive, a-la-carte, and if people start downloading their files, your server won't have to get stressed because your download pages can point directly to the S3 URL of the uploaded file.
"Pedro" has a nice sample application that works with S3 at github.com.
Clone the application ( git clone git://github.com/pedro/paperclip-on-heroku.git )
Make sure that you have the right_aws gem installed.
Put your Amazon S3 credentials (API & secret) into config/s3.yml
Install the Firefox S3 plugin (http://www.s3fox.net/)
Go into Firefox S3 plugin and put in your api & secret.
Use the S3 plugin to create a bucket with a unique name, perhaps 'your-paperclip-demo'.
Edit app/models/user.rb, and put your bucket name on the second last line (:bucket => 'your-paperclip-demo').
Fire up your server locally and upload some files to your local app. You'll see from the S3 plugin that the file was uploaded to Amazon S3 in your new bucket.
I'm usually terribly incompetent or unlucky at getting these kinds of things working, but with Pedro's little S3 upload application I was successful. Good luck.
you could also try and compile a version of Dropbox (they provide the source) and ln -s that to your public/system directory so paperclip saves to it. this way you can access the files remotely from any desktop as well... I haven't done this yet so i can't attest to how easy/hard/valuable it is but it's on my teux deux list... :)
I think S3 is your best bet. With a plugin like Paperclip it's really very easy to add to a Rails application, and not having to worry about scaling it will save on headaches.

Sharing Uploaded Files between multiple Rails Applications

I have multiple applications (an admin application, a "public"/non-admin application and a web service application) that all share a single database.
I've gotten the applications to share models and other code where appropriate, so I don't have multiple copies of the same code in each. However, the one task that I've yet to configure is how to share files that get uploaded between applications. I'm using Paperclip to successfully upload files to my applications, but if it uploads the files to the application doing the upload.
Ideally, I'd like to be able to serve all the files from the web service. My idea was that I'd need some type of task executed every time a new file is uploaded to any of the applications to have the file created in the file structure of the web service.
I know I could easily accomplish serving files from a single application if I loaded the files into the database (which is how I accomplished this in a similar application suite), but I'm not sure if that's the best route to go for managing/serving the files. Another idea I had was storing the files in the database and having the web service manage "serving" them and having it create the file on the disk on the first request. After the first request for the file, the web service would serve the file from the disk rather than from the database.
Does anyone have any ideas on what the best way to accomplish this might be? Or any better ideas?
Thank you in advance for any feedback anyone might have on the subject.
I'd recommend putting them in a shared location that is served directly by your front end webserver (not Rails) if you have that kind of setup, in this example it's serving up a location called files that points at a folder on disk. Then in your paperclip options, change the save location.
has_attached_file :image,
:url => "/files/:basename.:extension",
:path => "/var/htdocs/public/files/:basename.:extension"
Are you running all apps on the same UNIX/Linux system? Have you tried creating symbolic links to share the folder that contains the images? The goal is to save all images to the same location. Eliminating the need to throw in complicated hooks for attachment creation.
Paperclip by default stores things at :rails_root/public/system/:attachment/:id/:style/:filename If you're sharing a database you won't have to worry about collisions. And you just need to create a system folder to be used by each app.
You can use one app's public/system folder as the master, or create an entirely new one. From this point on all other system folders that aren't the master one will be referred to slave folders. Once you've chosen your master it's as simple as moving everything in each slave folder to the master folder. Deleting the slave folders and replacing them with a symbolic link to the master folder.
Sample command set to migrate and replace with symlink given the paperclip defaults. It's probably a good idea to stop the server before attempting this.
$ mv /path/to/slave/project/public/system/* /path/to/master/system
$ mv /path/to/slave/project/public/system.bak
$ ln -s /path/to/master/system /path/to/slave/project/public/system
Once you're sure the migration is sucessful you can remove the backup:
$ rm /path/to/slave/project/public/system.bak

Resources