File storage backend for Rails - ruby-on-rails

I have a Rails application that I want to add file upload to, so that the users have access to a "resources" section where they can upload and share (although not publicly) any type of file. I know I could build a solution using paperclip and S3 for example, but to try and avoid the admin overhead of all that I'm looking at API interfaces to drop.io and box.net. Does anyone have any experience of these? I've got a basic demo working rather well to drop.io, but I was just wondering if anyone had any better ideas or experiences.
Many thanks
D

I use attachment_fu with S3 backend. For User Interface goodness, I use YUI's file uploader.
Some of the files are uploaded with world read access, others with no public read access.
I use Attachement_fu to create self-signed urls to enable clients to access the private S3 files.
I did write some small helper routines for the S3 library for re-connecting after a timeout, handling various errors that the S3 library can raise, etc.
Building your own library for drop.io and/or box.net
Your idea of using the API for a commercial service is interesting but I haven't run into any problems with the above config. And the price for direct S3 access is very low.
If you do decide to go this route, you may want to open source your code. You'd benefit by getting testing, ideas, and possible code contributions from the community.
Note that if you have a lot of uploads, you can end up with a performance issue if the uploads are synchronous with the Rails thread--the rails process is busy uploading and can't do anything else until the upload is done.
HTH,
Larry

Related

What is the best way for an iOS app access data from a public website without overloading it?

I would like to use some publicly available data from a government website as a source of data in an iOS app. But I am not sure what is the best / most polite / scalable way have a large number of users request data from this website with the least impact on their servers and best reliability for me.
It is 1-50kb of static XML with a fixed URL scheme
It updates with a new XML once a day
New users would need to download past data
It has a Last-Modified header but no caching headers
It does not use compression or a CDN
It's a government website, so if someone even replies to my email I doubt they are going to change how they host it for me...
I'm thinking I could run a script on a server to download this file once a day and re-host for my app however my heart desires. But I don't currently run a server which I could use for this and it seems like a lot just for this. My knowledge of web development is not great, so am I perhaps missing something obvious and I just don't know what search terms I should be using to find the answer.
Can I point a CDN at this static data somehow and use that?
Is there something in CloudKit I could use?
Should I run a script on AWS somehow to do the rehosting without needing a full server?
Should I just not worry about it and access the data directly??
You can use the AWS S3 service (Simple Storage Service).
The flow is somewhat like this:
If the file doesn't exist on S3 yet, or, if the creation date of the file on S3 is yesterday, the iOS app downloads the XML from the gov site and stores it in S3.
If the file exists on S3 and is up to date, download it from S3.
After that, the data can be presented by the app without overloading to the site.
I think the best way for you is to create an intermediary database where you can store your data in a secure manner.
Create a pipeline that does some data transformation and store in you newly created database.
Create an api with pagination and you desired filters
Also make sure you are not violating any data policies in the process.
I hope this helps.

File sharing solution with Ruby on Rails

I am working on a web application that will be used to securely share files between individuals. In terms of functionality what I deem important is easy file sharing, good ux, and secure storage. I want to integrate this functionality into my web application. I am working in the Ruby on Rails framework and have played around with carrierwave and Amazon S3 integration but I can't help but wonder if there is not a complete solution out there already.
My question thus is: Are there file sharing open source solutions or paid products out there which I can plug in to my web application that I should be investigating and not build the whole file sharing component from the start? I do not mind paying a fee for this software.
You could try https://github.com/mischa78/boxroom
Boxroom is a Rails application that aims to be a simple interface for
managing and sharing files in a web browser. It lets users create
folders and upload, download and share files. Admins can manage users,
groups and permissions.
Caplinked is a virtual data room provider that provides an API which will securely store and share your files / documents between individuals and groups. They also have a Ruby on Rails SDK which seems pretty easy to use. Check out their developer portal.

How does Dropbox upload data to its servers?

just recently I was thinking and wondered, how does Dropbox upload my files to its S3 storage and how might that one be organized?
Let's just completely forget about the sync aspect for a second and scale the problem down to one S3 bucket.
Say, in that bucket's root directory you have lots of folders, each belonging to an arbitrary user.
Now if that user wants to upload a file to his folder... how does that happen internally? I mean, Dropbox can't just store the Amazon S3 access credentials/keys hard-coded into the application (be it on ios or windows) as it might get reverse-engineered and thus exposed.
Any thoughts on this?
Thanks!
Some guys from EADS did reengineering on Dropbox, the presentation slides are available for download: A CRITICAL ANALYSIS OF
DROPBOX SOFTWARE SECURITY
In the same way websites don't allow users to directly access their databases but rather provide interfaces that can control permissions and handle authentication, I'm sure Dropbox has some kind of application that the client on your computer interacts with. Their server daemon will have permissions to write to the disk, but your computer has to go through it (and it's security procedures) before anything your computer sends is written.

Rails security on production server

I am putting my first rails app on the internet, I have read the rails guide on security and have implemented the points listed in there but was interested to hear of anything else ?
Also I currently store my uploads in public/documents is this ok ? I noticed there is no htaccess files protecting the directory.
Storing your uploads in a predictable location is a bad idea if you want to keep them a secret. If you don't care about people accessing it then it doesn't matter. Using .htaccess to password protect the directory is a good solution.
You should test your application for vulnerablites using Acunetx($$) or Wapiti (open source).
You should also read: What should a developer know before building a public web site?
If your site allows anyone to upload, it is a bad idea to store your uploads in a place that non-logged-in users can get to them. This is because then your site can be used by unscrupulous people as a place to store things that you might not want stored, such as malware.

Securing S3 via your own application

Imagine the following use case:
You have a basecamp style application hosting files with S3. Accounts all have their own files, but stored on S3.
How, therefore, would a developer go about securing files so users of account 1, couldn't somehow get to files of account 2?
We're talking Rails if that's a help.
S3 supports signed time expiring URLs that mean you can furnish a user with a URL that effectively lets only people with that link view the file, and only within a certain time period from issue.
http://www.miracletutorials.com/s3-amazon-expiring-urls/
If you want to restrict control of those remote resources you could proxy the files through your app. For something like S3 this may defeat the purpose of what you are trying to do, but it would still allow you to keep the data with amazon and restrict access.
You should be careful with an approach like this as it could cause your ruby thread to block while it is proxying the file, which could become a real problem with the application.
Serve the files using an EC2 Instance
If you set your S3 bucket to private, then start up an EC2 instance, you could serve your files on S3 via EC2, using the EC2 instance to verify permissions based on your application's rules. Because there is no charge for EC2 to transfer to/from S3 (within the same region), you don't have to double up your bandwidth consumption costs at Amazon.
I haven't tackled this exact issue. But that doesn't stop me from having an opinion :)
Check out cancan:
http://github.com/ryanb/cancan
http://railscasts.com/episodes/192-authorization-with-cancan
It allows custom authorization schemes, without too much hassle.

Resources