The recommended way to get a public readable reference to a Google Storage file seems to be to use Signed URLs.
I need to retrieve a storage reference based on the URL, so that when my database record is deleted I can delete its files from Storage as well.
The signed URL for a file stored in path/file.jpeg seems to follow the pattern:
https://storage.googleapis.com/bucket.name/path%2Ffile.jpeg?foobar
So I am currently using a regex to take the text between bucket.name and the ? character, then replacing %2F with /. I would like to know:
Is this reliable?
Is there any API in official libraries that does this for me? Could not find any.
Is there any better approach? Like storing the storage path in the database record, along with the signed url (seems overkill to me).
The recommended way to get a public readable reference to a Cloud Storage object is just by allowing public access to it, by doing this you will get a URL in the form of storage.googleapis.com/[your-bucket]/[path-to-file]/[file].
-Is this reliable?
Signed URLs are meant to be used when requiring access (read, write or delete) just for a limited time, thus using a Signed URL for the current application needs may not be the best approach since you are using regex to get the appropriate URL path but ignoring all the text after “?” which requires certain computational process to be signed.
-Is there any API in official libraries that does this for me? Could not find any.
Not sure if you are referring to extracting the path from the signed URL, if that is the case then the answer is no.
-Is there a better approach?
Using the public access permission could be another option. If you are using the signed url to also have delete permissions but not really using the limited time functionality then the best approach is to use object public access, create a service account with enough permissions (delete Cloud Storage objects) and use the storage client library to delete the object from the bucket when the DB record is deleted.
Related
So I am going through the security rules documentation of firestore right now in an effort to make sure the data users put in my app will be okay. As of right now, all I need users to be able to do is to read data (really only the 'get', but 'read' is fine too), and create data. So, my security rules for the firestore data right now are:
rules_version = '2';
service cloud.firestore {
match /databases/{database}/documents {
match /jumpSpotAnnotations/{id} {
// 'get' instead of 'read' would work too
allow read, create;
}
}
}
I have the exact same 'allow read, create;' for my storage data too. Will this be okay upon release or is this dangerous? In the documentation, they write:
"As you set up Cloud Firestore, you might have set your rules to allow open access during development. You might think you're the only person using your app, but if you've deployed it, it's available on the internet. If you're not authenticating users and configuring security rules, then anyone who guesses your project ID can steal, modify, or delete the data."
This text precedes an example where the rules are, 'allow read, write;', as opposed to my 'allow read, create'. Are my rules also subject to the deletion/modification of the data? I put create because I assume that that only lets people create the data, and not delete or modify it.
Final part of this question, but how could a user guess my project ID? Would they not have to sign in on my google account to then be able to manually delete, modify, or steal data? I'm not sure how that works. My app interface allows for the user to only create data, or read data, nothing else. So could some random person still somehow get into this database online and mess with it?
Thanks for any help.
Your rule allows anyone with an internet connection to read and create documents in the jumpSpotAnnotations collection. We don't know if that's "safe" for your app. You have to determine for yourself if that situation is safe. If you're OK with someone anonymously loading up that collection with documents, and you're OK with paying for that behavior, then it's safe.
Your project ID is baked into your app before you publish it. All someone has to do is download and decompile your app to find it. It's not hard. Your project ID is not private information.
No, your rules are not secure, to understand how someone can guess your project id and steal data first you have to understand that Firebase provides a simple REST API to access stored data. All of the data is stored in JSON format, so public databases can be accessed by making a request to the database URL appended by “.json”.
Now the main concern that how someone can guess your project id, see there are many tools available through which you can set up a proxy on your network and analyze each and every request going through. As Google already said that firebase simply uses rest API so the API endpoints can be known easily by intercepting HTTP requests and then if your rules are not secured then your data could be compromised.
Now solution, how to protect your data. See there are many ways even firebase provides tons of ways to secure data just read their docs about database security. But there is something which you could do from your side so that if your data is compromised then also someone can't actually read it.
You can prevent the apps from reading the data in plaintext. Use public-key algorithms to encrypt the data. Keep the private key on the systems that have to read the data. Then the app cannot read the data in plain text. This also will not prevent the manipulation or deletion of data.
I am working on a project where the user joins a "stream". During stream setup, the person who is creating the stream (the stream creator) can choose to either:
Upload all photos added to the stream by members to our hosting solution (S3)
Upload all photos added to the stream by members to the stream creator's own Dropbox authenticated folder
In the future I would like to add more storage providers (such as Drive, Onesky etc)
There is a couple of different questions I have in regards to how to solve this.
What should the structure be in the database for photos? I currently only have photo_url, but that won't be easy to manage from a data perspective with pre-signed urls and when there are different ways a photo can be uploaded (s3, dropbox etc.)
How should the access tokens for each storage provider be stored? Remember that only the stream creator's access_token will be stored and everyone who is on the stream will share that token when uploading photos
I will add iOS and web clients in the future that will do a direct upload to the storage provider and bypass the server to avoid a heavy load on the server
As far as database storage, your application should dictate the structure based on the interface that you present both to the user and to the stream.
If you have users upload a photo and they don't get to choose the URI, and you don't have any hierarchy within a stream, then I'd recommend storing just an ID and a stream_id in your main photo table.
So at a minimum you might have something looking like
create table photos(id integer primary key, stream_id integer references streams(id) not null);
But you probably also want description and other information that is independent of storage.
The streams table would have all the generic information about a stream, but would have a polymorphic association to a class dependent on the type of stream. So you could use that association to get an instance of S3Stream or DropBoxStream based on what actual stream was used.
That instance (also an ActiveRecord resource) could store the access key, and for things like dropbox, the path to the folder etc. In addition, that instance could provide methods to construct a URI given your Photo object.
If a particular technology needs to cache signed URIs, then say the S3Stream object could reference a S3SignedUrl model where the URIs are signed.
If it turns out that the signed URL code is similar between DropBox and S3, then perhaps you have a single SignedUrl model.
When you design the ios and android clients, it is critical that they are not given access to the stream owner's access tokens. Instead, you'll need to do all the signing inside your server app. You wouldn't want a compromise of a device to lead to exposing the access token creating billing problems as well as privacy exposures.
Hope this helps.
we setup a lot of rails applications with different kind of file storages behind it.
Yes, just an url is not manageable in the future. To save a lot of time you could use gems like carrierwave or paperclip. They handle all the thumbnail generation and file validation. One approach is, that you could upload the file from the client directly to S3 or Dropbox to a tmp folder and just tell your Rails App "Hey, here is the url of a new upload file" and paperclip and carrierwave will take care of the thumbnail generation and storaging. (Example for paperclip)
Don't know exactly how your stream works, so I cannot give a good answer to this -.-
With the setup I mentioned in 1. you should upload form your different clients directly to S3 or Dropbox etc. and after uploading, the client tells the Rails Backend that it should import the file from that url. (And before paperclip or carrierwave finish their processing you could use the tmp url from the file to display something directly in your stream)
I'm building a service where users can upload images. I want to prevent users from uploading files bigger than 5mb.
Right now I create a 'signed url' for cloud storage user PUT and GET requests. Is there any way I can limit the size of a user upload?
I don't want users to start uploading extremely large files by mistake or with malicious intent.
There are three ways to control access to Google Cloud Storage buckets and objects:
Access Control Lists (ACLs), which provide a way to specify read or write access for specified Google accounts and groups.
Signed URLs (Query String Authentication), which provide a way to give time-limited read or write access to anyone in possession of the URL, regardless of whether they have a Google account or not.
Signed Policy Documents, which provide a way to specify what can be uploaded to a bucket. Policy documents allow greater control over size, content type, and other upload characteristics than signed URLs, and can be used by website owners to allow visitors to upload files to Google Cloud Storage.
In this case, you'll need to use the Signed Policy Documents access control method. Take a look at this POST Object article and this case for examples.
you can limit size using the below given command .
please google search the given below command and you can configure file size.
["content-length-range", , ]
I would like to protect my s3 documents behind by rails app such that if I go to:
www.myapp.com/attachment/5 that should authenticate the user prior to displaying/downloading the document.
I have read similar questions on stackoverflow but I'm not sure I've seen any good conclusions.
From what I have read there are several things you can do to "protect" your S3 documents.
1) Obfuscate the URL. I have done this. I think this is a good thing to do so no one can guess the URL. For example it would be easy to "walk" the URL's if your S3 URLs are obvious: https://s3.amazonaws.com/myapp.com/attachments/1/document.doc. Having a URL such as:
https://s3.amazonaws.com/myapp.com/7ca/6ab/c9d/db2/727/f14/document.doc seems much better.
This is great to do but doesn't resolve the issue of passing around URLs via email or websites.
2) Use an expiring URL as shown here: Rails 3, paperclip + S3 - Howto Store for an Instance and Protect Access
For me, however this is not a great solution because the URL is exposed (even for just a short period of time) and another user could perhaps in time reuse the URL quickly. You have to adjust the time to allow for the download without providing too much time for copying. It just seems like the wrong solution.
3) Proxy the document download via the app. At first I tried to just use send_file: http://www.therailsway.com/2009/2/22/file-downloads-done-right but the problem is that these files can only be static/local files on your server and not served via another site (S3/AWS). I can however use send_data and load the document into my app and immediately serve the document to the user. The problem with this solution is obvious - twice the bandwidth and twice the time (to load the document to my app and then back to the user).
I'm looking for a solution that provides the full security of #3 but does not require the additional bandwidth and time for loading. It looks like Basecamp is "protecting" documents behind their app (via authentication) and I assume other sites are doing something similar but I don't think they are using my #3 solution.
Suggestions would be greatly appreciated.
UPDATE:
I went with a 4th solution:
4) Use amazon bucket policies to control access to the files based on referrer:
http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingBucketPolicies.html
UPDATE AGAIN:
Well #4 can easily be worked around via a browsers developer's tool. So I'm still in search of a solid solution.
You'd want to do two things:
Make the bucket and all objects inside it private. The naming convention doesn't actually matter, the simpler the better.
Generate signed URLs, and redirect to them from your application. This way, your app can check if the user is authenticated and authorized, and then generate a new signed URL and redirect them to it using a 301 HTTP Status code. This means that the file will never go through your servers, so there's no load or bandwidth on you. Here's the docs to presign a GET_OBJECT request:
https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/S3/Presigner.html
I would vote for number 3 it is the only truly secure approach. Because once you pass the user to the S3 URL that is valid till its expiration time. A crafty user could use that hole the only question is, will that affect your application?
Perhaps you could set the expire time to be lower which would minimise the risk?
Take a look at an excerpt from this post:
Accessing private objects from a browser
All private objects are accessible via
an authenticated GET request to the S3
servers. You can generate an
authenticated url for an object like
this:
S3Object.url_for('beluga_baby.jpg', 'marcel_molina')
By default
authenticated urls expire 5 minutes
after they were generated.
Expiration options can be specified
either with an absolute time since the
epoch with the :expires options, or
with a number of seconds relative to
now with the :expires_in options:
I have been in the process of trying to do something similar for quite sometime now. If you dont want to use the bandwidth twice, then the only way that this is possible is to allow S3 to do it. Now I am totally with you about the exposed URL. Were you able to come up with any alternative?
I found something that might be useful in this regard - http://docs.aws.amazon.com/AmazonS3/latest/dev/AuthUsingTempFederationTokenRuby.html
Once a user logs in, an aws session with his IP as a part of the aws policy should be created and then this can be used to generate the signed urls. So in case, somebody else grabs the URL the signature will not match since the source of the request will be a different IP. Let me know if this makes sense and is secure enough.
One of my Rails applications is going to depend on a secret key in memory, so all of its functions will only be available once administrator goes to a certain page and uploads the valid key.
The problem is that this key needs to be stored securely, so no other processes on the same machine should be able to access it (so memcached and filesystem are not suitable). One good idea would be just to store it in some configuration variable in the application, but newly spawned instances won't have access to that variable. Any thoughts how to implement this on RubyEE/Apache/mod_passenger?
there is really no way to accomplish that goal. (this is the same problem all DRM systems have)
You can't keep things secret from the operating system. Your application has to have the key somewhere in memory and the operating system kernel can read any memory location it wants to.
You need to be able to trust the operating system, which means that you then can also trust the operating system to properly enforce file access permissions. This in turn means that can store the key in a file that only the rails-user-process can read.
Think of it this way: even if you had no key at all, what is to stop an attacker on the server from simply changing the application code itself to gain access to the disabled functionality?
I would use the filesystem, with read access only to the file owner, and ensure the ruby process is the only process owned by this user. (using chmod 400 file)
You can get more complex than that, but it all boils down to using the unix users and permissions.
Encrypt it heavily in the filesystem?
What about treating it like a regular password, and using a salted hash? Once the user authenticates, he has access to the functions of the website.