I'm giving Amazon Web Services a try for the first time and getting stuck on understanding the credentials process.
From a tutorial from awsblog.com, I gather that I can upload a file to one of my AWS "buckets" as follows:
s3 = Aws::S3::Resource.new
s3.bucket('bucket-name').object('key').upload_file('/source/file/path')
In the above circumstance, I'm assuming he's using the default credentials (as described here in the documentation), where he's using particular environment variables to store the access key and secret or something like that. (If that's not the right idea, feel free to set me straight.)
The thing I'm having a hard time understanding is the meaning behind the .object('key'). What is this? I've generated a bucket easily enough but is it supposed to have a specific key? If so, how to I create it? If not, what is supposed to go into .object()?
I figure this MUST be out there somewhere but I haven't been able to get it (maybe I'm misreading the documentation). Thanks to anyone who gives me some direction here.
Because S3 doesn't have traditional directories, what you would consider the entire 'file path' in your client machines, i.e. \some\directory\test.xls becomes the 'key'. The object is the data in the file.
Buckets are unique across S3, and the keys must be unique within your bucket.
As far as the credentials, there are multiple ways of providing them - one is to actually supply the id and secret access key right in your code, another is to store them in a config file somewhere on your machine (this varies by OS type), and then when you are running your code in production, i.e. on an EC2 instance, the best practice is to start your instance with a IAM Role assigned, and then anything that runs on that machine automatically has all of the permissions of that role. This is the best/safest option for code that runs in EC2.
Related
We have custom Docker web app running in Elastic Beanstalk Docker container environment.
Would like to have application logs be available for viewing outside. Without downloading through instances or AWS console.
So far neither of solutions been acceptable. Maybe someone achieved centralised logging for Elastic Benastalk Dockerized apps?
Solution 1: AWS Console log download
not acceptable - requires to download logs, extract every time. Non real-time.
Solution 2: S3 + Elasticsearch + Fluentd
fluentd does not have plugin to retrieve logs from S3
There's excellent S3 plugin, but it's only for log output to S3. not for input logs from S3.
Solution 3: S3 + Elasticsearch + Logstash
cons: Can only pull all logs from entire bucket or nothing.
The problem lies with Elastic Beanstalk S3 Log storage structure. You cannot specify file name pattern. It's either all logs or nothing.
ElasticBeanstalk saves logs on S3 in path containing random instance and environment ids:
s3.bucket/resources/environments/logs/publish/e-<random environment id>/i-<random instance id>/my.log#
Logstash s3 plugin can be pointed only to resources/environments/logs/publish/. When you try to point it to environments/logs/publish/*/my.log it does not work.
which means you can not pull particular log and tag/type it to be able to find in Elasticsearch. Since AWS saves logs from all your environments and instances in same folder structure, you cannot chose even the instance.
Solution 4: AWS CloudWatch Console log viewer
It is possible to forward your custom logs to CloudWatch console. Do achieve that, put configuration files in .ebextensions path of your app bundle:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/AWSHowTo.cloudwatchlogs.html
There's a file called cwl-webrequest-metrics.config which allows you to specify log files along with alerts, etc.
Great!? except that configuration file format is neither yaml,xml or Json, and it's not documented. There is absolutely zero mentions of that file, it's format either on AWS documentation website or anywhere on the net.
And to get one log file appear in CloudWatch is not simply adding a configuration line.
The only possible way to get this working seem to be trial and error. Great!? except for every attempt you need to re-deploy your environment.
There's only one reference to how to make this work with custom log: http://qiita.com/kozayupapa/items/2bb7a6b1f17f4e799a22 I have no idea how that person reverse engineered the file format.
cons:
Cloudwatch does not seem to be able to split logs into columns when displaying, so you can't easily filter by priority, etc.
AWS Console Log viewer does not have auto-refresh to follow logs.
Nightmare undocumented configuration file format, no way of testing. Trial and error requires re-deploying whole instance.
Perhaps an AWS Lambda function is applicable?
Write some javascript that dumps all notifications, then see what you can do with those.
After an object is written, you could rename it within the same bucket?
Or notify your own log-management service about the creation of a new object?
Lots of possibilities there...
I've started using Sumologic for the moment. There's a free trial and then a free tier (500mb /day, 7 day retention). I'm not out of the trial period yet and my EB app does literally nothing (it's just a few HTML pages serve by Nginx in a docker container. Looks like it could get expensive once you hit any serious amount of logs though.
It works ok so far. You need to create an IAM user that has access to the S3 bucket you want to read from and then it sucks the logs over to Sumologic servers and does all the processing and searching over there. Bit fiddly to set up, but I don't really see how it could be simpler and it's reasonably well-documented.
It lets you provide different path expressions with wildcards, then assign a "sourceCategory" to those different paths. You then use those sourceCategories to filter your log searching to a specific type of logging.
My plan long-term is to use something like your solution 3, but this got me going in very short order so I can move on to other things.
You can use a Multicontainer environment, sharing the log folder to another docker container with the tool of your preference to centralize the logs, in our case we connected an Apache Flume to move the files to an HDFS. Hope this helps you with this.
The easiest method I found to do this was using papertrail via rsyslog and .ebextensions, however it is very expensive for logging everything.
The good part is with rsyslog you can essentially send your logs anywhere and you are not tied to papertrail.
example ebextension
I've found loggly to be the most convenient.
It is a hosted service which might not be what you want. However if you check out their setup page you can see a number of ways your situation is supported (docker specific solutions, as well as like 10 amazon specific options). Even if loggly isn't to your taste, you can look at those solutions and easily see how some of them could be applied to most any centralized logging solution you might use or write.
I use Heroku to deploy a Rails app. I store sensitive data such as API keys and passwords in Heroku's environment variables, and then use the data in rake tasks that utilize various APIs.
I am just wondering how secure Heroku's environmental variables are? Is there a way to hash these variables while retaining the ability to use them in the background somehow?
I came across a previous thread here: Is it secure to store passwords as environment variables (rather than as plain text) in config files?.
But it doesn't quite cover instances when I still need to unhashed password to perform important background tasks.
Several things (mostly my opinion):
--
1. API Key != Password
When you talk about API Keys, you're talking about a public token which is generally already very secure. The nature of API's nowadays is they need some sort of prior authentication (either at app or user level) to create a more robust level of security.
I would firstly ensure what type of data you're storing in the ENV variables. If it's pure passwords (for email etc), perhaps consider migrating your setup to one of the cloud providers (SendGrid / Mandrill etc), allowing you to use only API keys
The beauty of API keys is they can be changed whilst not affecting the base account, as well as limiting interactivity to the constrains of the API. Passwords affect the base account
--
2. ENV Vars are OS-level
They are part of the operating environment in which a process runs.
For example, a running process can query the value of the TEMP
environment variable to discover a suitable location to store
temporary files, or the HOME or USERPROFILE variable to find the
directory structure owned by the user running the process.
You must remember Environment Variables basically mean you store the data in the environment you're operating. The generally means the "OS", but can be the virtual instance of an OS too, if required.
The bottom line is your ENV vars are present in the core of your server. The same way as text files would be sitting in a directory on the hard drive - Environment Variables reside in the core of the OS
Unless you received a hack to the server itself, it would be very difficult to get the ENV variable data pro-grammatically, at least in my experience.
What are you looking for? Security against who or what?
Every piece of information store in a config file or the ENV is readable to everyone who has access to the server. And even more important, every gem can read the information and send it somewhere.
You can not encrypt the information, because then you need to store the key to decrypt somewhere. Same problem.
IMO both – environment variables and config files – are secure as long you can trust everyone that has access to your servers and you carefully reviewed the source code of all libraries and gems you have bundled with your app.
I am trying to hide 2 secrets that I am using in one of my apps.
As I understand the keychain is a good place but I can not add them before I submit the app.
I thought about this scenario -
Pre seed the secrets in my app's CoreData Database by spreading them in other entities to obscure them. (I already have a seed DB in that app).
As the app launches for the first time, generate and move the keys to the keychain.
Delete the records from CoreData.
Is that safe or can the hacker see this happening and get those keys?
*THIRD EDIT**
Sorry for not explaining this scenario from the beginning - The App has many levels, each level contains files (audio, video, images). The user can purchase a level (IAP) and after the purchase is completed I need to download the files to his device.
For iOS6 the files are stored with Apple new "Hosted Content" feature. For iOS5 the files are stored in amazon S3.
So in all this process I have 2 keys:
1. IAP key, for verifying the purchase at Apple IAP.
2. S3 keys, for getting the files from S3 for iOS5 users:
NSString *secretAccessKey = #"xxxxxxxxx";
NSString *accessKey = #"xxxxxxxxx";
Do I need to protect those keys at all? I am afraid that people will be able to get the files from S3 with out purchasing the levels. Or that hackers will be able to build a hacked version with all the levels pre-downloaded inside.
Let me try to break down your question to multiple subquestions/assumption:
Assumptions:
a) Keychain is safe place
Actually, it's not that safe. If your application is installed on jailbroked device, a hacker will be able to get your keys from the keychain
Questions:
a) Is there a way to put some key into an app (binary which is delivered form AppStore) and be completely secure?
Short answer is NO. As soon as there is something in your binary, it could be reverse engineered.
b) Will obfuscation help?
Yes. It will increase time for a hacker to figure it out. If the keys which you have in app will "cost" less than a time spend on reverse engineering - generally speaking, you are good.
However, in most cases, security through obscurity is bad practice, It gives you a feeling that you are secure, but you aren't.
So, this could be one of security measures, but you need to have other security measures in place too.
c) What should I do in such case?*
It's hard to give you a good solution without knowing background what you are trying to do.
As example, why everybody should have access to the same Amazon S3? Do they need to read-only or write (as pointed out by Kendall Helmstetter Gein).
I believe one of the most secure scenarios would be something like that:
Your application should be passcode protected
First time you enter your application it requests a user to authenticate (enter his username, password) to the server
This authenticates against your server or other authentication provider (e.g. Google)
The server sends some authentication token to a device (quite often it's some type of cookie).
You encrypt this token based on hash of your application passcode and save it in keychain in this form
And now you can do one of two things:
hand over specific keys from the server to the client (so each client will have their own keys) and encrypt them with the hash of your application passcode
handle all operation with S3 on the server (and require client to send)
This way your protect from multiple possible attacks.
c) Whoooa.... I don't plan to implement all of this stuff which you just wrote, because it will take me months. Is there anything simpler?
I think it would be useful, if you have one set of keys per client.
If even this is too much then download encrypted keys from the server and save them in encrypted form on the device and have decryption key hardcoded into your app. I would say it's minimally invasive and at least your binary doesn't have keys in it.
P.S. Both Kendall and Rob are right.
Update 1 (based on new info)
First of all, have you seen in app purchase programming guide.
There is very good drawing under Server Product Model. This model protects against somebody who didn't buy new levels. There will be no amazon keys embedded in your application and your server side will hand over levels when it will receive receipt of purchase.
There is no perfect solution to protect against somebody who purchased the content (and decided to rip it off from your application), because at the end of days your application will have the content downloaded to a device and will need it in plain (unencrypted form) at some point of time.
If you are really concerned about this case, I would recommend to encrypt all your assets and hand over it in encrypted form from the server together with encryption key. Encryption key should be generated per client and asset should be encrypted using it.
This won't stop any advanced hacker, but at least it will protect from somebody using iExplorer and just copying files (since they will be encrypted).
Update 2
One more thing regarding update 1. You should store files unencrypted and store encryption key somewhere (e.g. in keychain).
In case your game requires internet connection, the best idea is to not store encryption key on the device at all. You can get it from the server each time when your app is started.
DO NOT store an S3 key used for write in your app! In short order someone sniffing traffic will see the write call to S3, in shorter order they will find that key and do whatever they like.
The ONLY way an application can write content to S3 with any degree of security is by going through a server you control.
If it's a key used for read-only use, meaning your S3 cannot be read publicly but the key can be used for read-only access with no ability to write, then you could embed it in the application but anyone wanting to can pull it out.
To lightly obscure pre-loaded sensitive data you could encrypt it in a file and the app can read it in to memory and decrypt before storing in the keychain. Again, someone will be able to get to these keys so it better not matter much if they can.
Edit:
Based on new information you are probably better off just embedding the secrets in code. Using a tool like iExplorer a causal user can easily get to a core data database or anything else in your application bundle, but object files are somewhat encrypted. If they have a jailbroken device they can easily get the un-encrypted versions but it still can be hard to find meaningful strings, perhaps store them in two parts and re-assemble in code.
Again it will not stop a determined hacker but it's enough to keep most people out.
You might want to also add some code that would attempt to ask your server if there's any override secrets it can download. That way if the secrets are leaked you could quickly react to it by changing the secrets used for your app, while shutting out anyone using a copied secret. To start with there would be no override to download. You don't want to have to wait for an application update to be able to use new keys.
There is no good way to hide a secret in a piece of code you send your attacker. As with most things of this type, you need to focus more on how to mitigate the problem when the key does leak rather than spend unbounded time trying to protect it. For instance, generating different keys for each user allows you to disable a key if it is being used abusively. Or working through a intermediary server allows you to control the protocol (i.e. the server has the key and is only willing to do certain things with it).
It is not a waste of time to do a little obfuscating. That's fine. But don't spend a lot of time on it. If it's in the program and it's highly valuable, then it will be hacked out. Focus on how to detect when that happens, and how to recover when it does. And as much as possible, move that kind of sensitive data into some other server that you control.
I would like to protect my s3 documents behind by rails app such that if I go to:
www.myapp.com/attachment/5 that should authenticate the user prior to displaying/downloading the document.
I have read similar questions on stackoverflow but I'm not sure I've seen any good conclusions.
From what I have read there are several things you can do to "protect" your S3 documents.
1) Obfuscate the URL. I have done this. I think this is a good thing to do so no one can guess the URL. For example it would be easy to "walk" the URL's if your S3 URLs are obvious: https://s3.amazonaws.com/myapp.com/attachments/1/document.doc. Having a URL such as:
https://s3.amazonaws.com/myapp.com/7ca/6ab/c9d/db2/727/f14/document.doc seems much better.
This is great to do but doesn't resolve the issue of passing around URLs via email or websites.
2) Use an expiring URL as shown here: Rails 3, paperclip + S3 - Howto Store for an Instance and Protect Access
For me, however this is not a great solution because the URL is exposed (even for just a short period of time) and another user could perhaps in time reuse the URL quickly. You have to adjust the time to allow for the download without providing too much time for copying. It just seems like the wrong solution.
3) Proxy the document download via the app. At first I tried to just use send_file: http://www.therailsway.com/2009/2/22/file-downloads-done-right but the problem is that these files can only be static/local files on your server and not served via another site (S3/AWS). I can however use send_data and load the document into my app and immediately serve the document to the user. The problem with this solution is obvious - twice the bandwidth and twice the time (to load the document to my app and then back to the user).
I'm looking for a solution that provides the full security of #3 but does not require the additional bandwidth and time for loading. It looks like Basecamp is "protecting" documents behind their app (via authentication) and I assume other sites are doing something similar but I don't think they are using my #3 solution.
Suggestions would be greatly appreciated.
UPDATE:
I went with a 4th solution:
4) Use amazon bucket policies to control access to the files based on referrer:
http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?UsingBucketPolicies.html
UPDATE AGAIN:
Well #4 can easily be worked around via a browsers developer's tool. So I'm still in search of a solid solution.
You'd want to do two things:
Make the bucket and all objects inside it private. The naming convention doesn't actually matter, the simpler the better.
Generate signed URLs, and redirect to them from your application. This way, your app can check if the user is authenticated and authorized, and then generate a new signed URL and redirect them to it using a 301 HTTP Status code. This means that the file will never go through your servers, so there's no load or bandwidth on you. Here's the docs to presign a GET_OBJECT request:
https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/S3/Presigner.html
I would vote for number 3 it is the only truly secure approach. Because once you pass the user to the S3 URL that is valid till its expiration time. A crafty user could use that hole the only question is, will that affect your application?
Perhaps you could set the expire time to be lower which would minimise the risk?
Take a look at an excerpt from this post:
Accessing private objects from a browser
All private objects are accessible via
an authenticated GET request to the S3
servers. You can generate an
authenticated url for an object like
this:
S3Object.url_for('beluga_baby.jpg', 'marcel_molina')
By default
authenticated urls expire 5 minutes
after they were generated.
Expiration options can be specified
either with an absolute time since the
epoch with the :expires options, or
with a number of seconds relative to
now with the :expires_in options:
I have been in the process of trying to do something similar for quite sometime now. If you dont want to use the bandwidth twice, then the only way that this is possible is to allow S3 to do it. Now I am totally with you about the exposed URL. Were you able to come up with any alternative?
I found something that might be useful in this regard - http://docs.aws.amazon.com/AmazonS3/latest/dev/AuthUsingTempFederationTokenRuby.html
Once a user logs in, an aws session with his IP as a part of the aws policy should be created and then this can be used to generate the signed urls. So in case, somebody else grabs the URL the signature will not match since the source of the request will be a different IP. Let me know if this makes sense and is secure enough.
One of my Rails applications is going to depend on a secret key in memory, so all of its functions will only be available once administrator goes to a certain page and uploads the valid key.
The problem is that this key needs to be stored securely, so no other processes on the same machine should be able to access it (so memcached and filesystem are not suitable). One good idea would be just to store it in some configuration variable in the application, but newly spawned instances won't have access to that variable. Any thoughts how to implement this on RubyEE/Apache/mod_passenger?
there is really no way to accomplish that goal. (this is the same problem all DRM systems have)
You can't keep things secret from the operating system. Your application has to have the key somewhere in memory and the operating system kernel can read any memory location it wants to.
You need to be able to trust the operating system, which means that you then can also trust the operating system to properly enforce file access permissions. This in turn means that can store the key in a file that only the rails-user-process can read.
Think of it this way: even if you had no key at all, what is to stop an attacker on the server from simply changing the application code itself to gain access to the disabled functionality?
I would use the filesystem, with read access only to the file owner, and ensure the ruby process is the only process owned by this user. (using chmod 400 file)
You can get more complex than that, but it all boils down to using the unix users and permissions.
Encrypt it heavily in the filesystem?
What about treating it like a regular password, and using a salted hash? Once the user authenticates, he has access to the functions of the website.