Where to store sensitive data in public rails app? - ruby-on-rails

My personal rails project uses a few API's for which I store the API keys/secrets in config/environments/production.yml and development.yml as global variables. I now want to push this project to github for others to use, but I don't want them to have those bits of sensitive data. I also don't want this file in .gitignore because it's required for the app to run. I've considered putting them in the DB somewhere, but am hoping to find a better solution.

TLDR: Use environment variables!
I think #Bryce's comment offers an answer, which I'll just flush out. It seems one approach Heroku recommends is to use environment variables to store sensitive information (API key strings, database passwords). So survey your code and see in which you have sensitive data. Then create environment variables (in your .bashrc file for example) that store the sensivite data values. For example for your database:
export MYAPP_DEV_DB_DATABASE=myapp_dev
export MYAPP_DEV_DB_USER=username
export MYAPP_DEV_DB_PW=secret
Now, in your local box, you just refer to the environment variables whenever you need the sensitive data. For example in database.yml :
development:
adapter: mysql2
encoding: utf8
reconnect: false
database: <%= ENV["MYAPP_DEV_DB_DATABASE"] %>
pool: 5
username: <%= ENV["MYAPP_DEV_DB_USER"] %>
password: <%= ENV["MYAPP_DEV_DB_PW"] %>
socket: /var/run/mysqld/mysqld.sock
I think database.yml gets parsed just at the app's initialization or restart so this shouldn't impact performance. So this would solve it for your local development and for making your repository public. Stripped of sensitive data, you can now use the same repository for the public as you do privately. It also solves the problem if you are on a VPS. Just ssh to it and set up the environment variables on your production host as you did in your development box.
Meanwhile, if your production setup involves a hands off deployment where you can't ssh to the production server, like Heroku's does, you need to look at how to remotely set up environment variables. For Heroku this is done with heroku config:add. So, per the same article, if you had S3 integrated into your app and you had the sensitive data coming in from the environment variables:
AWS::S3::Base.establish_connection!(
:access_key_id => ENV['S3_KEY'],
:secret_access_key => ENV['S3_SECRET']
)
Just have Heroku create environment variables for it:
heroku config:add S3_KEY=8N022N81 S3_SECRET=9s83159d3+583493190
Another pro of this solution is that it's language neutral, not just Rails. Works for any app since they can all acquire the environment variables.

How about this...
Create a new project and check it into GitHub with placeholder values in the production.yml and development.yml files.
Update .gitignore to include production.yml and development.yml.
Replace the placeholder values with your secrets.
Now you can check your code into GitHub without compromising your secrets.
And anyone can clone your repo without any extra steps to create missing files (they'll just replace the placeholder values as you did).
Does that meet your goals?

They're probably best put in initializers (config/initializers/api.yaml) though I think what you've got cooked up is fine. Add the actual keys to your .gitignore file and run git rm config/environments/production.yml to remove that sensitive data from your repo. Fair warning, it will remove that file too so back it up first.
Then, just create a config/environments/production.yml.example file next to your actual file with the pertinent details but with the sensitive data left out. When you pull it out to production, just copy the file without the .example and substitute the appropriate data.

Use environment variables.
In Ruby, they're accessible like so:
ENV['S3_SECRET']
Two reasons:
The values will not make it into source control.
"sensitive data" aka passwords tend to change on a per-environment basis anyways. e.g. you should be using different S3 credentials for development vs production.
Is this a best practice?
Yes: http://12factor.net/config
How do I use them locally?
foreman and dotenv are both easy. Or, edit your shell.
How do I use them in production?
Largely, it depends. But for Rails, dotenv is an easy win.
What about platform-as-a-service?
Any PaaS should give you a way to set them. Heroku for example: https://devcenter.heroku.com/articles/config-vars
Doesn't this make it more complicated to set up a new developer for the project?
Perhaps, but it's worth it. You can always check a .env.sample file into source control with some example data in it. Add a note about it to your project's readme.

Rails 4.1 has now a convention for it. You store this stuff in secrets.yml. So you don't end up with some global ENV calls scattered across Your app.
This yaml file is like database.yml erb parsed, so you are still able to use ENV calls here. In that case you can put it under version control, it would then serve just as a documentation which ENV vars has to be used. But you also can exlcude it from version control and store the actual secrets there. In that case you would put some secrets.yml.default or the like into the public repo for documentation purposes.
development:
s3_secret: 'foo'
production:
s3_secret: <%= ENV['S3_SECRET']%>
Than you can access this stuff under
Rails.application.secrets.s3_secret
Its discussed in detail at the beginning of this Episode

Related

Rails secrets.yml VS Dotenv VS Figaro with Capistrano on AWS

There are several posts ans Stack Overflow questions about how to manage API tokens on the web, but I see a lot of people repeat what they read somewhere else, often with contradictions...
How do you deal with API Tokens, secrets and the like ?
Here's what I have read so far using 3 different gems
secrets.yml
Introduced with Rails 4.1, then updated to encrypted secrets around rails 5
When initially released on rails 4, they were (or were not ?) meant to be pushed on repositories. However I often saw examples using environment variables for production
# config/secrets.yml
development:
secret_key_base: super_long_secret_key_for_development
...
production:
secret_key_base: <%= ENV["SECRET_KEY_BASE"] %>
...
And at this point someone asked "Why do we use ENV for production ?". A legit question back then, often answered "We don't want production token to be hard coded in the application" (hence how it is not clear anymore if the secrets should have been committed or not).
Later, with Rails 5, secrets became encrypted with a master key, so the encrypted secrets.yml file could be committed to the repository, but then the problem remained the same with the master key used to read the secrets.
PROs:
Rails convention.
Easy deploy with capistrano-secrets gem and cap [stage] setup (and it only deploys stage secrets nice) or similar gems
YML data structure (array/hash ok) + can use Ruby code via ERB
With encrypted secrets since rails 5, easy to collaborate/share the secrets
CONs:
Need to use Rails.application.secrets.xxx
Many services like AWS still read from ENV to automatically setup their gems/services
Is not the 12 factors way (or is it ?)
Quite new, so not really used yet ?
If using encrypted secrets, need to manage the master key
Bkeepers dotenv
Simply defining a .env file at the root that is read by Rails on startup
Plugins like capistrano-env allow to easily inject environment specific ENV on servers, and secrets can still must be managed using .env.staging, .env.production
PROs
ENV is in 12factor rules
3.5k stars... maybe not for nothing ?
the dotenv approach is now available on almost all other languages (JS, Go, etc)
Recent versions allow reusing variables (API_ROOT=X, SOME_ENDPOINT=${X}/endpoint)
CONs
No Ruby interpolation (unless extra code is added to parse the .env with the ERB templating engine for example)
limited to string-string key/val
Figaro
Some sort of hybrid secrets/ENV. With 12factors/Rails/Heroku in mind, but in the end doesn't seem better than the rest...
From the above and the rest I didn't write, it would seem like secrets.yml would be a great winner if those secrets were put in ENV instead (and tbh I feel lazy about writing Rails.Application.secrets each time).
So, suppose I start a quite new Rails app, and also based on your experience. Which one would you choose ?
(My particular stack uses AWS + Capistrano, no Heroku)
I personally think that the "right" approach depends on your environment.
In my case, I have servers which are managed by IT and I don't have access to the vhost or anything else to easily set environment variables. Because of this, I commit a secrets.yml file which doesn't contain the production stanza, then set up Capistrano to add this file to shared_files. On each server, I add the file.
If I had easy access the the vhost, and I was managing the server vhosts with Puppet, Chef, Ansible, or similar, I would use the environment variable approach like the default secrets.yml file.
Your case with AWS appears to be the latter. Ultimately, either option is fine. There is little downside to committing the secrets.yml file without a production stanza.
First, all three methods can be 12-factor compatible. It is compatible if you pass the config value by ENV variable, instead of copying one file to the server first.
My thoughts are each of these solutions:
Rails secrets
Developers are forced to go 12-factor, either manually set it on production server, or have another file on local machine and then pass it as ENV every time during deploy. (Didn't know about capistrano-secrets, it probably handles this)
(I think what you said in CON #2 and #3 are the opposite to secret.yml solution)
The accessor is also quite long as you mentioned.
dotenv
It does not encourage 12-factor, and was not originally designed for production env anyways. Either you write code to pass its value as ENV to production during deploy (making it 12 factor compatible), or you copy your .env.production file to the production server.
Also it forces you to use the flat key:value configuration. No nesting.
Figaro
Though it uses YAML, it does not allow nested hash.
It is 12 factor compatible, e.g. it includes code to transfer the config to heroku.
My Solution
I wrote a gem, in which settings are stored in gitignored YAML file. Nesting is allowed. When accessing some value, do Setting.dig(:fb,:api).
It provides mechanism for 12-factor app deploy, by serializing the whole YAML file into a string and pass it to production as ENV.
I no longer have to distinguish between secret config and non-secret config. They are in one place and secret by default. I get benefit of 12-factor app while using easy to namespace YAML files.

Why use an extension on config/database.yml?

I'm inheriting a project that uses config/database.yml.sqlite and config/database.yml.psql instead of config/database.yml.
Why is this done and how should I use it?
If I just run rake db:create rails is looking for a config/database.yml. I've tried looking for a way to specify the name of the config file but no luck.
I could just ask the people I'm inheriting the code from but after a bit of googling around I see this pattern in other projects and think that it'd be nice if SO had an answer.
It is often quite normal to add database.yml to your .gitignore file, because it can contain passwords etc, and so ought to be kept out of the Git repo.
In this case, it is useful to keep an example database.yml file in the repo, showing the settings you would want for, say a Postgres database if you are going to use that, or a Sqlite database if you prefer that for your development work. Then you can get set up quickly once you've cloned the repo.
All you have to do is run:
cp config/database.yml.psql config/database.yml
then add your own passwords for your local development database into database.yml, which will then stay out of the repo and not be shared with any other developers working on the same project.
I've actually never seen that pattern. It sounds to me that those other YAML files are pre-encoded for the target database - pre-configured so to speak.
The same result could be achieved by a single YAML file with extensive documented blocks - "here is a sqlite block, here is a Postgres block", etc.

Is it ok to store DB password for the production environment in the config/database.yml file

Is it ok to store DB password for the production environment in the "config/database.yml" file? Or is there any more correct way to do it (maybe environment variables)?
Thanks in advance.
It's not a good idea! One main reason is that the config/database.yml file will probably be included in some kind of source control, like a git repository. Even if the repo is private currently, you can't know for sure it won't be made public in the future and then you would have a problem on your hands!
In addition, if anyone ever gains read-access to your application's files or just a copy of your application's source, they now have your database password.
A typical solution is to set an environment variable like you suggested and then read it in the .yml file:
password: <%= ENV['DATABASE_PASSWORD'] %>
If you're using a PaaS like Heroku, this is the standard way to do things. But even this isn't a perfect solution, so evaluate your options carefully.

How to store credentials for third party services in Rails

I am setting up a redirection through SendGrid for the mails sent by my rails application.
However I am not really satisfied with the way I'm told to store the credentials.
As it is specified there, they suggest to overwrite ActionMailers defaults in the config/environment.rb file. I've found out that my predecessor created a initializers/smtp.rb file where he defined the previous settings, but by discovering this file, I discovered the SMTP password...
If I modify any of these files, anuone having access to the git repository will have access to the credentials (including the front-end and back-end freelances we work with).
I was thinking of creating a file that would stay on the server's shared folder (like the database.yml file) and that would be symlinked to the app each time we deploy thanks to capistrano.
What do you think of it? Would it be okay to just move this initializers/smtp.rb to the server's shared folder and symlink it when deploying?
My suggestion (what I've seen done):
Move API keys and sensitive info into a yml file under config/.
Load this yml file into a variable, for instance
KEYS = YAML::load(File.open("#{RAILS_ROOT}/config/config.yml"))
Voila.
Also, when putting your code up on GitHub for example, this config.yml would be something you add to the .gitignore. Instead, make a config-example.yml and tell your developers to get their own keys and passwords and such, storing them in their local config.yml.
Environmental variables are the best way if you're on *nix
Stick your variables in .bashrc file like so:
// no need for quotation marks
export GMAIL_USER=my_gmail_user_name#gmail.com
export GMAIL_PASSWORD=my_gmail_password
And call them in your smtp initializer like so:
ActionMailer::Base.smtp_settings = {
:user_name => ENV['GMAIL_USER'],
:password => ENV['GMAIL_PASSWORD']
}
Restart bash and your rails app. All should work. Heroku have a good article on how to use env variables on their network.

How Do You Secure database.yml?

Within Ruby on Rails applications database.yml is a plain text file that stores database credentials.
When I deploy my Rails applications I have an after deploy callback in my Capistrano
recipe that creates a symbolic link within the application's /config directory to the database.yml file. The file itself is stored in a separate directory that's outside the standard Capistrano /releases directory structure. I chmod 400 the file so it's only readable by the user who created it.
Is this sufficient to lock it down? If not, what else do you do?
Is anyone encrypting their database.yml files?
The way I have tackled this is to put the database password in a file with read permissions only for the user I run my application as. Then, in database.yml I use ERB to read the file:
production:
adapter: mysql
database: my_db
username: db_user
password: <%= begin IO.read("/home/my_deploy_user/.db") rescue "" end %>
Works a treat.
You'll also want to make sure that your SSH system is well secured to prevent people from logging in as your Capistrano bot. I'd suggest restricting access to password-protected key pairs.
Encrypting the .yml file on the server is useless since you have to give the bot the key, which would be stored . . . on the same server. Encrypting it on your machine is probably a good idea. Capistrano can decrypt it before sending.
Take a look at this github solution: https://github.com/NUBIC/bcdatabase. bcdatabase provides an encrypted store where the passwords can be kept separated from the yaml files.
bcdatabase
bcdatabase is a library and utility
which provides database configuration
parameter management for Ruby on Rails
applications. It provides a simple
mechanism for separating database
configuration attributes from
application source code so that
there's no temptation to check
passwords into the version control
system. And it centralizes the
parameters for a single server so that
they can be easily shared among
multiple applications and easily
updated by a single administrator.
Better late than never, I am posting my answer as the question still remains relevant. For Rails 5.2+, it is possible to secure any sensitive information using an encrypted file credentials.yml.enc.
Rails stores secrets in config/credentials.yml.enc, which is encrypted and hence cannot be edited directly. We can edit the credentials by running the following command:
$ EDITOR=nano rails credentials:edit
secret_key_base: 3b7cd727ee24e8444053437c36cc66c3
production_dbpwd: my-secret-password
Now, these secrets can be accessed using Rails.application.credentials.
So your database.yml will look like this:
production:
adapter: mysql
database: my_db
username: db_user
password: <%= Rails.application.credentials.production_dbpwd %>
You can read more about this here
Even if you secure the database.yml file, people can still write that uses the same credentials if they can change the code of your application.
An other way to look at this is: does the web application have to much access to the database. If true lower the permissions. Give just enough permissions to the application. This way an attacker can only do what the web application would be able to do.
If you're very concerned about security of the yml file, I have to ask: Is it stored in your version control? If so, that's another point where an attacker can get at it. If you're doing checkout/checkin over non-SSL, someone could intercept it.
Also, with some version control (svn, for exampl), even if you remove it, it's still there in the history. So, even if you removed it at some point in the past, it's still a good idea to change the passwords.

Resources