DBT - environment variables and running dbt - environment-variables

I am relatively new to DBT and I have been reading about env_var and I want to use this in a couple of situations and I am having difficultly and looking for some support.
firstly I am trying to use it in my profiles.yml file to replace the user and password so that this can be set when it is invoked. When trying to test this locally (before implementing this on our AWS side) I am failing to find the right syntax and not finding anything useful online.
I have tried variations of:
dbt run --vars '{DBT_USER: my_username, DBT_PASSWORD=my_password}'
but it is not recognizing and giving nothing useful error wise. When running dbt run by itself it does ask for DBT_USER so it is expecting it, but doesn't detail how
I would also like to use it in my dbt_project.yml for the schema but I assume that this will be similar to the above, just a third variable at the end. Is that the case?
Thanks

var and env_var are two separate features of dbt.
You can use var to access a variable you define in your dbt_project.yml file. The --vars command-line option lets you override the values of these vars at runtime. See the docs for var.
You should use env_var to access environment variables that you set outside of dbt for your system, user, or shell session. Typically you would use environment variables to store secrets like your profile's connection credentials.
To access environment variables in your profiles.yml file, you replace the values for username and password with a call to the env_var macro, as they do in the docs for env_var:
profile:
target: prod
outputs:
prod:
type: postgres
host: 127.0.0.1
# IMPORTANT: Make sure to quote the entire Jinja string here
user: "{{ env_var('DBT_USER') }}"
password: "{{ env_var('DBT_PASSWORD') }}"
....
Then BEFORE you issue the dbt_run command, you need to set the DBT_USER and DBT_PASSWORD environment variables for your system, user, or shell session. This will depend on your OS, but there are lots of good instructions on this. To set a var for your shell session (for Unix OSes), that could look like this:
$ export DBT_USER=my_username
$ export DBT_PASSWORD=abc123
$ dbt run
Note that storing passwords in environment variables isn't necessarily more secure than keeping them in your profiles.yml file, since they're stored in plaintext and not protected from being dumped into logs, etc. (You shouldn't be checking profiles.yml into source control). You should consider at least using an environment variable name prefixed by DBT_ENV_SECRET_ so that dbt keeps them out of logs. See the docs for more info

Related

Get all environment variables of a Docker container including security variables

Is there a way to get all environment variables that a Docker image accepts? Including authentications and all possible ones to make the best out of that image?
For example, I've run a redis:7.0.8 container and I want to use every possible feature this image offers.
First I used docker inspect and saw this:
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"GOSU_VERSION=1.16",
"REDIS_VERSION=7.0.8",
"REDIS_DOWNLOAD_URL=http://download.redis.io/releases/redis-7.0.8.tar.gz",
"REDIS_DOWNLOAD_SHA=06a339e491306783dcf55b97f15a5dbcbdc01ccbde6dc23027c475cab735e914"
],
I also tried docker exec -it my-container env which just showed me the same thing. I know there are more variables, for example this doesn't include the following:
REDIS_PASSWORD
REDIS_ACLS
REDIS_TLS_CERT_FILE
Absent documentation, this is pretty much impossible.
Let's start by repeating #jonrsharpe's comment:
They accept any env var at all, but they won't respond to all of them.
Consider this Python code, for example:
import os
def get_environ(d, name):
d.get(name, 'absent')
foo = os.environ.get('FOO', 'default_foo')
star_foo = get_environ(os.environ, foo)
print(star_foo)
This fragment looks up an environment variable $FOO. You could probably figure that out, if you knew the main process was in Python and recognized os.environ. But then it passes that value and the standard environment to a helper function, which looks up that environment variable by name. You'd need detailed static analysis to understand this is actually also an environment-variable lookup.
$ ./test.py
absent
$ default_foo=bar ./test.py
bar
$ FOO=BAR BAR=quux ./test.py
quux
$ I=3 ./test.py
absent
(A fair bit of the code I work with accesses environment variables rather haphazardly; it's not just "find the main function" but "find every ENV reference in every file in every library". Some frameworks like Spring Boot make it possible to set hundreds of configuration options via environment variables, and even if it were possible to get every possible setting here, the output would be prohibitive.)
"What environment variables are there" isn't standard container metadata. You'd have to identify the language the main container process runs, and do this sort of analysis on it, including compiled languages. That doesn't seem like a solvable problem.

How to safely pass variables that may contain special characters to a docker-compose file in Ansible

I am currently using an ansible script to deploy a docker-compose file (using the docker_service module), which sets a series of environment variables which are read by the .NET Core service running inside the docker container, like this:
(...)
environment:
- Poller:Username={{ poller_username }}
- Poller:Password={{ poller_password }}
(...)
The variables for poller_username and poller_password are being loaded from an Ansible Vault (which will be moved to a Hashicorp Vault eventually), and are interpolated into the file with no problem.
However, I have come across a scenario where this logic fails: the user has a '$' in the middle of his password. This means that instead of the environment variable being set to 'abc$123' it's instead set to 'abc', causing my application to fail.
Upon writing a debug command, I get the password output to the console correctly. If I do docker exec <container_name> env I get the wrong password.
Is there a Jinja filter I can use to ensure the password is compliant with docker-compose standards? It doesn't seem viable to me to guarantee the password will never have a $.
EDIT: {{ poller_password | replace("$","$$") }} works, but this isn't a very elegant solution to have in, potentially, every variable I use in the docker-compose module.
For this particular scenario, the {{ poller_password | replace("$","$$") }} solution seems to be inevitable. Thankfully, it appears to be the only case that requires this caution.
Had a similar situation was not a $ but some other character, end up using
something: !unsafe "{{ variable }}"
couldn't find a better way.

How to set System Wide Environment Variable in Cloud Config File on Digital Ocean

I am pretty new to setting up remote servers, but I was playing around today and was hoping that I could leverage a Cloud Config file upon setup in order to set a few environment variables as the server spins up.
How can I set my environment variables programmatically when spinning up a machine on Digital Ocean? The key is that I want to automate the setup and avoid interactively defining these variables.
Thanks in advance.
This is what I did with for ubuntu
write_files:
- path: /etc/environment
content: |
FOO="BAR"
append: true
There's a couple ways to do this, although Cloud Init doesn't support a built-in resource type for environment variables.
Depending on your OS, use a write-files section to output the env vars you want to the appropriate file. For CoreOS, you'd do something like:
write_files:
- path: "/etc/profile.env"
append: true
content: |
export MY_VAR="foo"
For Ubuntu, use /etc/environment, or a user's profile, etc.
Another way to do it would be to leverage Cloud Init's support for Chef, and use that tool to set the variables when the profile is applied.
Do you need the environment variable to be permanent, or just for the execution of a single command/script?
If it's for a single command, you can do that:
FOO=${BAR} | sh ./your_script.sh

Using Environment variables with cloudera manager

What would be the best way to use an environment variables declared for different users in a cluster(all nodes) and make a call to a oozie workflow (Cloudera) and the container of yarn recover the environment variable according to the user.
In the configuration of yarn in Cloudera manager seems to have references of this kind, something like ENVVAR_USER=$ENVVAR_USER.
It is a way to get a different properties file depending on the user making the call.
You could define one set of env. variables for every user, then resolve the actual values based on actual user name:
### per-user config
Sex_Mary=female
Sex_Mario=male
### resolving config for current user
User=Mario
eval Sex=\$Sex_$User
echo $Sex
But it's an old Unix trick, nothing to do with Hadoop or Cloudera. And maintaining the whole config would be a chore.
Any chance you can store the values in LDAP, and retrieve them dynamically with ldapsearch plus sed or awk??

Accessing Elastic Beanstalk environment properties in Docker

So I've been looking around for an example of how I can specify environment variables for my Docker container from the AWS EB web interface. Typically in EB you can add environment properties which are available at runtime. I was using these for my previous deployment before I switched to Docker, but it appears as though Docker has some different rules with regards to how the environment properties are handled, is that correct? According to this article [1], ONLY the AWS credentials and PARAM1-PARAM5 will be present in the environment variables, but no custom properties will be present. That's what it sounds like to me, especially considering the containers that do support custom environment properties say it explicitly, like Python shown here [2]. Does anyone have any experience with this software combination? All I need to specify is a single environment variable that tells me whether the application is in "staging" or "production" mode, then all my environment specific configurations are set up by the application itself.
[1] http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/command-options.html#command-options-docker
[2] http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/command-options.html#command-options-python
Custom environment variables are supported with the AWS Elastic Beanstalk Docker container. Looks like a miss in the documentation. You can define custom environment variables for your environment and expect that they will be passed along to the docker container.
I've needed to pass environment variable in moment docker run using Elastic Beanstalk, but, is not allowed put this information in Dockerrun.aws.json.
Below the steps to resolve this scenario:
Create a folder .ebextensions
Create a .config file in the folder
Fill the .config file:
option_settings:
-option_name: VARIABLE_NAME value: VARIABLE_VALUE
Zip the folder .ebextensions file along with the Dockerrun.aws.json plus Dockerfile and upload it to Beanstalk
To see the result, inside EC2 instance, execute the command "docker inspect CONTAINER_ID" and will see the environment variable.
At least for me the environment variables that I set in the EB console were not being populated into the Docker container. I found the following link helpful though: https://aws.amazon.com/premiumsupport/knowledge-center/elastic-beanstalk-env-variables-shell/
I used a slightly different approach where instead of exporting the vars to the shell, I used the ebextension to create a .env file which I then loaded from Python within my container.
The steps would be as follows:
Create a directory called '.ebextensions' in your app root dir
Create a file in this directory called 'load-env-vars.config'
Enter the following contents:
commands:
setvars:
command: /opt/elasticbeanstalk/bin/get-config environment | jq -r 'to_entries | .[] | "\(.key)=\"\(.value)\""' > /var/app/current/.env
packages:
yum:
jq: []
This will create a .env file in /var/app/current which is where your code should be within the EB instance
Use a package like python-dotenv to load the .env file or something similar if you aren't using Python. Note that this solution should be generic to any language/framework that you're using within your container.
I don't think the docs are a miss as Rohit Banga's answer suggests. Thought I agree that "you can define custom environment variables for your environment and expect that they will be passed along to the docker container".
The Docker container portion of the docs say, "No DOCKER-SPECIFIC configuration options are provided by Elastic Beanstalk" ... which doesn't necessarily mean that no environment variables are passed to the Docker container.
For example, for the Ruby container the Ruby-specific variables that are always passed are ... RAILS_SKIP_MIGRATIONS, RAILS_SKIP_ASSET_COMPILATION, BUNDLE_WITHOUT, RACK_ENV, RAILS_ENV. And so on. For the Ruby container, the assumption is you are running a Ruby app, hence setting some sensible defaults to make sure they are always available.
On the other hand, for the Docker container it seems it's open. You specify whatever variables you want ... they make no assumptions as to what you are running, Rails (Ruby), Django (Python) etc ... because it could be anything. They don't know before hand what you want to run and that makes it difficult to set sensible defaults.

Resources