Using Environment variables with cloudera manager - environment-variables

What would be the best way to use an environment variables declared for different users in a cluster(all nodes) and make a call to a oozie workflow (Cloudera) and the container of yarn recover the environment variable according to the user.
In the configuration of yarn in Cloudera manager seems to have references of this kind, something like ENVVAR_USER=$ENVVAR_USER.
It is a way to get a different properties file depending on the user making the call.

You could define one set of env. variables for every user, then resolve the actual values based on actual user name:
### per-user config
Sex_Mary=female
Sex_Mario=male
### resolving config for current user
User=Mario
eval Sex=\$Sex_$User
echo $Sex
But it's an old Unix trick, nothing to do with Hadoop or Cloudera. And maintaining the whole config would be a chore.
Any chance you can store the values in LDAP, and retrieve them dynamically with ldapsearch plus sed or awk??

Related

DBT - environment variables and running dbt

I am relatively new to DBT and I have been reading about env_var and I want to use this in a couple of situations and I am having difficultly and looking for some support.
firstly I am trying to use it in my profiles.yml file to replace the user and password so that this can be set when it is invoked. When trying to test this locally (before implementing this on our AWS side) I am failing to find the right syntax and not finding anything useful online.
I have tried variations of:
dbt run --vars '{DBT_USER: my_username, DBT_PASSWORD=my_password}'
but it is not recognizing and giving nothing useful error wise. When running dbt run by itself it does ask for DBT_USER so it is expecting it, but doesn't detail how
I would also like to use it in my dbt_project.yml for the schema but I assume that this will be similar to the above, just a third variable at the end. Is that the case?
Thanks
var and env_var are two separate features of dbt.
You can use var to access a variable you define in your dbt_project.yml file. The --vars command-line option lets you override the values of these vars at runtime. See the docs for var.
You should use env_var to access environment variables that you set outside of dbt for your system, user, or shell session. Typically you would use environment variables to store secrets like your profile's connection credentials.
To access environment variables in your profiles.yml file, you replace the values for username and password with a call to the env_var macro, as they do in the docs for env_var:
profile:
target: prod
outputs:
prod:
type: postgres
host: 127.0.0.1
# IMPORTANT: Make sure to quote the entire Jinja string here
user: "{{ env_var('DBT_USER') }}"
password: "{{ env_var('DBT_PASSWORD') }}"
....
Then BEFORE you issue the dbt_run command, you need to set the DBT_USER and DBT_PASSWORD environment variables for your system, user, or shell session. This will depend on your OS, but there are lots of good instructions on this. To set a var for your shell session (for Unix OSes), that could look like this:
$ export DBT_USER=my_username
$ export DBT_PASSWORD=abc123
$ dbt run
Note that storing passwords in environment variables isn't necessarily more secure than keeping them in your profiles.yml file, since they're stored in plaintext and not protected from being dumped into logs, etc. (You shouldn't be checking profiles.yml into source control). You should consider at least using an environment variable name prefixed by DBT_ENV_SECRET_ so that dbt keeps them out of logs. See the docs for more info

Adding a local system variable, or the result of a command, to a dockerfile

I have seen some similar questions, but none of them appear to solve my problem. I want to add a user to a docker container and in my Dockerfile, I define the username with:
ARG USERNAME="some_user"
Instead, I want the username to be the current user's computer username, as obtained by running the command whoami in the local terminal.
So what I would like to have is something like
ARG USERNAME=$(whoami)
.
This $(whoami) should be obtained from the local system environment, and not from the docker container.
Is there a way to do this for dockerfiles? I have thought of .env and docker-compose solutions but these also require each user to set their own username according to my knowledge.
There is no integrated way to execute arbitrary commands on the host directly outside of a container using just docker build / docker-compose build.
So to execute an arbitrary command to get/generate the required information you'll need to provide a custom script / use another build system to call docker/docker-compose with the respective flags or maybe generate the .env file from a template / interactively.
If you only need the current user name you may want to use the $USER / $LOGNAME environment variables that are set by the system in many default configurations. But since these are just normal environment variables their values may be incorrect / empty / manually changed by the user, see this question.

How to set System Wide Environment Variable in Cloud Config File on Digital Ocean

I am pretty new to setting up remote servers, but I was playing around today and was hoping that I could leverage a Cloud Config file upon setup in order to set a few environment variables as the server spins up.
How can I set my environment variables programmatically when spinning up a machine on Digital Ocean? The key is that I want to automate the setup and avoid interactively defining these variables.
Thanks in advance.
This is what I did with for ubuntu
write_files:
- path: /etc/environment
content: |
FOO="BAR"
append: true
There's a couple ways to do this, although Cloud Init doesn't support a built-in resource type for environment variables.
Depending on your OS, use a write-files section to output the env vars you want to the appropriate file. For CoreOS, you'd do something like:
write_files:
- path: "/etc/profile.env"
append: true
content: |
export MY_VAR="foo"
For Ubuntu, use /etc/environment, or a user's profile, etc.
Another way to do it would be to leverage Cloud Init's support for Chef, and use that tool to set the variables when the profile is applied.
Do you need the environment variable to be permanent, or just for the execution of a single command/script?
If it's for a single command, you can do that:
FOO=${BAR} | sh ./your_script.sh

Accessing Elastic Beanstalk environment properties in Docker

So I've been looking around for an example of how I can specify environment variables for my Docker container from the AWS EB web interface. Typically in EB you can add environment properties which are available at runtime. I was using these for my previous deployment before I switched to Docker, but it appears as though Docker has some different rules with regards to how the environment properties are handled, is that correct? According to this article [1], ONLY the AWS credentials and PARAM1-PARAM5 will be present in the environment variables, but no custom properties will be present. That's what it sounds like to me, especially considering the containers that do support custom environment properties say it explicitly, like Python shown here [2]. Does anyone have any experience with this software combination? All I need to specify is a single environment variable that tells me whether the application is in "staging" or "production" mode, then all my environment specific configurations are set up by the application itself.
[1] http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/command-options.html#command-options-docker
[2] http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/command-options.html#command-options-python
Custom environment variables are supported with the AWS Elastic Beanstalk Docker container. Looks like a miss in the documentation. You can define custom environment variables for your environment and expect that they will be passed along to the docker container.
I've needed to pass environment variable in moment docker run using Elastic Beanstalk, but, is not allowed put this information in Dockerrun.aws.json.
Below the steps to resolve this scenario:
Create a folder .ebextensions
Create a .config file in the folder
Fill the .config file:
option_settings:
-option_name: VARIABLE_NAME value: VARIABLE_VALUE
Zip the folder .ebextensions file along with the Dockerrun.aws.json plus Dockerfile and upload it to Beanstalk
To see the result, inside EC2 instance, execute the command "docker inspect CONTAINER_ID" and will see the environment variable.
At least for me the environment variables that I set in the EB console were not being populated into the Docker container. I found the following link helpful though: https://aws.amazon.com/premiumsupport/knowledge-center/elastic-beanstalk-env-variables-shell/
I used a slightly different approach where instead of exporting the vars to the shell, I used the ebextension to create a .env file which I then loaded from Python within my container.
The steps would be as follows:
Create a directory called '.ebextensions' in your app root dir
Create a file in this directory called 'load-env-vars.config'
Enter the following contents:
commands:
setvars:
command: /opt/elasticbeanstalk/bin/get-config environment | jq -r 'to_entries | .[] | "\(.key)=\"\(.value)\""' > /var/app/current/.env
packages:
yum:
jq: []
This will create a .env file in /var/app/current which is where your code should be within the EB instance
Use a package like python-dotenv to load the .env file or something similar if you aren't using Python. Note that this solution should be generic to any language/framework that you're using within your container.
I don't think the docs are a miss as Rohit Banga's answer suggests. Thought I agree that "you can define custom environment variables for your environment and expect that they will be passed along to the docker container".
The Docker container portion of the docs say, "No DOCKER-SPECIFIC configuration options are provided by Elastic Beanstalk" ... which doesn't necessarily mean that no environment variables are passed to the Docker container.
For example, for the Ruby container the Ruby-specific variables that are always passed are ... RAILS_SKIP_MIGRATIONS, RAILS_SKIP_ASSET_COMPILATION, BUNDLE_WITHOUT, RACK_ENV, RAILS_ENV. And so on. For the Ruby container, the assumption is you are running a Ruby app, hence setting some sensible defaults to make sure they are always available.
On the other hand, for the Docker container it seems it's open. You specify whatever variables you want ... they make no assumptions as to what you are running, Rails (Ruby), Django (Python) etc ... because it could be anything. They don't know before hand what you want to run and that makes it difficult to set sensible defaults.

$_SERVER and $_ENV not available if running php from shell

I have set up a cron job to run once an hour a script cron/cron.php
This script simply reads a table to check which scripts should run at a given time.
So far no problem.
I just noticed that $_SERVER['DOCUMENT_ROOT'] and $_SERVER['SERVER_NAME'] is empty. Same to $_ENV['HOSTNAME']
What can be the reason? I would prefer to have my cron.php portable so I am searching for a solution which should work on every server.
Thanks in advance for any tips!
When the cron script is run, it's most likely executed by the php-cli binary and not the webserver.
$_SERVER entries are set by the webserver, here is the quote from $_SERVER page in the PHP manual:
$_SERVER is an array containing information such as headers, paths, and script locations. The entries in this array are created by the web server.
As there is no webserver involved with your cron script, these are not set. You can try this your own by executing php on the command-line:
php -r 'var_dump($_SERVER);'
it will output all settings in $_SERVER in your command-line environment, "DOCUMENT_ROOT" most likely will be an empty string and "SERVER_NAME" is not set at all.
The $_ENV superglobal contains the environment variables of the system specifically, it's just that "HOSTNAME" is not set as environment variable by the cron binary.
Further Considerations
I normally suggest to not only create the PHP cron script (as you did with cron/cron.php) but also to create a shell-script that invokes the php script. Then use the shell-script in the crontab. This allows you to modify the environment easily without re-configuring the crontab or the cron.php too often. You can then set environment variables within that shell script as well as changing the working directory etc.
If you want to make your cron.php script more portable, figure out what the injected environment dependencies are (e.g. the document root your have) and make those variable, e.g. with variables or a parameter object. Then create a section in your script where those variables are populated and the rest of your script can run based on them in an injected manner. This reduces configuration changes only to a very limited part of your script and will allow you to create more re-useable code.

Resources