activating conda env vs calling python interpreter from conda env - environment-variables

What exactly is the difference between these two operations?
source activate python3_env && python my_script.py
and
~/anaconda3/envs/python3_env/bin/python my_script.py ?
It appears that activating the environment adds some variables to $PATH, but the second method seems to access all the modules installed in python3_env. Is there anything else going on under the hood?

You are correct, activating the environment adds some directories to the PATH environment variable. In particular, this will allow any binaries or scripts installed in the environment to be run first, instead of the ones in the base environment. For instance, if you have installed IPython into your environment, activating the environment allows you to write
ipython
to start IPython in the environment, rather than
/path/to/env/bin/ipython
In addition, environments may have scripts that add or edit other environment variables that are executed when the environment is activated (see the conda docs). These scripts can make arbitrary changes to the shell environment, including even changing the PYTHONPATH to change where packages are loaded from.
Finally, I wrote a very detailed answer of what exactly is happening in the code over there: Conda: what happens when you activate an environment? That may or may not still be up-to-date though. The relevant part of the answer is:
...the build_activate method adds the prefix to the PATH via the _add_prefix_to_path method. Finally, the build_activate method returns a dictionary of commands that need to be run to "activate" the environment.
And another step deeper... The dictionary returned from the build_activate method gets processed into shell commands by the _yield_commands method, which are passed into the _finalize method. The activate method returns the value from running the _finalize method which returns the name of a temp file. The temp file has the commands required to set all of the appropriate environment variables.
Now, stepping back out, in the activate.main function, the return value of the execute method (i.e., the name of the temp file) is printed to stdout. This temp file name gets stored in the Bash variable ask_conda back in the _conda_activate Bash function, and finally, the temp file is executed by the eval Bash function.
So you can see, depending on the environment, running conda activate python3_env && python my_script.py and ~/anaconda3/envs/python3_env/bin/python my_script.py may give very different results.

Related

Conda: what happens when you activate an environment?

How does running source activate <env-name> update the $PATH variable? I've been looking in the CONDA-INSTALLATION/bin/activate script and do not understand how conda updates my $PATH variable to include the bin directory for the recently activated environment. No where can I find the code that conda uses to prepend my $PATH variable.
Disclaimer: I am not a conda developer, and I'm not a Bash expert. The following explanation is based on me tracing through the code, and I hope I got it all right. Also, all of the links below are permalinks to the master commit at the time of writing this answer (7cb5f66). Behavior/lines may change in future commits. Beware: Deep rabbit hole ahead!
Note that this explanation is for the command source activate env-name, but in conda>=4.4, the recommended way to activate an environment is conda activate env-name. I think if one uses conda activate env-name, you should pick up the explanation around the part where we get into the cli.main function.
For conda >=4.4,<4.5, looking at CONDA_INST_DIR/bin/activate, we find on the second to last and last lines (GitHub link):
. "$_CONDA_ROOT/etc/profile.d/conda.sh" || return $?
_conda_activate "$#"
The first line sources the script conda.sh in the $_CONDA_ROOT/etc/profile.d directory, and that script defins the _conda_activate bash function, to which we pass the arguments $# which is basically all of the arguments that we passed to the activate script.
Taking the next step down the rabbit hole, we look at $_CONDA_ROOT/etc/profile.d/conda.sh and find (GitHub link):
_conda_activate() {
# Some code removed...
local ask_conda
ask_conda="$(PS1="$PS1" $_CONDA_EXE shell.posix activate "$#")" || return $?
eval "$ask_conda"
_conda_hashr
}
The key is that line ask_conda=..., and particularly $_CONDA_EXE shell.posix activate "$#". Here, we are running the conda executable with the arguments shell.posix, activate, and then the rest of the arguments that got passed to this function (i.e., the environment name that we want to activate).
Another step into the rabbit hole... From here, the conda executable calls the cli.main function and since the first argument starts with shell., it imports the main function from conda.activate. This function creates an instance of the Activator class (defined in the same file) and runs the execute method.
The execute method processes the arguments and stores the passed environment name into an instance variable, then decides that the activate command has been passed, so it runs the activate method.
Another step into the rabbit hole... The activate method calls the build_activate method, which calls another function to process the environment name to find the environment prefix (i.e., which folder the environment is in). Finally, the build_activate method adds the prefix to the PATH via the _add_prefix_to_path method. Finally, the build_activate method returns a dictionary of commands that need to be run to "activate" the environment.
And another step deeper... The dictionary returned from the build_activate method gets processed into shell commands by the _yield_commands method, which are passed into the _finalize method. The activate method returns the value from running the _finalize method which returns the name of a temp file. The temp file has the commands required to set all of the appropriate environment variables.
Now, stepping back out, in the activate.main function, the return value of the execute method (i.e., the name of the temp file) is printed to stdout. This temp file name gets stored in the Bash variable ask_conda back in the _conda_activate Bash function, and finally, the temp file is executed by the eval Bash function.
Phew! I hope I got everything right. As I said, I'm not a conda developer, and far from a Bash expert, so please excuse any explanation shortcuts I took that aren't 100% correct. Just leave a comment, I'll be happy to fix it!
I should also note that the recommended method to activate environments in conda >=4.4 is conda activate env-name, which is one of the reasons this is so convoluted - the activation is mostly handled in Python now, whereas (I think) previously it was more-or-less handled directly in Bash/CMD.

PyCharm not updating with environment variables

When I use vim to update my environmental variables (in ~/.bashrc), PyCharm does not get the updates right away. I have to shut down the program, source ~/.bashrc again, and re-open PyCharm.
Is there any way to have PyCharm source the changes automatically (or without shutting down)?
When any process get created it inherit the environment variables from it's parent process (the O.S. itself in your case). if you change the environment variables at the parent level, the child process is not aware of it.
PyCharm allows you to change the environment variables from the Run\Debug Configuration window.
Run > Edit Configurations > Environment Variables ->
In my case pycharm does not take env variables from bashrc even after restarting
Pycharm maintains it's own version of environment variables and those aren't sourced from the shell.
It seems that if pycharm is executed from a virtualenv or the shell containing said variables, it will load with them, however it is not dynamic.
the answer below has a settings.py script for the virtualenv to update and maintain settings. Whether this completely solves your question or not i'm not sure.
Pycharm: set environment variable for run manage.py Task
I recently discovered a workaround in windows. Close Pycharm, copy the command to run Pycharm directly from the shortcut, and rerun it in a new terminal window: cmd, cmder, etc.
C:\
λ "C:\Program Files\JetBrains\PyCharm 2017.2.1\bin\pycharm64.exe"
I know this is very late, but I encountered this issue as well and found the accepted answer tedious as I had a lot of saved configurations already.
The solution that a co-worker told me is to add the environment variables to ~/.profile instead. I then had to restart my linux machine and pycharm picked up the new values. (for OSX, I only needed to source ~/.profile and restart pycharm completely)
One thing to be aware is that another coworker said that pycharm would look at ~/.bash_profile so if you have that file, then you need the environment variables added there
In case you are using the "sudo python" technique, be aware that it does not by default convey the environment variables.
To correctly pass on the environment variables defined in the PyCharm launch configuration, use the -E switch:
sudo -E /path/to/python/executable "$#"
This is simply how environment variables work. If you change them you have to re-source your .bashrc (or whatever file the environment variables are located in).
from dotenv import load_dotenv
load_dotenv(override=True)
Python-dotenv can interpolate variables using POSIX variable expansion.
With load_dotenv(override=True) or dotenv_values(), the value of a variable is the first of the values defined in the following list:
Value of that variable in the .env file.
Value of that variable in the environment.
Default value, if provided.
Empty string.
With load_dotenv(override=False), the value of a variable is the first of the values defined in the following list:
Value of that variable in the environment
Value of that variable in the .env file.
Default value, if provided.
Empty string.

Ansible: How to globally set PATH for solaris

I am writing Ansible playbooks to setup and install our applications on Solaris servers.
The problem is that the (bash) scripts which I need to execute all assume that a certain directory lies on the PATH, namely /data/bin - which would normally not be a problem were it not for Ansible ignoring all the .profile and .bashrc config.
Now, I know that you can specify the environment for shell tasks via the environment flag, for example like this:
- shell: printenv
environment:
PATH: /usr/bin:/usr/sbin:/data/bin
This will properly path the /data/bin folder, and the printenv command will correctly display (or my bash scripts would correctly run).
But. There are two problems however:
First of all it is very annoying to have to specify the environment over and over again. I know that you can define the environment in some playbook base file variable and the reference that, but you still have to set environment: ... on every single shell task.
Secondly, the above example does not allow me to specify the path dynamically, e.g. as PATH: $PATH:/data/bin - because Ansible executes this in a way which does not resolve $PATH, thus the command fails catastrophically. So essentially this will override any other changes to PATH.
I am looking for a solution where
the additional PATH entry should only be added once
the additional PATH entry should not override entries added by other tasks
P.S. I found this nice explanation on how to do this on Linux, but it makes use of /etc/environment which does not exist on Solaris. (And /etc/profile is once again ignored by Ansible.)
try adding -o SendEnv=PATH to ssh_args in ansible.cfg. Requires that
the shell in which you run ansible has /data/bin in PATH. Or however ansible allows you to modify the current/local PATH variable.
remote machine has AcceptEnv set correctly.

$_SERVER and $_ENV not available if running php from shell

I have set up a cron job to run once an hour a script cron/cron.php
This script simply reads a table to check which scripts should run at a given time.
So far no problem.
I just noticed that $_SERVER['DOCUMENT_ROOT'] and $_SERVER['SERVER_NAME'] is empty. Same to $_ENV['HOSTNAME']
What can be the reason? I would prefer to have my cron.php portable so I am searching for a solution which should work on every server.
Thanks in advance for any tips!
When the cron script is run, it's most likely executed by the php-cli binary and not the webserver.
$_SERVER entries are set by the webserver, here is the quote from $_SERVER page in the PHP manual:
$_SERVER is an array containing information such as headers, paths, and script locations. The entries in this array are created by the web server.
As there is no webserver involved with your cron script, these are not set. You can try this your own by executing php on the command-line:
php -r 'var_dump($_SERVER);'
it will output all settings in $_SERVER in your command-line environment, "DOCUMENT_ROOT" most likely will be an empty string and "SERVER_NAME" is not set at all.
The $_ENV superglobal contains the environment variables of the system specifically, it's just that "HOSTNAME" is not set as environment variable by the cron binary.
Further Considerations
I normally suggest to not only create the PHP cron script (as you did with cron/cron.php) but also to create a shell-script that invokes the php script. Then use the shell-script in the crontab. This allows you to modify the environment easily without re-configuring the crontab or the cron.php too often. You can then set environment variables within that shell script as well as changing the working directory etc.
If you want to make your cron.php script more portable, figure out what the injected environment dependencies are (e.g. the document root your have) and make those variable, e.g. with variables or a parameter object. Then create a section in your script where those variables are populated and the rest of your script can run based on them in an injected manner. This reduces configuration changes only to a very limited part of your script and will allow you to create more re-useable code.

Use Environment Variable in the same process that assigned it

I have an Installer that assigns an Environment variable using setx command
Afterwards, that installer invokes a command line that uses this enviroment variable but in that context the variable is still empty.
If I invoke the command line independently the variable is read properly.
Why is that? and how can I overcome that?
I've experimented extensively with SETX. Variables set via SETX cannot be seen in the process or script that sets them, unless you programmatically re-read the pertinent Registry key.

Resources