Conda: what happens when you activate an environment? - path

How does running source activate <env-name> update the $PATH variable? I've been looking in the CONDA-INSTALLATION/bin/activate script and do not understand how conda updates my $PATH variable to include the bin directory for the recently activated environment. No where can I find the code that conda uses to prepend my $PATH variable.

Disclaimer: I am not a conda developer, and I'm not a Bash expert. The following explanation is based on me tracing through the code, and I hope I got it all right. Also, all of the links below are permalinks to the master commit at the time of writing this answer (7cb5f66). Behavior/lines may change in future commits. Beware: Deep rabbit hole ahead!
Note that this explanation is for the command source activate env-name, but in conda>=4.4, the recommended way to activate an environment is conda activate env-name. I think if one uses conda activate env-name, you should pick up the explanation around the part where we get into the cli.main function.
For conda >=4.4,<4.5, looking at CONDA_INST_DIR/bin/activate, we find on the second to last and last lines (GitHub link):
. "$_CONDA_ROOT/etc/profile.d/conda.sh" || return $?
_conda_activate "$#"
The first line sources the script conda.sh in the $_CONDA_ROOT/etc/profile.d directory, and that script defins the _conda_activate bash function, to which we pass the arguments $# which is basically all of the arguments that we passed to the activate script.
Taking the next step down the rabbit hole, we look at $_CONDA_ROOT/etc/profile.d/conda.sh and find (GitHub link):
_conda_activate() {
# Some code removed...
local ask_conda
ask_conda="$(PS1="$PS1" $_CONDA_EXE shell.posix activate "$#")" || return $?
eval "$ask_conda"
_conda_hashr
}
The key is that line ask_conda=..., and particularly $_CONDA_EXE shell.posix activate "$#". Here, we are running the conda executable with the arguments shell.posix, activate, and then the rest of the arguments that got passed to this function (i.e., the environment name that we want to activate).
Another step into the rabbit hole... From here, the conda executable calls the cli.main function and since the first argument starts with shell., it imports the main function from conda.activate. This function creates an instance of the Activator class (defined in the same file) and runs the execute method.
The execute method processes the arguments and stores the passed environment name into an instance variable, then decides that the activate command has been passed, so it runs the activate method.
Another step into the rabbit hole... The activate method calls the build_activate method, which calls another function to process the environment name to find the environment prefix (i.e., which folder the environment is in). Finally, the build_activate method adds the prefix to the PATH via the _add_prefix_to_path method. Finally, the build_activate method returns a dictionary of commands that need to be run to "activate" the environment.
And another step deeper... The dictionary returned from the build_activate method gets processed into shell commands by the _yield_commands method, which are passed into the _finalize method. The activate method returns the value from running the _finalize method which returns the name of a temp file. The temp file has the commands required to set all of the appropriate environment variables.
Now, stepping back out, in the activate.main function, the return value of the execute method (i.e., the name of the temp file) is printed to stdout. This temp file name gets stored in the Bash variable ask_conda back in the _conda_activate Bash function, and finally, the temp file is executed by the eval Bash function.
Phew! I hope I got everything right. As I said, I'm not a conda developer, and far from a Bash expert, so please excuse any explanation shortcuts I took that aren't 100% correct. Just leave a comment, I'll be happy to fix it!
I should also note that the recommended method to activate environments in conda >=4.4 is conda activate env-name, which is one of the reasons this is so convoluted - the activation is mostly handled in Python now, whereas (I think) previously it was more-or-less handled directly in Bash/CMD.

Related

activating conda env vs calling python interpreter from conda env

What exactly is the difference between these two operations?
source activate python3_env && python my_script.py
and
~/anaconda3/envs/python3_env/bin/python my_script.py ?
It appears that activating the environment adds some variables to $PATH, but the second method seems to access all the modules installed in python3_env. Is there anything else going on under the hood?
You are correct, activating the environment adds some directories to the PATH environment variable. In particular, this will allow any binaries or scripts installed in the environment to be run first, instead of the ones in the base environment. For instance, if you have installed IPython into your environment, activating the environment allows you to write
ipython
to start IPython in the environment, rather than
/path/to/env/bin/ipython
In addition, environments may have scripts that add or edit other environment variables that are executed when the environment is activated (see the conda docs). These scripts can make arbitrary changes to the shell environment, including even changing the PYTHONPATH to change where packages are loaded from.
Finally, I wrote a very detailed answer of what exactly is happening in the code over there: Conda: what happens when you activate an environment? That may or may not still be up-to-date though. The relevant part of the answer is:
...the build_activate method adds the prefix to the PATH via the _add_prefix_to_path method. Finally, the build_activate method returns a dictionary of commands that need to be run to "activate" the environment.
And another step deeper... The dictionary returned from the build_activate method gets processed into shell commands by the _yield_commands method, which are passed into the _finalize method. The activate method returns the value from running the _finalize method which returns the name of a temp file. The temp file has the commands required to set all of the appropriate environment variables.
Now, stepping back out, in the activate.main function, the return value of the execute method (i.e., the name of the temp file) is printed to stdout. This temp file name gets stored in the Bash variable ask_conda back in the _conda_activate Bash function, and finally, the temp file is executed by the eval Bash function.
So you can see, depending on the environment, running conda activate python3_env && python my_script.py and ~/anaconda3/envs/python3_env/bin/python my_script.py may give very different results.

PyCharm not updating with environment variables

When I use vim to update my environmental variables (in ~/.bashrc), PyCharm does not get the updates right away. I have to shut down the program, source ~/.bashrc again, and re-open PyCharm.
Is there any way to have PyCharm source the changes automatically (or without shutting down)?
When any process get created it inherit the environment variables from it's parent process (the O.S. itself in your case). if you change the environment variables at the parent level, the child process is not aware of it.
PyCharm allows you to change the environment variables from the Run\Debug Configuration window.
Run > Edit Configurations > Environment Variables ->
In my case pycharm does not take env variables from bashrc even after restarting
Pycharm maintains it's own version of environment variables and those aren't sourced from the shell.
It seems that if pycharm is executed from a virtualenv or the shell containing said variables, it will load with them, however it is not dynamic.
the answer below has a settings.py script for the virtualenv to update and maintain settings. Whether this completely solves your question or not i'm not sure.
Pycharm: set environment variable for run manage.py Task
I recently discovered a workaround in windows. Close Pycharm, copy the command to run Pycharm directly from the shortcut, and rerun it in a new terminal window: cmd, cmder, etc.
C:\
λ "C:\Program Files\JetBrains\PyCharm 2017.2.1\bin\pycharm64.exe"
I know this is very late, but I encountered this issue as well and found the accepted answer tedious as I had a lot of saved configurations already.
The solution that a co-worker told me is to add the environment variables to ~/.profile instead. I then had to restart my linux machine and pycharm picked up the new values. (for OSX, I only needed to source ~/.profile and restart pycharm completely)
One thing to be aware is that another coworker said that pycharm would look at ~/.bash_profile so if you have that file, then you need the environment variables added there
In case you are using the "sudo python" technique, be aware that it does not by default convey the environment variables.
To correctly pass on the environment variables defined in the PyCharm launch configuration, use the -E switch:
sudo -E /path/to/python/executable "$#"
This is simply how environment variables work. If you change them you have to re-source your .bashrc (or whatever file the environment variables are located in).
from dotenv import load_dotenv
load_dotenv(override=True)
Python-dotenv can interpolate variables using POSIX variable expansion.
With load_dotenv(override=True) or dotenv_values(), the value of a variable is the first of the values defined in the following list:
Value of that variable in the .env file.
Value of that variable in the environment.
Default value, if provided.
Empty string.
With load_dotenv(override=False), the value of a variable is the first of the values defined in the following list:
Value of that variable in the environment
Value of that variable in the .env file.
Default value, if provided.
Empty string.

Reload environment variables PATH from chef recipes client

is it possible to reload $PATH from a chef recipe?
Im intrsted in the response about process signals given in the following thread:
How to have Chef reload global PATH
I dont understand very well that example that the omribahumi user gives.
I would like a clearer example with chef-client / recipe to understand,
with that he explains, seems it is possible with that workaround.
Thanks.
Well I see two reasons for this request:
add something to the path for immediate execution => easy, just update the ENV['PATH'] variable within the chef run.
Extend the PATH system wide to include something just installed.
For the 2, you may update /etc/environment file (for ubuntu) or add a file to /etc/profiled.d (better idea to keep control over it),
But obviously the new PATH variable won't be available to actually running processes (including your actual shell), it will work for process launched after the file update.
To explain a little more the link you gave what is done is:
create a file with export commands to set env variables
echo 'export MYVAR="my value"' > ~/my_environment
create a bash function loading env vars from a file
function reload_environment { source ~/my_environment; }
set a trap in bash to do something on a signal, here run the function when bash receive SIGHUP
trap reload_environment SIGHUP
Launch the function for a first sourcing of the env file, there's two way:
easy one: launch the function
reload_environment
complex one: Get the pid of your actual shell and send it a SIGHUP signal
kill -HUP `echo $$`
All of this is only for the current shell until you set this in your .bash_rc
Not exactly what you were asking for indeed, but I hope you'll understand there's no way to update context of an already running process.
The best you can do is: update the PATH with whatever method you wish (something in /etc/profile.d for exemple) and execute a wall (if chef run as root) to tell users to reload their envs
echo 'reload your shell env by executing: source /etc/profile' | wall
Once again, it could work for humans, not for other process already running, those will have to be restarted.

$_SERVER and $_ENV not available if running php from shell

I have set up a cron job to run once an hour a script cron/cron.php
This script simply reads a table to check which scripts should run at a given time.
So far no problem.
I just noticed that $_SERVER['DOCUMENT_ROOT'] and $_SERVER['SERVER_NAME'] is empty. Same to $_ENV['HOSTNAME']
What can be the reason? I would prefer to have my cron.php portable so I am searching for a solution which should work on every server.
Thanks in advance for any tips!
When the cron script is run, it's most likely executed by the php-cli binary and not the webserver.
$_SERVER entries are set by the webserver, here is the quote from $_SERVER page in the PHP manual:
$_SERVER is an array containing information such as headers, paths, and script locations. The entries in this array are created by the web server.
As there is no webserver involved with your cron script, these are not set. You can try this your own by executing php on the command-line:
php -r 'var_dump($_SERVER);'
it will output all settings in $_SERVER in your command-line environment, "DOCUMENT_ROOT" most likely will be an empty string and "SERVER_NAME" is not set at all.
The $_ENV superglobal contains the environment variables of the system specifically, it's just that "HOSTNAME" is not set as environment variable by the cron binary.
Further Considerations
I normally suggest to not only create the PHP cron script (as you did with cron/cron.php) but also to create a shell-script that invokes the php script. Then use the shell-script in the crontab. This allows you to modify the environment easily without re-configuring the crontab or the cron.php too often. You can then set environment variables within that shell script as well as changing the working directory etc.
If you want to make your cron.php script more portable, figure out what the injected environment dependencies are (e.g. the document root your have) and make those variable, e.g. with variables or a parameter object. Then create a section in your script where those variables are populated and the rest of your script can run based on them in an injected manner. This reduces configuration changes only to a very limited part of your script and will allow you to create more re-useable code.

Use Environment Variable in the same process that assigned it

I have an Installer that assigns an Environment variable using setx command
Afterwards, that installer invokes a command line that uses this enviroment variable but in that context the variable is still empty.
If I invoke the command line independently the variable is read properly.
Why is that? and how can I overcome that?
I've experimented extensively with SETX. Variables set via SETX cannot be seen in the process or script that sets them, unless you programmatically re-read the pertinent Registry key.

Resources