Why when installing cron along with playwright,errors occur in the absence of the playwright module - parsing

The launch takes place on a VPS. So, if you run the code without cron-a, then everything is OK, the sites are parsed. When I add cron, everything flies into a heap with errors. Here is what my log gives me
`Traceback (most recent call last):
File "/apars/nobr.py", line 39, in <module>
povtor()
File "/apars/nobr.py", line 9, in povtor
x = obr_cnt()
File "/apars/obrzka_count.py", line 8, in obr_cnt
id_mobile = parser()
File "/apars/parser_new.py", line 4, in parser
from playwright.sync_api import sync_playwright
ModuleNotFoundError: No module named 'playwright'`
Although everything is installed in ubuntu and this is what pip3 list shows...
enter image description here
Removing the entry about the program launch from the cron, everything falls into place. Perhaps someone has come across the same problem and can help me.

cron is probably being run as a different user and so it has a different site-packages directory where it pulls installed modules from.
You can either install playwright for root user, or schedule the cron job to be run by your current user. The second is probably the better option.
The other possibility is that you are running your tests in a virtual environment that cron doesn't have access to. Either way the solution is to make sure that cron has access to all of the python modules needed to run the script.
To schedule a job for a specific user use crontab -u username -e

Related

running pycharm interpreter using nvidia-docker2

Im working on Ubuntu 20. I've installed docker, nvidia-docker2. On Pycharm, I've followed jetbrain guide, but in the advanced steps it isn't consistent with what I see in my setup. I use PyCharm Proffesional 2022.2.
In this step:
in the run options I put additionally --runtime=nvidia and --gpus=all.
Step 4 finishes as same as in the guide (almost, but it seems that it doesn't bother anything so on that later) and on step 5 I put manually the path to the interpreter in the virtual environment I've created using the Dockerfile.
In that way I am able to run the command of nvidia-smi and see correctly the GPU, but I don't see any packages I've installed during the Dockerfile build.
There is another option to connect the interpreter a little bit differently in which I do see the packages, but I can't run the nvidia-smi command and the torch.cuda.is_availble return False.
The way is instead of doing this as in the guide:
I press on the little down arrow in left of the Add Interpreter button and then click on Show all:
After which I can press the + button :
works, so it might be PyCharm "Python Console" issue.
and then I can choose Docker:
which will result in the difference mentioned above in functionality and also in the path dispalyed (the first one is the first remote interpreter top to bottom direction and the second is the second correspondingly):
Here of course the effect of the first and the second correspondingly:
Here is the results of the interpreter run with the first method connected interpreter:
and here is the second:
Of the following code:
Here is the Dockerfile file if you want to take a look:
Anyone configured it correctly and can help ?
Thank you in advance.
P.S: if I run the docker from services and enter the terminal the command nvidia-smi works fine and also the import of torch and the command torch.cuda.is_available return True.
P.S.2:
The thing that has worked for me for now is to change the Dockerfile to install directly torch with pip without create conda environement.
Then I set the path to the python2.7 and I can run the code, but not debug it.
for run the result is as expected (the packages list as was shown before is still empty, but it works, I guess somehow my IDE cannot access the packages list of the remote interpreter in that case, I dont know why):
But the debugger outputs the following error:
Any suggestions for the debugger issue also will be welcome, although it is a different issue.
Please update to 2022.2.1 as it looks like a known regression that has been fixed.
Let me know if it still does not work well.

Airflow on windows 10 - Module not found errors

I'm new to data science and wanted to do a little tutorial, which requires airflow, among other things. I installed it on windows using git bash in VS Code. I tried running it but it had a problem not being able to load the sqlite3 import
command (module not found error). I figured out that if I added the directory of sqlite3.py to the path, it would run, but now it gives me a similar error: pwd module not found from daemon.py
File "C:\ProgramData\Anaconda3\lib\site-packages\daemon\daemon.py", line 18, in <module>
import pwd
ModuleNotFoundError: No module named 'pwd'
Strange to me that it can't find pwd. Obviously pwd works in both git bash and powershell natively. It seems like a basic universal command. I'd love to learn more about what's going on. I don't want to have to end up adding 100 things to path just to get this program to run. I'd love any insights anyone can provide.
PS I'm using Anaconda.
it's seems to be the side effects of Spawning new Python Daemons .
You likely can fix this by downgrading the Python-Daemon :
pip install python-daemon==2.1.2

how should I persistently save Julia packages in a Docker container

I'm running Julia on the raspberry pi 4. For what I'm doing, I need Julia 1.5 and thankfully there is a docker image of it here: https://github.com/Julia-Embedded/jlcross
My challenge is that, because this is a work-in-progress development I find myself adding packages here and there as I work. What is the best way to persistently save the updated environment?
Here are my problems:
I'm having a hard time wrapping my mind around volumes that will save packages from Julia's package manager and keep them around the next time I run the container
It seems kludgy to commit my docker container somehow every time I install a package.
Is there a consensus on the best way or maybe there's another way to do what I'm trying to do?
You can persist the state of downloaded & precompiled packages by mounting a dedicated volume into /home/your_user/.julia inside the container:
$ docker run --mount source=dot-julia,target=/home/your_user/.julia [OTHER_OPTIONS]
Depending on how (and by which user) julia is run inside the container, you might have to adjust the target path above to point to the first entry in Julia's DEPOT_PATH.
You can control this path by setting it yourself via the JULIA_DEPOT_PATH environment variable. Alternatively, you can check whether it is in a nonstandard location by running the following command in a Julia REPL in the container:
julia> println(first(DEPOT_PATH))
/home/francois/.julia
You can manage the package and their versions via a Julia Project.toml file.
This file can keep both the list of your dependencies.
Here is a sample Julia session:
julia> using Pkg
julia> pkg"generate MyProject"
Generating project MyProject:
MyProject\Project.toml
MyProject\src/MyProject.jl
julia> cd("MyProject")
julia> pkg"activate ."
Activating environment at `C:\Users\pszufe\myp\MyProject\Project.toml`
julia> pkg"add DataFrames"
Now the last step is to provide package version information to your Project.toml file. We start by checking the version number that "works good":
julia> pkg"st DataFrames"
Project MyProject v0.1.0
Status `C:\Users\pszufe\myp\MyProject\Project.toml`
[a93c6f00] DataFrames v0.21.7
Now you want to edit Project.toml file [compat] to fix that version number to always be v0.21.7:
name = "MyProject"
uuid = "5fe874ab-e862-465c-89f9-b6882972cba7"
authors = ["pszufe <pszufe#******.com>"]
version = "0.1.0"
[deps]
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
[compat]
DataFrames = "= 0.21.7"
Note that in the last line the equality operator is twice to fix the exact version number see also https://julialang.github.io/Pkg.jl/v1/compatibility/.
Now in order to reuse that structure (e.g. different docker, moving between systems etc.) all you do is
cd("MyProject")
using Pkg
pkg"activate ."
pkg"instantiate"
Additional note
Also have a look at the JULIA_DEPOT_PATH variable (https://docs.julialang.org/en/v1/manual/environment-variables/).
When moving installations between dockers here and there it might be also sometimes convenient to have control where all your packages are actually installed. For an example you might want to copy JULIA_DEPOT_PATH folder between 2 dockers having the same Julia installations to avoid the time spent in installing packages or you could be building the Docker image having no internet connection etc.
In my Dockerfile I simply install the packages just like you would do with pip:
FROM jupyter/datascience-notebook
RUN julia -e 'using Pkg; Pkg.add.(["CSV", "DataFrames", "DataFramesMeta", "Gadfly"])'
Here I start with a base datascience notebook which includes Julia, and then call Julia from the commandline instructing it to execute the code needed to install the packages. Only downside for now is that package precompilation is triggered each time I load the container in VS Code.
If I need new packages, I simply add them to the list.

Reload environment variables PATH from chef recipes client

is it possible to reload $PATH from a chef recipe?
Im intrsted in the response about process signals given in the following thread:
How to have Chef reload global PATH
I dont understand very well that example that the omribahumi user gives.
I would like a clearer example with chef-client / recipe to understand,
with that he explains, seems it is possible with that workaround.
Thanks.
Well I see two reasons for this request:
add something to the path for immediate execution => easy, just update the ENV['PATH'] variable within the chef run.
Extend the PATH system wide to include something just installed.
For the 2, you may update /etc/environment file (for ubuntu) or add a file to /etc/profiled.d (better idea to keep control over it),
But obviously the new PATH variable won't be available to actually running processes (including your actual shell), it will work for process launched after the file update.
To explain a little more the link you gave what is done is:
create a file with export commands to set env variables
echo 'export MYVAR="my value"' > ~/my_environment
create a bash function loading env vars from a file
function reload_environment { source ~/my_environment; }
set a trap in bash to do something on a signal, here run the function when bash receive SIGHUP
trap reload_environment SIGHUP
Launch the function for a first sourcing of the env file, there's two way:
easy one: launch the function
reload_environment
complex one: Get the pid of your actual shell and send it a SIGHUP signal
kill -HUP `echo $$`
All of this is only for the current shell until you set this in your .bash_rc
Not exactly what you were asking for indeed, but I hope you'll understand there's no way to update context of an already running process.
The best you can do is: update the PATH with whatever method you wish (something in /etc/profile.d for exemple) and execute a wall (if chef run as root) to tell users to reload their envs
echo 'reload your shell env by executing: source /etc/profile' | wall
Once again, it could work for humans, not for other process already running, those will have to be restarted.

interactive docker build from dockerfile?

I want to use a Dockerfile to build an image. However, commands will need user input as they run. Currently, the build is not successful because docker exits on user input. I know I can use the -i -t options on docker run command but I want to do that on a Dockerfile. How is that possible?
You can try with expect or a similar tool.
The easiest way to configure it is using the autoexpect tool, which lets you run the commands interactively and creates an expect script for you.
I couldn't get the rvmsudo stuff working (I haven't used it and didn't want to spend too much time with it) so I decided to use vi instead. First run autoexpect
$ autoexpect vi test
This will open vi and you can create or edit the file and save it. After exiting the vi you'll see your file test as well as an expect script script.exp.
You can then remove the test file and execute script.exp. It will recreate the same file using the same steps.
The autoexpect tool is great, but you may have to create a script from scratch if you need to have more control over what happens. E.g. if you don't want the script to work with the exact expected input.

Resources