I'm working on a poc to work with airflow on k8s.
I'm missing an pip package and I'm adding that through the shell in kubectl.
That works and when i look in the shell and do pip list i see the new package.
Im adding it to the webserver of airflow.
But the webserver UI still gives me an error about the missing package.
What should I do to let the webserver that there is an new package.
Thanks in advance
Related
I am running aws-mwaa-local-runner in order to run a local Apache Airflow environment (in Docker for Windows).
However, after creating the container using ./mwaa-local-env start, I repeatedly get the Broken DAG ModuleNotFoundError' However, when I check my /docker/config/requirements.txt file (see here, although my file has a few more requirements that I need in it). When I compare my /docker/config/requirements.txt file with the output of pip freeze command run in Airflow container, I can see those requirements I need for my DAGs are missing.
I tried to pip install my other requirements in Airflow container but to no avail.
Is there a way to modify docker-compose-local.yml file so it installs all of my requirements.txt when creating the container (i.e. running the Airflow)?
Is there maybe something I might be missing? Any help or suggestion would be greatly appreciated.
Look at this: https://github.com/aws/aws-mwaa-local-runner . You should run the requirements file located in dags, locally.
pip install -r requirements.txt
Add your extra requirements to dags/requirements.txt, not docker/config/requirements.txt. The former is installed every time you start the service, but the latter is only installed when you build or rebuild the image.
Additionally, keeping your added requirements separate is important because you will need to upload the list to your MWAA environment.
I have set up some python script in Jenkins with AWS/Ubuntu server.
However, when I run a job, my ip address get inaccessible http://3.82.243.44:8080/, just spinning, and I can't do anything within Jenkins app
My AWS instance is showing as Running so I don't its an issue there.
This is what I latest installed on it
sudo apt-get install python3-pip
And this is what I'm trying to build (customer python build) in Jenkins
pip3 install -r requirements.txt
sbase install chromedriver latest
pytest --headless
If anyone has experience and what I may be doing wrong, please let me know.
It's happening the same to me. The only solution I have found for the moment is to stop de instance from the AWS console and start it again. I'm sill looking for a good solution for this problem, but i'm new in this "world".
Background: I am new to the Python world and am using Plotly for creating basic graphs in Python. I am using AWS Sagemaker's JupyterLab for creating the python scripts.
Issue: I have been trying to run the basic codes mentioned on Plotly's website however even those are returning blank graphs.
Issue Resolution Tried by myself:
pip installed plotly version 4.6.0
Steps mentioned on https://plotly.com/python/getting-started/ for JupyterLab support have already been executed
Code Example:
import plotly.graph_objects as go
fig = go.Figure(data=go.Bar(y=[2, 3, 1]))
fig.show()
I recently had the same issue. Simple change suggested here helped me. I know this is a temporary workaround until a proper fix is found.
// fig = go.Figure()
fig = go.FigureWidget() // replace with this
// fig.show()
fig // remove .show()
Sagemaker notebook instances are using (As of Jan 2022), for some reason, jupyterlab==1.2.21. You can verify that by running pip freeze | grep lab from the terminal or !pip freeze | grep lab from a notebook.
According to the documentation, you'll need to install the following jupyterlab extensions (which are not needed if sagemaker was running jupyterlab 3):
jupyterlab-plotly
jupyter-widgets/jupyterlab-manager
You can install those on a up-and-running instance by running
jupyter labextension install jupyterlab-plotly#5.5.0 #jupyter-widgets/jupyterlab-manager in the terminal or notebook (using ! if you are running on the notebook ofcourse). Notice that the jupyterlab-plotly extension version (here 5.5.0) should match the plotly version you are installing. Mismatches my cause issues. In this case by plotly version is 5.5.0 and thus that's also the jupyterlab-plotly version I've installed.
If you need, like I did, to have it ready upon spinning up a notebook instance, you'll need to:
Create a lifecycle script
To it, add:
PATH=$PATH:/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin - To ensure nodejs path which is needed for the extension installation
pip install plotly==5.5.0 - To ensure a specific version
jupyter labextension install jupyterlab-plotly#5.5.0 #jupyter-widgets/jupyterlab-manager - To ensure same version
of coures, you can change the version according to the most up to date.
I think that documentation is not on par. You now need to install jupyterlab-plotly extension.
jupyter labextension install jupyterlab-plotly
UPDATE
I followed a mix of instructions here and here.
First Enable Extention manager from jupyter-lab
then from terminal
conda install -c conda-forge "nbformat" "ipywidgets>=7.5" -y
jupyter labextension install jupyterlab-plotly
jupyter labextension install #jupyter-widgets/jupyterlab-manager plotlywidget
And within your environment
conda install nbformat
I am trying to setup docker-compose architecture for local development and production and I can't figure when in the containers life it's the best time to install library dependencies. In the same time I am not sure if these should be placed in the container or in external volume.
All my code is mounted in external volumes, so that changes are immidiately taken into without rebuilding the containers, but I am not sure about libraries that need to be installed by pip (I am running python backend) and npm/yarn (for webpack front-end).
Placing requirments.txt and package.json into the containers and running pip install and yarn install in the container build process means that I have to rebuild the container any time dependecies change - that is too much overhead.
Putting them in an external volume and running pip install and yarn install as part of the command of each container when it is started seems to solve the issue.
The build process of each container then contains only platform dependencies (eg. installing python, webpack or other platform tools), but libraries are installed after started (with CMD directive).
Is this the correct approach? I have seen lot of examples doing exactly the oposite and running npm install in the build process of the container - but I don't see any advantage for that, am I missing something?
Installing dependecies is usually part of the build process. Mounting code is a good trick when developing in order to get changes directly reflected.
Concerning adding requirements.txt or package.json. Installing dependecies takes time, and for that you need to take advantage of docker layer caching. In particular, you want to avoid cache invalidation.
For pip I suggest the following in development phase: For dependencies that you are unlikely to change, install these in separate RUN instuction. Your Docker file will look something like.
FROM ..
RUN pip install package1 package2 package3 ...
ADD requirements.txt requirements.txt
RUN RUN pip install -r requirements.txt
...
Keep only dependencies that might be changed in requirements.txt. Once you are done developing, add the packages back to the requirements.txt and build using the requirements file.
A similar approach would be adding two requirements files, and at the end combining them.
So i figured out the basics of Apache Airflow and I can run dags/tasks on my computer (so sleek!). However, I want to be able to have these run when my computer's off - so I bought a $5/month Lightsail instance and tried to install Airflow on there pip install airflow.
I keep getting the attached output. It seems as though there isn't enough memory on the instance to finish the command or something but I feel like if that were true, it would output an error message...
Thoughts?
I've found an answer to my own question. I tried out the solution provided for this question, and it worked:
First - I created a virtual environment and entered it by typing these commands into the command line:
virtualenv my-env,
source my-venv/bin/activate
Second - Once in the virtual environment, instead of inputing pip install airflow, I input pip --no-cache-dir install airflow. This worked to avoid the memory error!