Permission error accessing USB from homeassistant docker - docker

I am running homeassistant in a docker container on a RPi 4 with Raspbian. I am using tributs scripts to elevate the need of running the docker image as root. This all works dandy. But now I am trying to add the dsmr integration but I am not succeeding. The integration requires to connect to the "Slimme meter" via USB. However, I get a permission error. my knowledge of both docker and linux privelages is too limited to know where to start debugging this. Does anyone have some pointers for me?
This is the error message homeassistant is throwing at me:
Logger: homeassistant.components.dsmr
Source: components/dsmr/config_flow.py:93
Integration: dsmr (documentation, issues)
First occurred: 21:18:17 (1 occurrences)
Last logged: 21:18:17
Error connecting to DSMR
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/serial/serialposix.py", line 322, in open
self.fd = os.open(self.portstr, os.O_RDWR | os.O_NOCTTY | os.O_NONBLOCK)
PermissionError: [Errno 13] Permission denied: '/dev/ttyUSB0'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/components/dsmr/config_flow.py", line 93, in validate_connect
transport, protocol = await asyncio.create_task(reader_factory())
File "/usr/local/lib/python3.9/site-packages/serial_asyncio/__init__.py", line 445, in create_serial_connection
serial_instance = serial.serial_for_url(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/serial/__init__.py", line 90, in serial_for_url
instance.open()
File "/usr/local/lib/python3.9/site-packages/serial/serialposix.py", line 325, in open
raise SerialException(msg.errno, "could not open port {}: {}".format(self._port, msg))
serial.serialutil.SerialException: [Errno 13] could not open port /dev/ttyUSB0: [Errno 13] Permission denied: '/dev/ttyUSB0'

After some researching I figured that I needed to add the user to an extra group named dialout because only members of that group are allowed to access the USB ports (as well as other devices).
First I figured out the group id of the dialout group in the host machine (the machine running the docker container) by running
cat /etc/group | grep dialout
it returned 20 in my case. Luckily, the script of tributs has the possibility to add the user to an extra group via the environment variable EXTRA_GID. So all the relevant lines in the docker-compose file for accessing USB are (when using the script of tribut) are:
devices:
- /dev/ttyUSB0:/dev/ttyUSB0
environment:
- PUID=1000
- PGID=1000
- EXTRA_GID=20 # this is the ID of the group 'dialout'

Related

Airflow DockerOperator unable to mount tmp directory correctly

I am trying to run a simple python script within a docker run command scheduled with Airflow.
I have followed the instructions here Airflow init.
My .env file:
AIRFLOW_UID=1000
AIRFLOW_GID=0
And the docker-compose.yaml is based on the default one docker-compose.yaml. I had to add - /var/run/docker.sock:/var/run/docker.sock as an additional volume to run docker inside of docker.
My dag is configured as followed:
""" this is an example dag """
from datetime import timedelta
from airflow import DAG
from airflow.operators.docker_operator import DockerOperator
from airflow.utils.dates import days_ago
from docker.types import Mount
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'email': ['info#foo.com'],
'email_on_failure': True,
'email_on_retry': False,
'retries': 10,
'retry_delay': timedelta(minutes=5),
}
with DAG(
'msg_europe_etl',
default_args=default_args,
description='Process MSG_EUROPE ETL',
schedule_interval=timedelta(minutes=15),
start_date=days_ago(0),
tags=['satellite_data'],
) as dag:
download_and_store = DockerOperator(
task_id='download_and_store',
image='satellite_image:latest',
auto_remove=True,
api_version='1.41',
mounts=[Mount(source='/home/archive_1/archive/satellite_data',
target='/app/data'),
Mount(source='/home/dlassahn/projects/forecast-system/meteoIntelligence-satellite',
target='/app')],
command="python3 src/scripts.py download_satellite_images "
"{{ (execution_date - macros.timedelta(hours=4)).strftime('%Y-%m-%d %H:%M') }} "
"'msg_europe' ",
)
download_and_store
The Airflow log:
[2021-08-03 17:23:58,691] {docker.py:231} INFO - Starting docker container from image satellite_image:latest
[2021-08-03 17:23:58,702] {taskinstance.py:1501} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/client.py", line 268, in _raise_for_status
response.raise_for_status()
File "/home/airflow/.local/lib/python3.6/site-packages/requests/models.py", line 943, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http+docker://localhost/v1.41/containers/create
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1157, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1331, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 1361, in _execute_task
result = task_copy.execute(context=context)
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/docker/operators/docker.py", line 319, in execute
return self._run_image()
File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/docker/operators/docker.py", line 258, in _run_image
tty=self.tty,
File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/container.py", line 430, in create_container
return self.create_container_from_config(config, name)
File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/container.py", line 441, in create_container_from_config
return self._result(res, True)
File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/client.py", line 274, in _result
self._raise_for_status(response)
File "/home/airflow/.local/lib/python3.6/site-packages/docker/api/client.py", line 270, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/home/airflow/.local/lib/python3.6/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 400 Client Error for http+docker://localhost/v1.41/containers/create: Bad Request ("invalid mount config for type "bind": bind source path does not exist: /tmp/airflowtmp037k87u6")
Trying to set mount_tmp_dir=False yield to an Dag ImportError because of unknown Keyword Argument mount_tmp_dir. (this might be an issue for the Documentation)
Nevertheless I do not know how to configure the tmp directory correctly.
My Airflow Version: 2.1.2
There was a bug in Docker Provider 2.0.0 which prevented Docker Operator to run with Docker-In-Docker solution.
You need to upgrade to the latest Docker Provider 2.1.0
https://airflow.apache.org/docs/apache-airflow-providers-docker/stable/index.html#id1
You can do it by extending the image as described in https://airflow.apache.org/docs/docker-stack/build.html#extending-the-image with - for example - this docker file:
FROM apache/airflow
RUN pip install --no-cache-dir apache-airflow-providers-docker==2.1.0
The operator will work out-of-the-box in this case with "fallback" mode (and Warning message), but you can also disable the mount that causes the problem. More explanation from the https://airflow.apache.org/docs/apache-airflow-providers-docker/stable/_api/airflow/providers/docker/operators/docker/index.html
By default, a temporary directory is created on the host and mounted
into a container to allow storing files that together exceed the
default disk size of 10GB in a container. In this case The path to the
mounted directory can be accessed via the environment variable
AIRFLOW_TMP_DIR.
If the volume cannot be mounted, warning is printed and an attempt is
made to execute the docker command without the temporary folder
mounted. This is to make it works by default with remote docker engine
or when you run docker-in-docker solution and temporary directory is
not shared with the docker engine. Warning is printed in logs in this
case.
If you know you run DockerOperator with remote engine or via
docker-in-docker you should set mount_tmp_dir parameter to False. In
this case, you can still use mounts parameter to mount already
existing named volumes in your Docker Engine to achieve similar
capability where you can store files exceeding default disk size of
the container,
I had the same issue and all "recommended" ways of solving the issue here and setting up mount_dir params as descripted here just lead to other errors. The one solution that helped me was wrapping the invocated by docker code with the VPN (actually this hack was taken from another docker-powered DAG that used VPN and worked well).
So the final solution looks like:
#!/bin/bash
connect_to_vpn.sh &
sleep 10
python3 my_func.py
sleep 10
stop_vpn.sh
wait -n
exit $?
To connect to VPN I used openconnect. The took can be installed with apt install and supports anyconnect protocol (it was my crucial requirement).

Celery with redis: instance state changed (master -> replica?)

Am using celery for scheduled tasks and redis server for data backup within docker containers. My jobs are running correctly sometimes. But I am get following error randomly and celery beat task can no longer progress.
[2020-09-16 21:01:07,863: CRITICAL/MainProcess] Unrecoverable error: ResponseError('UNBLOCKED force unblock from blocking operation, instance sta
te changed (master -> replica?)',)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/celery/worker/worker.py", line 205, in start
self.blueprint.start(self)
File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 369, in start
return self.obj.start()
File "/usr/local/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 318, in start
blueprint.start(self)
File "/usr/local/lib/python3.6/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/usr/local/lib/python3.6/site-packages/celery/worker/consumer/consumer.py", line 599, in start
c.loop(*c.loop_args())
File "/usr/local/lib/python3.6/site-packages/celery/worker/loops.py", line 83, in asynloop
next(loop)
File "/usr/local/lib/python3.6/site-packages/kombu/asynchronous/hub.py", line 364, in create_loop
cb(*cbargs)
File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", line 1088, in on_readable
self.cycle.on_readable(fileno)
File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", line 359, in on_readable
chan.handlers[type]()
File "/usr/local/lib/python3.6/site-packages/kombu/transport/redis.py", line 739, in _brpop_read
**options)
File "/usr/local/lib/python3.6/site-packages/redis/client.py", line 892, in parse_response
response = connection.read_response()
File "/usr/local/lib/python3.6/site-packages/redis/connection.py", line 752, in read_response
raise response
redis.exceptions.ResponseError: UNBLOCKED force unblock from blocking operation, instance state changed (master -> replica?)
Any help is will be appreciated. Let me know in case you need more details
As I stated above the issue is happening randomly and perturb our app in production. So I decided to spend time on a solution. I came across many propositions such as hardware issues (Memory or CPU). But this one definitively solve the issue. I was not using authentication on redis server Those interesting on setting redis password easily in docker can refer to this Docker Tip. After setting a password to redis the url looks like REDIS_URL=redis://user:myPass#localhost:6379
You can try this answer: https://stackoverflow.com/a/74141982/1635525
TLDR Adding restart: unless-stopped to your docker-compose helps to recover from celery crashes including the ones caused by redis downtime/maintenance.

Coral Dev Board - Can't get demo to run

I recently purchased a used dev board to start working on. I managed to flash it with Mendel 5 (Eagle) enabled SSH and followed the instructions on the get started without any issue until I tried the demo.
I can't get it to work at all!
If I run edgetpu_demo --stream
I get:
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
Press 'q' to quit.
Press 'n' to switch between models.
Traceback (most recent call last):
File "/usr/bin/edgetpu_detect_server", line 11, in <module>
load_entry_point('edgetpuvision==1.0', 'console_scripts', 'edgetpu_detect_server')()
File "/usr/lib/python3/dist-packages/edgetpuvision/detect_server.py", line 33, in main
run_server(add_render_gen_args, render_gen)
File "/usr/lib/python3/dist-packages/edgetpuvision/apps.py", line 43, in run_server
camera = make_camera(args.source, next(gen), args.loop)
File "/usr/lib/python3/dist-packages/edgetpuvision/detect.py", line 144, in render_gen
engines, titles = utils.make_engines(args.model, DetectionEngine)
File "/usr/lib/python3/dist-packages/edgetpuvision/utils.py", line 53, in make_engines
engine = engine_class(model_path)
File "/usr/lib/python3/dist-packages/edgetpu/detection/engine.py", line 80, in __init__
super().__init__(model_path)
File "/usr/lib/python3/dist-packages/edgetpu/basic/basic_engine.py", line 92, in __init__
self._engine = BasicEnginePythonWrapper.CreateFromFile(model_path)
RuntimeError: No Edge TPU device detected!
running as root gives me:
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
Press 'q' to quit.
Press 'n' to switch between models.
job-working-directory: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
Unable to init server: Could not connect: Connection refused
Unable to init server: Could not connect: Connection refused
Unable to init server: Could not connect: Connection refused
(edgetpu_detect_server:7283): Gtk-WARNING **: 02:24:31.914: cannot open display:
Which seems closer but no idea how to fix either. I've searched as much as I can (added the user the plugdev, made sure everything was installed correctly)
Any ideas? Pretty sure if this doesn't work something is fairly broken.
Thanks for reaching out, this is definitely not a good sign..
RuntimeError: No Edge TPU device detected!
Sounds like the board some how lost access to the edgetpu, could you send us an email to coral-support#google.coon this issue?

404 on all assets with Latest odoo 13 image

I created an instance of the latest official odoo image with postgres behind and started using it. After recreation of the container due to odoo config changes I got a 404 on all assets after the recreation of the odoo container, which leads to an empty white page.
So at the moment I can't use odoo anymore.
Any ideas?
This is the output of the odoo container. I get a few of similar cache related exceptions:
2020-08-07 07:58:02,193 1 INFO XXXX
odoo.addons.base.models.ir_attachment: _read_file reading /var/lib/odoo/filestore/XXXXXX/02/02141d5e40545807c7f22145c21af626c109ddfb
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/odoo/api.py", line 745, in get
value = self._data[field][record._ids[0]]
KeyError: 286
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/odoo/fields.py", line 996, in
__get__
value = env.cache.get(record, self)
File "/usr/lib/python3/dist-packages/odoo/api.py", line 751, in get
raise CacheMiss(record, field)
odoo.exceptions.CacheMiss: ('ir.attachment(286,).datas', None)

Cannot start dask cluster over SSH

I'm trying to start a dask cluster over SSH, but I am encountering a strange errors like these:
Exception in thread Thread-6:
Traceback (most recent call last):
File "/home/localuser/miniconda3/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/localuser/miniconda3/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/home/localuser/miniconda3/lib/python3.6/site-packages/distributed/deploy/ssh.py", line 57, in async_ssh
banner_timeout=20) # Helps prevent timeouts when many concurrent ssh connections are opened.
File "/home/localuser/miniconda3/lib/python3.6/site-packages/paramiko/client.py", line 329, in connect
to_try = list(self._families_and_addresses(hostname, port))
File "/home/localuser/miniconda3/lib/python3.6/site-packages/paramiko/client.py", line 200, in _families_and_addresses
hostname, port, socket.AF_UNSPEC, socket.SOCK_STREAM)
File "/home/localuser/miniconda3/lib/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
I'm starting the cluster like this:
$ dask-ssh --ssh-private-key ~/.ssh/cluster_id_rsa \
--hostfile ~/dask-hosts.txt \
--remote-python "~/miniconda3/bin/python3.6"
My dask-hosts.txt looks like this:
localuser#127.0.0.1
remoteuser#10.10.4.200
...
remoteuser#10.10.4.207
I get the same error with/without the localhost line.
I have checked the ssh setup, I can login to all the nodes using a public key setup (the key is unencrypted, to avoid decryption prompts). What am I missing?
The error indicates that name resolution is the culprit. Most likely this is happening because of the inclusion of usernames in your dask-hosts.txt. According to its documentation, the host file should contain only hostnames/IP addresses:
–hostfile PATH Textfile with hostnames/IP addresses
You can use --ssh-username to set a username (although only a single one).

Resources