gcsfuse mount exits with status 1 with the right permisions - docker

I'm trying to mount my database to my container using gcsfuse but im getting this error when i run gcsfuse storage-name /media i get this error:
...
2022/08/07 20:33:42.908030 Mounting file system "development-videoo-storage1"...
daemonize.Run: readFromProcess: sub-process: mountWithArgs: mountWithConn: Mount: mount: running /usr/bin/fusermount: exit status 1
Now there is already a question posted about this topic which seems to have a similar issue. However I've allready tried everything on it to no avail. I've also tried this question.
I ran getent passwd and saw that there is indeed a user caller "www-data" so I gave it permissions using chown -R www-data:www-data / and manually created media directory using mkdir then ran chown -R www-data:www-data media but it still gives me the same error when I run gcsfuse storage-name /media.
I'm using kubernetes so I added this to my deployment.yaml file:
securityContext:
privileged: true
capabilities:
add:
- SYS_ADMIN
But it still gives the same error. This is my Dockerfile where i install gcsfuse:
FROM --platform=amd64 ubuntu:22.10
# Use baseimage-docker's init system.
CMD ["/sbin/my_init"]
# Install.
EXPOSE 80
RUN apt-get update -y && apt-get dist-upgrade -y && apt-get -y install lsb-release curl gnupg && apt -y install lsb-core
ENV GCSFUSE_REPO gcsfuse-stretch
USER root
RUN apt-get update -y && apt-get install -y --no-install-recommends apt-transport-https ca-certificates curl gnupg
RUN echo "deb http://packages.cloud.google.com/apt $GCSFUSE_REPO main" | tee /etc/apt/sources.list.d/gcsfuse.list
RUN echo "deb https://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list
RUN curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
# Install gcsfuse and google cloud sdk
RUN apt-get update -y && apt-get install -y gcsfuse google-cloud-sdk \
&& apt-get autoremove -y \
&& apt-get clean -y \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
I've also checked the permissions for accessing the cloud storage (storage-name) and it says "Public access Not public" but my service account seems to have the right permissions (I'm not %100 sure whether it has the right permissions or not).
I've also tried running gcsfuse -o rw,noauto,user,implicit_dirs,allow_other storage-name /media to be greeted with the same error...
Edit:
Here is the result of gcsfuse --implicit-dirs --foreground --debug_gcs --debug_fuse storage-name /media:
WARNING: gcsfuse invoked as root. This will cause all files to be owned by
root. If this is not what you intended, invoke gcsfuse as the user that will
be interacting with the file system.
2022/08/07 21:47:00.807929 Creating a new server...
2022/08/07 21:47:00.808134 Set up root directory for bucket development-videoo-storage1
2022/08/07 21:47:00.808381 OpenBucket("development-videoo-storage1", "")
gcs: 2022/08/07 21:47:00.808669 Req 0x0: <- ListObjects("")
gcs: 2022/08/07 21:47:00.900159 Req 0x0: -> ListObjects("") (91.477658ms): OK
gcs: 2022/08/07 21:47:00.900634 Req 0x1: <- ListObjects("")
gcs: 2022/08/07 21:47:00.932224 Req 0x1: -> ListObjects("") (31.587009ms): OK
2022/08/07 21:47:00.932631 Mounting file system "storage-name"...
/usr/bin/fusermount: fuse device not found, try 'modprobe fuse' first
mountWithArgs: mountWithConn: Mount: mount: running /usr/bin/fusermount: exit status 1
command terminated with exit code 1

Related

error: chmod on config.lock failed: Operation not permitted

I'm trying to deploy Atlantis on a Cloud Run Gen2 service with a GCS bucket mounted to it via gcsfuse.
Most seems to work fine, the atlantis server starts and can handle requests properly. Files are also written to the GCS bucket through gcsfuse.
But, when Atlantis tries to clone a git repository (as part of the: atlantis plan commmand) it returns the following error:
running git clone --branch f/gcsfuse-cloudrun --depth=1 --single-branch https://xxxxxxxx:<redacted>#github.com/xxxxxxxx/xxxxxxxx.git /app/atlantis/repos/xxxxxxxx/xxxxxxxx/29/default: Cloning into '/app/atlantis/repos/xxxxxxxx/xxxxxxxx/29/default'...
error: chmod on /app/atlantis/repos/xxxxxxxx/xxxxxxxx/29/default/.git/config.lock failed: Operation not permitted
fatal: could not set 'core.filemode' to 'false'
: exit status 128
I believe that I'm very close but I'm not too knowledgeable on Linux file system permissions.
My Dockerfile is as following:
FROM ghcr.io/runatlantis/atlantis:v0.21.1-pre.20221213-debian
USER root
# Install Python
ENV PYTHONUNBUFFERED=1
RUN apt-get update -y
RUN apt-get install -y python3 python3-pip
# Install system dependencies
RUN set -e; \
apt-get update -y && apt-get install -y \
tini \
lsb-release; \
gcsFuseRepo=gcsfuse-`lsb_release -c -s`; \
echo "deb http://packages.cloud.google.com/apt $gcsFuseRepo main" | \
tee /etc/apt/sources.list.d/gcsfuse.list; \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | \
apt-key add -; \
apt-get update; \
apt-get install -y gcsfuse \
&& apt-get clean
# Set fallback mount directory
ENV MNT_DIR /app/atlantis
# Create mount directory for service
RUN mkdir -p ${MNT_DIR}
RUN chown -R atlantis /app/atlantis/
RUN chmod -R 777 /app/atlantis/
WORKDIR $MNT_DIR
# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY gcsfuse_run.sh ./
# Make the script an executable
RUN chmod +x /app/gcsfuse_run.sh
ENTRYPOINT ["/app/gcsfuse_run.sh"]
The entrypoint script ^, is as following:
#!/usr/bin/env bash
set -eo pipefail
echo "Mounting GCS Fuse to $MNT_DIR"
gcsfuse -o allow_other -file-mode=777 -dir-mode=777 --implicit-dirs --debug_gcs --debug_fuse $BUCKET $MNT_DIR
echo "Mounting completed."
# This is a atlantis provided docker script that comes from the base image
/usr/local/bin/docker-entrypoint.sh server
Help is highly appreciated!
We simulated the exact steps, but didn't face the issue.
Also we found the same type of issue on many places and for them below solutions worked :
Run the server with sudo permission.
Restart the system.
git config --global --replace-all core.fileMode false
The chmod operation is not supported by gcsfuse. As such, the suggestion by #tulsi-shah (git config --global --replace-all core.fileMode false) would provide a work-around.
https://github.com/googlecloudplatform/gcsfuse/blob/master/docs/semantics.md#inodes

"docker:19.03-dind" could not select device driver "nvidia" with capabilities: [[gpu]]

I got a K8S+DinD issue:
launch Kubernetes cluster
start a main docker image and a DinD image inside this cluster
when running a job requesting GPU, got error could not select device driver "nvidia" with capabilities: [[gpu]]
Full error
http://localhost:2375/v1.40/containers/long-hash-string/start: Internal Server Error ("could not select device driver "nvidia" with capabilities: [[gpu]]")
exec to the DinD image inside of K8S pod, nvidia-smi is not available.
Some debugging and it seems it's due to the DinD is missing the Nvidia-docker-toolkit, I had the same error when I ran the same job directly on my local laptop docker, I fixed the same error by installing nvidia-docker2 sudo apt-get install -y nvidia-docker2.
I'm thinking maybe I can try to install nvidia-docker2 to the DinD 19.03 (docker:19.03-dind), but not sure how to do it? By multiple stage docker build?
Thank you very much!
update:
pod spec:
spec:
containers:
- name: dind-daemon
image: docker:19.03-dind
I got it working myself.
Referring to
https://github.com/NVIDIA/nvidia-docker/issues/375
https://github.com/Henderake/dind-nvidia-docker
First, I modified the ubuntu-dind image (https://github.com/billyteves/ubuntu-dind) to install nvidia-docker (i.e. added the instructions in the nvidia-docker site to the Dockerfile) and changed it to be based on nvidia/cuda:9.2-runtime-ubuntu16.04.
Then I created a pod with two containers, a frontend ubuntu container and the a privileged docker daemon container as a sidecar. The sidecar's image is the modified one I mentioned above.
But since this post is 3 year ago from now, I did spent quite some time to match up the dependencies versions, repo migration over 3 years, etc.
My modified version of Dockerfile to build it
ARG CUDA_IMAGE=nvidia/cuda:11.0.3-runtime-ubuntu20.04
FROM ${CUDA_IMAGE}
ARG DOCKER_CE_VERSION=5:18.09.1~3-0~ubuntu-xenial
RUN apt-get update -q && \
apt-get install -yq \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common && \
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - && \
add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable" && \
apt-get update -q && apt-get install -yq docker-ce docker-ce-cli containerd.io
# https://github.com/docker/docker/blob/master/project/PACKAGERS.md#runtime-dependencies
RUN set -eux; \
apt-get update -q && \
apt-get install -yq \
btrfs-progs \
e2fsprogs \
iptables \
xfsprogs \
xz-utils \
# pigz: https://github.com/moby/moby/pull/35697 (faster gzip implementation)
pigz \
# zfs \
wget
# set up subuid/subgid so that "--userns-remap=default" works out-of-the-box
RUN set -x \
&& addgroup --system dockremap \
&& adduser --system -ingroup dockremap dockremap \
&& echo 'dockremap:165536:65536' >> /etc/subuid \
&& echo 'dockremap:165536:65536' >> /etc/subgid
# https://github.com/docker/docker/tree/master/hack/dind
ENV DIND_COMMIT 37498f009d8bf25fbb6199e8ccd34bed84f2874b
RUN set -eux; \
wget -O /usr/local/bin/dind "https://raw.githubusercontent.com/docker/docker/${DIND_COMMIT}/hack/dind"; \
chmod +x /usr/local/bin/dind
##### Install nvidia docker #####
# Add the package repositories
RUN curl -fsSL https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add --no-tty -
RUN distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && \
echo $distribution && \
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
tee /etc/apt/sources.list.d/nvidia-docker.list
RUN apt-get update -qq --fix-missing
RUN apt-get install -yq nvidia-docker2
RUN sed -i '2i \ \ \ \ "default-runtime": "nvidia",' /etc/docker/daemon.json
RUN mkdir -p /usr/local/bin/
COPY dockerd-entrypoint.sh /usr/local/bin/
RUN chmod 777 /usr/local/bin/dockerd-entrypoint.sh
RUN ln -s /usr/local/bin/dockerd-entrypoint.sh /
VOLUME /var/lib/docker
EXPOSE 2375
ENTRYPOINT ["dockerd-entrypoint.sh"]
#ENTRYPOINT ["/bin/sh", "/shared/dockerd-entrypoint.sh"]
CMD []
When I use exec to login into the Docker-in-Docker container, I can successfully run nvidia-smi (which previously return not found error then cannot run any GPU resource related docker run)
Welcome to pull my image at brandsight/dind:nvidia-docker

404 error for some packages when building docker image

when building docker image for gitlab runner base image getting error as :
ERRO[2021-12-29T09:46:32Z] Application execution failed PID=6622 error="executing the script on the remote host: executing script on container with IP \"3.x.x.x\": connecting to server: connecting to server \"3.x.x.x:x\" as user \"root\": dial tcp 3.x.x.x:x: connect: connection refused"
ERROR: Job failed (system failure): prepare environment: exit status 2. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
Dockerfile:
FROM registry.gitlab.com/tmaczukin-test-projects/fargate-driver-debian:latest
RUN apt-get install -y wget && \
apt-get install -y python3-pip && \
wget https://releases.hashicorp.com/terraform/0.12.24/terraform_0.12.24_linux_amd64.zip && \
unzip terraform_0.12.24_linux_amd64.zip
mv terraform /usr/local/bin && \
chmod -R 777 /usr/local/bin
I'm assuming the error mentioned in the title is from the apt-get install commands. You should be running an apt-get update first to get an updated package list. Otherwise apt will be looking for packages from a stale state (whenever the base image was created). You can also merge the install commands and include a cleanup of temporary files in the same step to reduce layer size.
FROM registry.gitlab.com/tmaczukin-test-projects/fargate-driver-debian:latest
RUN apt-get update && \
apt-get install -y \
python3-pip \
wget && \
wget https://releases.hashicorp.com/terraform/0.12.24/terraform_0.12.24_linux_amd64.zip && \
unzip terraform_0.12.24_linux_amd64.zip
mv terraform /usr/local/bin && \
chmod -R 777 /usr/local/bin && \
rm terraform_0.12.24_linux_amd64.zip && \
rm -rf /var/lib/apt/lists/*

Docker build failing when using gcsfuse to mount google storage

I have been trying to mount SQL and a storage bucket to my docker WordPress container. It appears to succeeding in mounting SQL, but failing mounting the bucket. The instance is based of of this post.
I have attached the Docker file and error below, as well as my build command.
Build command:
docker build -t ic/spm .
Dockerfile:
FROM wordpress
MAINTAINER Gareth Williams <gareth#itinerateconsulting.com>
# Move login creds locally
ADD ./creds.json /creds.json
# install sudo, wget and gcsfuse
ENV GCSFUSE_REPO=gcsfuse-jessie
RUN apt-get update && \
apt-get -y install sudo && \
apt-get install -y curl ca-certificates && \
echo "deb http://packages.cloud.google.com/apt $GCSFUSE_REPO main" > /etc/apt/sources.list.d/gcsfuse.list && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
apt-get update && \
apt-get install -y gcsfuse wget && \
apt-get remove -y curl --purge && \
apt-get autoremove -y && \
rm -rf /var/lib/apt/lists/*
# Config fuse
RUN chmod a+r /etc/fuse.conf
RUN perl -i -pe 's/#user_allow_other/user_allow_other/g' /etc/fuse.conf
# Setup sql proxy
RUN sudo mkdir /cloudsql
RUN sudo chmod 777 /cloudsql
ADD https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64 cloud_sql_proxy.linux.amd64
RUN mv cloud_sql_proxy.linux.amd64 cloud_sql_proxy && chmod +x ./cloud_sql_proxy
RUN ./cloud_sql_proxy -dir=/cloudsql -fuse -credential_file=/creds.json &
# mysql -u icroot -S /cloudsql/[INSTANCE_CONNECTION_NAME]
# Perform Cloud Storage FUSE mounting for uploads folder
RUN mkdir /mnt/uploads
RUN chmod a+w /mnt/uploads
#RUN chown www-data:www-data -R /mnt && groupadd fuse && gpasswd -a www-data fuse && chmod g+rw /dev/fuse
USER www-data
RUN gcsfuse --key-file /creds.json \
--debug_gcs --debug_http --debug_fuse --debug_invariants \
--dir-mode "777" -o allow_other spm-bucket /mnt/uploads
Error:
Step 17 : RUN gcsfuse --key-file /creds.json --foreground --debug_gcs --debug_http --debug_fuse --debug_invariants --dir-mode "777" -o allow_other spm-bucket /mnt/uploads
---> Running in 7e3f31221bee
Using mount point: /mnt/uploads
Opening GCS connection...
Opening bucket...
gcs: Req 0x0: <- ListObjects()
http: ========== REQUEST:
GET http://www.googleapis.com/storage/v1/b/spm-bucket/o?maxResults=1&projection=full HTTP/1.1
Host: www.googleapis.com
User-Agent: gcsfuse/0.0
Authorization: Bearer ya29.ElrQAw8oxClKt8YGvtmxhc7z2Y2LufvL0fBueq1UESjYYjRrdxukNTQqO1qfM8e8h-rqfbOWNSjVK2rCRXVrEDla-CiUVhHwT6X71Y1Djb0jDJg7z3KblgNQPrc
Accept-Encoding: gzip
http: ========== RESPONSE:
HTTP/2.0 200 OK
Content-Length: 31
Alt-Svc: quic=":443"; ma=2592000; v="35,34"
Cache-Control: private, max-age=0, must-revalidate, no-transform
Content-Type: application/json; charset=UTF-8
Date: Wed, 11 Jan 2017 09:19:05 GMT
Expires: Wed, 11 Jan 2017 09:19:05 GMT
Server: UploadServer
Vary: Origin
Vary: X-Origin
X-Guploader-Uploadid: AEnB2UpTqXhtHW906FFDTRsz4FjHjFu_E84wYhvt0zhaVFuMpqSY1fsd1XcrEcpsYBBwX1mqf0ZXRVWJH05ThtDQIfFKHd4PFw
{
"kind": "storage#objects"
}
http: ====================
gcs: Req 0x0: -> ListObjects() (1.793169206s): OK
Mounting file system...
mountWithArgs: mountWithConn: Mount: mount: running fusermount: exit status 1
stderr:
fusermount: failed to open /dev/fuse: Operation not permitted
If you're running your container on GKE, and you want to use gcsfuse, permissions should automatically be inherited in your account locally. Also...there is a caveat that you need to make sure that the cluster your running needs to have storage access. So make sure your cluster has the storage permissions set to full access. That way gcsfuse can mount your buckets on GCS within the container without having to worry about passing credential files and all that stuff...making the implementation pretty straight forward.
In your docker file...make sure you're doing your apt commands to get and install the gcsfuse application.
I personally made a shell script that I call once the instance is up, that mounts my directories that I needed.
Something like this...
Docker Entry
ENTRYPOINT ["/opt/entry.sh"]
entry.sh script example
gcsfuse [gcs bucket name] [local folder to mount as]
When generating your GKE cluster, make sure to add the storage scope
gcloud container clusters create [your cluster name] --scopes storage-full
Hope this helps you.
Docker won't allowed to mount with other storages(like GCP) by default. What you can do is when running the container with privileged option you can mount with the storage.
Put this command in script file(gcp.sh) and build the docker image.
RUN gcsfuse --key-file /creds.json \
--debug_gcs --debug_http --debug_fuse --debug_invariants \
--dir-mode "777" -o allow_other spm-bucket /mnt/uploads
gcp.sh:
gcsfuse --key-file /creds.json --debug_gcs --debug_http --debug_fuse --debug_invariants --dir-mode "777" -o allow_other spm-bucket /mnt/uploads
and the Dockerfile:
FROM wordpress
MAINTAINER Gareth Williams <gareth#itinerateconsulting.com>
# Move login creds locally
ADD ./creds.json /creds.json
# install sudo, wget and gcsfuse
ENV GCSFUSE_REPO=gcsfuse-jessie
RUN apt-get update && \
apt-get -y install sudo && \
apt-get install -y curl ca-certificates && \
echo "deb http://packages.cloud.google.com/apt $GCSFUSE_REPO main" > /etc/apt/sources.list.d/gcsfuse.list && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
apt-get update && \
apt-get install -y gcsfuse wget && \
apt-get remove -y curl --purge && \
apt-get autoremove -y && \
rm -rf /var/lib/apt/lists/*
# Config fuse
RUN chmod a+r /etc/fuse.conf
RUN perl -i -pe 's/#user_allow_other/user_allow_other/g' /etc/fuse.conf
# Setup sql proxy
RUN sudo mkdir /cloudsql
RUN sudo chmod 777 /cloudsql
ADD https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64 cloud_sql_proxy.linux.amd64
RUN mv cloud_sql_proxy.linux.amd64 cloud_sql_proxy && chmod +x ./cloud_sql_proxy
RUN ./cloud_sql_proxy -dir=/cloudsql -fuse -credential_file=/creds.json &
# mysql -u icroot -S /cloudsql/[INSTANCE_CONNECTION_NAME]
# Perform Cloud Storage FUSE mounting for uploads folder
RUN mkdir /mnt/uploads
RUN chmod a+w /mnt/uploads
#RUN chown www-data:www-data -R /mnt && groupadd fuse && gpasswd -a www-data fuse && chmod g+rw /dev/fuse
USER www-data
COPY gcp.sh /home
RUN chmod +x /home/gcp.sh
CMD cd /home && ./gcp.sh
and finally after build the image run the container with --privileged option
docker run --privileged
your www-data have permission problem in the dockerfile:
#RUN chown www-data:www-data -R /mnt && groupadd fuse && gpasswd -a www-data fuse && chmod g+rw /dev/fuse
uncomment this line

Docker /dev/mapper permission

All new to docker, i'm trying to build an image of a software, during the
RUN apt-get install -y xxx command i'm encourtering issues :
Setting up lvm2 (2.02.95-8) ...
Setting up LVM Volume Groups... /dev/mapper/control: open failed:Operation not permitted
Failure to communicate with kernel device-mapper driver.
Check that device-mapper is available in the kernel.
No volume groups found
/dev/mapper/control: open failed: Operation not permitted
Failure to communicate with kernel device-mapper driver.
Check that device-mapper is available in the kernel.
No volume groups found
what could cause this issue ?
my distrib is a Debian7, maybe should i try this on a more recent distros ?
here is the Dockerfile :
#installation d'hynesim
FROM debian:wheezy
RUN echo $(whoami)
RUN echo "exit 0" > /usr/sbin/policy-rc.d
RUN apt-get update && apt-get install -y curl
RUN echo 'deb [arch=amd64] http://repository.hynesim.org/debian wheezy 2.2 backports' >> /etc/apt/sources.list && \
echo 'deb-src [arch=amd64] http://repository.hynesim.org/debian wheezy 2.2 backports' >> /etc/apt/sources.list
RUN curl -o - https://repository.hynesim.org/debian/hynesim.asc | apt-key add - && apt-get update && apt-get install -y \
hynesim-node \
hynesim-glacier

Resources