docker: how to retrieve a file (created by scrapy-splash) - docker

I use scrapy-splash with docker.
In Dockerfile I have this line to export the result in a .jl .
CMD ["scrapy", "crawl", "quotesjs", "-o", "quote.jl"]
When I run docker-compose build and docker-compose up, the log informs me that:
scrapy1 | 2017-12-18 00:00:00 [scrapy.extensions.feedexport] INFO: Stored jl feed (10 items) in: quote.jl
I don't see any quote.jl in my local folder (where the Dockerfile and the scrapy project is), so I guesse it should be in my container.
I trie to cp the content of the container with this command but without success.
docker cp containerID:. ./copy_of_container
How can I retrieve the quote.jl file.
I am on Windows10 and I use Docker for Windows
My dockerfile
FROM python:alpine
RUN apk --update add libxml2-dev libxslt-dev libffi-dev gcc musl-dev libgcc openssl-dev curl bash
RUN pip install scrapy scrapy-splash scrapy-fake-useragent
ADD . /scraper
WORKDIR /scraper
CMD ["scrapy", "crawl", "apkmirror", "-o", "apkmirror.jl"]

Related

How to use env variables set from build phase in run. (Docker)

I want to preface this in saying that I am very new to docker and have just got my feet wet with using it. In my Docker file that I run to build the container I install a program that sets some env variables. Here is my Docker file for context.
FROM python:3.8-slim-buster
COPY . /app
RUN apt-get update
RUN apt-get install wget -y
RUN wget http://static.matrix-vision.com/mvIMPACT_Acquire/2.40.0/install_mvGenTL_Acquire.sh
RUN wget http://static.matrix-vision.com/mvIMPACT_Acquire/2.40.0/mvGenTL_Acquire-x86_64_ABI2-2.40.0.tgz
RUN chmod +x ./install_mvGenTL_Acquire.sh
RUN ./install_mvGenTL_Acquire.sh -u
RUN apt-get install -y python3-opencv
RUN pip3 install USSCameraTools
WORKDIR /app
CMD python3 main.py
After executing the build docker command the program "mvGenTL_Acquire.sh" sets env inside the container. I need these variables to be set when executing the run docker command. But when checking the env variables after running the image it is not set. I know I can pass them in directly but would like to use the ones that are set from the install in the build.
Any help would be greatly appreciated, thanks!
For running a bash script when your container is creating:
make an script.sh file:
#!/bin/bash
your commands here
If you are using an alpine image, you must use #!/bin/sh instead of #!/bin/bash on the first line of your bash file.
Now in your Dockerfile copy your bash file in the container and use the ENTRYPOINT instruction for running this file when the container is creating:
.
.
.
COPY script.sh /
RUN chmod +x /script.sh
.
.
.
ENTRYPOINT ["/script.sh"]
Notice that in the ENTRYPOINT instruction use your bash file address in your image.
Now when you create a container, the script.sh file will be executed.

My custom beat cant find custombeat.yml when I try to run it from a container

So, I have built a beat with mage GenerateCustomBeat and it runs okay, except, now I'm trying to cotainerize it. When I run the image I built, it complains that no customBeat.yml was found.
I have secured that the file exists in the folder by adding a line RUN ls . at the end of my Dockerfile.
The beat name is coletorbeat, so this name appears multiple times inside the Dockerfile.
Upon executing sudo docker run coletorbeat I have the following error message:
Exiting: error loading config file: stat coletorbeat.yml: no such file or directory
If there was a way to specify the coletorbeat.yml file location when I execute the beat, in CMD I think I would solve it, but I have not found how to do so yet.
I'll post the Dockerfile below. I know the code inside the beater folder works fine. I'm guessing I'm making some mistake on the containerization.
Dockerfile:
FROM ubuntu
MAINTAINER myNameHere
ARG ${ip:-"333.333.333.333"}
ARG ${porta:-"4343"}
ARG ${dataInicio:-"2020-01-07"}
ARG ${dataFim:-"2020-01-07"}
ARG ${tipoEquipamento:-"type"}
ARG ${versao:-"2"}
ARG ${nivel:-"0"}
ARG ${instituicao:-"RJ"}
ADD . .
RUN mkdir /etc/coletorbeat
COPY /coletorbeat/coletorbeat.yml /etc/coletorbeat/coletorbeat.yml
RUN apt-get update && \
apt-get install -y wget git
RUN wget https://storage.googleapis.com/golang/go1.14.4.linux-amd64.tar.gz
RUN tar -zxvf go1.14.*.linux-amd64.tar.gz -C /usr/local
RUN mkdir /go
ENV GOROOT /usr/local/go
ENV GOPATH $HOME/go
ENV PATH $PATH:$GOROOT/bin:$GOPATH/bin
RUN echo $PATH
RUN go get -u -d github.com/magefile/mage
RUN cd $GOPATH/src/github.com/magefile/mage && \
go run bootstrap.go
RUN apt-get install -y python3-venv
RUN apt-get install -y build-essential
RUN cd /coletorbeat && chmod go-w coletorbeat.yml && ./coletorbeat setup
RUN cd /coletorbeat && ./coletorbeat test config -c /coletorbeat/coletorbeat.yml && ls .
CMD ./coletorbeat/coletorbeat -E 'coletorbeat.ip=${ip}'
You are adding the yml file into the /etc dir
COPY /coletorbeat/coletorbeat.yml /etc/coletorbeat/coletorbeat.yml
But then running commands on /coletorbeat without using etc.
On CMD line in the Dockerfile, I added the command cd /mybeatfolder and it worked. Libbeat searches the current folder for the config file as default, so moving to the right directory before executing my beat solved it.

Docker COPY is not copying script

Docker COPY is not copying over the bash script
FROM alpine:latest
#Install Go and Tini - These remain.
RUN apk add --no-cache go build-base gcc go
RUN apk add --no-cache --update ca-certificates redis git && update-ca-certificates
# Set Env Variables for Go and add Go to Path.
ENV GOPATH /go
ENV PATH $GOPATH/bin:/usr/local/go/bin:$PATH
RUN go get github.com/rakyll/hey
RUN echo GOLANG VERSION `go version`
COPY ./bench.sh /root/bench.sh
RUN chmod +x /root/bench.sh
ENTRYPOINT /root/bench.sh
Here is the script -
#!/bin/bash
set -e;
echo "entered";
hey;
I try running the above Dockerfile with
$ docker build -t test-bench .
$ docker run -it test-bench
But I get the error
/bin/sh: /root/bench.sh: not found
The file does exist -
$ docker run --rm -it test-bench sh
/ # ls
bin dev etc go home lib media mnt opt proc root run sbin srv sys tmp usr var
/ # cd root
~ # ls
bench.sh
~ #
Is your docker build successful. When I tried to simulate this, found the following error
---> Running in 96468658cebd
go: missing Git command. See https://golang.org/s/gogetcmd
package github.com/rakyll/hey: exec: "git": executable file not found in $PATH
The command '/bin/sh -c go get github.com/rakyll/hey' returned a non-zero code: 1
Try installing git using Dockerfile RUN apk add --no-cache go build-base gcc go git and run again.
The COPY operation here seems to be correct. Make sure it is present in the directory from where docker build is executed.
Okay, the script is using /bin/bash the bash binary is not available in the alpine image. Either it has to be installed or a /bin/sh shell should be used

docker can't build because of alpine error

Hi I am trying build a docker and Docker file looks like this.
FROM alpine
LABEL description "Nginx + uWSGI + Flask based on Alpine Linux and managed by Supervisord"
# Copy python requirements file
COPY requirements.txt /tmp/requirements.txt
RUN apk add --no-cache \
python3 \
bash \
nginx \
uwsgi \
uwsgi-python3 \
supervisor && \
python3 -m ensurepip && \
rm -r /usr/lib/python*/ensurepip && \
pip3 install --upgrade pip setuptools && \
pip3 install -r /tmp/requirements.txt && \
rm /etc/nginx/conf.d/default.conf && \
rm -r /root/.cache
# Copy the Nginx global conf
COPY nginx.conf /etc/nginx/
# Copy the Flask Nginx site conf
COPY flask-site-nginx.conf /etc/nginx/conf.d/
# Copy the base uWSGI ini file to enable default dynamic uwsgi process number
COPY uwsgi.ini /etc/uwsgi/
# Custom Supervisord config
COPY supervisord.conf /etc/supervisord.conf
# Add demo app
COPY ./app /app
WORKDIR /app
CMD ["/usr/bin/supervisord"]
Errors looks like
Sending build context to Docker daemon 250.9kB
Step 1/11 : FROM alpine
---> 196d12cf6ab1
Step 2/11 : LABEL description "Nginx + uWSGI + Flask based on Alpine Linux and managed by Supervisord"
---> Using cache
---> d8d38c761b8d
Step 3/11 : COPY requirements.txt /tmp/requirements.txt
---> Using cache
---> cb29eb34ca46
Step 4/11 : RUN apk add --no-cache python3 bash nginx uwsgi uwsgi-python3 supervisor && python3 -m ensurepip && rm -r /usr/lib/python*/ensurepip && pip3 install --upgrade pip setuptools && pip3 install -r /tmp/requirements.txt && rm /etc/nginx/conf.d/default.conf && rm -r /root/.cache
---> Running in 3d568d2620dd
fetch http://dl-cdn.alpinelinux.org/alpine/v3.8/main/x86_64/APKINDEX.tar.gz
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.8/main/x86_64/APKINDEX.tar.gz: could not connect to server (check repositories file)
fetch http://dl-cdn.alpinelinux.org/alpine/v3.8/community/x86_64/APKINDEX.tar.gz
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.8/community/x86_64/APKINDEX.tar.gz: could not connect to server (check repositories file)
ERROR: unsatisfiable constraints:
bash (missing):
required by: world[bash]
nginx (missing):
required by: world[nginx]
python3 (missing):
required by: world[python3]
supervisor (missing):
required by: world[supervisor]
uwsgi (missing):
required by: world[uwsgi]
uwsgi-python3 (missing):
required by: world[uwsgi-python3]
The command '/bin/sh -c apk add --no-cache python3 bash nginx uwsgi uwsgi-python3 supervisor && python3 -m ensurepip && rm -r /usr/lib/python*/ensurepip && pip3 install --upgrade pip setuptools && pip3 install -r /tmp/requirements.txt && rm /etc/nginx/conf.d/default.conf && rm -r /root/.cache' returned a non-zero code: 6
A month ago it was building fine. Because of the limited knowledge in Docker i couldn't to figure what's causing the error. A quick google search has resulted in these two links: link1 link2 But none of them were working.
Build docker with flag "--network host" solved the issue. Here is the link.
-In Ubuntu
It was a DNS error for me. By setting /etc/docker/daemon.json with,
{
"dns": ["8.8.8.8"]
}
and then restarting docker with,
sudo service docker restart
I was able to build images again.
https://github.com/gliderlabs/docker-alpine/issues/334#issuecomment-450598069
-In Windows
C:/Users/Administrator(or any other Username)/.docker/daemon.json
And add
{
...,
"dns": ["8.8.8.8"]
}
The line:
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.8/main/x86_64/APKINDEX.tar.gz: could not connect to server (check repositories file)
Basically says that you are either offline, or the alpinelinux repo is down. I cannot find anything about it on the internet, but it happened several times in the past. Or it can be network problem somewhere in between you and the cdn.
You can always pick mirror yourself from the http://dl-cdn.alpinelinux.org/alpine/MIRRORS.txt and setup it like so:
RUN echo http://repository.fit.cvut.cz/mirrors/alpine/v3.8/main > /etc/apk/repositories; \
echo http://repository.fit.cvut.cz/mirrors/alpine/v3.8/community >> /etc/apk/repositories
(change the v3.8 according to you version)
Also as #emix pointed out, you should never use :latest tag for your base image. Use for example 3.8, or the one with packages versions you need.
Try restarting the docker service, it worked for me and others:
sudo systemctl restart docker docker.service
Thanks to: https://github.com/gliderlabs/docker-alpine/issues/334#issuecomment-408826204
This kind of errors often happend due to some network problem.
Try use https mirrors instead of http.
RUN sed -i -e 's/http:/https:/' /etc/apk/repositories
Another fix -
I added 8.8.8.8 to my /etc/resolv.conf and restarted the docker daemon. It fixed this issue for me.
If you are able to manually download the file, try restarting your docker service. It did the trick for me..
Providing a more generic troubleshooting answer for the title. Test your docker commands in another container. This could be another running container that you don't mind breaking, or preferably a base container (in this case alpine) where you can run the Dockerfile commands on the shell. Probably not a solution where the network is the issue as in the original question, but good in other cases.
The apk error messages aren't always the most useful. Take a look at the example below:
/ # apk add --no-cache influxdb-client
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
ERROR: unsatisfiable constraints:
influxdb-client (missing):
required by: world[influxdb-client]
/ #
/ #
/ #
/ #
/ # apk add --no-cache influxdb
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
(1/1) Installing influxdb (1.8.0-r1)
Executing influxdb-1.8.0-r1.pre-install
Executing busybox-1.31.1-r19.trigger
OK: 613 MiB in 98 packages
By the way, https://pkgs.alpinelinux.org/packages is a good place to find the names of packages for Alpine, which would fix the above example.
I think I've done all what was proposed here, without any success.
Change http to https
Use the --network host trick
Add 8.8.8.8 in resolv.conf
Use a mirror
See my last build
I can download on that machine the index without any problem.
But when using Docker (with or without gitlab-runner), it just fails.
It works beautifully on another machine on the same network with the same architecture (armv7).
If my first instruction is
RUN wget https://mirrors.ircam.fr/pub/alpine/v3.15/main/armv7/APKINDEX.tar.gz
I get ---> Running in 19a0630d633a wget: bad address 'mirrors.ircam.fr'
In my case I had changed the /etc/docker/daemon.json and added a repository-mirrors
to bypass filtering in my region and download docker images from that repo.
daemon.json:
{
"registry-mirrors": [
"https://docker.somerepo.com"
],
"insecure-registries": [],
"debug": true,
"experimental": false
}
so that was the problem, I removed daemon.json file (or you could comment all lines in the daemon.json file) and it was fine to go and download new docker images.
I still ran into this problem as of dez 2022 with debian buster. Docker on Buster seems to be incompOption 2 here solved the problem for me.

Docker build and run with Miniconda environments on Ubuntu host

I am in the process of creating a docker container which has a miniconda environment setup with some packages (pip and conda). Dockerfile :
# Use an official Miniconda runtime as a parent image
FROM continuumio/miniconda3
# Create the conda environment.
# RUN conda create -n dev_env Python=3.6
RUN conda update conda -y \
&& conda create -y -n dev_env Python=3.6 pip
ENV PATH /opt/conda/envs/dev_env/bin:$PATH
RUN /bin/bash -c "source activate dev_env" \
&& pip install azure-cli \
&& conda install -y nb_conda
The behavior I want is that when the container is launched, it should automatically switch to the "dev_env" conda environment but I haven't been able to get this to work. Logs :
dparkar#mymachine:~/src/dev/setupsdk$ docker build .
Sending build context to Docker daemon 2.56kB
Step 1/4 : FROM continuumio/miniconda3
---> 1284db959d5d
Step 2/4 : RUN conda update conda -y && conda create -y -n dev_env Python=3.6 pip
---> Using cache
---> cb2313f4d8a8
Step 3/4 : ENV PATH /opt/conda/envs/dev_env/bin:$PATH
---> Using cache
---> 320d4fd2b964
Step 4/4 : RUN /bin/bash -c "source activate dev_env" && pip install azure-cli && conda install -y nb_conda
---> Using cache
---> 3c0299dfbe57
Successfully built 3c0299dfbe57
dparkar#mymachine:~/src/dev/setupsdk$ docker run -it 3c0299dfbe57
(base) root#3db861098892:/# source activate dev_env
(dev_env) root#3db861098892:/# exit
exit
dparkar#mymachine:~/src/dev/setupsdk$ docker run -it 3c0299dfbe57 source activate dev_env
[FATAL tini (7)] exec source failed: No such file or directory
dparkar#mymachine:~/src/dev/setupsdk$ docker run -it 3c0299dfbe57 /bin/bash source activate dev_env
/bin/bash: source: No such file or directory
dparkar#mymachine:~/src/dev/setupsdk$ docker run -it 3c0299dfbe57 /bin/bash "source activate dev_env"
/bin/bash: source activate dev_env: No such file or directory
dparkar#mymachine:~/src/dev/setupsdk$ docker run -it 3c0299dfbe57 /bin/bash -c "source activate dev_env"
dparkar#mymachine:~/src/dev/setupsdk$
As you can see above, when I am within the container, I can successfully run "source activate dev_env" and the environment switches over. But I want this to happen automatically when the container is launched.
This also happens in the Dockerfile during build time. Again, I am not sure if that has any effect either.
You should use the command CMD for anything related to runtime.
Anything typed after RUN will only be run at image creation time, not when you actually run the container.
The shell used to run such commands is closed at the end of the image creation process, making the environment activation non-persistent in that case.
As such, your additional line might look like this:
CMD ["conda activate <your-env-name> && <other commands>"]
where <other commands> are other commands you might need at runtime after the environment activation.
This docker build file worked for me.
# start with miniconda image
FROM continuumio/miniconda3
# setting the working directory
WORKDIR /usr/src/app
# Copy the file from your host to your current location in container
COPY . /usr/src/app
# Run the command inside your image filesystem to create an environment and name it in the requirements.yml file, in this case "myenv"
RUN conda env create --file requirements.yml
# Activate the environment named "myenv" with shell command
SHELL ["conda", "run", "-n", "myenv", "/bin/bash", "-c"]
# Make sure the environment is activated by testing if you can import flask or any other package you have in your requirements.yml file
RUN echo "Make sure flask is installed:"
RUN python -c "import flask"
# exposing port 8050 for interaction with local host
EXPOSE 8050
#Run your application in the new "myenv" environment
CMD ["conda", "run", "-n", "myenv", "python", "app.py"]

Resources