Using memory-profiler in a docker container

Using memory-profiler in a docker container - docker

RAM is ramping up in my docker container and I aim to investigate the cause. I was recommended memory profiler to help me with this.
I was told the steps were as follows:
Install https://pypi.org/project/memory-profiler/ in the image
manually execute the program in the container using mprof run -o memory-profile.dat python3 -m vdx
mprof plot memory-profile.dat
Now, I'm having trouble realizing these steps, since I am new to docker, python and the memory profiler.
For the first step, I included RUN pip3 install memory-profiler in the Dockerfile below RUN pip3 install -r requirements.txt
Then, below where it says COPY src /vdx I've added RUN mprof run -o memory-profile.dat python3 -m vdp
Now, when I build and run the container everything seems to work as usual...
Container is being built:
Step 7/12 : RUN pip3 install -r requirements.txt
---> Using cache
---> 11ad28b8793a
Step 8/12 : RUN pip3 install memory-profiler
---> Using cache
---> 58885abbcd8b
Step 9/12 : COPY src /vdx
---> Using cache
---> 15e884cb94c5
Step 10/12 : WORKDIR /
---> Using cache
---> 93cdfcee7f4a
Step 11/12 : RUN mprof run -o memory-profile.dat python3 -m vdx
---> Using cache
---> de7b15fd74ef
Step 12/12 : ENTRYPOINT ["python3", "-m", "vdx"]
---> Using cache
---> 53675841da99
Successfully built 53675841da99
and the process runs smoothly ...
but no file memory-profile.dat is being created.
What am I doing wrong?

Notice there are two different stages in the life-cycle of a container: building and running. The Dockerfile applies to the building stage, whereas the ENTRYPOINT action in the Dockerfile applies to the running stage.
This means that RUN mprof run -o memory-profile.dat python3 -m vdx in the Dockerfile is probably not what you want, as you want to investigate the issue when the container is running and not when it's building, don't you?
Basically what you are doing is, you are running your app with the memory profiler at build time, and then you set an entrypoint without the memory profiler for runtime, so when you run the container, you actually don't have the output of the profiler because it's not being run.
You need to run the profiler at runtime, moving the instruction to the ENTRYPOINT. ENTRYPOINT ["mprof", "run", "-o", "memory-profile.dat", "python3", "-m", "vdx"].

Related

Docker build not using cache

When I run
docker build -f docker/webpack.docker services/webpack --build-arg env=production
twice in a row, Docker builds my image each time, starting from the first RUN (the COPY uses the cache).
FROM node:lts
ARG env=production
ENV NODE_ENV=$env
WORKDIR /app
COPY package.json yarn.lock ./
RUN yarn install --frozen-lockfile --production=false --non-interactive
COPY . .
RUN node --max-old-space-size=20000 node_modules/.bin/svg2fonts icons -o assets/markons -b mrkn -f markons -n Markons
RUN node --max-old-space-size=20000 node_modules/.bin/webpack --progress
How can I get it to cache those RUNs?
Output looks like:
Sending build context to Docker daemon 3.37MB
Step 1/9 : FROM node:lts
---> 0c601cba9f11
Step 2/9 : ARG env=production
---> Using cache
---> dd38b2167c75
Step 3/9 : ENV NODE_ENV=$env
---> Using cache
---> 800f5afd416c
Step 4/9 : WORKDIR /app
---> Using cache
---> d15b93dce11d
Step 5/9 : COPY package.json yarn.lock ./
---> Using cache
---> a049dd1609a8
Step 6/9 : RUN yarn install --frozen-lockfile --production=false --non-interactive
---> Using cache
---> d5e51b0d556c
Step 7/9 : COPY . .
---> 92990e326d4b
Step 8/9 : RUN node --max-old-space-size=20000 node_modules/.bin/svg2fonts icons -o assets/markons -b mrkn -f markons -n Markons
---> Running in a23878db7b0e
Wrote assets/markons/markons.css
Wrote assets/markons/markons.js
Wrote assets/markons/markons.html
Wrote assets/markons/markons-chars.json
Wrote assets/markons/markons.svg
Wrote assets/markons/markons.ttf
Wrote assets/markons/markons.woff
Wrote assets/markons/markons.woff2
Wrote assets/markons/markons.eot
Removing intermediate container a23878db7b0e
---> 3bce79d0ecf0
Step 9/9 : RUN node --max-old-space-size=20000 node_modules/.bin/webpack --progress
---> Running in b6d460488950
<s> [webpack.Progress] 0% compiling
...

See the description:
If the contents of all external files on the first COPY command are
the same, the layer cache will be used and all subsequent commands
until the next ADD or COPY command will use the layer cache.
However, if the contents of one or more external files are different,
then all subsequent commands will be executed without using the layer
cache.
So every time the content is changed two last RUN will be executed with no cache. There is no way to control caching yet. Maybe it's a better option to specify volumes?

command not found in pipenv path

Edit: This is solved, see answer below
I'm trying to deploy some software using Docker / Jenkins for the first time and I'm running into some issues with the paths.
This is the current Dockerfile:
WORKDIR /usr/src/app
COPY ./ .
RUN pip install pipenv
RUN pipenv install --ignore-pipfile
ENV PATH="${PATH}:/usr/src/app"
RUN pipenv run echo $PATH
RUN pwd
RUN ls
RUN pipenv run whereis run
RUN pipenv run run
When I try to build the docker image using Jenkins, I get the following output:
09:19:48 ---> Running in ff1a52b2e299
09:19:48 Removing intermediate container ff1a52b2e299
09:19:48 ---> 67355de18e72
09:19:48 Step 7/11 : RUN pipenv run echo $PATH
09:19:48 ---> Running in 5cb904118910
09:19:49 /usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/src/app
09:19:49 Removing intermediate container 5cb904118910
09:19:49 ---> 0df62985c94d
09:19:49 Step 8/11 : RUN pwd
09:19:49 ---> Running in 4e79f53b581f
09:19:49 /usr/src/app
09:19:50 Removing intermediate container 4e79f53b581f
09:19:50 ---> 563ab4218eba
09:19:50 Step 9/11 : RUN ls
09:19:50 ---> Running in fbc7670633d1
09:19:50 Dockerfile
09:19:50 Jenkinsfile
09:19:50 Pipfile
09:19:50 Pipfile.lock
09:19:50 README.md
09:19:50 alembic.ini
09:19:50 run.bat
09:19:50 run.sh
09:19:50 trendanalyse
09:19:51 Removing intermediate container fbc7670633d1
09:19:51 ---> 02a4b76defd0
09:19:51 Step 10/11 : RUN pipenv run whereis run
09:19:51 ---> Running in 70d5448e29b1
09:19:51 run: /usr/src/app/run.bat /usr/src/app/run.sh /usr/src/app/run.bat /usr/src/app/run.sh
09:19:52 Removing intermediate container 70d5448e29b1
09:19:52 ---> 455fc44688ce
09:19:52 Step 11/11 : RUN pipenv run run
09:19:52 ---> Running in b25bf9d5f818
09:19:52 [91mError: the command run could not be found within PATH or Pipfile's [scripts].
09:19:53 [0mThe command '/bin/sh -c pipenv run run' returned a non-zero code: 1
I don't see what's wrong, the run.sh file is in the current path, not sure why it won't run. It's working locally on my Windows machine, maybe it's some difference between Windows / Linux that I'm not seeing?
Thanks!

There were two issues with the Dockerfile, I needed to add the line
RUN chmod +x run.sh
Additionally, I had to change
RUN pipenv run run
to
RUN pipenv run run.sh
(I think this is due to differences in Windows / Linux). Now it works :)

How to run go server on docker from binary

I have been trying to make a Dockerfile, that would let me build my go server as binary and then run it either from the scratch image or alpine. The server works fine locally, on macOS 10.13.5, and I made it work when it wasn't from binary on Docker.
I keep getting this error:
standard_init_linux.go:190: exec user process caused "exec format error"
I have been googling around and found something about system architecture. I am not sure how to check if that is the error and/or how to fix it.
Any hints for debugging or possible fix are much appreciated.
My Dockerfile:
FROM golang:1.10.3 as builder
WORKDIR /go/src/gitlab.com/main/server
COPY . .
RUN go get -d -v ./...
RUN CGO_ENABLED=0 GOOS=linux go build -a -o main .
FROM scratch
ADD main /
CMD ["/main"]
The output:
Building go
Step 1/9 : FROM golang:1.10.3 as builder
---> 4e611157870f
Step 2/9 : WORKDIR /go/src/gitlab.com/main/server
Removing intermediate container 20cd4d66008b
---> 621d9fc02dde
Step 3/9 : COPY . .
---> cab639571baf
Step 4/9 : RUN go get -d -v ./...
---> Running in 7681f9adc7b2
Removing intermediate container 7681f9adc7b2
---> 767a4c9dfb94
Step 5/9 : RUN go build -a -installsuffix cgo -o main .
---> Running in a6ec73121163
Removing intermediate container a6ec73121163
---> b9d7d1c0d2f9
Step 6/9 : FROM alpine:latest
---> 11cd0b38bc3c
Step 7/9 : WORKDIR /app
---> Using cache
---> 6d321d334b8f
Step 8/9 : COPY . .
---> 048a59fcdd8f
Step 9/9 : CMD ["/app/main"]
---> Running in d50d174644ff
Removing intermediate container d50d174644ff
---> 68f8f3c6cdf7
Successfully built 68f8f3c6cdf7
Successfully tagged main_go:latest
Creating go ... done
Attaching to go
go | standard_init_linux.go:190: exec user process caused "exec format error"
go exited with code 1

As #tgogos pointed out did I need to use what I build in the first step.
My final Dockerfile ended like this with a few further improvements: The important part is second last line though:
FROM golang:1.10.3 AS build
WORKDIR /go/src/gitlab.com/main/server
COPY . .
RUN go get github.com/golang/dep/cmd/dep && \
dep ensure && \
rm -f schema/bindata.go && \
go generate ./schema
RUN CGO_ENABLED=0 GOOS=linux go build -a -o main .
FROM alpine
RUN apk add --no-cache ca-certificates
COPY --from=build /go/src/gitlab.com/main/server/main .
CMD ["/main"]

Dockerfile not using cache in RUN composer install command

I thought i understand Docker already, but today i found some problem about utilizing docker cache.
Here is my dockerfile
FROM quay.io/my_company/phpjenkins
WORKDIR /usr/src/my_project
ADD composer.json composer.json
ADD composer.lock composer.lock
RUN composer install -o
ADD . .
RUN mkdir -p temp/unittest/cache log
RUN cp app/config/config.unittest.template.neon app/config/config.unittest.neon
CMD ["tail", "-f", "/dev/null"]
I expect docker to use the cache until ADD . .
However, every build, look like docker try to do composer install every time.
Here is some output
+ docker-compose -f docker-compose.yml run app vendor/bin/phpunit -d memory_limit=2048M
Creating network "xxx_default" with the default driver
Creating xxx_rabbitmq_1
Creating xxx_mysql_1
Building app
Step 1/9 : FROM quay.io/my_company/phpjenkins
---> f10ea65fb7df
Step 2/9 : WORKDIR /usr/src/my_project
---> Using cache
---> 07ad76770cd2
Step 3/9 : ADD composer.json composer.json
---> Using cache
---> 0d22314b81af
Step 4/9 : ADD composer.lock composer.lock
---> Using cache
---> 3d41825efcb3
Step 5/9 : RUN composer install -o
---> Running in 38de5f08eb46
Warning: This development build of composer is over 60 days old. It is recommended to update it by running "/usr/local/bin/composer self-update" to get the latest version.
Do not run Composer as root/super user! See https://getcomposer.org/root for details ....
...
---> aa05dc9ddc5f
Removing intermediate container 581aa7e4b00f
Step 6/9 : ADD . .
---> 8796a9235b9a
Removing intermediate container b7354231fbd7
I run out of lead, what could be possible thing that dockerfile didn't use cache for RUN composer install command
I'm using Docker version 17.05.0-ce, build 89658be on Debian, if this help for investigation.
Please advise.

As a work-around you could create two Dockerfiles. One that creates an image at the point where you would like to cache. The second Dockerfile can then use the first image as its base and make modifications as required.
FROM quay.io/my_company/phpjenkins
WORKDIR /usr/src/my_project
ADD composer.json composer.json
ADD composer.lock composer.lock
RUN composer install -o
CMD ["tail", "-f", "/dev/null"]
Build this file to mycomposerimage using
docker build -t mycomposerimage .
Then second dockerfile picks up from there
FROM mycomposerimage
WORKDIR /usr/src/my_project
ADD . .
RUN mkdir -p temp/unittest/cache log
RUN cp app/config/config.unittest.template.neon app/config/config.unittest.neon
CMD ["tail", "-f", "/dev/null"]

How to flatten a Docker image?

I made a Docker container which is fairly large. When I commit the container to create an image, the image is about 7.8 GB big. But when I export the container (not save the image!) to a tarball and re-import it, the image is only 3 GB big. Of course the history is lost, but this OK for me, since the image is "done" in my opinion and ready for deployment.
How can I flatten an image/container without exporting it to the disk and importing it again? And: Is it a wise idea to do that or am I missing some important point?

Now that Docker has released the multi-stage builds in 17.05, you can reformat your build to look like this:
FROM buildimage as build
# your existing build steps here
FROM scratch
COPY --from=build / /
CMD ["/your/start/script"]
The result will be your build environment layers are cached on the build server, but only a flattened copy will exist in the resulting image that you tag and push.
Note, you would typically reformulate this to have a complex build environment and only copy over a few directories. Here's an example with Go to make a single binary image from source code and a single build command without installing Go on the host and compiling outside of docker:
$ cat Dockerfile
ARG GOLANG_VER=1.8
FROM golang:${GOLANG_VER} as builder
WORKDIR /go/src/app
COPY . .
RUN go-wrapper download
RUN go-wrapper install
FROM scratch
COPY --from=builder /go/bin/app /app
CMD ["/app"]
The go file is a simple hello world:
$ cat hello.go
package main
import "fmt"
func main() {
fmt.Printf("Hello, world.\n")
}
The build creates both environments, the build environment and the scratch one, and then tags the scratch one:
$ docker build -t test-multi-hello .
Sending build context to Docker daemon 4.096kB
Step 1/9 : ARG GOLANG_VER=1.8
--->
Step 2/9 : FROM golang:${GOLANG_VER} as builder
---> a0c61f0b0796
Step 3/9 : WORKDIR /go/src/app
---> Using cache
---> af5177aae437
Step 4/9 : COPY . .
---> Using cache
---> 976490d44468
Step 5/9 : RUN go-wrapper download
---> Using cache
---> e31ac3ce83c3
Step 6/9 : RUN go-wrapper install
---> Using cache
---> 2630f482fe78
Step 7/9 : FROM scratch
--->
Step 8/9 : COPY --from=builder /go/bin/app /app
---> Using cache
---> 5645db256412
Step 9/9 : CMD /app
---> Using cache
---> 8d428d6f7113
Successfully built 8d428d6f7113
Successfully tagged test-multi-hello:latest
Looking at the images, only the single binary is in the image being shipped, while the build environment is over 700MB:
$ docker images | grep 2630f482fe78
<none> <none> 2630f482fe78 6 days ago 700MB
$ docker images | grep 8d428d6f7113
test-multi-hello latest 8d428d6f7113 6 days ago 1.56MB
And yes, it runs:
$ docker run --rm test-multi-hello
Hello, world.

Up from Docker 1.13, you can use the --squash flag.
Before version 1.13:
To my knowledge, you cannot using the Docker api. docker export and docker import are designed for this scenario, as you yourself already mention.
If you don't want to save to disk, you could probably pipe the outputstream of export into the input stream of import. I have not tested this, but try
docker export red_panda | docker import - exampleimagelocal:new

Take a look at docker-squash
Install with:
pip install docker-squash
Then, if you have a image, you can squash all layers into 1 with
docker-squash -f <nr_layers_to_squash> -t new_image:tag existing_image:tag
A quick 1-liner that is useful for me to squash all layers:
docker-squash -f $(($(docker history $IMAGE_NAME | wc -l | xargs)-1)) -t ${IMAGE_NAME}:squashed $IMAGE_NAME

Build the image with the --squash flag:
https://docs.docker.com/engine/reference/commandline/build/#squash-an-images-layers---squash-experimental
Also consider mopping up unneeded files, such as the apt cache:
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart