Run Test Coverage inside Docker container for Pyspark test cases - docker

I have a pyspark project with few unit test case files
test case files
test_testOne.py
test_testcaseTwo.py
These test classes are executed inside a docker container. While running the tests inside the container i want to get the test coverage reports also. Therefore I added the following line in my requirements.txt file
coverage==6.0.2
And inside the docker container I run he following command
python -m coverage discover -s path/to/test/files
I am getting the following output
/opt/conda/bin/python: No module named coverage
Can anybody help me to run my tests successfully with test coverage. Please note that all test cases r successfully running inside the container with the following command. But its not generating the test coverage
python -m unittest discover -s path/to/test/files

If you are using coverage the command:
python -m unittest discover -s path/to/test/files
Becomes:
coverage run -m unittest discover -s path/to/test/files
As specified in the documentation: Quick Start
Since you are using docker, a good option is to create a volume inside a docker container and when the tests are finished, coverage can generate a report and store it on your host machine. Like that you could automate the whole process and save reports.
Create a volume using -v flag when you start a docker container (more info: Use Volumes)
After the tests, run coverage html -d /path/to/your/volume/inside/docker (look in the documentation for more option: coverage html)

Related

Circle CI docker cp file from instance to artifacts

I am running CircleCI with Docker and then testing my code, after it is tested the test coverage percentage is recorded to a txt file which I want to copy to the artifacts folder.
- run:
name: Run test coverage
command: |
docker-compose exec api mkdir /tmp_reports
docker-compose exec api coverage report > /tmp_reports/coverage_output.txt
docker cp api:/tmp_reports/coverage_output.txt /tmp/coverage_results
- store_artifacts:
path: /tmp/coverage_results
CircleCI Error
/bin/bash: line 1: /tmp_reports/coverage_output.txt: No such file or directory
Exited with code 1
I have ran this locally and copied the file from the docker container to my local directory, but circle ci seems to have issue with this. Can some one point me in the right direction here, thanks.
On the second line in your script, where you send output to the file > /tmp_reports/coverage_output.txt, that writes to a file outside the container. The redirect of output is handled by bash outside the container.
So on line 1 you create the directory inside the container, and line 2 it fails because the directory /tmp_reports does not exist outside the container.
You can fix this by replacing all 3 lines with:
mkdir -p /tmp/coverage_results
docker-compose exec api coverage report > /tmp/coverage_results/coverage_output.txt

Code coverage in gitlab CI/CD

I used Docker-dind to build and test my python code. I confused how to run coverage in gitlab-ci between two following options.
1) Gitlab has coverage by itself [here]
2) I follow python's coverage tutorial and create my own coverage with following:
coverage:
stage: test
script:
- docker pull $CONTAINER_TEST_IMAGE
- docker run $CONTAINER_TEST_IMAGE python -m coverage run tests/tests.py
- docker run $CONTAINER_TEST_IMAGE python -m coverage report -m
When gitlab throws an exception No data to report.:
I guess coverage report command can not access/find .coverage file in the container.
So my question is What is the elegant way to run coverage in this situation?
since const's answer has already made the first part easier i.e to get the coverage details, I have tried solve how to get reports?
This is given by Gitlab coverage doc.
So your coverage job must be written like this
coverage:
stage: test
script:
- docker pull $CONTAINER_TEST_IMAGE
- docker run $CONTAINER_TEST_IMAGE /bin/bash -c "python -m coverage run tests/tests.py && python -m coverage report -m"
coverage: '/TOTAL.+ ([0-9]{1,3}%)/'
the regex was mentioned in mondwan blog
Addon
If you add the below line in your README.md file you will get a nice badge(in master README.md) that captures your coverage details.
[![coverage report](https://gitlaburl.com/group_name/project_name/badges/master/coverage.svg?job=unittest)](https://gitlaburl.com/group_name/project_name/commits/master)
I guess coverage report command can not access/find .coverage file in the container.
Yes, your assumption is correct. By running:
- docker run $CONTAINER_TEST_IMAGE python -m coverage run tests/tests.py
- docker run $CONTAINER_TEST_IMAGE python -m coverage report -m
you actually start two completely separate containers one after the another.
In order to extract coverage report you will have to run coverage report command after the coverage run command is finished in the same container like so (I'm assuming bash shell here):
- docker run $CONTAINER_TEST_IMAGE /bin/bash -c "python -m coverage run tests/tests.py && python -m coverage report -m"

How can you cache gradle inside docker?

I'm trying to cache things that my gradle build download each time currently. For that I try to mount a volume with the -v option like -v gradle_cache:/root/.gradle
The thing is each time I rerun the build with the exat same command it still downloads everything again. The full command I use to run the image is
sudo docker run --rm -v gradle_cache:/root/.gradle -v "$PWD":/home/gradle/project -w /home/gradle/project gradle:jdk8-alpine gradle jar
I also checked in the directory where docker saves the volumes content at /var/lib/docker/volumes/gradle_cache/_data but that is also empty.
my console log
What am I missing to make this working?
Edit: As per request I rerun the command with the --scan option.
And also with a diffrent gradle home:
$ sudo docker run --rm -v gradle_cache:/root/.gradle -v "$PWD":/home/gradle/project -w /home/gradle/project gradle:jdk8-alpine gradle jar --gradle-user-home /root/.gradle
FAILURE: Build failed with an exception.
* What went wrong:
Failed to load native library 'libnative-platform.so' for Linux amd64.
After looking at the Dockerfile for the Container I'm using I found out, that the right option to use is -v gradle_cache:/home/gradle/.gradle.
What made me think that the files were cached in /root/.gradle is that the Dockerfile also sets that up as a symlink from /home/gradle/.gradle:
ln -s /home/gradle/.gradle /root/.gradle
So inspecting the filesystem after a build made it look like the files were stored there.
Since 6.2.1, Gradle now supports a shared, read-only dependency cache for this scenario:
It’s a common practice to run builds in ephemeral containers. A container is typically spawned to only execute a single build before it is destroyed. This can become a practical problem when a build depends on a lot of dependencies which each container has to re-download. To help with this scenario, Gradle provides a couple of options:
copying the dependency cache into each container
sharing a read-only dependency cache between multiple containers
https://docs.gradle.org/current/userguide/dependency_resolution.html#sub:ephemeral-ci-cache describes the steps to create and use the shared cache.
Alternatively to have more control on the cache directory you can use this:
ENV GRADLE_USER_HOME /path/to/custom/cache/dir
VOLUME $GRADLE_USER_HOME

Run command in Docker Container only on the first start

I have a Docker Image which uses a Script (/bin/bash /init.sh) as Entrypoint. I would like to execute this script only on the first start of a container. It should be omitted when the containers is restarted or started again after a crash of the docker daemon.
Is there any way to do this with docker itself, or do if have to implement some kind of check in the script?
I had the same issue, here a simple procedure (i.e. workaround) to solve it:
Step 1:
Create a "myStartupScript.sh" script that contains this code:
CONTAINER_ALREADY_STARTED="CONTAINER_ALREADY_STARTED_PLACEHOLDER"
if [ ! -e $CONTAINER_ALREADY_STARTED ]; then
touch $CONTAINER_ALREADY_STARTED
echo "-- First container startup --"
# YOUR_JUST_ONCE_LOGIC_HERE
else
echo "-- Not first container startup --"
fi
Step 2:
Replace the line "# YOUR_JUST_ONCE_LOGIC_HERE" with the code you want to be executed only the first time the container is started
Step 3:
Set the scritpt as entrypoint of your Dockerfile:
ENTRYPOINT ["/myStartupScript.sh"]
In summary, the logic is quite simple, it checks if a specific file is present in the filesystem; if not, it creates it and executes your just-once code. The next time you start your container the file is in the filesystem so the code is not executed.
The entry point for a docker container tells the docker daemon what to run when you want to "run" that specific container. Let's ask the questions "what the container should run when it's started the second time?" or "what the container should run after being rebooted?"
Probably, what you are doing is following the same approach you do with "old-school" provisioning mechanisms. Your script is "installing" the needed scripts and you will run your app as a systemd/upstart service, right? If you are doing that, you should change that into a more "dockerized" definition.
The entry point for that container should be a script that actually launches your app instead of setting things up. Let's say that you need java installed to be able to run your app. So in the dockerfile you set up the base container to install all the things you need like:
FROM alpine:edge
RUN apk --update upgrade && apk add openjdk8-jre-base
RUN mkdir -p /opt/your_app/ && adduser -HD userapp
ADD target/your_app.jar /opt/your_app/your-app.jar
ADD scripts/init.sh /opt/your_app/init.sh
USER userapp
EXPOSE 8081
CMD ["/bin/bash", "/opt/your_app/init.sh"]
Our containers, at the company I work for, before running the actual app in the init.sh script they fetch the configs from consul (instead of providing a mount point and place the configs inside the host or embedded them into the container). So the script will look something like:
#!/bin/bash
echo "Downloading config from consul..."
confd -onetime -backend consul -node $CONSUL_URL -prefix /cfgs/$CONSUL_APP/$CONSUL_ENV_NAME
echo "Launching your-app..."
java -jar /opt/your_app/your-app.jar
One advice I can give you is (in my really short experience working with containers) treat your containers as if they were stateless once they are provisioned (all the commands you run before the entry point).
I had to do this and I ended up doing a docker run -d which just created a detached container and started bash (in the background) followed by a docker exec, that did the necessary initialization. here's an example
docker run -itd --name=myContainer myImage /bin/bash
docker exec -it myContainer /bin/bash -c /init.sh
Now when I restart my container I can just do
docker start myContainer
docker attach myContainer
This may not be ideal but work fine for me.
I wanted to do the same on windows container. It can be achieved using task scheduler on windows. Linux equivalent for task Scheduler is cron. You can use that in your case. To do this edit the dockerfile and add the following line at the end
WORKDIR /app
COPY myTask.ps1 .
RUN schtasks /Create /TN myTask /SC ONSTART /TR "c:\WINDOWS\system32\WindowsPowerShell\v1.0\powershell.exe C:\app\myTask.ps1" /ru SYSTEM
This Creates a task with name myTask runs it ONSTART and the task its self is to execute a powershell script placed at "c:\app\myTask.ps1".
This myTask.ps1 script will do whatever Initialization you need to do on the container startup. Make sure you delete this task once it is executed successfully or else it will run at every startup. To delete it you can use the following command at the end of myTask.ps1 script.
schtasks /Delete /TN myTask /F

Jenkins parameterization issue using Cucumber

I'm trying to find the right sintax for an instrucion that runs a docker image, maps a volume, and calls tests written in Cucumber with a JUnit output.
When I set the following instruction with "Execute command shell" in a job configuration and I don't map any volume, tests run:
docker run docker-registry.dev.xoom.com/agrimaldi/jasper:${VERSION} cucumber -t #co -f junit -o /opt/xbp_stamp_jasper/features/output
Problem is, I need a volume in order to read the output of the tests. So I try out with the following line:
docker run --rm -v /var/lib/jenkins/jobs/qacolombia/workspace/default/features/output:/opt/xbp_stamp_jasper/features/output docker-registry.dev.xoom.com/agrimaldi/jasper:${VERSION} cucumber -t #co -f junit -o /opt/xbp_stamp_jasper/features/output
But Jenkins doesn't seem to recognize the "#" symbol. I've tried with several positions of single quotes, for example: '#co' or 'cucumber -t #co -f junit -o /opt/xbp_stamp_jasper/features/output', using backslashes, double quotes... and Jenkins doesn't recognize the whole instruction. Would you please help me with a suggestion of a way of sending parameters?
Any help is highly appreciated.

Resources