Remove docker container at the end of each test - docker

I'm using docker to scale the test infrastructure / browsers based on the number of requests received in Jenkins.
Created a python script to identify the total number of spec files and browser type, and spin-up those many docker containers. Python code has the logic to determine how many nodes are currently in use, stale and it determines the required number of containers.
I want to programmatically delete the container / de-register the selenium node at the end of each spec file (Docker --rm flag is not helping me).So that the next test will get a clean browser and environment.
The selenium grid runs on the same box where Jenkins is. Once I invoke protractor protractor.conf.js (Step 3), selenium grid will start distributing the tests to the containers created in Step 1.
When I say '--rm' is not helping, I mean the after step3 the communication is mainly between selenium hub and the nodes. I'm finding it difficult to determine which node / container was used by the selenium grid to execute the test and remove the container even before the grid sends another test to the container.
-- Jenkins Build Stage --
Shell:
# Step 1
python ./create_test_machine.py ${no_of_containers} # This will spin-up selenium nodes
# Step 2
npm install # install node modules
# Step 3
protractor protractor.conf.js # Run the protractor tests
--Python code to spin up containers - create_test_machine.py--
Python Script:
import sys
import docker
import docker.utils
import requests
import json
import time
c = docker.Client(base_url='unix://var/run/docker.sock', version='1.23')
my_envs = {'HUB_PORT_4444_TCP_ADDR' :'172.17.0.1', 'HUB_PORT_4444_TCP_PORT' : 4444}
def check_available_machines(no_of_machines):
t = c.containers()
noof_running_containers = len(t)
if noof_running_containers == 0:
print("0 containers running. Creating " + str(no_of_machines) + "new containers...")
spinup_test_machines(no_of_machines)
else:
out_of_stock = 0
for obj_container in t:
print(obj_container)
container_ip_addr = obj_container['NetworkSettings']['Networks']['bridge']['IPAddress']
container_state = obj_container['State']
res = requests.get('http://' + container_ip_addr + ':5555/wd/hub/sessions')
obj = json.loads(res.content)
node_inuse = len(obj['value'])
if node_inuse != 0:
noof_running_containers -= 1
if noof_running_containers < no_of_machines:
spinup_test_machines(no_of_machines - noof_running_containers)
return
def spinup_test_machines(no_of_machines):
'''
Parameter : Number of test nodes to spin up
'''
print("Creating " + str(no_of_machines) + " new containers...")
# my_envs = docker.utils.parse_env_file('docker.env')
for i in range(0,no_of_machines):
new_container = c.create_container(image='selenium/node-chrome', environment=my_envs)
response = c.start(container=new_container.get('Id'))
print(new_container, response)
return
if len(sys.argv) - 1 == 1:
no_of_machines = int(sys.argv[1]) + 2
check_available_machines(no_of_machines)
time.sleep(30)
else:
print("Invalid number of parameters")

Here the difference can be seen clearly when docker run with -d and --rm
Using -doption
C:\Users\apps>docker run -d --name testso alpine /bin/echo 'Hello World'
5d447b558ae6bf58ff6a2147da8bdf25b526bd1c9f39117498fa017f8f71978b
Check the logs
C:\Users\apps>docker logs testso
'Hello World'
Check the last run containers
C:\Users\apps>docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5d447b558ae6 alpine "/bin/echo 'Hello Wor" 35 hours ago Exited (0) 11 seconds ago testso
Finally user have to remove it explicity
C:\Users\apps>docker rm -f testso
testso
Using --rm, the container is vanished including its logs as soon as the process that is
run inside the container is completed. No trace of container any more.
C:\Users\apps>docker run --rm --name testso alpine /bin/echo 'Hello World'
'Hello World'
C:\Users\apps>docker logs testso
Error: No such container: testso
C:\Users\apps>docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
I believe that it is clear, how to run the container and leaving no trace after process is finished inside of container.

so to start a container in detached mode, you use -d=true or just -d option. By design, containers started in detached mode exit when the root process used to run the container exits. A container in detached mode cannot be automatically removed when it stops, this means you cannot use the --rm option with -d option.
look at this
https://docs.docker.com/engine/reference/run/

You can use nose test. For every "def test_xxx()", it will call the setup and teardown functions with #with_setup decrator. Below is an example:
from nose.tools import *
c = docker.Client(base_url='unix://var/run/docker.sock', version='1.23')
my_envs = {'HUB_PORT_4444_TCP_ADDR' :'172.17.0.1', 'HUB_PORT_4444_TCP_PORT' : 4444}
my_containers = {}
def setup_docker():
""" Setup Test Environment,
create/start your docker container(s), populate the my_containers dict.
"""
def tear_down_docker():
"""Tear down test environment.
"""
for container in my_containers.itervalues():
try:
c.stop(container=container.get('Id'))
c.remove_container(container=container.get('Id'))
except Exception as e:
print e
#with_setup(setup=setup_docker, teardown=tear_down_docker)
def test_xxx():
# do your test here
# you can call a subprocess to run your selenium
Or, you write a separate python script to detect the containers you set up for your test, and then do something like this:
for container in my_containers.itervalues():
try:
c.stop(container=container.get('Id'))
c.remove_container(container=container.get('Id'))
except Exception as e:
print e

Related

How to run docker image inside GCP Compute Engine instance with Apache Airflow

I am trying to create an Airflow DAG from which I want to spin a Compute Engine instance with a docker image stored in Google Container Registry.
In other words, I wanted to replicate gcloud compute instances create-with-container with airflow dags using gcloud operators. I searched for airflow operators for such operations but couldn't find any way to make them work.
Possible references:
https://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/cloud/compute.html
https://cloud.google.com/composer/docs/connect-gce-vm-sshoperator
A simple and clean solution to run a premade container using VMs with Airflow may consist in chaining the 3 steps below:
create a fresh new VM (through a BashOperator) with a startup script that pulls/runs the container and shut down the VM when the running is done;
use a PythonSensor to check when the VM is stopped (i.e. the docker finish running);
delete the VM (through a BashOperator) in order to repeat the previous steps when the airflow dag is triggered the next time.
All we need are the below bash commands:
bash_cmd = {
'active_account': \
'gcloud auth activate-service-account MYCLIENTEMAIL '
'--key-file=/PATH/TO/MY/JSON/SERVICEACCOUNT',
'set_project': \
'gcloud config set project MYPROJECTID',
'list_vm': \
'gcloud compute instances list',
'create_vm': \
'gcloud compute instances create-with-container VMNAME '
'--project=MYPROJECTID --zone=MYZONE --machine-type=e2-medium '
'--image=projects/cos-cloud/global/images/cos-stable-101-17162-40-5 '
'--boot-disk-size=10GB --boot-disk-type=pd-balanced '
'--boot-disk-device-name=VMNAME '
'--container-image=eu.gcr.io/MYPROJECTID/MYCONTAINER --container-restart-policy=always '
'--labels=container-vm=cos-stable-101-17162-40-5 --no-shielded-secure-boot '
'--shielded-vtpm --shielded-integrity-monitoring '
'--metadata startup-script="#!/bin/bash\n sleep 10\n sudo useradd -m bob\n sudo -u bob docker-credential-gcr configure-docker\n sudo usermod -aG docker bob\n sudo -u bob docker run eu.gcr.io/MYPROJECTID/MYCONTAINER\n sudo poweroff" ',
'delete_vm': \
'gcloud compute instances delete VMNAME --zone=MYZONE --delete-disks=boot',
}
active_account and set_project are used respectively to activate the service account and set the correct working project (where we want to run the VMs). This is required and needed when Airflow is running outside the GCP project where the VMs are instantiated. It's also important to have ComputeEngine privileges on the service account used. The container images to run must be located in the container registry of the same project where the VMs are instantiated.
list_vm returns the list of the existing VMs in the project with relative features and status (RUNNING/TERMINATED).
create_vm creates the VM attaching the docker to run from the container registry. The command to create the VM can be customized according to your needs. Important to note, you must add --metadata startup-script that includes the run of the docker and the VM power off when the docker finishes running. (to see how the startup script is generated see here).
delete_vm simply deletes the VM created by create_vm.
All these commands can be combined to work together in an Airflow DAG in this way:
import re
import os
import datetime
import subprocess
import airflow
from airflow.sensors.python import PythonSensor
from airflow.operators.bash_operator import BashOperator
def vm_run_check():
"function to list all the VMs and check their status"
finish_run = False
output = subprocess.check_output(
bash_cmd['active_account'] + " && " + \
bash_cmd['set_project'] + " && " + \
bash_cmd['list_vm'],
shell=True
)
output = output.decode("utf-8").split("\n")[:-1]
machines = []
for i in range(1,len(output)):
m = {}
for match in re.finditer(r"([A-Z_]+)( +)?", output[0]+" "*10):
span = match.span()
m[match.group().strip()] = output[i][span[0]:span[1]].strip()
machines.append(m)
machines = {m['NAME']:m for m in machines}
if VMNAME in machines:
if machines[VMNAME]['STATUS'] == 'TERMINATED':
finish_run = True
return finish_run
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'email': [''],
'email_on_failure': False,
'email_on_retry': False,
'retries': 0,
}
with models.DAG(
'MYDAGNAME',
catchup=False,
default_args=default_args,
start_date=datetime.datetime.now() - datetime.timedelta(days=3),
schedule_interval='0 4 * * *', # every day at 04:00 AM UTC
) as dag:
create_vm = BashOperator(
task_id="create_vm",
bash_command = bash_cmd['active_account'] + " && " + \
bash_cmd['set_project'] + " && " + \
bash_cmd['create_vm']
)
sensor_vm_run = PythonSensor(
task_id="sensor_vm_run"
python_callable=vm_run_check,
poke_interval=60*2, # check every 2 minutes
timeout=60*60, # check every 2 minutes for an hour
soft_fail=True,
mode="reschedule",
)
delete_vm = BashOperator(
task_id="delete_vm",
bash_command = bash_cmd['active_account'] + " && " + \
bash_cmd['set_project'] + " && " + \
bash_cmd['delete_vm']
)
create_vm >> sensor_vm_run >> delete_vm

Can't run Docker command via ssh and python's subprocess module

I am trying to automatically run a docker build command using the subprocess module as such:
command = "docker build -t image_name ."
ssh_command = "ssh -o 'StrictHostKeyChecking=no' -i 'XXXXX.pem' ubuntu#" + cur_public_ip + " " + command
retval = subprocess.run(command.split(" "), stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
if retval.stderr != '':
print('Error trace: ')
print(retval.stderr)
else:
print("Docker image succesfully built.")
print(retval.stdout)
Interestingly, if I run this command (the string that is the command variable) after I manually SSH into my ec2 instance, it works fine.
But when I run the code above, I get this error:
Error trace:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
I can't seem to solve this problem, and I am stuck since I don't see how what I am doing is different from manually sshing into the instance and running the command.
The docker daemon is definitely running since I can build manually through an ssh terminal. I've tried changing the rwx permissions of the Dockerfile and all related files on the ec2 instance, but that did not help as well.
How do I make this work? I need to programmatically be able to do this.
Thank you.
Your first problem is that you're only passing command to subprocess.run, so you're running docker build locally:
+--- look here
|
v
retval = subprocess.run(command.split(" "), stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
Your second problem is that you've got way to much quoting going on in ssh_command, which is going to result in a number of problems. As written, for example, you'll be passing the literal string 'StrictHostKeyChecking=no' to ssh, resulting in an error like:
command-line: line 0: Bad configuration option: 'stricthostkeychecking
Because you're not executing your command via a shell, all of those quotes will be passed literally in the command line.
Rather than calling command.split(" "), you would be better off just building the command as a list, something like this:
import subprocess
cur_public_ip = "1.2.3.4"
command = ["docker", "build", "-t", "image_name", "."]
ssh_command = [
"ssh",
"-o",
"stricthostkeychecking=no",
"-i",
"XXXXX.pem",
f"ubuntu#{cur_public_ip}",
] + command
retval = subprocess.run(
ssh_command,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
universal_newlines=True,
)

Cronjob Not Running via Crontab -e

I am trying to run a script daily that connects to my ESXi host, deletes all snapshots of all my VMs, then creates new snapshots each day. I am attempting to do this by running the script within a docker container using the VMWare PowerCLI docker image (https://hub.docker.com/r/vmware/powerclicore) on my docker VM running Ubuntu.
I am able to successfully run this script by running the following command in terminal:
/usr/bin/docker run --rm -it --name=powerclicore --entrypoint="/usr/bin/pwsh" -v /home/<redacted>/config/powercli:/scripts vmware/powerclicore /scripts/VMSnapshot.ps1
However, after adding the above command to my cronfile via crontab -e, my job is not running.
GNU nano 4.8 /tmp/crontab.EZkliu/crontab
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').
#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h dom mon dow command
0 0 * * * /usr/bin/docker run --rm -it --name=powerclicore --entrypoint="/usr/bin/pwsh" -v /home/<redacted>/config/powercli:/scripts vmware/powerclicore /scripts/VMSnapshot.ps1
Am I doing this wrong? A second pair of eyes would be much appreciated!
#KazikM. After some troubleshooting, I was finally able to get this to work using your recommendation. I created a Bash script that called the Docker command I was trying to put int crontab, then just called the Bash script using crontab: 0 0 * * * /bin/bash /home/<redacted>/config/powerclicore/VMSnapshot-bash.sh. Thanks again for your help!

who and w commands in CentOS 8 Docker container

While playing with CentOs 8 on Docker container I found out, that outputs of who and w commands are always empty.
[root#9e24376316f1 ~]# who
[root#9e24376316f1 ~]# w
01:01:50 up 7:38, 0 users, load average: 0.00, 0.04, 0.00
USER TTY FROM LOGIN# IDLE JCPU PCPU WHAT
Even when I'm logged in as a different user in second terminal.
When I want to write to this user it shows
[root#9e24376316f1 ~]# write test
write: test is not logged in
Is this because of Docker? Maybe it works in some way that disallow sessions to see each other?
Or maybe that's some other issue. I would really appreciate some explanation.
These utilities obtain the information about current logins from the utmp file (/var/run/utmp). You can easily check that in ordinary circumstances (e.g. on the desktop system) this file contains something like the following string (here qazer is my login and tty7 is a TTY where my desktop environment runs):
$ cat /var/run/utmp
tty7:0qazer:0�o^�
while in the container this file is (usually) empty:
$ docker run -it centos
[root#5e91e9e1a28e /]# cat /var/run/utmp
[root#5e91e9e1a28e /]#
Why?
The utmp file is usually modified by programs which authenticate the user and start the session: login(1), sshd(8), lightdm(1). However, the container engine cannot rely on them, as they may be absent in the container file system, so "logging in" and "executing on behalf of" is implemented in the most primitive and straightforward manner, avoiding relying on anything inside the container.
When any container is started or any command is execd inside it, the container engine just spawns the new process, arranges some security settings, calls setgid(2)/setuid(2) to forcibly (without any authentication) alter the process' UID/GID and then executes the required binary (the entry point, the command, and so on) within this process.
Say, I start the CentOS container running its main process on behalf of UID 42:
docker run -it --user 42 centos
and then try to execute sleep 1000 inside it:
docker exec -it $CONTAINER_ID sleep 1000
The container engine will perform something like this:
[pid 10170] setgid(0) = 0
[pid 10170] setuid(42) = 0
...
[pid 10170] execve("/usr/bin/sleep", ["sleep", "1000"], 0xc000159740 /* 4 vars */) = 0
There will be no writes to /var/run/utmp, thus it will remain empty, and who(1)/w(1) will not find any logins inside the container.

docker exec command doesn't return after completing execution

I started a docker container based on an image which has a file "run.sh" in it. Within a shell script, i use docker exec as shown below
docker exec <container-id> sh /test.sh
test.sh completes execution but docker exec does not return until i press ctrl+C. As a result, my shell script never ends. Any pointers to what might be causing this.
I could get it working with adding the -it parameters:
docker exec -it <container-id> sh /test.sh
Mine works like a charm with this command. Maybe you only forgot the path to the binary (/bin/sh)?
docker exec 7bd877d15c9b /bin/bash /test.sh
File location at
/test.sh
File Content:
#!/bin/bash
echo "Hi"
echo
echo "This works fine"
sleep 5
echo "5"
Output:
ArgonQQ#Terminal ~ docker exec 7bd877d15c9b /bin/bash /test.sh
Hi
This works fine
5
ArgonQQ#Terminal ~
My case is a script a.sh with content
like
php test.php &
if I execute it like
docker exec contianer1 a.sh
It also never returned.
After half a day googling and trying
changed a.sh to
php test.php >/tmp/test.log 2>&1 &
It works!
So it seems related with stdin/out/err.
>/tmp/test.log 2>&1
Please try.
And please note that my test.php is a dead loop script that monitors a specified process, if the process is down, it will restart it. So test.php will never exit.
As described here, this "hanging" behavior occurs when you have processes that keep stdout or stderr open.
To prevent this from happening, each long-running process should:
be executed in the background, and
close both stdout and stderr or redirect them to files or /dev/null.
I would therefore make sure that any processes already running in the container, as well as the script passed to docker exec, conform to the above.
OK, I got it.
docker stop a590382c2943
docker start a590382c2943
then will be ok.
docker exec -ti a590382c2943 echo "5"
will return immediately, while add -it or not, no use
actually, in my program, the deamon has the std input and std output, std err. so I change my python deamon like following, things work like a charm:
if __name__ == '__main__':
# do the UNIX double-fork magic, see Stevens' "Advanced
# Programming in the UNIX Environment" for details (ISBN 0201563177)
try:
pid = os.fork()
if pid > 0:
# exit first parent
os._exit(0)
except OSError, e:
print "fork #1 failed: %d (%s)" % (e.errno, e.strerror)
os._exit(0)
# decouple from parent environment
#os.chdir("/")
os.setsid()
os.umask(0)
#std in out err, redirect
si = file('/dev/null', 'r')
so = file('/dev/null', 'a+')
se = file('/dev/null', 'a+', 0)
os.dup2(si.fileno(), sys.stdin.fileno())
os.dup2(so.fileno(), sys.stdout.fileno())
os.dup2(se.fileno(), sys.stderr.fileno())
# do second fork
while(True):
try:
pid = os.fork()
if pid == 0:
serve()
if pid > 0:
print "Server PID %d, Daemon PID: %d" % (pid, os.getpid())
os.wait()
time.sleep(3)
except OSError, e:
#print "fork #2 failed: %d (%s)" % (e.errno, e.strerror)
os._exit(0)

Resources