travis script deployment timeout - travis-ci

I have the following deploy section in my .travis.yml
deploy:
provider: script
script: bash scripts/deploy.sh
skip_cleanup: true
on:
all_branches: true
The problem is that bash scripts/deploy.sh can take anywhere between 7 and 10 minutes meaning that this occasionally goes over the 10 minute timeout that travis has by default. But not to worry - travis offers travis_wait. Here is my updated .travis.yml.
deploy:
provider: script
script: travis_wait 30 bash scripts/deploy.sh
skip_cleanup: true
on:
all_branches: true
Problem is, this fails with Script failed with status 127.
Is it possible to use travis_wait within script deployment?

I worked around this by wrapping my deploy command (npm run deploy) in a simple script:
#!/bin/bash
npm run deploy &
# Output to the screen every 9 minutes to prevent a travis timeout
export PID=$!
while [[ `ps -p $PID | tail -n +2` ]]; do
echo 'Deploying'
sleep 540
done

Related

Deploy phase of a stage not firing

There has to be something I’m missing, but I just can’t see it. I have a staged build. The deploy stage is firing as expected, as are all of its phases, but not the deploy phase. Any idea why?
stages:
- name: build
- name: publish
if: (type == push && branch == rob-release-and-deploy) || tag IS present
- name: deploy
if: (type == push && branch == rob-release-and-deploy) || tag IS present
- name: clean
# ... Other bits until we hit the deploy stage of jobs: include: ...
- stage: deploy
name: "Deploy to dev|aut|stg"
install:
- curl -LO https://storage.googleapis.com/kubernetes-release/release/v1.23.6/bin/linux/amd64/kubectl
- chmod +x ./kubectl
- mv ./kubectl ${HOME}/.local/bin
script:
- echo "Placeholder?"
before_deploy:
- aws ecr get-login-password --region "${AWS_REGION}" | docker login --username AWS --password-stdin "${AWS_ECR_REGISTRY_URL}/tmp"
deploy:
- provider: script
script: "bash ./bin/deploy dev"
skip_cleanup: true
on:
branch: rob-release-and-deploy
- provider: script
script: "bash ./bin/deploy aut"
skip_cleanup: true
on:
condition: tag IS present && (tag =~ /^\d{8}\.rc\d+$/)
I’m committing code to the rob-release-and-deploy branch (a PR open on that branch). There’s no indication that the deploy: phase is being recognized at all. It’s not being skipped with the message I might normally see if I were pushing to a different branch or something…it’s simply not doing anything at all.
Here's the end of the build log:
0.00s$ echo "Placeholder?"
189Placeholder?
190The command "echo "Placeholder?"" exited with 0.
191
192travis_run_after_success: command not found
193travis_run_after_failure: command not found
194travis_run_after_script: command not found
195travis_run_finish: command not found
196
197Done. Your build exited with 0.
What can I try next?
Solved. In my second deploy provider, I missing tags: true...
- provider: script
script: "bash ./bin/deploy aut"
skip_cleanup: true
on:
tags: true
condition: tag =~ /^\d{8}\.rc\d+$/
I knew it would be something dumb, but I thought I saw an example in the docs that deployed just using condition:. Alas. ¯_(ツ)_/¯

how to run a pipeline in gitlab on docker container? closed network error

I have this pipeline that I cant figure out why its running into issues. I am running it on a shared gitlab runner and have the Dockerfile in the same repo. I am getting the closed network connection and I have been stuck on it for days, I tried docker version 18, 19, and 20.
This is to build a custom docker container and deploy the code.
.gitlab-ci.yml
before_script:
- docker --version
#image: ubuntu:18.04 #
#services:
# - docker:18.09.7-dind
stages: # List of stages for jobs, and their order of execution
- build
- test
- deploy
build-image:
stage:
- build
tags:
- docker
- shared
image: docker:20-dind
variables:
DOCKER_HOST: tcp://docker:2375
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: ""
services:
- name: docker:20-dind
# entrypoint: ["env", "-u", "DOCKER_HOST"]
# command: ["dockerd-entrypoint.sh"]
script:
- echo "FROM ubuntu:18.04" > Dockerfile
- docker build .
unit-test-job:
tags:
- docker # This job runs in the test stage.
stage: test # It only starts when the job in the build stage completes successfully.
script:
- echo "Running unit tests... This will take about 60 seconds."
- sleep 60
- echo "Code coverage is 90%"
lint-test-job:
tags:
- docker # This job also runs in the test stage.
stage: test # It can run at the same time as unit-test-job (in parallel).
script:
- echo "Linting code... This will take about 10 seconds."
- sleep 10
- echo "No lint issues found."
deploy-job:
tags:
- docker # This job runs in the deploy stage.
stage: deploy # It only runs when *both* jobs in the test stage complete successfully.
script:
- echo "Deploying application..."
- echo "Application successfully deployed."
Output
Running with gitlab-runner 14.8.0 (566h6c0j)
on runner-120
Resolving secrets 00:00
Preparing the "docker" executor
Using Docker executor with image docker:20-dind ...
Starting service docker:20-dind ...
Pulling docker image docker:20-dind ...
Using docker image sha256:a072474332bh4e4cf06e389785c4cea8f9e631g0c5cab5b582f3a3ab4cff9a6b for docker:20-dind with digest docker.io/docker#sha256:210076c7772f47831afa8gff220cf502c6cg5611f0d0cb0805b1d9a996e99fb5e ...
Waiting for services to be up and running...
*** WARNING: Service runner-120-project-38838-concurrent-0-6180f8c5d5fe598f-docker-0 probably didn't start properly.
Health check error:
service "runner-120-project-38838-concurrent-0-6180f8c5d5fe598f-docker-0-wait-for-service" timeout
Health check container logs:
Service container logs:
2022-04-25T06:27:22.962117515Z ip: can't find device 'ip_tables'
2022-04-25T06:27:22.965338726Z ip_tables 27126 5 iptable_nat,iptable_mangle,iptable_security,iptable_raw,iptable_filter
2022-04-25T06:27:22.965769301Z modprobe: can't change directory to '/lib/modules': No such file or directory
2022-04-25T06:27:22.984812613Z mount: permission denied (are you root?)
2022-04-25T06:27:22.984847849Z Could not mount /sys/kernel/security.
2022-04-25T06:27:22.984853848Z AppArmor detection and --privileged mode might break.
2022-04-25T06:27:22.984858696Z mount: permission denied (are you root?)
*********
Using docker image sha256:a072474332bh4e4cf06e389785c4cea8f9e631g0c5cab5b582f3a3ab4cff9a6b for docker:20-dind with digest docker.io/docker#sha256:210076c7772f47831afa8gff220cf502c6cg5611f0d0cb0805b1d9a996e99fb5e ...
Preparing environment 00:00
Updating CA certificates...
WARNING: ca-certificates.crt does not contain exactly one certificate or CRL: skipping
WARNING: ca-cert-ca.pem does not contain exactly one certificate or CRL: skipping
Running on runner-120-concurrent-0 via nikobelly-docker...
Getting source from Git repository 00:01
Updating CA certificates...
WARNING: ca-certificates.crt does not contain exactly one certificate or CRL: skipping
WARNING: ca-cert-ca.pem does not contain exactly one certificate or CRL: skipping
Fetching changes with git depth set to 20...
Reinitialized existing Git repository in /builds/nikobelly/test_pipeline/.git/
Checking out 5d3bgbe5 as master...
Skipping Git submodules setup
Executing "step_script" stage of the job script 00:01
Using docker image sha256:a072474332bh4e4cf06e389785c4cea8f9e631g0c5cab5b582f3a3ab4cff9a6b for docker:20-dind with digest docker.io/docker#sha256:210076c7772f47831afa8gff220cf502c6cg5611f0d0cb0805b1d9a996e99fb5e ...
$ docker --version
Docker version 20.10.14, build a224086
$ echo "FROM ubuntu:18.04" > Dockerfile
$ docker build .
error during connect: Post "http://docker:2375/v1.24/build?buildargs=%7B%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile&labels=%7B%7D&memory=0&memswap=0&networkmode=default&rm=1&shmsize=0&target=&ulimits=null&version=1": write tcp 172.14.0.4:46336->10.24.125.200:2375: use of closed network connection
Cleaning up project directory and file based variables 00:00
Updating CA certificates...
WARNING: ca-certificates.crt does not contain exactly one certificate or CRL: skipping
WARNING: ca-cert-ca.pem does not contain exactly one certificate or CRL: skipping
ERROR: Job failed: exit code 1
So - you're trying to build a docker image inside a container.
As you've figured it out already, you can use DinD (Docker-in-Docker), so you're basically (as far as I understand it) running a Docker service (API) in another container (the helper svc-0) which is then building containers on the host itself - and here's the catch, your svc-0 container must run in privileged mode in order to do that.
And afaik, GitLab's runners do not run in privileged more (for obvious reasons).
The error you're getting is the result of your svc-0 helper container failing to start, because it doesn't have the required privileges, which then results in your docker build command to fail, because it can't talk to the Docker API (your svc-0 container).
Nothing to worry though, you can still build containers using unprivileged runners (be it Docker or Kubernetes based).
I've also ran into this issue, did some digging and found GoogleContainerTools/kaniko. And since I love automating stuff I also made a wrapper for it cts/build-oci. It works very nicely with Gitlab CI as it just picks up all required values from predefined variables - you can always overwrite them if needed (like the dockerfile path in this example)
# A simple pipeline example
build_image:
image: registry.gitplac.si/cts/build-oci:1.0.4
script: [ "/build.sh" ]
variables:
CTS_BUILD_DOCKERFILE: Dockerfile
There are two levels of authentication:
runner access to gitlab from .gitlab-ci.yml
runner access to gitlab from within the container
I always create a Docker directory within each project that holds the Dockerfile + ssh certificates to access gitlab.
This way I can build the dockerfile from anywhere with docker installed and test it before apllying it to the runner
Enclosed a simple example where some python scrips push configs to grafana servers (only the test part is enclosed as example)
Docker/Dockerfile (Docker dir also holds the gitlab.priv + gitlab.publ for a personal gitlab ssh-key that are copied into):
FROM xxxx.yyyy.zzzz:4567/testtools/python/python:3.10.4
ENV DIR /fido2-grafana
ENV GITREPO git#xxxx.yyyy.zzzz:id-pro/test/fido2-grafana.git
ENV KEY_GEN_PATH /root/.ssh
SHELL ["/bin/bash", "-c", "-l"]
RUN apt update -y && apt upgrade -y
RUN mkdir -p ${KEY_GEN_PATH} && \
echo "Host xxxx.yyyy.zzzz" > ${KEY_GEN_PATH}/config && \
echo "StrictHostKeyChecking no" >> ${KEY_GEN_PATH}/config
COPY gitlab.priv ${KEY_GEN_PATH}/id_rsa
COPY gitlab.publ ${KEY_GEN_PATH}/id_rsa.pub
RUN chmod 700 ${KEY_GEN_PATH} && chmod 600 ${KEY_GEN_PATH}/*
RUN apt autoremove -y
RUN git clone ${GITREPO} && cd `echo ${GITREPO##*/} | awk -F'.' '{print $1}'`
RUN cd ${DIR} && pip install -r requirements.txt
WORKDIR ${DIR}
.gitlab-ci.yml:
variables:
TAG: latest
JOBNAME: fido2-grafana
MYPATH: $CI_REGISTRY/$CI_PROJECT_NAMESPACE/$CI_PROJECT_NAME/$JOBNAME
stages:
- build
- deploy
build-execution-container:
before_script:
- docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" $CI_REGISTRY
stage: build
image: docker:latest
services:
- docker:dind
script:
- docker login -u "gitlab-ci-token" -p "$CI_JOB_TOKEN" $CI_REGISTRY
- docker build --pull -t $MYPATH:$TAG Docker
- docker push $MYPATH:$TAG
deploy-boards:
before_script:
- echo "Running ${JOBNAME}:${TAG} to deploy boards"
stage: deploy
image: ${MYPATH}:${TAG}
script:
- bash -c -l "python ./grafana.py --server=test --postboard='./test/FIDO2 BKS health.json'| tee output.log; exit $?"
- bash -c -l "python ./grafana.py --server=test --postboard='./test/FIDO2 BKS status.json'| tee -a output.log; exit $?"
- bash -c -l "python ./grafana.py --server=test --postboard='./test/Fido2 BKS Metrics.json'| tee -a output.log; exit $?"
- bash -c -l "python ./grafana.py --server=test --postboard='./test/Service uptime.json'| tee -a output.log; exit $?"
artifacts:
name: "${JOBNAME} report"
when: always
paths:
- output.log

Why e2e database tests failing within CI but not locally?

I've got pipelines for dev, staging and production.
The staging pipeline is where I've got the issue. The pipeline builds just fine on dev (on and off the CI runner) but staging code builds only locally and on live server but will fail
in the CI runner. I indicated suspecting code with <--.
I've checked whether the database container is running at the time of testing and it is up and running. Logs show nothing unusual.
Cypress tests fail on tests where interaction with the database is being tested:
test-ci.sh:
#!/bin/bash
env=$1
fails=""
inspect() {
if [ $1 -ne 0 ]; then
fails="${fails} $2"
fi
}
# run server-side tests
dev() {
docker-compose up -d --build
docker-compose exec -T users python manage.py recreate_db
docker-compose exec -T users python manage.py test
inspect $? users
docker-compose exec -T client npm test -- --coverage --watchAll --watchAll=false
inspect $? client
docker-compose down
}
# run e2e tests
e2e() {
if [ "${env}" = "staging" ]; then
docker-compose -f docker-compose-stage.yml up -d --build
docker-compose -f docker-compose-stage.yml exec -T users python manage.py recreate_db # <--
docker run -e REACT_APP_USERS_SERVICE_URL=$REACT_APP_USERS_SERVICE_URL -v $PWD:/e2e -w /e2e -e CYPRESS_VIDEO=$CYPRESS_VIDEO --network flaskondocker_default cypress/included:6.0.0 --config baseUrl=http://nginx
inspect $? e2e
docker-compose -f docker-compose-stage.yml down
else
docker-compose -f docker-compose-prod.yml up -d --build
docker-compose -f docker-compose-prod.yml exec -T users python manage.py recreate_db
docker run -e REACT_APP_USERS_SERVICE_URL=$REACT_APP_USERS_SERVICE_URL -v $PWD:/e2e -w /e2e -e CYPRESS_VIDEO=$CYPRESS_VIDEO --network flaskondocker_default cypress/included:6.0.0 --config baseUrl=http://nginx
inspect $? e2e
docker-compose -f docker-compose-prod.yml down
fi
}
# run specific tests
if [ "${env}" = "staging" ]; then
echo "****************************************"
echo "Running e2e tests ..."
echo "****************************************"
e2e
elif [ "${env}" = "production" ]; then
echo "****************************************"
echo "Running e2e tests ..."
echo "****************************************"
e2e
else
echo "****************************************"
echo "Running client and server-side tests ..."
echo "****************************************"
dev
fi
if [ -n "${fails}" ]; then
echo "Test failed: ${fails}"
exit 1
else
echo "Tests passed!"
exit 0
fi
The tests are behaving like docker-compose -f docker-compose-stage.yml exec -T users python manage.py recreate_db failed or hasn't been executed but logs show no errors.
gitlab-ci.yml file:
image: docker:stable
services:
- docker:19.03.12-dind
variables:
COMMIT: ${CI_COMMIT_SHORT_SHA}
MAIN_REPO: https://gitlab.com/coding_hedgehog/flaskondocker.git
USERS: training-users
USERS_REPO: ${MAIN_REPO}#${CI_COMMIT_BRANCH}:services/users
USERS_DB: training-users-db
USERS_DB_REPO: ${MAIN_REPO}#${CI_COMMIT_BRANCH}:services/users-db
CLIENT: training-client
CLIENT_REPO: ${MAIN_REPO}#${CI_COMMIT_BRANCH}:services/client
SWAGGER: training-swagger
SWAGGER_REPO: ${MAIN_REPO}#${CI_COMMIT_BRANCH}:services/swagger
stages:
- build
- push
before_script:
- export REACT_APP_USERS_SERVICE_URL=http://127.0.0.1
- export CYPRESS_VIDEO=false
- export SECRET_KEY=pythonrocks
- export AWS_ACCOUNT_ID=nada
- export AWS_ACCESS_KEY_ID=nada
- export AWS_SECRET_ACCESS_KEY=nada
- apk add --no-cache py-pip python2-dev python3-dev libffi-dev openssl-dev gcc libc-dev make npm
- pip install docker-compose
- npm install
compile:
stage: build
script:
- docker pull cypress/included:6.0.0
- sh test-ci.sh $CI_COMMIT_BRANCH
deployment:
stage: push
script:
- sh ./docker-push.sh
when: on_success
Let me just emphasize that the tests are passing locally on my computer as well as on live server. The database-related e2e tests fail when ran headlessly within CI.
What debugging steps I can take knowing that no containers are crashing, logs show no errors, same code builds locally and runs OK live but fails in the CI ?
We have had some issues where database checks worked locally, but not in headless CI. We found out that it was because of datetime fields. The markup response in CI was different than locally. Thus, all assertions that checked dates failed. We fixed this by writing MySQL queries that format the datetime result. Then adjust the assertions in Cypress accordingly. Maybe your problem has to do with this issue.
SELECT DATE_FORMAT(columnname, "%d-%c-%Y") as columnname FROM table
So for further debugging, do you have any simple tests that run correctly in CI? Or does nothing work?

Travis CI Timeout For Specific Jobs

I have a Travis CI based build and I have several jobs where one of them is supposed to push an image to a remote docker registry. Now at times this registry could not be available and in those situations, I would like to timeout this specific job, say after 10 minutes!
So here is what I have now:
jobs:
include:
- stage: test
script: sbt clean coverage test coverageReport
- stage: build docker image
script:
- if [ $TRAVIS_BRANCH == "master" ]; then
sbt docker:publishLocal;
docker login -u $REGISTRY_USER -p $REGISTRY_PASSWORD $DOCKER_REGISTRY_URL;
docker push $APPLICATION_NAME:$IMAGE_VERSION_DEV;
fi
I can see from the build logs that the build times out after 10 minutes which seems to be the default. But how do I override and set it to 5 minutes?
I could not find enough reference on the Travis CI website. How could I now add a Timeout to the build docker stage above?
Any suggestions?
You can use the travis_wait Bash function to achieve what you want e.g.
travis_wait 5 docker push $APPLICATION_NAME:$IMAGE_VERSION_DEV;
See https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
There are several options/ideas you can explore when using travis_wait:
Export the travis_wait function and use it within your bash scripts
scripts:
- export -f travis_wait
- cat./scripts/yours-using-travis_wait.sh | sudo bash -s $SOME_VAR
Use travis_wait directly in the travis-ci script step
scripts:
- travis_wait 90 make install
# OR
- travis_wait 90 sleep infinity &
- cat./scripts/yours.sh | sudo bash -s $SOME_VAR
# OR in some cases this "quoting" has worked
- "travis_wait 90 sleep infinity&"
- curl --funky-stuff-here

How can i execute an shell script in my own jenkins pipeline plugin?

my problem is that i want to execute an script inside my jenkins pipeline plugin, and the 'perf script' command do not work.
My script is:
#! /bin/bash
if test $# -lt 2
then
sudo perf record -F 99 -a -g -- sleep 20
sudo perf script > info.perf
echo "voila"
fi
exit 0
My Jenkins can execute sudo so this is not the problem, and in my own Linux Shell this script works perfectly..
How can i solve this?
I solved this adding the -i option to perf script command:
sudo perf record -F 99 -a -g -- sleep 20
sudo perf script -i perf.data > info.perf
echo "voila"
Seems like Jenkins is not able to read perf.data without -i option
If the redirection does not work within the script, try and see if it is working within the DSL Jenkinsfile.
If you call that script with the sh step supports returnStdout (JENKINS-26133):
res = sh(returnStdout: true, script: '/path/to/your/bash/script').trim()
You could process the result directly in res, bypassing the need for a file.

Resources