Will new GCP Kubeflow Pipelines deployment support Containerd? - kubeflow

The default node image for version 1.18.17-gke.100 will be changed to Container-optimized OS with Containerd. From what I know this doesn't work with Kubeflow Pipelines 1.4.1, specifically with the underlying argo implementation. This is the typical error you will run into:
Failed to wait for container id '60b93fa7392926e132e8a3c3c336d55d1ba85734a85cd519a786c2be86b4334a':
Error response from daemon: No such container: 60b93fa7392926e132e8a3c3c336d55d1ba85734a85cd519a786c2be86b4334a
Will the pipeline deployment get an updated to support Containerd?

The support for docker-less execution will come after the upgrade to Argo 3 which is currently pending.
We've tried the PNS executor of Argo 2, but found it to be flaky.

Related

Gitlab runner with docker-windows executor is pulling linux docker images and failing

I've been running a Gitlab CI runner in a linux server for a while configured with docker executor. This has been working for a while now until I have been asked to add a docker-windows executor to the pool in order to compile windows projects under docker.
It seems that the windows-docker executor is now trying to also execute linux based jobs and failing with the following error:
Running with gitlab-runner 14.0.1 (c1edb478)
on XXXXXXXXXXXXX_
Preparing the "docker-windows" executor 00:14
Using Docker executor with image SOME_LINUX_BASED_IMAGE_HERE ...
Authenticating with credentials from job payload (GitLab Registry)
Pulling docker image SOME_LINUX_BASED_IMAGE_HERE ...
WARNING: Failed to pull image with policy "always": image operating system "linux" cannot be used on this platform (manager.go:205:2s)
ERROR: Preparation failed: failed to pull image "SOME_LINUX_BASED_IMAGE_HERE" with specified policies [always]: image operating system "linux" cannot be used on this platform (manager.go:205:2s)
I have searched Gitlab documentation on how to avoid this but have been unable to do so. How can I configure my runners to avoid this?
Thanks in advance
You can assign tags to runners and then a specific runner will only accept jobs that have a specific tag.
cf. https://docs.gitlab.com/runner/#tags and https://docs.gitlab.com/ee/ci/yaml/#tags
I introduced OS specific tags for all my runners and configured my default runners (majority is linux) to also accept jobs without tags in order to avoid tagging of all jobs that should run on the default runners.
My final configuration to avoid the problem is the following:
Tag all the runners with the corresponding OS tag when registering a runner.
Because my organization runs by default on linux, linux runners are configured to allow to pick up jobs without tags.

Getting accurate status of openshift pod deployment

I have deployment stage in Jenkins which executes "oc patch" and "oc rollout" commands. These commands replaces docker image name in DeploymentConfig and rollout the changes in openshift.
As you can imagine, this is asynchronous call. It means in Jenkins I am not able to verify if newly deployed pod is running or failing. Jenkins just executes oc commands and proceeds to next stage execution. My requirement is I want to get the actual status of pod back in Jenkins to mark pipeline success/failed.
I could not find any oc or kubectl command which provides me exact status of deployment using synchronous call. As workaround I wrote shell script which checks the status of pod (using grep)for certain amount of time post "oc rollout" and send exit status back to Jenkins shell. I feel this is not correct method to do the deployment validation as my pipeline execution time is increased.
Do we have standard utility in openshift/kubernetes which can provide me exact status of pod post deployment on which I can rely and using which I can mark my deployment pipeline success/fail in Jenkins.
Please note, I am opening shell session in pipeline and executing oc cli commands on agent which has oc cli installed. This agent is not part of openshift cluster and Jenkins is sitting outside openshift cluster.
You have to use oc rollout status
Synopsis
oc rollout status
Description
Watch the status of the latest rollout, until it's done.

JX Promote returns 404 and exits from the job

Summary
jx installed in GKE(Google Kubernetes engine) and configured bitbucket cloud repository.
When trying to promote the build using jx promote returns
Promoting app sample-spring version 0.0.11 to namespace jx-staging
error: finding existing PRs using filter on repo baskar030/environment-XXX-staging: listing open pull requests on baskar030/environment-XXX-staging: Status: 404 Not Found, Body: {"type": "error", "error": {"message": "Resource not found", "detail": "There is no API hosted at this URL.\n\nFor information about our API's, please refer to the documentation at: https://developer.atlassian.com/bitbucket/api/2/reference/"}}
script returned exit code 1
Jx version
The output of jx version is:
COPY OUTPUT HERE
NAME VERSION
jx 2.0.398
jenkins x platform 2.0.744
Kubernetes cluster v1.12.8-gke.10
kubectl v1.12.8-dispatcher
helm client Client: v2.13.1+g618447c
git git version 2.17.2 (Apple Git-113)
Operating System Mac OS X 10.13.6 build 17G65
Jenkins type
- Classic Jenkins
Kubernetes cluster
GKE Cluster version 1.12.8-gke.10
Operating system / Environment
Ubuntu
This error is related to this issue: Bitbucket Cloud API deprecations
As long as the bitbucket cloud API used by Jenkins X (See jx\pkg\gits\bitbucket_cloud.go) hasn't been fixed, you'll run into this error.

docker pipeline in jenkins issue

I am trying to run the docker steps using docker pipeline plugin in Jenkins but facing the following issue:
Cannot run program "nohup" (in directory "/var/lib/jenkins/workspace/pipeline/java"): error=7, Argument list too long
This issue is coming while running:
app = docker.build('java:1.7')
I went through the plugin code and seems to be a shell issue where the argument length exceeds the default limit.
Earlier it was working fine but all of sudden start facing this issue.
I appreciate any kind of help here.
Environment details are as follows:
Jenkins Version: 2.89.2 running on AWS RHEL EC2 instance
Docker pipeline plugin version: 1.14
Thanks,
Sanjiv

CI/CD with Docker - what is the final deployment step?

I am developing a small website (Ruby/Sinatra) to be used internally where I work. (Simply, it crunches some source data and generates reports.)
I'm want to deploy it using Docker and have a set up that works on my dev environment, but I'm trying to understand the workflow for "production" deployment (we're using Jenkins).
I've read lots of articles about deployment workflows using Docker, but they all seem to stop at "and then push your image to the Docker registry". What seems to be missing is how to then take that image and actually update the application.
I appreciate that every application is likely to be different, but what is the next step? I'm aware of lots of different frameworks like Chef, Puppet, Ansible that could be used, but my question really is - how do I integrate that into my CI/CD pipeline? E.g. does a job "push" the changes to the production server, or should a Jenkins slave be running on the production server to execute a job directly on the server?
There are several orchestration tools like docker-swarm, kubernetes and rancher. In docker swarm for example you create services and can update the versions in blue-green deployment manner also for just one instance (then there is no blue-green :) ) and if you just use docker run you should check your running container, stop and remove it if its running an start your docker container with the newer image version.
It depends on how your application is configured to run. In my case, I have a call to "docker run" in a systemd script. It's configured to just restart if it ever stops.
So, in my Jenkinsfile, after I push the image to the registry, I do a "docker pull" (my Jenkins agent is running on the same box that the application is running on), and then a "docker stop". That causes the application to exit, then restarts, which causes it to get the new version that was just pulled, and now it's running the new version.

Resources