Missing jobs in jenkins when kubernetes deployment have many replicas - jenkins

I have a deployment of jenkins in kubernetes with 2 replicas, exposed as a service under the nginx-ingress. After creating a project, the next refresh would yield no result for it, as if it was never created, the third refresh would show the created project again.
New to jenkins and kubernetes so not really sure what is happening.
Maybe each time the service is routing to different pods and so just one of the have the project created and other none. If this is the case how could i fix it??
PD: I reduce the replica to 1 and it work as intended but I am trying to make this a failure tolerant project.

To my knowledge Jenkins doesn't support HA by design. You can't scale it up just by adding more replicas. Here is simmilar question to yours on stack overflow.
Nginx is load balancing between two replicas of jenkins instances you created.
These two instances are not aware of each other and have separate data so you alternate between two totally separate jenkins instances.
One way you can try solving this is by setting session affinity on the ingress object:
nginx.ingress.kubernetes.io/affinity-mode: cookie
so in this way your browser session sticks to one pod.
Also remember to share $JENKINS_HOME dir between these pods e.g. using NFS volumes.
And let me know if you find this helpful.

Related

On Demand Container Serving With GCP

I have a Docker image that will execute different logic depending on the environment variables that it is run with. You can imagine that running it with VAR=A will produce slightly different logic compared to running it with VAR=B.
We have an application that is meant to allow users to kick off tasks that are defined within this Docker image. Depending on the user attributes, different environment variables will need to be passed into the Docker container when it is run. The task is meant to run each time an event is generated through user action and then the container should shut down/be removed.
I'm trying to determine if GCP has any container services that best match what I'm looking for. My understanding of some of the services is:
Cloud Functions - can work well for consuming events and taking specific actions each time an event is triggered, but it is not suited for containerized workloads.
Cloud Run - a serverless way of deploying containers. As I understand it, a deployment on cloud run spins up a "service", and the environment variables must be passed in as part of the service definition. Because we may have a large number of combinations of environment variables (many of which may need to be running at once), it seems that this would end up creating a large number of services, which feels potentially clunky. This approach seems better for deploying a single service with static environment variables that needs to be auto-scaled by GCP.
GKE - another container orchestration platform. This is what I'm considering at the moment. The idea is that we would define a single job definition that can vary according to environment variables that are passed into it. The problem is that these jobs would need to be kicked off dynamically via code. This is a fairly straightforward process with kubectl, but the Kubernetes REST API seems fairly underdeveloped (or at least not that well documented). And the lack of information online on how to start jobs on-demand through the Kubernetes API makes me question whether this is the best approach.
Are there any tools that I'm missing that would be useful in spinning up containers on-demand with dynamic sets of environment variables and removing them when done?

Jenkins connect to AWS instances dynamically

I have one question regarding Jenkins and connecting it to AWS instances.
I have a Jenkins server/ master node and I want it to connect to my AWS instances which are created & destroyed by an Auto Scale Group that they belong to.
I checked ec2 and ec2 fleet plugins and even though they do what I need, to connect to each (currently) existing instance and create a slave for it, they both intervene with the creation/ termination of the instances as well as with my Auto Scale Group settings.
I simply want Jenkins to connect to my instances, do some CI/CD stuff there and that's about it.
I don't want Jenkins to either create or terminate any of my instances.
I can do that with static instances, by using their ip's but I can't seem to find a way to do that with dynamically created instances.
Any ideas on how to work around that?
Thank you!

Presto with Kubernetes

We are trying to implement Presto with Kubernetes. We have a kubernetes cluster running on cloud as a service. I tried to google on this but could not find a conclusive result as to what may be the best practices to deploy Presto with Kubernetes. Though there exists the official github of Presto - but does not help. Below are the two questions I am trying to seek an answer for:
What should be the best approach to configure Presto with Kubernetes - metrics such as ideal worker replicas?
How can we go ahead and performance test this deployment?
You could install with the official helm chart from https://github.com/helm/charts/tree/master/stable/presto It provides an option to set the number of workers. With the official chart you should be able to ask questions in the Kubernetes charts slack channel (through http://slack.k8s.io) and raise issues in GitHub if you hit any. Or there are non-helm examples such as https://github.com/dharmeshkakadia/presto-kubernetes
The question of how many workers isn't specific to Kubernetes. It's a question of how much and what kind of load you will need the deployment to handle and will also depend on what hardware your Kubernetes cluster is using. If you're not sure then perhaps you can deploy with the defaults and adjust as needed. This is suggested by https://prestodb.io/presto-admin/docs/current/installation/presto-configuration.html You'll find some of the settings such as memory per node set in the Deployment parts of the kubenernetes yaml descriptors or in the values.yaml in the case of the helm chart.
To performance test your deployment you will need test data and can then run queries against the cluster. So the same process you would follow outside of Kubernetes. There are tools to help such as https://www.lewuathe.com/use-benchto-for-evaluation-of-presto.html or https://github.com/prestodb/tempto You may also want to look at https://kognitio.com/blog/presto-performance-powerful-or-problematic/
There are a couple of examples of how it could be achieved available, for example dharmeshkakadia/presto-kubernetes but I guess you might want to use a StatefulSet here, rather. Not sure concerning perf tests because much of it will depend on the kind of persistent volume you choose or better say by what it is backed, for example NFS, Ceph, or maybe you are in a cloud environment with native storage?

how to make two docker containers share sqllite db on kubernetes?

I am trying to build an application which in essence consists of two parts.
Django based api
SQLite database.
Api interacts with the SQLite database, but has only read only access. However, the SQLite database needs to be updated every N minutes. So my plan was to make two docker container. First one, for the api. The second one, for the script which is executed every N minutes using CRON (Ubuntu) to update the database.
I am using Kubernetes to serve my applications. So I was wondering if there is a way to achieve what I want here?
I've researched about Persistent Volumes in Kubernetes, but still do not see how I can make it work.
EDIT:
So I have figured that I can use one pod two share two containers on Kubernetes and this way make use of the emptyDir. My question is then, how do I define the path to this directory in my python files?
Thanks,
Lukas
Take into account that emptyDir is erased every time the pod is stopped/killed (you do a deployment, a node crash, etc.). See the doc for more details: https://kubernetes.io/docs/concepts/storage/volumes/#emptydir
Taking that into account, if that solves your problem, then you just need to put the mountPath to the directory you want, as in the link above shows the example.
Take into account that the whole directory will be empty, so if you have other things there they won't be visible if you set up and emptyDir (just typical unix mount semantics, nothing k8s specific here)

What to use to orchestrate a few long running web services on few machines?

Investigating the possibilities, i am quite confused what is the best tool for us.
We want to deploy a few web services, for start a gitlab and a wiki.
The plan is to use docker images for these services and to store the data externally.
This services need to be accessible from outside.
I looked into Marathon and kubernetes and both seemed like overkill.
A problem we face as academics is that most people only stay for about three years and it's not our main job to administrate stuff. So we would like an easy to use, easy to maintain solution.
We have 3-4 nodes we want to use for this, we'd like it to be fault tolerant (restarting the service on another node if one dies for example).
So to sum up:
3-4 nodes
gitlab with CI and runners
a wiki
possibly one or two services more
auto deployment, load balancing
as failsafe as possible
What would you recommend?
I would recommend a managed container service like https://aws.amazon.com/ecs/
Running your own container manager swarm/kubernetes comes with a whole host of issues that it sounds like you should avoid.

Resources