1 Jenkins 2 clusters, agent connection issue on 2nd cluster - jenkins

I'm sitting with a new issue that you might also face soon. I need a little help if possible. I've spent about almost 2 working weeks on this.
I have 2 possible solutions for my problem.
CONTEXT
I have 2 kubernetes clusters called FS and TC.
The Jenkins I am using runs on TC.
The slaves do deploy in FS from the TC Jenkins, however the slaves in FS would not connect to the Jenkins master in TC.
The slaves make use of a TCP connection that requires a HOST and PORT. However, the exposed jnlp service on TC is HTTP (http:/jenkins-jnlp.tc.com/) which uses nginx to auto generate the URL.
Even if I use
HOST: jenkins-jnlp.tc.com
PORT: 80
It will still complain that it's getting serial data instead of binary data.
The complaint
For TC I made use of the local jnlp service HOST (jenkins-jnlp.svc.cluster.local) with PORT (50000). This works well for our current TC environment.
SOLUTIONS
Solution #1
A possible solution would involve having a HTTP to TCP relay container running between the slave and master on FS. It will then be linked up to the HTTP url in TC (http:/jenkins-jnlp.tc.com/), encapsulating the HTTP connection to TCP (localhost:50000) and vice versa.
The slaves on FS can then connect to the TC master using that TCP port being exposed from that container in the middle.
Diagram to understand better
Solution #2
People kept complaining and eventually someone made a new functionality to Jenkins around 20 Feb 2020. They introduced Websockets that can run over HTTP and convert it to TCP on the slave.
I did set it up, but it seems too new and is not working for me even though the slave on FS says it's connected, it's still not properly communicating with the Jenkins master on TC. It still sees the agent/slave pod as offline.
Here are the links I used
Original post
Update note on Jenkins
Details on Jenkins WebSocket
Jenkins inbound-agent github
DockerHub jenkins-inbound-agent
CONCLUSION
After a lot of fiddling, research and banging my head on the wall, I think the only solution is solution #1. Problem with solution #1, a simple tool or service to encapsulate HTTP to TCP and back does not exist (that I know of, I searched for days). This means, I'll have to make one myself.
Solution #2 is still too new, zero to none docs to help me out or make setting it up easy and seems to come with some bugs. It seems the only way to fix these bugs would be to modify both Jenkins and the jnlp agent's code, which I have no idea where to even start.
UPDATE #1
I'm halfway done with the code for the intermediate container. I can now get a downstream from HTTP to TCP, I just have to set up an upstream TCP to HTTP.
Also considering the amount of multi-treading required to run a single central docker container to convert the protocols. I figured on adding the the HTTP-to-TCP container as a sidecar to the Jenkins agent when I'm done.
This way every time a slave spins up in a different cluster, it will automatically be able to connect and I don't have to worry about multiple connections. That is the theory, but obviously I want results and so do you guys.

Related

Port allocation when running build job in Jenkins

My project is structured in such a way that the build job in Jenkins is triggered from a push to Git. As part of my application logic, I spin up kafka and elastic search instances to be used in my test cases downstream.
The issue I have right now is, when a developer pushes his changes to Git, it triggers a build in Jenkins which in turn runs our code and spawns kafka broker in localhost:9092 and elastic search in localhost:9200.
When another developer working on some other change simultaneously, pushes his code, it triggers the build job again and tries to spin up another instance of kafka/elastic search but fails with the exception “Port already in use”.
I am looking at options on how to handle this scenario.
Will running these instances inside of docker container help to some extent? How do I handle the port issue in that case?
Yes dockerizing these instances can indeed help as you can spawn them multiple times.
You could create a docker container per component including your application and then let them talk to each other by linking them or using docker-compose
That way you would not have to expose the ports to the "outside" world but keep it internal within the docker environment.
That way you would not have the “Port already in use”. The only problem is memory in that case. e.g. if 100 pushes are done to the git repo, you might run out of memory...

Jenkins Double SSH Remote Connection

I have a bit unusual environment. In order to connect to the machine B via ssh, I need connect to the machine A and from that box, connect to B, and execute a number of commands there.
Local --ssh--> Machine A --ssh--> Machine B (some commands to execute here)
Generally speaking, Machine A is my entry point to all servers.
I am trying to automate the deployment process with Jenkins and wondering, if it supports such unusual scenario.
So far, I installed the SSH plugin and able to connect to Machine A, yet I am struggling with a connection to Machine B. The jenkins process freezes on the ssh command to Machine B and nothing happens.
Does anyone have any ideas how I can make such scenario work?
The term for Machine A is a "bastion host", which might help your googling.
This link calls it a "jump host", and describes a number of ways to use SSH's ProxyCommand setting to setup all manner of inter-host SSH communication:
https://www.cyberciti.biz/faq/linux-unix-ssh-proxycommand-passing-through-one-host-gateway-server/

Jenkins: 2 master nodes using NFS

I´m thinking about the following high availability solution for my enviroment:
Datacenter with one powered on Jenkins master node.
Datacenter for desasters with one off Jenkins master node.
Datacenter one is always powered on, the second is only for disasters. My idea is install the two jenkins using the same ip but with a shared NFS. If the first has fallen, the second starts with the same ip and I still having my service successfully
My question is, can this solution work?.
Thanks all by the hekp ;)
I don't see any challenges as such why it should not work. But you still got to monitor in case of switch-over because I have faced the situation where jobs that were running when jenkins abruptly shuts down were still in the queue when service was recovered but they never completed afterwards, I had to manually delete the build using script console.
Over the jenkins forum a lot of people have reported such bugs, most of them seems to have fixed, but still there are cases where this might happen, and it is because every time jenkins is restarted/started the configuration is reloaded from the disk. So there is inconsistency at times because of in memory config that were there earlier and reloaded config.
So in your case, it might happen that your executor thread would still be blocked when service is recovered. Thus you got to make sure that everything is running fine after recovery.

Run multiple instances on a Mesos slave node

I'm building an Apache mesos cluster with 3 masters and 3 slaves. I installed docker on the slave nodes and it's able to create instances which are vissible in Marathon. Now i tried to install the HAproxy server on top of it but that didn't worked out that well so I deleted it.
The problem is, since then i'm only able to scale my application to a maximum of 3 instances, the exact number of nodes When I want to scale to 5, there are 2 instances that are stuck at the 'deploying' stage.
Does anyone know how to fix this issue so i'm back able to create more instances?
Thank you
To perform that, you trully need to setup Marathon ServiceDiscovery with HAProxy as unknown ports on the same slave machine will be binded to your containers.
First, install HAProxy on every slave. If you need SSL, you will need to make build HAProxy to support SSL.
Then, when HAProxy service is running, you need to follow this very well explain tutorial to enable Marathon service discovery on every Slave
HAProxy marathon Service discovery
Pay well attention to the tutorial, it is very well explained and straightforward.

Networking among kubernetes minions

I installed an 8-node kubernetes cluster (1 master + 7 minion) but I faced a networking problem among minions.
I installed my cluster according to this step-by-step Fedora manual, so I use Fedora 20 with its testing repository to get kubernetes binaries.
After installing, I wanted to try the guestbook example, but it seems to me there is a problem with the inter-container networking.
Although containers/PODs are in running state and I can reach my 3 frontend containers (via browser) and the redis containers as well (via natcat), but the frontend, which not on the same host with the redis, cannot reach redis master. The frontend's PHP give back network exception.
Can anybody help me why the containers cannot reach each other among the hosts?
I hope I could describe my setup enough accurately and thanks in advance.
The Fedora guide you followed will only get you running on a single machine. It avoids the issues around setting up networking across nodes.
For kubernetes to work, the following network set up must be satisfied:
Every container should be able to talk to every other container, even across nodes. This means also that the bridge IP range for those containers must not overlap.
Code running on any node that isn't in a container should be able to reach every container (and vise-versa), even across nodes.
It is not necessary (but useful) if computers on the network that aren't part of the cluster can reach the containers directly.
There are a lot of ways to achieve this -- for instance the set up for vagrant sets up GRE tunnels between each node. On GCE we use features of the platform to do the routing. If you are on physical machines on a switch you can probably just do a big layer 2 network w/ bridges. A bulletproof way to get started (but perhaps not the most performant, depending on your set up) is to use something like flannel.
We are working on making this stuff easier to start up (without using a mess of shell scripts) and are thinking of building something like flannel in so that there is a reasonable default.

Resources