Unable to install Jenkins on Minikube using Helm due to the permission

Unable to install Jenkins on Minikube using Helm due to the permission - jenkins

I've been trying to install Jenkins by using Helm on Minikube according to the official article
https://www.jenkins.io/doc/book/installing/kubernetes/
It turns out that I can't bring up the Jenkins Pod, kubectl logs -f jenkins-0 -c init -n jenkins gives me this error
disable Setup Wizard
/var/jenkins_config/apply_config.sh: 4: /var/jenkins_config/apply_config.sh: cannot create /var/jenkins_home/jenkins.install.UpgradeWizard.state: Permission denied
From my assumption, this issue obviously relates with permission in Dockerfile
or it might relates to the defined values in jenkins-values.yaml. I've changed some parameters as the recommended values.
storageClass: jenkins-pv
serviceAccount:
create: false
name: jenkins
annotations: {}
serviceType: NodePort
release detail
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
jenkins jenkins 1 2021-01-04 15:58:00.022465588 +0700 +07 deployed jenkins-3.0.14 2.263.1
is there anyway to fix this?
Thanks

It seems that for some reason the volume is mounted with not-enough access rights. You can try running your container with the root user. It may solve the issue. Put these lines into your values.yaml.
runAsUser: 0
fsGroup: 0

Related

Unable to install Jenkins on Minikube using Helm due to the permission on mac

I`vw tried to install jenkins on minikube according this article
https://www.jenkins.io/doc/book/installing/kubernetes/
When I type kubectl logs pod/jenkins-0 init -n jenkins
I get
disable Setup Wizard
/var/jenkins_config/apply_config.sh: 4: /var/jenkins_config/apply_config.sh: cannot create /var/jenkins_home/jenkins.install.UpgradeWizard.state: Permission denied
I almost sure that I have some problems with file system on mac.
I did not create serviceAccount from article because helm have not seen it and returns error.
Instead of it I changed in jenkins-values.yaml
serviceAccount:
create: true
name: jenkins
annotations: {}
Then I tried set next values to 0. It have no affect.
runAsUser: 1000
fsGroup: 1000
Addition info:
kubectl get all -n jenkins
NAME READY STATUS RESTARTS AGE
pod/jenkins-0 0/2 Init:CrashLoopBackOff 7 15m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/jenkins ClusterIP 10.104.114.29 <none> 8080/TCP 15m
service/jenkins-agent ClusterIP 10.104.207.201 <none> 50000/TCP 15m
NAME READY AGE
statefulset.apps/jenkins 0/1 15m
Also tried to use different directories for volume live /Volumes/data and add 777 permissions to it.

There are a couple potentials in here, but there is a solution without switching to runAsUser 0 (which breaks security assessments).
The folder /data/jenkins-volume is created as root by default, with a 755 permission set so you can't create persistent data in this dir with the default jenkins build.
To fix this, enter minikube with $ minikube ssh and run: $ chown 1000:1000 /data/jenkins-volume
The other thing that could be biting you (after fixing the folder permissions) is SELinux policies, when you are running your Kubernetes on a RHEL based OS.
To fix this: $ chcon -R -t svirt_sandbox_file_t /data/jenkins-volume

It was resolved
I just set runAsUser to 0 everywhere.

runAsUser to 0 everywhere worked, but this not the ideal solution due to potential security issues. Good for dev environment but not for prod.

Jenkins jobs now fail after upgrading Jenkins kubernetes plugin from 1.14.2 --> 1.26.0. Plugin update changes build dir path- re-clarifying question

Jenkins version is 2.222.4.
We upgraded the jenkins kubernetes plugin from 1.14.2 --> 1.26.0.
What this has done is pre-pluginupgrade, the jenkins slave would mount /home/jenkins as rw so it could use .gradle files in there for its build.
Post plugin upgrade, home/jenkins is now change to readonly, and instead the dir called /home/jenkins/agent has become the read/write.
However the build job now has no more r/w access to files in home/jenkins which it needs.
I did a df -h on our slave jnlp pod pre upgrade (k8splugin-V1.14.2) and see the following:
Filesystem Size Used Available Use% Mounted on
overlay 119.9G 5.6G 109.1G 5% /
/dev/nvme0n1p2 119.9G 5.6G 109.1G 5% /home/jenkins
and can see its mounted as read/write
cat /proc/mounts | grep -i jenkins
/dev/nvme0n1p2 /home/jenkins ext4 rw,relatime,data=ordered 0 0
Post plugin upgrade if I run a df -h I don't even see /home/jenkins mounted only:
/dev/nvme0n1p2 120G 5.6G 110G 5% /etc/hosts
and if I cat /proc/mounts I only see this post upgrade
jenkins#buildpod:~$ cat /proc/mounts | grep -i jenkins
/dev/nvme0n1p2 /home/jenkins/agent ext4 rw,relatime,data=ordered 0 0
/dev/nvme0n1p2 /home/jenkins/.jenkins ext4 rw,relatime,data=ordered 0 0
Also seeing this in the jenkins job log but not sure if it is relevant:
WARNING] HOME is set to / in the jnlp container. You may encounter troubles when using tools or ssh client. This usually happens if the uid doesnt have any entry in /etc/passwd. Please add a user to your Dockerfile or set the HOME environment variable to a valid directory in the pod template definition.
Any ideas or workarounds would be most welcome as badly stuck by this issue.
Brian

My colleague just figured this out. He found it goes back to a change the plugin developers made sometime in August 2019, to be compatible with Kubernetes 1.18. That's when they changed the default workspace in release 1.18.0 of the plugin. It was spotted and supposed to be fixed here github.com/jenkinsci/kubernetes-plugin/pull/713 but it persists in our case. Workaround is to hardcode into the jenkinsfile of each job workingDir: '/home/jenkins' under the container

Trying to Implement Jupyterhub on Kubernetes

I am trying to implement Jupyterhub on a set of 8 unclustered completely identical computers in my school. My instructions were first to cluster the 8 systems (all running Ubuntu 18.04 LTS) and to implement Jupyterhub on that cluster.
After searching the net, these are the instructions that I followed-
Installed docker on both systems using this instructions
(Tried) Implemented a Kubernetes cluster using this instructions and this
Implement Jupyterhub using zero-to-jupyterhub instructions
Using the instructions I managed to do steps 1 and 2 already. But after installing helm using the instructions of zero-to-jupyterhub, I came across the error when doing step 2 of Installing Jupyterhub section in this webpage.
My exact error is:
Error: Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps?labelSelector=NAME%D(MISSING)jhub%!(MISSING)OWNER%D(MISSING)TILLER%!D(MISSING)DEPLOYED: dial tcp 10.96.0.1:443: i/o timeout
Error: UPGRADE FAILED : Get https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps?labelSelector=NAME%D(MISSING)jhub%!(MISSING)OWNER%D(MISSING)TILLER%!D(MISSING)DEPLOYED: dial tcp 10.96.0.1:443: i/o timeout
then when I view the link I get this: [https://10.96.0.1:443/api/v1/namespaces/...]
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "configmaps is forbidden: User \"system:anonymous\" cannot list resource \"configmaps\" in API group \"\" in the namespace \"kube-system\"",
"reason": "Forbidden",
"details": {
"kind": "configmaps"
},
"code": 403
}
Has anyone encountered this problem? What did you do?
Thank you for anyone that would answer...
Also, feel free to tell me I'm wrong in the implementation as I am open to new Ideas. If you have any better way to this please leave instructions on how to implement it. Thank you very much.

It looks like you have RBAC enabled and are trying to access the resources that are not permitted to be accessed from your account.
Did you follow the instructions to set up Helm/Tiller? There should be two commands that will create the proper permissions to deploy JupyterHub:
kubectl --namespace kube-system create serviceaccount tiller
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
Hope this helps!

I had exactly the same issue when I upgraded my minikube. In my case I had to delete the cluster and init it again - everything worked fine from there.
In your case it seems like requests from Tiller are blocked and they can't reach the API. In case of your fresh cluster I think that the issue might be incorrect CNI configuration, but to confirm that you would have to add information on what CNI did you use and if you used --pod-network-cidr= flag or any other steps that could end up with conflict or blocking the Tiller requests.
Before adding that information I can only recommend running:
kubeadm reset
lets assume you want to use Calico:
kubeadm init --pod-network-cidr=192.168.0.0/16
kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml`
Install Helm:
curl https://raw.githubusercontent.com/helm/helm/master/scripts/get > get_helm.sh
chmod 700 get_helm.sh
./get_helm.sh
kubectl create serviceaccount tiller --namespace kube-system
kubectl create clusterrolebinding tiller-cluster-rule \
--clusterrole=cluster-admin \
--serviceaccount=kube-system:tiller
helm init --service-account=tiller
Now follow Jupyter Hub tutorial:
Create the config.yaml as described here.
And install JupyterHub:
helm repo add jupyterhub https://jupyterhub.github.io/helm-chart/
helm repo update
RELEASE=jhub
NAMESPACE=jhub
helm upgrade --install $RELEASE jupyterhub/jupyterhub \
--namespace $NAMESPACE \
--version=0.8.0 \
--values config.yaml

Error when deploying stable jenkins charts via kubernetes : curl performs SSL certificate verification by default

I have installed Rancher 2 and created a kubernetes cluster of internal vm's ( no AWS / gcloud).
The cluster is up and running. We are behind Corp proxy.
1) Installed Kubectl and executed kubectl cluster-info . It listed my cluster information correctly.
2) Installed helm
3) Configured helm referencing Rancher Helm Init
4) Tried installing Jenkins charts via helm
helm install --namespace jenkins --name jenkins -f values.yaml stable/jenkins
The values.yaml has proxy details.
---
Master:
ServiceType: "ClusterIP"
AdminPassword: "adminpass111"
Cpu: "200m"
Memory: "256Mi"
InitContainerEnv:
- name: "http_proxy"
value: "http://proxyuserproxypass#proxyname:8080"
- name: "https_proxy"
value: "http://proxyuserproxypass#proxyname:8080"
ContainerEnv:
- name: "http_proxy"
value: "http://proxyuserproxypass#proxyname:8080"
- name: "https_proxy"
value: "http://proxyuserproxypass#proxyname:8080"
JavaOpts: >-
-Dhttp.proxyHost=proxyname
-Dhttp.proxyPort=8080
-Dhttp.proxyUser=proxyuser
-Dhttp.proxyPassword=proxypass
-Dhttps.proxyHost=proxyname
-Dhttps.proxyPort=8080
-Dhttps.proxyPassword=proxypass
-Dhttps.proxyUser=proxyuser
Persistence:
ExistingClaim: "jk-volv-pvc"
Size: "10Gi"
5) The workloads are created. However the Pods are stuck.Logs complains about SSL certificate verification.
How to turn SSL verification off. I dont see an option to set in values.yaml.
We cannot turn off installing plugins during deployment as well.
Do we need to add SSL cert when deploying charts?
Any idea how to solve this issue?

I had the same issues as you had. In my case it was due to the fact that my DNS domain had a wildcard A record. So updates.jenkins.io.mydomain.com would resolve fine. After removing the wildcard, that fails now, so the host will then properly interpret updates.jenkins.io, as updates.jenkins.io.
This is fully documented here:
https://github.com/kubernetes/kubernetes/issues/64924

Kubernetes pods hanging in Init state

I am facing a weird issue with my pods. I am launching around 20 pods in my env and every time some random 3-4 pods out of them hang with Init:0/1 status. On checking the status of pod, Init container shows running status, which should terminate after task is finished, and app container shows Waiting/Pod Initializing stage. Same init container image and specs are being used in across all 20 pods but this issue is happening with some random pods every time. And on terminating these stuck pods, it stucks in Terminating state. If i ssh on node at which this pod is launched and run docker ps, it shows me init container in running state but on running docker exec it throws error that container doesn't exist. This init container is pulling configs from Consul Server and on checking volume (got from docker inspect), i found that it has pulled all the key-val pairs correctly and saved it in defined file name. I have checked resources on all the nodes and more than enough is available on all.
Below is detailed example of on the pod acting like this.
Kubectl Version :
kubectl version
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.0", GitCommit:"925c127ec6b946659ad0fd596fa959be43f0cc05", GitTreeState:"clean", BuildDate:"2017-12-15T21:07:38Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.2", GitCommit:"5fa2db2bd46ac79e5e00a4e6ed24191080aa463b", GitTreeState:"clean", BuildDate:"2018-01-18T09:42:01Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
Pods :
kubectl get pods -n dev1|grep -i session-service
session-service-app-75c9c8b5d9-dsmhp 0/1 Init:0/1 0 10h
session-service-app-75c9c8b5d9-vq98k 0/1 Terminating 0 11h
Pods Status :
kubectl describe pods session-service-app-75c9c8b5d9-dsmhp -n dev1
Name: session-service-app-75c9c8b5d9-dsmhp
Namespace: dev1
Node: ip-192-168-44-18.ap-southeast-1.compute.internal/192.168.44.18
Start Time: Fri, 27 Apr 2018 18:14:43 +0530
Labels: app=session-service-app
pod-template-hash=3175746185
release=session-service-app
Status: Pending
IP: 100.96.4.240
Controlled By: ReplicaSet/session-service-app-75c9c8b5d9
Init Containers:
initpullconsulconfig:
Container ID: docker://c658d59995636e39c9d03b06e4973b6e32f818783a21ad292a2cf20d0e43bb02
Image: shr-u-nexus-01.myops.de:8082/utils/app-init:1.0
Image ID: docker-pullable://shr-u-nexus-01.myops.de:8082/utils/app-init#sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd
Port: <none>
Args:
-consul-addr=consul-server.consul.svc.cluster.local:8500
State: Running
Started: Fri, 27 Apr 2018 18:14:44 +0530
Ready: False
Restart Count: 0
Environment:
CONSUL_TEMPLATE_VERSION: 0.19.4
POD: sand
SERVICE: session-service-app
ENV: dev1
Mounts:
/var/lib/app from shared-volume-sidecar (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-bthkv (ro)
Containers:
session-service-app:
Container ID:
Image: shr-u-nexus-01.myops.de:8082/sand-images/sessionservice-init:sitv12
Image ID:
Port: 8080/TCP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/etc/appenv from shared-volume-sidecar (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-bthkv (ro)
Conditions:
Type Status
Initialized False
Ready False
PodScheduled True
Volumes:
shared-volume-sidecar:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-bthkv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-bthkv
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
Container Status on Node :
sudo docker ps|grep -i session
c658d5999563 shr-u-nexus-01.myops.de:8082/utils/app-init#sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd "/usr/bin/consul-t..." 10 hours ago Up 10 hours k8s_initpullconsulconfig_session-service-app-75c9c8b5d9-dsmhp_dev1_c2075f2a-4a18-11e8-88e7-02929cc89ab6_0
da120abd3dbb gcr.io/google_containers/pause-amd64:3.0 "/pause" 10 hours ago Up 10 hours k8s_POD_session-service-app-75c9c8b5d9-dsmhp_dev1_c2075f2a-4a18-11e8-88e7-02929cc89ab6_0
f53d48c7d6ec shr-u-nexus-01.myops.de:8082/utils/app-init#sha256:7b0692e3f2e96c6e54c2da614773bb860305b79922b79642642c4e76bd5312cd "/usr/bin/consul-t..." 10 hours ago Up 10 hours k8s_initpullconsulconfig_session-service-app-75c9c8b5d9-vq98k_dev1_42837d12-4a12-11e8-88e7-02929cc89ab6_0
c26415458d39 gcr.io/google_containers/pause-amd64:3.0 "/pause" 10 hours ago Up 10 hours k8s_POD_session-service-app-75c9c8b5d9-vq98k_dev1_42837d12-4a12-11e8-88e7-02929cc89ab6_0
On running Docker exec (same result with kubectl exec) :
sudo docker exec -it c658d5999563 bash
rpc error: code = 2 desc = containerd: container not found

A Pod can be stuck in Init status due to many reasons.
PodInitializing or Init Status means that the Pod contains an Init container that hasn't finalized (Init containers: specialized containers that run before app containers in a Pod, init containers can contain utilities or setup scripts). If the pods status is ´Init:0/1´ means one init container is not finalized; init:N/M means the Pod has M Init Containers, and N have completed so far.
Gathering information
For those scenario the best would be to gather information, as the root cause can be different in every PodInitializing issue.
kubectl describe pods pod-XXX with this command you can get many info of the pod, you can check if there's any meaningful event as well. Save the init container name
kubectl logs pod-XXX this command prints the logs for a container in a pod or specified resource.
kubectl logs pod-XXX -c init-container-xxx This is the most accurate as could print the logs of the init container. You can get the init container name describing the pod in order to replace "init-container-XXX" as for example to "copy-default-config" as below:
The output of kubectl logs pod-XXX -c init-container-xxx can thrown meaningful info of the issue, reference:
In the image above we can see that the root cause is that the init container can't download the plugins from jenkins (timeout), here now we can check connection config, proxy, dns; or just modify the yaml to deploy the container without the plugins.
Additional:
kubectl describe node node-XXX describing the pod will give you the name of its node, which you can also inspect with this command.
kubectl get events to list the cluster events.
journalctl -xeu kubelet | tail -n 10 kubelet logs on systemd (journalctl -xeu docker | tail -n 1 for docker).
Solutions
The solutions depends on the information gathered, once the root cause is found.
When you find a log with an insight of the root cause, you can investigate that specific root cause.
Some examples:
1 > In there this happened when init container was deleted, can be fixed deleting the pod so it would be recreated, or redeploy it. Same scenario in 1.1.
2 > If you found "bad address 'kube-dns.kube-system'" the PVC may not be recycled correctly, solution provided in 2 is running /opt/kubernetes/bin/kube-restart.sh.
3 > There, a sh file was not found, the solution would be to modify the yaml file or remove the container if unnecessary.
4 > A FailedSync was found, and it was solved restarting docker on the node.
In general you can modify the yaml, for example to avoid using an outdated URL, try to recreate the affected resource, or just remove the init container that causes the issue from your deployment. However the specific solution will depend on the specific root cause.

My problem was related to the ebs-csi-controller (AWS EKS 1.24)
The ebs addin needs access to a role, and in my case the role trust relationship was broken. It uses OIDC, so I had to add my cluster's OIDC provider manually into the IAM identity provider section
kubectl logs deployment/ebs-csi-controller -n kube-system -c ebs-plugin
helped diagnose this, as well as
https://aws.amazon.com/premiumsupport/knowledge-center/eks-troubleshoot-ebs-volume-mounts/

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart