I'm trying to deploy an app on k8s but I keep getting the following error
NAME READY STATUS RESTARTS AGE
pod_name 1/2 CreateContainerConfigError 0 26m
If I try to see what went wrong using kubectl describe pod pod_name I get the following error: Error from server (NotFound): pods "pod_name" not found
You didn't include the command that generated the output shown, so it's hard to tell. Perhaps you're looking at different namespaces?
One of the parameters key in the file was misspelling making the deploy fail. Unfortunately, the error message was not helpful...
CreateContainerConfigError means kubernetes is unable to find the configmap you have provided for volume.Make sure both pod and configmap are deployed in same namespace.
Do cross verify the name of configmap you have created and configmap name you have specified in pod volume definition is same.
the message Error from server (NotFound): pods "pod_name" not found is very clear for me that you deployed your pod in a different namespace
In your deployment yaml file check the value of namespace and execute the command
kubectl describe pod pod_name -n namespace_from_yaml_file
Related
Redhat Openshift autumatically creates a range of user ids that can be used in a given namespace, e.g.
$ oc describe namespace xyz-123
Name: xyz-123
Labels: <none>
Annotations: xx.com/accountID: id-testcluster-account
xx.com/type: System
openshift.io/sa.scc.mcs: s0:c25,c20
openshift.io/sa.scc.supplemental-groups: 1000640000/10000
openshift.io/sa.scc.uid-range: 1000640000/10000
Here is the problem:
While creating docker image, I am setting USER id in Dockerfile:
USER 1001121001:1001121001
I am specifying runAsUser in Helm charts to deploy this image:
runAsUser : 1001121001
When I try to create the deployment, the deployment fails. Because the user ID 1001121001 does not fall in the range above i.e. [1000640000, 1000640000+10000].
The deployment error:
$ oc get deployment abc-123 -n xyz-123 -o yaml
....
....
message: 'pods "abc-123-7f8fc74765-" is forbidden: unable to validate against any security context constraint: [spec.containers[0].securityContext.runAsUser: Invalid value: 1000321001: must be in the ranges: [1000660000, 1000669999]]'
....
....
Tried options - 1:
Using anyuid works as described here : https://www.openshift.com/blog/a-guide-to-openshift-and-uids
But the document says:
"Upon closer inspection of the “anyuid” SCC, it is clear that any user and any group can be used by the Pod launched by a ServiceAccount with access to the “anyuid” SCC. The “RunAsAny” strategy is effectively skipping the default OpenShift restrictions and authorization allowing the Pod to choose any ID."
Hence, I dont want to use this anyuid optiuon.
Tried option-2:
After creating a namespace get the range allowed for that namespace and select an id (say 1000660000) from that range and use that while deploying by setting that id for runAsUser: 1000660000.
All files/folders in the docker image will have the ownership/permissions set to USER 1001121001 and the container started with the id 1000660000 and hence there are issues running the container due to read/write/execute permissions.
To overcome this I need to give o+rwx permissions for all the files, which is risky.
Is there any other way to specify a USER in Dockerfile and use the same USER id during deployment in Redhat Openshift?
$ oc version
Client Version: 4.6.9
Server Version: 4.6.9
Kubernetes Version: v1.19.0+7070803
Solution:
The suggestion from Ritesh worked.
Created the namespace specifying the UID range and covering the specific USER ID. Then created the deployment in this namespace:
Created a namespace with predefined user id range (covering the specific USER id 1001121001) before deploying into the namespace.
apiVersion: v1
kind: Namespace
metadata:
name: xyz-123
annotations:
annotations:
openshift.io/sa.scc.mcs: 's0:c26,c5'
openshift.io/sa.scc.supplemental-groups: 1001120001/10000
openshift.io/sa.scc.uid-range: 1001120001/10000
If you are creating namespace while doing deployment or before then can use following option. Using this you can use runAsUser : 1001121001 (or any other user)
define the yaml file
apiVersion: v1
kind: Namespace
metadata:
name: dspx-dummy-runtimeid
annotations:
openshift.io/sa.scc.mcs: <>
openshift.io/sa.scc.supplemental-groups: <>
openshift.io/sa.scc.uid-range: <>
Use kubectl apply -f <namespace.yaml> or oc apply -f <namespace.yaml>.
We are deploying Java microservices to AWS 'ECR > EKS' using helm3 and Jenkins CI/CD pipeline. However what we see is, if we re-run Jenkins job to re-install the deployment/pod, then the pod does not re-install if there are no code changes. It still keeps the old running pod as is. Use case considered here is, AWS Secrets Manager configuration for db secret pulled during deployment has changed, so service needs to be redeployed by re-triggering the Jenkins job.
Approach 1 : https://helm.sh/docs/helm/helm_upgrade/
I tried using 'helm upgrade --install --force ....' as suggested in helm3 upgrade documentation but it fails with below error in Jenkins log
"Error: UPGRADE FAILED: failed to replace object: Service "dbservice" is invalid: spec.clusterIP: Invalid value: "": field is immutable"
Approach 2 : using --recreate-pods from earlier helm version
With 'helm upgrade --install --recreate-pods ....', I am getting below warning in Jenkins log
"Flag --recreate-pods has been deprecated, functionality will no longer be updated. Consult the documentation for other methods to recreate pods"
However, the pod gets recreated. But as we know --recreate-pods is not soft-restart. Thus we would have downtime, which breaks the microservice principle.
helm version used
version.BuildInfo{Version:"v3.4.0", GitCommit:"7090a89efc8a18f3d8178bf47d2462450349a004", GitTreeState:"clean", GoVersion:"go1.14.10"}
question
How to use --force with helm 3 with helm upgrade for above error ?
How to achieve soft-restart with deprecated --recreate-pods ?
This is nicely described in Helm documentation: https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
Below is how I configured it - Thanks to #vasili-angapov for redirecting to correct documentation section.
In deployment.yaml, I added annotations and rollme
kind: Deployment
spec:
template:
metadata:
annotations:
rollme: {{ randAlphaNum 5 | quote }}
As per documentation, each invocation of the template function randAlphaNum will generate a unique random string. Thus random string always changes and causes the deployment to roll.
The other way described in the document is with respect to a changing SHA value for a file.
In the past helm recommended using the --recreate-pods flag as another option. This flag has been marked as deprecated in Helm 3 in favor of the more declarative method above.
Openshift/okd version: 3.11
I'm using jenkins-ephemeral app from the openshift catalog and using a buildconfig to create a pipeline. Reference: https://docs.okd.io/3.11/dev_guide/dev_tutorials/openshift_pipeline.html
When i start the pipeline, in one the stage of jenkins it needs to create a persistent volume, at that point im getting the following error:
Error from server (Forbidden): persistentvolumes is forbidden: User "system:serviceaccount:pipelineproject:jenkins" cannot create persistentvolumes at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io "create" not found
I have tried giving the cluster-create role to service account jenkins with following command, still im getting the same error.
oc adm policy add-cluster-role-to-user create system:serviceaccount:pipelineproject:jenkins
Creating a PersistentVolume is typically something that you should not be manually doing. You should ideally be relying on PersistentVolumeClaims. PersistentVolumeClaims are namespaced resources, that your service account should be able to create with the edit Role.
$ oc project pipelineproject
$ oc policy add-role-to-user edit -z jenkins
However, if it's required that you interact with PersistentVolume objects directly, there is a storage-admin ClusterRole that should be able to give your ServiceAccount the necessary permissions.
$ oc project pipelineproject
$ oc adm policy add-cluster-role-to-user storage-admin -z jenkins
I am trying to run the artifact passing example on Argoproj. However, I am getting the following error:
failed to save outputs: verify serviceaccount platform:default has necessary privileges
This error is appearing in the first step (generate-artifact) itself.
Selecting the generate-artifact component and clicking YAML gives following line highlighted
Nothing appears on clicking LOGS.
I need to understand the correct sequence of steps in running the YAML file so that this error does not appear and artifacts are passed. Could not find much resources on this issue other than this page where the issue is discussed on argo repository.
All pods in a workflow run with the service account specified in workflow.spec.serviceAccountName, or if omitted, the default service account of the workflow's namespace.
Here the default service account of that namespace doesn't seem to be given any roles by default.
Try granting a role to the “default” service account in a namespace:
kubectl create rolebinding argo-default-binding \
--clusterrole=cluster-admin \
--serviceaccount=platform:default \
--namespace=platform
Since the default service account now gets all access via the 'cluster-admin' role, the example should work now.
I'm trying to delete a failed pod from the Pods page on the Kubernetes Web UI, but it is not getting deleted.
I understand what the error itself is, and I believe I have resolved it by using secrets, since this is a private repo. That said, I cannot re-add the pod again correctly, since it already exists.
Here is what I am seeing on the Pods page in the Kubernetes UI:
Pod Status: Waiting: ContainerCreating
Error:
Failed to pull image "<USERNAME>/<REPO>:<TAG>":
failed to run [fetch --no-store docker://<USERNAME>/<REPO>:<TAG>]:
exit status 254 stdout: stderr: Flag --no-store has been deprecated,
please use --pull-policy=update fetch: Unexpected HTTP code: 401, URL: https://xxxxxx.docker.io/v2/<USERNAME>/<NAMESPACE>/manifests/<TAG>
Error syncing pod
I have also tried deleting the pod with kubectl, but kubectl can't even see the failed pod!
$ kubectl get pods
No resources found
$ kubectl get pods --show-all
No resources found
Is there any other way that I can delete this pod?
I just found a solution to my own problem. Go to the Workloads page in the Kubernetes Web UI, and delete the associated Deployment, and the Pod will be deleted as well.
If the pod does not get deleted after this, you will need to force a delete from the command line.
kubectl delete pod <POD_NAME> --grace-period=0 --force
Use command below to delete terminated or failed one:
kubectl delete pods pod_name --grace-period=0 --force -n namespace_name