Kubernetes pod not binding volumes to container - docker

I've got the following ReplicationController JSON defined:
{
"id": "PHPController",
"kind": "ReplicationController",
"apiVersion": "v1beta1",
"desiredState": {
"replicas": 2,
"replicaSelector": {"name": "php"},
"podTemplate": {
"desiredState": {
"manifest": {
"version": "v1beta1",
"id": "PHPController",
"volumes": [{ "name": "wordpress", "path": "/mnt/nfs/wordpress_a", "hostDir": "/mnt/nfs/wordpress_a"}],
"containers": [{
"name": "php",
"image": "internaluser/php53",
"ports": [{"containerPort": 80, "hostPort": 9021}],
"volumeMounts": [{"name": "wordpress", "mountPath": "/mnt/nfs/wordpress_a"}]
}]
}
},
"labels": {"name": "php"}
}},
"labels": {"name": "php"}
}
The container starts correctly when run with "docker run -t -i -p 0.0.0.0:9021:80 -v /mnt/nfs/wordpress_a:/mnt/nfs/wordpress_a:rw internaluser/php53".
/mnt/nfs/wordpress_a is an NFS share, mounted on all of the minions. Each minion has full RW access and I have verified that the share is present.
After creating the pod containers with the Replication Controller, I can see that the volume was never actually bound, and/or incorrectly mounted:
"Volumes": {
"/mnt/nfs/wordpress_a": "/var/lib/docker/vfs/dir/8b5dc8477958f5c1b894e68ab9412b41e81a34ef16dac81f0f9d4884352a90b7"
},
"VolumesRW": {
"/mnt/nfs/wordpress_a": true
}
"HostConfig": {
"Binds": null,
"ContainerIDFile": "",
"LxcConf": null,
"Privileged": false,
"PortBindings": {
"80/tcp": [
{
"HostIp": "",
"HostPort": "9021"
}
]
},
I find it strange that the container believes /mnt/nfs/wordpress_a is mapped to "/var/lib/docker/vfs/dir/8b5dc8477958f5c1b894e68ab9412b41e81a34ef16dac81f0f9d4884352a90b7".
From the kubelet log:
Desired [10.101.4.15]: [{Namespace:etcd Name:c823da9e-4437-11e4-a3b1-0050568421eb Manifest:{Version:v1beta1 ID:c823da9e-4437-11e4-a3b1-0050568421eb UUID:c823da9e-4437-11e4-a3b1-0050568421eb Volumes:[{Name:wordpress Source:}] Containers:[{Name:php Image:internaluser/php53 Command:[] WorkingDir: Ports:[{Name: HostPort:9021 ContainerPort:80 Protocol:TCP HostIP:}] Env:[{Name:SERVICE_HOST Value:10.1.1.1}] Memory:0 CPU:0 VolumeMounts:[{Name:wordpress ReadOnly:false MountPath:/mnt/nfs/wordpress_a}] LivenessProbe: Lifecycle: Privileged:false}] RestartPolicy:{Always:0xa99a20 OnFailure: Never:}}}]
Does anyone have experience with this sort of thing? I've been driving myself crazy troubleshooting this. Thanks!

Solved. The volumes syntax was incorrect.
https://github.com/GoogleCloudPlatform/kubernetes/issues/1446

Related

Rabbitmq in Kubernetes: Command not found

Trying to start up rabbitmq in K8s while attaching a configmap gives me the following error:
/usr/local/bin/docker-entrypoint.sh: line 367: rabbitmq-plugins: command not found
/usr/local/bin/docker-entrypoint.sh: line 405: exec: rabbitmq-server: not found
Exactly the same setup is working fine with docker-compose, so I am a bit lost. Using rabbitmq:3.8.3
Here is a snippet from my deployment:
"template": {
"metadata": {
"creationTimestamp": null,
"labels": {
"app": "rabbitmq"
}
},
"spec": {
"volumes": [
{
"name": "rabbitmq-configuration",
"configMap": {
"name": "rabbitmq-configuration",
"defaultMode": 420
}
}
],
"containers": [
{
"name": "rabbitmq",
"image": "rabbitmq:3.8.3",
"ports": [
{
"containerPort": 5672,
"protocol": "TCP"
}
],
"env": [
{
"name": "RABBITMQ_DEFAULT_USER",
"value": "guest"
},
{
"name": "RABBITMQ_DEFAULT_PASS",
"value": "guest"
},
{
"name": "RABBITMQ_ENABLED_PLUGINS_FILE",
"value": "/opt/enabled_plugins"
}
],
"resources": {},
"volumeMounts": [
{
"name": "rabbitmq-configuration",
"mountPath": "/opt/"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent"
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"securityContext": {},
"schedulerName": "default-scheduler"
}
},
And here is the configuration:
{
"kind": "ConfigMap",
"apiVersion": "v1",
"metadata": {
"name": "rabbitmq-configuration",
"namespace": "e360",
"selfLink": "/api/v1/namespaces/default/configmaps/rabbitmq-configuration",
"uid": "28071976-98f6-11ea-86b2-0244a03303e1",
"resourceVersion": "1034540",
"creationTimestamp": "2020-05-18T10:55:58Z"
},
"data": {
"enabled_plugins": "[rabbitmq_management].\n"
}
}
That's because you're monting a volume in /opt, which is the rabbitmq home path.
So, the entrypoint script cannot find any of the rabbitmq binaries.
You can see the rabbitmq Dockerfile here

Orphaned Tasks in Docker Swarm after removal of failed node

Last week I had to remove a failed node from my Docker Swarm Cluster, leaving some tasks that ran on that node in desired state "Remove".
Even after deleting the stack and recreating it with the same name, docker stack ps stackname still shows them.
Interestingly enough, after recreating the stack, the tasks are still there, but with no node assigned.
Here's what I tried so far to "cleanup" the stack:
Recreating the stack with the same name
docker container prune
docker volume prune
docker system prune
Is there a way to remove a specific task?
Here's the output for docker inspect fkgz0oihexzs, the first task in the list:
[
{
"ID": "fkgz0oihexzsjqwv4ju0szorh",
"Version": {
"Index": 14422171
},
"CreatedAt": "2018-11-05T16:15:31.528933998Z",
"UpdatedAt": "2018-11-05T16:27:07.422368364Z",
"Labels": {},
"Spec": {
"ContainerSpec": {
"Image": "redacted",
"Labels": {
"com.docker.stack.namespace": "redacted"
},
"Env": [
"redacted"
],
"Privileges": {
"CredentialSpec": null,
"SELinuxContext": null
},
"Isolation": "default"
},
"Resources": {},
"Placement": {
"Platforms": [
{
"Architecture": "amd64",
"OS": "linux"
}
]
},
"Networks": [
{
"Target": "3i998stqemnevzgiqw3ndik4f",
"Aliases": [
"redacted"
]
}
],
"ForceUpdate": 0
},
"ServiceID": "g3vk9tgfibmcigmf67ik7uhj6",
"Slot": 1,
"Status": {
"Timestamp": "2018-11-05T16:15:31.528892467Z",
"State": "new",
"Message": "created",
"PortStatus": {}
},
"DesiredState": "remove"
}
]
I had the same problem. I resolved it following this instructions :
docker run --rm -v /var/run/docker/swarm/control.sock:/var/run/swarmd.sock dperny/tasknuke <taskid>
Be sure to use the full long task id or it will not work (fkgz0oihexzsjqwv4ju0szorh in your case).

How to directly mount external NFS share/volume in kubernetes(1.10.3)

I am using kubernetes : v1.10.3 , i have one external NFS server which i am able to mount anywhere ( any physical machines). I want to mount this NFS directly to pod/container . I tried but every time i am getting error. don't want to use privileges, kindly help me to fix.
ERROR: MountVolume.SetUp failed for volume "nfs" : mount failed: exit
status 32 Mounting command: systemd-run Mounting arguments:
--description=Kubernetes transient mount for /var/lib/kubelet/pods/d65eb963-68be-11e8-8181-00163eeb9788/volumes/kubernetes.io~nfs/nfs
--scope -- mount -t nfs 10.225.241.137:/stagingfs/alt/ /var/lib/kubelet/pods/d65eb963-68be-11e8-8181-00163eeb9788/volumes/kubernetes.io~nfs/nfs
Output: Running scope as unit run-43393.scope. mount: wrong fs type,
bad option, bad superblock on 10.225.241.137:/stagingfs/alt/, missing
codepage or helper program, or other error (for several filesystems
(e.g. nfs, cifs) you might need a /sbin/mount. helper program)
In some cases useful info is found in syslog - try dmesg | tail or so.
NFS server : mount -t nfs 10.X.X.137:/stagingfs/alt /alt
I added two things for volume here but getting error every time.
first :
"volumeMounts": [
{
"name": "nfs",
"mountPath": "/alt"
}
],
Second :
"volumes": [
{
"name": "nfs",
"nfs": {
"server": "10.X.X.137",
"path": "/stagingfs/alt/"
}
}
],
---------------------complete yaml --------------------------------
{
"kind": "Deployment",
"apiVersion": "extensions/v1beta1",
"metadata": {
"name": "jboss",
"namespace": "staging",
"selfLink": "/apis/extensions/v1beta1/namespaces/staging/deployments/jboss",
"uid": "6a85e235-68b4-11e8-8181-00163eeb9788",
"resourceVersion": "609891",
"generation": 2,
"creationTimestamp": "2018-06-05T11:34:32Z",
"labels": {
"k8s-app": "jboss"
},
"annotations": {
"deployment.kubernetes.io/revision": "2"
}
},
"spec": {
"replicas": 1,
"selector": {
"matchLabels": {
"k8s-app": "jboss"
}
},
"template": {
"metadata": {
"name": "jboss",
"creationTimestamp": null,
"labels": {
"k8s-app": "jboss"
}
},
"spec": {
"volumes": [
{
"name": "nfs",
"nfs": {
"server": "10.X.X.137",
"path": "/stagingfs/alt/"
}
}
],
"containers": [
{
"name": "jboss",
"image": "my.abc.com/alt:7.1_1.1",
"resources": {},
"volumeMounts": [
{
"name": "nfs",
"mountPath": "/alt"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent",
"securityContext": {
"privileged": true
}
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"securityContext": {},
"schedulerName": "default-scheduler"
}
},
"strategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"maxUnavailable": "25%",
"maxSurge": "25%"
}
},
"revisionHistoryLimit": 10,
"progressDeadlineSeconds": 600
},
"status": {
"observedGeneration": 2,
"replicas": 1,
"updatedReplicas": 1,
"readyReplicas": 1,
"availableReplicas": 1,
"conditions": [
{
"type": "Available",
"status": "True",
"lastUpdateTime": "2018-06-05T11:35:45Z",
"lastTransitionTime": "2018-06-05T11:35:45Z",
"reason": "MinimumReplicasAvailable",
"message": "Deployment has minimum availability."
},
{
"type": "Progressing",
"status": "True",
"lastUpdateTime": "2018-06-05T11:35:46Z",
"lastTransitionTime": "2018-06-05T11:34:32Z",
"reason": "NewReplicaSetAvailable",
"message": "ReplicaSet \"jboss-8674444985\" has successfully progressed."
}
]
}
}
Regards
Anupam Narayan
As stated in the error log:
for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount. helper program
According to this question, you might be missing the nfs-commons package which you can install using sudo apt install nfs-common

Mesos Marathon(ctl) Debugging - "Abnormal executor termination: unknown container"

I'm still new to Mesos, but am trying to figure out the best way to debug a Mesos application I'm attempting to develop. I'm getting the error message "Abnormal executor termination: unknown container" through the web application, and am unsure how to get more descriptive error messages to figure out what's going on. The error message would seem to indicate it can't find the Docker image, but I know for a fact it's referencing the correct image that is installed and running.
{
"id": "pgprimary",
"cmd": null,
"cpus": 1,
"mem": 128,
"disk": 0,
"instances": 1,
"container": {
"docker": {
"image": "example/postgres:centos7-10.0-1.6.0",
"network": "BRIDGE",
"parameters": [{
"key": "hostname",
"value": "pgprimary"
}],
"portMappings": [
]
},
"type": "DOCKER",
"volumes": [
{
"hostPath": "/mnt/nfsfileshare/pgdata",
"containerPath": "/pgdata",
"mode": "RW"
}
]
},
"env": {
"PG_MODE": "primary",
"PG_USER": "testuser",
"PG_PASSWORD": "testuser",
"PG_DATABASE": "userdb",
"PG_ROOT_PASSWORD": "password",
"PG_PRIMARY_USER": "primaryuser",
"PG_PRIMARY_PASSWORD": "password",
"PG_PRIMARY_PORT": "5432"
},
"labels": {},
"healthChecks": [
{
"protocol": "COMMAND",
"command": {
"value": "/usr/pgsql-10/bin/pg_isready --host=pgprimary.marathon.mesos"
},
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 3,
"ignoreHttp1xx": false
}
]
}
The command I'm using to deploy the Marathon app:
marathonctl -h http://10.0.2.15:8080 app create postgres.json
Not image, but docker is what marathon cannot find.
Specify the use of the Docker containerizer:
echo 'docker,mesos' > /etc/mesos-slave/containerizers
Provisioning Containers with the Docker Containerizer
https://mesosphere.github.io/marathon/docs/native-docker.html

Service host/port undefined, Kubernetes/Google Container Engine

I have a service with the name mongodb. According to the documentation, the service host and port should be available to other pods in the same cluster through $MONGODB_SERVICE_HOST and $MONGODB_SERVICE_PORT.
However, neither of these are set in my frontend pods. What are the requirements for this to work?
frontend-controller.json
{
"id": "frontend",
"kind": "ReplicationController",
"apiVersion": "v1beta1",
"desiredState": {
"replicas": 1,
"replicaSelector": {"name": "spatula", "role": "frontend"},
"podTemplate": {
"desiredState": {
"manifest": {
"version": "v1beta1",
"id": "frontend",
"containers": [{
"name": "frontend",
"image": "gcr.io/crafty_apex_841/spatula_frontend",
"cpu": 100,
"ports": [{"name": "spatula-server", "containerPort": 80}]
}]
}
},
"labels": { "name": "spatula", "role": "frontend" }
}
},
"labels": { "name": "spatula", "role": "frontend" }
}
frontend-service.json
{
"apiVersion": "v1beta1",
"kind": "Service",
"id": "frontend",
"port": 80,
"containerPort": "spatula-server",
"labels": { "name": "spatula", "role": "frontend" },
"selector": { "name": "spatula", "role": "frontend" },
"createExternalLoadBalancer": true
}
mongodb-service.json
{
"apiVersion": "v1beta1",
"kind": "Service",
"id": "mongodb",
"port": 27017,
"containerPort": "mongodb-server",
"labels": { "name": "spatula", "role": "mongodb" },
"selector": { "name": "spatula", "role": "mongodb" }
}
mongodb-controller.json
{
"id": "mongodb",
"kind": "ReplicationController",
"apiVersion": "v1beta1",
"desiredState": {
"replicas": 1,
"replicaSelector": {"name": "spatula", "role": "mongodb"},
"podTemplate": {
"desiredState": {
"manifest": {
"version": "v1beta1",
"id": "mongodb",
"containers": [{
"name": "mongodb",
"image": "dockerfile/mongodb",
"cpu": 100,
"ports": [{"name": "mongodb-server", "containerPort": 27017}]
}]
}
},
"labels": { "name": "spatula", "role": "mongodb" }
}
},
"labels": { "name": "spatula", "role": "mongodb" }
}
The service:
$ gcloud preview container services list
NAME LABELS SELECTOR IP PORT
mongodb name=spatula,role=mongodb name=spatula,role=mongodb 10.111.240.154 27017
The pod:
$ gcloud preview container pods list
POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS
9ffd980f-ab56-11e4-ad76-42010af069b6 10.108.0.11 mongodb dockerfile/mongodb k8s-spatula-node-1.c.crafty-apex-841.internal/104.154.44.77 name=spatula,role=mongodb Running
Because environment variables for pods are only created when the pod is started, the service has to exist before a given pod in order for that pod to see the service's environment variables. You should be able to see them from all new pods you create.
If you'd like to learn more, additional explanation of how services work can be found in the documentation.
Alternatively, all newly created clusters in Container Engine (version 0.9.2 and above) have a SkyDNS service running in the cluster that you can use to access services from pods, even those without the environment variables.

Resources