I'm working with mesos + marathon + docker quite a while but I got stuck at some point. At the moment I try to deal with persistent container and I tried to play around with the "volumes-from" parameter but I can't make it work because I have no clue how I can figure out the name of the data box to put it as a key in the json. I tried it with the example from here
{
"id": "privileged-job",
"container": {
"docker": {
"image": "mesosphere/inky"
"privileged": true,
"parameters": [
{ "key": "hostname", "value": "a.corp.org" },
{ "key": "volumes-from", "value": "another-container" },
{ "key": "lxc-conf", "value": "..." }
]
},
"type": "DOCKER",
"volumes": []
},
"args": ["hello"],
"cpus": 0.2,
"mem": 32.0,
"instances": 1
}
I would really appreciate any kind of help :-)
From what I know :
docker --volume-from take the ID or the name of a container.
Since your datacontainer is launch with Marathon too, it get an ID (not sur how to get this ID from marathon) and a name of that form : mesos-0fb2e432-7330-4bfe-bbce-4f77cf382bb4 which is not related to task ID in Mesos nor docker ID.
The solution would be to write something like this for your web-ubuntu application :
"parameters": [
{ "key": "volumes-from", "value": "mesos-0fb2e432-7330-4bfe-bbce-4f77cf382bb4" }
]
Since this docker-ID is unknown from Marathon it is not practical to use datacontainer that are started with Marathon.
You can try to start a datacontainer directly with Docker (without using Marathon) and use it as you do before but since you don't know in advance where web-ubuntu will be scheduled (unless you add a constraint to force it) it is not practical.
{
"id": "data-container",
"container": {
"docker": {
"image": "mesosphere/inky"
},
"type": "DOCKER",
"volumes": [
{
"containerPath": "/data",
"hostPath": "/var/data/a",
"mode": "RW"
}
]
},
"args": ["data-only"],
"cpus": 0.2,
"mem": 32.0,
"instances": 1
}
{
"id": "privileged-job",
"container": {
"docker": {
"image": "mesosphere/inky"
"privileged": true,
"parameters": [
{ "key": "hostname", "value": "a.corp.org" },
{ "key": "volumes-from", "value": "data-container" },
{ "key": "lxc-conf", "value": "..." }
]
},
"type": "DOCKER",
"volumes": []
},
"args": ["hello"],
"cpus": 0.2,
"mem": 32.0,
"instances": 1
}
Something like that maybe?
Mesos support passing the parameter of volume plugin using "key" & "value". But the issue is how to pass the volume name which Mesos expects to be either an absolute path or if absolute path is not passed then it will merge the name provided with the slave container sandbox folder. They do that primarily to support checkpointing, in case slave goes down accidentally.
The only option, till the above get enhanced, is to use another key value pair parameter. For e.g. in above case
{ "key": "volumes-from", "value": "databox" },
{ "key": "volume", "value": "datebox_volume" }
I have tested above with a plugin and it works.
Another approach is to write a custom mesos framework capable of running the docker command you want. In order to know what offers to accept and where to place each task you can use marathon information from: /apps/v2/ (under tasks key).
A good starting point for writing a new mesos framework is: https://github.com/mesosphere/RENDLER
Related
I have 2 tasks in visual studio code to run 2 different images into containers. Only the last docker run task is recognized by vscode.
This is my tasks.json file
{
"version": "2.0.0",
"tasks": [
{
"label": "docker-build-1",
"type": "docker-build",
"platform": "python",
"dockerBuild": {
"tag": "image1:latest",
"dockerfile": "${workspaceFolder}/app1/dev.Dockerfile",
"context": "${workspaceFolder}/",
"pull": true
}
},
{
"label": "docker-build-2",
"type": "docker-build",
"platform": "python",
"dockerBuild": {
"tag": "image2:latest",
"dockerfile": "${workspaceFolder}/app2/dev.Dockerfile",
"context": "${workspaceFolder}/",
"pull": true
}
},
{
"label": "docker-run-1",
"type": "docker-run",
"dependsOn": [
"docker-build-1"
],
"python": {
"module": "app.main"
},
"dockerRun": {
"network": "mynetwork"
}
},
{
"label": "docker-run-2",
"type": "docker-run",
"dependsOn": [
"docker-build-2"
],
"python": {
"module": "app.main"
},
"dockerRun": {
"network": "mynetwork"
}
},
]
}
When vscode shows the menu for running task, only thask docker-run-2 is showing:
Actually, only the last docker run task in the tasks.json file is shown. If I change the order in the list of tasks, then vscode only recognize docker-run-1. I searched in the documentation and it doesn't says anything about this behaviour. Any idea why this is happening? The idea is to setup 2 debug configurations in vscode for the 2 apps, but running the debug config for the app that is not the last produce an error in vscode:
Came across this same issue today. Seems that the "dockerRun" attribute between the run tasks has to be different. In my case i just added a test environment variable to one of the tasks and then both started to appear in the task list.
Last week I had to remove a failed node from my Docker Swarm Cluster, leaving some tasks that ran on that node in desired state "Remove".
Even after deleting the stack and recreating it with the same name, docker stack ps stackname still shows them.
Interestingly enough, after recreating the stack, the tasks are still there, but with no node assigned.
Here's what I tried so far to "cleanup" the stack:
Recreating the stack with the same name
docker container prune
docker volume prune
docker system prune
Is there a way to remove a specific task?
Here's the output for docker inspect fkgz0oihexzs, the first task in the list:
[
{
"ID": "fkgz0oihexzsjqwv4ju0szorh",
"Version": {
"Index": 14422171
},
"CreatedAt": "2018-11-05T16:15:31.528933998Z",
"UpdatedAt": "2018-11-05T16:27:07.422368364Z",
"Labels": {},
"Spec": {
"ContainerSpec": {
"Image": "redacted",
"Labels": {
"com.docker.stack.namespace": "redacted"
},
"Env": [
"redacted"
],
"Privileges": {
"CredentialSpec": null,
"SELinuxContext": null
},
"Isolation": "default"
},
"Resources": {},
"Placement": {
"Platforms": [
{
"Architecture": "amd64",
"OS": "linux"
}
]
},
"Networks": [
{
"Target": "3i998stqemnevzgiqw3ndik4f",
"Aliases": [
"redacted"
]
}
],
"ForceUpdate": 0
},
"ServiceID": "g3vk9tgfibmcigmf67ik7uhj6",
"Slot": 1,
"Status": {
"Timestamp": "2018-11-05T16:15:31.528892467Z",
"State": "new",
"Message": "created",
"PortStatus": {}
},
"DesiredState": "remove"
}
]
I had the same problem. I resolved it following this instructions :
docker run --rm -v /var/run/docker/swarm/control.sock:/var/run/swarmd.sock dperny/tasknuke <taskid>
Be sure to use the full long task id or it will not work (fkgz0oihexzsjqwv4ju0szorh in your case).
I am using kubernetes : v1.10.3 , i have one external NFS server which i am able to mount anywhere ( any physical machines). I want to mount this NFS directly to pod/container . I tried but every time i am getting error. don't want to use privileges, kindly help me to fix.
ERROR: MountVolume.SetUp failed for volume "nfs" : mount failed: exit
status 32 Mounting command: systemd-run Mounting arguments:
--description=Kubernetes transient mount for /var/lib/kubelet/pods/d65eb963-68be-11e8-8181-00163eeb9788/volumes/kubernetes.io~nfs/nfs
--scope -- mount -t nfs 10.225.241.137:/stagingfs/alt/ /var/lib/kubelet/pods/d65eb963-68be-11e8-8181-00163eeb9788/volumes/kubernetes.io~nfs/nfs
Output: Running scope as unit run-43393.scope. mount: wrong fs type,
bad option, bad superblock on 10.225.241.137:/stagingfs/alt/, missing
codepage or helper program, or other error (for several filesystems
(e.g. nfs, cifs) you might need a /sbin/mount. helper program)
In some cases useful info is found in syslog - try dmesg | tail or so.
NFS server : mount -t nfs 10.X.X.137:/stagingfs/alt /alt
I added two things for volume here but getting error every time.
first :
"volumeMounts": [
{
"name": "nfs",
"mountPath": "/alt"
}
],
Second :
"volumes": [
{
"name": "nfs",
"nfs": {
"server": "10.X.X.137",
"path": "/stagingfs/alt/"
}
}
],
---------------------complete yaml --------------------------------
{
"kind": "Deployment",
"apiVersion": "extensions/v1beta1",
"metadata": {
"name": "jboss",
"namespace": "staging",
"selfLink": "/apis/extensions/v1beta1/namespaces/staging/deployments/jboss",
"uid": "6a85e235-68b4-11e8-8181-00163eeb9788",
"resourceVersion": "609891",
"generation": 2,
"creationTimestamp": "2018-06-05T11:34:32Z",
"labels": {
"k8s-app": "jboss"
},
"annotations": {
"deployment.kubernetes.io/revision": "2"
}
},
"spec": {
"replicas": 1,
"selector": {
"matchLabels": {
"k8s-app": "jboss"
}
},
"template": {
"metadata": {
"name": "jboss",
"creationTimestamp": null,
"labels": {
"k8s-app": "jboss"
}
},
"spec": {
"volumes": [
{
"name": "nfs",
"nfs": {
"server": "10.X.X.137",
"path": "/stagingfs/alt/"
}
}
],
"containers": [
{
"name": "jboss",
"image": "my.abc.com/alt:7.1_1.1",
"resources": {},
"volumeMounts": [
{
"name": "nfs",
"mountPath": "/alt"
}
],
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent",
"securityContext": {
"privileged": true
}
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"securityContext": {},
"schedulerName": "default-scheduler"
}
},
"strategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"maxUnavailable": "25%",
"maxSurge": "25%"
}
},
"revisionHistoryLimit": 10,
"progressDeadlineSeconds": 600
},
"status": {
"observedGeneration": 2,
"replicas": 1,
"updatedReplicas": 1,
"readyReplicas": 1,
"availableReplicas": 1,
"conditions": [
{
"type": "Available",
"status": "True",
"lastUpdateTime": "2018-06-05T11:35:45Z",
"lastTransitionTime": "2018-06-05T11:35:45Z",
"reason": "MinimumReplicasAvailable",
"message": "Deployment has minimum availability."
},
{
"type": "Progressing",
"status": "True",
"lastUpdateTime": "2018-06-05T11:35:46Z",
"lastTransitionTime": "2018-06-05T11:34:32Z",
"reason": "NewReplicaSetAvailable",
"message": "ReplicaSet \"jboss-8674444985\" has successfully progressed."
}
]
}
}
Regards
Anupam Narayan
As stated in the error log:
for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount. helper program
According to this question, you might be missing the nfs-commons package which you can install using sudo apt install nfs-common
I'm still new to Mesos, but am trying to figure out the best way to debug a Mesos application I'm attempting to develop. I'm getting the error message "Abnormal executor termination: unknown container" through the web application, and am unsure how to get more descriptive error messages to figure out what's going on. The error message would seem to indicate it can't find the Docker image, but I know for a fact it's referencing the correct image that is installed and running.
{
"id": "pgprimary",
"cmd": null,
"cpus": 1,
"mem": 128,
"disk": 0,
"instances": 1,
"container": {
"docker": {
"image": "example/postgres:centos7-10.0-1.6.0",
"network": "BRIDGE",
"parameters": [{
"key": "hostname",
"value": "pgprimary"
}],
"portMappings": [
]
},
"type": "DOCKER",
"volumes": [
{
"hostPath": "/mnt/nfsfileshare/pgdata",
"containerPath": "/pgdata",
"mode": "RW"
}
]
},
"env": {
"PG_MODE": "primary",
"PG_USER": "testuser",
"PG_PASSWORD": "testuser",
"PG_DATABASE": "userdb",
"PG_ROOT_PASSWORD": "password",
"PG_PRIMARY_USER": "primaryuser",
"PG_PRIMARY_PASSWORD": "password",
"PG_PRIMARY_PORT": "5432"
},
"labels": {},
"healthChecks": [
{
"protocol": "COMMAND",
"command": {
"value": "/usr/pgsql-10/bin/pg_isready --host=pgprimary.marathon.mesos"
},
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 3,
"ignoreHttp1xx": false
}
]
}
The command I'm using to deploy the Marathon app:
marathonctl -h http://10.0.2.15:8080 app create postgres.json
Not image, but docker is what marathon cannot find.
Specify the use of the Docker containerizer:
echo 'docker,mesos' > /etc/mesos-slave/containerizers
Provisioning Containers with the Docker Containerizer
https://mesosphere.github.io/marathon/docs/native-docker.html
Can someone give an example of how to use the gitRepo type of volume in Kubernetes?
The doc says it's a plugin, not sure what that means. Could not find an example anywhere and i don't know the proper syntax.
especially is there parameters to pull a specific branch, use credentials (username, password, or SSH key) etc...
EDIT:
Going through the Kubernetes code this is what I figured so far:
- name: data
gitRepo:
repository: "git repo url"
revision: "hash of the commit to use"
But can't seen to make it work, and not sure how to troubleshoot this issue
This is a sample application I used:
{
"kind": "ReplicationController",
"apiVersion": "v1",
"metadata": {
"name": "tess.io",
"labels": {
"name": "tess.io"
}
},
"spec": {
"replicas": 3,
"selector": {
"name": "tess.io"
},
"template": {
"metadata": {
"labels": {
"name": "tess.io"
}
},
"spec": {
"containers": [
{
"image": "tess/tessio:0.0.3",
"name": "tessio",
"ports": [
{
"containerPort": 80,
"protocol": "TCP"
}
],
"volumeMounts": [
{
"mountPath": "/tess",
"name": "tess"
}
]
}
],
"volumes": [
{
"name": "tess",
"gitRepo": {
"repository": "https://<TOKEN>:x-oauth-basic#github.com/tess/tess.io"
}
}
]
}
}
}
}
And you can use the revision too.
PS: The repo above does not exist anymore.
UPDATE:
gitRepo is now deprecated
https://github.com/kubernetes/kubernetes/issues/60999
ORIGINAL ANSWER:
going through the code this is what i figured:
- name: data
gitRepo:
repository: "git repo url"
revision: "hash of the commit to use"
after fixing typos in my mountPath, it works fine.