Mesos/Marathon Memory usage limits for Docker

Mesos/Marathon Memory usage limits for Docker - docker

We are created a wordpress container using mesos-Marathon, we allocated 0.1 CPU and 64mb RAM.
When we check the docker stats, we observed that memory allocations we differed with what we are allocated in marathon,
Is there any way to update memory usage limit for Docker container, can we set up any default limits for all containers at demon level.(By Mesos / Docker demon level)
We try do load test on WordPress site, container got killed for just 500 connections, we try to do load test using JMeter.
Thanks in Advance

Docker doesn't have a memory option for your docker daemon yet. As to what is the default memory limit for containers you can only set limits at runtime (not after runtime) with the following options:
-m, --memory="" Memory limit
--memory-swap="" Total memory (memory + swap), '-1' to disable swap
As per this
I also see that there's still in issue open here. Make sure you are using Mesos (0.22.1) or later.
How about creating your containers with something like this Marathon request?
curl -X POST -H "Content-Type: application/json" http://<marathon-server>:8080/v2/apps -d#helloworld.json
helloworld.json:
{
"id": "helloworld",
"container": {
"docker": {
"image": "ubuntu:14.04"
},
"type": "DOCKER",
"volumes": []
},
"cmd": "while true; do echo hello world; sleep 1; done",
"cpus": 0.1,
"mem": 96.0, # Update the memory here.
"instances": 1
}

Related

Docker local volume increase size everyday

I have 5X Docker containers running in VM (mostly no volume attached)
After a while with no new containers deploy, I've notice Docker consume more disk space everyday
I've tried to remove log and/or unused image with
sudo sh -c "truncate -s 0 /var/lib/docker/containers/*/*-json.log"
docker system prune --volumes
It reclaim a very little disk space
Then, I've found 1 local volume that use 30ish GB (it was 29GB yesterday - growth rate ~1GB per day),
docker volume inspect <volume id>
[
{
"CreatedAt": "2022-06-28T12:00:15+07:00", << created last hour
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/4301ac15fed0bec1cd93aa181ab18c5227577c2532fff0a5f4e23956da1cfe4f/_data",
"Name": "4301ac15fed0bec1cd93aa181ab18c5227577c2532fff0a5f4e23956da1cfe4f",
"Options": null,
"Scope": "local"
}
]
And I don't even know what service/container use or create this volume.
How do I know that is it safe to remove this volume or how can I limit consumption of disk space ?

Interact with podman docker via socket in Redhat 9

I'm trying to migrate one of my dev boxes over from centos 8 to RHEL9. I rely heavily on docker and noticed when I tried to run a docker command on the RHEL box it installed podman-docker. This seemed to go smoothly; I was able to pull an image, launch, build, commit a new version without problem using the docker commands I knew already.
The problem I have encountered though is I can't seem to interact with it via the docker socket (which seems to be a link to the podman one).
If I run the docker command:
[#rhel9 ~]$ docker images
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/redhat/ubi9 dev_image de371523ca26 6 hours ago 805 MB
docker.io/redhat/ubi9 latest 9ad46cd10362 6 days ago 230 MB
it has my images listed as expected. I should be able to also run:
[#rhel9 ~]$ curl --unix-socket /var/run/docker.sock -H 'Content-Type: application/json' http://localhost/images/json | jq .
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 3 100 3 0 0 55 0 --:--:-- --:--:-- --:--:-- 55
[]
but as you can see, nothing is coming back. The socket is up and running as I can ping it without issue:
[#rhel9 ~]$ curl -H "Content-Type: application/json" --unix-socket /var/run/docker.sock http://localhost/_ping
OK
I also tried the curl commands using the podman socket directly but it had the same results. Is there something I am missing or a trick to getting it to work so that I can interact with docker/podman via the socket?

Podman isn't implemented using a client/server model like Docker. By default there is no socket, because there's no equivalent to the docker daemon. Podman does provide a compatibility interface that you can use by enabling the podman.socket unit:
$ systemctl enable --now podman.socket
This exposes a Unix socket at /run/podman/podman.sock that responds to Docker API commands. But!
The socket connects you to podman running as root, whereas you've been running podman as a non-root user: so you won't see the same list of images, containers, networks, etc.
Some random notes:
Podman by default runs "rootless": you can run it as an unprivileged user, and all of its storage, metadata, etc, is stored in your home directory.
You can also run Podman as root, in which case the behavior is more like Docker.
If you enable the podman socket, you can replace podman-docker with the actual Docker client (and use things like docker-compose), although I have run into occasional issues with this. Mostly I just use podman, and run docker engine in a VM). You will need to configure Docker to look at the podman socket in /run/podman/podman.sock.
I have podman.socket enabled on my system, so this works:
$ curl --unix-socket /run/podman/podman.sock -H 'content-type: application/json' http://localhost/_ping
OK
Or:
$ curl --unix-socket /run/podman/podman.sock -H 'content-type: application/json' -sf http://localhost/containers/json | jq
[
{
"Id": "f0d9a880c45bb5857b24f46bcb6eeeca162eb68d574c8ba16c4a03703c2d60f4",
"Names": [
"/sleeper"
],
"Image": "docker.io/library/alpine:latest",
"ImageID": "14119a10abf4669e8cdbdff324a9f9605d99697215a0d21c360fe8dfa8471bab",
"Command": "sleep inf",
"Created": 1655418914,
"Ports": [],
"Labels": {},
"State": "running",
"Status": "Up 3 days",
"NetworkSettings": {
"Networks": {
"podman": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "podman",
"EndpointID": "",
"Gateway": "10.88.0.1",
"IPAddress": "10.88.0.2",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "06:55:82:1b:1a:41",
"DriverOpts": null
}
}
},
"Mounts": null,
"Name": "",
"Config": null,
"NetworkingConfig": null,
"Platform": null,
"AdjustCPUShares": false
}
]

I managed to solve my problem although I'm not entirely sure how the scenario came about. I was looking through the output of docker info and podman info and noticed that they both had the remote socket set as:
remoteSocket:
exists: true
path: /run/user/1000/podman/podman.sock
rather than /run/podman/podman.sock which is where I thought it was (this socket does actually exist on my machine). Looking at the systemd file for podman.socket I can see that the socket was specified as %t/podman/podman.sock and checking the man page for podman-system-service it specified the rootless socket as unix://$XDG_RUNTIME_DIR/podman/podman.sock (where my $XDG_RUNTIME_DIR=/run/user/1000.
To get it all working with my software I just needed to make sure the DOCKER_HOST env variable was correctly set e.g. export DOCKER_HOST=unix:///run/user/1000/podman/podman.sock

Increasing the disk size that docker can access in Container Optimized OS

I am attempting to run a simple daily batch script that can run for some hours, after which it will send the data it generated and shut down the instance. To achieve that, I have put the following into user-data:
users:
- name: cloudservice
uid: 2000
runcmd:
- sudo HOME=/home/root docker-credential-gcr configure-docker
- |
sudo HOME=/home/root docker run \
--rm -u 2000 --name={service_name} {image_name} {command}
- shutdown
final_message: "machine took $UPTIME seconds to start"
I am creating the instance using a python script to generate the configuration for the API like so:
def build_machine_configuration(
compute, name: str, project: str, zone: str, image: str
) -> Dict:
image_response = (
compute.images()
.getFromFamily(project="cos-cloud", family="cos-stable")
.execute()
)
source_disk_image = image_response["selfLink"]
machine_type = f"zones/{zone}/machineTypes/n1-standard-1"
# returns the cloud init from above
cloud_config = build_cloud_config(image)
config = {
"name": f"{name}",
"machineType": machine_type,
# Specify the boot disk and the image to use as a source.
"disks": [
{
"type": "PERSISTENT",
"boot": True,
"autoDelete": True,
"initializeParams": {"sourceImage": source_disk_image},
}
],
# Specify a network interface with NAT to access the public
# internet.
"networkInterfaces": [
{
"network": "global/networks/default",
"accessConfigs": [{"type": "ONE_TO_ONE_NAT", "name": "External NAT"}],
}
],
# Allow the instance to access cloud storage and logging.
"serviceAccounts": [
{
"email": "default",
"scopes": [
"https://www.googleapis.com/auth/devstorage.read_write",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/datastore",
"https://www.googleapis.com/auth/bigquery",
],
}
],
# Metadata is readable from the instance and allows you to
# pass configuration from deployment scripts to instances.
"metadata": {
"items": [
{
# Startup script is automatically executed by the
# instance upon startup.
"key": "user-data",
"value": cloud_config,
},
{"key": "google-monitoring-enabled", "value": True},
]
},
}
return config
I am however running out of disk space inside the docker engine.
Any ideas on how to increase the size of the volume available to docker services?

The Docker engine uses the space of the disk of the Instance. So if the container doesn't have space is because the disk of the Instance is full.
The first thing that you can try to do is create an Instance with a bigger disk. The documentation says:
disks[ ].initializeParams.diskSizeGb string (int64 format)
Specifies the size of the disk in base-2 GB. The size must be at least
10 GB. If you specify a sourceImage, which is required for boot disks,
the default size is the size of the sourceImage. If you do not specify
a sourceImage, the default disk size is 500 GB.
You could increase the size adding the field diskSizeGb in the deployment:
"disks": [
{
[...]
"initializeParams": {
"diskSizeGb": 50,
[...]
Other thing you could try is execute the following command in the instance to see if the disk is full and what partition is full:
$ df -h
In the same way you could execute the following command to see the disk usage of the Docker Engine:
$ docker system df
The client and daemon API must both be at least 1.25 to use this command. Use the docker version command on the client to check your client and daemon API versions.
If you want more infomration you could use the flag -v
$ docker system df -v

Running Chronos docker image in BRIDGE mode

I've been putting together a POC mesos/marathon system that I am using to launch and control docker images.
I have a Vagrant virtual machine running in VirtualBox on which I run docker, marathon, zookeeper, mesos-master and mesos-slave processes, with everything working as expected.
I decided to add Chronos into the mix and initially I started with it running as a service on the vagrant VM, but then opted to switch to running it in a docker container using the mesosphere/chronos image.
I have found that I can get container image to start and run successfully when I specify HOST network mode for the container, but when I change to BRIDGE mode then I run into problems.
In BRIDGE mode, the chronos framework registers successfully with mesos (I can see the entry on the frameworks page of the mesos UI), but it looks as though the framework itself doesn't know that the registration was successful. The mesos master log if full of messages like:
strong textI1009 09:47:35.876454 3131 master.cpp:2094] Received SUBSCRIBE call for framework 'chronos-2.4.0' at scheduler-16d21dac-b6d6-49f9-90a3-bf1ba76b4b0d#172.17.0.59:37318
I1009 09:47:35.876832 3131 master.cpp:2164] Subscribing framework chronos-2.4.0 with checkpointing enabled and capabilities [ ]
I1009 09:47:35.876924 3131 master.cpp:2174] Framework 20151009-094632-16842879-5050-3113-0001 (chronos-2.4.0) at scheduler-16d21dac-b6d6-49f9-90a3-bf1ba76b4b0d#172.17.0.59:37318 already subscribed, resending acknowledgement
This implies some sort of configuration/communication issue but I have not been able to work out exactly what the root of the problem is. I'm not sure if there is any way to confirm if the acknowledgement from mesos is making it back to chronos or to check the status of the communication channels between the components.
I've done a lot of searching and I can find posts by folk who have encountered the same issue but I haven't found an detailed explanation of what needs to be done to correct it.
For example, I found the following post which mentions a problem that was resolved and which implies the user successfully ran their chronos container in bridge mode, but their description of the resolution was vague. There was also this post but the change suggested did resolve the issue that I am seeing.
Finally there was a post by someone at ILM who had what sound like exactly my problem and the resolution appeared to involve a fix to Mesos to introduce two new environment variables LIBPROCESS_ADVERTISE_IP and LIBPROCESS_ADVERTISE_PORT (on top of LIBPROCESS_IP and LIBPROCESS_PORT) but I can't find a decent explanation of what values should be assigned to any of these variables, so have yet to work out whether the change will resolve the issue I am having.
It's probably worth mentioning that I've also posted a couple of questions on the chronos-scheduler group, but I haven't had any responses to these.
If it's of any help the versions of software I'm running are as follows (the volume mount allows me to provide values of other parameters [e.g. master, zk_hosts] as files, without having to keep changing the JSON):
Vagrant: 1.7.4
VirtualBox: 5.0.2
Docker: 1.8.1
Marathon: 0.10.1
Mesos: 0.24.1
Zookeeper: 3.4.5
The JSON that I am using to launch the chronos container is as follows:
{
"id": "chronos",
"cpus": 1,
"mem": 1024,
"instances": 1,
"container": {
"type": "DOCKER",
"docker": {
"image": "mesosphere/chronos",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 4400,
"hostPort": 0,
"servicePort": 4400,
"protocol": "tcp"
}
]
},
"volumes": [
{
"containerPath": "/etc/chronos/conf",
"hostPath": "/vagrant/vagrantShared/chronos",
"mode": "RO"
}
]
},
"cmd": "/usr/bin/chronos --http_port 4400",
"ports": [
4400
]
}
If anyone has any experience of using chronos in a configuration like this then I'd appreciate any help that you might be able to provide in resolving this issue.
Regards,
Paul Mateer

I managed to work out the answer to my problem (with a little help from the sample framework here), so I thought I should post a solution to help anyone else the runs into the same issue.
The chronos service (and also the sample framework) were configured to communicate with zookeeper on the IP associated with the docker0 interface on the host (vagrant) VM (in this case 172.17.42.1).
Zookeeper would report the master as being available on 127.0.1.1 which was the IP address of the host VM that the mesos-master process started on, but although this IP address could be pinged from the container any attempt to connect to specific ports would be refused.
The solution was to start the mesos-master with the --advertise_ip parameter and specify the IP of the docker0 interface. This meant that although the service started on the host machine it would appear as though it had been started on the docker0 ionterface.
Once this was done communications between mesos and the chronos framework started completeing and the tasks scheduled in chronos ran successfully.

Running Mesos 1.1.0 and Chronos 3.0.1, I was able to successfully configure Chronos in BRIDGE mode by explicitly setting LIBPROCESS_ADVERTISE_IP, LIBPROCESS_ADVERTISE_PORT and pinning its second port to a hostPort which isn't ideal but the only way I could find to make it advertise its port to Mesos properly:
{
"id": "/core/chronos",
"cmd": "LIBPROCESS_ADVERTISE_IP=$(getent hosts $HOST | awk '{ print $1 }') LIBPROCESS_ADVERTISE_PORT=$PORT1 /chronos/bin/start.sh --hostname $HOST --zk_hosts master-1:2181,master-2:2181,master-3:2181 --master zk://master-1:2181,master-2:2181,master-3:2181/mesos --http_credentials ${CHRONOS_USER}:${CHRONOS_PASS}",
"cpus": 0.1,
"mem": 1024,
"disk": 100,
"instances": 1,
"container": {
"type": "DOCKER",
"volumes": [],
"docker": {
"image": "mesosphere/chronos:v3.0.1",
"network": "BRIDGE",
"portMappings": [
{
"containerPort": 9900,
"hostPort": 0,
"servicePort": 0,
"protocol": "tcp",
"labels": {}
},
{
"containerPort": 9901,
"hostPort": 9901,
"servicePort": 0,
"protocol": "tcp",
"labels": {}
}
],
"privileged": true,
"parameters": [],
"forcePullImage": true
}
},
"env": {
"CHRONOS_USER": "admin",
"CHRONOS_PASS": "XXX",
"PORT1": "9901",
"PORT0": "9900"
}
}

Docker container on Marathon doesn't finish

I have Mesos cluster consists of 3 CentOS6.5 machines.
ZooKeeper and Mesos-Master is running on one of the machines and Mesos-Slave is running on each machine.
Also, Marathon is running on master node.
Then, I am trying to run docker containers on Marathon, following this instruction by Mesosphere.
job.json is like as follows,
{
"container": {
"type": "DOCKER",
"docker": {
"image": "libmesos/ubuntu"
}
},
"id": "ubuntu",
"instances": 1,
"cpus": 0.5,
"mem": 512,
"uris": [],
"cmd": "date -u +%T"
}
Then I run following command,
curl -X POST -H "Accept: application/json" -H "Content-Type: application/json" <master-hostname>:8080/v2/apps -d#job.json
Then on Marathon Web UI, I can see the Docker container is "Deploying" status even after long time.
And on Mesos-Master Web UI, I can see the Task is in "STAGING" status after long time.
On Sandbox pane, I can see the stdout and the command seems to completed successfly. No problem.
stderr is like this,
I0416 19:19:49.254998 29178 exec.cpp:132] Version: 0.22.0
I0416 19:19:49.257824 29193 exec.cpp:206] Executor registered on slave 20150416-160950-109643786-5050-30728-S0
stdout is like this,
Registered executor on master-hostname
10:19:49
But I expect the container(TASK) to finish after completed the command.
Is it possible?
If possible, how to do that way?
Thank you.

The task will finish (you should be able to see in the Mesos completed tasks) but the container will be restarted by Marathon. Marathon is for long-running apps.
If you don't want your application to be running continuously, you should take a look at another framework like Chronos.

Marathon is for long running processes. Even if you remove the containers, marathon will try to restart these. One more thing that I observed is that Marathon tries to launch containers and continue to do that till you are left with no memory and CPU. When you are out of resources your task will go in stage state.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart