qsub disregarded memory limit - memory

My command is:
qsub -t 1:30:1 -q test.q -l r_core=5 -l r_mem=30 run.sh
It launches 30 instances, each on one server, but they tend to consume more than the specified 30GB of RAM.
What are the reasons for this?

The only real-time resource enforcement you get is A) checking of min/max requests at submission, and B) walltime--and even with walltime, you may not get reliable enforcement, depending on the node. For solid resource enforcement, you should impose default resource restrictions, and then upgrade to the version supporting cgroups and enable that.

Related

Setting CPU and Memory limits globally for all docker containers

There are many examples which talks about setting memory, cpu etc using docker run. Is it possible to set it globally so every container created with those values?
There should be other ways like using AppArmor maybe, I'll check, but the first thing that comes to my mind is this project from a friend
Docker enforcer
https://github.com/piontec/docker-enforcer
This project is a docker plugin that will kill containers if they don't meet certain pre-defined policies such as having strict memory and cpu limits

Enabling cgroup cpu real-time runtime in ubuntu kernel

I am trying to use real-time scheduling in a docker container running on Ubuntu 18.04.
I have already installed a realtime kernel following the method given here. I have selected kernel version 5.2.9 and its associated rt patch.
The output of uname -a confirms that the realtime kernel is well installed and running:
Linux myLaptop 5.2.9-rt3 #1 SMP PREEMPT RT ...
To run my container I issue the following command:
docker run --cpu-rt-runtime=95000 \
--ulimit rtprio=99 \
--ulimit memlock=102400 \
--cap-add=sys_nice \
--privileged \
-it \
myimage:latest
However, the output I got is:
docker: Error response from daemon: Your kernel does not support cgroup cpu real-time runtime.
I have seen that this can be linked to the missing CONFIG_RT_GROUP_SCHED as detailed in the issue here. Indeed if I run the script provided at this page to check the kernel compatibility with Docker I get:
- CONFIG_RT_GROUP_SCHED: missing
Which seems to confirm that Docker is using this for realtime scheduling but is not provided in the kernel, although patched to be realtime.
From there, I tried to find a solution in vain. I am not well versed in kernel configurations to know if I need to compile it with a specific option, and which one to choose, to add the missing CONFIG_RT_GROUP_SCHED.
Thanks a lot in advance for recommendations and help.
When talking about real-time Linux there are different approaches ranging from single kernel approaches (like PREEMPT_RT) to dual-kernel approaches (such as Xenomai). You can use real-time capable Dockers in combination with all of them (clearly the kernel of your host machine has to match) to produce real-time capable systems but the approaches differ. In your case you are mixing up two different approaches: You installed PREEMPT_RT while following a guide for control groups which are incompatible with PREEMPT_RT.
By default the Linux kernel can be compiled with different levels of preempt-ability (see e.g. Reghenzani et al. - "The real-time Linux kernel: a Survey on PREEMPT_RT"):
PREEMPT_NONE has no way of forced preemption
PREEMPT_VOLUNTARY where preemption is possible in some locations in order to reduce latency
PREEMPT where preemption can occur in any part of the kernel (excluding spinlocks and other critical sections)
These can be combined with the feature of control groups (cgroups for short) by setting CONFIG_RT_GROUP_SCHED=y during kernel compilation, which reserves a certain fraction of CPU-time for processes of a certain (user-defined) group.
PREEMPT_RT developed from PREEMPT and is a set of patches that aims at making the kernel fully preemptible, even in critical sections (PREEMPT_RT_FULL). For this purpose e.g. spinlocks are largely replaced by mutexes.
As of 2021 it is being slowly merged into the mainline and will be available to the general public without the need to patch the kernel. As stated here PREEMPT_RT currently can't be compiled with the CONFIG_RT_GROUP_SCHED and therefore can't be used with control groups (see here for a comparison). From what I have read this is due to high latency spikes, something that I have already observed with control groups by means of cyclicytests.
This means you can either compile your kernel (see the Ubuntu manual for details)
Without PREEMPT_RT but with CONFIG_RT_GROUP_SCHED (see this post for details) and follow the Docker guide on real-time with control groups as well as my post here. From my experience this has though quite high latency spikes, something not desirable for real-time system where the worst-case latency is much more important than the average latency.
With PREEMPT_RT without CONFIG_RT_GROUP_SCHED (which can also be installed from a Debian package such as this one). In this case it is sufficient to execute the Docker with the options --privileged --net=host, or the Docker-compose equivalent privileged: true network_mode: host. Then any process from inside the Docker can set real-time priorities rtprio (e.g. by calling ::pthread_setschedparam from inside the code or by using chrt from the command line).
In case you are not using the root as user inside the Docker you furthermore will have to have give yourself a name of a user that belongs to a group with real-time privileges on your host computer (see $ ulimit -r). This can be done by configuring the PAM limits (/etc/security/limits.conf file) accordingly (as described here) by copying the section of the #realtime user group and creating a new group (e.g. #some_group) or adding the user (e.g. some_user) directly:
#some_group soft rtprio 99
#some_group soft priority 99
#some_group hard rtprio 99
#some_group hard priority 99
In this context rtprio is the maximum real-time priority allowed for non-privileged processes. The hard limit is the real limit to which the soft limit can be set to. The hard limits are set by the super-user and enforce by the kernel. The user cannot raise his code to run with a higher priority than the hard limit. The soft limit on the other hand is the default value limited by the hard limit. For more information see e.g. here.
I use latter option for real-time capable robotic applications and could not observe any differences in latency between with and without the Docker. You can find a guide on how to set up PREEMPT_RT and automated scripts for building it on my Github.

How to deploy a large docker image to Pivotal Cloud Foundry

I'm trying to push this docker image to my PCF environment:
https://hub.docker.com/r/jupyter/tensorflow-notebook/tags/
The image is 3.9GB when extracted.
When I do:
cf push jupyter-minimal-notebook --docker-image jupyter/minimal-notebook -m 8GB -k 4G
I get the error:
The app is invalid: disk_quota too much disk requested (requested 4096 MB - must be less than 2048 MB)
The default disk space an app gets is 1024mb by default. This is set in cloud_controller_ng in the default_app_disk_in_mb parameter.
The maximum amount of disk a user can request is 2048mb by default. This is set in cloud_controller_ng in the maximum_app_disk_in_mb parameter.
I believe the solution is to increase the value for maximum_app_disk_in_mb however, after much searching I cannot figure out how to set it.
I have tried the following in the manifest.yml:
---
applications:
- name: jupyter-tensorflow-notebook
docker:
image: jupyter/tensorflow-notebook
cloud_controller_ng:
maximum_app_disk_in_mb: 4096
disk_quota: 4G
memory: 8G
instances: 1
This does not work and returns the same error:
The app is invalid: disk_quota too much disk requested (requested 4096 MB - must be less than 2048 MB)
UPDATE September 3rd 2019:
I didn't give enough background. I've setup a Small Footprint PCF environment on AWS using the AWS Quickstart, so I have full control over the deployment to tweak whatever parameters I'd like. In effect I'm the platform operator. So the question is, given I have the rights to make the changes to maximum_app_disk_in_mb how would I go about doing that? I'd like to change the maximum_app_disk_in_mb parameter but can't see how to do that without redeploying the entire environment.
To get the manifest that was used in Quickstart I figured that I needed to do the following:
bosh -e aws -d [my-deployment] manifest
This from what I understand is the complete manifest. There are a lot of variable parameters in it such as ((cloud-controller-internal-api-user-credentials.password)) etc.
Is there a way to update maximum_app_disk_in_mb without redeploying the complete manifest?
If I have to redeploy the complete manifest is the best way to do this by doing:
bosh -e aws -d [my-deployment] manifest > manifest.yml
Then editing maximum_app_disk_in_mb parameter value in the manifest.yml and redeploying? If I do this will it pick up on all the values for parameters that are using variables in the manifest such ((cloud-controller-internal-api-user-credentials.password))?
When I do:
bosh -e aws deployments
There seem to be two deployments, aws-service-broker-xxxxxxxxxxxxxxx and cf-xxxxxxxxxxxxxxxx (I've replaced the id with x's for anonymity). The former doesn't have any running instances so I guess I don't need to make any changes to that one?
Let me just start by saying that you, as a developer/end user, of Cloud Foundry cannot change settings in Cloud Controller, which is a system level component. Only your operator could do that.
If you could change settings in Cloud Controller, that would void all the limits set by Cloud Controller and your platform operators, as you could just set them as high as you want.
Cloud Controller will set a default disk size, which takes effect if you do not set one, and it will set a maximum disk size, which limits you from consuming too much disk on the foundation and impacting other users.
I don't believe there is a way to fetch the max disk space allowed by Cloud Controller & your platform operator other than just specifying a large disk quota (try cf push -k 99999G) and looking at the response.
As you can see from your response, Cloud Controller will tell you in the error message the maximum allowed value. In your case, the most it will allow is 2G.
The app is invalid: disk_quota too much disk requested (requested 4096 MB - must be less than 2048 MB)
There's nothing you can do to get a disk above 2G, other than write to your platform operator and ask them to increase this value. If this is your company's platform, then that is probably the best option.
If you are using a public provider, then you're probably better looking at other providers. There are several public, certified providers you can use. See the list here -> https://www.cloudfoundry.org/thefoundry/
Hope that helps!

Docker parallel operations limit

Is there a limit to the number of parallel Docker push/pulls you can do?
E.g. if you thread Docker pull / push commands such that they are
pulling/pushing different images at the same time what would be the
upper limit to the number of parallel push/pulls
Or alternatively
On one terminal you do docker pull ubuntu on another you do docker
pull httpd etc - what would be the limit Docker would support?
The options are set in the configuration file (Linux-based OS it is located in the path: /etc/docker/daemon.json and C:\ProgramData\docker\config\daemon.json on Windows)
Open /etc/docker/daemon.json (If doesn't exist, create it)
Add the values(for push/pulls) and set parallel operations limit
{
"max-concurrent-uploads": 1,
"max-concurrent-downloads": 1
}
Restart daemon: sudo service docker restart
The docker daemon (dockerd) has two flags:
--max-concurrent-downloads int Set the max concurrent downloads for each pull
(default 3)
--max-concurrent-uploads int Set the max concurrent uploads for each push
(default 5)
The upper limit will likely depend on the number of open files you permit for the process (ulimit -n). There will be some overhead of other docker file handles, and I expect that each push and pull opens multiple handles, one for the remote connection, and another for the local file storage.
To compound the complication of this, each push and pull of an image will open multiple connections, one per layer, up to the concurrent limit. So if you run a dozen concurrent pulls, you may have 50-100 potential layers to pull.
While docker does allow these limits to be increased, there's a practical limit where you'll see diminishing returns if not a negative return to opening more concurrent connections. Assuming the bandwidth to the remote registry is limited, more connections will simply split that bandwidth, and docker itself will wait until the very first layer finishes before it starts unpacking that transmission. Also any aborted docker pull or push will lose any partial transmissions of a layer, so you increase the potential data you'd need to retransmit with more concurrent connections.
The default limits are well suited for a development environment, and if you find the need to adjust them, I'd recommend measuring the performance improvement before trying to find the max number of concurrent sessions.
For anyone using Docker for Windows and WSL2:
You can (and should) set the options on the Settings tab:
Docker for Windows Docker Engine settings

Limiting a Docker Container to a single cpu core

I'm trying to build a system which runs pieces of code in consistent conditions, and one way I imagine this being possible is to run the various programs in docker containers with the same layout, reserving the same amount of memory, etc. However, I can't seem to figure out how to keep CPU usage consistent.
The closest thing I can seem to find are "cpu shares," which, if I understand the documentation, limit cpu usage with respect to what other containers/other processes are running on the system, and what's available on the system. They do not seem to be capable of limiting the container to an absolute amount of cpu usage.
Ideally, I'd like to set up docker containers that would be limited to using a single cpu core. Is this at all possible?
If you use a newer version of Docker, you can use --cpuset-cpus="" in docker run to specify the CPU cores you want to allocate:
docker run --cpuset-cpus="0" [...]
If you use an older version of Docker (< 0.9), which uses LXC as the default execution environment, you can use --lxc-conf to configure the allocated CPU cores:
docker run --lxc-conf="lxc.cgroup.cpuset.cpus = 0" [...]
In both of those cases, only the first CPU core will be available to the docker container. Both of these options are documented in the docker help.
I've tried to provide a tutorial on container resource alloc.
https://gist.github.com/afolarin/15d12a476e40c173bf5f

Resources