How to raise ulimit hard limit for real time priority programmatically with setuid or capability CAP_SYS_RESOURCE? - ulimit

I would like to run a program under the linux SCHED_FIFO real-time class. I would prefer to keep the user's hard limit for RTPRIO set to 0, and to programmatically raise the hard limit just for the single process. It is broadly claimed that if I grant the process CAP_SYS_RESOURCE to allow it raise the hard limit, e.g. , man setrlimit 2:
The soft limit is the value that the kernel enforces for the corresponding resource. The hard limit acts as a ceiling for the soft limit: an unprivileged process may only set its soft limit to a value in the range from 0 up to the hard limit, and (irreversibly) lower its hard limit. A privileged process (under Linux: one with the CAP_SYS_RESOURCE capability) may make arbitrary changes to either limit value.
However, I can't seem to get this to work for me. Here is test code:
#include <stdio.h>
#include <sched.h>
#include <errno.h>
#include <string.h>
#include <sys/resource.h>
#define PRIORITY (50)
int main(int argc, char **argv) {
struct sched_param param;
struct rlimit rl;
int e, min_fifo, max_fifo;
min_fifo = sched_get_priority_min(SCHED_FIFO);
max_fifo = sched_get_priority_max(SCHED_FIFO);
printf("For policy SCHED_FIFO min priority is %d, max is %d.\n",
min_fifo, max_fifo);
if ((min_fifo>PRIORITY)||(max_fifo<PRIORITY)) {
printf("Desired priority of %d is out of range.\n", PRIORITY);
return 1;
}
if (getrlimit(RLIMIT_RTPRIO, &rl) != 0) {
e = errno;
printf("Failed to getrlimit(): %s.\n", strerror(e));
return 1;
}
printf("RTPRIO soft limit is %d, hard is %d.\n",
(int) rl.rlim_cur, (int) rl.rlim_max);
// Adjust hard limit if necessary
if (rl.rlim_max < PRIORITY) {
rl.rlim_max = PRIORITY;
if (setrlimit(RLIMIT_RTPRIO, &rl) != 0) {
e = errno;
printf("Failed to raise hard limit for RTPRIO to %d: %s.\n",
(int) rl.rlim_max, strerror(e));
return 1;
}
printf("Raised hard limit for RTPRIO to %d.\n", (int) rl.rlim_max);
}
// Adjust soft limit if necessary
if (rl.rlim_cur < PRIORITY) {
rl.rlim_cur = PRIORITY;
if (setrlimit(RLIMIT_RTPRIO, &rl) != 0) {
e = errno;
printf("Failed to raise soft limit for RTPRIO to %d: %s.\n",
(int) rl.rlim_cur, strerror(e));
return 1;
}
printf("Raised soft limit for RTPRIO to %d.\n", (int) rl.rlim_cur);
}
// Set desired priority with class SCHED_FIFO
param.sched_priority = PRIORITY;
if (sched_setscheduler(0, SCHED_FIFO, &param) != 0) {
e = errno;
printf("Setting policy failed: %s.\n", strerror(e));
return 1;
} else {
printf("Set policy SCHED_FIFO, priority %d.\n", param.sched_priority);
}
return 0;
}
This works as expected without special privilege with a hard limit of 99:
$ ./rtprio
For policy SCHED_FIFO min priority is 1, max is 99.
RTPRIO soft limit is 0, hard is 99.
Raised soft limit for RTPRIO to 50.
Set policy SCHED_FIFO, priority 50.
$
It works as expected with hard limit of 0 using sudo:
$ sudo ./rtprio
For policy SCHED_FIFO min priority is 1, max is 99.
RTPRIO soft limit is 0, hard is 0.
Raised hard limit for RTPRIO to 50.
Raised soft limit for RTPRIO to 50.
Set policy SCHED_FIFO, priority 50.
$
However it does not work as expected when setuid root:
$ sudo chown root ./rtprio
$ sudo chgrp root ./rtprio
$ sudo chmod ug+s ./rtprio
$ ls -l ./rtprio
-rwsrwsr-x 1 root root 8948 11月 28 12:04 ./rtprio
$ ./rtprio
For policy SCHED_FIFO min priority is 1, max is 99.
RTPRIO soft limit is 0, hard is 0.
Failed to raise hard limit for RTPRIO to 50: Operation not permitted.
It also unexpectedly fails with capability CAP_SYS_RESOURCE as well as with all capabilities:
$ sudo setcap cap_sys_resource=eip ./rtprio
$ getcap ./rtprio
./rtprio = cap_sys_resource+eip
$ ./rtprio
For policy SCHED_FIFO min priority is 1, max is 99.
RTPRIO soft limit is 0, hard is 0.
Failed to raise hard limit for RTPRIO to 50: Operation not permitted.
$ sudo setcap all=eip ./rtprio
$ getcap ./rtprio
./rtprio =eip
$ ./rtprio
For policy SCHED_FIFO min priority is 1, max is 99.
RTPRIO soft limit is 0, hard is 0.
Failed to raise hard limit for RTPRIO to 50: Operation not permitted.
What am I missing here?
$ uname -srv
Linux 3.13.0-100-generic #147-Ubuntu SMP Tue Oct 18 16:48:51 UTC 2016
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
$ bash --version | head -1
GNU bash, version 4.3.11(1)-release (x86_64-pc-linux-gnu)

The fact that setuid root didn't work is the clue.
It turns out that the test program above was in a partition mounted withnosuid, and hence the setuid bits have no effect. If you don't trust the setuid bits in the partition, then you probably shouldn't trust the file capabilities either. And, indeed, it turns out that when mounting with nosuid the file capabilities are also ignored.
It seems that luks encrypted home directories tend to be mounted nosuid.
I'm leaving this question up because there are a lot of search engine hits for "linux capabilities nosuid" indicating that a lot of time has been wasted on this issue (but of course you don't know to search for this until you have figured it out).

Related

Max size of string CMD that can be passed to Docker

In Docker references, I didn't find any information about how long the string can be passed to Docker CMD.
What are the limitations?
What is the maximum number of characters I can pass to CMD?
I have done a simple test and found the limit of dockerfile line is 65535 on my CentOS 7/x64 machine.
#./build.sh
Sending build context to Docker daemon 363kB
Error response from daemon: failed to parse Dockerfile: dockerfile line greater than max allowed size of 65535
#ZenithS You're right for Windows (8192 Characters), but Linux is not that easy.
To make it short: For Linux it's hardcoded to 64 or 128 kiB. You could check with xargs --show-limits, which gives a pretty detailed overview:
Your environment variables take up 5354 bytes
POSIX upper limit on argument length (this system): 2089750 <-- ARG_MAX
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 2084396 <-- ARG_MAX - ENV
Size of command buffer we are actually using: 131072 <-- Hardcoded limit which applies actually (see below)
Maximum parallelism (--max-procs must be no greater): 2147483647
The hardcoded limit goes to MAX_ARG_STRLEN which is set to PAGE_SIZE * 32
https://github.com/torvalds/linux/blob/v5.16-rc7/include/uapi/linux/binfmts.h#L15
With getconf PAGE_SIZE you can check (mostly 2048 or 4096 on modern platforms) so results to 64 or 128 kiB.

How to generate 100Mpps traffic of 64B packet size with Pktgen?

I tried to use dpdk-pktgen 3.7.2 with dpdk 18.11, but it only reached about 35 Mpps traffic with 64B packet size.Following is my lua script:
package.path = package.path ..";?.lua;test/?.lua;app/?.lua;../?.lua"
require "Pktgen";
local time = 30;
local pcnt_rate = 100;
sendport = 0;
recvport = 1;
pkt_size = 64;
burst_cnt = 128
local dstip = "192.168.100.100";
local srcip = "192.168.0.0";
function main()
pktgen.stop(sendport);
sleep(2);
pktgen.set(sendport, "size", burst_cnt);
pktgen.set(sendport, "burst", 64);
pktgen.set(sendport, "rate", pcnt_rate);
pktgen.set_ipaddr(sendport, "dst", dstip);
pktgen.set_ipaddr(sendport, "src", srcip);
pktgen.set_proto(sendport..","..recvport, "udp");
pktgen.start(sendport)
sleep(time)
pktgen.stop(sendport)
end
printf("\n**** Traffic Profile Rate for %d byte packets ***\n", pkt_size);
main();
printf("\n*** Traffic Profile Done (Total Time %d) ***\n", time);
l ran the script with the following command.
sudo pktgen -l 0-7 -n 4 -- -N -T -P -m "[1-7].0" -f script.lua
My NIC is Mellanox ConnectX-5 100GbE with traffic limit of 200Mpps and 100Gbps. Is there any problem in my script that restricts performance of pktgen? Thank you for your suggestions.
As mentioned in comments this is more of configuration issue of platform or not choosing right platform. I am able to generate 120Mpps with 64B on 100Gbps (CVL NIC).
[EDIT-2] Finally got hands on Mellanox connectx6 DX cards (2 *100Gbps). With DPDK 21.11 and PKTGEN 21.11, it is possible to generate over 100MPPs
MLX PMD args used: mprq_en=1,rxqs_min_mprq=1,mprq_log_stride_num=9,txq_inline_mpw=128,rxq_pkt_pad_en=1
Platform Details:
DPDK:21.08
PKTGEN:pktgen-dpdk-pktgen-21.03.1
NIC:Ethernet Controller E810-C for QSFP
CPU: Intel(R) Xeon(R) Gold 6152 CPU # 2.10GHz
PKTGEN CMD:pktgen --legacy-mem -a 0000:86:00.0 -l 22-43 -- -P -m "[25-29:30-34].0" -N
[EDIT-1] #SoliRaven as mentioned in comments and answer one can generate up to 120 to 125 Mpps with Intel E810 & with Mellanox Connect6 DX 120MPPs. Hence this looks more like configuration, platform or firmware issues. Hence this is not DPDK or DPDK-PKTGEN is not issue.

check_cpu + nsclient : set critical threshold only on 5min period

I am using centreon (nagios) to monitor the CPUs of some VMs using NSClient. In my case it makes only sense to set the critical state of the cpu probe if the average cpu load is > 95 over the 5m period. Is this achievable ?
I cannot find documentation on how to specify that in the critical param
Default command
check_cpu
Returns
CPU Load ok
'total 5m load'=0%;80;90 'total 1m load'=0%;80;90 'total 5s load'=7%;80;90
Command with specific threshold (but all time period can match)
check_cpu "critical=load > 90"
It is not exactly what I wanted to do but what I did is the following
check_nrpe -u -H XX.XXX.X.XXX -c check_cpu -a "crit=load > 95" "warn=load > 90" time=5m
Which limits the output to the 5m time period.
Note that to execute this from centreon you have to set the following variables inside the nsclient.ini file (waisted a lot of time on that one)
[/settings/NRPE/server]
allow nasty characters=true
[/settings/external scripts]
allow nasty characters=true
Check this script,
define service{
use generic-service
host_name xxx
service_description CPU Load
check_command check_nrpe!check_load
contact_groups sysadmin
}
---
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
You can try something like that
check_nrpe -u -H XX.XXX.X.XXX -c check_cpu -a "warning=time = '5m' and load > 80" "critical=time = '5m' and load > 90" show-all
You can also check the documentation for more info.

Snakemake memory limiting

In Snakemake, I have 5 rules. For each I set the memory limit by resources mem_mb option.
It looks like this:
rule assembly:
input:
file1 = os.path.join(MAIN_DIR, "1.txt"), \
file2 = os.path.join(MAIN_DIR, "2.txt"), \
file3 = os.path.join(MAIN_DIR, "3.txt")
output:
foldr = dir, \
file4 = os.path.join(dir, "A.png"), \
file5 = os.path.join(dir, "A.tsv")
resources:
mem_mb=100000
shell:
" pythonscript.py -i {input.file1} -v {input.file2} -q {input.file3} --cores 5 -o {output.foldr} "
I want to limit the memory usage of the whole Snakefile by doing something like:
snakamake --snakefile mysnakefile_snakefile --resources mem_mb=100000
So not all jobs would use 100GB each ( if I have 5 rules, meaning as 500GB memory allocation), but all of their executions will be maximum 100GB ( 5 jobs, total of 100 GB allocation?)
The command line argument sets the total limit. The Snakemake scheduler will ensure that for the set of running jobs, the sum of the mem_mb resources will not exceed the total limit.
I think this is exactly what you want, isn't it? You just need to set the per-job expected memory in the rule itself. Note that Snakemake does not measure this for you. You have to define that value yourself in the rule. E.g., if you expect your job to use 100MB memory, put mem_mb=100 into that rule.

Missing nvcc compiler - theano

I use ubuntu 14.04 and cuda 7.5. I get cuda version information using $ nvcc --version :
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17
$PATH and $LD_LIBRARY_PATH are below :
$ echo $PATH
/usr/local/cuda-7.5/bin:/usr/local/cuda-7.5/bin/:/opt/ros/indigo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
$ echo $LD_LIBRARY_PATH
/usr/local/cuda-7.5/lib64
I install theano. I use it with cpu but not gpu. This guide says that
Testing Theano with GPU¶
To see if your GPU is being used, cut and paste the following program into a file and run it.
from theano import function, config, shared, sandbox import
> theano.tensor as T import numpy import time
>
> vlen = 10 * 30 * 768 # 10 x #cores x # threads per core iters = 1000
>
> rng = numpy.random.RandomState(22) x =
> shared(numpy.asarray(rng.rand(vlen), config.floatX)) f = function([],
> T.exp(x)) print(f.maker.fgraph.toposort()) t0 = time.time() for i in
> range(iters):
> r = f() t1 = time.time() print("Looping %d times took %f seconds" % (iters, t1 - t0)) print("Result is %s" % (r,)) if
> numpy.any([isinstance(x.op, T.Elemwise) for x in
> f.maker.fgraph.toposort()]):
> print('Used the cpu') else:
> print('Used the gpu') The program just computes the exp() of a bunch of random numbers. Note that we use the shared function to make
> sure that the input x is stored on the graphics device.
If I run this program (in check1.py) with device=cpu, my computer
takes a little over 3 seconds, whereas on the GPU it takes just over
0.64 seconds. The GPU will not always produce the exact same floating-point numbers as the CPU. As a benchmark, a loop that calls
numpy.exp(x.get_value()) takes about 46 seconds.
$ THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python
check1.py [Elemwise{exp,no_inplace}()]
Looping 1000 times took 3.06635117531 seconds Result is [ 1.23178029
1.61879337 1.52278066 ..., 2.20771813 2.29967761
1.62323284] Used the cpu
$ THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python
check1.py Using gpu device 0: GeForce GTX 580
[GpuElemwise{exp,no_inplace}(),
HostFromGpu(GpuElemwise{exp,no_inplace}.0)] Looping 1000 times took
0.638810873032 seconds Result is [ 1.23178029 1.61879349 1.52278066 ..., 2.20771813 2.29967761
1.62323296] Used the gpu Note that GPU operations in Theano require for now floatX to be float32 (see also below).
I run gpu version command without sudo, it throws permission denied error :
/theano/gof/cmodule.py", line 741, in refresh
files = os.listdir(root)
OSError: [Errno 13] Permission denied: '/home/user/.theano/compiledir_Linux-3.16--generic-x86_64-with-Ubuntu-14.04-trusty-x86_64-2.7.6-64/tmp077r7U'
If I use it with sudo, the compiler cannot find nvcc path.
ERROR (theano.sandbox.cuda): nvcc compiler not found on $PATH. Check your nvcc installation and try again.
How can I fix this error?
Try running
chown -R user /home/user/.theano
chmod -R 775 /home/user/.theano
this will change the permissions of the folder that your python script can't access. The first one will make the folder belong to your user and the second one will change the permissions to be readable, writable and executable by the user.
Regarding this error only:
You can check where your NVCC is installed , default path is '/usr/local/cuda/bin', if you could see it there then do as below:
$ export PATH="/usr/local/cuda/bin:$PATH"
$ source .bashrc
This worked for me and now I can use NVCC and it is no longer missing.

Resources