GNU Parallel: set remote server run on 1 job with all CPU - gnu-parallel

I have jobs (multiprocessing python codes) that ideally take 4 CPUs to run on each remote machine. In GNU parallel, how do I set up the arguments to make each remote server (assuming 4 cores) run one job at a time, using all its 4 cores on the same job (instead of using its 4 cores to run 4 job by default)

Run (25% of 4 cores) jobs in parallel with each 4 args:
-j 25% -N4

Related

Jenkins Request Handling is Very Slow

We have jenkins server(master / Agents) hosted in AWS and we setup agents(on demand based on Jobs Queue) connection through swarm-client.
Mainly when Rebuilding jenkins Job / Replay jenkins Job take more time approximately 7 mins.
Down Scaling agents: We have a python script used for downscaling the agents, if they are idle and those request processing also taking more time approximately an average 5 mins(When there are 10+ request are coming at a time)
Scaling up agents: Create Slaves is little less comparing above two may be an average 2 mins.
GET / POST requests are very fast and these are served with in 100 milliseconds.
Below is the command used to connect agent to jenkins master:
====
java -jar /usr/share/jenkins/swarm-client.jar -fsroot /var/jenkins_agent_home -deleteExistingClients -disableClientsUniqueId -executors 1 -master https://xxxxxx/ -passwordEnvVariable PASSWORD -e PASSWORD=redacted -username user -fsroot /var/jenkins/0 -name xxl-xxxxxxx -labels xxl_linux -mode exclusive
===
Jenkins server:
Memory: 72 GB
Cores: 32
Heap memory: 16 GB
Java Version: 11
Garbage collector used: G1GC
enter image description here
Please support here to improve performance. Thanks in Advance !!

presto docker containers on production environment

we intend to build presto cluster on docker containers
we have 12 RHEL machines ,
the simple implementation is to set presto service on docker container per Linux machine
on the other-hand we are thinking about the following different plan and we will appreciate to get feedback's about this plan
since we have 12 physical Linux machines
we can build 4 docker containers on each Linux machine
when each docker container will include presto service
so total presto workers will be 4 X 12 = 48
I think the question is: should I run one Presto worker per machine or multiple?
In general: one Presto worker per machine will perform much better than multiple workers.
There are some edge cases though. If your machines have more than 200 GB memory, you may get some performance penalty from JVM due to rather large heap sizes. (This, however, requires more thought, so don't take it as an advice to run multiple workers per machine.)
Make sure you run on Java 11 or newer. This is in fact one of the main reasons why Presto requires Java 11 starting with Presto 333.
Note: you do not need to build your own Docker image. We publish a Centos-based image at https://hub.docker.com/r/prestosql/presto. Hope this is helpful.

Multiple GitLab Runner Docker instances at one host?

I need to configure GitLab runner to run multiple shared runners in Docker containers at one server (host).
So, I registered two runners with gitlab-runner register as shared runners with same tag.
But there is an issue now - only one of them is currently using and all other tasks are waiting in Pending status until the first runner is stopped. So, second runner instance is not using, until first instance will be stopped.
All tasks have same tag.
How to run multiple runners at same server host?
By default concurrent is 1, so unless you increase it your runner will just use one registration at a time: https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-global-section
limits how many jobs globally can be run concurrently. The most upper limit of jobs using all defined runners. 0 does not mean unlimited
To utilize all your CPU cores set concurrent in /etc/gitlab-runner/config.toml (when running as root) or ~/.gitlab-runner/config.toml (when running as non root) to the number of your CPUs.
You can find the number of CPUs like this: grep -c ^processor /proc/cpuinfo.
In my case the config.toml says concurrent = 8
Citations:
Gitlab-Runner advanced config: https://docs.gitlab.com/runner/configuration/advanced-configuration.html#the-global-section
Find CPU roes on linux: How to obtain the number of CPUs/cores in Linux from the command line?

Spark: what's the advantages of having multiple executors per node for a Job?

I am running my job on AWS-EMR cluster. It is a 40 nodes cluster using cr1.8xlarge instances. Each cr1.8xlarge has 240G memory and 32 cores. I can run with the following config:
--driver-memory 180g --driver-cores 26 --executor-memory 180g --executor-cores 26 --num-executors 40 --conf spark.default.parallelism=4000
or
--driver-memory 180g --driver-cores 26 --executor-memory 90g --executor-cores 13 --num-executors 80 --conf spark.default.parallelism=4000
Since from the job-tracker website, the number of tasks running simultaneously is mainly just the number of cores (cpu) available. So I am wondering is there any advantages or specific scenarios that we want to have more than one executor per node?
Thanks!
Yes, there are advantages of running multiple executors per node - especially on large instances like yours. I recommend that you read this blog post from Cloudera.
A snippet of the post that would be of particular interest to you:
To hopefully make all of this a little more concrete, here’s a worked example of configuring a Spark app to use as much of the cluster as possible: Imagine a cluster with six nodes running NodeManagers, each equipped with 16 cores and 64GB of memory. The NodeManager capacities, yarn.nodemanager.resource.memory-mb and yarn.nodemanager.resource.cpu-vcores, should probably be set to 63 * 1024 = 64512 (megabytes) and 15 respectively. We avoid allocating 100% of the resources to YARN containers because the node needs some resources to run the OS and Hadoop daemons. In this case, we leave a gigabyte and a core for these system processes. Cloudera Manager helps by accounting for these and configuring these YARN properties automatically.
The likely first impulse would be to use --num-executors 6 --executor-cores 15 --executor-memory 63G. However, this is the wrong approach because:
63GB + the executor memory overhead won’t fit within the 63GB capacity of the NodeManagers.
The application master will take up a core on one of the nodes, meaning that there won’t be room for a 15-core executor on that node.
15 cores per executor can lead to bad HDFS I/O throughput.
A better option would be to use --num-executors 17 --executor-cores 5 --executor-memory 19G. Why?
This config results in three executors on all nodes except for the one with the AM, which will have two executors.
--executor-memory was derived as (63/3 executors per node) = 21. 21 * 0.07 = 1.47. 21 – 1.47 ~ 19.

Is there a limit of the number of tasks that can be launched by Mesos on a slave?

I am currently using Mesos + Marathon for the test.
When I launch a lot of tasks with command ping 8.8.8.8, at one point, slaves cannot launch a task any more. So I checked out stderr of sandbox, then it shows
Failed to initialize, pthread_create
I launched tasks with 0.00001 cpus and 0.00001 mems, so enough resources to launch a task remained in slaves.
Is there a limit of the number of tasks that can be launched by Mesos on a slave?
My first guees would be you are hitting a ulimits limit on your slaves.
Can you try the following:
#Check max number of threads:
$ ulimit -u
1024
Btw: If you just want to launch dummy tasks I would probably use sleep 3000 or something like that.
Hope this helps
Joerg

Resources