xvfb-run unreliable when multiple instances invoked in parallel - xvfb

Can you help me, why I get sometimes (50:50):
webkit_server.NoX11Error: Cannot connect to X. You can try running with xvfb-run.
When I start the script in parallel as:
xvfb-run -a python script.py
You can reproduce this yourself like so:
for ((i=0; i<10; i++)); do
xvfb-run -a xterm &
done
Of the 10 instances of xterm this starts, 9 of them will typically fail, exiting with the message Xvfb failed to start.

Looking at xvfb-run 1.0, it operates as follows:
# Find a free server number by looking at .X*-lock files in /tmp.
find_free_servernum() {
# Sadly, the "local" keyword is not POSIX. Leave the next line commented in
# the hope Debian Policy eventually changes to allow it in /bin/sh scripts
# anyway.
#local i
i=$SERVERNUM
while [ -f /tmp/.X$i-lock ]; do
i=$(($i + 1))
done
echo $i
}
This is very bad practice: If two copies of find_free_servernum run at the same time, neither will be aware of the other, so they both can decide that the same number is available, even though only one of them will be able to use it.
So, to fix this, let's write our own code to find a free display number, instead of assuming that xvfb-run -a will work reliably:
#!/bin/bash
# allow settings to be updated via environment
: "${xvfb_lockdir:=$HOME/.xvfb-locks}"
: "${xvfb_display_min:=99}"
: "${xvfb_display_max:=599}"
# assuming only one user will use this, let's put the locks in our own home directory
# avoids vulnerability to symlink attacks.
mkdir -p -- "$xvfb_lockdir" || exit
i=$xvfb_display_min # minimum display number
while (( i < xvfb_display_max )); do
if [ -f "/tmp/.X$i-lock" ]; then # still avoid an obvious open display
(( ++i )); continue
fi
exec 5>"$xvfb_lockdir/$i" || continue # open a lockfile
if flock -x -n 5; then # try to lock it
exec xvfb-run --server-num="$i" "$#" || exit # if locked, run xvfb-run
fi
(( i++ ))
done
If you save this script as xvfb-run-safe, you can then invoke:
xvfb-run-safe python script.py
...and not worry about race conditions so long as no other users on your system are also running xvfb.
This can be tested like so:
for ((i=0; i<10; i++)); do xvfb-wrap-safe xchat & done
...in which case all 10 instances correctly start up and run in the background, as opposed to:
for ((i=0; i<10; i++)); do xvfb-run -a xchat & done
...where, depending on your system's timing, nine out of ten will (typically) fail.

This questions was asked in 2015.
In my version of xvfb (2:1.20.13-1ubuntu1~20.04.2), this problem has been fixed.
It looks at /tmp/.X*-lock to find an available port, and then runs Xvfb. If Xvfb fails to start, it finds a new port and retries, up to 10 times.

Related

Why QProcess is not showing stdout from bash script executed in remote server?

I made an script (findx.h) that doesn't have any problem when i ran it on Solaris server via console (bash-3.2$ ./findx.sh)
The problem appears when i try to run it from a windows Qt app using QProcess (code below) where it doesn't display the ouput of the command.
I tried little variations and appear to show data when just use one pipe instead of two. But i need the two: grep and ggrep.
//findx.h in solaris
//WHAT WORKS
#!/bin/bash
echo pass | sudo -S /usr/sbin/snoop -x0 -ta HSM1000 port 1000
//WHAT I WANT
#!/bin/bash
echo pass | sudo -S /usr/sbin/snoop -x0 -ta HSM1000 port 1000 | /usr/sfw/bin/ggrep -A 2 KR01
//Qt on windows
QString commands="(";
commands +="source setpath.sh";
commands +=";/path/to/script/findx.sh";
commands +=")";
this->logged=false;
QString program = "plink.exe";
QStringList arguments;
arguments <<"-ssh"
<<ip
<<"-l"
<<user
<<"-pw"
<<pass
<<commands;
this->myProcess=new QProcess(this);
connect(this->myProcess,SIGNAL(started()),
this, SLOT(onprocess_started()));
connect(this->myProcess, SIGNAL(errorOccurred(QProcess::ProcessError)),
this, SLOT(onprocess_errorOcurred(QProcess::ProcessError)));
connect(this->myProcess, SIGNAL(finished(int, QProcess::ExitStatus)),
this, SLOT(onprocess_finished(int, QProcess::ExitStatus)));
connect(this->myProcess, SIGNAL(readyReadStandardError()),
this, SLOT(onprocess_readyReadStandardError()));
connect(this->myProcess, SIGNAL(readyReadStandardOutput()),
this, SLOT(onprocess_readyReadStandardOutput()));
connect(this->myProcess, SIGNAL(stateChanged(QProcess::ProcessState)),
this, SLOT(onprocess_stateChanged(QProcess::ProcessState)));
this->myProcess->start(program, arguments);
this->ui->labStatus->setText("Starting");
return 0;
// How i read, i do the same for stderr and put it also in plainOutput
QByteArray err=this->myProcess->readAllStandardOutput();
QString m="Standard output:"+QString(err.data());
this->ui->plainOutput->appendPlainText(m);
please any advice would be useful.
Thanks in advance.

How to monitor resources during slurm job?

I'm running jobs on our university cluster (regular user, no admin rights), which uses the SLURM scheduling system and I'm interested in plotting the CPU and memory usage over time, i.e while the job is running. I know about sacct and sstat and I was thinking to include these commands in my submission script, e.g. something in the line of
#!/bin/bash
#SBATCH <options>
# Running the actual job in background
srun my_program input.in output.out &
# While loop that records resources
JobStatus="$(sacct -j $SLURM_JOB_ID | awk 'FNR == 3 {print $6}')"
FIRST=0
#sleep time in seconds
STIME=15
while [ "$JobStatus" != "COMPLETED" ]; do
#update job status
JobStatus="$(sacct -j $SLURM_JOB_ID | awk 'FNR == 3 {print $6}')"
if [ "$JobStatus" == "RUNNING" ]; then
if [ $FIRST -eq 0 ]; then
sstat --format=AveCPU,AveRSS,MaxRSS -P -j ${SLURM_JOB_ID} >> usage.txt
FIRST=1
else
sstat --format=AveCPU,AveRSS,MaxRSS -P --noheader -j ${SLURM_JOB_ID} >> usage.txt
fi
sleep $STIME
elif [ "$JobStatus" == "PENDING" ]; then
sleep $STIME
else
sacct -j ${SLURM_JOB_ID} --format=AllocCPUS,ReqMem,MaxRSS,AveRSS,AveDiskRead,AveDiskWrite,ReqCPUS,AllocCPUs,NTasks,Elapsed,State >> usage.txt
JobStatus="COMPLETED"
break
fi
done
However, I'm not really convinced of this solution:
sstat unfortunately doesn't show how many cpus are used at the
moment (only average)
MaxRSS is also not helpful if I try to record memory usage over time
there still seems to be some error (script doesn't stop after job finishes)
Does anyone have an idea how to do that properly? Maybe even with top or htop instead of sstat? Any help is much appreciated.
Slurm offers a plugin to record a profile of a job (PCU usage, memory usage, even disk/net IO for some technologies) into a HDF5 file. The file contains a time series for each measure tracked, and you can choose the time resolution.
You can activate it with
#SBATCH --profile=<all|none|[energy[,|task[,|filesystem[,|network]]]]>
See the documentation here.
To check that this plugin is installed, run
scontrol show config | grep AcctGatherProfileType
It should output AcctGatherProfileType = acct_gather_profile/hdf5.
The files are created in the folder referred to in the ProfileHDF5Dir Slurm configuration parameter (in slurm.conf)
As for your script, you could try replacing sstat with an SSH connection to the compute nodes to run ps. Assuming pdsh or clush is installed, you could run something like:
pdsh -j $SLURM_JOB_ID ps -u $USER -o pid,state,cputime,%cpu,rssize,command --columns 100 >> usage.txt
This will give you CPU and memory usage per process.
As a final note, your job never terminates simply because it will terminate when the while loop terminates, and the while loop will terminate when the job terminates... The condition "$JobStatus" == "COMPLETED" will never be observed from within the script. When the job is completed, the script is killed.

jenkins plugin for triggering build whenever any file changed in a given directory

I am looking for functionality where we have a directory with some files in it.
Whenever any one makes a change in any of the files in the directory, jenkins shoukd trigger a build.
Is there any plugin or mathod for this functionality. Please advise.
Thanks in advance.
I have not tried it myself, but The FSTrigger plugin seems to do what you want:
FSTrigger provides polling mechanisms to monitor a file system and
trigger a build if a file or a set of files have changed.
If you can monitor the directory with a script, you can trigger the build with a HTTP GET, for example with wget or curl:
wget -O- $JENKINS_URL/job/JOBNAME/build
Although slightly related.. it seems like this issue was about monitoring static files on system.. however there are many version control systems for just this purpose.
I answered this in another post if you're using git to track changes on the files themselves:
#!/bin/bash
set -e
job_name="whatever"
JOB_URL="http://myserver:8080/job/${job_name}/"
FILTER_PATH="path/to/folder/to/monitor"
python_func="import json, sys
obj = json.loads(sys.stdin.read())
ch_list = obj['changeSet']['items']
_list = [ j['affectedPaths'] for j in ch_list ]
for outer in _list:
for inner in outer:
print inner
"
_affected_files=`curl --silent ${JOB_URL}${BUILD_NUMBER}'/api/json' | python -c "$python_func"`
if [ -z "`echo \"$_affected_files\" | grep \"${FILTER_PATH}\"`" ]; then
echo "[INFO] no changes detected in ${FILTER_PATH}"
exit 0
else
echo "[INFO] changed files detected: "
for a_file in `echo "$_affected_files" | grep "${FILTER_PATH}"`; do
echo " $a_file"
done;
fi;
You can add the check directly to the top of the job's exec shell, and it will exit 0 if no changes detected.. Hence, you can always poll the top level of the repo for check-in's to trigger a build. And only complete a build if the files in question change.

Best way to search the path in shell

I've got a small script called "onewhich". Its purpose is to behave like which, except that it will only give the FIRST occurrence of any executables specified as options, as found in the order they'd appear in the path.
So for example, if my path is /opt/bin:/usr/bin:/bin, and I have both /opt/bin/runme and /usr/bin/runme, then the command onewhich runme would return /opt/bin/runme.
But if I also have a /usr/bin/doit, then the command onewhich doit runme would return /usr/bin/doit instead.
The idea is to walk through the path, check for each executable specified, and if it exists, show it and exit.
Here's the script so far.
#!/bin/sh
for what in "$#"; do
for loc in `echo "${PATH}" | awk -vRS=: 1`; do
if [ -f "${loc}/${what}" ]; then
echo "${loc}/${what}"
exit 0
fi
done
done
exit 1
The problem is, I want to be better about PATH directories with special characters. Every second shell question here on StackOverflow talks about how bad it is to parse paths with tools like awk and sed. There's even a bash faq entry about it. (Proviso: I'm not using bash for this, but the recommendation is still valid.)
So I tried rewriting the script to separate paths in a pipe, like this"
#!/bin/sh
for what in "$#"; do
echo "${PATH}" | awk -vRS=: 1 | while read loc ; do
if [ -f "${loc}/${what}" ]; then
echo "${loc}/${what}"
exit 0
fi
done
done
exit 1
I'm not sure if this gives me any real advantage (since $loc is still inside quotes), but it also doesn't work because for some reason, the exit 0 seems to be ignored. Or ... it exits something (the sub-shell with the while loop that terminates the pipe, maybe), but the script exits with a value of 1 every time.
What's a better way to step through directories in ${PATH} without the risk that special characters will confuse things?
Alternately, am I reinventing the wheel? Is there maybe a way to do this that's built in to existing shell tools?
This needs to run in both Linux and FreeBSD, which is why I'm writing it in Bourne instead of bash.
Thanks.
This doesn't directly answer your question, but does eliminate the need to parse PATH at all:
onewhich () {
for what in "$#"; do
which "$what" 2>/dev/null && break
done
}
This just calls which on each command on the input list until it finds a match.
To parse PATH, you can simply set `IFS=':'.
if [ "${IFS:-x}" = "${IFS-x}" ]; then
# Only preserve the value of IFS if it is currently set
OLDIFS=$IFS
fi
IFS=":"
for f in $PATH; do # Do not quote $PATH, to allow word splitting
echo $f
done
if [ "${OLDIFS:-x}" = "${OLDIFS-x}" ]; then
IFS=$OLDIFS
fi
The above will fail if any of the directories in PATH actually contain colons.
Your first method looks to me as if it should work. In practical terms, if it's really the $PATH you'll be searching, it's unlikely you'll have spaces and newlines embedded in directories there. If you do, it's probably time to refactor.
But still, I don't think you're at risk from the possibility of bad names clobbering your loop, since you're wrapping variables in quotes. At worst, I suspect you might miss the odd valid executable, but I can't see how the script would generate errors. (I don't see how the script would miss valid executables, and I haven't tested - I'm just saying I don't see problems at first glance.)
As for your second question, about the loop, I think you've hit the nail on the head. When you run a pipe like this | that | while condition; do things; done, the while loop runs in its own shell at the end of the pipe. Exiting that shell may terminate the actions of the pipe, but that only brings you back to the parent shell, which has its own thread of execution that terminates with exit 1.
As for a better way to do this, I would consider which.
#!/bin/sh
for what in "$#"; do
which "$what"
done | head -1
And if you really want the exit values as well:
#!/bin/sh
for what in "$#"; do
which "$what" && exit 0
done
exit 1
The second might even be fewer resources, as it doesn't have to open a file handle and pipe through head.
You can also split your path using IFS. For example, if you wanted to wrap your loops the other way around, you could do this:
#!/bin/sh
IFS=":"
for loc in $PATH; do
for what in "$#"; do
if [ -x "$loc"/"$what" ]; then
echo "$loc"/"$what"
exit 0
fi
done
done
exit 1
Note that under normal circumstances, you might want to save the old value of $IFS, but you seem to be doing things in a stand-alone script, so the "new" value gets thrown out when the script exits.
All the above code is untested. YMMV.
Another way to get around the need to parse PATH at all is to run the builtin type command in new shell with a stripped environment (i. e. there simply are no functions or aliases to look up; cf. env -i sh -c 'type cmd 2>/dev/null).
# using `cmd` instead of $(cmd) for portability
onewhich() {
ec=0 # exit code
for cmd in "$#"; do
command -p env -i PATH="$PATH" sh -c '
export LC_ALL=C LANG=C
cmd="$1"
path="`type "$cmd" 2>/dev/null`"
if [ X"$path" = "X" ]; then
printf "%s\n" "error: command \"${cmd}\" not found in PATH" 1>&2
exit 1
else
case "$path" in
*\ /*)
path="/${path#*/}"
printf "%s\n" "$path";;
*)
printf "%s\n" "error: no disk file: $path" 1>&2
exit 1;;
esac
exit 0
fi
' _ "$cmd"
[ $? != 0 ] && ec=1
done
[ $ec != 0 ] && return 1
}
onewhich awk ls sed
onewhich builtin
onewhich if
Since which on success returns two full command paths if two commands are specified as arguments, exit 0 in the first onewhich script above aborts the program prematurely. In addition, if two commands are specified as arguments to which, the exit code of which is set to 1 even if only one command lookup failed (cf. which awk sedxyz ls; echo $?). To mimic this behaviour of the which command it is necessary to toggle on/off two variables (cnt and nomatches below).
onewhich() (
IFS=":"
nomatches=0
for cmd in "$#"; do
cnt=0
for loc in $PATH ; do
if [ $cnt = 0 ] && [ -x "$loc"/"$cmd" ]; then
echo "$loc"/"$cmd"
cnt=1
fi
done
[ $cnt = 0 ] && nomatches=1
done
[ $nomatches = 1 ] && exit 1 || exit 0 # exit 1: at least one cmd was not in PATH
)
onewhich awk ls sed
onewhich awk lsxyz sed
onewhich builtin
onewhich if

unix at command pass variable to shell script?

I'm trying to setup a simple timer that gets started from a Rails Application. This timer should wait out its duration and then start a shell script that will start up ./script/runner and complete the initial request. I need script/runner because I need access to ActiveRecord.
Here's my test lines in Rails
output = `at #{(Time.now + 60).strftime("%H:%M")} < #{Rails.root}/lib/parking_timer.sh STRING_VARIABLE`
return render :text => output
Then my parking_timer.sh looks like this
#!/bin/sh
~/PATH_TO_APP/script/runner -e development ~/PATH_TO_APP/lib/ParkingTimer.rb $1
echo "All Done"
Finally, ParkingTimer.rb reads the passed variable with
ARGV.each do|a|
puts "Argument: #{a}"
end
The problem is that the Unix command "at" doesn't seem to like variables and only wants to deal with filenames. I either get one of two errors depending on how I position "s
If I put quotes around the right hand side like so
... "~/PATH_TO_APP/lib/parking_timer.sh STRING_VARIABLE"
I get,
-bash: ~/PATH_TO_APP/lib/parking_timer.sh STRING_VARIABLE: No such file or directory
I I leave the quotes out, I get,
at: garbled time
This is all happening on a Mac OS 10.6 box running Rails 2.3 & Ruby 1.8.6
I've already messed around w/ BackgrounDrb, and decided its a total PITA. I need to be able to cancel the job at any time before it is due.
After playing around with irb a bit, here's what I found.
The backtick operator invokes the shell after ruby has done any interpretation necessary. For my test case, the strace output looked something like this:
execve("/bin/sh", ["sh", "-c", "echo at 12:57 < /etc/fstab"], [/* 67 vars */]) = 0
Since we know what it's doing, let's take a look at how your command will be executed:
/bin/sh -c "at 12:57 < RAILS_ROOT/lib/parking_timer.sh STRING_VARIABLE"
That looks very odd. Do you really want to pipe parking_timer.sh, the script, as input into the at command?
What you probably ultimately want is something like this:
/bin/sh -c "RAILS_ROOT/lib/parking_timer.sh STRING_VARIABLE | at 12:57"
Thus, the output of the parking_timer.sh command will become the input to the at command.
So, try the following:
output = `#{Rails.root}/lib/parking_timer.sh STRING_VARIABLE | at #{(Time.now + 60).strftime("%H:%M")}`
return render :text => output
You can always use strace or truss to see what's happening. For example:
strace -o strace.out -f -ff -p $IRB_PID
Then grep '^exec' strace.out* to see where the command is being executed.

Resources