I am running a background rake task. (Using '&'). The thing is that I want it to stop sometimes. So I wrote this:
pinger_pid = system "ps | grep rake | awk '{print $1}'"
puts pinger_pid
system "kill -9 #{pinger_pid}"
Seems that I am getting a 'true' output garbage! How can I remove that?
output:
ERROR: garbage process ID "true".
Usage:
kill pid ... Send SIGTERM to every process listed.
kill signal pid ... Send a signal to every process listed.
kill -s signal pid ... Send a signal to every process listed.
kill -l List all signal names.
kill -L List all signal names in a nice table.
kill -l signal Convert between signal numbers and names.
System returns true or false, depending on the success of the command.
Use %x to capture output:
pinger_pid = %x(ps | grep rake | awk '{print $1}')
puts pinger_pid
system 'kill', '-9', pinger_pid
Related
I ran the below-mentioned jq command and my putty session became inactive. however, I can still see the process running using the "top" command.
Does jq --stream run in background by default?
jq -cn --stream '
fromstream(1|truncate_stream(inputs | select(.[0][0] == "userActivities") | del(.[0][0])))
| select(.localDate[0:7] == "2018-10")
' 2018-10-01T21_45_56Z_triplem-baas_data.json > October_2018_triplem_events.json
Does jq --stream run in background by default?
No.
The --stream option is usually only used for very large JSON texts, so if that is the case here, then it might take a long while for the job to finish. If you want to verify that progress is being made, consider adding one or more debug statements: each debug is like . but copies its input value to STDERR before passing the value along.
Sometimes it pays to be a bit devious with debug, as illustrated in this variant of your program:
jq -cn --stream '
fromstream(1|truncate_stream(inputs | select(.[0][0] == "userActivities") | del(.[0][0])))
| (.localDate|debug) as $debug
| select(.localDate[0:7] == "2018-10")
' 2018-10-01T21_45_56Z_triplem-baas_data.json > October_2018_triplem_events.json
I'm running jobs on our university cluster (regular user, no admin rights), which uses the SLURM scheduling system and I'm interested in plotting the CPU and memory usage over time, i.e while the job is running. I know about sacct and sstat and I was thinking to include these commands in my submission script, e.g. something in the line of
#!/bin/bash
#SBATCH <options>
# Running the actual job in background
srun my_program input.in output.out &
# While loop that records resources
JobStatus="$(sacct -j $SLURM_JOB_ID | awk 'FNR == 3 {print $6}')"
FIRST=0
#sleep time in seconds
STIME=15
while [ "$JobStatus" != "COMPLETED" ]; do
#update job status
JobStatus="$(sacct -j $SLURM_JOB_ID | awk 'FNR == 3 {print $6}')"
if [ "$JobStatus" == "RUNNING" ]; then
if [ $FIRST -eq 0 ]; then
sstat --format=AveCPU,AveRSS,MaxRSS -P -j ${SLURM_JOB_ID} >> usage.txt
FIRST=1
else
sstat --format=AveCPU,AveRSS,MaxRSS -P --noheader -j ${SLURM_JOB_ID} >> usage.txt
fi
sleep $STIME
elif [ "$JobStatus" == "PENDING" ]; then
sleep $STIME
else
sacct -j ${SLURM_JOB_ID} --format=AllocCPUS,ReqMem,MaxRSS,AveRSS,AveDiskRead,AveDiskWrite,ReqCPUS,AllocCPUs,NTasks,Elapsed,State >> usage.txt
JobStatus="COMPLETED"
break
fi
done
However, I'm not really convinced of this solution:
sstat unfortunately doesn't show how many cpus are used at the
moment (only average)
MaxRSS is also not helpful if I try to record memory usage over time
there still seems to be some error (script doesn't stop after job finishes)
Does anyone have an idea how to do that properly? Maybe even with top or htop instead of sstat? Any help is much appreciated.
Slurm offers a plugin to record a profile of a job (PCU usage, memory usage, even disk/net IO for some technologies) into a HDF5 file. The file contains a time series for each measure tracked, and you can choose the time resolution.
You can activate it with
#SBATCH --profile=<all|none|[energy[,|task[,|filesystem[,|network]]]]>
See the documentation here.
To check that this plugin is installed, run
scontrol show config | grep AcctGatherProfileType
It should output AcctGatherProfileType = acct_gather_profile/hdf5.
The files are created in the folder referred to in the ProfileHDF5Dir Slurm configuration parameter (in slurm.conf)
As for your script, you could try replacing sstat with an SSH connection to the compute nodes to run ps. Assuming pdsh or clush is installed, you could run something like:
pdsh -j $SLURM_JOB_ID ps -u $USER -o pid,state,cputime,%cpu,rssize,command --columns 100 >> usage.txt
This will give you CPU and memory usage per process.
As a final note, your job never terminates simply because it will terminate when the while loop terminates, and the while loop will terminate when the job terminates... The condition "$JobStatus" == "COMPLETED" will never be observed from within the script. When the job is completed, the script is killed.
I'm new with Geneos and would like to know how to show the output of our existing script which is previously used in Nagios. We're planning to use the toolkit plugin and not sure what will be the commands to use to be able to see result in active console.
Requirement - check the log if there is session timeout and it will alert OK if grep is equal 20 then Warning alert if less or greater than 20.
Output in Geneos:
column_title - TIMEOUT CHECK, STATUS
row_result - THE_FILE, OK: Session Timeout is 20
Here's our sample script:
#!/bin/ksh
OK=0
WARNING=1
CRITICAL=2
THE_FILE=/target/directory/web.txt
TIMEOUT=`grep "<session-timeout>" $THE_FILE | awk -F'>' '{print $2}' | awk -F'>' '{print $1}'
if [$TIMEOUT -eq 20 ]; then
echo "OK: Session Timeout is $TIMEOUT"
exit $OK
else
echo "WARNING: Session Timeout is $TIMEOUT"
exit $WARNING
fi
Thanks!
You can use the same script (adding the header) and adding it to Geneos. But I highly recommend you to use a FKM sampler, and check this log file using Geneos directly.
Hope this helps you.
I'd like to write an escript that reloads its configuration when it receives a HUP signal. I'm on OS X and searching for any new processes in Activity Monitor when I start the escript. When I do, these pop up: inet_gethost (twice), erl_child_setup, and beam.swp. When I send a SIGHUP to erl_child_setup, it crashes with the message of "erl_child_setup closed". When I send it to beam.swp I get a message of "Hangup: 1", but my trapping code is not called.
Here's some example code that illustrates what I am trying to do:
defmodule TrapHup do
def main(args) do
Process.flag(:trap_exit, true)
main_loop()
end
def main_loop() do
receive do
{ :EXIT, _from, reason } ->
IO.puts "Caught exit!"
IO.inspect reason
main_loop()
end
end
end
I found out this is not possible with just Elixir/Erlang. Apparently it is possible through bash as illustrated in this gist: https://gist.github.com/Djo/bfa9fa75928ce432ec51
Here's the code:
#!/usr/bin/env bash
set -x
term_handler() {
echo "Stopping the server process with PID $PID"
erl -noshell -name "term#127.0.0.1" -eval "rpc:call('app#127.0.0.1', init, stop, [])" -s init stop
echo "Stopped"
}
trap 'term_handler' TERM INT
elixir --name app#127.0.0.1 -S mix run --no-halt &
PID=$!
echo "Started the server process with PID $PID"
wait $PID
# remove the trap if the first signal received or 'mix run' stopped for some reason
trap - TERM INT
# return the exit status of the 'mix run'
wait $PID
EXIT_STATUS=$?
exit $EXIT_STATUS
I make a file in my rails app /bin/restart_resque.sh
kill `cat tmp/pids/scheduler.pid`
When I execute bin/restart_resque.sh,I got the error
: arguments must be process or job IDs624
and the process is still working.
Then I change the file to :
kill 2624
I got the same error,but the process 2624 is do exist.why?
You have invalid PID in scheduler.pid or file is not exists.
Check file and permissions:
namei -lm tmp/pids/scheduler.pid
Check PID (resque process must have PID from pid file):
cat tmp/pids/scheduler.pid
ps aux | grep `cat tmp/pids/scheduler.pid`
I found the issue.
when I execute file bin/resque_restart.sh
bin/resque_restart.sh: ASCII text, with CRLF line terminators
so the file format is the reason.but I don't know why.