Pass value grep command in python - grep

I am obtaining CPU and RAM statistics for the openvpn process by running the following command in a Python script on a Linux Debian 7 box.
>ps aux | grep openvpn
The output is parsed and sent to a zabbix monitoring server.
I currently use the following Python script called psperf.py.
If I want CPU% stats I run: psperf 2
>#!/usr/bin/env python
>
>import subprocess, sys, shlex
>
>psval=sys.argv[1] #ps aux val to extract such as CPU etc #2 = %CPU, 3 = %MEM, 4 = VSZ, 5 = RSS
>
>#https://stackoverflow.com/questions/6780035/python-how-to-run-ps-cax-grep-something-in-python
>proc1 = subprocess.Popen(shlex.split('ps aux'),stdout=subprocess.PIPE)
>proc2 = subprocess.Popen(shlex.split('grep >openvpn'),stdin=proc1.stdout,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
>
>proc1.stdout.close() # Allow proc1 to receive a SIGPIPE if proc2 exits.
>out,err=proc2.communicate()
>
>#string stdout?
>output = (format(out))
>
>#create output list
>output = output.split()
>
>#make ps val an integer to enable list location
>psval = int(psval)
>
>#extract value to send to zabbix from output list
>val = output[psval]
>
>#OUTPUT
>print val
This script works fine for obtaining the data in relation to openvpn. However I now want to reuse the script by passing process details from which to extract data without having to have a script for each individual process. For example I might want CPU and RAM statistics for the zabbix process.
I have tried various solutions including the following but get an index out of range.
For example I run: psperf 2 apache
>#!/usr/bin/env python
>
>import subprocess, sys, shlex
>
>psval=sys.argv[1] #ps aux val to extract such as CPU etc #2 = %CPU, 3 = %MEM, 4 = VSZ, 5 = RSS
>psname=sys.argv[2] #process details/name
>
>#https://stackoverflow.com/questions/6780035/python-how-to-run-ps-cax-grep-something-in-python
>proc1 = subprocess.Popen(shlex.split('ps aux'),stdout=subprocess.PIPE)
>proc2 = subprocess.Popen(shlex.split('grep', >psname),stdin=proc1.stdout,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
>
>proc1.stdout.close() # Allow proc1 to receive a SIGPIPE if proc2 exits.
>out,err=proc2.communicate()
>
>#string stdout?
>output = (format(out))
>
>#create output list
>output = output.split()
>
>#make ps val an integer to enable list location
>psval = int(psval)
>
>#extract value to send to zabbix from output list
>val = output[psval]
>
>#OUTPUT
>print val
Error:
>root#Deb764opVPN:~# python /usr/share/zabbix/externalscripts/psperf.py 4 openvpn
>Traceback (most recent call last):
> File "/usr/share/zabbix/externalscripts/psperf.py", line 25, in <module>
> val = output[psval]
>IndexError: list index out of range
In the past I haven't used the shlex class which is new to me. This was necessary to pipe the ps aux command to grep securely - avoiding shell = true - a security hazard (http://docs.python.org/2/library/subprocess.html).
I adopted the script from: How to run " ps cax | grep something " in Python?
I believe its to do with how shlex handles my request but I`m not to sure how to go forward.
Can you help? As in how can I successfully pass a value to the grep command.
I can see this being benfical to many others who pipe commands etc.
Regards
Aidan

I carried on researching and solved using the following:
!/usr/bin/env python
import subprocess, sys # , shlex
psval=sys.argv[1] #ps aux val to extract such as CPU etc #2 = %CPU, 3 = %MEM, 4 = VSZ, 5 = RSS
psname=sys.argv[2] #process details/name
#http://www.cyberciti.biz/tips/grepping-ps-output-without-getting-grep.html
proc1 = subprocess.Popen(['ps', 'aux'], stdout=subprocess.PIPE)
proc2 = subprocess.Popen(['grep', psname], stdin=proc1.stdout,stdout=subprocess.PIPE)
proc1.stdout.close() # Allow proc1 to receive a SIGPIPE if proc2 exits.
stripres = proc2.stdout.read()
#TEST RESULT
print stripres
#create output list
output = stripres.split()
#make ps val an integer to enable list location
psval = int(psval)
#extract value to send to zabbix from output list
val = output[psval]
#OUTPUT
print val
Regards
Aidan

Related

Python script have a problem whith int() convertion whith crontab

I'm checking memory of my Raspberry pi. It's work's fine.
But when I want run this every minute, crontab says have an error to convert string to int
ValueError: invalid literal for int() with base 10: ''
My script.py :
intmemused = 0
cmd = "top -n1 | grep 'Mem :'| awk '{print $6;}'"
output = Popen(cmd,shell=True, stdout=PIPE)
memused = output.communicate()[0].strip()
memused = str(memused.decode("utf-8"))
print(memused) |-----------> 589020
intmemused = int(memused) #Error when crontab execute my scrypt
mem = intmemused * 100
mem = float(mem) / float(memtot)
mem = 100 - float(mem)
mem = round(mem,2)
My crontab :
*/1 * * * * /home/dietpi/<b>info.sh</b> 2>/home/dietpi/marseille.log
My info.sh :
#!/bin/bash
/usr/bin/python3 /home/dietpi/script.py
marseille.log it create to log errors when it's execute by crontab and in it have :
TERM environment variable not set.
Traceback (most recent call last):
File "/home/dietpi/config", line 56, in <module>
intmemused = int(memused)
ValueError: invalid literal for int() with base 10: ''
When I look this error I've belived memused is empty but not. Print gived 589020
I've belived blank character but I've used .strip()
I've belived TERM environment variable not set. it's a problem, but whith this command set | grep TERM have a good answer TERM=xterm
I don't undertand why it's work with python3 and not whith crontab
Can you help me ?
Thank's a lot !
MaxKweeger

Count the number of running process with Telegraf

I'm using telegraf, influxdb and grafana to make a monitoring system for a distributed application. The first thing I want to do is to count the number of java process running on a machine.
But when I make my request, the number of process is nearly random (always between 1 and 8 instead of always having 8).
I think there is a mistake in my telegraf configuration but i don't see where.. I tried to change interval but nothing was different : it seems influxdb doesn't have all the data.
I'm running centos 7 and Telegraf v1.5.0 (git: release-1.5 a1668bbf)
All Java process I want to count :
[root#localhost ~]# pgrep -f java
10665
10688
10725
10730
11104
11174
16298
22138
My telegraf.conf :
[global_tags]
# Configuration for telegraf agent
[agent]
interval = "5s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
debug = true
quiet = false
logfile = "/var/log/telegraf/telegraf.log"
hostname = "my_server"
omit_hostname = false
My input.conf :
# Read metrics about disk usagee
[[inputs.disk]]
fielddrop = [ "inodes*" ]
mount_points=["/", "/workspace"]
# File
[[inputs.filestat]]
files = ["myfile.log"]
# Read the number of running java process
[[inputs.procstat]]
user = "root"
pattern = "java"
My request :
The response :
If you just want to count PID, it's a good way to use exec like this :
[[inputs.exec]]
commands = ["pgrep -c java"] #command to execute
name_override = "the_name" #database's name
data_format = "my_value" #colunm's name
For commands, use pgrep -c java without option -f because it's "full" and also counts the command pgrep (and you have almost the same problem as if you use procstat).
Solution found here
With pattern matching, if it matches multi pids, multi data points are generated with identical tags and timestamp. When these points are sent to influxdb, only the last point is stored.
Example of what may happen with your configuration:
00:00 => pid 1
00:05 => pid 2
00:10 => pid 1
00:15 => pid 5
00:20 => pid 7
00:25 => pid 3
00:30 => pid 3
00:35 => pid 4
00:40 => pid 6
00:45 => pid 7
00:50 => pid 6
00:55 => pid 5
Different pids over one minute = 7 (pid 8 was not stored a single time)
Since it's random, you sometimes hit the 8 different pids in a minute, but most of the time you don't.
To differentiate between processes whose tags are otherwise the same, use pid_tag = true :
[[inputs.procstat]]
user = "root"
pattern = "java"
pid_tag = true
However, if you just want to count the number of processes (and don't care about the stats), just use the exec plugin with a custom command like pgrep -c -f java. This will be more optimized than having multiples time series (with pid_tag you end up with one per pid).

Run command on GPS fix

I have GPSD running on a Linux system (specifically SkyTraq Venus 6 on a Raspberry Pi 3, but that shouldn't matter). Is there a way to trigger a command when the GPS first acquires or loses the 3D fix, almost like the scripts in /etc/network/if-up.d and /etc/network/if-down.d?
I found a solution:
Step 1: With GPSD running, gpspipe -w outputs JSON data, documented here. The TPV class has a mode value, which can take one of these values:
0=unknown mode
1=no fix
2=2D fix
3=3D fix
Step 2: Write a little program called gpsfix.py:
#!/usr/bin/env python
import sys
import errno
import json
modes = {
0: 'unknown',
1: 'nofix',
2: '2D',
3: '3D',
}
try:
while True:
line = sys.stdin.readline()
if not line: break # EOF
sentence = json.loads(line)
if sentence['class'] == 'TPV':
sys.stdout.write(modes[sentence['mode']] + '\n')
sys.stdout.flush()
except IOError as e:
if e.errno == errno.EPIPE:
pass
else:
raise e
For every TPV sentence, gpspipe -w | ./gpsfix.py will print the mode.
Step 3: Use grep 3D -m 1 to wait for the first fix, and then quit (which sends SIGPIPE to all other processes in the pipe).
gpspipe -w | ./gpsfix.py | grep 3D -m 1 will print 3D on the first fix.
Step 4: Put in in a bash script:
#!/usr/bin/env bash
# Wait for first 3D fix
gpspipe -w | ./gpsfix.py | grep 3D -m 1
# Do something nice
cowsay "TARGET LOCATED"
And run it:
$ ./act_on_gps_fix.sh
3D
________________
< TARGET LOCATED >
----------------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||

The run result of flume and test flume

enter image description here
enter image description here
my flume configuration file is as follows:
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.channels = c1
a1.sources.r1.command = tail -F /home/hadoop/flume-1.5.0-bin/log_exec_tail
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
And start my flume agent with the following stript:
bin/flume-ng agent -n a1 -c conf -f conf/flume_log.conf -Dflume.root.logger=INFO,console
question 1: the run result is as follows, however I don't konw if it run successful or not!
question 2: And there is the sentences as follows and I don't know what the mean is about "the queation of flume test":
NOTE: To test that the Flume agent is running properly, open a new terminal window and change directories to /home/horton/solutions/:
horton#ip:~$ cd /home/horton/solutions/
Run the following script, which writes log entries to nodemanager.log:
$ ./test_flume_log.sh
If successful, you should see new files in the /user/horton/flume_sink directory in HDFS
Stop the logagent Flume agent
As per your flume configuration, whenever the file /home/hadoop/flume-1.5.0-bin/log_exec_tail is changed, it will do a tail operation and append the results in the console.
So to test it working correctly,
1. run the command bin/flume-ng agent -n a1 -c conf -f conf/flume_log.conf -Dflume.root.logger=INFO,console
2. Open a terminal and add few lines in the file /home/hadoop/flume-1.5.0-bin/log_exec_tail
3. Save it
4. Now check the terminal where you triggered flume command
5. You can see newly added lines displayed

Erlang how to start an external script in linux

I want to run an external script and get the PID of the process (once it starts) from my erlang program. Later, I will want to send TERM signal to that PID from erlang code. How do I do it?
I tried this
P = os:cmd("myscript &"),
io:format("Pid = ~s ~n",[P]).
It starts the script in background as expected, but I dont get the PID.
Update
I made the below script (loop.pl) for testing:
while(1){
sleep 1;
}
Then tried to spawn the script using open_port. The script runs OK. But, erlang:port_info/2 troughs exception:
2> Port = open_port({spawn, "perl loop.pl"}, []).
#Port<0.504>
3> {os_pid, OsPid} = erlang:port_info(Port, os_pid).
** exception error: bad argument
in function erlang:port_info/2
called as erlang:port_info(#Port<0.504>,os_pid)
I checked the script is running:
$ ps -ef | grep loop.pl
root 10357 10130 0 17:35 ? 00:00:00 perl loop.pl
You can open a port using spawn or spawn_executable, and then use erlang:port_info/2 to get its OS process ID:
1> Port = open_port({spawn, "myscript"}, PortOptions).
#Port<0.530>
2> {os_pid, OsPid} = erlang:port_info(Port, os_pid).
{os_pid,91270}
3> os:cmd("kill " ++ integer_to_list(OsPid)).
[]
Set PortOptions as appropriate for your use case.
As the last line above shows, you can use os:cmd/1 to kill the process if you wish.

Resources