Jenkins - Posting results to a external monitoring job is adding garbage to the build job log - jenkins

I have a external monitor job that I'm pushing the result of another job to it with curl and base on this link :
Monitoring external jobs
After I create the job I just need to run a curl command with the body encoded in HEX to the specified url and then a build will be created and the output will be added to it but what I get instead is part of my output in clear text and the rest in weird characters like so :
Started
Asking akamai to purge this urls:
http://xxx/sites/all/modules/custom/uk.png http://aaaaaasites/all/modules/custom/flags/jp.png
<html><head><title>401 Unauthorized</title> </h�VC��&�G����CV�WF��&��VC�������R&R��BWF��&��VBF�66W72F�B&W6�W&6S�����&�G�����F����F�RW&�F �6�V6�7FGW2�bF�R&WVW7B�2��F�RF��RF�v�B�2��6�Ɩ�r&6�w&�V�B��"F�6�V6�7FGW2�bF�RF�6�W#�v�F��rf�"���F�W&vRF��6O request please keep in mind this is an estimated time
Waiting for another 60 seconds
Asking akamai to purge this urls:
...
..
..
This is how I'm doing it :
export output=`cat msg.out|xxd -c 256 -ps`
curl -k -X POST -d "<run><log encoding=\"hexBinary\">$output</log><result>0</result> <duration>2000</duration></run>" https://$jenkinsuser:$jenkinspass#127.0.0.1/jenkins/job/akamai_purge_results/postBuildResult -H'.crumb:c775f3aa15464563456346e'
If I cat that file is all fine and even if I edit it with vi I can't see any problem with it.
Do you guys have any idea how to fix this ?
Could it be a problem with the hex encoding ? ( I tried hex/enc/dec pages with the result of xxd and they look fine)
Thanks.

I had the same issue, and stumbled across this: http://blog.markfeeney.com/2010/01/hexbinary-encoding.html
From that page, you can get the encoding you need via this command:
echo "Hello world" | hexdump -v -e '1/1 "%02x"'
48656c6c6f20776f726c640a
An excerpt from the explanation:
So what the hell is that? -v means don't suppress any duplicate data
in the output, and -e is the format string. hexdump's very particular
about the formatting of the -e argument; so careful with the quotes.
The 1/1 means for every 1 byte encountered in the input, apply the
following formatting pattern 1 time. Despite this sounding like the
default behaviour in the man page, the 1/1 is not optional. /1 also
works, but the 1/1 is very very slightly more readable, IMO. The
"%02x" is just a standard-issue printf-style format code.
So in your case, you would do this (removing 'export' in favor of inline variable)
OUTPUT=`cat msg.out | hexdump -v -e '1/1 "%02x"'` curl -k -X POST -d "<run><log encoding=\"hexBinary\">$OUTPUT</log><result>0</result> <duration>2000</duration></run>" https://$jenkinsuser:$jenkinspass#127.0.0.1/jenkins/job/akamai_purge_results/postBuildResult -H'.crumb:c775f3aa15464563456346e'

Related

Vertica's vsql.exe returns errorlevel 0 when facing ERROR 3326: Execution time exceeded run time cap

I am using vsql.exe on an external Vertica database for which I don't have any administrative access. I use some views with simple SELECT+FROM+WHERE queries.
These queries 90% of the time work just fine, but some times, randomly, I get this error:
ERROR 3326:  Execution time exceeded run time cap of 00:00:45
The strange thing is that this error can happen way after those 45 seconds, even after 3 minutes. I've been told this is related to having different resource pools, but anyway I don't want to dig into that.
The problem is that when this occurs, vsql.exe returns errorlevel 0 and there is (apparently almost) no way to know this failed.
The output of the query is stored in a csv file. When it succeeds, it ends with (#### rows). But when it fails with this error, it just stops at any point of the csv, and its resulting size is around half of what's expected. This is of course not what you would expect when an error occurs, like no output or an empty one.
If there is a connection error or if the query has syntax errors, the errorlevel is not 0, so in those cases it behaves as expected.
I've tried many things, like increasing the timeout or adding -v ON_ERROR_STOP=ON to the vsql.exe parameters, but none of that helped.
I've googled a lot and found many people having this error, but the solutions are mostly related to increasing the timeouts, not related to the errorlevel returned.
Any help will be greatly appreciated.
TL;DR: how can I detect an error 3326 in a batch file like this?
#echo off
vsql.exe -h <hostname> -U <user> -w <pwd> -o output.cs -Ac "SELECT ....;"
echo %errorlevel% is always 0
if errorlevel 1 echo Error!! But this is never displayed.
Now that's really unexpected to me. I don't have Windows available just now, but trying on my Mac - at first just triggering a deliberate error:
$ vsql -h zbook -d sbx -U dbadmin -w $VSQL_PASSWORD -v ON_ERROR_STOP=ON -Ac "select * from foobarfoo"
ERROR 4566: Relation "foobarfoo" does not exist
$ echo $?
1
With ON_ERROR_STOP set to ON, this should be the behaviour everywhere.
Could you try what I did above through Windows, just with echo %ERRORLEVEL% instead of echo $?, just from the Windows command prompt and not in a batch file?
Next test: I run on resource pool general in my little test database, so I temporarily modify it to a runtime cap of 30 sec, run a silly query that will take over 30 seconds with ON_ERROR_STOP set to ON, collect the value returned by vsql and set the runtime cap of general back to NONE. I also have the %VSQL_* % env variables set so I don't have to repeat them all the time:
rem Windows way to set environment variables for vsql:
set VSQL_HOST=zbook
set VSQL_DATABASE=sbx
set VSQL_USER=dbadmin
set VSQL_PASSWORD=***masked***
Now for the test (backslashes, in Linux/MacOs escape a new line, which enables you to "word wrap" a shell command. Use the caret (^) in Windows for that):
marco ~/1/Vertica/supp $ # set a runtime cap
marco ~/1/Vertica/supp $ vsql -i -c \
"alter resource pool general runtimecap '00:00:30'"
ALTER RESOURCE POOL
Time: First fetch (0 rows): 116.326 ms. All rows formatted: 116.730 ms
marco ~/1/Vertica/supp $ vsql -v ON_ERROR_STOP=ON -iAc \
"select count(*) from one_million_rows a cross join one_million_rows b"
ERROR 3326: Execution time exceeded run time cap of 00:00:30
marco ~/1/Vertica/supp $ # test the return code
marco ~/1/Vertica/supp $ echo $?
1
marco ~/1/Vertica/supp $ # clear the runtime cap
marco ~/1/Vertica/supp $ vsql -i -c \
"alter resource pool general runtimecap NONE "
ALTER RESOURCE POOL
Time: First fetch (0 rows): 11.148 ms. All rows formatted: 11.383 ms
So it works in my case. Your line:
if errorlevel 1 echo Error!! But this is never displayed.
... never echoes anything because the previous line, with echo will return 0 to the shell, overriding the previous errorlevel.
Try it command by command on your Windows command prompt, and see what happens. Just echo %errorlevel%, without evaluating it.
And I notice that you are trying to export to CSV format. Then, try this:
Format the output unaligned (-A)
set the field separator to comma (-F ',')
remove the footer '(n rows)' (-P footer)
limit the output to 5 rows in the query for test
(I show the output before redirecting to file):
marco ~/1/Vertica/supp $ vsql -A -F ',' -P footer -c "select * from one_million_rows limit 5"
id,id_desc,dob,category,busid,revenue
0,0,1950-01-01,1,====== boss ========,0.000
1,-1,1950-01-02,2,kbv-000001kbv-000001,0.010
2,-2,1950-01-03,3,kbv-000002kbv-000002,0.020
3,-3,1950-01-04,4,kbv-000003kbv-000003,0.030
4,-4,1950-01-05,5,kbv-000004kbv-000004,0.040
Not aligning is much faster than aligning.
Then, as you spend most time in the fetching of the rows (that's because you get a timeout in the middle of an output file write process), try fetching more rows at a time than the default 1000. You will need to play with the value, depending on the network settings at your site until you get your best value:
-v ROWS_AT_A_TIME=10000
Once you're happy with the tested output, try this command (change the SELECT for your needs, of course ....):
marco ~/1/Vertica/supp $ vsql -A -F ',' -P footer \
-v ON_ERROR_STOP=ON -v ROWS_AT_A_TIME=10000 -o one_million_rows.csv \
-c "select * from one_million_rows"
marco ~/1/Vertica/supp $ wc -l one_million_rows.csv
1000001 one_million_rows.csv
The table actually contains one million rows. Note the line count in the file: 1,000,001. That's the title line included, but the footer (1000000 rows) removed.

Curl a URL list from a file and make it faster with parallel

Right now i'm using the followin code:
while read num;
do M=$(curl "myurl/$num")
echo "$M"
done < s.txt
where s.txt contains a list (1 per line) of a part of the url.
Is it correct to assume that curl is running sequentially?
Or is it running in thread/jobs/multiple conn at a time?
I've found this online:
parallel -k curl -s "http://example.com/locations/city?limit=100\&offset={}" ::: $(seq 100 100 30000) > out.txt
The problem is that my sequence is coming from a file or from a variable (one element per line) and i can't adapt it to my needs
I've not fully understood how to pass the list to parallel
Should i save all the curl commands in the list and run it with parallel -a ?
Regards,
parallel -j100 -k curl myurl/{} < s.txt
Consider spending an hour walking through man parallel_tutorial. Your command line will love you for it.

parse maven output in real time using sed

I am trying to parse my mvn verify output to only show lines with INFO tags. Please note that maven outputs line to stdout in real time and not by batch. I do not think that it is a problem with maven.
At first I tried to do it with grep:
$ mvn verify | grep INFO
but didn't seem to output lines in real time, as I understand grep buffers its lines before outputting, so I have to wait a few seconds between each flush and then I have tens of lines being printed at the same time, not very convenient. Then I thought I would try with sed.
According to this link, the following command:
sed -n '/PATTERN/p' file
// is equivalent to
grep PATTERN file
and according to this link, the -l option should force sed to flush its output buffer after every newline. So now I am using this command:
$ mvn verify | sed -ln -e '/INFO/p'
but I'm still getting the same result as before, I get a ton of output flushed every 30s or so and I don't know what I've done wrong. Can someone point me in the right direction please?
Try this, if your grep supports it:
mvn verify | grep --line-buffered INFO
If you're doing this in a terminal and still seeing buffered results, it would probably be something earlier than grep doing the buffering, but I'm not familiar with mvn. (And, yes, the -l option to sed should have done the same thing, so the problem may be upstream.)
try this line:
mvn verify | while read line; do echo $line|grep INFO; done
I found what was the problem, I was using a script to colorise maven output (see here) and in fact it was that script that was buffering the output down the pipe. I forgot about it as I was using it as an alias, I guess this is a good lesson, I won't alias as easily in the future. Anyway here is the fix, I changed -e to -le in the last line of the sed call:
mvn $# | sed -e "s/\(\[INFO\]\ \-.*\)/${TEXT_BLUE}${BOLD}\1/g" \
-e "s/\(\[INFO\]\ \[.*\)/${RESET_FORMATTING}${BOLD}\1${RESET_FORMATTING}/g" \
-e "s/\(\[INFO\]\ BUILD SUCCESSFUL\)/${BOLD}${TEXT_GREEN}\1${RESET_FORMATTING}/g" \
-e "s/\(\[WARNING\].*\)/${BOLD}${TEXT_YELLOW}\1${RESET_FORMATTING}/g" \
-e "s/\(\[ERROR\].*\)/${BOLD}${TEXT_RED}\1${RESET_FORMATTING}/g" \
-le "s/Tests run: \([^,]*\), Failures: \([^,]*\), Errors: \([^,]*\), Skipped: \([^,]*\)/${BOLD}${TEXT_GREEN}Tests run: \1${RESET_FORMATTING}, Failures: ${BOLD}${TEXT_RED}\2${RESET_FORMATTING}, Errors: ${BOLD}${TEXT_RED}\3${RESET_FORMATTING}, Skipped: ${BOLD}${TEXT_YELLOW}\4${RESET_FORMATTING}/g"
In effect this is telling sed to flush its output at every new line, which is what I wanted. I am sorry I didn't find another workaround that is more generic. I tried playing around with empty (see man page) and script but none of these solutions worked for me.

Monitoring URLs with Nagios

I'm trying to monitor actual URLs, and not only hosts, with Nagios, as I operate a shared server with several websites, and I don't think its enough just to monitor the basic HTTP service (I'm including at the very bottom of this question a small explanation of what I'm envisioning).
(Side note: please note that I have Nagios installed and running inside a chroot on a CentOS system. I built nagios from source, and have used yum to install into this root all dependencies needed, etc...)
I first found check_url, but after installing it into /usr/lib/nagios/libexec, I kept getting a "return code of 255 is out of bounds" error. That's when I decided to start writing this question (but wait! There's another plugin I decided to try first!)
After reviewing This Question that had almost practically the same problem I'm having with check_url, I decided to open up a new question on the subject because
a) I'm not using NRPE with this check
b) I tried the suggestions made on the earlier question to which I linked, but none of them worked. For example...
./check_url some-domain.com | echo $0
returns "0" (which indicates the check was successful)
I then followed the debugging instructions on Nagios Support to create a temp file called debug_check_url, and put the following in it (to then be called by my command definition):
#!/bin/sh
echo `date` >> /tmp/debug_check_url_plugin
echo $* /tmp/debug_check_url_plugin
/usr/local/nagios/libexec/check_url $*
Assuming I'm not in "debugging mode", my command definition for running check_url is as follows (inside command.cfg):
'check_url' command definition
define command{
command_name check_url
command_line $USER1$/check_url $url$
}
(Incidentally, you can also view what I was using in my service config file at the very bottom of this question)
Before publishing this question, however, I decided to give 1 more shot at figuring out a solution. I found the check_url_status plugin, and decided to give that one a shot. To do that, here's what I did:
mkdir /usr/lib/nagios/libexec/check_url_status/
downloaded both check_url_status and utils.pm
Per the user comment / review on the check_url_status plugin page, I changed "lib" to the proper directory of /usr/lib/nagios/libexec/.
Run the following:
./check_user_status -U some-domain.com.
When I run the above command, I kept getting the following error:
bash-4.1# ./check_url_status -U mydomain.com
Can't locate utils.pm in #INC (#INC contains: /usr/lib/nagios/libexec/ /usr/local/lib/perl5 /usr/local/share/perl5 /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5 /usr/share/perl5) at ./check_url_status line 34.
BEGIN failed--compilation aborted at ./check_url_status line 34.
So at this point, I give up, and have a couple of questions:
Which of these two plugins would you recommend? check_url or check_url_status?
(After reading the description of check_url_status, I feel that this one might be the better choice. Your thoughts?)
Now, how would I fix my problem with whichever plugin you recommended?
At the beginning of this question, I mentioned I would include a small explanation of what I'm envisioning. I have a file called services.cfg which is where I have all of my service definitions located (imagine that!).
The following is a snippet of my service definition file, which I wrote to use check_url (because at that time, I thought everything worked). I'll build a service for each URL I want to monitor:
###
# Monitoring Individual URLs...
#
###
define service{
host_name {my-shared-web-server}
service_description URL: somedomain.com
check_command check_url!somedomain.com
max_check_attempts 5
check_interval 3
retry_interval 1
check_period 24x7
notification_interval 30
notification_period workhours
}
I was making things WAY too complicated.
The built-in / installed by default plugin, check_http, can accomplish what I wanted and more. Here's how I have accomplished this:
My Service Definition:
define service{
host_name myers
service_description URL: my-url.com
check_command check_http_url!http://my-url.com
max_check_attempts 5
check_interval 3
retry_interval 1
check_period 24x7
notification_interval 30
notification_period workhours
}
My Command Definition:
define command{
command_name check_http_url
command_line $USER1$/check_http -I $HOSTADDRESS$ -u $ARG1$
}
The better way to monitor urls is by using webinject which can be used with nagios.
The below problem is due to the reason that you dont have the perl package utils try installing it.
bash-4.1# ./check_url_status -U mydomain.com Can't locate utils.pm in #INC (#INC contains:
You can make an script plugin. It is easy, you only have to check the URL with something like:
`curl -Is $URL -k| grep HTTP | cut -d ' ' -f2`
$URL is what you pass to the script command by param.
Then check the result: If you have an code greater than 399 you have a problem, else... everything is OK! THen an right exit mode and the message for Nagios.

unix at command pass variable to shell script?

I'm trying to setup a simple timer that gets started from a Rails Application. This timer should wait out its duration and then start a shell script that will start up ./script/runner and complete the initial request. I need script/runner because I need access to ActiveRecord.
Here's my test lines in Rails
output = `at #{(Time.now + 60).strftime("%H:%M")} < #{Rails.root}/lib/parking_timer.sh STRING_VARIABLE`
return render :text => output
Then my parking_timer.sh looks like this
#!/bin/sh
~/PATH_TO_APP/script/runner -e development ~/PATH_TO_APP/lib/ParkingTimer.rb $1
echo "All Done"
Finally, ParkingTimer.rb reads the passed variable with
ARGV.each do|a|
puts "Argument: #{a}"
end
The problem is that the Unix command "at" doesn't seem to like variables and only wants to deal with filenames. I either get one of two errors depending on how I position "s
If I put quotes around the right hand side like so
... "~/PATH_TO_APP/lib/parking_timer.sh STRING_VARIABLE"
I get,
-bash: ~/PATH_TO_APP/lib/parking_timer.sh STRING_VARIABLE: No such file or directory
I I leave the quotes out, I get,
at: garbled time
This is all happening on a Mac OS 10.6 box running Rails 2.3 & Ruby 1.8.6
I've already messed around w/ BackgrounDrb, and decided its a total PITA. I need to be able to cancel the job at any time before it is due.
After playing around with irb a bit, here's what I found.
The backtick operator invokes the shell after ruby has done any interpretation necessary. For my test case, the strace output looked something like this:
execve("/bin/sh", ["sh", "-c", "echo at 12:57 < /etc/fstab"], [/* 67 vars */]) = 0
Since we know what it's doing, let's take a look at how your command will be executed:
/bin/sh -c "at 12:57 < RAILS_ROOT/lib/parking_timer.sh STRING_VARIABLE"
That looks very odd. Do you really want to pipe parking_timer.sh, the script, as input into the at command?
What you probably ultimately want is something like this:
/bin/sh -c "RAILS_ROOT/lib/parking_timer.sh STRING_VARIABLE | at 12:57"
Thus, the output of the parking_timer.sh command will become the input to the at command.
So, try the following:
output = `#{Rails.root}/lib/parking_timer.sh STRING_VARIABLE | at #{(Time.now + 60).strftime("%H:%M")}`
return render :text => output
You can always use strace or truss to see what's happening. For example:
strace -o strace.out -f -ff -p $IRB_PID
Then grep '^exec' strace.out* to see where the command is being executed.

Resources