How to download a huge console output from Jenkins - jenkins

My job executes ansible-playbook in debug-mode (ansible-playbook -vvv) which generates a lot of output.
After the job finished, its very difficult to search using browser because its very slow and stuck.
I tried to download it with curl/wget, but the file is incomplete (i guess only about 10% was downloaded)
curl http://j:8080/job/my-job/5/consoleText -O
wget http://j:8080/job/my-job/5/consoleText
curl returns with error:
curl: (18) transfer closed with outstanding read data remaining

Try the following.
curl -u "admin":"admin" "http://localhost:8080/job/my-job/5/logText/progressiveText?start=0"
If that doesn't work. I think your best option is to get the log from the server itself. The log can be found at ${JENKINS_HOME}/jobs/${JOB_NAME}/builds/${BUILD_NUMBER}/log.

Related

How to fail a AppCenter build on script exit code?

I've got a bunch of scripts that get called when the appcenter-pre-build.sh is called. For example, one of them is a simple check to see if the current branch tag already exists on the repository.
#!/usr/bin/env bash
set -e # Exit immediately if a command exits with a non-zero status (failure)
# 1 Fetch tags
git fetch --tags
# See if the tag exists
if git tag --list | egrep -q "^$VERSION_TAG$"
then
echo "Error: Found tag. Exiting."
exit 1
else
git tag $VERSION_TAG
git push origin $VERSION_TAG
fi
If the tag is found, I want to abort the build in AppCenter and fail it. This worked perfectly fine when I was running everything through Xcode Server but for some reason, I cannot figure out how to abort the build upon failure of my script. I'm not seeing much documentation on this particular subject and the AppCenter folk over at Microsoft are taking their sweet time getting back to me.
Anyone have experience with this and/or know how to fail an AppCenter build from their scripts? Thanks in advance for your thoughts!
Okay, figured it out. Looks like sending a curl request to cancel the build using the env variable "$APPCENTER_BUILD_ID" takes care of the issue. Exiting your script with a non-zero is NOT working inside AppCenter.
Here's a sample of what to do. I just put it in a special "cancelAppCenterBuild.sh" script and called it in place of my exits.
API_TOKEN="<YourAppToken>"
OWNER_NAME="<YourOwnerOrOrganizationName>"
APP_NAME="<YourAppName>"
curl -iv "https://appcenter.ms/api/v0.1/apps/$OWNER_NAME/$APP_NAME/builds/$APPCENTER_BUILD_ID" \
-X PATCH \
-d "{\"status\":\"cancelling\"}" \
--header 'Content-Type: application/json' \
--header "X-API-Token: $API_TOKEN"
Pro tip: If you've ever renamed your app, AppCenter servers are having issues referencing the new name. I was getting a 403 with a forbidden message. You might have to change your app name to whatever the original name was or just rebuild the app from scratch within AppCenter.

Firebase crash batch upload fail for unknown reason

I'm trying to upload DSYM s to firebase, which worked perfectly till a few days before. When I start the script, and it s start loggin, a few lines after it stuck for a few minutes and then fails.
/Users/..../dSYMs/DF...C47.dSYM/Contents/Resources/DWARF/leveldb: warning: function at offset 0x51662 has no name
./Pods/FirebaseCrash/upload-sym-util.bash:365: error: upload: Unable to upload symbol file (reason unknown).
The interesting thing is, in firebase console it tells me the upload was successful:
Future stack traces for UUID B4...AAF will
be symbolicated using the uploaded symbol file.
But it never, because I ve "uploaded" a few like this, and since that, I had a few more crashes, but still not symbolicated...
What's going on?
FYI: I'm using firebase crashreporting since February, and it worked nicely. I updated my mac to High Sierra a few days ago.
Thy
TL;DR
Look for the following line in upload-sym-util.bash:
HTTP_STATUS=$(curl ${CURLOPT} -sfL -H 'Content-Type: text/plain' -H "Authorization: Bearer ${BEARER_TOKEN}" -w '%{http_code}' -T "${FILE}" "${UPLOAD_URL}")
And append --http1.1 at the end so that it becomes:
HTTP_STATUS=$(curl ${CURLOPT} -sfL -H 'Content-Type: text/plain' -H "Authorization: Bearer ${BEARER_TOKEN}" -w '%{http_code}' -T "${FILE}" "${UPLOAD_URL}" --http1.1)
Explanation
We have been having this issue when uploading the DSYM files on Firebase via XCode. What was driving us crazy was that the process seemed to randomly succeed and fail. When the upload failed, it did after a few minutes.
We managed to run the offending curl command manually and discovered that it was returning an HTTP status code of 000 which seems to happen when the connection is closed before the server actually returns anything (eg: timeout). By using the --verbose argument we discovered that curl was aborting the call with an INTERNAL_ERROR (err 2) which seemed to be linked with the use of HTTP/2. We managed to confirm this when we found out that the only machine that was able to upload the DSYM files correctly had the same version of curl as everyone else but with no HTTP/2 support, which was apparently added in High Sierra. We forced curl to use HTTP1.1 and it did the trick.

Docker. How to resume downloading image when interrupted?

How can I resume pull when disconnected? The pull process always start from the beginning every time I run docker pull some-image again after disconnected. My connection is so unstable that even downloading just a 100MB image take so long and almost fails every time. So, it is almost impossible for me to pull a bigger image. So, how can I resume the pull process?
Update:
The pull process will now automatically resume based on which layers have already been downloaded. This was implemented with https://github.com/moby/moby/pull/18353.
Old:
There is no resume feature yet. However there are discussions around this feature being implemented with docker's download manager.
Docker's code isn't as updated as the moby in development repository on github. People have been having issues for several years relating to this. I had tried to manually use several patches which aren't in the upstream yet, and none worked decent.
The github repository for moby (docker's development repo) has a script called download-frozen-image-v2.sh. This script uses bash, curl, and other things like JSON interpreters via command line. It will retrieve a docker token, and then download all of the layers to a local directory. You can then use 'docker load' to insert into your local docker installation.
It does not do well with resume though. It had some comment in the script relating to 'curl -C' isn't working. I had tracked down, and fixed this problem. I made a modification which uses a ".headers" file to retrieve initially, which has always returned a 302 while I've been monitoring, and then retrieves the final using curl (+ resume support) to the layer tar file. It also has to loop on the calling function which retrieves a valid token which unfortunately only lasts about 30 minutes.
It will loop this process until it receives a 416 stating that there is no resume possible since it's ranges have been fulfilled. It also verifies the size against a curl header retrieval. I have been able to retrieve all images necessary using this modified script. Docker has many more layers relating to retrieval, and has remote control processes (Docker client) which make it more difficult to control, and they viewed this issue as only affecting some people on bad connections.
I hope this script can help you as much as it has helped me:
Changes:
fetch_blob function uses a temporary file for its first connection. It then retrieves 30x HTTP redirect from this. It attempts a header retrieval on the final url and checks whether the local copy has the full file. Otherwise, it will begin a resume curl operation. The calling function which passes it a valid token has a loop surrounding retrieving a token, and fetch_blob which ensures the full file is obtained.
The only other variation is a bandwidth limit variable which can be set at the top, or via "BW:10" command line parameter. I needed this to allow my connection to be viable for other operations.
Click here for the modified script.
In the future it would be nice if docker's internal client performed resuming properly. Increasing the amount of time for the token's validation would help tremendously..
Brief views of change code:
#loop until FULL_FILE is set in fetch_blob.. this is for bad/slow connections
while [ "$FULL_FILE" != "1" ];do
local token="$(curl -fsSL "$authBase/token?service=$authService&scope=repository:$image:pull" | jq --raw-output '.token')"
fetch_blob "$token" "$image" "$layerDigest" "$dir/$layerTar" --progress
sleep 1
done
Another section from fetch_blob:
while :; do
#if the file already exists.. we will be resuming..
if [ -f "$targetFile" ];then
#getting current size of file we are resuming
CUR=`stat --printf="%s" $targetFile`
#use curl to get headers to find content-length of the full file
LEN=`curl -I -fL "${curlArgs[#]}" "$blobRedirect"|grep content-length|cut -d" " -f2`
#if we already have the entire file... lets stop curl from erroring with 416
if [ "$CUR" == "${LEN//[!0-9]/}" ]; then
FULL_FILE=1
break
fi
fi
HTTP_CODE=`curl -w %{http_code} -C - --tr-encoding --compressed --progress-bar -fL "${curlArgs[#]}" "$blobRedirect" -o "$targetFile"`
if [ "$HTTP_CODE" == "403" ]; then
#token expired so the server stopped allowing us to resume, lets return without setting FULL_FILE and itll restart this func w new token
FULL_FILE=0
break
fi
if [ "$HTTP_CODE" == "416" ]; then
FULL_FILE=1
break
fi
sleep 1
done
Try this
ps -ef | grep docker
Get PID of all the docker pull command and do a kill -9 on them. Once killed, re-issue the docker pull <image>:<tag> command.
This worked for me!

Travis CI: Build intermittently fails and log takes forever to load (never loads)

This is my build.
https://travis-ci.org/gogo/protobuf
It intermittently fails for some of the builds.
I think it is struggling with installing a protocol buffer version using wget, but I can't see the logs, since they take forever to load.
It would be great if travis could tell me that it has failed to load the logs instead of just pretending to load them. Sorry I don't know if that is really the case, but that is how it feels.
Also I don't understand why this works some of the time and randomly fails. If the server is overloaded, put me in a queue, please don't fail when there is not something wrong with the code.
Please help I am new to travis, so maybe I am just doing it wrong.
Some of the other builds with the same use of PROTOBUF_VERSION are successful and show some output from the final step of install-protobuf.sh (./configure --prefix=/home/travis && make -j2 && make install). So similar to you, I suspect that the wget step in install-protobuf.sh is what is failing in the failed builds.
I would suggest editing install-protobuf.sh so that you can better see what is going on in the travis-ci logs:
change set command to: set -ex
remove the -q option from your use of wget in:
wget -q https://github.com/google/protobuf/releases/download/v$PROTOBUF_VERSION/$basename.tar.gz

Icinga check_jboss "NRPE: unable to read output"

I'm using Icinga to monitor some servers and services. Most of them run fine. But now I like to monitor a JBoss-AS on one server via NRPE. Therefore I'm using the check_jboss-Plugin from MonitoringExchange. Although each time I try running a test-command from my Icinga-Server via NRPE I'm getting a NRPE: unable to read output error. When I try executing the command directly on the monitored server it runs fine. It's strange that the execution on the monitored server takes around 5 seconds to return a acceptable result but the NRPE-Exceution returns immediately the error. Trying to set up the NRPE-timeout didn't solve the problem. I also checked the permissions of the check_jboss-plugin and set them to "777" so that there should be no error.
I don't think that there's a common issue with NRPE, because there are also some other checks (e.g. check_load, check_disk, ...) via NRPE and they are all running fine. The permissions of these plugins are analog to my check_jboss-Plugin.
Following one sample exceuction on the monitored server which runs fine:
/usr/lib64/nagios/plugins/check_jboss.pl -T ServerInfo -J jboss.system -a MaxMemory -w 3000: -c 2000: -f
JBOSS OK - MaxMemory is 4049076224 | MaxMemory=4049076224
Here are two command-executions via NRPE from my Icinga-Server. Both commands are correctly
./check_nrpe -H xxx.xxx.xxx.xxx -c check_hda1
DISK OK - free space: / 47452 MB (76% inode=97%);| /=14505MB;52218;58745;0;65273
./check_nrpe -H xxx.xxx.xxx.xxx -c jboss_MaxMemory
NRPE: Unable to read output
Does anyone have a hint for me? If further config-information needed please ask :)
Try to rule out SELinux either by disabling it globally or by changing the SELinux type to nagios_unconfined_plugin_exec_t.

Resources