Automated ansible output parsing in jenkins pipeline

Automated ansible output parsing in jenkins pipeline - jenkins

How do you automatically parse ansible warnings and errors in your jenkins pipeline jobs?
I greatly enjoy the power of leveraging in ansible in jenkins when it works. Upon a failure, the hunt to locate the actual error can be challenging.
I use WarningsNG which supports custom parsers (and allows their programmatic generation)
Do you know of any plugins or addons that already transform these logs into the kind charts similar to WarningsNG?
I figured I'd ask as I go off into deep regex land and make my own.

One good way to achieve this seems to be the following:
select an existing structured output ansible callback plugin (json, junit and yaml are all viable) . I selected junit as I can play with the format to get a really nice view into the playbook with errors reported in a very obvious way.
fork that GPL file (yes, so be careful with that license) to augment with the following:
store output as file
implement the missing callback methods (the three mentioned above do not implement the v2...item callbacks.
forward events to the default or debug callback to ensure operators see something when they execute the plan
add a secrets cleaner - if you use jenkins credentials-binding-plugin it will hide secrets from the console, it will not not hide secrets within stored files. You'll need to handle that in your playbook or via some groovy code (if groovy, try{...} finally { clean } seems a good pattern)
Snippet - forewarding to default callback
from ansible.plugins.callback.default import CallbackModule as CallbackModule_default
...
class CallbackModule(CallbackBase):
CALLBACK_VERSION = 2.0
CALLBACK_TYPE = 'stdout'
CALLBACK_NAME = 'json'
def __init__(self, display=None):
super(CallbackModule, self).__init__(display)
self.default_callback = CallbackModule_default()
...
def v2_on_file_diff(self, result):
self.default_callback.v2_on_file_diff(result)
... do whatever you'd want to ensure the content appears in the json file

Related

How to add more/custom data to be stored in jenkins rest api

My question is similar to this.
What I want is that when my pipelines run, I want to add some information in the job build so that when a REST API call is made it return existing info as well as info that I added to the job build-
Currently, the info present in this API has information like job names, build number, etc...
http://example.com/jenkins/<job_name>/<build_number>/api/json
I see there is a plugin that can be used to do this : Env Injector. But there is lot of effort to just add little info in existing API. It does not have good support of Jenkins pipeline and isn't that mainstream.
Other way is that I could just write a JSON file on the system where Jenkins is running and make it available over HTTP. This doesn't involve REST APIs but does what I want.
Is there any better way to do this?

If it's just metadata you can use the job description and parse it with Regex using Groovy
def jobDescription = job.getDescription();
// regex match of #tags, capture "tag" from "#tag"
def tagMatches = (jobDescription =~ /#(\S+)/)
Then iterate over tagMatches
tagMatches.each { match ->
}

Extract info from a Groovy DSL file?

I recently switched my logback configuration file from logback.xml to logback.groovy. Using a DSL with Groovy is more versatile than XML for this sort of thing.
I need to analyse this file programmatically, like I analysed the previous XML file (any of innumerable parsing tools). I realise that this will be imperfect, as a DSL config file sits on top of an object which it configures and must be executed, so its results are inevitably dynamic, whereas an XML file is static.
If you want to include one Groovy file in another file there are solutions. This one worked for me.
But I'm struggling to find what I need from the results.
If I put a function like this in the DSL file ...
def greet(){
println "hello world"
}
... not only can I execute it (config.greet() as below), but I can also see it listed when I go
GroovyShell shell = new GroovyShell()
def config = shell.parse( logfileConfigPath.toFile() )
println "config.class.properties ${config.class.properties}"
But if I put a line like this in the DSL file...
def MY_CONSTANT = "XXX"
... I have no idea how to find it and get its value (it is absent from the confusing and copious output from config.class.properties).
PS printing out config.properties just gives this:
[class:class logback, binding:groovy.lang.Binding#564fa2b]
... and yes, I did look at config.binding.properties: there was nothing.
further thought
My question is, more broadly, about what if any tools are available for analysis of Groovy DSL configuration files. Given that such a file is pretty meaningless without the underlying object it is configuring (an object implementing org.gradle.api.Project in the case of Gradle; I don't know what class it may be in the case of logback), you would have thought there would need to be instrumentation to kind of hitch up such an object and then observe the effects of the config file in a controlled, observable way. If Groovy DSL config files are to be as versatile as their XML counterparts surely you need something along those lines? NB I have a suspicion that org.gradle.tooling.model.GradleProject or org.gradle.tooling.model.ProjectModel might serve that purpose. Unfortunately, at the current time I am unable to get GradleConnector working, as detailed here.
I presume there is nothing of this kind for logback, and at the moment I have no knowledge of its DSL or configurable object, or the latter's class or interface...

The use of def creates a local variable in the execution of the script that is not available in the binding of the script; see this. Even dropping def will not expose MY_CONSTANT in the binding because parsing the script via GroovyShell.parse() does not interpret/execute the code.
To expose MY_CONSTANT in config's binding, change def MY_CONSTANT = "XXX" to MY_CONSTANT = "XXX" and execute the config script via config.run().

Perform action after Dataflow pipeline has processed all data

Is it possible to perform an action once a batch Dataflow job has finished processing all data? Specifically, I'd like to move the text file that the pipeline just processed to a different GCS bucket. I'm not sure where to place that in my pipeline to ensure it executes once after the data processing has completed.

I don't see why you need to do this post pipeline execution. You could use side outputs to write the file to multiple buckets, and save yourself the copy after the pipeline finishes.
If that's not going to work for you (for whatever reason), then you can simply run your pipeline in blocking execution mode i.e. use pipeline.run().waitUntilFinish(), and then just write the rest of your code (which does the copy) after that.
[..]
// do some stuff before the pipeline runs
Pipeline pipeline = ...
pipeline.run().waitUntilFinish();
// do something after the pipeline finishes here
[..]

A little trick I got from reading the source code of apache beam's PassThroughThenCleanup.java.
Right after your reader, create a side input that 'combine' the entire collection (in the source code, it is the View.asIterable() PTransform) and connect its output to a DoFn. This DoFn will be called only after the reader has finished reading ALL elements.
P.S. The code literally name the operation, cleanupSignalView which I found really clever
Note that you can achieve the same effect using Combine.globally() (java) or beam.CombineGlobally() (python). For more info check out section 4.2.4.3 here

I think two options can help you here:
1) Use TextIO to write to the bucket or folder you want, specifying the exact GCS path (for e.g. gs://sandbox/other-bucket)
2) Use Object Change Notifications in combination with Cloud Functions. You can find a good primer on doing this here and the SDK for GCS in JS here. What you will do in this option is basically setting up a trigger when something drops in a certain bucket, and move it to another one using your self-written Cloud Function.

How do we access the sensitive variables in a jenkins plugin that is workflow compatible?

I'm trying to take the jenkins gradle plugin and make it compatible with the new workflow job type. I've gotten it to the point where I can use something like this and it will run gradle pretty successfully:
step([$class: 'Gradle',
switches: "-PenableInstallerDistribution=true",
tasks: 'build install',
gradleName: '(Default)',
useWrapper: true,
makeExecutable: true,
fromRootBuildScriptDir: true,
useWorkspaceAsHome: true])
However, I had to make some sacrifices. I had to simply delete these lines:
Set<String> sensitiveVars = build.getSensitiveBuildVariables();
args.addKeyValuePairs("-D", fixParameters(build.getBuildVariables()), sensitiveVars);
I can't find any way to access the "sensitive variables" from the Run object that is supplied in place of the old AbstractBuild and popping passwords into the console output seems like a bad idea. (I believe that's what the code is trying to avoid doing; I didn't write the original.)

There is currently no Run.getSensitiveBuildVariables(), though it is possible one is needed. Anyway this method is merely communicating to other plugins which variables might be considered secrets for various purposes; it is not responsible for making passwords included in the command line from ProcStarter be shown as **** in the build log, which would be done using ArgumentListBuilder.addMasked.
The quick answer is that, pending newer APIs, you should just skip this block if not given an AbstractBuild.

Jenkins - retrieve full console output during build step

I have been scouring the internet for days, I have a problem similar to this.
I need to retrieve the console output in raw (plain) text. But if I can get it in HTML that is fine too, I can always parse it. The only thing is that I need to get it during the build step, which is a problem since the location where it should be available is truncated...
I have tried retrieving the console output from the following URL's (relative to the job):
/consoleText
/logText/progressiveText
/logText/progressiveHTML
The two text ones are plain text and would be perfect if not for the truncation, same goes for the HTML one... exactly what I need - only its truncated....
I am sure it is possible to retrieve this information somehow, since when viewing /consoleFull there is a real-time update of the console, without truncating or buffering.
However, upon examining that web page, instead of finding the content I desired, I found this code where it should have been (I did not include the full pages code, since it would be mostly irrelevant, and I believe those answering would be able to find out and know what should be there on their own)
new Ajax.Request(href,{
method: "post",
parameters: {"start":e.fetchedBytes},
requestHeaders: headers,
onComplete: function(rsp,_) {
var stickToBottom = scroller.isSticking();
var text = rsp.responseText;
if(text!="") {
var p = document.createElement("DIV");
e.appendChild(p); // Needs to be first for IE
// Use "outerHTML" for IE; workaround for:
// http://www.quirksmode.org/bugreports/archives/2004/11/innerhtml_and_t.html
if (p.outerHTML) {
p.outerHTML = '<pre>'+text+'</pre>';
p = e.lastChild;
}
else p.innerHTML = text;
Behaviour.applySubtree(p);
if(stickToBottom) scroller.scrollToBottom();
}
e.fetchedBytes = rsp.getResponseHeader("X-Text-Size");
e.consoleAnnotator = rsp.getResponseHeader("X-ConsoleAnnotator");
if(rsp.getResponseHeader("X-More-Data")=="true")
setTimeout(function(){fetchNext(e,href);},1000);
else
$("spinner").style.display = "none";
}
});
Specifically, I am hoping there is a way for me to get the content from text whatever it may be. I am not familiar with this language and so am not sure how I might be able to get the content I want. Plugins won't help since I want to retrieve this content as part of my script during the build step

You did pretty much good investigation already. I can only add the following: all console related plug-ins I know are designed as a post build actions.
The Log Trigger plugin provides a post-build action that allows Hudson
builds to search their console log for a given regular expression and
if found, trigger additional downstream jobs.
So it looks like there is no straightforward solution to your problem. I can see the following options:
1. Use tee or something similar (applicable to shell build steps only)
This solution is far from being universal, but it can provide quick access to the latest console output, produced by a command or set of command.
tee - read from standard input and write to standard output and files
Using synonyms on the system level other Jenkins build steps can modified in order to produce console output. File with console output can be referenced through Jenkins or using any other way.
2. Modify Jenkins code
You can just do a quick fix for internal usage or provide a patch introducing specific system-wide setting.
3. Mimic /console behavior
Code in your example is used to request updates from the Jenkins server. As you may expect the server side can return piece of information starting with some offset. Let me show.
Periodically console page sends requests to the server:
Parameters are straightforward:
Response is a chunk of information to be added:
Another request with updated offset (start) value
You can easily understand there is no data by analyzing Content-Length
So the answer is: use url/job-name/build-number/logText/progressiveHtml, specify start offset, send request and receive console update.

I had a similar issue, the last part of my Jenkinsfile build script needs to parse the ConsoleLog for particular error messages to put in an email build report.
First attempt: http request.
It felt like a hack, it mostly worked, but ran into issues when we locked down access to the Jenkins server & my build nodes could no longer perform annon http gets on the page
Second attempt: use the APIs to enumerate the log lines.
It felt like the right thing to do, but it failed horribly as my nodes would take 30 minutes to get through the 100 meg log files. My presumption is that the Jenkins server was not caching the file, so each request involved a re-reading of the entire file up until the point of the last read.
Third and most successful solution: run grep on the server.
node('master') {
sh 'grep some_criteria $JENKINS_HOME/workspace/path/to/job/console.log'
}
it was fast, reliable, and it didn't matter how big the log files were.
Yes, this required trust of the Jenkins admin and knowledge of the directory paths on the Jenkins server - but since I was the admin, I trusted myself to do the right thing. Your mileage may vary.

To add some insight: when the Jenkins build was in progress, the response for the .../consoleText URL maxed out at 10000 lines, exactly.
I was using 'requests()' package in Python. I have tried the same URL with curl and again received only the first 10K lines.
Only after the build has finished both methods returned the full log (>22K lines in my case).
I will research further and hope to report back.
[2015-08-18] Update: It seems that this is a known issue (see here) and it's fixed in Jenkins 1.618 and later. I am still running 1.615 so I cannot verify.
Amir

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart