Catch waf's run_str output for further processing - waf

I have a task something defined in waf which uses run_str insted of def run(self).
How can I catch the output (stdout, stderr and returncode) to further process it.
I know I could place it all in in def run(self), start the task manually, catch the output and process it. But for the sake of simplicity I would like to keep the run_str and do the output processing in an extra function.
actual implementation:
class something(Task.Task):
run_str = "${PROGRAM} ${ARGS}"
So the desired output would be:
class something(Task.Task):
run_str = "${PROGRAM} ${ARGS}"
def check_output(self):
pass
# stdout, stderr and returncode processing here

Related

How do you terminate a ros spin when receiving a message

We know the classic form of a subscriber node in ROS
def callback(msg):
#do something with the msg
rospy.init_node('the_node',anonymous=True)
sub= rospy.Subscriber('message',Image, callback) # for example Images, but can be anything
rospy.spin()
Here the node will be receiving mesages and processing them with callback, while ROS "spins"
My question is, is there a simple way to get out of this spin based on for example a message we receive?
def callback(msg):
#If we receive a msg that says "FINISH" break the main spin
rospy.init_node('the_node',anonymous=True)
sub= rospy.Subscriber('message',Image, callback) # for example Images, but can be anything
rospy.spin()
print("spin was broken")
The purpose rospy.spin() is to go into an infinite loop processing callbacks until a shutdown signal is received. The way to get out of the spin, and the only reason you ever should, is when the process is shutting down. This can be done via sys.exit() in python or rospy.signal_shutdown().
Based on your example it seems like you want to break out of the spin but keep the node alive to do more work. If that's the case this is not the correct use of rospy.spin() and you should reconsider what you're trying to accomplish and by what method. Consider possibly using a run loop with rospy.rate.sleep()
cb_signal = False
def callback(msg):
cb_signal = msg.data
def run():
while not rospy.is_shutdown():
#Do some other work
if cb_signal == True:
some_other_method()
rospy.rate.sleep(10) #10Hz
if __name__ == '__main__':
rospy.init_node('my_node')
rospy.Subscriber('message',Bool, callback)
run()

Jenkins pipeline replaces method call in vars/mavenBuildSpike.groovy with assignment "new NullObject"

I have this code in vars/mavenBuildSpike.groovy:
#NonCPS
def createSqBuilder(SqBuildConfig config) {
System.out.println("createSqBuilder=${config}")
// The constructor contains code which the CPS transformer can't handle.
def result = new SqBuilder(config)
System.out.println("result=${result}")
return result
}
def call(Closure body) {
echo 'Creating ConfigBuilderWrapper'
def wrapper = new ConfigBuilderWrapper()
echo 'Calling apply()'
wrapper.apply(body)
echo 'Done processing closure'
def config = wrapper.builder.build()
echo "config=${config.dump()}"
echo 'Creating builder'
def builder = createSqBuilder(config) // <<--- This doesn't work.
echo "builder=${builder}"
echo builder.dump()
...
The output is:
...everything looks good...
[Pipeline] echo
Creating builder
[Pipeline] echo
builder=null
[Pipeline] End of Pipeline
java.lang.NullPointerException: Cannot invoke method hashCode() on null object
at org.codehaus.groovy.runtime.NullObject.hashCode(NullObject.java:174)
at org.codehaus.groovy.runtime.DefaultGroovyMethods.dump(DefaultGroovyMethods.java:291)
...
at org.kohsuke.groovy.sandbox.impl.Checker.checkedCall(Checker.java:159)
at com.cloudbees.groovy.cps.sandbox.SandboxInvoker.methodCall(SandboxInvoker.java:17)
at mavenBuildSpike.call(...\branches\master\builds\16\libs\sq-pipeline-library-spike\vars\mavenBuildSpike.groovy:33)
at WorkflowScript.run(WorkflowScript:4)
at ___cps.transform___(Native Method)
....
That is, the method createSqBuilder is never called and just replaced with an assignment: def builder = new NullObject().
Why is that and how can I fix it?
Before running the code, Jenkins will do an AST transformation called "CPS transformation". This transformer doesn't support everything that Groovy can do and it won't tell you when it can't - you'll just get weird or useless error messages running the resulting code and sometimes no errors at all - the build will simply fail without any error message or stack trace anywhere.
It seems that the CPS transform doesn't like calling constructors with arguments. This worked for me:
#Field // groovy.transform.Field
SqBuilder builder = new SqBuilder()
def call(Closure body) {
...
//def builder = createSqBuilder(config) // Doesn't work!!!
builder.init(config) // This works; move the code from the constructor to the init() method.
...
The #Field annotation is necessary to turn the local variable builder into a field of the class which Groovy will create at runtime. The name of this class is WorkflowScript.
You could also just type builder = new SqBuilder() (without type or def before the variable name). But that would put builder into the pool of global variables (called "Binding" in Groovy). Jenkins puts it's own stuff there (like env or scm) so that could cause strange problems when you install more plugins.
See also: Strange variable scoping behavior in Jenkinsfile

NotSerializableException in jenkinsfile

I'm working on a jenkinsfile and I'm getting and exception in the third stage:
an exception which occurred:
in field com.cloudbees.groovy.cps.impl.BlockScopeEnv.locals
in object com.cloudbees.groovy.cps.impl.BlockScopeEnv#7bbae4fb
in field com.cloudbees.groovy.cps.impl.ProxyEnv.parent
in object com.cloudbees.groovy.cps.impl.CaseEnv#6896a2e3
in field com.cloudbees.groovy.cps.impl.ProxyEnv.parent
in object com.cloudbees.groovy.cps.impl.BlockScopeEnv#605ccbbc
in field com.cloudbees.groovy.cps.impl.CallEnv.caller
in object com.cloudbees.groovy.cps.impl.FunctionCallEnv#7b8ef914
in field com.cloudbees.groovy.cps.Continuable.e
in object org.jenkinsci.plugins.workflow.cps.SandboxContinuable#11e73f3c
in field org.jenkinsci.plugins.workflow.cps.CpsThread.program
in object org.jenkinsci.plugins.workflow.cps.CpsThread#b2df9bb
in field org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.threads
in object org.jenkinsci.plugins.workflow.cps.CpsThreadGroup#2b30596a
in object org.jenkinsci.plugins.workflow.cps.CpsThreadGroup#2b30596a
Caused: java.io.NotSerializableException: java.util.regex.Matcher
at org.jboss.marshalling.river.RiverMarshaller.doWriteObject(RiverMarshaller.java:860)
at org.jboss.marshalling.river.BlockMarshaller.doWriteObject(BlockMarshaller.java:65)
at org.jboss.marshalling.river.BlockMarshaller.writeObject(BlockMarshaller.java:56)
I've been reading about it and I know I can't create non-serializable variables. So, I think it has to be with this part of my code:
def artifact_name = sh (
script: "ls -b *.jar | head -1",
returnStdout: true
).trim()
def has_snapshot = artifact_name =~ /-TEST\.jar/
if (has_snapshot) {
//Do something
}
My question is, how do I define that two variables in order to avoid that exception?
Your problem is this line:
def has_snapshot = artifact_name =~ /-TEST\.jar/
The =~ is the Groovy find operator. It returns a java.util.regex.Matcher instance, which is not Serializable. If Jenkins decides to pause your script after you have stored the result in a local variable that is serialized by Jenkins that is when you get the exception. This can be easily tested by immediately adding a sleep(1) step after your invocation and watch as that same exception is thrown.
To resolve this, you should :
Not store the java.util.regex.Matcher result in CPS transformed code
Move the usage into a #NonCPS annotated method or use the match operator (==~) which returns a boolean (if it fits your use case)
Building on the accepted answer I came up with this solution:
def hello = {
def matcher = ("Hello" =~ /Hello/)
matcher.find()
return matcher.group()
}.call()
I guess the stability of this is not so good, but I assume the likeliness of this failing to be very low. So if the impact of this code failing is also low it might be reasonable risk management to use this code.
The following seems to fit the case of running in a NonCPS context, but I am not 100% sure. It definitely is working though
#NonCPS def hello = {
def matcher = ("Hello" =~ /Hello/)
matcher.find()
return matcher.group()
}
hello = hello.call()
println hello
The accepted answer is certainly correct. In my case, I was trying to parse some JSON from an API response like so:
#NonCPS
def parseJson(rawJson) {
return new groovy.json.JsonSlurper().parseText(rawJson)
}
All this does is return a JsonSlurper that can then be used to walk down your JSON structure, like so:
def jsonOutput = parseJson(createIssueResponse)
echo "Jira Ticket Created. Key: ${jsonOutput.key}"
This snippet actually worked fine in my script, but later on in the script, it was using the jsonOutput.key to make a new web request. As stated in the other answer, if the script pauses when you have something stored into a local variable that cannot be serialized, you will get this exception.
When the script attempted to make the web request, it would pause (presumably because it was waiting for the request to respond), and the exception would get thrown.
In my scenario, I was able to fix this by doing this instead:
def ticketKey = parseJson(createIssueResponse).key.toString()
echo "Jira Ticket Created. Key: ${ticketKey}"
And later on when the script attempts to send the web request, it no longer throws the exception. Now that no JsonSlurper object is present in my running script when it is paused, it works fine. I previously assumed that because the method was annotated with #NonCPS that its returned object was safe to use, but that is not true.

Is there any fast and efficient way to get abstracts from pubmed?

I would like to download large scientific abstract data for lets say about 2000 Pubmed IDs. My python code is sloppy and seems rather slow working. Is there any fast and efficient method to do harvest these abstracts?
If this is the fastest method how do I measure it so I become able compare against others or home against work situation (different ISP may play part in speed)?
Attached my code below.
import sqlite3
from Bio.Entrez import read,efetch,email,tool
from metapub import PubMedFetcher
import pandas as pd
import requests
from datetime import date
import xml.etree.ElementTree as ET
import time
import sys
reload(sys)
sys.setdefaultencoding('utf8')
Abstract_data = pd.DataFrame(columns=["name","pmid","abstract"])
def abstract_download(self,dict_pmids):
"""
This method returns abstract for a given pmid and add to the abstract data
"""
index=0
baseUrl = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/"
for names in dict_pmids:
for pmid in dict_pmids[names]:
try:
abstract = []
url = baseUrl+"efetch.fcgi?db=pubmed&id="+pmid+"&rettype=xml"+
response=requests.request("GET",url,timeout=500).text
response=response.encode('utf-8')
root=ET.fromstring(response)
root_find=root.findall('./PubmedArticle/MedlineCitation/Article/Abstract/')
if len(root_find)==0:
root_find=root.findall('./PubmedArticle/MedlineCitation/Article/ArticleTitle')
for i in range(len(root_find)):
if root_find[i].text != None:
abstract.append(root_find[i].text)
if abstract is not None:
Abstract_data.loc[index]=names,pmid,"".join(abstract)
index+=1
except:
print "Connection Refused"
time.sleep(5)
continue
return Abstract_data
EDIT: The general error that occurs for this script is seemingly a "Connection Refused". See the answer of ZF007 below how this was solved.
The below code works. Your script hang on malformed URL construction. Also if things went wrong inside the script the response was a refused connection. This was infact not the case because it was the code that did the processing of the retrieved data.. I've made some adjustments to get the code working for me and left comments in place where you need to adjust due to the lack of the dict_pmids list.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys, time, requests, sqlite3
import pandas as pd
import xml.etree.ElementTree as ET
from metapub import PubMedFetcher
from datetime import date
from Bio.Entrez import read,efetch,email,tool
def abstract_download(pmids):
"""
This method returns abstract for a given pmid and add to the abstract data
"""
index = 0
baseUrl = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/"
collected_abstract = []
# code below diabled to get general abstract extraction from pubmed working. I don't have the dict_pmid list.
"""
for names in dict_pmids:
for pmid in dict_pmids[names]:
move below working code to the right to get it in place with above two requirements prior to providing dict_pmid list.
# from here code works upto the next comment. I don't have the dict_pmid list.
"""
for pmid in pmids:
print 'pmid : %s\n' % pmid
abstract = []
root = ''
try:
url = '%sefetch.fcgi?db=pubmed&id=%s&rettype=xml' % (baseUrl, pmid)
# checks my url... line to parse into a webbrowser like firefox.
print 'url', url
response = requests.request("GET", url, timeout=500).text
# check if I got a response.
print 'response', response
# response = response.encode('utf-8')
root = ET.fromstring(response)
except Exception as inst:
# besides a refused connection.... the "why" it was connected comes in handly to resolve issues at hand
# if and when they happen.
print "Connection Refused", inst
time.sleep(5)
continue
root_find = root.findall('./PubmedArticle/MedlineCitation/Article/Abstract/')
if len(root_find)==0:
root_find = root.findall('./PubmedArticle/MedlineCitation/Article/ArticleTitle')
# check if I found something
print 'root_find : %s\n\n' % root_find
for i in range(len(root_find)):
if root_find[i].text != None:
abstract.append(root_find[i].text)
Abstract_data = pd.DataFrame(columns=["name","pmid","abstract"])
# check if I found something
#print 'abstract : %s\n' % abstract
# code works up to the print statement ''abstract', abstract' teh rest is disabled because I don't have the dict_pmid list.
if abstract is not None:
# Abstract_data.loc[index] = names,pmid,"".join(abstract)
index += 1
collected_abstract.append(abstract)
# change back return Abstract_data when dict_pmid list is administered.
# return Abstract_data
return collected_abstract
if __name__ == '__main__':
sys.stdout.flush()
reload(sys)
sys.setdefaultencoding('utf8')
pubmedIDs = range(21491000, 21491001)
mydata = abstract_download(pubmedIDs)
print 'mydata : %s' % (mydata)

PowerShell [System.IO.StreamWriter] to write to console (Write-Host)?

I know that I write into a file by using:
$stream = [System.IO.StreamWriter] "file.txt"
As I need to use a stream I really would know if I can replace "file.txt" by something so that what ever is written to the stream is printed to the console (write-host) or to write-progress?
Thanks in advance
I would look at writing a function that will perform both of those operations. This is pretty generic but would take the input that you want from the pipeline and then write to the stream as well as writing the output to the console.
# Generic function name modeled after Tee-Object, which outputs to console and file
Function Tee-Object1 {
Param (
[parameter(ValueFromPipeline=$True)]
$InputObject
)
Process {
# Write to stream
[void]$Script:Stream.Write($_)
# Write to console
$_
}
}
$Script:stream = [System.IO.StreamWriter] "file.txt"
# Whatever your data is will be piped into the function
$data | Tee-Object1
You should be able to use [Console]::Out now to get a reference to the current stdout stream
https://learn.microsoft.com/en-us/dotnet/api/system.console.out?view=netcore-3.1#System_Console_Out

Resources