Maintain a state in apache beam (google dataflow) pipeline - google-cloud-dataflow

I have a dataflow deployed in Google Cloud. I am looking for a way to set a trace id so that it is available in all steps of the pipeline.
One way that i was thinking is to create a trace id and set it in each step :
ex:
Step A (create and set trace id) --> Step B (read and set to message C) --> Step C (read and set to message D --> Step D (read trace id).
Is there any better way to make the trace id available in all steps ?
Srinivas
I tried to create a set the trace id in each step.

Related

Creating copy of Robot Framework output.xml in Jenkins pipeline

I am in the process of creating a Groovy email template for a Jenkins pipeline running Robot Framework tests. I intend to use Groovy's XMLSlurper to parse the output.xml created by Jenkins to extract the information I need. However, the template also relies on using Robot Publisher which I've now realized automatically deletes the output.xml. I would rather not have to archive the artifacts and access them that way, so is there a way to create a copy of the output.xml in the Jenkins pipeline before the Robot Publisher stage, that will not be deleted by Robot Publisher, that I can parse in my email stage?
Please bear with me as I'm relatively new to Jenkins (and stackoverflow for that matter), so apologies if I've excluded vital information, but any ideas would be much appreciated! Thanks
I would approach your problem from a different angle. First of all I do not suggest using Groovy's XMLSlurper or any other XML parser to extract the information you need from Robot Framework's output.xml.
What you should use is Robot Framework's own API that already implements the parsers you need. You could easily access any information described in the robot.result.model module. You can find everything here, suites, tests and keywords with all thier attributes like, test messages, failure messages, execution times, test results, etc.
All in all this would be the most future proof parsing solution as this parser will always match the version of the framework. Make sure to use the API documentation that matches your current framework version.
Now back to your task, you should utilize the above mentioned API via Robot Framework's listener interface. Implementing the output_file listener method you can access the output.xml (you can even make a copy of it here) file before the Robot Publisher plugin moves the file. The output_file will be automatically called once the output.xml is ready. The method will get the path to the xml file as an input. You can pass this path straight to the ExecutionResult class from the API, then you could "visit" the results by your ResultVisitor and acquire the information needed.
Last step would be to write the data into a file that would serve as an input to your e-mail stage. Note that this file won't be touched by the Robot Publisher by default as it is not a standard output, but a custom you just made using Robot Framework's API.
As it could sound a lot, here is an example to demonstrate the idea. The listener and the result visitor in EmailInputProvider.py:
from robot.api import ExecutionResult, ResultVisitor
class MyTestResultVisitor(ResultVisitor):
def __init__(self):
self.test_results = dict()
def visit_test(self, test):
self.test_results[test.longname] = test.status
class EmailInputProvider:
ROBOT_LISTENER_API_VERSION = 3
def output_file(self, path):
output = 'EmailInput.txt'
visitor = MyTestResultVisitor() # Instantiate result visitor
result = ExecutionResult(path) # Parse up execution result using robot API
result.visit(visitor) # Visit and top level suite to retrive needed metadata
with open(output, 'w') as f: # Write retrived data into a file
for testname, result in visitor.test_results.items():
print(f'{testname} - {result}', file=f)
# You can make a copy of the output.xml here as well
print(f'Email: Input saved into {output}') # Log about custom output to console
globals()[__name__] = EmailInputProvider
This would give the following results for this dummy suite (SO2.robot):
*** Test Cases ***
Test A
No Operation
Test B
No Operation
Test C
No Operation
Test D
No Operation
Test E
No Operation
Test F
Fail
Console output:
$ robot --listener EmailInputProvider SO2.robot
==============================================================================
SO2
==============================================================================
Test A | PASS |
------------------------------------------------------------------------------
Test B | PASS |
------------------------------------------------------------------------------
Test C | PASS |
------------------------------------------------------------------------------
Test D | PASS |
------------------------------------------------------------------------------
Test E | PASS |
------------------------------------------------------------------------------
Test F | FAIL |
AssertionError
------------------------------------------------------------------------------
SO2 | FAIL |
6 critical tests, 5 passed, 1 failed
6 tests total, 5 passed, 1 failed
==============================================================================
Email: Input saved into EmailInput.txt
Output: ..\output.xml
Log: ..\log.html
Report: ..\report.html
Custom output file:
SO2.Test A - PASS
SO2.Test B - PASS
SO2.Test C - PASS
SO2.Test D - PASS
SO2.Test E - PASS
SO2.Test F - FAIL

Make global variable accessible in Robot Framework listener

In one of my projects weare using Robot Framework and a special listener to push results via XRAY to Jira.
Now, we want to call Robot Framework in two different modes named A or B, and different labels need to pushed via XRay to Jira.
I don't want to set some environment variables prior to call robot, as they are really hard to track.
What might be the easiest way to make a global variables of a Robot Framework run accessible in a Robot Framework listener.
I just want to call robot something like this:
robot --listener XRayListener.py --variable Mode:A
How, can I now access the variable Mode inside of XRayListener.py
As detailed in this article , from the listener python code you can use BuiltIn().get_variables() to obtain a given variable value.
from robot.libraries.BuiltIn import BuiltIn
ROBOT_LISTENER_API_VERSION = 2
def end_test(name, attributes):
print("BROWSER = '%s'" % BuiltIn().get_variables()['${BROWSER}'])
Then run it as:
robot --listener ShowVariable simple.robot
The robot file, just for mere reference was:
*** Settings ***
Library SeleniumLibrary
*** Variables ***
${URL} https://www.google.com/
${REMOTE_URL} http://192.168.56.1:4444/wd/hub
${BROWSER} Chrome
*** Test Cases ***
Confirm login popup is accessable
#Go To ${URL}
open browser ${URL} ${BROWSER}
set window size 350 800
[Teardown] Close Browser(base

How can I chain build pipelines in a blocking fashion?

I am trying to configure a Jenkins pipeline in the following fashion :
Build A , B and C non blocking as they don't depend on each other
(but block on the fact that A , B and C are still building)
Build D
I tried to configure two pipelines :
Pipeline 1 : Build A , B and C non blocking
Pipeline 2 : Build D
But that did not work. A pipeline seems to reports "Success" the moments a build starts which is not what I need.
Ideally I would like to stay within Jenkins UI instead of creating scripts to accomplish this.
Use the parallel syntax. Found here: https://jenkins.io/doc/book/pipeline/syntax/#parallel

Spring cloud dataflow - Composite Task with external configuration

I have a composite task with two cloud task ( AAA && BBB ).
I want to pass the properties to AAA and BBB task from a directory.
For example, the usage of "--spring.config.location=directory/" when launching the Spring boot application.
As per the documentation, i understand that we can pass properties using app.CompositeTaskName.taskname.prop1=val1.
But, i want to load a bunch of configuration at launch.
So, is there a way to launch the tasks with "spring.config.location" argument ?
I found the solution. I passed "--spring.config.location" in the task definition of my composite task.
task create myctr --definition "AAA --spring.config.location=/data/prop/ '*'->BBB"
I launched the composite task "myctr" and it referred the property files from the "/data/prop/" directory.
Documentation reference :
http://docs.spring.io/spring-cloud-dataflow/docs/1.7.4.RELEASE/reference/htmlsingle/#spring-cloud-dataflow-composed-tasks
--> Task Application Parameters

Launching composed task built by DSL from stream application

Every example I've seen (task-launcher sink and triggertask source ) shows how to launch the task defined by uri attribute.
My tasks definitions look like this :
sampleTask <t2: timestamp || t1: timestamp>
sampleTask-t1 timestamp
sampleTask-t2 timestamp
sampleTaskRunner composed-task-runner --graph=sampleTask
My question is how do I launch the composed task runner (sampleTaskRunner, defined by DSL) from stream application.
Thanks
UPDATE
I ended up with the below solution that triggers task using SCDF REST API :
composedTask definition :
<timestamp || mySampleTask>
Stream definition :
http | httpclient | log
Deployment properties :
app.http.port=81
app.httpclient.body=name=composedTask&arguments=--increment-instance-enabled=true
app.httpclient.http-method=POST
app.httpclient.url=http://localhost:9393/tasks/executions
app.httpclient.headers-expression={'Content-Type':'application/x-www-form-urlencoded'}
Though it's easy to implement http sink component, would be great if stream application starters will provide one out of the box.
Another concern I have is about discovering the SCDF REST URL when deployed in distributed environment.
Here's a quick take from one of the SCDF's R&D team members (Glenn Renfro).
stream create foozer --definition "trigger --fixed-delay=5 | tasklaunchrequest-transform --uri=maven://org.springframework.cloud.task.app:composedtaskrunner-task:1.1.0.BUILD-SNAPSHOT --command-line-arguments='--graph=sampleTask-t1||sampleTask-t2 --increment-instance-enabled=true --spring.datasource.url=jdbc:mariadb://localhost:3306/test --spring.datasource.username=root --spring.datasource.password=password --spring.datasource.driverClassName=org.mariadb.jdbc.Driver' | task-launcher-local" --deploy
In the foozer stream definition,
1) "trigger" source happens to trigger an upstream event every 5s
2) "tasklaunchrequest-transform" processor takes a few arguments; more specifically, it uses "composedtaskrunner-task:1.1.0.BUILD-SNAPSHOT" to launch a composed-task graph (i.e., sampleTask-t1||sampleTask-t2)
3) Pay attention to --increment-instance-enabled. This was recently added to CTR application and this provides the ability to re-launch a composed-task in a recurring cadence
4) Since the CTR and SCDF must share the same database, we are also passing datasource properties as command-line args. (SCDF-server is already started with the same datasource credentials)
Hope this helps.
Lastly, we will add a sample to the reference guide via: spring-cloud/spring-cloud-dataflow#1780

Resources