I am building a Jenkins Declarative Pipeline.
Here's a gist of what I'm trying to do(as an arbitrary example)-
There is a list of platforms. I have put those in a matrix cell for readability and parallelism.
Each of them has an associated browser.
I want the matrix to be executed so that each Key-Values list iterates together.
For Example-
Platforms = ["Windows", "Mac", "Linux"]
Browsers = ["Edge", "Chrome", "Firefox"]
I want the output stages to have these pairings for (Platforms,Browsers)-
[("Windows", "Edge"),("Mac", "Chrome"),("Linux", "Firefox")]
In the actual case, this list is 12 long, so I don't want to define as many stages with when directives to pair these values manually, since everything else is the same in these stages.
Is there a way to do this, or a better approach?
Related
I'm looking for plugin where I could have aggregation of settings and view for many cases, the same way it is in multi-branch pipeline. But instead of basing on various branches I want to base on one branch but varying on parameters. Below picture is from mentioned multi-branch pipeline, instead of "Branches" I'm looking for "Cases" and instead of "Name" column I need to have configurable Parameter.
Additionally to it, I need to have various Periodic build triggers in way
H 22 * * 5 %param1=value1 %param2=value3
H 22 * * 5 %param1=value2 %param2=value3
The second case could be done in standard job, but since there will be many such cases launched periodically every week or two weeks or every month, and difference in param1 is crucial and is important to have it readable and easily visible to quickly distinguish which case have failed.
I was looking for such plugin but couldn't find something like this. Maybe someone knows such plugin or way to solve it.
I have alternative of creating "super"job which in build steps would launch my current job with specific parameters. Then my readability would change from many rows to many columns since the number is over 20 it will IMHO significantly decrease readability(in column solution) and additionally not all cases would be launched with same periodicity. So there would be necessity to have some ready sets assigned by parameter, and most often the super build cases would have mostly skips in it. What would result that one might not see last result for one of the cases.
Note, that param2, has always same value for periodic launches. Other values are used only with manual trigger. Param2 can but doesn't have to be visible on "multi-branch pipeline" like solution.
I hope my explanation of issue is clear. Looking forward for answers\suggestions etc. :)
I have a Jenkins pipeline, which runs a suite of automated tests against a variety of environments in separate workers using the matrix directive. At the end of this, I would like to combine the code coverage output of the various test suite runs into a single file before collecting them, to ensure that the results are accurate. This sounds like it should be simple:
For each matrix cell, stash the coverage output file with a unique stash name, based on the matrix cell values.
After the test runs are complete, unstash all of the files on the "main" worker and combine them.
However, the fact that the stashes are dynamically named makes step 2 difficult. This leaves me, seemingly, with three options:
Hardcode the matrix axes again when unstashing. Not particularly appealing.
Retrieve the matrix axes programmatically. It seems like it should be possible, but I'm uncertain how to go from the FlowNodeWrapper representing the matrix stage to the underlying axis strings.
List all stashes for the build, and pick the ones I want. Also a viable solution if it's possible, since the stash names follow a pattern, but I'm not even sure where to start with this one. There is an open issue related to this in the Jenkins issue board, but it doesn't seem like it'll be moving anytime soon.
In short: how can I achieve this? How can I either:
Go from a FlowNodeWrapper to the matrix axes?
Find my stashes in a different way?
1. For each matrix cell, stash the coverage output file with a unique stash name, based on the matrix cell values.
Right. I'm not familiar with matrix, so I don't know for sure how you can get a unique name, but in many cases you can use env.STAGE_NAME.
2. After the test runs are complete, unstash all of the files on the "main" worker and combine them.
In step 1, keep track of the stash names you've used. Then step 2 is easy.
With a scripted pipeline, that's easy:
def stashes = [:]
…
stage(…) {
…
String stash_name = env.STAGE_NAME
stash stash_name, …
stashes[stash_name] = 1
}
…
stage('Coverage analysis') {
for (stash_name in stashes) {
unstash stash_name
}
…
}
I don't know if that works with a declarative pipeline.
I wrote a ParDo function that returns multiple side outputs.
Although PCollections elements are unordered, I'd like to write these different types of Pcollections sequentially.
Does the Beam SDK support this feature?
If I understand your question correctly, you are looking to order the processing of each of those outputs in the subsequent steps? If so, you could potentially use the Wait transform.
So for a PCollectionTuple "results" with three tuple tags (ONE, TWO and THREE).
results.get(THREE)
.apply(Wait.on(results.get(TWO))
.apply(Wait.on(results.get(ONE)
.apply(new ProcessOne()))
.apply(new ProcessTwo())
.apply(new ProcessThree());
This should allow ONE to be processed before TWO followed by THREE.
I have multiple custom combine functions which I call as such:
e.g. I have 'data' calculated previously in the pipeline.
cd1 = data | customCombFn1()
cd2 = data | customCombFn2()
cd3 = data | customCombFn3()
How does the pipeline work in the above case ? Is the 'data' evaluated again and again ? Or are cd1, cd2, and cd3 evaluated as a by-product of the pipeline ?
Your data object is a PCollection. Applying a combine transformation on a PCollection creates another PCollection, most often containing much fewer elements.
There would be no 're-evaluation', as you call it. PCollection is typically produced on multiple workers and immediately consumed by transformations that need it. If that is not possible in a given case, PCollection will typically be stored for processing at a later point.
Generally speaking, Cloud Dataflow service automatically applies optimizations to users' pipeline. In most cases, including this one, it allows users to focus on their business logic instead of the underlying execution considerations.
As shown here Dataflow pipelines are represented by a fixed DAG. I'm wondering if it's possible to implement a pipeline where the processing proceeds until a dynamically evaluated condition is satisfied based on the data computed so far.
Here's some pseudo code to illustrate what I'd like to implement:
PCollection pco = null
while(true):
pco = pco.apply(someTransform())
if (conditionSatisfied(pco)):
break
pco.Write()
It seems like you really want iterative computations. Right now Dataflow does not provide support for that, but we are aware that it is a very important use case and we are working on finding the right set of APIs to express it.
For now your workarounds are:
Iteratively run whole pipelines (run pipeline, inspect output, run again if the condition is not satisfied, etc). This has the obvious downside of pipeline setup and teardown overhead.
Build a pipeline with a hard-coded number of iterations by .apply()'ing in a loop unconditionally, then run the whole pipeline.
A combination of the two, e.g. run fixed 5-iteration pipelines until you're satisfied with the result.