Upload pipeline on kubeflow - kubeflow

I am currently trying to setup a kubeflow pipeline. My use case requires that the configuration for pipelines shall be provided via a yaml/json structure. Looking into the documentation for submitting pipelines I came across this paragraph:
Each pipeline is defined as a Python program. Before you can submit a pipeline to the Kubeflow Pipelines service, you must compile the pipeline to an intermediate representation. The intermediate representation takes the form of a YAML file compressed into a .tar.gz file.
Is is possible to upload/submit a pipeline to KubeFlow via json representation or any other representation instead of a zip file(tar.gz) representation? Is there a way to bypass the filesystem persistence of files(zips and tar.gz) and add them into database as a yaml/json representation?

When you compile your python pipeline code then it results in a compressed file containing a YAML file. You can take out the YAML file after decompressing it and you can add its contents to your database table.
Later If you want to upload it to Kubeflow then use the following code:
pipeline_file_path = 'pipelines.yaml' # extract it from your database
pipeline_name = 'Your Pipeline Name'
client = kfp.Client()
pipeline = client.pipeline_uploads.upload_pipeline(
pipeline_file_path, name=pipeline_name)

Related

How to execute a groovy script from a jenkins pipeline

I am new to Jenkins.
I have written a groovy script, which loads a token secret from an adjacent config.properties file.
What's the best way to execute this script in a groovy pipeline ?
What I think I need to do is:
download that script from SCM
change the token in config.properties (retrieved from the stored jenkins credentials)
execute that script
I've seen various options :
loading the script with load(), but this seems to be more dedicated to loading helper functions, whereas I only need to execute the whole script
using shared libraries, but this seems to be more dedicated to code snippets reused across multiple jobs, which is not my case
using withGroovy {}
using sh groovy
using a docker image that contains a groovy sdk
I tried these, and managed to get none of them to work (in part due to the extra-difficulty of retrieving the credential), so before I go any further I'd like to know what's the best option to proceed. For example, instead of trying to change the file config.properties in my pipeline to set the correct token, should I rather try to change my groovy script so that it takes the token from an environment variable ?
Thanks

How to access json file in bitbucket pipeline?

I have a bitbucket pipeline that runs Google Lighthouse. I want to access the json output that is generated at the end of the pipeline and have it echo 1 of the variables. I understand that I can use artifacts, but I am unsure of how to access it.
Here is my bitbucket-pipelines.yml file:
script:
- lhci collect
- lhci upload
- echo "===== Lighthouse has completed running ====="
artifacts: # defining the artifacts to be passed to each future step.
- .lighthouseci/*.json
Reciting the official doc,
Artifacts are files that are produced by a step. Once you've defined them in your pipeline configuration, you can share them with a following step or export them to keep the artifacts after a step completes. For example, you might want to use reports or JAR files generated by a build step in a later deployment step. Or you might like to download an artifact generated by a step, or upload it to external storage.
If you have your json file generated right after echo "===== Lighthouse has completed running =====" line, you don't have to define a separate step for echoing its contents. Do it right here. You don't even need artifacts if that's the only thing you want to do with your json.

Jenkins declarative pipeline: How to fingerprint a file without archiving it?

I have a Jenkins declarative pipeline job that has the end result of creating some very large output files ( > 2 GB in size ). I don't want to archive these files in Jenkins as artifacts.
However, I would like to fingerprint these large files so that I can associate them with other builds.
How can I do this, preferably in the post action of the pipeline?
In your pipeline script add: fingerprint 'module/dist/**/*.zip'
Where 'module/dist/**/*.zip' are the files you wish to fingerprint using Ant's FileSet
In console log you should see:
Recording fingerprints
[Pipeline] ...
While users have mentioned in the Jenkins documentation that files also need to be archived for the build not to fail, this work for me on Jenkins ver. 2.180.

How to add config file for config file provider plugin with groovy script in jenkins

I am using Job DSL in Jenkins. There is a seed job that generates some files that should be shared across other jobs that could run on different nodes. If the files were not generated, the config files provider plugin could be used for this task. However I need the files to be dynamic so that no Jenkins UI interaction is needed.
Is it possible to add a file to the plugin with a groovy script?
The only other option I could think of was to record the UI interaction and let a script replay it with modified data. In case of a more secured Jenkins this would also require to get authentication and CSRF tokens right.
You can use Job DSL to create config files that are managed by the Config File Provider plugin:
configFiles {
customConfig {
id('one')
name('Config 1')
comment('lorem')
content('ipsum')
providerId('???')
}
}
See https://github.com/jenkinsci/job-dsl-plugin/wiki/Job-DSL-Commands#config-file
When you are using job-dsl you can read in data from anywhere that the Groovy runtime can access.
You could store shared config in a hard coded variable in your script itself.
You could inject the data via a Jenkins parameter to your seed job.
You could retrieve the data from a file in the git repo where your store your seed job.
You could retrieve the data from a database, REST API.
etc etc.

Passing s3 artifacts from parallel builds to a single build in Jenkins Workflow

I am attempting to build a Windows installer through Jenkins.
I have a number of jenkins projects that build individual modules and then save these artifacts in s3 via the s3 artifact plugin.
I'd like to run these in parallel and copy the artifacts to a final "build-installer" job that takes all these and builds an installer image. I figured out how to run jobs in parallel with jenkins workflow but I don't know where to look to figure out how to extract job result details, ensure they're all the same changeset and pass it to the 'build-installer' job.
So far I have workflow script like this:
def packageBuilds = [:]
// these save artifacts to s3:
packageBuilds['moduleA'] = { a_job = build 'a_job' }
packageBuilds['moduleB'] = { b_job = build 'b_job' }
parallel packageBuilds
// pass artifacts from another jobs to below??
build job:'build-installer', parameters:????
Is this the right way? Or should I just have a mega build job that builds the modules and installer in one job?
A single job that does all the steps would be easier to manage.
I know file parameters are yet not supported for sending files to a Workflow job: JENKINS-27413. I have not tried sending files from a Workflow job using file parameters. Probably cannot work without some special support. (Not sure if you can even send file parameters between freestyle builds, for that matter.)

Resources