How to specify task pod resources in Kubeflow Kale - kubeflow

Is it possible to specify the pod resources (memory and CPU requests and limits), for a task in a kubeflow pipeline created with Kale?
To provide more details, when writing the kubflow pipeline in python using the DSL, I can specify the task pod resources as follows:
task1 = (
train_op(...)
.set_memory_request('600M')
.set_memory_limit('1200M')
.set_cpu_request('700m')
.set_cpu_limit('1400m')
)
Is it possible to do the same with Kale?

From what I can see, setting pod resource requests and limits for each task in the kubeflow pipeline is not possible with Kale. The best alternative is to set default resources for the namespace, which will set the pod requests and limits for all tasks in the kubeflow pipeline.

Related

How to increase number executors on jenkins build agent(kubernetes pods)

I have a pod template declared in configure clouds section and I am using jenkins/inbound-agent:4.3-4 image, build agents are coming up fine but they are coming up with just one executor, is there a way I can increase that number?
The reason I would like to increase the number of executors is, I want to create a job which triggers other jobs sequentially and I want to all the downstream projects to run on the same agent as the main job.
I don't see any option in configure cloud section, any heap or clue on workarounds is appreciated.
I met the same issue with the same situation.
I found a post which similiar with this situation,maybe the kubernetes plugin is as same as amazon-ecs plugin,they both hard coded the executors value as 1.
Jenkins inbound-agent container via ecs/fargate plugin - set # executors for node
So, run pipeline steps one by one is the only way I know to avoid this issue.
If you need call other job,set wait:false would be work, like this
build(job: "xxxx", wait: false, parameters: [string(name: "key", value: "value")])

Spring Cloud Data Flow Pod Cleanup

We are repeatedly seeing resource quota limitation issues in logs and Task jobs fail on the SCDF running on Kubernetes. Problem is, there are so many pods in "running" status even after they completed. I understand, SCDF does not delete the pods and it is developer's responsibility to cleanup.
Even when I run the Task Execution Cleanup from SCDF dashboard UI, it only cleans up the execution logs and task form UI but the pods created by that task still remain. Is this expected ? Shouldn't Task Execution Cleanup also delete the pods ? We are using Spring-Cloud-Dataflow-Server 2.4.2 Release.
Is there a way to cleanup the pods right after the execution is complete ? Any best practices here ?
Method - 1
You can clean up task executions by using restful api provided by spring-cloud-dataflow.
Request Structure
DELETE /tasks/executions/{ids}?action=CLEANUP,REMOVE_DATA HTTP/1.1
Host: localhost:9393
Fire DELETE request.
http://<stream-url>/tasks/executions/<task-ids-seperated-by-coma>?action=CLEANUP,REMOVE_DATA
eg: http://localhost:9393/tasks/executions/1,2?action=CLEANUP,REMOVE_DATA
Delete Task Execution
Note: Above api will clean up resources that were used to deploy tasks and delete the data associated with task executions from the underlying persistence store.
CLEANUP : clean up the resources
REMOVE_DATA : remove data from persistence store.
You can either pass both actions or single action depends on your use-case.
Method - 2
Using spring cloud dataflow shell.
Enter into the spring-cloud-dataflow shell and execute below command.
task execution cleanup --id <task-id>
eg: task execution cleanup --id 1
Cleanup task execution from spring-cloud-dataflow shell
Other option (Applicable for kubernetes platform only)
If you wan't to delete all completed pods then you can delete using kubectl tool.
kubectl delete pod --field-selector=status.phase==Succeeded -l role=spring-app -n <namespace-where-tasks-launched>
If you wan't to delete all pods with Error status then execute below command
kubectl delete pod --field-selector=status.phase==Failed -l role=spring-app -n <namespace-where-tasks-launched>

Allocate resources for Kubeflow pipeline using pipeline params

I would like to be able to create a Kubeflow pipeline that allows users to set the allocated resources for a run. The end result would be something like this:
Example of Kubeflow "Create Run" UI with ability to set resource allocation.
Definition of the pipeline params is possible; however, the syntax of the pipeline params does not match the validation regex used by Kubeflow to preprocess its YAML definition.
As an example, using the values paramters in the screen shot, I can hard-code the resources allocated to the pipeline by adding this to the pipeline's YAML definition:
resources:
limits: {nvidia.com/gpu: 1}
requests: {cpu: 16, memory: 32G}
However, what I want to do is to use the pipeline's paramaters to define these allocations for each run. Something like:
resources:
limits: {nvidia.com/gpu: '{{inputs.parameters.gpu_limit}}'}
requests: {cpu: '{{inputs.parameters.cpu_request}}', memory: '{{inputs.parameters.memory_request}}'}
When I use the second definition of pipeline resources, creation of the pipeline fails because Kubeflow cannot to parse these resource parameter as the input parameter syntax '{{input.parameters.parameter}}' does not match the regular expression ^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$.
{
"error_message":"Error creating pipeline: Create pipeline failed: Failed to get parameters from the workflow: InvalidInputError: Failed to parse the parameter.: error unmarshaling JSON: while decoding JSON: quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'",
"error_details":"Error creating pipeline: Create pipeline failed: Failed to get parameters from the workflow: InvalidInputError: Failed to parse the parameter.: error unmarshaling JSON: while decoding JSON: quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'"
}
Has anyone found a workaround for this issue, or am I trying to force Kubeflow to do something it isn't built for? Defining and using pipeline parameters like I have in the second example works for other portions of the pipeline definition (eg args or commands to run in the Docker container).
This just can be done in current version of kubeflow pipelines. It is a limitation, but you cannot change resources from the pipeline itself.

Jenkins slave pod keeps always running

I am trying to keep always the slave pod running. Unfortunately using Kubernetes agent inside the pipeline, I am still struggling with adding "podRetention" as always
For a declarative pipeline you would use idleMinutes to keep the pod longer
pipeline {
agent {
kubernetes {
label "myPod"
defaultContainer 'docker'
yaml readTrusted('kubeSpec.yaml')
idleMinutes 30
}
}
the idea is to keep the pod alive for a certain time for jobs that are triggered often, the one watching master branch for instance. That way if developers are on rampage pushing on master, the build will be fast. When devs are done we don't need the pod to be up forever and we don't want to pay extra resources for nothing so we let the pod kill itself

Jenkins scripted pipeline or declarative pipeline

I'm trying to convert my old style project base workflow to a pipeline based on Jenkins. While going through docs I found there are two different syntaxes named scripted and declarative. Such as the Jenkins web declarative syntax release recently (end of 2016). Although there is a new syntax release Jenkins still supports scripted syntax as well.
Now, I'm not sure in which situation each of these two types would be a best match. So will declarative be the future of the Jenkins pipeline?
Anyone who can share some thoughts about these two syntax types.
When Jenkins Pipeline was first created, Groovy was selected as the foundation. Jenkins has long shipped with an embedded Groovy engine to provide advanced scripting capabilities for admins and users alike. Additionally, the implementors of Jenkins Pipeline found Groovy to be a solid foundation upon which to build what is now referred to as the "Scripted Pipeline" DSL.
As it is a fully featured programming environment, Scripted Pipeline offers a tremendous amount of flexibility and extensibility to Jenkins users. The Groovy learning-curve isn’t typically desirable for all members of a given team, so Declarative Pipeline was created to offer a simpler and more opinionated syntax for authoring Jenkins Pipeline.
The two are both fundamentally the same Pipeline sub-system underneath. They are both durable implementations of "Pipeline as code." They are both able to use steps built into Pipeline or provided by plugins. Both are able to utilize Shared Libraries
Where they differ however is in syntax and flexibility. Declarative limits what is available to the user with a more strict and pre-defined structure, making it an ideal choice for simpler continuous delivery pipelines. Scripted provides very few limits, insofar that the only limits on structure and syntax tend to be defined by Groovy itself, rather than any Pipeline-specific systems, making it an ideal choice for power-users and those with more complex requirements. As the name implies, Declarative Pipeline encourages a declarative programming model. Whereas Scripted Pipelines follow a more imperative programming model.
Copied from Syntax Comparison
Another thing to consider is declarative pipelines have a script() step. This can run any scripted pipeline. So my recommendation would be to use declarative pipelines, and if needed use script() for scripted pipelines. Therefore you get the best of both worlds.
I made the switch to declarative recently from scripted with the kubernetes agent. Up until July '18 declarative pipelines didn't have the full ability to specify kubernetes pods. However with the addition of the yamlFile step you can now read your pod template from a yaml file in your repo.
This then lets you use e.g. vscode's great kubernetes plugin to validate your pod template, then read it into your Jenkinsfile and use the containers in steps as you please.
pipeline {
agent {
kubernetes {
label 'jenkins-pod'
yamlFile 'jenkinsPodTemplate.yml'
}
}
stages {
stage('Checkout code and parse Jenkinsfile.json') {
steps {
container('jnlp'){
script{
inputFile = readFile('Jenkinsfile.json')
config = new groovy.json.JsonSlurperClassic().parseText(inputFile)
containerTag = env.BRANCH_NAME + '-' + env.GIT_COMMIT.substring(0, 7)
println "pipeline config ==> ${config}"
} // script
} // container('jnlp')
} // steps
} // stage
As mentioned above you can add script blocks. Example pod template with custom jnlp and docker.
apiVersion: v1
kind: Pod
metadata:
name: jenkins-pod
spec:
containers:
- name: jnlp
image: jenkins/jnlp-slave:3.23-1
imagePullPolicy: IfNotPresent
tty: true
- name: rsync
image: mrsixw/concourse-rsync-resource
imagePullPolicy: IfNotPresent
tty: true
volumeMounts:
- name: nfs
mountPath: /dags
- name: docker
image: docker:17.03
imagePullPolicy: IfNotPresent
command:
- cat
tty: true
volumeMounts:
- name: docker
mountPath: /var/run/docker.sock
volumes:
- name: docker
hostPath:
path: /var/run/docker.sock
- name: nfs
nfs:
server: 10.154.0.3
path: /airflow/dags
declarative appears to be the more future-proof option and the one that people recommend. it's the only one the Visual Pipeline Editor can support. it supports validation. and it ends up having most of the power of scripted since you can fall back to scripted in most contexts. occasionally someone comes up with a use case where they can't quite do what they want to do with declarative, but this is generally people who have been using scripted for some time, and these feature gaps are likely to close in time.
more context: https://jenkins.io/blog/2017/02/03/declarative-pipeline-ga/
The Jenkins documentation properly explains and compares both the types.
To quote:
"Scripted Pipeline offers a tremendous amount of flexibility and extensibility to Jenkins users. The Groovy learning-curve isn’t typically desirable for all members of a given team, so Declarative Pipeline was created to offer a simpler and more opinionated syntax for authoring Jenkins Pipeline.
The two are both fundamentally the same Pipeline sub-system underneath."
Read more here:https://jenkins.io/doc/book/pipeline/syntax/#compare
The declarative pipeline is defined within a block labelled ‘pipeline’ whereas the scripted pipeline is defined within a ‘node’.
Syntax - Declarative pipeline has 'Stages' , 'Steps'
If the build is failed, declarative one gives you an option to restart the build from that stage again which is not true in scripted option
If there is any issue in scripting, the declarative one will notify you as soon as you build the job but in case of scripted , it will pass the stage that is 'Okay' and throw error on the stage which is 'Not ok'
You can also refer this. A very Good read -> https://e.printstacktrace.blog/jenkins-scripted-pipeline-vs-declarative-pipeline-the-4-practical-differences/
#Szymon.Stepniak https://stackoverflow.com/users/2194470/szymon-stepniak?tab=profile
I also have this question, which brought me here. Declarative pipeline certainly seems like the preferred method and I personally find it much more readable, but I'm trying to convert a mid-level complexity Freestyle job to Declarative and I've found at least one plugin, the Build Blocker plugin, that I can't get to run even in the a script block in a step (I've tried putting the corresponding "blockOn" command everywhere with no luck, and the return error is usually "No such DSL method 'blockOn' found among steps".) So I think plugin support is a separate issue even with the script block (someone please correct me if I'm wrong in this.) I've also had to use the script block several times to get what I consider simple behaviors to work such as setting the build display name.
Due to my experience, I'm leaning towards redoing my work as scripted since support for Declarative still isn't up to where we need, but it's unfortunate as I agree this seems the most future proof option, and it is officially supported. Maybe consider how many plugins you intend to use before making a choice.

Resources