I'm considering the option to use Jenkins Pipeline to build data ETL pipelines. For the moment, it sounds more attractive, more modern and simpler to use than Make/Makefile.
However, I don't understand if the same Make/Makefile step-triggering behaviour is available, eg, let's say I have data2.xml built by the script csv2xml.sh, taking data1.csv as input: in a Makefile, it's pretty straightforward to declare that data2.xml must be built only if it doesn't exist or is older than data1.csv.
Is it possible to do the same in Jenkins Pipeline? Or am I looking at the wrong tool?
such steps are available under sh/bat execution, you can deal with your make/makefile as usual , Jenkins will just execute it and give you back results, afterwards inside pipeline you will decide what to do with it, e.g. upload to server or something similar
Related
Is there any way to compare two files in jenkins scripted pipeline and mail the difference? We can write a batch script and call it in jenkins but want to understand if there is any way to do it in jenkins pipeline.
Any suggestions on this is highly appreciated.
If you only want to compare size, the option suggested by Sharp would be a solution. Apart from that, if you want to get a diff, there aren't a lot of built in tools within Jenkins that can help you. You could try the Artifact Diff plugin, but it hasn't been updated in years and may not have the necessary capabilities you are looking for. Using external diff tools you can call would be the best way to go here. As you mentioned, you would have to write a batch script that performs the diff and writes it into a patch file. You could then use Jenkins to email the patch files if they exist.
I have a significant amount of pre-configuration that I want to automate for Jenkins. E.g. Pre configuring gerrit for the gerrit trigger plugins, pre configuring saml, libraries etc
I'm aware of two methods typically used to do similar tasks:
Configuration as code plugin + yaml configuration
Groovy scripts to execute from the init.groovy.d directory of jenkins home on Jenkins startup
My users want to be able to update Jenkins configuration from the UI without needing to update yaml, suggesting the config as code plugin isn't fit for our purpose as I believe it reapplies the config when the Jenkins container is restarted.
My hunch is to use groovy scripts that remove themselves after the first execution so that they don't reapply themselves on restart.
Is there a more standard way of pre configuring Jenkins? or is groovy my best bet?
TL;DR: Use the file system
Why? There is no "standard" way to achieve what you intend; the two approaches that you suggest are viable options for sure.
From operational point of view, however, it will be good to select a solution which is
generic (so it can cover all aspects of Jenkins configuration) and
"simple" to use
Now,
"Configuration as code" makes you depend on the corresponding plugin -- it may or may not support a specific configuration option
With groovy, it is sometimes quite difficult to find out how to set a Jenkins configuration option (and how to store the setting permanently).
Since all Jenkins configuration data is stored on-disk, another option for bootstrapping Jenkins with a well-defined configuration is to pre-fill those configuration files with proper content right away:
You can be sure that this works in all cases, including all border cases (like, secret/encrypyted data)
Users can change the data later on as needed
Usually, it's quite easy to find the proper configuration file
On the downside, there is a risk that the configuration file format might change with newer versions of the core or of some plugin. However, a similar risk exists for the two other solutions that you suggested.
Tip: for rolling out such pre-configured Jenkins setups, it is helpful to disable the Jenkins setup wizard by setting jenkins.install.runSetupWizard to false.
When you combine words like : pre-configuring Jenkins, init.groovy.d, jenkins home, jenkins startup, etc, it sounds confusing o_O
When Jenkins is ready to use, usual folks just need to create jobs or pipelines. If you need to create a job or pipeline, you just need to install and configure some plugins. Very few of them need groovy, because the goal is "Easy to use".
Advanced user are able to create its own plugins, with java. But almost all is available as plugins.
You can use groovy in a pipeline scripts or declarative pipelines.
So if your question is more like "What is the best way to create and configure jobs or pipelines", I can advise you:
Try much as possible to use pipeline scripts or declarative pipelines.
Use just verified and supported plugins.
Stop call shell scripts in hard drive.
Stop using complicated configurations. Almost all of requirements are already implemented and documented.
If you have a requirement and no one plugin seems to help you, ask here in stackoverflow or develop your own plugin focused in configurability, so you can release it, for the benefit of Jenkins Community.
I have 11 jobs running on the Jenkins master node, all of which have a very similar pipeline setup. For now I have integrated each job with its very own Jenkinsfile that specifies the stages within the job and all of them build just fine. But, wouldn't it be better to have a single repo that has some files (preferably a single Jenkinsfile and some libraries) required to run all the jobs that have similar pipeline structure with a few changes that can be taken care of with a work around?
If there is a way to accomplish this, please let me know.
Use a Shared Library to define common functionality. Your 11 Jenkinsfiles can then be as small as only a single call to the function implementing the pipeline.
Besides using a Shared Library, you can create a groovy file with common functionality and call its methods via load().
Documentation
and example. This is an easier approach, but in the future with the increasing complexity of pipelines, this may impose some limitations.
I have a build that's currently using the old build flow plugin that I'm trying to convert to pipeline.
This build can be massively parallelized (many units of work can run on many different nodes) but we only want to extract the source code once at the beginning, preferably with the Pipeline script from SCM option. I'm at a loss to understand how I can share the source extract (which apparently is on the master) with all of the "downstream" nodes that will be used by the pipeline script.
For build flow we extracted to a well-known location on a shared file system and all of the downstream jobs invoked by the flow were passed (or could derive) that location. That always felt icky & I was hoping that pipeline would have solved this problem but I can't find anything to suggest that it has. What am I missing?
I believe the official recommendation for this is to make bundles of the source and then use "stash" and "unstash" to make them available to deeper steps of your pipeline script.
See https://www.cloudbees.com/blog/parallelism-and-distributed-builds-jenkins
Keep in mind that this doesn't do anything to help with line-endings. If you have builds that span OSs with different line endings you either need to make OS-specific stashes, or just checkout to a safe label in each downstream step.
After further research it seems like the External Workspace Manager Plugin does what I'm looking for.
I want to setup a continous integration system that upon a commit or similar trigger should:
run tests on a fortran/C/C++ code, if needed.
compile that code using cmake.
run tests on a rails app.
compile the rails ap.
restart the server.
I'm looking at Jenkins. Is it the best choice for this kind of work? Also, what's the difference between using a bash script that makes all that (if possible) and using jenkins? I'm asking not because I'm thinking about using a script, but to better understand jenkins.
It sounds like Jenkins would certainly be a reasonable choice for this. Apart from the ability to run arbitrary scripts as build steps, there's also a large number of plugins, which provide better integration with cmake for example.
Even if you're using a single bash script to do all of this, using Jenkins on top of it would still have a number of advantages. You get a web interface, email notifications and build history for free, with all that this entails. By integrating your tests "properly" with Jenkins, you can also get things like graphs that show how many tests succeeded/failed over time.
I am using Jenkins for java projects and have to say it is easy to configure. I used to add lots of plugins for better configuration of build steps, but tend to go back to using scripting languages for build and deploy steps because of two main reasons. If I have a build script, it's easier to configure the same job on a different Jenkins server or run the script manually if need be and the build configuration is not so cluttered (I still have one maven job with more than 50 post build steps). The second reason is, that it is easier to version the scripts in SVN, compared to having the build config in SVN.
So to answer your questions. I don't know if it is the 'best' tool, but it is good enough for me. Regarding scripting: use each tool for what it is build for. Jenkins a glorified cron deamon with great options when it comes to displaying analysis. The learning curve for people to use it is minimal (i.e. starting a job, seeing whether it failed.) Configuring Jenkins needs a little bit more learning, but it's very easy to set up simple jobs and go then to the more complicated tasks.
For the first four activities Jenkins will do the job and is rather the best choice nowadays, but for things like restarting the server (which is actually "remote execution"), better have a look at:
http://saltstack.com/
or:
https://wiki.opscode.com/display/chef/Home
http://cfengine.com/
http://puppetlabs.com/
http://cfengine.com/
Libraries like Fabric(Python) or Capistrano(Ruby) might be useful too.