CI tools for static checking of environment variables definition - jenkins

For some security reasons, all of our projects are not allowed to define the environment variables in Dockerfile ,entrypoint.sh (for starting the application). Currently, we have to review each merge request manually to ensure this rule, I wonder if there is some ci tool to automate this work to make life easier, thanks!

GitLab comes with container scanning by default, but I don't believe it's possible to achieve this level of granularity. This might be possible in a pipeline using Trivy for scanning in conjunction with custom Trivy policies to detect misconfigurations in Dockerfiles. It will take some trial and error to detect and write custom policies, but I believe it can be done.

Related

On Demand Container Serving With GCP

I have a Docker image that will execute different logic depending on the environment variables that it is run with. You can imagine that running it with VAR=A will produce slightly different logic compared to running it with VAR=B.
We have an application that is meant to allow users to kick off tasks that are defined within this Docker image. Depending on the user attributes, different environment variables will need to be passed into the Docker container when it is run. The task is meant to run each time an event is generated through user action and then the container should shut down/be removed.
I'm trying to determine if GCP has any container services that best match what I'm looking for. My understanding of some of the services is:
Cloud Functions - can work well for consuming events and taking specific actions each time an event is triggered, but it is not suited for containerized workloads.
Cloud Run - a serverless way of deploying containers. As I understand it, a deployment on cloud run spins up a "service", and the environment variables must be passed in as part of the service definition. Because we may have a large number of combinations of environment variables (many of which may need to be running at once), it seems that this would end up creating a large number of services, which feels potentially clunky. This approach seems better for deploying a single service with static environment variables that needs to be auto-scaled by GCP.
GKE - another container orchestration platform. This is what I'm considering at the moment. The idea is that we would define a single job definition that can vary according to environment variables that are passed into it. The problem is that these jobs would need to be kicked off dynamically via code. This is a fairly straightforward process with kubectl, but the Kubernetes REST API seems fairly underdeveloped (or at least not that well documented). And the lack of information online on how to start jobs on-demand through the Kubernetes API makes me question whether this is the best approach.
Are there any tools that I'm missing that would be useful in spinning up containers on-demand with dynamic sets of environment variables and removing them when done?

Complex/Orchestrated CD with AWS CodePipeline or others

Building a AWS serverless solution (lambda, s3, cloudformation etc) I need an automated build solution. The application should be stored in a Git repository (pref. Bitbucket or Codecommit). I looked at BitBucket pipelines, AWS CodePipeline, CodeDeploy , hosted CI/CD solutions but it seems that all of these do something static as in receiving a dumb signal that something changed to rebuild the whole environment.... like it is 1 app, not a distributed application.
I want to define ordered steps of what to do depending on the filetype per change.
E.g.
1. every updated .js file containing lambda code should first be used to update the existing lambda
2. after that, every new or changed cloudformation file/stack shoud be used to update or create existing ones, there may be a needed order (importing values from each other)
3. after that, code for new lambda's in .js files should be used to update the created lambda's (prev step) code.
Non updated resources should NOT be updated or recreated!
It seems that my pipelines should be ordered AND have the ability to filter input (e.g. only .js files from a certain path) and receive as input also what the name of the changed resource(s) is(are).
I dont seem to find this functionality withing AWS or hosted git solutions like BitBucket or CI/CD pipelines like CircleCI or Codeship, aws CodePipeline, CodeDeploy etc.
How come? Doesn't anyone need this? Seems like a basic requirement in my eyes....
I looked again at available AWS tooling and got to the following conclusion:
When coupling CodePipeline to CodeCommit repositry, every commit puts a whole package of the repositry on S3 as input for CodeCommit. So not only the changes but everything.
In CodePipeline there is the orchestration functionality i was looking for. You can have actions for every component like create-change-set for SAM component and execute-chage-set etc and have control over the order of all.
But:
Since all code is given as input I assume all actions in CodeCommit will be triggered even for a small change in code which does not affect 99% of the resources. Underwater SAM or CF will determine themself what did or did not change. But it is not very efficient. See my post here.
I cannot see in the pipeline overview which one was run the last time and its status...
I cannot temporary disable a pipeline or trigger with custom input
In the end I think to make a main pipeline with custom lambda code determining what actually changed using CodeCommit API and splitting all actions in sub pipelines. From the main pipeline I will push their needed input to S3 and execute them.
(i'm not allow to comment, so i'll try and provide an answer instead - probably not the one you were hoping for :) )
There is definitely a need and at Codeship we're looking into how best to support FaaS/Serverless workflows. It's been a bit of a moving target over the last years, but more common practices etc. are starting to emerge/mature to a point where it makes more sense to start codifying them.
For now, it seems most people working in this space have resorted to scripting (either the Serverless framework, or directly against the FaaS providers) but everyone's struggling with the issue of just deploying what's changes vs. deploying everything as you point to. Adding further complexity with sequencing is obviously just making things harder.
Most services (Codeship included) will allow you some form of sequenced/stepped approach to deploying, but you'll have to do all the heavy lifting of working out what has changed etc.
As to your question of How come? i think it's purely down to how fast the tooling has been changing lately combined with how few are really doing it. There's a huge push for larger companies to move to K8s and i think they've basically just drowned out the FaaS adopters. Not that it should be like that, or that we at Codeship don't want to change that; it's just how i personally see things.

Jenkinsfile - Mutual exclusivity across multiple pipelines

I'm looking for a way to make multiple declaratively written Jenkinsfiles only run exclusively and block each other. They consume test instances who will be terminated after they run which causes problems when PRs are being tested as they come in.
I cannot find an option to make the BuildBlocker plugin do this, all the jenkinsfiles that use this plugin are not running in our Plugin/Jenkins version schema and it seems as if these [$class: <some.java.expression>] strings being exported from the syntax generator don't work here anyways.
I cannot find a way to run these Locks over all the steps involved in the pipeline.
I could hack a file-lock but this won't help me with multi-node builds.
This plugin could perhaps help you as it allows to lock resources you've declared previously so that if a resource is currently locked, any other job that requires the same resource will wait until it is released.
https://plugins.jenkins.io/lockable-resources/
Since you say you want declarative, probably wait for the currently-in-review Allow locking multiple stages in declarative pipeline jira issue to be completed. You can also vote for it and watch it.
And if you can't wait, this is your opportunity to learn golang (or whatever language you want to learn) by implementing a microservice that holds these locks that you call from your pipeline scripts. :D
The JobDSL plugin is for configuring Jenkins execution policies including blocking another and calling pipeline code.
https://jenkinsci.github.io/job-dsl-plugin/#method/javaposse.jobdsl.dsl.jobs.FreeStyleJob.blockOn describes the method, which also the blocker plugin uses.
This is the tutorial for usage https://github.com/jenkinsci/job-dsl-plugin/wiki/Tutorial---Using-the-Jenkins-Job-DSL, the api https://github.com/jenkinsci/job-dsl-plugin/wiki/Job-DSL-Commands.
Taken from https://www.digitalocean.com/community/tutorials/how-to-automate-jenkins-job-configuration-using-job-dsl:
It should be possible to use https://github.com/jenkinsci/job-dsl-plugin/wiki/Dynamic-DSL, but I found no good usage example yet.

CI / CD: Principles for deployment of environments

I am not a developer, but reading about CI/CD at the moment. Now I am wondering about good practices for automated code deployment. I read a lot about the deployment of code to a pre-existing environment so far.
My question now is whether it is also good-practice to use e.g. a Jenkins workflow to deploy an environment from scratch when a new build is created. For example for testing of a newly created build, deleting the environment again after testing.
I know that there are various plugins to interact with AWS, Azure etc. that could be used to develop a job for deployment of a virtual machine.
There are also plugins to trigger Puppet to deploy infra (as code) and there are plugins to invoke an infrastructure orchestration.
So everything is available to be able to deploy the infrastructure and middleware before deploying code (with some extra effort of course).
Is this something that is used in real life? How is it done?
The background of my question is my interest in full automation of development with as few clicks as possible, and cost saving in a pay-per-use model by not having idle machines.
My question now is whether it is also good-practice to use e.g. a Jenkins workflow to deploy an environment from scratch when a new build is created
Yes it is good practice to deploy an environment from scratch. Like you say, Jenkins and Jenkins pipelines can certainly help with kicking off and orchestrating that process depending on your specific requirements. Deploying a full environment from scratch is one of the hardest things to automate, and if that is automated, it implies that a lot of other things are also automated, such as infrastructure, application deployments, application configuration, and so on.
Is this something that is used in real life?
Yes, definitely. A lot of shops do this. The simpler your environments, the easier it is, and therefore, a startup with one backend app would have relatively little trouble achieving this valhalla state. But even the creation of the most complex environments--with hundreds of interdependent applications--can be fully automated; it just takes more time and effort.
The background of my question is my interest in full automation of development with as less clicks as possible and cost saving in a pay-per-use model by not having idling machines.
Yes, definitely. The "spin up and destroy" strategy benefits all hosting models (since, after full automation, no one ever has to wait for someone to manually provision an environment), but those using public clouds see even larger benefits in terms of cost (vs always leaving AWS environments running, for example).
I appreciate your thoughts.
Not a problem. I will advise that this question doesn't fit stackoverflow's question and answer sweet spot super well, since it is quite general. In the future, I would recommend chatting with your developers, finding folks who are excited about this sort of thing, and formulating more specific questions when you all get stuck in the weeds on something. Welcome to stackoverflow!
All is being used in various combinations; the objective is to deliver continuous value to end user. My two cents:
Build & Release
It depends on what you are using. I personally recommend to use what is available with the tool. For example, VSTS (Visual Studio Team Services) offers complete CI/CD pipeline. But if you have a unique need which can only be served by Jenkins then you must use that and VSTS offers that out of the box.
IAC (Infrastructure as code)
In addition to Puppet etc. You can take benefits of AZURE ARM (Azure Resource Manager) Template to Build and destroy an environment. Again, see what is available out of the box with the tool set you have.
Pay-per-use
What I have personally used is Azure Dev/Test Labs and have the code deployed to that via CI/CD pipeline. Later setup Shutdown policy on the VM so it will auto-start and auto-shutdown based on time provided. This is a great feature to let you save cost on the resources being used and replicate environments.
For example, UAT environment might not needed until QA is signed off. But using IAC you can quickly spin up the environment automatically and then have one-click deployment setup to deploy code to UAT.

Creating a Docker UAT/Production image

Just a quick question about best practices on creating Docker images for critical environments. As we know in the real world, often times the team/company deploying to internal test is not the same as who is deploying to client test environments and production. There becomes a problem because all app configuration info may not be available when creating the Docker UAT/production image e.g. with Jenkins. And then there is the question about passwords that are stored in app configuration.
So my question is, how "fully configured" should the Docker image be? The way I see it, it is in practice not possible to fully configure the Docker image, but some app passwords etc. must be left out. But then again this slightly defies the purpose of a Docker image?
how "fully configured" should the Docker image be? The way I see it, it is in practice not possible to fully configure the Docker image, but some app passwords etc. must be left out. But then again this slightly defies the purpose of a Docker image?
There will always be tradeoffs between convenience, security, and flexibility.
An image that works with zero runtime configuration is very convenient to run but not very flexible and sensitive config like passwords will be exposed.
An image that takes all configuration at runtime is very flexible and doesn't expose sensitive info, but can be inconvenient to use if default values aren't provided. If a user doesn't know some values they may not be able to use the image at all.
Sensitive info like passwords usually land on the runtime side when deciding what configuration to bake into images and what to require at runtime. However, this isn't always the case. As an example, you may want to build test images with zero runtime configuration that only point to test environments. Everyone has access to test environment credentials anyways, zero configuration is more convenient for testers, and no one can accidentally run a build against the wrong database.
For configuration other than credentials (e.g. app properties, loglevel, logfile location) the organizational structure and team dynamics may dictate how much configuration you bake in. In a devops environment making changes and building a new image may be painless. In this case it makes sense to bake in as much configuration as you want to. If ops and development are separate it may take days to make minor changes to the image. In this case it makes sense to allow more runtime configuration.
Back to the original question, I'm personally in favor of choosing reasonable defaults for everything except credentials and allowing runtime overrides only as needed (convention with reluctant configuration). Runtime configuration is convenient for ops, but it can make tracking down issues difficult for the development team.

Resources