I have a monorepo with some backend (Node.js) and frontend (Angular) services. Currently my deployment process looks like this:
Check if tests pass
Build docker images for my services
Push docker images to container registry
Apply changes to Kubernetes cluster (GKE) with kubectl
I'm aiming to automate all those steps with the help of Bazel and Cloud Build. But I am really struggling to get started with Bazel:
To make it work I'll probably need to add a WORKSPACE file with my external dependencies and multiple BUILD files for my own packages/services? I need help with the actual implementation:
How to build my Dockerfiles with Bazel?
How push those images into a registry (preferably GCR)?
How to apply changes to Google Kubernetes Engine automatically?
How to integrate this toolchain with Google Cloud Build?
More information about the project
I've put together a tiny sample monorepo to showcase my use-case
Structure
├── kubernetes
├── packages
│ ├── enums
│ ├── utils
└── services
├── gateway
General
Gateway service depends on enums and utils
Everything is written in Typescript
Every service/package is a Node module
There is a Dockerfile inside the gateway folder, which I want to be built
The Kubernetes configuration are located in the kubernetes folder.
Note, that I don't want to publish any npm packages!
What we want is a portable Docker container that holds our Angular app along with its server and whatever machine image it requires, that we can bring up on any Cloud provider, We are going to create an entire pipeline to be incremental. "Docker Rules" are fast. Essentially, it provides instrumentality by adding new Docker layers, so that the changes you make to the app are the only things sent over the wire to the cloud host. In addition, since Docker images are tagged with a SHA, we only re-deploy images that changed. To manage our production deployment, we will use Kubernetes, for which Bazel rules also exist. Building a docker image from Dockerfile using Bazel is not possible to my knowledge because
it's by design not allowed due to non-hermetic nature of Dockerfile. (Source:
Building deterministic Docker images with Bazel)
The changes done as part of the source code are going to get deployed in the Kubernetes Cluster, This is one way to achieve the following using Bazel.
We have to put Bazel in watch mode, Deploy replace tells the Kubernetes cluster to update the deployed version of the app.
a.
Command : ibazel run :deploy.replace
In case there are any source code changes do it in the angular.
Bazel incrementally re-builds just the parts of the build graph that depend on the changed file, In this case, that includes the ng_module that was changed, the Angular app that includes that module, and the Docker nodejs_image that holds the server. As we have asked to update the deployment, after the build is complete it pushes the new Docker container to Google Container Registry and the Kubernetes Engine instance starts serving it. Bazel understands the build graph, it only re-builds what is changed.
Here are few Snippet level tips, which can actually help.
WORKSPACE FILE:
Create a Bazel Workspace File, The WORKSPACE file tells Bazel that this directory is a "workspace", which is like a project root. Things that are to be done inside the Bazel Workspace are listed below.
• The name of the workspace should match the npm package where we publish, so that these imports also make sense when referencing the published package.
• Mention all the rules in the Bazel Workspace using "http_archive" , As we are using the angular and node the rules should be mentioned for rxjs, angular,angular_material,io_bazel_rules_sass,angular-version,build_bazel_rules_typescript, build_bazel_rules_nodejs.
• -Next we have to load the dependencies using "load". sass_repositories, ts_setup_workspace,angular_material_setup_workspace,ng_setup_workspace,
• Load the docker base images also , in our case its "#io_bazel_rules_docker//nodejs:image.bzl",
• Dont forget to mention the browser and web test repositaries
web_test_repositories()
browser_repositories(
chromium = True,
firefox = True,
)
"BUILD.bazel" file.
• Load the Modules which was downloaded ng_module, the project module etc.
• Set the Default visiblity using the "default_visibility"
• if you have any Jasmine tests use the ts_config and mention the depndencies inside it.
• ng_module (Assets,Sources and Depndeencies should be mentioned here )
• If you have Any Lazy Loading scripts mention it as part of the bundle
• Mention the root directories in the web_package.
• Finally Mention the data and the welcome page / default page.
Sample Snippet:
load("#angular//:index.bzl", "ng_module")
ng_module(
name = "src",
srcs = glob(["*.ts"]),
tsconfig = ":tsconfig.json",
deps = ["//src/hello-world"],
)
load("#build_bazel_rules_nodejs//:future.bzl", "rollup_bundle")
rollup_bundle(
name = "bundle",
deps = [":src"]
entry_point = "angular_bazel_example/src/main.js"
)
Build the Bundle using the Below command.
bazel build :bundle
Pipeline : through Jenkins
Creating the pipeline through Jenkins and to run the pipeline there are stages. Each Stage does separate tasks, But in our case we use the stage to publish the image using the BaZel Run.
pipeline {
agent any
stages {
stage('Publish image') {
steps {
sh 'bazel run //src/server:push'
}
}
}
}
Note :
bazel run :dev.apply
Dev Apply maps to kubectl apply, which will create or replace an existing configuration.(For more information see the kubectl documentation.) This applies the resolved template, which includes republishing images. This action is intended to be the workhorse of fast-iteration development (rebuilding / republishing / redeploying).
If you want to pull containers using the workpsace file use the below tag
container_pull(
name = "debian_base",
digest = "sha256:**",
registry = "gcr.io",
repository = "google-appengine/debian9",
)
If GKE is used, the gcloud sdk needs to be installed and as we are using GKE(Google Contianer Enginer), It can be authenticated using the below method.
gcloud container clusters get-credentials <CLUSTER NAME>
The Deploymnet Object should be mentioned in the below format:
load("#io_bazel_rules_k8s//k8s:object.bzl", "k8s_object")
k8s_object(
name = "dev",
kind = "deployment",
template = ":deployment.yaml",
images = {
"gcr.io/rules_k8s/server:dev": "//server:image"
},
)
Sources :
https://docs.bazel.build/versions/0.19.1/be/workspace.html
https://github.com/thelgevold/angular-bazel-example
https://medium.com/#Jakeherringbone/deploying-an-angular-app-to-kubernetes-using-bazel-preview-91432b8690b5
https://github.com/bazelbuild/rules_docker
https://github.com/GoogleCloudPlatform/gke-bazel-demo
https://github.com/bazelbuild/rules_k8s#update
https://codefresh.io/howtos/local-k8s-draft-skaffold-garden/
https://github.com/bazelbuild/rules_k8s
A few months later and I've gone relatively far in the whole process.
Posting every detail here would just be too much!
So here is the open-source project which has most of the requirements implemented: https://github.com/flolu/fullstack-bazel
Feel free to contact me with specific questions! :)
Good luck
Flo, have you considered using terraform and a makefile for auto-building the cluster?
In my recent project, I automated infrastructure end to end with make & terraform. Essentially, that approach builds the entire cluster, build and deploys the entire project with one single command within 3 - 5 minutes. Depends on how fast gcp is on a given day.
There is a google sample project showing the idea although the terraform config is outdated and needs to be replaced with a config adhering to the current 0.13 / 0/14 syntax.
https://github.com/GoogleCloudPlatform/gke-bazel-demo#build--deploy-with-bazel
The makefile that enables the one-command end to end automation:
https://github.com/GoogleCloudPlatform/gke-bazel-demo/blob/master/Makefile
Again, replace or customize the scripts for your project; I actually wrote two more scripts, one for checking / installing requirements on the client i.e. git / kubctl & gcloud, and another one for checking or configuring & authentication gcloud in case it's not yet configured and authenticated. From there, the terraform script takes over and build the entire cluster and once that's done, the usual auto-deployment kicks in.
I find the idea of layering make over terraform & bazel for end to end automation just brilliant.
Related
I have a multi-container application, with nginx as web server and reverse-proxy, and a simple 'Hello World' Streamlit app.
It is available on my Gitlab.
I am totally new to DevOps, and would therefore like to leverage Gitlab's Auto DevOps so as to make it easy.
By default Gitlab's Auto DevOps expects one Dockerfile only, and at the root of the project (source)
Surprisingly, I only found one ressource on my multi-container use case, that aimed to answer this issue : https://forum.gitlab.com/t/auto-build-for-multiple-docker-containers/46949
I followed the advice, and made only slights changes to the .gitlab-ci.yml for the path to my dockerfiles.
But then I have an issue with the Dockerfiles not recognizing the files in its folder :
App's Dockerfile doesn't find the requirements.txt :
And Nginx's Dockerfile doesn't find the project.conf
It seems that the DOCKERFILE_PATH: src/nginx/Dockerfile variable gives only acess to the Dockerfile in itself, but doesn't understand this path as the location for the build.
How can I customize this .gitlab-ci.yml so that the build passes correctly ?
Thank you very much !
The reason the files are not being found is due to how docker's context works. Since you're running docker build from the root, your context will be within the root as opposed to from the path for your dockerfile. That means that your docker build command is trying to find /requirements.txt instead of src/app/requirements.txt. You can fix this relatively easily by just executing a cd to change to your /src/app directory before you run docker build, and removing the -f flag from your docker build (since you no longer need to specify the folder).
Since each job executes in an isolated container, you don't need to worry about CDing back to your build root, since your job never runs any other non-docker commands.
TL;DR: I would like to use on a self-hosted Actions runner (itself a docker container on my docker engine) specific docker images to build artefacts that I would move between the build phases, and end with a standalone executable (not a docker container to be deployed). I do not know how to use docker containers as "building engines" in Actions.
Details: I have a home project consisting of a backend in Go (cross compiled to a standalone binary) and a frontend in Javascript (actually a framework: Quasar).
I develop on my laptop in Windows and use GitHub as the SCM.
The manual steps I do are:
build a static version of the frontend which lands in a directory spa
copy that directory to the backend directory
compile the executable that embeds the spa directory
copy (scp) this executable to the final destination
For development purposes this works fine.
I now would like to use Actions to automate the whole thing. I use docker based self-hosted runners (tcardonne/github-runner).
My problem: the containers do a great job isolating the build environment from the server they run on. They are however reused across build jobs and this may create conflicts. More importantly, the default versions of software provided by these containers is not the right (usually - latest) one.
The solution would be to run the build phases in disposable docker containers (that would base on the right image, shortening the build time as a collateral nice to have). Unfortunately, I do not know how to set this up.
Note: I do not want to ultimately create docker containers, I just want to use them as "building engines" and extract the artefacts from them, and share between the jobs (in my specific case - one job would be to build the front with quasar and generate a directory, the other one would be a compilation ending up with a standalone executable copied elsewhere)
Interesting premise, you can certainly do this!
I think you may be slightly mistaken with regards to:
They are however reused across build jobs and this may create conflicts
If you run a new container from an image, then you will start with a fresh instance of that container. Files, software, etc, all adhering to the original image definition. Which is good, as this certainly aids your efforts. Let me know if I have the wrong end of the stick in regards to the above though.
Base Image
You can define your own image for building, in order to mitigate shortfalls of public images that may not be up to date, or suit your requirements. In fact, this is a common pattern for CI, and Google does something similar with their cloud build configuration. For either approach below, you will likely want to do something like the following to ensure you have all the build tools you may
As a rough example:
FROM golang:1.16.7-buster
RUN apt update && apt install -y \
git \
make \
...
&& useradd <myuser> \
&& mkdir /dist
USER myuser
You could build and publish this with the following tag:
docker build . -t <containerregistry>:buildr/golang
It would also be recommended that you maintain a separate builder image for other types of projects, such as node, python, etc.
Approaches
Building with layers
If you're looking to leverage build caching for your applications, this will be the better option for you. Caching is only effective if nothing has changed, and since the projects will be built in isolation, it makes it relatively safe.
Building your app may look something like the following:
FROM <containerregistry>:buildr/golang as builder
COPY src/ .
RUN make dependencies
RUN make
RUN mv /path/to/compiled/app /dist
FROM scratch
COPY --from=builder /dist /dist
The gist of this is that you would start building your app within the builder image, such that it includes all the build deps you require, and then use a multi stage file to publish a final static container that includes your compiled source code, with no dependencies (using the scratch image as the smallest image possible ).
Getting the final files out of your image would be a bit harder using this approach, as you would have to run an instance of the container once published in order to mount the files and persist it to disk, or use docker cp to retrieve the files from a running container (not image) to your disk.
In Github actions, this would look like running a step that builds a Docker container, where the step can occur anywhere with docker accessibility
For example:
jobs:
docker:
runs-on: ubuntu-latest
steps:
...
- name: Build and push
id: docker_build
uses: docker/build-push-action#v2
with:
push: true
tags: user/app:latest
Building as a process
This one can not leverage build caching as well, but you may be able to do clever things like mounting a host npm cache into your container to aid in actions like npm restore.
This approach differs from the former in that the way you build your app will be defined via CI / a purposeful script, as opposed to the Dockerfile.
In this scenario, it would make more sense to define the CMD in the parent image, and mount your source code in, thus not maintaining a image per project you are building.
This would shift the responsibility of building your application from the buildtime of the image, to the runtime. Retrieving your code from the container would be doable through volume mounting for example:
docker run -v /path/to/src:/src /path/to/dist:/dist <containerregistry>:buildr/golang
If the CMD was defined in the builder, that single script would execute and build the mounted in source code, and subsequently publish to /dist in the container, which would then be persisted to your host via that volume mapping.
Of course, this applies if you're building locally. It actually becomes a bit nicer in a Github actions context if you wish to keep your build instructions there. You can choose to run steps within your builder container using something like the following suggestion
jobs:
...
container:
runs-on: ubuntu-latest
container: <containerregistry>:buildr/golang
steps:
- run: |
echo This job does specify a container.
echo It runs in the container instead of the VM.
name: Run in container
Within that run: spec, you could choose to call a build script, or enter the commands that might be present in the script yourself.
What you do with the compiled source is muchly up to you once acquired 👍
Chaining (Frontend / Backend)
You mentioned that you build static assets for your site and then embed them into your golang binary to be served.
Something like that introduces complications of course, but nothing untoward. If you do not need to retrieve your web files until you build your golang container, then you may consider taking the first approach, and copying the content from the published image as part of a Docker directive. This makes more sense if you have two separate projects, one for frontend and backend.
If everything is in one folder, then it sounds like you may just want to extend your build image to facilitate go and js, and then take the latter approach and define those build instructions in a script, makefile, or your run: config in your actions file
Conclusion
This is alot of info, I hope it's digestible for you, and more importantly, I hope it gives you some ideas as to how you can tackle your current issue. Let me know if you would like clarity in the comments
I want my Cloud Build to push an image to a registry with an incremented tag. So, when the trigger arrives from GitHub, build the image, and if the latest tag was 1.10, tag the new one 1.11. Similarly, the 1.11 value will serve in multiple other steps in the build.
Reading the registry and incrementing the tag is easy (in a bash Cloud Build step), but Cloud Build has no way to pass parameters. (Substitutions come from outside the Cloud Build process, for example from the Git tags, and are not generated inside the process.)
This StackOverflow question and this article say that Cloud Build steps can communicate by writing files to the workspace directory.
That is clumsy. But worse, this requires using shell steps exclusively, not the native docker-building steps, nor the native image command.
How can I do this?
Sadly you can't. The Cloud Builder image have each time their own sandbox and only the /workspace directory is mounted. By the way, all the environment variable, binaries installed and so, doesn't persist from one container to the next one.
You have to use the shell script each time :( The easiest way is to have a file in your /workspace directory (for example env.var file)
# load the environment variable
source /workspace/env.var
# Add variable
echo "NEW=Variable" >> /workspace/env.var
For this, Cloud Build is boring...
We are thinking to move our ci from jenkins to gitlab. We have several projects that have the same build workflow. Right now we use a shared library where the pipelines are defined and the jenkinsfile inside the project only calls a method defined in the shared library defining the actual pipeline. So changes only have to be made at a single point affecting several projects.
I am wondering if the same is possible with gitlab ci? As far as i have found out it is not possible to define the gitlab-ci.yml outside the repository. Is there another way to define a pipeline and share this config with several projects to simplify maintainance?
GitLab 11.7 introduces new include methods, such as include:file:
https://docs.gitlab.com/ee/ci/yaml/#includefile
include:
- project: 'my-group/my-project'
ref: master
file: '/templates/.gitlab-ci-template.yml'
This will allow you to create a new project on the same GitLab instance which contains a shared .gitlab-ci.yml.
First let me start by saying: Thank you for asking this question! It triggered me to search for a solution (again) after often wondering if this was even possible myself. We also have like 20 - 30 projects that are quite identical and have .gitlab-ci.yml files of about 400 - 500 loc that have to each be changed if one thing changes.
So I found a working solution:
Inspired by the Auto DevOps .gitlab-ci.yml template Gitlab itself created, and where they use one template job to define all functions used and call every before_script to load them, I came up with the following setup.
Multiple project repo's (project-1, project-2) requiring a shared set of CI jobs / functions
Functions script containing all shared functions in separate repo
Files
So using a shared ci jobs scipt:
#!/bin/bash
function list_files {
ls -lah
}
function current_job_info {
echo "Running job $CI_JOB_ID on runner $CI_RUNNER_ID ($CI_RUNNER_DESCRIPTION) for pipeline $CI_PIPELINE_ID"
}
A common and generic .gitlab-ci.yml:
image: ubuntu:latest
before_script:
# Install curl
- apt-get update -qqq && apt-get install -qqqy curl
# Get shared functions script
- curl -s -o functions.sh https://gitlab.com/giix/demo-shared-ci-functions/raw/master/functions.sh
# Set permissions
- chmod +x functions.sh
# Run script and load functions
- . ./functions.sh
job1:
script:
- current_job_info
- list_files
You could copy-paste your file from project-1 to project-2 and it would be using the same shared Gitlab CI functions.
These examples are pretty verbose for example purposes, optimize them any way you like.
Lessons learned
So after applying the construction above on a large scale (40+ projects) I want to share some lessons learned so you don't have to find out the hard way:
Version (tag / release) your shared ci functions script. Changing one thing can now make all pipelines fail.
Using different Docker images could cause an issue in the requirement for bash to load the functions (e.g. I use some Alpine-based images for CLI tool based jobs that have sh by default)
Use project based CI/CD secret variables to personalize build jobs for projects. Like environment URL's etc.
Since gitlab version 12.6, it's possible define a external .gitlab-cy.yml file.
To customize the path:
Go to the project's Settings > CI / CD.
Expand the General pipelines section.
Provide a value in the Custom CI configuration path field.
Click Save changes.
...
If the CI configuration will be hosted on an external site, the URL link must end with .yml:
http://example.com/generate/ci/config.yml
If the CI configuration will be hosted in a different project within
GitLab, the path must be relative to the root directory in the other
project, with the group and project name added to the end:
.gitlab-ci.yml#mygroup/another-project
my/path/.my-custom-file.yml#mygroup/another-project
Use include feature, (available from GitLab 10.6):
https://docs.gitlab.com/ee/ci/yaml/#include
So, i always wanted to post, with what i came up with now:
Right now we use a mixed approach of #stefan-van-gastel's idea of a shared ci library and the relatively new include feature of gitlab 11.7. We are very satisfied with this approach as we can now manage our build pipeline for 40+ repositories in a single repository.
I have created a repository called ci_shared_library containing
a shell script for every single build job containing the execution logic for the step.
a pipeline.yml file containing the whole pipeline config. In the before script we load the ci_shared_library to /tmp/shared to be able to execute the scripts.
stages:
- test
- build
- deploy
- validate
services:
- docker:dind
before_script:
# Clear existing shared library
- rm -rf /tmp/shared
# Get shared library
- git clone https://oauth2:${GITLAB_TOKEN}#${SHARED_LIBRARY} /tmp/shared
- cd /tmp/shared && git checkout master && cd $CI_PROJECT_DIR
# Set permissions
- chmod -R +x /tmp/shared
# open access to registry
- docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY
test:
stage: test
script:
- /tmp/shared/test.sh
build:
stage: build
script:
- /tmp/shared/build.sh
artifacts:
paths:
- $CI_PROJECT_DIR/target/RPMS/x86_64/*.rpm
expire_in: 3h
only:
- develop
- /release/.*/
deploy:
stage: deploy
script:
- /tmp/shared/deploy.sh
artifacts:
paths:
- $CI_PROJECT_DIR/tmp/*
expire_in: 12h
only:
- develop
- /release/.*/
validate:
stage: validate
script:
- /tmp/shared/validate.sh
only:
- develop
- /release\/.*/
Every project that want's to use this pipeline config has to have a .gitlab-ci.yml. In this file the only thing to do is to import the shared pipeline.yml file from the ci_shared_library repo.
# .gitlab-ci.yml
include:
- project: 'ci_shared_library'
ref: master
file: 'pipeline.yml'
With this approach really everything regarding to the pipeline lives in one single repository and is reusable. We have the whole pipeline-template in one file, but i think it would even be possible to split this up to have every single job in a yml-file. This way it would be more flexible and one could create default jobs that can be merged together differently for projects that have similar jobs but not every project needing all jobs...
With GitLab 13.5 (October 2020), the include feature is even more useful:
Validate expanded GitLab CI/CD configuration with the API
Writing and debugging complex pipelines is not a trivial task. You can use the include keyword to help reduce the length of your pipeline configuration files.
However, if you wanted to validate your entire pipeline via the API previously, you had to validate each included configuration file separately which was complicated and time consuming.
Now you have the ability to validate a fully-expanded version of your pipeline configuration through the API, with all the include configuration included.
Debugging large configurations is now easier and more efficient.
See Documentation and Issue.
And:
See GitLab 13.6 (November 2020)
Include multiple CI/CD configuration files as a list
Previously, when adding multiple files to your CI/CD configuration using the include:file syntax, you had to specify the project and ref for each file. In this release, you now have the ability to specify the project, ref, and provide a list of files all at once. This prevents you from having to repeat yourself and makes your pipeline configuration less verbose.
See Documentation) and Issue.
You could look into the concept of Dynamic Child pipeline.
It has evolved with GitLab 13.2 (July 2020):
Dynamically generate Child Pipeline configurations with Jsonnet
We released Dynamic Child Pipelines back in GitLab 12.9, which allow you to generate an entire .gitlab-ci.yml file at runtime.
This is a great solution for monorepos, for example, when you want runtime behavior to be even more dynamic.
We’ve now made it even easier to create CI/CD YAML at runtime by including a project template that demonstrates how to use Jsonnet to generate the YAML.
Jsonnet is a data templating language that provides functions, variables, loops, and conditionals that allow for fully parameterized YAML configuration.
See documentation and issue.
On the readme of the fabric8-pipeline-library project there is a chapter "Mixing and Matching", which describes how multiple pod templates can be combined to come out with a pod that mixes multiple building capabilities, like having both the maven and docker binaries available.
From this document:
There are cases where we might need a more complex setup that may require more than a single template. (e.g. a maven container that can run docker builds).
For this case you can combine add the docker template and the maven template together:
dockerTemplate {
mavenTemplate(label: 'maven-and-docker') {
node('maven-and-docker') {
container(name: 'maven') {
sh 'mvn clean package fabric8:build fabric8:push'
}
}
}
}
This mvn call obviously would need maven plus local docker capabilities to be available on the "maven" container.
I have quite a similar problem with one fabric8-enabled project but also need to add the "npm" binary on top. The npm calls are also done from the maven build via exec plugin, so I need a container featuring maven+docker+npm.
While I'm struggling to get this to work I also crosschecked the doc of the jenkins kubernetes plugin. It also describes how pod templates can be nested/inherited, but there it does not sound like the result would actually be a single container combining the capabilities of both templates. It just sounds like this mechanism can merely add or modify container configuration on top of the parent template. It does not really combine images.
From this document:
A podTemplate may or may not inherit from an existing template. This means that the podTemplate will inherit node selector, service account, image pull secrets, containerTemplates and volumes from the template it inheritsFrom.
Container templates that are added to the podTemplate, that has a matching containerTemplate (a containerTemplate with the same name) in the 'parent' template, will inherit the configuration of the parent containerTemplate. If no matching containerTemplate is found, the template is added as is.
This is also the way that I interpret the outcomings of my combination attempts. Combining docker and maven templates does not create a single docker+maven container but two containers "docker" and "maven" inside the pod, where each one has one capability. But there is no container where the capabilities are combined and a mvn call that also needs docker should work.
Why docker+maven still works? I think the maven builder image "fabric8/maven-builder:2.2.297" simply already features docker out of the box. A terminal on a container based on that image can call the docker binary, without any parent templates involved.
So, I think I'm clearly misunderstanding something horribly. Can someone shed some light on this topic?
UPDATE
Right now I'm much deeper into this topic. No, you cannot combine the capabilities of multiple containers into one with this functionality. I believe the person writing the readme on fabric8-pipeline-library was not really aware of this. You need a single container based on an image that combines everything you need.
I don't think it's technically possible to just combine two docker containers together, so I think the documentation is very misleading.