How to use helm charts without internet access

How to use helm charts without internet access - docker

All of us know that helm charts are amazing and make our lives easier.
However, I got a use case when I would like to use helm charts - WITHOUT INTERNET ACCESS
And there are two steps:
Downloading chart from Git
Pulling Docker images from Dockerhub (spefified in values.yaml files)
How can I do this?

Using a helm chart offline involves pulling the chart from the internet then installing it.:
$ helm pull <chart name>
$ ls #The chart will be pulled as a tar to the local directory
$ helm install <whatever release name you want> <chart name>.tgz
For this method to work you'll need all the docker images the chart uses locally as you mentioned.

I know the answer to the first part of my question.
you can actually git clone https://github.com/kubernetes/charts.git all offical charts from github and specify path to a chart (folder) on your filesystem you want to install.
This will be in form of helmfile
execute the command like this:
helmfile -f deployment.yaml sync
cat deployment.yaml
...
repositories:
- name: roboll
url: http://roboll.io/charts
context: example.int.com # kube-context (--kube-context)
releases:
# Prometheus deployment
- name: my_prometheus # name of this release
namespace: monitoring # target namespace
chart: /opt/heml/charts/stable/prometheus # the chart being installed to create this release, referenced by `repository/chart` syntax
values: ["values/values_prometheus_ns_monitoring.yaml"]
set: # values (--set)
- name: rbac.create
value: true
# Grafana deployment
- name: my_grafana # name of this release
namespace: monitoring # target namespace
chart: /opt/heml/charts/stable/grafana
values: ["values/values_grafana_ns_monitoring.yaml"]
So as you can see I have specified some custom values_<software>_ns_monitoring.yaml files.
The second part of my original question is still unanswered.
I want to be able to tell docker to use a local docker image in this section
cat values_grafana_ns_monitoring.yaml
replicas: 1
image:
repository: grafana/grafana
tag: 5.0.4
pullPolicy: IfNotPresent
I have managed to manually copy/paste - then load docker image
so it is visible in my computer - but I can't figure out how to convince
docker + helmfile to use my image. The goal is to process totally offlilne installation.
ANY IDEAS ???
sudo docker images
[sudo] password for jantoth:
REPOSITORY TAG IMAGE ID CREATED SIZE
my_made_up_string/custom_grafana/custom_grafana 5.1.2 917f46a60761 6 days ago 238 MB

Pulling docker images from Dockerhub WITHOUT INTERNET ACCESS
Obviously, that is not possible. However, the problem can be addressed by splitting it into stages:
Pull the docker images from Dockerhub to a system that has internet access.
Save the docker images docker save, copy them to the destination environment where you want to do offline installation and load the images back docker load
Setup a docker registry in the destination environment and tag / push the docker images into this registry. Local docker registry
In the helm charts / kubernetes yaml files, update image references to point to the local docker registry Kubernetes and private docker registry
Alternatively, you can look at offline packaging / deployment tools like Gravity

Related

Helm Install from DockerHub image - DockerHub Image download fails

When running the following command:
helm upgrade --cleanup-on-fail \
-- install $releaseName $dockerHubName/$dockerHubRepo:$tag \
-- namespace $namespace \
-- create-namespace \
-- values config.yaml
I get the following error:
Error: Failed to download "$dockerHubName/$dockerHubRepo"
I've also tried with different tags, with semantic versioning (tag="1.0.0") and there's a image with the tag "latest" on the DockerHub repo (which is Public)
This also works with the base JuPyTerHub image jupyterhub/jupyterhub

Based on information from the jupyterhub for kubernetes site, to use a different image from jupyter/docker-stacks, the following steps are required:
Modify your config.yaml file to specify the image. For example:
singleuser:
image:
# You should replace the "latest" tag with a fixed version from:
# https://hub.docker.com/r/jupyter/datascience-notebook/tags/
# Inspect the Dockerfile at:
# https://github.com/jupyter/docker-stacks/tree/HEAD/datascience-notebook/Dockerfile
name: jupyter/datascience-notebook
tag: latest
Apply the changes by following the directions listed in apply the changes.
If you have configured prePuller.hook.enabled, all the nodes in your
cluster will pull the image before the hub is upgraded to let users
use the image. The image pulling may take several minutes to complete,
depending on the size of the image.
Restart your server from JupyterHub control panel if you are already logged in.

Kubernates pods with :latest Image issue

I'm using Kubernates for production environment (I'm new for these kinds of configuration), This is an example for one of my depolyment files(with changes):
apiVersion: apps/v1
kind: Deployment
metadata:
name: myProd
labels:
app: thisIsMyProd
spec:
replicas: 3
selector:
matchLabels:
app: thisIsMyProd
template:
metadata:
labels:
app: thisIsMyProd
spec:
containers:
- name: myProd
image: DockerUserName/MyProdProject # <==== Latest
ports:
- containerPort: 80
Now, I wanted to make it works with the travis ci, So I made something similar to this:
sudo: required
services:
- docker
env:
global:
- LAST_COMMIT_SHA=$(git rev-parse HEAD)
- SERVICE_NAME=myProd
- DOCKER_FILE_PATH=.
- DOCKER_CONTEXT=.
addons:
apt:
packages:
- sshpass
before_script:
- docker build -t $SERVICE_NAME:latest -f $DOCKER_FILE_PATH $DOCKER_CONTEXT
script:
# Mocking run test cases
deploy:
- provider: script
script: bash ./deployment/deploy-production.sh
on:
branch: master
And finally here is the deploy-production.sh script:
#!/usr/bin/env bash
# Log in to the docker CLI
echo "$DOCKER_PASSWORD" | docker login -u "$DOCKER_USERNAME" --password-stdin
# Build images
docker build -t $DOCKER_USERNAME/$SERVICE_NAME:latest -t $DOCKER_USERNAME/$SERVICE_NAME:$LAST_COMMIT_SHA -f $DOCKER_FILE_PATH $DOCKER_CONTEXT
# Take those images and push them to docker hub
docker push $DOCKER_USERNAME/$SERVICE_NAME:latest
docker push $DOCKER_USERNAME/$SERVICE_NAME:$LAST_COMMIT_SHA
# Run deployment script in deployment machine
export SSHPASS=$DEPLOYMENT_HOST_PASSWORD
ssh-keyscan -H $DEPLOYMENT_HOST >> ~/.ssh/known_hosts
# Run Kubectl commands
kubctl apply -f someFolder
kubctl set image ... # instead of the `...` the rest command that sets the image with SHA to the deployments
Now here are my questions:
When travis finish its work, the deploy-production.sh script with run when it is about merging to the master branch, Now I've a concern about the kubectl step, for the first time deployment, when we apply the deployment it will pull the image from dockerhup and try to run them, after that the set image command will run changing the image of these depolyment. Does this will make the deployment to happen twice?
When I try to deploy for the second time, I noticed the deployment used old version from the latest image because it found it locally. After searching I found imagePullPolicy and I set it to always. But imagine that I didn't use that imagePullPolicy attribute, what would really happen in this case? I know that old-version code containers for the first apply command. But isn't running the set image will fix that? To clarify my question, Is kubernetes using some random way to select pods that are going to go down? Like it doesn't mark the pods with the order which the commands run, so it will detect that the set image pods should remain and the apply pods are the one who needs to be terminated?
Doesn't pulling every time is harmful? Should I always make the deployment image somehow not to use the latest is better to erase that hassle?
Thanks

If the image tag is the same in both apply and set image then only the apply action re-deploy the Deployment (in which case you do not need the set image command). If they refer to different image tags then yes, the deployment will be run twice.
If you use latest tag, applying a manifest that use the latest tag with no modification WILL NOT re-deploy the Deployment. You need to introduce a modification to the manifest file in order to force Kubernetes to re-deploy. Like for my case, I use date command to generate a TIMESTAMP variable that is passed as in the env spec of the pod container which my container does not use in any way, just to force a re-deploy of the Deployment. Or you can also use kubectl rollout restart deployment/name if you are using Kubernetes 1.15 or later.
Other than wasted bandwidth or if you are being charged by how many times you pull a docker image (poor you), there is no harm with additional image pull just to be sure you are using the latest image version. Even if you use a specific image tag with version numbers like 1.10.112-rc5, they will be case where you or your fellow developers forget to update the version number when pushing a modified image version. IMHO, imagePullPolicy=always should be the default rather than explicitly required.

Mixing local and remote Docker images repo?

I work on a Kubernetes cluster based CI-CD pipeline.
The pipeline runs like this:
An ECR machine has Docker.
Jenkins runs as a container.
"Builder image" with Java, Maven etc is built.
Then this builder image is run to build an app image(s)
Then the app is run in kubernetes AWS cluster (using Helm).
Then the builder image is run with params to run Maven-driven tests against the app.
Now part of these steps doesn't require the image to be pushed. E.g. the builder image can be cached or disposed at will - it would be rebuilt if needed.
So these images are named like mycompany/mvn-builder:latest.
This works fine when used directly through Docker.
When Kubernetes and Helm comes, it wants the images URI's, and try to fetch them from the remote repo. So using the "local" name mycompany/mvn-builder:latest doesn't work:
Error response from daemon: pull access denied for collab/collab-services-api-mvn-builder, repository does not exist or may require 'docker login'
Technically, I can name it <AWS-repo-ID>/mvn-builder and push it, but that breaks the possibility to run all this locally in minikube, because that's quite hard to keep authenticated against the silly AWS 12-hour token (remember it all runs in a cluster).
Is it possible to mix the remote repo and local cache? In other words, can I have Docker look at the remote repository and if it's not found or fails (see above), it would take the cached image?
So that if I use foo/bar:latest in a Kubernetes resource, it will try to fetch, find out that it can't, and would take the local foo/bar:latest?

I believe an initContainer would do that, provided it had access to /var/run/docker.sock (and your cluster allows such a thing) by conditionally pulling (or docker load-ing) the image, such that when the "main" container starts, the image will always be cached.
Approximately like this:
spec:
initContainers:
- name: prime-the-cache
image: docker:18-dind
command:
- sh
- -c
- |
if something_awesome; then
docker pull from/a/registry
else
docker load -i some/other/path
fi
volumeMounts:
- name: docker-sock
mountPath: /var/run/docker.lock
readOnly: true
containers:
- name: primary
image: a-local-image
volumes:
- name: docker-sock
hostPath:
path: /var/run/docker.sock

Concourse CI: leverage docker image cache

I totally understand that Concourse is meant to be stateless, but nevertheless is there any way to re-use already pulled docker images?
In my case, I build ~10 docker images which have the same base image, but each time build is triggered Concourse pulls base image 10 times.
Is it possible to pull that image once and re-use it later (at least in scope of the same build) using standard docker resource?
Yeah, it should be possible to do that using custom image and code it in sh script, but I'm not in fond of inviting bicycles.
If standard docker resource does not allow that, is it possible to extend it somehow to enable such behaviour?
--cache-from is not helpful, as CI spends most of time pulling image, not building new layers.

Theory
First, some Concourse theory (at least as of v3.3.1):
People often talk about Concourse having a "cache", but misinterpret what that means. Every concourse worker has a set of volumes on disk which are left around, forming a volume cache. This volume cache contains volumes that have been populated by resource get and put and task outputs.
People also often misunderstand how the docker-image-resource uses Docker. There is no global docker server running with your Concourse installation, in fact Concourse containers are not Docker containers, they are runC containers. Every docker-image-resource process (check, get, put) is run inside of its own runC container, inside of which there is a local docker server running. This means that there's no global docker server that is pulling docker images and caching the layers for further use.
What this implies is that when we talk about caching with the docker-image-resource, it means loading or pre-pulling images into the local docker server.
Practice
Now to the options for optimizing build times:
load_base
Background
The load_base param in your docker-image-resource put tells the resource to first docker load an image (retrieved via a get) into its local docker server, before building the image specified via your put params.
This is useful when you need to pre-populate an image into your "docker cache." In your case, you would want to preload the image used in the FROM directive. This is more efficient because it uses Concourse's own volume caching to only pull the "base" once, making it available to the docker server during the execution of the FROM command.
Usage
You can use load_base as follows:
Suppose you want to build a custom python image, and you have a git repository with a file ci/Dockerfile as follows:
FROM ubuntu
RUN apt-get update
RUN apt-get install -y python python-pip
If you wanted to automate building/pushing of this image while taking advantage of Concourse volume caching as well as Docker image layer caching:
resources:
- name: ubuntu
type: docker-image
source:
repository: ubuntu
- name: python-image
type: docker-image
source:
repository: mydocker/python
- name: repo
type: git
source:
uri: ...
jobs:
- name: build-image-from-base
plan:
- get: repo
- get: ubuntu
params: {save: true}
- put: python-image
params:
load_base: ubuntu
dockerfile: repo/ci/Dockerfile
cache & cache_tag
Background
The cache and cache_tag params in your docker-image-resource put tell the resource to first pull a particular image+tag from your remote source, before building the image specified via your put params.
This is useful when it's easier to pull down the image than it is to build it from scratch, e.g. you have a very long build process, such as expensive compilations
This DOES NOT utilize Concourse's volume caching, and utilizes Docker's --cache-from feature (which runs the risk of needing to first perform a docker pull) during every put.
Usage
You can use cache and cache_tag as follows:
Suppose you want to build a custom ruby image, where you compile ruby from source, and you have a git repository with a file ci/Dockerfile as follows:
FROM ubuntu
# Install Ruby
RUN mkdir /tmp/ruby;\
cd /tmp/ruby;\
curl ftp://ftp.ruby-lang.org/pub/ruby/2.0/ruby-2.0.0-p247.tar.gz | tar xz;\
cd ruby-2.0.0-p247;\
chmod +x configure;\
./configure --disable-install-rdoc;\
make;\
make install;\
gem install bundler --no-ri --no-rdoc
RUN gem install nokogiri
If you wanted to automate building/pushing of this image while taking advantage of only Docker image layer caching:
resources:
- name: compiled-ruby-image
type: docker-image
source:
repository: mydocker/ruby
tag: 2.0.0-compiled
- name: repo
type: git
source:
uri: ...
jobs:
- name: build-image-from-cache
plan:
- get: repo
- put: compiled-ruby-image
params:
dockerfile: repo/ci/Dockerfile
cache: mydocker/ruby
cache_tag: 2.0.0-compiled
Recommendation
If you want to increase efficiency of building docker images, my personal belief is that load_base should be used in most cases. Because it uses a resource get, it takes advantage of Concourse volume caching, and avoids needing to do extra docker pulls.

Docker: Where to store version of multiple images for docker-compose?

In my CI environment there are several build versions and I need to get a specific version for docker-compose:
registry.example.com/foo/core 0.1.3
registry.example.com/foo/core 0.2.2
... # multiple packages in several versions like this
The images are builded like this:
build:
stage: build
script:
...
- docker build -t $CI_REGISTRY_IMAGE:$VERSION .
- docker push $CI_REGISTRY_IMAGE:$VERSION
And they are pulled on deploy pipeline like this:
production:
stage: deploy
script:
- docker pull $CI_REGISTRY_IMAGE:$VERSION
I'm also using docker-compose to start all microservices from that images:
$ docker-compose up -d
But now that is my problem. Where can I store which version of each image should be used? Hardcoded it looks like this - but this will always start 0.2.2, although the deploy pipeline could pull a different version, like 0.1.3:
core:
container_name: core
image: 'registry.example.com/foo/core:0.2.2'
restart: always
links:
- 'mongo_live'
environment:
- ROOT_URL=https://example.com
- MONGO_URL=mongodb://mongo_live/example
But the version should be better set as a variable. So I would think that on deploy I have to store the current $VERSION-value somewhere. On running docker-compose the version value should be read to get the correct version as the latest version is not always the selected one.

If you pass it as a variable, you'd store it where ever you define your environment variables in your CI environment. You can also store the values in a .env which is described here.
Using the variable defined in the environment (or .env file) would have a line in the docker-compose.yml like:
image: registry.example.com/foo/core:${VERSION}
Personally, I'd take a different approach and let the registry server maintain the version of the image with tags. If you have 3 versions for dev, stage, and prod, you can build your registry.example.com/foo/core:0.2.2 and then tag that same image as registry.example.com/foo/core:dev. The backend image checksum can be referenced by multiple tags and not take up additional disk space on the registry server or the docker hosts. Then in your dev environment, you'd just do a docker-compose pull && docker-compose up -d to grab the dev image and spin it up. The only downside of this approach is that the tag masks which version of the image is currently being used as dev, so you'll need to track that some other way.
Tagging an image in docker uses the docker tag command. You would run:
docker tag registry.example.com/foo/core:${VERSION} registry.example.com/foo/core:dev
docker push registry.example.com/foo/core:${VERSION}
docker push registry.example.com/foo/core:dev
If the registry already has a tag for registry.example.com/foo/core:dev it gets replaced with the new tag pointing to the new image id.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart