how to use cloudify to auto-heal/scale docker containers - docker

In my project, I'm using cloudify to start and configure the docker containers.
Now I'm wondering how to write YAML files to auto-heal/scale those containers.
My topology is like this: a Compute node contains a Docker-Container node, and in the latter runs several containers.
I've noticed cloudify does the job of auto-healing on the base of the Compute node. So can't I trigger an auto-heal workflow by containers' statuses?
And for auto-scale, I installed the monitor agent and configured the basic collectors. The CPU use percent seems not able to trigger the workflow. cloudify docs about diamond plugin mentions some built-in collectors. Unfortunately, I failed to figure out how to config the collectors.
In hope of some inspirations. Any opinions are appreciated. Thanks~

The docker nodes should be in the right groups for scale and heal.
You can look at this example
scale-heal example
It does exactly what you are looking for

Related

How do I implement Prometheus monitoring in Openshift projects?

We have an openshift container platform url that contains multiple projects like
project1
project2
project3
Each project contains several pods that we are currently monitoring with NewRelic like
pod1
pod2
pod3
We are trying to implement Prometheus + Grafana for all these projects separately.
It's too confusing with online articles as none of them described with the configuration that we have now.
Where do we start?
What do we add to docker images?
Is there any procedure to monitor the containers using cAdvisor on openshift?
Some say we need to add maven dependency in project. Some say we need to modify the code. Some say we need to add prometheus annotations for docker containers. Some say add node-exporter. What is the node-exporter in first place? Is it another container that looks for containers metrics? Can I install that as part of my docker images? Can anyone point me to an article or something with similar configuration?
Your question is pretty broad, so the answer will be the same :)
Just to clarify - in your question:
implement Prometheus + Grafana for all these projects separately
Are going to have for each project dedicated installation of Kubernetes? Prometheus + Grfana? Or you are going to have 1 cluster for all of them?
In general, I think, the answer should be:
Use Prometheus Operator as recommended (https://github.com/coreos/prometheus-operator)
Once operator installed - you'll be able to get most of your data just by config changes - for example, you will get the Grafa and Node Exporters in the cluster by single config changes
In our case (we are not running Open Shift, but vanilla k8s cluster) - we are running multiple namespaces (like your projects), which has it representation in Prometheus
To be able to monitor pod's "applicational metrics", you need to use Prometheus client for your language, and to tell Prometheus to scrape the metrics (usually, it is done by ServiceMonitors).
Hope this will shed some light.

CI testing with docker-compose on Jenkins with Kubernetes

I have tests that I run locally using a docker-compose environment.
I would like to implement these tests as part of our CI using Jenkins with Kubernetes on Google Cloud (following this setup).
I have been unsuccessful because docker-in-docker does not work.
It seems that right now there is no solution for this use-case. I have found other questions related to this issue; here, and here.
I am looking for solutions that will let me run docker-compose. I have found solutions for running docker, but not for running docker-compose.
I am hoping someone else has had this use-case and found a solution.
Edit: Let me clarify my use-case:
When I detect a valid trigger (ie: push to repo) I need to start a new job.
I need to setup an environment with multiple dockers/instances (docker-compose).
The instances on this environment need access to code from git (mount volumes/create new images with the data).
I need to run tests in this environment.
I need to then retrieve results from these instances (JUnit test results for Jenkins to parse).
The problems I am having are with 2, and 3.
For 2 there is a problem running this in parallel (more than one job) since the docker context is shared (docker-in-docker issues). If this is running on more than one node then i get clashes because of shared resources (ports for example). my workaround is to only limit it to one running instance and queue the rest (not ideal for CI)
For 3 there is a problem mounting volumes since the docker context is shared (docker-in-docker issues). I can not mount the code that I checkout in the job because it is not present on the host that is responsible for running the docker instances that I trigger. my workaround is to build a new image from my template and just copy the code into the new image and then use that for the test (this works, but means I need to use docker cp tricks to get data back out, which is also not ideal)
I think the better way is to use the pure Kubernetes resources to run tests directly by Kubernetes, not by docker-compose.
You can convert your docker-compose files into Kubernetes resources using kompose utility.
Probably, you will need some adaptation of the conversion result, or maybe you should manually convert your docker-compose objects into Kubernetes objects. Possibly, you can just use Jobs with multiple containers instead of a combination of deployments + services.
Anyway, I definitely recommend you to use Kubernetes abstractions instead of running tools like docker-compose inside Kubernetes.
Moreover, you still will be able to run tests locally using Minikube to spawn the small all-in-one cluster right on your PC.

Using custom docker containers in Dataflow

From this link I found that Google Cloud Dataflow uses Docker containers for its workers: Image for Google Cloud Dataflow instances
I see it's possible to find out the image name of the docker container.
But, is there a way I can get this docker container (ie from which repository do I go to get it?), modify it, and then indicate my Dataflow job to use this new docker container?
The reason I ask is that we need to install various C++ and Fortran and other library code on our dockers so that the Dataflow jobs can call them, but these installations are very time consuming so we don't want to use the "resource" property option in df.
Update for May 2020
Custom containers are only supported within the Beam portability framework.
Pipelines launched within portability framework currently must pass --experiments=beam_fn_api explicitly (user-provided flag) or implicitly (for example, all Python streaming pipelines pass that).
See the documentation here: https://cloud.google.com/dataflow/docs/guides/using-custom-containers?hl=en#docker
There will be more Dataflow-specific documentation once custom containers are fully supported by Dataflow runner. For support of custom containers in other Beam runners, see: http://beam.apache.org/documentation/runtime/environments.
The docker containers used for the Dataflow workers are currently private, and can't be modified or customized.
In fact, they are served from a private docker repository, so I don't think you're able to install them on your machine.
Update Jan 2021: Custom containers are now supported in Dataflow.
https://cloud.google.com/dataflow/docs/guides/using-custom-containers?hl=en#docker
you can generate a template from your job (see https://cloud.google.com/dataflow/docs/templates/creating-templates for details), then inspect the template file to find the workerHarnessContainerImage used
I just created one for a job using the Python SDK and the image used in there is dataflow.gcr.io/v1beta3/python:2.0.0
Alternatively, you can run a job, then ssh into one of the instances and use docker ps to see all running docker containers. Use docker inspect [container_id] to see more details about volumes bound to the container etc.

DC/OS on top of a docker container cluster

Given that I have only one machine(high configuration laptop), can I run the entire DCOS on my laptop (for purely simulation/learning purpose). The way I was thinking to set this up was using some N number of docker containers (with networking enabled between them), where some of those from N would be masters, some slaves, one zookeeper maybe, and 1 container to run the scheduler/application. So basically the 1 docker container would be synonymous to a machine instance in this case. (since I don't have multiple machines and using multiple VMs on one machine would be an overkill)
Has this been already done, so that I can straight try it out or am I completely missing something here with regards to understanding?
We're running such a development configuration where ZooKeeper, Mesos Masters and Slaves as well as Marathon runs fully dockerized (but on 3 bare metal machine cluster) on CoreOS latest stable. It has some known downsides, like when a slave dies the running tasks cannot be recovered AFAIK by the restarted slave.
I think it also depends on the OS what you're running on your laptop. If it's non-Windows, you should normally be fine. If your system supports systemd, then you can have a look at tobilg/coreos-setup to see how I start the Mesos services via Docker.
Still, I would recommend to use a Vagrant/VirtualBox solution if you just want to test how Mesos works/"feels"... Those will probably save you some headaches compared to a "from scratch" solution. The tobilg/coreos-mesos-cluster project runs the services via Docker on CoreOS within Vagrant.
Also, you can have a look at dharmeshkakadia/awesome-mesos and especially the Vagrant based setup section to get some references.
Have a look at https://github.com/dcos/dcos-docker it is quite young but enables you to do exactly what you want.
It starts a DC/OS cluster with masters and agents on a single node in docker containers.

How can I run multiple docker nodes on my laptop to simulate a cluster?

My goal is to simulate a cluster environment where I can test my applications and tools.
I need to have minimum of 3 Docker nodes (not container) running, and have access to them over ssh.
I have tried the following:
1 - Installing multiple VMs Machines from Ubuntu MinimalCD
Result: ended up with huge files to maintain, repeating the process is really harmful, and unpleasant.
2- Downloading Vagrant box that has docker inside (there are some here).
Result: I can't access them over ssh, and can't really fire up more than one box (Ok, I can but it is still not optimal).
3- Tried to run "Kitematic" multiple times, but had no success with it.
What do you do to test your clustering tools for docker?
My only "easy" solution is to run multiple instances from some provider and pay per hour usage, but that is really not that easy when I am offline, and when I just don't want to pay.
I don't need to run multiple "containers", but multiple "hosts", which I can then join together into a single Cluster to simulate distributed Data Center.
You could use docker-machine to create a few VMs locally. You can connect to all of them by changing the environment variables.
You might also be interested in something like https://github.com/dnephin/compose-swarm-sandbox/. It creates multiple docker hosts inside containers using https://github.com/dnephin/docker-swarm-slave.
If you are using something other than swarm, you would just remove that service from /srv/.
I would recommend you to use docker-machine for this purpose as they are very light weight and very easy to install, run and manage.
Try creating 3-4 docker machine, pull the swarm image on them and make a cluster and use docker compose to manage the cluster in one go.
Option 2 should be a valid option but what you looked at was to use a VM box using the docker provisionner. I would recommend looking at vagrant docker provider you do not need a vagrant box in this scenario but docker images. The Vagrant file though is still there and you can easily setup your multiple machines from the single Vagrant file
here is a nice blog but I am sure there are plenty of other good articles that explains in detail
I recommend Running CoreOS on Vagrant, it has been designed for your request with cluster enable and 3 instances will be started by default.
With etcd and fleetd, you should be fine to get cluster work properly.

Resources