How can i spawn a multiple instances of a container using kubernetes? - docker

I have a container(Service C) which is listening to certain user event and based on the input it needs to spawn one or more instance of an another container(Service X).

From your use case description, it looks like deployment is what you are looking for https://kubernetes.io/docs/concepts/workloads/controllers/deployment/ . By using deployments you can dynamically scale the number of instances of the pod.

Related

Cassandra cluster using docker

I'm new to cassandra and wanted to understand and implement the NetworkTopology Stratergy.
I want to create a cassandra cluster with NetworkTopology stratergy with multiple data centers. How to do it?
I tried creating a docker bridge network and three cassandra nodes: cas1, cas2, cas3. When used nodetools to check status a cluster with single datacentre is only getting created. But I want to create 2 datacenters.
There's a document which walks you through this: Initializing a multiple node cluster (multiple datacenters). It's for Cassandra 3.x, but the procedure is pretty much the same for 4.x as well.
But if I had to guess, I'd say there's two things you're probably missing:
In the cassandra.yaml set the endpoint_snitch to GossipingPropertyFileSnitch.
endpoint_snitch: GossipingPropertyFileSnitch
That tells Cassandra to check the cassandra-rackdc.properties file for data center and rack information. Inside that file, you'll find the following settings (by default).
dc=dc1
rack=rack1
This is where you can set the name of the new DC. Then you can use those data center names to specify replication on keyspaces using NetworkTopologyStrategy.

In jenkins-kubernetes-plugin, how to generate labels in Pod template that are based on a pattern

Set-Up
I am using jenkins-kubernetes-plugin to run our QE jobs. The QE jobs are executed over multiple PODs and each POD has a static set of labels like testing chrome
Issue:
In these QE jobs, there is one port say 7900 that I want to expose through Kubernetes Ingress Controller.
The issue is we have multiple PODs running from the same Pod Template and they all have the same set of labels. For Ingress Controller to work, I want these PODs to have some labels that come from a pattern.
Like POD1 has a label chrome-1 and POD2 has a label called chrome-2 and so on...
Is this possible?
This is not currently possible directly, but you could use groovy in the pipeline to customize it, ie. add the build id as a label

How to use custom docker storage in Prefect flows?

I have setup a Dask cluster and i'm happily sending basic Prefect flows to it.
Now i want to do something more interesting and take a custom docker image with my python library on it and execute flows/tasks on the dask cluster.
My assumption was I could leave the dask cluster (scheduler and workers) as they are with their own python environment (after checking all the various message passing libraries have the matching versions everywhere). That is to say, i do not expect to need to add my library to those machines if the Flow is executed within my custom storage.
However either I have not set up storage correctly or it is not safe to assume the above. In other words, perhaps when pickling objects in my custom library, the Dask cluster does need to know about my python library. Suppose i have some generic python library called data...
import prefect
from prefect.engine.executors import DaskExecutor
#see https://docs.prefect.io/api/latest/environments/storage.html#docker
from prefect.environments.storage import Docker
#option 1
storage = Docker(registry_url="gcr.io/my-project/",
python_dependencies=["some-extra-public-package"],
dockerfile="/path/to/Dockerfile")
#this is the docker build and register workflow!
#storage.build()
#or option 2, specify image directly
storage = Docker(
registry_url="gcr.io/my-project/", image_name="my-image", image_tag="latest"
)
#storage.build()
def get_tasks():
return [
"gs://path/to/task.yaml"
]
#prefect.task
def run_task(uri):
#fails because this data needs to be pickled ??
from data.tasks import TaskBase
task = TaskBase.from_task_uri(uri)
#task.run()
return "done"
with prefect.Flow("dask-example",
storage = storage) as flow:
#chain stuff...
result = run_task.map(uri=get_tasks())
executor = DaskExecutor(address="tcp://127.0.01:8080")
flow.run(executor=executor)
Can anyone explain how/if this type of docker-based workflow should work?
Your dask workers will need access to the same python libraries that your tasks rely on to run. The simplest way to achieve this is to run your dask workers using the same image as your Flow. You could do this manually, or using something like the DaskCloudProviderEnvironment that will create short-lived Dask clusters per-flow run using the same image automatically.

Set environment variables all over processes in Julia

I'm currently working with Julia (1.0) to run some parallel code on clusters of an HPC. The HPC is managed with PBS. I'm trying to find a way for broadcasting environment variables over all processes, i.e. a way to broadcast a specific list of environment variables automatically in order to have access to them in every Julia worker.
#!/bin/bash
#PBS ...
export TOTO=toto
julia --machine-file=$PBS_NODEFILE my_script.jl
In this example, I will not be able to access to the variable TOTO in each julia worker (via ENV["TOTO"]).
The only way I found to do what I want is to set the variables in my .bashrc but I want this to be script-specific. Another way is to put in my startup.jl file :
#everywhere ENV["TOTO"] = $(ENV["TOTO"])
But it is not script-specific because I have to know in advance which variables I want to send. If I do a loop over ENV keys then I'll broadcast all the variables and then override variables I don't want to.
I tried to use DotEnv.jl but it doesn't work.
Thanks for your time.
The obvious way is to set the variables first thing in script.jl. You can also put the initialization in a separate file, e.g. environment.jl, and load that on all processes with the -L flag:
julia --machine-file=$PBS_NODEFILE -L environment.jl my_script.jl
where environment.jl would, in this case, contain
ENV["TOTO"] = "toto"
etc.

Problems with Dockerbeats dashboard containerName field

I have dockerbeats set up on a local cluster that is running ELK stack and some other misc. dockers (all containers controlled via kubernetes). I set up the dashboard from Ingensi (Ingensi dockerbeat Dashboard) for kibana and ran into an issue with the containerNames field while setting up the graphs. Now, for context, my docker containers have names like these:
k8s_dockerbeats.79c42f90_dockerbeats-796n9_default_472faa11-1b3a-11e6-8bf4-28924a2bffbf_2832ea88
(as well as supporting containers for kubernetes with similar container names) [2]: http://i.stack.imgur.com/hvIUG.png
k8s_POD.6d00e006_dockerbeats-796n9_default_472faa11-1b3a-11e6-8bf4-28924a2bffbf_3ddcfe44
When I set up the dashboard in kibana I get an issue where I get multiple containerNames from the same container. For example instead of a single containerName output I get the containerName split up into smaller segments:
k8s_dockerbeats
79c42f90_dockerbeats
796n9
28924a2bffbf_3ddcfe44
and so on...
I assume that the format of the container name is confusing the dashboard (maybe in the way that it parses the name information) and I could probably go around renaming every container to a more sensible name.
But before I do that, is there a way to configure the dashboard in such a way that I read in the entire container name string so that it does not break up like it does in the first image? (assuming I'll have to dig into the .json files from the repository mentioned above)
Thanks in advance if anyone answers this.
It sounds like the container name is being analyzed by Elasticsearch. You need to make sure that the container name field is marked as not_analyzed in the Elasticsearch index template. You can do this by installing the index template provided by Dockerbeat.
Marking the field as not_analyzed ensures that the data is not tokenized and it gets indexed as is. It will only be searchable by specifying the exact string.
You will need to delete your current indexes after installing the new index template in order to change the mappings.
Install the provided index template:
curl -XPUT 'http://elasticsearch:9200/_template/dockerbeat' -d#dockerbeat.template.json
Delete your the existing indexes:
curl -XDELETE 'http://elasticsearch:9200/dockerbeat-*'
You can view your current mappings by querying Elasticearch:
curl http://elasticsearch:9200/dockerbeat-*/_mapping

Resources