How to bundle elasticsearch templates with docker image? - docker

Im currently building a custom docker image to be used for integration test. My requirement is to set it up with custom configuration with default ingester pipeline and template mappings.
Dockerfile:
FROM docker.elastic.co/elasticsearch/elasticsearch:5.6.2
ADD config /usr/share/elasticsearch/config/
USER root
RUN chown -R elasticsearch:elasticsearch config
RUN chmod +x config/setup.sh
USER elasticsearch
RUN elasticsearch-plugin remove x-pack
EXPOSE 9200
EXPOSE 9300
where config is a directory which contains:
> elasticsearch.yml for the configuration
> templates in the form of json files
> setup.sh - script which executes curl to es in order to register pipelines to _ingester and template mappings
The setup script looks like this:
#!/bin/bash
# This script sets up the es5 docker instance with the correct pipelines and templates
baseUrl='127.0.0.1:9200'
contentType='Content-Type:application/json'
# filebeat
filebeatUrl=$baseUrl'/_ingest/pipeline/filebeat-pipeline?pretty'
filebeatPayload='#pipeline/filebeat-pipeline.json'
echo 'setting filebeat pipeline...'
filebeatResult=$(curl -XPUT $filebeatUrl -H$contentType -d$filebeatPayload)
echo -e "filebeat pipeline setup result: \n$filebeatResult"
# template
echo -e "\n\nsetting up templates..."
sleep 1
cd template
for f in *.json
do
templateName="${f%.*}"
templateUrl=$baseUrl'/_template/'$templateName
echo -e "\ncreating index template for $templateName..."
templateResult=$(curl -XPUT $templateUrl -H$contentType -d#$f)
echo -e "$templateName result: $templateResult"
sleep 1
done
echo -e "\n\n\nCompleted ES5 Setup, refer to logs for details"
How do i build and run the image in such a way that the script gets executed AFTER elastic is up and running?

What I usually do is to include a warmer script like yours and at the beginning I add the following lines. There's no other way that I know of in Docker to wait for the underlying service to launch
# wait until ES is up
until $(curl -o /dev/null -s --head --fail $baseUrl); do
echo "Waiting for ES to start..."
sleep 5
done

If template mapping is not evolving frequently then you can try below solution:
You can embed template in your custom image by saving container state(creating new image) using following steps:
Run your image as per your dockerfile(elasticsearch would have been stared in it)
Use docker exec command to run your template(curl command or script)
Use docker commit to save container state and create new image which will already have template
Use newly created image which already has template mapping.You don't need to run template mapping as part of script since your image itself will have it.

Related

Auto-create Rundeck jobs on startup (Rundeck in Docker container)

I'm trying to setup Rundeck inside a Docker container. I want to use Rundeck to provision and manage my Docker fleet. I found an image which ships an ansible-plugin as well. So far running simple playbooks and auto-discovering my Pi nodes work.
Docker script:
echo "[INFO] prepare rundeck-home directory"
mkdir ../../target/work/home
mkdir ../../target/work/home/rundeck
mkdir ../../target/work/home/rundeck/data
echo -e "[INFO] copy host inventory to rundeck-home"
cp resources/inventory/hosts.ini ../../target/work/home/rundeck/data/inventory.ini
echo -e "[INFO] pull image"
docker pull batix/rundeck-ansible
echo -e "[INFO] start rundeck container"
docker run -d \
--name rundeck-raspi \
-p 4440:4440 \
-v "/home/sebastian/work/workspace/workspace-github/raspi/target/work/home/rundeck/data:/home/rundeck/data" \
batix/rundeck-ansible
Now I want to feed the container with playbooks which should become jobs to run in Rundeck. Can anyone give me a hint on how I can create Rundeck jobs (which should invoke an ansible playbook) from the outside? Via api?
One way I can think of is creating the jobs manually once and exporting them as XML or YAML. When the container and Rundeck is up and running I could import the jobs automatically. Is there a certain folder in rundeck-home or somewhere where I can put those files for automatic import? Or is there an API call or something?
Could Jenkins be more suited for this task than Rundeck?
EDIT: just changed to a Dockerfile
FROM batix/rundeck-ansible:latest
COPY resources/inventory/hosts.ini /home/rundeck/data/inventory.ini
COPY resources/realms.properties /home/rundeck/etc/realms.properties
COPY resources/tokens.properties /home/rundeck/etc/tokens.properties
# import jobs
ENV RD_URL="http://localhost:4440"
ENV RD_TOKEN="yJhbGciOiJIUzI1NiIs"
ENV rd_api="36"
ENV rd_project="Test-Project"
ENV rd_job_path="/home/rundeck/data/jobs"
ENV rd_job_file="Ping_Nodes.yaml"
# copy job definitions and script
COPY resources/jobs-definitions/Ping_Nodes.yaml /home/rundeck/data/jobs/Ping_Nodes.yaml
RUN curl -kSsv --header "X-Rundeck-Auth-Token:$RD_TOKEN" \
-F yamlBatch=#"$rd_job_path/$rd_job_file" "$RD_URL/api/$rd_api/project/$rd_project/jobs/import?fileformat=yaml&dupeOption=update"
Do you know how I can delay the curl at the end until after the rundeck service is up and running?
That's right you can design an script with an API call using cURL (pointing to your Docker instance) after deploying your instance (a script that deploys your instance and later import the jobs), I leave a basic example (in this example you need the job definition in XML format).
For XML job definition format:
#!/bin/sh
# protocol
protocol="http"
# basic rundeck info
rdeck_host="localhost"
rdeck_port="4440"
rdeck_api="36"
rdeck_token="qNcao2e75iMf1PmxYfUJaGEzuVOIW3Xz"
# specific api call info
rdeck_project="ProjectEXAMPLE"
rdeck_xml_file="HelloWorld.xml"
# api call
curl -kSsv --header "X-Rundeck-Auth-Token:$rdeck_token" \
-F xmlBatch=#"$rdeck_xml_file" "$protocol://$rdeck_host:$rdeck_port/api/$rdeck_api/project/$rdeck_project/jobs/import?fileformat=xml&dupeOption=update"
For YAML job definition format:
#!/bin/sh
# protocol
protocol="http"
# basic rundeck info
rdeck_host="localhost"
rdeck_port="4440"
rdeck_api="36"
rdeck_token="qNcao2e75iMf1PmxYfUJaGEzuVOIW3Xz"
# specific api call info
rdeck_project="ProjectEXAMPLE"
rdeck_yml_file="HelloWorldYML.yaml"
# api call
curl -kSsv --header "X-Rundeck-Auth-Token:$rdeck_token" \
-F xmlBatch=#"$rdeck_yml_file" "$protocol://$rdeck_host:$rdeck_port/api/$rdeck_api/project/$rdeck_project/jobs/import?fileformat=yaml&dupeOption=update"
Here the API call.

Is it possible to add an installer, run it and delete it during one build step in Docker?

I'm trying to create a Docker image from a pretty large installer binary (300+ MB). I want to add the installer to the image, install it, and delete the installer. This doesn't seem to be possible:
COPY huge-installer.bin /tmp
RUN /tmp/huge-installer.bin
RUN rm /tmp/huge-installer.bin # <- has no effect on the image size
Using multiple build stages doesn't seem to solve this, since I need to run the installer in the final image. If I could execute the installer directly from a previous build stage, without copying it, that would solve my problem, but as far as I know that's not possible.
Is there any way to avoid including the full weight of the installer in the final image?
I ended up solving this by using the built-in HTTP server in Python to make the project directory available to the image over HTTP.
Inside the Dockerfile, I can run commands like this, piping scripts directly to bash using curl:
RUN curl "http://127.0.0.1:${SERVER_PORT}/installer-${INSTALLER_VERSION}.bin" | bash
Or save binaries, run them and delete them in one step:
RUN curl -O "http://127.0.0.1:${SERVER_PORT}/binary-${INSTALLER_VERSION}.bin" && \
./binary-${INSTALLER_VERSION}.bin && \
rm binary-${INSTALLER_VERSION}.bin
I use a Makefile to start the server and stop it after the build, but you can use a build script instead.
Here's a Makefile example:
SHELL := bash
IMAGE_NAME := app-test
VERSION := 1.0.0
SERVER_PORT := 8580
.ONESHELL:
.PHONY: build
build:
# Kills the HTTP server when the build is done
function cleanup {
pkill -f "python3 -m http.server.*${SERVER_PORT}"
}
trap cleanup EXIT
# Starts a HTTP server that makes the contents of the project directory
# available to the image
python3 -m http.server -b 127.0.0.1 ${SERVER_PORT} &>/dev/null &
sleep 1
EXTRA_ARGS=""
# Allows skipping the build cache by setting NO_CACHE=1
if [[ -n $$NO_CACHE ]]; then
EXTRA_ARGS="--no-cache"
fi
docker build $$EXTRA_ARGS \
--network host \
--build-arg SERVER_PORT=${SERVER_PORT} \
-t ${IMAGE_NAME}:latest \
.
docker tag ${IMAGE_NAME}:latest ${IMAGE_NAME}:${VERSION}
I think the best way is to download the bin from a website then run it:
RUN wget http://myweb/huge-installer.bin && /tmp/huge-installer.bin && rm /tmp/huge-installer.bin
in this way your image layer will not contain the binary you download
I didn't test it thoroughly, but wouldn't such an approach be viable? (Besides LinPy's answer, which is way easier if you have the possibility to just do it that way.)
Dockerfile:
FROM alpine:latest
COPY entrypoint.sh /tmp/entrypoint.sh
RUN \
echo "I am an image that can run your huge installer binary!" \
&& echo "I will only function when you give it to me as a volume mount."
ENTRYPOINT [ "/tmp/entrypoint.sh" ]
entrypoint.sh:
#!/bin/sh
/tmp/your-installer # install your stuff here
while true; do
echo "installer finished, commit me now!"
sleep 5
done
Then run:
$ docker build -t foo-1
$ docker run --rm --name foo-1 --rm -d -v $(pwd)/your-installer:/tmp/your-installer
$ docker logs -f foo-1
# once it echoes "commit me now!", run the next command
$ docker commit foo-1 foo-2
$ docker stop foo-1
Since the installer was only mounted as a volume, the image foo-2 should not contain it anymore. You could also go and build another Dockerfile based on foo-2 to change the entrypoint, for example.
Cf. docker commit

Rails migration on ECS

I am trying to figure out how to run rake db:migrate on my ECS service but only on one machine after deployment.
Anyone has experience with that?
Thanks
You may do it via Amazon ECS one-off task.
Build a docker image with rake db migrate as "CMD" in your docker file.
Create a task definition. You may choose one task per host while creating the task-definition and desired task number as "1".
Run a one-off ECS task inside your cluster. Make sure to make it outside service. Once It completed the task then the container will stop automatically.
You can write a script to do this before your deployment. After that, you can define your other tasks as usual.
You can also refer to the container lifecycle in Amazon ECS here: http://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_life_cycle.html. However, this is the default behavior of the docker.
Let me know if it works for you.
I built a custom shell script to run when my docker containers start ( CMD command in docker ):
#!/bin/sh
web_env=${WEB_ENV:-1}
rails_env=${RAILS_ENV:-staging}
rails_host=${HOST:-'https://www.example.com'}
echo "*****************RAILS_ENV is $RAILS_ENV default to $rails_env"
echo "***************** WEB_ENV is $WEB_ENV default to $web_env"
######## Rails migration ################################################
echo "Start rails migration"
echo "cd /home/app/example && bundle exec rake db:migrate RAILS_ENV=$rails_env"
cd /home/app/example
bundle exec rake db:migrate RAILS_ENV=$rails_env
echo "Complete migration"
if [ "$web_env" = "1" ]; then
######## Generate webapp.conf##########################################
web_app=/etc/nginx/sites-enabled/webapp.conf
replace_rails_env="s~{{rails_env}}~${rails_env}~g"
replace_rails_host="s~{{rails_host}}~${rails_host}~g"
# sed: -i may not be used with stdin in MacOsX
# Edit files in-place, saving backups with the specified extension.
# If a zero-length extension is given, no backup will be saved.
# we use -i.back as backup file for linux and
# In Macosx require the backup to be specified.
sed -i.back -e $replace_rails_env -e $replace_rails_host $web_app
rm "${web_app}.back" # remove webapp.conf.back cause nginx to fail.
# sed -i.back $replace_rails_host $web_app
# sed -i.back $replace_rails_server_name $web_app
######## Enable Web app ################################################
echo "Web app: enable nginx + passenger: rm -f /etc/service/nginx/down"
rm -f /etc/service/nginx/down
else
######## Create Daemon for background process ##########################
echo "Sidekiq service enable: /etc/service/sidekiq/run "
mkdir /etc/service/sidekiq
touch /etc/service/sidekiq/run
chmod +x /etc/service/sidekiq/run
echo "#!/bin/sh" > /etc/service/sidekiq/run
echo "cd /home/app/example && bundle exec sidekiq -e ${rails_env}" >> /etc/service/sidekiq/run
fi
echo ######## Custom Service setup properly"
What I did was to build a docker image to be run as a web server ( Nginx + passenger) or Sidekiq background process. The script will decide whether it is a web or Sidekiq via ENV variable WEB_ENV and rails migration will always get executed.
This way I can be sure the migration always up to date. I think this will work perfectly for a single task.
I am using a Passenger docker that has been designed very easy to customize but if you use another rails app server you can learn from the docker design of Passenger to apply to your own docker design.
For example, you can try something like:
In your Dockerfile:
CMD ["/start.sh"]
Then you create a start.sh where you put the commands which you want to execute:
start.sh
#! /usr/bin/env bash
echo "Migrating the database..."
rake db:migrate

How to create a Dockerfile for cassandra (or any database) that includes a schema?

I would like to create a dockerfile that builds a Cassandra image with a keyspace and schema already there when the image starts.
In general, how do you create a Dockerfile that will build an image that includes some step(s) that can't really be done until the container is running, at least the first time?
Right now, I have two steps: build the cassandra image from an existing cassandra Dockerfile that maps a volume with the CQL schema files into a temporary directory, and then run docker exec with cqlsh to import the schema after the image has been started as a container.
But that doesn't create an image with the schema - just a container. That container could be saved as an image, but that's cumbersome.
docker run --name $CASSANDRA_NAME -d \
-h $CASSANDRA_NAME \
-v $CASSANDRA_DATA_DIR:/data \
-v $CASSANDRA_DIR/target:/tmp/schema \
tobert/cassandra:2.1.7
then
docker exec $CASSANDRA_NAME cqlsh -f /tmp/schema/create_keyspace.cql
docker exec $CASSANDRA_NAME cqlsh -f /tmp/schema/schema01.cql
# etc
This works, but it makes it impossible to use with tools like Docker compose since linked containers/services will start up too and expect the schema to be in place.
I saw one attempt where the cassandra process as attempted to be started in the background in the Dockerfile during build, then cqlsh run, but I don't think that worked too well.
Ok I had this issue and someone advised me some strategy to deal with:
Start from an existing Cassandra Dockerfile, the official one for example
Remove the ENTRYPOINT stuff
Copy the schema (.cql) file and data (.csv) into the image and put it somewhere, /opt/data for example
create a shell script that will be used as the last command to start Cassandra
a. start cassandra with $CASSANDRA_HOME/bin/cassandra
b. IF there is a $CASSANDRA_HOME/data/data/your_keyspace-xxxx folder and it's not empty, do nothing more
c. Else
1. sleep some time to allow the server to listen on port 9042
2. when port 9042 is listening, execute the .cql script to load csv files
I found this procedure rather cumbersome but there seems to be no other way around. For Cassandra hands-on lab, I found it easier to create a VM image using Vagrant and Ansible.
Make a docker file Dockerfile_CAS:
FROM cassandra:latest
COPY ddl.cql docker-entrypoint-initdb.d/
COPY docker-entrypoint.sh /docker-entrypoint.sh
RUN ls -la *.sh; chmod +x *.sh; ls -la *.sh
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["cassandra", "-f"]
edit docker-entrypoint.sh, add
for f in docker-entrypoint-initdb.d/*; do
case "$f" in
*.sh) echo "$0: running $f"; . "$f" ;;
*.cql) echo "$0: running $f" && until cqlsh -f "$f"; do >&2 echo "Cassandra is unavailable - sleeping"; sleep 2; done & ;;
*) echo "$0: ignoring $f" ;;
esac
echo
done
above exec "$#"
docker build -t suraj1287/cassandra -f Dockerfile_CAS .
and rebuild the image...
Another approach used by our team is create schema on server init.
Our java code test if exist the SCHEMA, if not (new environment, new deployment) create it.
Same for every new TABLE, automatic CREATE TABLE creates required new tables for new data entities when they run in any new cluster (other developer local, preproduction, production).
All this code is isolated inside our DataDriver classes for portability, in case we change Cassandra for another DB in some client or project.
This prevent a lot of hassle both for admins and for developers.
This approach is even valid for initial data loading, we use on tests.

Simple Shell Script: Dead Docker Containers

I have a very simple Docker container that runs a bash shell script that returns something. My Dockerfile:
# Docker image to get stats from a rest interface using CURL and JSON parsing
FROM ubuntu
RUN apt-get update
# Install curl and jq, a lightweight command-line JSON processor
RUN apt-get install -y curl jq
COPY ./stats.sh /
# Make sure script has execute permissions for root
RUN chmod 500 stats.sh
# Define a custom entrypoint to execute stats commands easily within the container,
# using environment substitution and the like...
ENTRYPOINT ["/stats.sh"]
CMD ["info"]
The stats.sh looks like this:
#!/bin/bash
# ElasticSearch
## Get the total size of the elasticsearch DB in bytes
## Requires the elasticsearch container to be linked with alias 'elasticsearch'
function es_size() {
local size=$(curl $ELASTICSEARCH_PORT_9200_TCP_ADDR:$ELASTICSEARCH_PORT_9200_TCP_PORT/_stats/_all 2>/dev/null|jq ._all.total.store.size_in_bytes)
echo $size
}
if [[ "$1" == "info" ]]; then
echo "Check stats.sh for available commands"
elif [[ "$1" == "es_size" ]]; then
es_size
else
echo "Unknown command: $#"
fi
So basically, I have a Docker container that I will run with --rm to exit immediately after running and returning the value I want. More precise, I run it from another shell script (in the host) with:
local size=$(docker run --name stats-es-size --rm --link $esName:elasticsearch $ENV_DOCKER_REST_STATS_IMAGE:$ENV_DOCKER_REST_STATS_VERSION es_size)
Now I'm running this periodically to gather statistics, once a minute. While it works well in general, I end up getting containers with status Dead about once a day.
Can anybody tell me what I might be doing wrong? Is there some problem with my approach or why do my containers die with a certain frequency?

Resources