How to create a Dockerfile for cassandra (or any database) that includes a schema?

How to create a Dockerfile for cassandra (or any database) that includes a schema? - docker

I would like to create a dockerfile that builds a Cassandra image with a keyspace and schema already there when the image starts.
In general, how do you create a Dockerfile that will build an image that includes some step(s) that can't really be done until the container is running, at least the first time?
Right now, I have two steps: build the cassandra image from an existing cassandra Dockerfile that maps a volume with the CQL schema files into a temporary directory, and then run docker exec with cqlsh to import the schema after the image has been started as a container.
But that doesn't create an image with the schema - just a container. That container could be saved as an image, but that's cumbersome.
docker run --name $CASSANDRA_NAME -d \
-h $CASSANDRA_NAME \
-v $CASSANDRA_DATA_DIR:/data \
-v $CASSANDRA_DIR/target:/tmp/schema \
tobert/cassandra:2.1.7
then
docker exec $CASSANDRA_NAME cqlsh -f /tmp/schema/create_keyspace.cql
docker exec $CASSANDRA_NAME cqlsh -f /tmp/schema/schema01.cql
# etc
This works, but it makes it impossible to use with tools like Docker compose since linked containers/services will start up too and expect the schema to be in place.
I saw one attempt where the cassandra process as attempted to be started in the background in the Dockerfile during build, then cqlsh run, but I don't think that worked too well.

Ok I had this issue and someone advised me some strategy to deal with:
Start from an existing Cassandra Dockerfile, the official one for example
Remove the ENTRYPOINT stuff
Copy the schema (.cql) file and data (.csv) into the image and put it somewhere, /opt/data for example
create a shell script that will be used as the last command to start Cassandra
a. start cassandra with $CASSANDRA_HOME/bin/cassandra
b. IF there is a $CASSANDRA_HOME/data/data/your_keyspace-xxxx folder and it's not empty, do nothing more
c. Else
1. sleep some time to allow the server to listen on port 9042
2. when port 9042 is listening, execute the .cql script to load csv files
I found this procedure rather cumbersome but there seems to be no other way around. For Cassandra hands-on lab, I found it easier to create a VM image using Vagrant and Ansible.

Make a docker file Dockerfile_CAS:
FROM cassandra:latest
COPY ddl.cql docker-entrypoint-initdb.d/
COPY docker-entrypoint.sh /docker-entrypoint.sh
RUN ls -la *.sh; chmod +x *.sh; ls -la *.sh
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["cassandra", "-f"]
edit docker-entrypoint.sh, add
for f in docker-entrypoint-initdb.d/*; do
case "$f" in
*.sh) echo "$0: running $f"; . "$f" ;;
*.cql) echo "$0: running $f" && until cqlsh -f "$f"; do >&2 echo "Cassandra is unavailable - sleeping"; sleep 2; done & ;;
*) echo "$0: ignoring $f" ;;
esac
echo
done
above exec "$#"
docker build -t suraj1287/cassandra -f Dockerfile_CAS .
and rebuild the image...

Another approach used by our team is create schema on server init.
Our java code test if exist the SCHEMA, if not (new environment, new deployment) create it.
Same for every new TABLE, automatic CREATE TABLE creates required new tables for new data entities when they run in any new cluster (other developer local, preproduction, production).
All this code is isolated inside our DataDriver classes for portability, in case we change Cassandra for another DB in some client or project.
This prevent a lot of hassle both for admins and for developers.
This approach is even valid for initial data loading, we use on tests.

Related

Docker container does not run crontab

I have a dockerfile image based on ubuntu. Iam trying to make a bash script run each day but the cron never runs. When the container is running, i check if cron is running and it is. the bash script works perfectly and the crontab command is well copied inside the container. i can't seem to find where the problem is coming from.
Here is the Dockerfile:
FROM snipe/snipe-it:latest
ENV TZ=America/Toronto
RUN apt-get update \
&& apt-get install awscli -y \
&& apt-get clean \
&& apt-get install cron -y \
&& rm -rf /var/lib/apt/lists/*
RUN mkdir /var/www/html/backups_scripts /var/www/html/config/scripts
COPY config/crontab.txt /var/www/html/backups_scripts
RUN /usr/bin/crontab /var/www/html/backups_scripts/crontab.txt
COPY config/scripts/backups.sh /var/www/html/config/scripts
CMD ["cron","-f"]
The last command CMD doesn't work. And as soon as i remove the cmd command i get this message when i check the cron task inside the container:
root#fcfb6052274a:/var/www/html# /etc/init.d/cron status
* cron is not running
Even if i start the cron process before the crontab, the crontab is still not launched
This dockerfile is called by a docker swarm file (compose file type). Maybe the cron must be activated with the compose file.
How can i tackle this problem ??? Thank you

You need to approach this differently, as you have to remember that container images and containers are not virtual machines. They're a single process that starts and is maintained through its lifecycle. As such, background processes (like cron) don't exist in a container.
What I've seen most people do is have the container just execute whatever you're looking for it to do on a job like do_the_thing.sh and then using the docker run command on on the host machine to call it via cron.
So for sake of argument, let's say you had an image called myrepo/task with a default entrypoint of do_the_thing.sh
On the host, you could add an entry to crontab:
# m h dom mon dow user command
0 */2 * * * root docker run --rm myrepo/task
Then it's down to a question of design. If the task needs files, you can pass them down via volume. If it needs to put something somewhere when it's done, maybe look at blob storage.

I think this question is a duplicate, with a detailed response with lots of upvotes here. I followed the top-most dockerfile example without issues.
Your CMD running cron in the foreground isn't the problem. I ran a quick version of your docker file and exec'ing into the container I could confirm cron was running. Recommend checking how your cron entries in the crontab file are re-directing their output.
Expanding on one of the other answers here a container is actually a lot like a virtual machine, and often they do run many processes concurrently. If you happen to have any other containers running you might be able to see this most easily by running docker stats and looking at the PID column.
Also, easy to examine interactively yourself like this:
$ # Create a simple ubuntu running container named my-ubuntu
$ docker run -it -h my-ubuntu ubuntu
root#my-ubuntu$ ps aw # Shows bash and ps processes running.
root#my-ubuntu$ # Launch a ten minute sleep in the background.
root#my-ubuntu$ sleep 600 &
root#my-ubuntu$ ps aw # Now shows sleep also running.

Start Node manager in Weblogic (Docker) using script.

I tried to dockerize weblogic server. Now I am facing a issue with Starting node manager after server is started inside the docker container. My docker file as below.
FROM oracle/weblogic:12.1.3-generic
ENV JAVA_OPTIONS="${JAVA_OPTIONS} -
Dweblogic.nodemanager.SecureListener=false" \
ADMIN_PORT="7001" \
ADMIN_HOST="localhost"
USER oracle
COPY dockerfiles/keyStore/keystore_ss.jks /u01/oracle/keystore/
COPY dockerfiles/patch/* /u01/oracle/patch/
COPY dockerfiles/local_domainScripts /u01/oracle/local_domainScripts/
COPY dockerfiles/scripts/* /u01/oracle/
COPY dockerfiles/applicationFiles/ /u01/oracle/applicationFiles/
USER root
RUN yum install -y procps
RUN chmod +x startWeblogic.sh
USER oracle
RUN /u01/oracle/wlst /u01/oracle/local_domainScripts/config.py
RUN nohup bash -c "/u01/oracle/user_projects/domains/local_domain/bin/startNodeManager.sh &" && sleep 4
CMD ["/u01/oracle/user_projects/domains/local_domain/startWebLogic.sh"]
This will create weblogic server instance. I want to start node manager after this server is started.
Run command:
docker run -d --name wls_local_domain --network=host --hostname localhost -p 7001:7001 test-docker:0-SNAPSHOT
When ./startNodeManager.sh is executed inside the container that will start the node manager. To start the node manager, weblogic server need to be started first.
I want to this using bash script. I tried this one but it didn't help
github link

You can't (usefully) RUN a background process. That Dockerfile command launches an intermediate container executing the RUN command, saves its filesystem, and exits; there is no process running any more by the time the next Dockerfile command executes.
If this is a commercially maintained image, you might look into whether Oracle has intstructions on how to use it. (From clicking around, none of the samples there start a node manager; is it necessary?)
Best practice is generally to run only one server in a Docker container (and ideally in the foreground and as the container's main process). If that will work and there aren't shared filesystem dependencies, you can split all of this except the final CMD into one base Dockerfile, then have two additional Dockerfiles that just have a FROM line pointing at your mostly-built image and a requested CMD.
If that really won't work then you'll have to fall back to running some init system in your container, typically supervisord.

You need to start the node manager as a background process then start the server. In order to keep alive the docker container while you are running the background processes, you can use the tail command.
This is how I start the node managed and the WebLogic server in my container:
#!/bin/bash
# ------------------------------------------------------------------------------
# start the Node Manager
# ------------------------------------------------------------------------------
function startNodeManager() {
echo "starting the node manager for $MANAGED_SERVER_NAME server..."
"$ORACLE_HOME/user_projects/domains/$DOMAIN_NAME/bin/startNodeManager.sh" &
while ! nc -z "$HOSTNAME" "$NODE_MANAGER_PORT"; do
sleep 0.5
done
echo "node manager is up and ready to receive requests"
}
# ------------------------------------------------------------------------------
# start the WebLogic Admin server
# ------------------------------------------------------------------------------
function startAdminServer() {
echo "starting the $ADMIN_SERVER_NAME server..."
local logHome
logHome="$ORACLE_HOME/user_projects/domains/$DOMAIN_NAME/servers/$ADMIN_SERVER_NAME/logs"
mkdir -p "$logHome"
"$ORACLE_HOME/user_projects/domains/$DOMAIN_NAME/bin/startWebLogic.sh" > "$logHome/$ADMIN_SERVER_NAME.out" 2>&1 &
}
# ------------------------------------------------------------------------------
# main app starts here
# ------------------------------------------------------------------------------
startNodeManager
startAdminServer
# this command keeps alive the docker container
tail -F \
"$ORACLE_HOME/user_projects/domains/$DOMAIN_NAME/servers/$ADMIN_SERVER_NAME/logs/$ADMIN_SERVER_NAME.log" \
"$ORACLE_HOME/user_projects/domains/$DOMAIN_NAME/servers/$ADMIN_SERVER_NAME/logs/$ADMIN_SERVER_NAME.nohup" \
"$ORACLE_HOME/user_projects/domains/$DOMAIN_NAME/servers/$ADMIN_SERVER_NAME/logs/$ADMIN_SERVER_NAME.out"
This is a complete startup script that you can use as an example and improve it. It starts the node manager and the admin server: https://github.com/zappee/docker-images/blob/master/oracle-weblogic/oracle-weblogic-12.2.1.4-admin-server/container-scripts/startup.sh
From here you can download the complete working solution.

Docker Logs Issue : Logs are not created or displayed in Tomcat's logs folder in docker container

We are using Docker container and created a Dockerfile. Inside this container we deployed war file using tomcat image
and we can see tomcat logs at console but console logs is not updating
after sending a request to tomcat via URL.
Also we can not see any log file inside tomcat logs folder
Can anyone help me out that how we can see tomcat logs like localhost.logs/catalina.logs/manager.logs etc
MY Dockerfile is :-
FROM openjdk:6-jre
ENV CATALINA_HOME /usr/local/tomcat
ENV PATH $CATALINA_HOME/bin:$PATH
COPY tomcat $CATALINA_HOME
ADD newui.war $CATALINA_HOME/webapps
CMD $CATALINA_HOME/bin/startup.sh && tail -F $CATALINA_HOME/logs/catalina.out
EXPOSE 8080
Used below script to build
$ docker build -t tomcat .
and below used to run tomcat
$ docker run -p 8080:8080 tomcat

Here are a few things wrong with your dockerfile:
You mention that you need java 6, and yet the line FROM java as of this writing is set to use java:8.
You need to replace the FROM line with FROM java:6-jre or as suggested by the official page: FROM openjdk:6-jre if in 2018 you still need java 6, which is dangerous. I would also strongly suggest to use at least FROM tomcat:7 which should be able to run java 6 applets but will include some bug fixes including support for longer Diffie-Hellman primes for HTTPS (if you are serious about your app's security).
Copt tomcat $CATALINA_HOME you either miss-typed the line to SO, or your image should not build at all. It should be COPY tomcat $CATALINA_HOME
Given that you are using the COPY command there is no need to use RUN mkdir -p prior to this, since the COPY command will automatically create all the required folders.
CMD $CATALINA_HOME/bin/startup.sh && tail -f $CATALINA_HOME/logs/catalina.out
First the tail -f part: since you are looking to tail a log file which might be created and recreated during the server's operation instead of following the FD you should be following the path by doing tail -F (capital F)
startup.sh && tail - tail will never start until startup.sh exits. A better approach is to do tail -F $CATALINA_HOME/logs/catalina.out & inside your startup.sh right before you start your tomcat server. That way tail will be running in the background.
Regardless this is a somewhat dangerous approach and you risk zombie processes because bash does not manage its children processes and neither does docker. I would recommend to use supervisord or something similar.
(From https://docs.docker.com/engine/admin/multi-service_container/)
FROM ubuntu:latest
RUN apt-get update && apt-get install -y supervisor
RUN mkdir -p /var/log/supervisor
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
COPY my_first_process my_first_process
COPY my_second_process my_second_process
CMD ["/usr/bin/supervisord"]
Note: this dockerfile sample omits a few of the best practices, e.g. removing the apt cache in the same run command as doing the apt-get update.
Personal favorite is the phusion/baseimage, but it is harder to setup since you'll need to install everything including the java into the image.
If with all of these modifications you still have no luck in seeing the console update, then you'll need to also post the contents of your startup.sh file or other tomcat related configurations.
P.S.: it might be a good idea to do RUN mkdir -p $CATALINA_HOME/logs just to make sure that the logs folder exists for tomcat to write to.
P.P.S.: the java base image is actually using openjdk instead of the oracle one. Just thought I'd point it out

You should check tomcat logging settings. The default logging.properties in the JRE specifies a ConsoleHandler that routes logging to System.err. The default conf/logging.properties in Apache Tomcat also adds several FileHandlers that write to files.
Example logging.properties file to be placed in $CATALINA_BASE/conf:
handlers = 1catalina.org.apache.juli.FileHandler, \
2localhost.org.apache.juli.FileHandler, \
3manager.org.apache.juli.FileHandler, \
java.util.logging.ConsoleHandler
.handlers = 1catalina.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler
############################################################
# Handler specific properties.
# Describes specific configuration info for Handlers.
############################################################
1catalina.org.apache.juli.FileHandler.level = FINE
1catalina.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
1catalina.org.apache.juli.FileHandler.prefix = catalina.
2localhost.org.apache.juli.FileHandler.level = FINE
2localhost.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
2localhost.org.apache.juli.FileHandler.prefix = localhost.
3manager.org.apache.juli.FileHandler.level = FINE
3manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
3manager.org.apache.juli.FileHandler.prefix = manager.
3manager.org.apache.juli.FileHandler.bufferSize = 16384
java.util.logging.ConsoleHandler.level = FINE
java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter
############################################################
# Facility specific properties.
# Provides extra control for each logger.
############################################################
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].handlers = \
2localhost.org.apache.juli.FileHandler
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].handlers = \
3manager.org.apache.juli.FileHandler
# For example, set the org.apache.catalina.util.LifecycleBase logger to log
# each component that extends LifecycleBase changing state:
#org.apache.catalina.util.LifecycleBase.level = FINE
Example logging.properties for the servlet-examples web application to be placed in WEB-INF/classes inside the web application:
handlers = org.apache.juli.FileHandler, java.util.logging.ConsoleHandler
############################################################
# Handler specific properties.
# Describes specific configuration info for Handlers.
############################################################
org.apache.juli.FileHandler.level = FINE
org.apache.juli.FileHandler.directory = ${catalina.base}/logs
org.apache.juli.FileHandler.prefix = servlet-examples.
java.util.logging.ConsoleHandler.level = FINE
java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter
More info at https://tomcat.apache.org/tomcat-6.0-doc/logging.html

we can not see the logs in Docker container until unless we mount it.
To build the Dockerfile:-
docker build -t tomcat
To run the Dockerfile Image:-
docker run -p 8080:8080 tomcat
To copy the logs of tomcat present in docker container to mounted container :-
Run this cmd to mount the container:
1stpath : 2ndpath
docker run \\-d \\-p 8085:8085 \\-v /usr/local/tomcat/logs:/usr/local/tomcat/logs \tomcat
or simply
docker run \\-d \\-v /usr/local/tomcat/logs:/usr/local/tomcat/logs \tomcat
1st:-/usr/local/tomcat/logs: path of root dir: where we want to copy
the logs or destination
2nd:- /usr/local/tomcat/logs: path of tomcat/logs folder present in
docker container
tomcat:-name of image
need to change the port if it is busy
now the container is get mount
to get the list of container run : docker ps -a
now get the container id of latest created container:
docker exec -it < mycontainer > bash
then we can see the logs by
cd /usr/local/tomcat/logs
usr/local/tomcat/logs# less Log Name Here
this to Copy any folder in docker container on root:-
docker cp <containerId>:/file/path/within/container /host/path/target

delete volumes from images

When I create a container from docker-compose with some volumes and then commit that container, the volumes in the docker-compose file will be committed too. There is a way to not commit the volumes in the image?
With below command just can add volume but not delete them:
docker commit -c 'VOLUME /foo' container_name image_name
Thank you.

Update (April 2018): In "How can I edit an existing docker image metadata?", Guido U. Draheim proposes gdraheim/docker-copyedit, a python scripts which can edit docker image metadata.
That can remove or overrides image metadata, including volumes.
The command would be:
./docker-copyedit.py FROM image1 INTO image2 REMOVE ALL VOLUMES
Since 2018, the same issue now includes (from Aalex Gabi):
For building a CI image with an embedded MySQL database snapshot I ended up using this solution: "Persist & share dev data in a Docker image with commit" from Steven Landow.
FROM mysql:5.7
ADD snapshots/default.sql /tmp/default.sql
# Using separate data folder outside of mysql image declared volume
# https://github.com/moby/moby/issues/3465
# https://medium.com/#stevenlandow/persist-share-dev-mysql-data-in-a-docker-image-with-commit-f9aa9910be0a
RUN mkdir /var/lib/mysql-no-volume
RUN set -exu ;\
MYSQL_ROOT_PASSWORD=root docker-entrypoint.sh --datadir /var/lib/mysql-no-volume &\
MYSQL_PID=$! &&\
timeout 22 bash -c 'until printf "" 2>>/dev/null >>/dev/tcp/$0/$1; do sleep 1; done' localhost 3306 &&\
mysql -proot -e 'create database `mydb` collate "utf8mb4_general_ci"' &&\
mysql -proot mydb < /tmp/default.sql &&\
kill $MYSQL_PID &&\
tail --pid=$MYSQL_PID -f /dev/null # Using tail to wait for PID to end https://unix.stackexchange.com/questions/427115/listen-for-exit-of-process-given-pid
# Using separate data folder outside of mysql image declared volume
# https://github.com/moby/moby/issues/3465
# https://medium.com/#stevenlandow/persist-share-dev-mysql-data-in-a-docker-image-with-commit-f9aa9910be0a
CMD ["--datadir", "/var/lib/mysql-no-volume"]

It seems that this is currently not possible, though there are many people requesting the feature and someone might be working on it. This Github issue discusses the topic:
https://github.com/moby/moby/issues/3465

Run command in Docker Container only on the first start

I have a Docker Image which uses a Script (/bin/bash /init.sh) as Entrypoint. I would like to execute this script only on the first start of a container. It should be omitted when the containers is restarted or started again after a crash of the docker daemon.
Is there any way to do this with docker itself, or do if have to implement some kind of check in the script?

I had the same issue, here a simple procedure (i.e. workaround) to solve it:
Step 1:
Create a "myStartupScript.sh" script that contains this code:
CONTAINER_ALREADY_STARTED="CONTAINER_ALREADY_STARTED_PLACEHOLDER"
if [ ! -e $CONTAINER_ALREADY_STARTED ]; then
touch $CONTAINER_ALREADY_STARTED
echo "-- First container startup --"
# YOUR_JUST_ONCE_LOGIC_HERE
else
echo "-- Not first container startup --"
fi
Step 2:
Replace the line "# YOUR_JUST_ONCE_LOGIC_HERE" with the code you want to be executed only the first time the container is started
Step 3:
Set the scritpt as entrypoint of your Dockerfile:
ENTRYPOINT ["/myStartupScript.sh"]
In summary, the logic is quite simple, it checks if a specific file is present in the filesystem; if not, it creates it and executes your just-once code. The next time you start your container the file is in the filesystem so the code is not executed.

The entry point for a docker container tells the docker daemon what to run when you want to "run" that specific container. Let's ask the questions "what the container should run when it's started the second time?" or "what the container should run after being rebooted?"
Probably, what you are doing is following the same approach you do with "old-school" provisioning mechanisms. Your script is "installing" the needed scripts and you will run your app as a systemd/upstart service, right? If you are doing that, you should change that into a more "dockerized" definition.
The entry point for that container should be a script that actually launches your app instead of setting things up. Let's say that you need java installed to be able to run your app. So in the dockerfile you set up the base container to install all the things you need like:
FROM alpine:edge
RUN apk --update upgrade && apk add openjdk8-jre-base
RUN mkdir -p /opt/your_app/ && adduser -HD userapp
ADD target/your_app.jar /opt/your_app/your-app.jar
ADD scripts/init.sh /opt/your_app/init.sh
USER userapp
EXPOSE 8081
CMD ["/bin/bash", "/opt/your_app/init.sh"]
Our containers, at the company I work for, before running the actual app in the init.sh script they fetch the configs from consul (instead of providing a mount point and place the configs inside the host or embedded them into the container). So the script will look something like:
#!/bin/bash
echo "Downloading config from consul..."
confd -onetime -backend consul -node $CONSUL_URL -prefix /cfgs/$CONSUL_APP/$CONSUL_ENV_NAME
echo "Launching your-app..."
java -jar /opt/your_app/your-app.jar
One advice I can give you is (in my really short experience working with containers) treat your containers as if they were stateless once they are provisioned (all the commands you run before the entry point).

I had to do this and I ended up doing a docker run -d which just created a detached container and started bash (in the background) followed by a docker exec, that did the necessary initialization. here's an example
docker run -itd --name=myContainer myImage /bin/bash
docker exec -it myContainer /bin/bash -c /init.sh
Now when I restart my container I can just do
docker start myContainer
docker attach myContainer
This may not be ideal but work fine for me.

I wanted to do the same on windows container. It can be achieved using task scheduler on windows. Linux equivalent for task Scheduler is cron. You can use that in your case. To do this edit the dockerfile and add the following line at the end
WORKDIR /app
COPY myTask.ps1 .
RUN schtasks /Create /TN myTask /SC ONSTART /TR "c:\WINDOWS\system32\WindowsPowerShell\v1.0\powershell.exe C:\app\myTask.ps1" /ru SYSTEM
This Creates a task with name myTask runs it ONSTART and the task its self is to execute a powershell script placed at "c:\app\myTask.ps1".
This myTask.ps1 script will do whatever Initialization you need to do on the container startup. Make sure you delete this task once it is executed successfully or else it will run at every startup. To delete it you can use the following command at the end of myTask.ps1 script.
schtasks /Delete /TN myTask /F

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart