Running mongodump from within a docker container - docker

I need to be able to run the mongodump utility from within a docker container because the developers on my team do not have the utilities installed locally. So, I created a node script, export.js that handles running the tool and also zips the output.
Ideally what I want is to be able to run an npm script:
{
"db:export": "docker build -t some-local-container -f docker-images/Dockerfile.export && docker run some-local-container"
}
some-local-container would have access to the node_modules and the export.js script I would like to run. It would also run this script as the default ENTRY. Obviously, it was built with both node and mongo installed.
My question is vague, but, is there an easier way to do this? I feel like this is overkill for what I want to do. This is a development only script so it didn't seem to make sense living within the Dockerfile for our Mongo instance.

Related

Use Docker to Distribute CLI Application

I'm somewhat new to Docker. I would like to be able to use Docker to distribute a CLI program, but run the program normally once it has been installed. To be specific, after running docker build on the system, I need to be able to simply run my-program in the terminal, not docker run my-program. How can I do this?
I tried something with a Makefile which runs docker build -t my-program . and then writes a shell script to ~/.local/bin/ called my-program that runs docker run my-program, but this adds another container every time I run the script.
EDIT: I realize is the expected behavior of docker run, but it does not work for my use-case.
Any help is greatly appreciated!
If you want to keep your script, add the remove flag --rm to the docker run command. The remove flag removes the container automatically after the entry-point process has exit.
Additionally, I would personally prefer an alias for this. Simply add something like this example alias my-program="docker run --rm my-program" to your ~/.bashrc or ~/.zshrc file. This even has the advantage that all other parameters after the alias (my-program param1 param2) are automatically forwarded to the entry-point of your image without any additional effort.

Running DBT within Airflow through the Docker Operator

Building my question on How to run DBT in airflow without copying our repo, I am currently running airflow and syncing the dags via git. I am considering different option to include DBT within my workflow. One suggestion by louis_guitton is to Dockerize the DBT project, and run it in Airflow via the Docker Operator.
I have no prior experience using the Docker Operator in Airflow or generally DBT. I am wondering if anyone has tried or can provide some insights about their experience incorporating that workflow, my main questions are:
Should DBT as a whole project be run as one Docker container, or is it broken down? (for example: are tests ran as a separate container from dbt tasks?)
Are logs and the UI from DBT accessible and/or still useful when run via the Docker Operator?
How would partial pipelines be run? (example: wanting to run only a part of the pipeline)
Judging by your questions, you would benefit from trying to dockerise dbt on its own, independently from airflow. A lot of your questions would disappear. But here are my answers anyway.
Should DBT as a whole project be run as one Docker container, or is it broken down? (for example: are tests ran as a separate container from dbt tasks?)
I suggest you build one docker image for the entire project. The docker image can be based on the python image since dbt is a python CLI tool. You then use the CMD arguments of the docker image to run any dbt command you would run outside docker.
Please remember the syntax of docker run (which has nothing to do with dbt): you can specify any COMMAND you wand to run at invocation time
$ docker run [OPTIONS] IMAGE[:TAG|#DIGEST] [COMMAND] [ARG...]
Also, the first hit on Google for "docker dbt" is this dockerfile that can get you started
Are logs and the UI from DBT accessible and/or still useful when run via the Docker Operator?
Again, it's not a dbt question but rather a docker question or an airflow question.
Can you see the logs in the airflow UI when using a DockerOperator? Yes, see this how to blog post with screenshots.
Can you access logs from a docker container? Yes, Docker containers emit logs to stdout and stderr output streams (which you can see in airflow, since airflow picks this up). But logs are also stored in JSON files on the host machine in a folder /var/lib/docker/containers/. If you have any advanced needs, you can pick up those logs with a tool (or a simple BashOperator or PythonOperator) and do what you need with it.
How would partial pipelines be run? (example: wanting to run only a part of the pipeline)
See answer 1, you would run your docker dbt image with the command
$ docker run my-dbt-image dbt run -m stg_customers

Run docker inside docker to avoid installing multiple dependencies

I'm in front of one dilemma that I'd like to discuss here to open a constructive discussion.
My use case is very simple:
I need to run a bash script which execute several commands like install npm, execute aws-cli and query PostgreSQL. For this last task I use psql. Easy task I'd say however Docker slightly complicate the situation.
The problem would be solved if I would create an image where I'd install all the dependencies. However the result would be a pretty big image and I'd not go with this solution.
What about to run the script with one Docker image and then from the script (inside Docker) run something like
docker run postgres:9.6.3-alpine psql
docker run node:9.8 npm
In other words would be to run docker inside docker. What do you think?
If you want to execute docker run inside a docker, just execute first docker run with -v /var/run/docker.sock:/var/run/docker.sock option.
With this, you can access in containers to docker images defined in host or wherever.

webpack improve automation via improving dockerfile

I run want to know how to automate my npm project better with docker.
I'm using webpack with a Vue.js project. When I run npm run buld I get a output folder ./dist this is fine. If I then build a docker image via docker build -t projectname . and run this container all is working perfectly.
This is my Dockerfile (found here)
FROM httpd:2.4
COPY ./dist /usr/local/apache2/htdocs/
But it would be nice if I could just build the docker image and not have to build the project manually via npm run build. Do you understand my problem?
What could be possible solutions?
If you're doing all of your work (npm build and others) outside of the container, and have infrequent changes you could use a simple shell script to wrap the two commands.
If you're doing more frequent iterative development you might consider using a task runner (grunt maybe?) as a container service (or running it locally).
If you want to do the task running/building inside of Docker, you might look at docker-compose. The exact details of how to set this would would require more detail about your workflow, but docker-compose makes it relatively easy to define & link multiple services in a single file, and start and stop them with a simple set of commands.

is it possible to wrap an entire ubuntu 14 os in a docker image

I have a Ubuntu 14 desktop, on which I do some of my development work.
This work mainly revolves around Django & Flask development using PyCharm
I was wandering if it was possible to wrap the entire OS file system in a Docker container, so my whole development environment, including PyCharm and any other tools, would become portable
Yes, this is where Docker shines. Once you install Docker you can run:
docker run --name my-dev -it ubuntu:14.04 /bin/bash
and this will put you, as root, inside a Docker container's bash prompt. It is for all intents and purposes the entire os without anything extra, you will need to install the extras, like pycharm, flask, django, etc. Your entire environment. The environment you start with has nothing, so you will have to add things like pip (apt-get install -y python-pip), and other goodies. Once you have your entire environment you can exit (with exit, or ^D) and you will be back in your host operating system. Then you can commit :
docker commit -m 'this is my development image' my-dev my-dev
This takes the Docker image you just ran (and updated with changes) and saves it on your machine with the tag my-dev:v1, any time in the future you can run this again using the invocation:
docker run -it my-dev /bin/bash
Building a Docker image like this is harder, it is easier once you learn how to make a Dockerfile that describes the base image (ubuntu:14.04) and all of the modifications you want to make to it in a file called Dockerfile. I have an example of a Dockerfile here:
https://github.com/tacodata/pythondev
This builds my python development environment, including git, ssh keys, compilers, etc. It does have my name hardcoded in it, so, it won't help you much doing development (I need to fix that). Anyway, you can download the Dockerfile, change it with your details in it, and create your own image like this:
docker build -t my-dev -< Dockerfile
There are hundreds of examples on the Docker hub which is where I started with mine.
-g

Resources