How to configure Databricks token inside Docker File - docker

I have a docker file where I want to
Download the Databricks CLI
Configure the CLI by adding a host and token
And then running a python file that hits the Databricks token
I am able to install the CLI in the docker image, and I have a working python file that is able to submit the job to the Databricks API but Im unsure of how to configure my CLI within docker.
Here is what I have
FROM python
MAINTAINER nope
# Creating Application Source Code Directory
RUN mkdir -p /src
# Setting Home Directory for containers
WORKDIR /src
# Installing python dependencies
RUN pip install databricks_cli
# Not sure how to do this part???
# databricks token kicks off the config via CLI
RUN databricks configure --token
# Copying src code to Container
COPY . /src
# Start Container
CMD echo $(databricks --version)
#Kicks off Pythern Job
CMD ["python", "get_run.py"]
If I was to do databricks configure --token in the CLI it would prompt for the configs like this :
databricks configure --token
Databricks Host (should begin with https://):

It's better not to do it this way for multiple reasons:
It's insecure - if you configure Databricks CLI this way it will generate a file inside the container that could be read by anyone who has access to it
Token has time-to-live (default is 90 days) - this means that you'll need to rebuild your containers regularly...
Instead it's just better to pass two environment variables to the container, and they will be picked up by the databricks command. These are DATABRICKS_HOST and DATABRICKS_TOKEN as it described in the documentation.

When databricks configure is run successfully, it writes the information to the file ~/.databrickscfg:
[DEFAULT]
host = https://your-databricks-host-url
token = your-api-token
One way you could set this in the container is by using a startup command (syntax here for docker-compose.yml):
/bin/bash -ic "echo '[DEFAULT]\nhost = ${HOST_URL}\ntoken = ${TOKEN}' > ~/.databrickscfg"

It is not very secure to put your token in the DockerFile. However, if you want to pursue this approach you can use the code below.
RUN export DATABRICKS_HOST=XXXXX && \
export DATABRICKS_API_TOKEN=XXXXX && \
export DATABRICKS_ORG_ID=XXXXX && \
export DATABRICKS_PORT=XXXXX && \
export DATABRICKS_CLUSTER_ID=XXXXX && \
echo "{\"host\": \"${DATABRICKS_HOST}\",\"token\": \"${DATABRICKS_API_TOKEN}\",\"cluster_id\":\"${DATABRICKS_CLUSTER_ID}\",\"org_id\": \"${DATABRICKS_ORG_ID}\", \"port\": \"${DATABRICKS_PORT}\" }" >> /root/.databricks-connect
Make sure to run all the commands in a line using one RUN command. Otherwise, the variable such as DATABRICKS_HOST or DATABRICKS_API_TOKEN may not properly propagate.
If you want to connect to a Databricks Cluster within a docker container you need more configuration. You can find the required details in this article: How to Connect a Local or Remote Machine to a Databricks Cluster

The number of personal access tokens per user is limited to 600
But via bash is easy
echo "y
$(WORKSPACE-REGION-URL)
$(CSE-DEVELOP-PAT)
$(EXISTING-CLUSTER-ID)
$(WORKSPACE-ORG-ID)
15001" | databricks-connect configure

If you want to access databricks models/download_artifacts using hostname and access token like how you do on databricks cli
databricks configure --token --profile profile_name
Databricks Host (should begin with https://): your_hostname
Token : token
if you have created profile name and pushed models and just want to access the model/artifacts in docker using this profile
Add below code in the docker file.
RUN pip install databricks_cli
ARG HOST_URL
ARG TOKEN
RUN echo "[<profile name>]\nhost = ${HOST_URL}\ntoken = ${TOKEN}" >> ~/.databrickscfg
#this will created your .databrickscfg file with host and token after build the same way you do using databricks configure command
Add args HOST_URL and TOKEN in the docker build
e.g
your host name = https://adb-5443106279769864.19.azuredatabricks.net/
your access token = dapi********************53b1-2
sudo docker build -t test_tag --build-arg HOST_URL=<your host name> --build-arg TOKEN=<your access token> .
And now you can access your experiments using this profilename Databricks:profile_name in the code.

Related

Mounted the AWS CLI credentials as volume to docker container however still credentials are not being referred

I have created a docker image using AmazonLinux:2 base image in my Dockerfile. This docker container will run as Jenkins build agent on a Linux server and has to make certain AWS API calls. In my Dockerfile, I'm copying a shell-script called assume-role.sh.
Code snippet:-
COPY ./assume-role.sh .
RUN ["chmod", "+x", "assume-role.sh"]
ENTRYPOINT ["/assume-role.sh"]
CMD ["bash", "--"]
Shell script definition:-
#!/usr/bin/env bash
#echo Your container args are: "${1} ${2} ${3} ${4} ${5}"
echo Your container args are: "${1}"
ROLE_ARN="${1}"
AWS_DEFAULT_REGION="${2:-us-east-1}"
SESSIONID=$(date +"%s")
DURATIONSECONDS="${3:-3600}"
#Temporary loggings starts here
id
pwd
ls .aws
cat .aws/credentials
#Temporary loggings ends here
# AWS STS AssumeRole
RESULT=(`aws sts assume-role --role-arn $ROLE_ARN \
--role-session-name $SESSIONID \
--duration-seconds $DURATIONSECONDS \
--query '[Credentials.AccessKeyId,Credentials.SecretAccessKey,Credentials.SessionToken]' \
--output text`)
# Setting up temporary creds
export AWS_ACCESS_KEY_ID=${RESULT[0]}
export AWS_SECRET_ACCESS_KEY=${RESULT[1]}
export AWS_SECURITY_TOKEN=${RESULT[2]}
export AWS_SESSION_TOKEN=${AWS_SECURITY_TOKEN}
echo 'AWS STS AssumeRole completed successfully'
# Making test AWS API calls
aws s3 ls
echo 'test calls completed'
I'm running the docker container like this:-
docker run -d -v $PWD/.aws:/.aws:ro -e XDG_CACHE_HOME=/tmp/go/.cache arn:aws:iam::829327394277:role/myjenkins test-image
What I'm trying to do here is mounting .aws credentials from host directory to the volume on container at root level. The volume mount is successful and I can see the log outputs as describe in its shell file :-
ls .aws
cat .aws/credentials
It tells me there is a .aws folder with credentials inside it in the root level (/). However somehow, AWS CLI is not picking up and as a result remaining API calls like AWS STS assume-role is getting failed.
Can somebody please suggest me here?
[Output of docker run]
Your container args are: arn:aws:iam::829327394277:role/myjenkins
uid=0(root) gid=0(root) groups=0(root)
/
config
credentials
[default]
aws_access_key_id = AKXXXXXXXXXXXXXXXXXXXP
aws_secret_access_key = e8SYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYxYm
Unable to locate credentials. You can configure credentials by running "aws configure".
AWS STS AssumeRole completed successfully
Unable to locate credentials. You can configure credentials by running "aws configure".
test calls completed
I found the issue finally.
The path was wrong while mounting the .aws volume to the container.
Instead of this -v $PWD/.aws:/.aws:ro, it was supposed to be -v $PWD/.aws:/root/.aws:ro

Auto-create Rundeck jobs on startup (Rundeck in Docker container)

I'm trying to setup Rundeck inside a Docker container. I want to use Rundeck to provision and manage my Docker fleet. I found an image which ships an ansible-plugin as well. So far running simple playbooks and auto-discovering my Pi nodes work.
Docker script:
echo "[INFO] prepare rundeck-home directory"
mkdir ../../target/work/home
mkdir ../../target/work/home/rundeck
mkdir ../../target/work/home/rundeck/data
echo -e "[INFO] copy host inventory to rundeck-home"
cp resources/inventory/hosts.ini ../../target/work/home/rundeck/data/inventory.ini
echo -e "[INFO] pull image"
docker pull batix/rundeck-ansible
echo -e "[INFO] start rundeck container"
docker run -d \
--name rundeck-raspi \
-p 4440:4440 \
-v "/home/sebastian/work/workspace/workspace-github/raspi/target/work/home/rundeck/data:/home/rundeck/data" \
batix/rundeck-ansible
Now I want to feed the container with playbooks which should become jobs to run in Rundeck. Can anyone give me a hint on how I can create Rundeck jobs (which should invoke an ansible playbook) from the outside? Via api?
One way I can think of is creating the jobs manually once and exporting them as XML or YAML. When the container and Rundeck is up and running I could import the jobs automatically. Is there a certain folder in rundeck-home or somewhere where I can put those files for automatic import? Or is there an API call or something?
Could Jenkins be more suited for this task than Rundeck?
EDIT: just changed to a Dockerfile
FROM batix/rundeck-ansible:latest
COPY resources/inventory/hosts.ini /home/rundeck/data/inventory.ini
COPY resources/realms.properties /home/rundeck/etc/realms.properties
COPY resources/tokens.properties /home/rundeck/etc/tokens.properties
# import jobs
ENV RD_URL="http://localhost:4440"
ENV RD_TOKEN="yJhbGciOiJIUzI1NiIs"
ENV rd_api="36"
ENV rd_project="Test-Project"
ENV rd_job_path="/home/rundeck/data/jobs"
ENV rd_job_file="Ping_Nodes.yaml"
# copy job definitions and script
COPY resources/jobs-definitions/Ping_Nodes.yaml /home/rundeck/data/jobs/Ping_Nodes.yaml
RUN curl -kSsv --header "X-Rundeck-Auth-Token:$RD_TOKEN" \
-F yamlBatch=#"$rd_job_path/$rd_job_file" "$RD_URL/api/$rd_api/project/$rd_project/jobs/import?fileformat=yaml&dupeOption=update"
Do you know how I can delay the curl at the end until after the rundeck service is up and running?
That's right you can design an script with an API call using cURL (pointing to your Docker instance) after deploying your instance (a script that deploys your instance and later import the jobs), I leave a basic example (in this example you need the job definition in XML format).
For XML job definition format:
#!/bin/sh
# protocol
protocol="http"
# basic rundeck info
rdeck_host="localhost"
rdeck_port="4440"
rdeck_api="36"
rdeck_token="qNcao2e75iMf1PmxYfUJaGEzuVOIW3Xz"
# specific api call info
rdeck_project="ProjectEXAMPLE"
rdeck_xml_file="HelloWorld.xml"
# api call
curl -kSsv --header "X-Rundeck-Auth-Token:$rdeck_token" \
-F xmlBatch=#"$rdeck_xml_file" "$protocol://$rdeck_host:$rdeck_port/api/$rdeck_api/project/$rdeck_project/jobs/import?fileformat=xml&dupeOption=update"
For YAML job definition format:
#!/bin/sh
# protocol
protocol="http"
# basic rundeck info
rdeck_host="localhost"
rdeck_port="4440"
rdeck_api="36"
rdeck_token="qNcao2e75iMf1PmxYfUJaGEzuVOIW3Xz"
# specific api call info
rdeck_project="ProjectEXAMPLE"
rdeck_yml_file="HelloWorldYML.yaml"
# api call
curl -kSsv --header "X-Rundeck-Auth-Token:$rdeck_token" \
-F xmlBatch=#"$rdeck_yml_file" "$protocol://$rdeck_host:$rdeck_port/api/$rdeck_api/project/$rdeck_project/jobs/import?fileformat=yaml&dupeOption=update"
Here the API call.

New Docker Build secret information for use with aws cli

I would like to use the new --secret flag in order to retreive something from aws with its cli during the build process.
# syntax = docker/dockerfile:1.0-experimental
FROM alpine
RUN --mount=type=secret,id=mysecret,dst=/root/.aws cat /root/.aws
I can see the credentials when running the following command:
docker build --no-cache --progress=plain --secret id=mysecret,src=%USERPROFILE%/.aws/credentials .
However, if I adjust the command to be run, the aws cli cannot find the credentials file and asks me to do aws configure:
RUN --mount=type=secret,id=mysecret,dst=/root/.aws aws ssm get-parameter
Any ideas?
The following works:
# syntax = docker/dockerfile:1.0-experimental
FROM alpine
RUN --mount=type=secret,id=aws,dst=/aws export AWS_SHARED_CREDENTIALS_FILE=/aws aws ssm get-parameter ...

Docker Logs Issue : Logs are not created or displayed in Tomcat's logs folder in docker container

We are using Docker container and created a Dockerfile. Inside this container we deployed war file using tomcat image
and we can see tomcat logs at console but console logs is not updating
after sending a request to tomcat via URL.
Also we can not see any log file inside tomcat logs folder
Can anyone help me out that how we can see tomcat logs like localhost.logs/catalina.logs/manager.logs etc
MY Dockerfile is :-
FROM openjdk:6-jre
ENV CATALINA_HOME /usr/local/tomcat
ENV PATH $CATALINA_HOME/bin:$PATH
COPY tomcat $CATALINA_HOME
ADD newui.war $CATALINA_HOME/webapps
CMD $CATALINA_HOME/bin/startup.sh && tail -F $CATALINA_HOME/logs/catalina.out
EXPOSE 8080
Used below script to build
$ docker build -t tomcat .
and below used to run tomcat
$ docker run -p 8080:8080 tomcat
Here are a few things wrong with your dockerfile:
You mention that you need java 6, and yet the line FROM java as of this writing is set to use java:8.
You need to replace the FROM line with FROM java:6-jre or as suggested by the official page: FROM openjdk:6-jre if in 2018 you still need java 6, which is dangerous. I would also strongly suggest to use at least FROM tomcat:7 which should be able to run java 6 applets but will include some bug fixes including support for longer Diffie-Hellman primes for HTTPS (if you are serious about your app's security).
Copt tomcat $CATALINA_HOME you either miss-typed the line to SO, or your image should not build at all. It should be COPY tomcat $CATALINA_HOME
Given that you are using the COPY command there is no need to use RUN mkdir -p prior to this, since the COPY command will automatically create all the required folders.
CMD $CATALINA_HOME/bin/startup.sh && tail -f $CATALINA_HOME/logs/catalina.out
First the tail -f part: since you are looking to tail a log file which might be created and recreated during the server's operation instead of following the FD you should be following the path by doing tail -F (capital F)
startup.sh && tail - tail will never start until startup.sh exits. A better approach is to do tail -F $CATALINA_HOME/logs/catalina.out & inside your startup.sh right before you start your tomcat server. That way tail will be running in the background.
Regardless this is a somewhat dangerous approach and you risk zombie processes because bash does not manage its children processes and neither does docker. I would recommend to use supervisord or something similar.
(From https://docs.docker.com/engine/admin/multi-service_container/)
FROM ubuntu:latest
RUN apt-get update && apt-get install -y supervisor
RUN mkdir -p /var/log/supervisor
COPY supervisord.conf /etc/supervisor/conf.d/supervisord.conf
COPY my_first_process my_first_process
COPY my_second_process my_second_process
CMD ["/usr/bin/supervisord"]
Note: this dockerfile sample omits a few of the best practices, e.g. removing the apt cache in the same run command as doing the apt-get update.
Personal favorite is the phusion/baseimage, but it is harder to setup since you'll need to install everything including the java into the image.
If with all of these modifications you still have no luck in seeing the console update, then you'll need to also post the contents of your startup.sh file or other tomcat related configurations.
P.S.: it might be a good idea to do RUN mkdir -p $CATALINA_HOME/logs just to make sure that the logs folder exists for tomcat to write to.
P.P.S.: the java base image is actually using openjdk instead of the oracle one. Just thought I'd point it out
You should check tomcat logging settings. The default logging.properties in the JRE specifies a ConsoleHandler that routes logging to System.err. The default conf/logging.properties in Apache Tomcat also adds several FileHandlers that write to files.
Example logging.properties file to be placed in $CATALINA_BASE/conf:
handlers = 1catalina.org.apache.juli.FileHandler, \
2localhost.org.apache.juli.FileHandler, \
3manager.org.apache.juli.FileHandler, \
java.util.logging.ConsoleHandler
.handlers = 1catalina.org.apache.juli.FileHandler, java.util.logging.ConsoleHandler
############################################################
# Handler specific properties.
# Describes specific configuration info for Handlers.
############################################################
1catalina.org.apache.juli.FileHandler.level = FINE
1catalina.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
1catalina.org.apache.juli.FileHandler.prefix = catalina.
2localhost.org.apache.juli.FileHandler.level = FINE
2localhost.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
2localhost.org.apache.juli.FileHandler.prefix = localhost.
3manager.org.apache.juli.FileHandler.level = FINE
3manager.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
3manager.org.apache.juli.FileHandler.prefix = manager.
3manager.org.apache.juli.FileHandler.bufferSize = 16384
java.util.logging.ConsoleHandler.level = FINE
java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter
############################################################
# Facility specific properties.
# Provides extra control for each logger.
############################################################
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].handlers = \
2localhost.org.apache.juli.FileHandler
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].level = INFO
org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/manager].handlers = \
3manager.org.apache.juli.FileHandler
# For example, set the org.apache.catalina.util.LifecycleBase logger to log
# each component that extends LifecycleBase changing state:
#org.apache.catalina.util.LifecycleBase.level = FINE
Example logging.properties for the servlet-examples web application to be placed in WEB-INF/classes inside the web application:
handlers = org.apache.juli.FileHandler, java.util.logging.ConsoleHandler
############################################################
# Handler specific properties.
# Describes specific configuration info for Handlers.
############################################################
org.apache.juli.FileHandler.level = FINE
org.apache.juli.FileHandler.directory = ${catalina.base}/logs
org.apache.juli.FileHandler.prefix = servlet-examples.
java.util.logging.ConsoleHandler.level = FINE
java.util.logging.ConsoleHandler.formatter = java.util.logging.SimpleFormatter
More info at https://tomcat.apache.org/tomcat-6.0-doc/logging.html
we can not see the logs in Docker container until unless we mount it.
To build the Dockerfile:-
docker build -t tomcat
To run the Dockerfile Image:-
docker run -p 8080:8080 tomcat
To copy the logs of tomcat present in docker container to mounted container :-
Run this cmd to mount the container:
1stpath : 2ndpath
docker run \\-d \\-p 8085:8085 \\-v /usr/local/tomcat/logs:/usr/local/tomcat/logs \tomcat
or simply
docker run \\-d \\-v /usr/local/tomcat/logs:/usr/local/tomcat/logs \tomcat
1st:-/usr/local/tomcat/logs: path of root dir: where we want to copy
the logs or destination
2nd:- /usr/local/tomcat/logs: path of tomcat/logs folder present in
docker container
tomcat:-name of image
need to change the port if it is busy
now the container is get mount
to get the list of container run : docker ps -a
now get the container id of latest created container:
docker exec -it < mycontainer > bash
then we can see the logs by
cd /usr/local/tomcat/logs
usr/local/tomcat/logs# less Log Name Here
this to Copy any folder in docker container on root:-
docker cp <containerId>:/file/path/within/container /host/path/target

Docker: share private key via arguments

I want to share my github private key into my docker container.
I'm thinking about sharing it via docker-compose.yml via ARGs.
Is it possible to share private key using ARG as described here?
Pass a variable to a Dockerfile from a docker-compose.yml file
# docker-compose.yml file
version: '2'
services:
my_service:
build:
context: .
dockerfile: ./docker/Dockerfile
args:
- PRIVATE_KEY=MULTI-LINE PLAIN TEXT RSA PRIVATE KEY
and then I expect to use it in my Dockerfile as:
ARG PRIVATE_KEY
RUN echo $PRIVATE_KEY >> ~/.ssh/id_rsa
RUN pip install git+ssh://git#github.com/...
Is it possible via ARGs?
If you can use the latest docker 1.13 (or 17.03 ce), you could then use the docker swarm secret: see "Managing Secrets In Docker Swarm Clusters"
That allows you to associate a secret to a container you are launching:
docker service create --name test \
--secret my_secret \
--restart-condition none \
alpine cat /run/secrets/my_secret
If docker swarm is not an option in your case, you can try and setup a docker credential helper.
See "Getting rid of Docker plain text credentials". But that might not apply to a private ssh key.
You can check other relevant options in "Secrets and LIE-abilities: The State of Modern Secret Management (2017)", using standalone secret manager like Hashicorp Vault.
Although the ARG itself will not persist in the built image, when you reference the ARG variable somewhere in the Dockerfile, that will be in the history:
FROM busybox
ARG SECRET
RUN set -uex; \
echo "$SECRET" > /root/.ssh/id_rsa; \
do_deploy_work; \
rm /root/.ssh/id_rsa
As VonC notes there's now a swarm feature to store and manage secrets but that doesn't (yet) solve the build time problem.
Builds
Coming in Docker ~ 1.14 (or whatever the equivalent new release name is) should be the --build-secret flag (also #28079) that lets you mount a secret file during a build.
In the mean time, one of the solutions is to run a network service somewhere that you can use a client to pull secrets from during the build. Then if the build puts the secret in a file, like ~/.ssh/id_rsa, the file must be deleted before the RUN step that created it completes.
The simplest solution I've seen is serving a file with nc:
docker network create build
docker run --name=secret \
--net=build \
--detach \
-v ~/.ssh/id_rsa:/id_rsa \
busybox \
sh -c 'nc -lp 8000 < /id_rsa'
docker build --network=build .
Then collect the secret, store it, use it and remove it in the Dockerfile RUN step.
FROM busybox
RUN set -uex; \
nc secret 8000 > /id_rsa; \
cat /id_rsa; \
rm /id_rsa
Projects
There's a number of utilities that have this same premise, but in various levels of complexity/features. Some are generic solutions like Hashicorps Vault.
Dockito Vault
Hashicorp Vault
docker-ssh-exec

Resources