Airflow Version - 2.3.0
Helm Chart - Apache-airflow/airflow
I have been working on setting up airflow using helm on kubernetes.
Currently, I am planning to set airflow connections using the values.yaml file and env variables instead of configuring them up on the webUI.
I believe the settings to tweak, to set the connections, are:
extraSecrets: {}
# eg:
# extraSecrets:
# '{{ .Release.Name }}-airflow-connections':
# type: 'Opaque'
# data: |
# AIRFLOW_CONN_GCP: 'base64_encoded_gcp_conn_string'
# AIRFLOW_CONN_AWS: 'base64_encoded_aws_conn_string'
# stringData: |
# AIRFLOW_CONN_OTHER: 'other_conn'
# '{{ .Release.Name }}-other-secret-name-suffix':
# data: |
# ...
I am not sure how to set all the key-value pairs for a databricks/emr connection, and how to use the kubernetes secrets (already set up as env vars in pods) to get the values
#extraSecrets:
# '{{ .Release.Name }}-airflow-connections':
# type: 'Opaque'
# data:
# AIRFLOW_CONN_DATABRICKS_DEFAULT_two:
# conn_type: "emr"
# host: <host_url>
# extra:
# token: <token string>
# host: <host_url>
It would be great to get some insights on how to resolve this issue
I looked up this link : managing_connection on airflow
Tried Changes in values.yaml file:
#extraSecrets:
# '{{ .Release.Name }}-airflow-connections':
# type: 'Opaque'
# data:
# AIRFLOW_CONN_DATABRICKS_DEFAULT_two:
# conn_type: "emr"
# host: <host_url>
# extra:
# token: <token string>
# host: <host_url>
Error Occurred:
While updating helm release:
extraSecrets.{{ .Release.Name }}-airflow-connections expects string, got object
Airflow connections can be set using Kubernetes secrets and env variables.
For setting secrets, directly from the cli, the easiest way is to
Create a kubernetes secret
The secret value (connection string) has to be in the URI format suggested by airflow
my-conn-type://login:password#host:port/schema?param1=val1¶m2=val2
Create an env variable in the airflow-suggested-format
Airflow format for connection - AIRFLOW_CONN_{connection_name in all CAPS}
set the value of the connection env variable using the secret
How to manage airflow connections: here
Example,
To set the default databricks connection (databricks_default)in airflow -
create secret
kubectl create secret generic airflow-connection-databricks \
--from-literal=AIRFLOW_CONN_DATABRICKS_DEFAULT='databricks://#<DATABRICKS_HOST>?token=<DATABRICKS_TOKEN>'
In helm's (values.yaml), add new env variable using the secret:
envName: "AIRFLOW_CONN_DATABRICKS_DEFAULT"
secretName: "airflow-connection-databricks"
secretKey: "AIRFLOW_CONN_DATABRICKS_DEFAULT"
Some useful links:
Managing Airflow connections
Databricks databricks connection
Related
I am triying to deploy a Jenkins using helm with JCASC to get vault secrets. I am using a local minikube to create mi k8 cluster and a local vault instance in my machine (not in k8 cluster).
Even that I am trying using initContainerEnv and ContainerEnv I am not able to reach the vault values. For CASC_VAULT_TOKEN value I am using vault root token.
This is helm command i run locally:
helm upgrade --install -f values.yml mijenkins jenkins/jenkins
And here is my values.yml file code:
controller:
installPlugins:
# need to add this configuration-as-code due to a known jenkins issue: https://github.com/jenkinsci/helm-charts/issues/595
- "configuration-as-code:1414.v878271fc496f"
- "hashicorp-vault-plugin:latest"
# passing initial environments values to docker basic container
initContainerEnv:
- name: CASC_VAULT_TOKEN
value: "my-vault-root-token"
- name: CASC_VAULT_URL
value: "http://localhost:8200"
- name: CASC_VAULT_PATHS
value: "cubbyhole/jenkins"
- name: CASC_VAULT_ENGINE_VERSION
value: "2"
ContainerEnv:
- name: CASC_VAULT_TOKEN
value: "my-vault-root-token"
- name: CASC_VAULT_URL
value: "http://localhost:8200"
- name: CASC_VAULT_PATHS
value: "cubbyhole/jenkins"
- name: CASC_VAULT_ENGINE_VERSION
value: "2"
JCasC:
configScripts:
here-is-the-user-security: |
jenkins:
securityRealm:
local:
allowsSignup: false
enableCaptcha: false
users:
- id: "${JENKINS_ADMIN_ID}"
password: "${JENKINS_ADMIN_PASSWORD}"
And in my local vault I can see/reach values:
>vault kv get cubbyhole/jenkins
============= Data =============
Key Value
--- -----
JENKINS_ADMIN_ID alan
JENKINS_ADMIN_PASSWORD acosta
Any of you have an idea what I could be doing wrong?
I haven't used Vault with jenkins so I'm not exactly sure about your particular situation but I am very familiar with how finicky the Jenkins helm chart is and I was able to configure my securityRealm (with the Google Login plugin) by creating a k8s secret with the values needed first:
kubectl create secret generic googleoauth --namespace jenkins \
--from-literal=clientid=${GOOGLE_OAUTH_CLIENT_ID} \
--from-literal=clientsecret=${GOOGLE_OAUTH_SECRET}
then passing those values into helm chart values.yml via:
controller:
additionalExistingSecrets:
- name: googleoauth
keyName: clientid
- name: googleoauth
keyName: clientsecret
then reading them into JCasC like so:
...
JCasC:
configScripts:
authentication: |
jenkins:
securityRealm:
googleOAuth2:
clientId: ${googleoauth-clientid}
clientSecret: ${googleoauth-clientsecret}
In order for that to work the values.yml also needs to include the following settings:
serviceAccount:
name: jenkins
rbac:
readSecrets: true # allows jenkins serviceAccount to read k8s secrets
Note that I am running jenkins as a k8s serviceAccount called jenkins in the namespace jenkins
After debugging my jenkins installation I figured out that the main issue was not my values.yml neither my JCASC integration as I was able to see the ContainerEnv values if I go inside my jenkins pod with:
kubectl exec -ti mijenkins-0 -- sh
So I needed to expose my vault server so my jenkins is able to reach it, I used this Vault tutorial to achieve it. Which in, brief, instead of using normal:
vault server -dev
We need to use:
vault server -dev -dev-root-token-id root -dev-listen-address 0.0.0.0:8200
Then we need to export an environment variable for the vault CLI to address the Vault server.
export VAULT_ADDR=http://0.0.0.0:8200
After that, we need to determine the vault address which we are going to redirect our jenkins ping, to do that we need start a minukube ssh session:
minikube ssh
Within this SSH session, retrieve the value of the Minikube host.
$ dig +short host.docker.internal
192.168.65.2
After retrieving the value, we are going to retrieve the status of the Vault server to verify network connectivity.
$ dig +short host.docker.internal | xargs -I{} curl -s http://{}:8200/v1/sys/seal-status
And now we can connect our jenkins pod with our vault, we just need to change CASC_VAULT_URL to use http://192.168.65.2:8200 in our main .yml file like this:
- name: CASC_VAULT_URL
value: "http://192.168.65.2:8200"
I'm trying to set up a local Kibana instance with ActiveMQ for testing purposes. I've created a docker network called elastic-network. I have 3 containers in my network: elasticsearch, kibana and finally activemq. In my kibana container, I downloaded metric beats using the following shell command
curl -L -O https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.11.2-linux-x86_64.tar.gz
In the configuration file metricbeat.reference.yml, I've changed the host for my ActiveMQ instance running under the container activemq
- module: activemq
metricsets: ['broker', 'queue', 'topic']
period: 10s
hosts: ['activemq:8161']
path: '/api/jolokia/?ignoreErrors=true&canonicalNaming=false'
username: admin # default username
password: admin # default passwor
When I run metricbeat using the verbose parameter ./metricbeat -e I get some error mentioning that ActiveMQ API is unreachable. My problem is that metricbeat ignore my active mq broker configuration and tries to connect to localhost.
Is there a reason why my configuration could be ignored?
After looking through the documentation, I saw that for Linux, unlike the other OS, you also have to change the configuration in the module directory module.d/activemq.yml, not just the metricbeat.reference.yml
# Module: activemq
# Docs: https://www.elastic.co/guide/en/beats/metricbeat/7.11/metricbeat-module-activemq.html
- module: activemq
metricsets: ['broker', 'queue', 'topic']
period: 10s
hosts: ['activemq:8161']
path: '/api/jolokia/?ignoreErrors=true&canonicalNaming=false'
username: admin # default username
password: admin # default password
is there any way how to create a proper, really custom .lando.yml file so it will not use any recipe? How do I specify "just give me Apache, MariaDB, PHP" in Lando?
I tried this
# The name of the app
name: mariadb
# Give me http://mariadb.lndo.site and https://mariadb.lndo.site
proxy:
html:
- mariadb.lndo.site
# Set up my services
services:
# Set up a basic webserver running the latest nginx with ssl turned on
html:
type: nginx
ssl: true
webroot: www
# Spin up a mariadb container called "database"
# NOTE: "database" is arbitrary, you could just as well call this "db" or "kanye"
database:
# Use mariadb version 10.1
type: mariadb:10.1
# Optionally allow access to the database at localhost:3307
# You will need to make sure port 3307 is open on your machine
#
# You can also set `portforward: true` to have Lando dynamically assign
# a port. Unlike specifying an actual port setting this to true will give you
# a different port every time you restart your app
portforward: 3307
# Optionally set the default db credentials
#
# Note: You will need to `lando destroy && lando start` to change these if you've
# already started your app
# See: https://docs.devwithlando.io/tutorials/lando-info.html
creds:
user: mariadb
password: mariadb
database: mariadb
# Optionally load in all the mariadb config files in the config directory
# This is relative to the app root
# NOTE: these files need to end in .cnf
config:
confd: config
but after lando start I am getting a ERROR: No such service: appserver error and the documentation for this is extremely confusing.
Thanks.
You'll want to look at the Building a Custom Stack section of the lando custom project page.
I won't do your entire project, but the basics are as follows:
# LAMP stack example
name: lamp
proxy:
appserver:
- lamp.lndo.site # Allows you to access the site at http[s]://lamp.lndo.site
# This may actually get done automatically
services: # Define your services
appserver: # Create a web server container
type: php:5.3 # Specify what version of php to use
via: apache # This could be nginx, should you choose so
webroot: www # Specify webroot
config: # If you want to add/edit
server: config/apache/lamp.conf # Use an alternate apache config file
conf: path/from/app/root/php.ini # Alter php configuration with a custom file
database: # Create a database server container
type: mysql
portforward: 3308
creds: # Specify what creds/db to use
user: lamp
password: lamp
database: lamp
tooling: # These toolings allow you to connect land <command> to the appropriate containers
composer: # Call with "lando composer..."
service: appserver
description: Run composer commands
cmd: composer --ansi
php: # Call with "lando php..."
service: appserver
mysql: # Call with "lando mysql..."
user: root
service: database
description: Drop into a MySQL shell
I'm creating a docker image for our fluentd.
The image contains a file called http_forward.conf
It contains:
<store>
type http
endpoint_url ENDPOINTPLACEHOLDER
http_method post # default: post
serializer json # default: form
rate_limit_msec 100 # default: 0 = no rate limiting
raise_on_error true # default: true
authentication none # default: none
username xxx # default: ''
password xxx # default: '', secret: true
</store>
So this is in our image. But we want to use the image for all our environments. Specified with environment variables.
So we create an environment variable for our environment:
ISSUE_SERVICE_URL = http://xxx.dev.xxx.xx/api/fluentdIssue
This env variable contains dev on our dev environment, uat on uat etc.
Than we want to replace our ENDPOINTPLACEHOLDER with the value of our env variable. In bash we can use:
sed -i -- 's/ENDPOINTPLACEHOLDER/'"$ISSUE_SERVICE_URL"'/g' .../http_forward.conf
But how/when do we have to execute this command if we want to use this in our docker container? (we don't want to mount this file)
We did that via ansible coding.
Put the file http_forward.conf as template, and deploy the change depend on the environment, then mount the folder (include the conf file) to docker container.
ISSUE_SERVICE_URL = http://xxx.{{ environment }}.xxx.xx/api/fluentdIssue
playbook will be something like this, I don't test it.
- template: src=http_forward.conf.j2 dest=/config/http_forward.conf mode=0644
- docker:
name: "fluentd"
image: "xxx/fluentd"
restart_policy: always
volumes:
- /config:/etc/fluent
In your DockerFile you should have a line starting with CMD somewhere. You should add it there.
Or you can do it cleaner: set the CMD line to call a script instead. For example CMD ./startup.sh. The file startup.sh will then contain your sed command followed by the command to start your fluentd (I assume that is currently the CMD).
I have built a 4 node kubernetes cluster running multi-container pods all running on CoreOS. The images come from public and private repositories. Right now I have to log into each node and manually pull down the images each time I update them. I would like be able to pull them automatically.
I have tried running docker login on each server and putting the .dockercfg file in /root and /core
I have also done the above with the .docker/config.json
I have added secret to the kube master and added imagePullSecrets:
name: docker.io to the Pod configuration file.
When I create the pod i get the error message Error:
image <user/image>:latest not found
If I log in and run docker pull it will pull the image. I have tried this using docker.io and quay.io.
To add to what #rob said, as of docker 1.7, the use of .dockercfg has been deprecated and they now use a ~/.docker/config.json file. There is support for this type of secret in kube 1.1, but you must create it using different keys/type configuration in the yaml:
First, base64 encode your ~/.docker/config.json:
cat ~/.docker/config.json | base64 -w0
Note that the base64 encoding should appear on a single line so with -w0 we disable the wrapping.
Next, create a yaml file:
my-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: registrypullsecret
data:
.dockerconfigjson: <base-64-encoded-json-here>
type: kubernetes.io/dockerconfigjson
-
$ kubectl create -f my-secret.yaml && kubectl get secrets
NAME TYPE DATA
default-token-olob7 kubernetes.io/service-account-token 2
registrypullsecret kubernetes.io/dockerconfigjson 1
Then, in your pod's yaml you need to reference registrypullsecret or create a replication controller:
apiVersion: v1
kind: Pod
metadata:
name: my-private-pod
spec:
containers:
- name: private
image: yourusername/privateimage:version
imagePullSecrets:
- name: registrypullsecret
If you need to pull an image from a private Docker Hub repository, you can use the following.
Create your secret key
kubectl create secret docker-registry myregistrykey --docker-server=DOCKER_REGISTRY_SERVER --docker-username=DOCKER_USER --docker-password=DOCKER_PASSWORD --docker-email=DOCKER_EMAIL
secret "myregistrykey" created.
Then add the newly created key to your Kubernetes service account.
Retrieve the current service account
kubectl get serviceaccounts default -o yaml > ./sa.yaml
Edit sa.yaml and add the ImagePullSecret after Secrets
imagePullSecrets:
- name: myregistrykey
Update the service account
kubectl replace serviceaccount default -f ./sa.yaml
I can confirm that imagePullSecrets not working with deployment, but you can
kubectl create secret docker-registry myregistrykey --docker-server=DOCKER_REGISTRY_SERVER --docker-username=DOCKER_USER --docker-password=DOCKER_PASSWORD --docker-email=DOCKER_EMAIL
kubectl edit serviceaccounts default
Add
imagePullSecrets:
- name: myregistrykey
To the end after Secrets, save and exit.
And its works. Tested with Kubernetes 1.6.7
Kubernetes supports a special type of secret that you can create that will be used to fetch images for your pods. More details here.
For centos7, the docker config file is under /root/.dockercfg
echo $(cat /root/.dockercfg) | base64 -w 0
Copy and paste result to secret YAML based on the old format:
apiVersion: v1
kind: Secret
metadata:
name: docker-secret
type: kubernetes.io/dockercfg
data:
.dockercfg: <YOUR_BASE64_JSON_HERE>
And it worked for me, hope that could also help.
The easiest way to create the secret with the same credentials that your docker configuration is with:
kubectl create secret generic myregistry --from-file=.dockerconfigjson=$HOME/.docker/config.json
This already encodes data in base64.
If you can download the images with docker, then kubernetes should be able to download them too. But it is required to add this to your kubernetes objects:
spec:
template:
spec:
imagePullSecrets:
- name: myregistry
containers:
# ...
Where myregistry is the name given in the previous command.
go the easy way, do not forget to define --type and add it to proper namespace
kubectl create secret generic YOURS-SECRET-NAME \
--from-file=.dockerconfigjson=$HOME/.docker/config.json \
--type=kubernetes.io/dockerconfigjson